LECTURE NOTES IN CALCULUS OF VARIATIONS AND …george1/calculus-variations-optimal-control.pdf · Ver.1.2 LECTURE NOTES IN CALCULUS OF VARIATIONS AND OPTIMAL CONTROL MSc in Systems

Ver.1.2

LECTURE NOTES IN CALCULUS OF VARIATIONS AND OPTIMAL

CONTROL

MSc in Systems and Control

Dr George Halikias

EEIE, School of Engineering and Mathematical Sciences, City University

4 March 2007

1. Calculus of variations

1.1 Introduction

Calculus of variations in the theory of optimisation of functionals, typically integrals.

Perhaps the first problem in the calculus of variations was the “brachistochrone”

problem formulated by J. Bernoulli in 1696: Consider a bead sliding under gravity

along a smooth wire joining two fixed points A and B (not on the same vertical line).

What is the shape of the wire in order that the bead, when released from rest at point

A, slides to B in minimum time?

A

B

x

y

g

y(x)

a

b

The brachistochrone problem

The figure shows the choice of axes with A taken to be the origin without loss of

generality. Here we required to minimise:

∫ B

A

dt =

∫ B

A

ds

v

where s is the arc-length along the wire and v is the instantaneous speed of the bead.

Here total energy is conserved, so, if m is the mass of the bead,

1

2mv2 = mgy ⇒ v =

√2gy

On noting that

ds =√

(dx)2 + (dy)2 = dx

√1 +

(dy

dx

)2

= dx

√1 + y′2

2 of 58

we have to minimise:

J [y] =1√2g

∫ a

0

(1 + y′2

y

) 12

dx

with y(0) = 0 and y(a) = b.

For each curve y(x) joining A and B, J [y] has a numerical value for the time taken.

Thus J acts on a set of functions to produce a corresponding set of numbers. This is

unlike what we think as ordinary functions which typically map numbers to numbers.

To mark the difference, integrals like J [y] above are called functionals .

Calculus of variations deals with optimisation problems of the type described above.

We will generalise this class of problems by imposing additional integral constraints

(e.g. related to the total length of the curve y(x)) or, possibly, relaxing others (e.g.

those related to the fixed end-points). To describe our problems rigorously we also

need to specify clearly the class of the functions over which the optimisation is carried

out.

1.2 The fixed-end-point problem

Here we change notation and consider functions x(t), where t is the independent

variable and x the dependent, so that x(t) defines the equation of a curve. (This

is so that we have a smoother notational transition to optimal control problems to

be discussed later!). The problem considered here is to find, among all curves (in a

specified class) joining two fixed points (t0, x0) and (t1, x

1), the equation of the curve

minimising a given functional. This functional is the integral from t0 to t1 of a given

function f(t, x(t), x(t)) where x = dx/dt. We assume that f is sufficiently smooth, i.e.

differentiable with respect to each of its three variables t, x and x as many times as

required. Note that the three variables are considered to be independent (i.e. the value

of x at some time t does not constrain the gradient x at t). Thus the problem is:

min J [x] =

∫ t1

t0

f(t, x, x)dt (1)

with xt0 = x0 and x(t1) = x1. If x = x?(t) is a minimising curve, then we have that

J [y] ≥ J [x?]

for all other curves y(t) (in a specified class - see later!) satisfying the end constraints.

3 of 58

Admissible classes of functions that can be considered include C (continuous), C1 (once

continuously differentiable), C2 (twice continuously differentiable) and D1 (continuous

piecewise-differentiable, i.e. continuously differentiable except at a finite - or countable

- number of points). To minimise technicalities we start by taking the admissible

class of functions to be C2. We first need to define the concept of “closeness” between

functions.

Definition - Weak variations: Let x?(t) be the minimising curve and y(t) an admissible

curve. Then, if there exist (small) numbers ε1 and ε2 such that:

|x?(t)− y(t)| ≤ ε1 and |x?(t)− y(t)| ≤ ε2

for all t ∈ [t0 t1] we will say that y(t) is a weak variation of x?(t). Note that we require

not only the functions but also their derivatives to be “close”.

Definition - Strong variations: Let x?(t) be the minimising curve and y(t) an admissible

curve. Then, if there exist a (small) numbers ε such that:

|x?(t)− y(t)| ≤ ε

for all t ∈ [t0 t1] we will say that y(t) is a strong variation of x?(t).

A

B

A

B

t t

x*(t)

x*(t)x x

Weak and strong variations

Let x?(t) be a (local) minimiser of (1) in the class of C2 functions and define

x(t) = x?(t) + εη(t) where ε is a small quantity independent of x?, η and t. Then

4 of 58

x(t) will be a weak variation of x?(t) provided η(t) is C2 and η(t0) = η(t1) = 0. The

difference ∆J = J [x]− J [x?] is known as the variation of J . Hence,

∆J =

∫ t1

t0

f(x? + εη, x? + εη, t)dt−∫ t1

t0

f(x?, x?, t)dt

At each t ∈ [t0 t1] expand the first integrand as a Taylor series in the two variables εη

and εη around (x?, x?). Then,

∆J =

∫ t1

t0

{f(x?, x?, t) + εη

∂f

∂x+ εη

∂f

∂x+

1

2

(ε2η2∂2f

∂x2+ 2ε2ηη

∂2f

∂x∂x+ ε2η2∂2f

∂x2

)}dt

−∫ t1

t0

f(x?, x?, t)dt + O(ε3)

where all partial derivatives are evaluated on the optimal curve and where O(ε3) denotes

terms of at least order three. This may be written as

∆J = εV1 + ε2V2 + O(ε3)

where V1 and V2 denote the first and second variations:

V1 =

∫ t1

t0

(η(t)

∂f(x?, x?, t)

∂x+ η(t)

∂f(x?, x?, t)

∂x

)dt

and

V2 =1

2

∫ t1

t0

(η2(t)

∂2f(x?, x?, t)

∂x2+ 2η(t)η(t)

∂2f(x?, x?, t)

∂x∂x+ η2(t)

∂2f(x?, x?, t)

∂x2

)dt

respectively. If x?(t) is minimising then it is necessary that:

∆J = εV1 + ε2V2 + O(ε3) ≥ 0

for all admissible η(t) i.e. all C2 functions in the interval [t0 t1] such that η(t0) =

η(t1) = 0. Now ε can be either positive or negative; hence dividing by ε gives:

V1 + εV2 + O(ε2) ≥ 0 for ε > 0

and

V1 + εV2 + O(ε2) ≤ 0 for ε < 0

Now taking the limit as ε → 0 shows that V1 ≥ 0 and V1 ≤ 0 which can only be satisfied

if V1 = 0. Thus a necessary condition for a minimum is that the first variation V1 = 0,

i.e. ∫ t1

t0

(η(t)

∂f(x?, x?, t)

∂x+ η(t)

∂f(x?, x?, t)

∂x

)dt = 0

5 of 58

Integrating the second term by parts gives

∫ t1

t0

η∂f

∂xdt =

[η∂f

∂x

]t1

t0

−∫ t1

t0

ηd

dt

(∂f

∂x

)dt = −

∫ t1

t0

ηd

dt

(∂f

∂x

)dt

since η(t0) = η(t1) = 0. Thus, the necessary condition becomes:

∫ t1

t0

η(t)

{∂f

∂x− d

dt

(∂f

∂x

)}dt = 0 (2)

for all admissible η, i.e. all C2 functions in the interval [t0 t1] such that η(t0) = η(t1) = 0.

Note that the term in the curly brackets is continuous and does not involve the variation

η(t). We now make use of the following result:

Lemma 1: If g(t) is continuous in an interval [t0 t1] and if

∫ t1

t0

η(t)g(t)dt = 0

for every function η(t) ∈ C2(t0, t1) such that η(t0) = η(t1) = 0, then g(t) = 0 for every

t ∈ [t0 t1].

Proof: Suppose the function g(t) is non-zero, say positive, at some point t in the

interval [t0 t1]; then from continuity it must also be positive in some interval [α β]

contained in [t0 t1]. Define:

η(t) = (t− α)3(β − t)3 for t ∈ [α β]

= 0 for t /∈ [α β]

Clearly, the function η(t) defined above is continuous everywhere in the interval [t0 t1]

and has also continuous first and second partial derivatives at every point in this

interval (the only points of contention are t = α and t = β at which η(t) and its first

two derivatives may be easily verified to be continuous). Since further η(t0) = η(t1) = 0

we have from the Lemma’s hypothesis that:

∫ t1

t0

η(t)g(t)dt = 0 ⇒∫ β

α

η(t)g(t)dt = 0

which is a contradiction since both η(t) and g(t) are (strictly) positive everywhere in

the interval α < t < β. ¥

Applying the Lemma to (2) gives a necessary condition for a minimum as:

∂f

∂x− d

dt

(∂f

∂x

)= 0

6 of 58

which is known as the Euler-Lagrange equation. We summarise this important result

as follows:

Theorem 1: In order that x = x?(t) should be a solution, in the class of C2 functions,

of Problem 1, it is necessary that

∂f

∂x− d

dt

(∂f

∂x

)= 0

at each point of x = x?(t).

Note 1: The Euler-Lagrange equation also gives necessary conditions for a local

maximum (as can be seen by adapting the above derivation!). For this reason the

solutions to the Euler-Lagrange equation are called extremals.

Note 2: Although here the Euler-Lagrange equation was derived for C2 curves, it is

possible to show that if we enlarge the admissible class to C1 curves (once continuously

differentiable) the Euler-Lagrange equation still holds along a minimising solution.

Example: Find the extremal of the functional J [x] =∫ 2

1x2t3dt, given that x(1) = 0

and x(2) = 3.

We have t0 = 1, t1 = 2, x0 = 0, x1 = 3 and f(x, x, t) = x2t3. The Euler-Lagrange

equation in this case gives:

∂f

∂x− d

dt

(∂f

∂x

)= 0 ⇒ 0− d

dt(2xt3) = 0

giving xt3 = constant. On integrating we find:

x(t) =k

t2+ l for two constants k and l

When we apply the end conditions we get x(1) = 0 ⇒ k + l = 0 and x(2) = 3 ⇒k4

+ l = 3. Solving gives the extremal as:

x?(t) = 4− 4

t2

1.2.1 Special form of the Euler-Lagrange equation

Suppose that the function f in problem 1 is independent of t. In this case the Euler-

Lagrange equation simplifies. To show this let f = f(x, x). Then:

d

dt

(f − x

∂f

∂x

)=

∂f

∂xx +

∂f

∂xx− x

∂f

∂x− x

d

dt

(∂f

∂x

)

7 of 58

Henced

dt

(f − x

∂f

∂x

)= x

(∂f

∂x− d

dt

(∂f

∂x

))

On an extremal the RHS of this equation is zero and hence the Euler-Lagrange equation

in this case reduces to:

f − x∂f

∂x= constant

Example: Consider the brachystochrone problem in new (x, t) notation: This resulted

in the minimisation:

min J [x] =

∫ a

0

(1 + x2

x

) 12

dt

Here the integrand is independent of t and hence the Euler-Lagrange equation reduces

to:

f(x, x)− x∂f

∂x= constant ⇒ (1 + x2)

12

x12

− x12(1 + x2)−

12 2x

x12

= constant

or(1 + x2)

12

x12

− x2

x12 (1 + x2)

12

= constant ⇒ x(1 + x2) = c (constant)

This may be solved in parametric form (exercise!) by using the substitution x = tan(φ)

as: x = k(1+cos 2φ) and t = l−k(2φ+sin 2φ), where k and l are constants determined

by the end-conditions. The equation of the extremising curve (actually minimising) is

that of a cycloid. This is the path traced by a point on the edge of a disc as the disc

rolls in a straight line.

1.2.2 A formal setting for calculus of variations

In order to formalize the arguments used in the calculus of variations we need to

generalise the concepts of distance and continuity to function spaces. An approach for

this is outlined in this section.

Recall that by a linear space we mean a set R of elements x, y, z, . . ., for which the

operations of addition and multiplication by (real) numbers (more generally elements

of a field) α, β, . . ., are defined according to the following axioms:

1. x + y = y + x;

2. (x + y) + z = x + (y + z);

8 of 58

3. There exists an element 0 (zero element) such that x + 0 = x for every x ∈ R;

4. For each x ∈ R there exists an element −x such that x + (−x) = 0;

5. 1 · x = x;

6. α(βx) = (αβ)x;

7. (α + β)x = αx + βx; and

8. α(x + y) = αx + βy.

A linear space R is said to be normed, if each element x ∈ R is assigned a non-negative

number ‖x‖ (the norm of x), such that:

1. ‖x‖ = 0 if and only if x = 0;

2. ‖αx‖ = |α|‖x‖; and

3. ‖x + y‖ ≤ ‖x‖+ ‖y‖.

For our purposes we are especially interested in real function spaces. We define:

• The space C (or more precisely C(a, b)) consisting of all continuous functions

x(t) defined on the closed interval [a, b]. By addition of elements of C and

multiplication of elements of C by numbers we mean the normal operations of

function additions and number-function multiplication, respectively, while the

norm is defined as the maximum absolute value, i.e. ‖x‖0 = maxa≤t≤b |x(t)|.

• The space C1, or more precisely C1(a, b), consisting of all functions x(t) defined in

the interval [a b] which are continuous and have continuous first derivatives. The

operations of addition and multiplications are the same as in C, but now the norm

is defined as: ‖x‖1 = maxa≤t≤b |x(t)| + maxa≤t≤b |x(t)| where x is the derivative

of x(t). Thus, two functions in C1(a, b) will be regarded to be close-together (say

within a distance ε) if both functions themselves and their derivatives are close,

i.e. ‖y − z‖1 < ε implies that |y(t) − z(t)| < ε and |y(t) − z(t)| < ε for every

a ≤ t ≤ b.

9 of 58

• The space Cn, or more precisely Cn(a, b), consisting of all functions x(t) defined

in the interval [a b] which are continuous and have continuous n first derivatives.

The norm in Cn is defined as: ‖x‖n =∑n

i=0 maxa≤t≤b |x(i)(t)| where x(i)(t) is the

i-th derivative of x(t). Note in particular that Cn ⊆ Cn−1 ⊆ . . . ⊆ C1 ⊆ C and

that if x ∈ Cn then ‖x‖n < ε ⇒ ‖x‖n−1 < ε ⇒ . . . ⇒ ‖x‖1 < ε ⇒ ‖x‖0 < ε.

Functionals J can now be defined as maps from a linear space R to the set of real

numbers R, i.e. J : R → R. R will be normally taken as a real function space (e.g.

C1(a, b) or C2(a, b) depending on context). In a similar way that continuity is defined

for functions f : R→ R we can define continuity of functionals.

Definition: The functional J [x] is said to be continuous at a point x ∈ R if for any

ε > 0 there exists a δ > 0 such that:

|J [x]− J [x]| < ε

provided that ‖x− x‖ < δ.

Next, we can define formally the concept of variation (or differential) of a functional,

analogously to the concept of differential of a function of n variables. We first give the

following definition:

Definition: Given a normed linear space R, let each element x ∈ R be assigned a

real number φ[x], i.e. let φ[x] be a functional defined in R. Then φ[x] is said to be a

(continuous) linear functional if:

1. φ[αx] = αφ[x] for any x ∈ R and any real number α;

2. φ[x1 + x2] = φ[x1] + φ[x2] for any x1 and x2 ∈ R; and

3. φ[x] is continuous (for all x ∈ R).

Example: We associate with each function x(t) ∈ C(a, b) its value at a fixed point

t0 ∈ [a, b], i.e. we define the functional φ[x] by the formula φ[x] = x(t0); then φ[x] is a

linear functional on C(a, b).

Example: The integral φ[x] =∫ b

ax(t)dt defines a linear functional on C(a, b).

10 of 58

Exercise: Show that the two functionals defined in the above examples are linear by

verifying the three properties of the definition.

Let J [x] be a functional defined on some normed linear space and let ∆J [h] =

J [x + h]− J [x] be its increment corresponding to the function h = h(t). If x is fixed,

∆J [h] is a functional of h (nonlinear in general). Suppose that ∆J [h] = φ[h] + ε‖h‖and that ε → 0 as ‖h‖ → 0. Then the functional J [x] is said to be differentiable and

the principal linear part of the increment ∆J [h], i.e. the linear functional φ[h] which

differs from ∆J [h] by an infinitesimal of order higher than one relative to ‖h‖ is called

the variation (or differential) of J [x] and is denoted by δJ [h]. It may be easily shown

that the variation of a differentiable functional is unique.

We say that the functional J [x] has a (relative) extremum for x if J [x]− J [x] does not

change sign in some neighborhood of x. We are normally concerned with functionals

defined on a set of continuously differentiable functions and the functions themselves

can be regarded as elements either of C or C1. Correspondingly, we can define two types

of extrema, weak and strong. We will say that the functional J [x] has a weak extremum

for x = x if there exists an ε > 0 such that J [x]−J [x] has the same sign for all x in the

domain of definition of the functional which satisfy ‖x− x‖1 < ε. On the other hand,

we will say that J [x] has a strong extremum for x = x if there exists an ε > 0 such that

J [x] − J [x] has the same sign for all x such that ‖x − x‖0 < ε. Clearly, every strong

extremum is also a weak extremum, although the converse is not true in general. It is

now possible to show that a necessary condition for a differentiable functional to have

an extremum for x = x is that its (first) variation vanishes at x = x, i.e. that δJ [h] = 0

for x = x and all admissible functions h. This leads to the Euler-Lagrange equation as

a necessary condition for the functional:

J [x] =

∫ t1

t0

f(x, x, t)dt

to have a weak extremum.

1.3 Problems in which the end points are not fixed

Here we will consider the modified problem:

min J [x] =

∫ t1

t0

f(t, x, x)dt (3)

11 of 58

where (t0, x(t0)) is fixed but (t1, x(t1)) is required to lie on some (differentiable) given

curve x = c(t). We will derive again necessary conditions for a minimum.

If x = x?(t) be a minimising curve and suppose it intersects the target curve at t = t1.

Let y(t) = x?(t)+εη(t) be a weak variation starting at (t0, x(t0) and reaching the target

curve at t = t1 + ∆τ , where ∆τ is “small”. For a weak variation ∆τ will be O(ε).

A

tt0 t1 t1+∆τ

c(t)

x*(t)

y(t)=x*(t)+εη(t)

A weak variation with only one fixed end-point

Now, using Taylor series expansion:

y(t1 + ∆τ) = x?(t1 + ∆τ) + εη(t1 + ∆τ) = x?(t1) + ∆τ x?(t1) + εη(t1) + O(ε2)

and

y(t1 + ∆τ) = c(t1 + ∆τ) = c(t1) + ∆τ c(t1) + O(ε2)

Thus,

x?(t1) + ∆τ x?(t1) + εη(t1) = c(t1) + ∆τ c(t1)

and hence

εη(t1) = (c(t1)− x?(t1))∆τ (4)

since x?(t1) = c(t1). The variation in J is:

∆J =

∫ t1+∆τ

t0

f(t, x? + εη, x? + εη)dt−∫ t1

t0

f(t, x?, x?)dt

12 of 58

Expanding for each t the first integrand around (x?(t), x?(t)) gives

f(t, x? + εη, x? + εη) = f(t, x?, x?) + εη∂f

∂x+ εη

∂f

∂x+ O(ε2)

where we keep only terms up to O(ε) since we are only interested in the first variation.

Thus

∆J =

∫ t1

t0

{f(t, x?, x?) + εη

∂f

∂x+ εη

∂f

∂x

}dt +

∫ t1+∆τ

t1

{f(t, x?, x?) + εη

∂f

∂x+ εη

∂f

∂x

}dt

−∫ t1

t0

f(t, x?, x?)dt + O(ε2)

On noting that the second integral can be written as:∫ t1+∆τ

t1

{f(t, x?, x?) + εη

∂f

∂x+ εη

∂f

∂x

}dt = f(t1, x

?(t1), x?(t1))∆τ + O(ε2)

we get

∆J =

∫ t1

t0

{εη

∂f

∂x+ εη

∂f

∂x

}dt + f(t1, x

?(t1), x?(t1))∆τ + O(ε2)

Integrating the second term by parts gives∫ t1

t0

η∂f

∂xdt =

[η∂f

∂x

]t1

t0

−∫ t1

t0

ηd

dt

(∂f

∂x

)dt = η(t1)

∂f

∂x(t1, x

?(t1), x?(t1))−

∫ t1

t0

ηd

dt

(∂f

∂x

)dt

since in this case η(t0) = 0 but the second end-point η(t1) is not in general zero. Thus:

∆J = ε

∫ t1

t0

η(t)

{∂f

∂x− d

dt

(∂f

∂x

)}dt+f(t1, x

?(t1), x?(t1))∆τ+εη(t1)

∂f

∂x(t1, x

?(t1), x?(t1))+O(ε2)

in which all the explicitly-written terms are O(ε). Now consider an auxiliary

minimisation problem in which both end-points are fixed, i.e. x(t) is constrained

to pass through points (t0, x(t0)) and (t1, x?(t1)). Clearly for this problem x?(t) is

still an extremum: To see this clearly, suppose for contradiction that there existed an

admissible optimal curve y?(t) (in the same class as x?(t) and satisfying the (fixed)

end-point constraints y?(t0) = x0, y?(t1) = x?(t1)), for which J [y?] < J [x?]; then y?(t)

would also be an optimal solution for the original problem (end-point lies on c(t)) and

that would contradict minimality of x?(t) for the original problem.

Thus x?(t) must be a minimising curve for the fixed-end problem, and therefore (from

section 1.1) it must satisfy the Euler-Lagrange equation at every point of the optimal

curve. Thus the term inside the curly brackets in the above equation must be zero,

and we get:

∆J = f(t1, x?(t1), x

?(t1))∆τ + εη(t1)∂f

∂x(t1, x

?(t1), x?(t1)) + O(ε2)

13 of 58

Using equation (4) gives:

∆J = ∆τ

{f(t1, x

?(t1), x?(t1)) + (c(t1)− x∗(t1))

∂f

∂x(t1, x

?(t1), x?(t1))

}+ O(ε2)

If this extremal is to minimize J , then using the same argument as in the proof of

Theorem 1, the first variation must vanish (e.g. consider arbitrary signs in ∆τ) and

hence we have the necessary condition:

f(t1, x?(t1), x

?(t1)) + (c(t1)− x∗(t1))∂f

∂x(t1, x

?(t1), x?(t1)) = 0

at the end-points of the extremal, when the end-point is not fixed but constrained to

lie of the fixed curve c(t). Note that all the terms are evaluated at time t1, so this is

an algebraic equation relating the slope of c(t) and the slope of the extremal at the

point where they meet. Conditions of this type are called transversality conditions.

Of course this condition must be satisfied in addition to the Euler-Lagrange equation

which is also a necessary condition.

Example: Find the extremal of∫ T

1x2t3dt given that x(1) = 0, T > 1 is finite and

x(T ) lies on the curve x = 2/t2 − 3.

Note that this is essentially the same example as in section 1.1 except that now the

end-point is not fixed. Using the Euler-Lagrange equation the extremals were found

to be:

x(t) =k

t2+ l k and l arbitrary constants

Since x(1) = 0 we have l = −k; thus x(t) = kt2− k and x = −2k

t3. The target curve is

c(t) = 2t2− 3, so c = − 4

t3. The transversality condition gives:

x(T )2T 3 + 2 (c(T )− x(T )) x(T )T 3 = 0

Now T is finite and k 6= 0 (otherwise the extremal would be x(t) = 0), so the

tranversality condition gives k = 4. Thus the required extremal is x?(t) = 4t2− 4,

which meets the target curve x(t) = 2t2− 3 at T =

√2, x(T ) = −2.

14 of 58

1.3.1 Special forms of the transversality conditions

If the problem is one in which either t1 or x(t1) is specified while the value of the other

is completely free, then the transversality condition can be simplified. Consider each

case separately:

1. x(t1) fixed, t1 is free: Here the target curve is x = c(t) = constant and hence c = 0.

The transversality condition in this case becomes:

f(t1, x?(t1), x

?(t1))− x?(t1)∂f

∂x(t1, x

?(t1), x?(t1)) = 0

t

x

A

t1

c(t)=constant

t1+∆τ

x*(t)

x(t1) fixed

2. t1 fixed, x(t1) is free: Here the target curve is a straight line perpendicular to the t

axis whose slope c(t1) is infinite. Now, provided c(t) 6= 0, the transversality condition

can be written as:

1

c(t1)

(f(t1, x

?(t1), x?(t1))− x?(t1)

∂f

∂x(t1, x

?(t1), x?(t1))

)+

∂f

∂x(t1, x

?(t1), x?(t1)) = 0

which for infinite c(t1) simply reduces to:

∂f

∂x(t1, x

?(t1), x?(t1)) = 0

15 of 58

t

x

A

t1

x*(t)

t1 fixed

Example: Find the extremal of J =∫ T

0(x2 + x2)dt for each of the following cases: (i)

x(0) = 1, T = 2, and (ii) x(0) = 1, x(T ) = 2.

The Euler-Lagrange equation is fx − ddt

(fx) = 0 or 2x− 2x = 0 which gives x− x = 0.

The general solution is x(t) = Aet + Be−t. We now distinguish between the two cases:

(i) x(0) = 1 so A + B = 1. Since x(T ) is unspecified and T = 2 the appropriate

end condition is fx(T ) = 0 which gives x(2) = 0, so B = Ae4. The extremal is

x?(t) = cosh(t− 2)/ cosh(2) and x(2) = 1/ cosh(2).

(ii) x(0) = 1 so A + B = 1. Since T is unspecified and x(T ) = 2 the appropriate end

condition is f(T ) − x(T )fx(T ) = 0 which gives AB = 0, so that the extremals are

either et or e−t. Now x(T ) = 2 so either 2 = eT or 2 = e−T . The second of these has

no positive solution for T . Thus the extremal is x = et which cuts x = 2 at T = ln(2).

1.4 Finding minimising curves

Note that the Euler-Lagrange equation derived earlier is only a necessary condition,

i.e. it gives necessary conditions for a function to be a (local) minimiser of a functional

J . As we have seen, the Euler-Lagrange equations also give necessary conditions for

maximising functions. (Think of this equation as the equivalent of the condition that

the derivative of a function should vanish at a local minimum or maximum). What we

need in order to distinguish between minimising and maximising solutions are sufficient

conditions. Although such conditions have been developed (“field of extremals”) they

16 of 58

are unfortunately outside the scope of this course! Fortunately in some cases (for

complex problems read “hardly ever”!) we can identify minima and maxima from

simple additional arguments or by using geometrical/physical intuition. Consider the

following example:

Example: Find the extremal of J [x] =∫ 2

1x2t3dt given that x(1) = 0 and x(2) = 3. Is

this a minimum?

The extremal was identified in a previous example as x?(t) = 4 − 4t2

. Let us consider

the sign of J [y] − J [x?] where y = y(t) is an arbitrary C1 curve joining the (fixed)

end-points. Then y(t) = x?(t) + η(t) where η(t) ∈ C1 and η(1) = η(2) = 0. Then

J [y]− J [x?] =

∫ 2

1

(8

t3+ η

)2

t3dt−∫ 2

1

(8

t3

)2

t3dt =

∫ 2

1

(16η + η2t3)dt

= [16η]21 +

∫ 2

1

η2t3dt =

∫ 2

1

η2t3dt

since η(1) = η(2) = 0. Note that the integrand is non-negative in the interval t ∈ [1 2]

and thus J [y] ≥ J [x?], i.e. x? is actually a global minimiser among all C1 functions

which satisfy the end-point constraints (note that nowhere in the above argument is

was assumed that η(t) is “small”). In fact we can say more than this:

Consider a variation y(t) = x?(t) + η(t) in the class D1 satisfying the end constraints.

(Here D1 is the set of all continuous piece-wise differentiable functions, i.e. all

continuous functions with a continuous derivative - except at most at a countable

number of points). Then η(t) would have a finite number of discontinuities in the

interval [1 2], i.e. y(t) would be continuous with a finite number of “corners” where

the derivative “jumps”. In this case we could split [1 2] into a number of sub-intervals

in which η(t) is continuous and repeat the calculation for J [y] − J [x?]; we would still

get that J [y] ≥ J [x?], and hence x? is actually a global minimiser among (the wider

class) of D1 functions. This is not accidental as shown by the next Theorem:

Theorem 2: If x = x?(t) is a minimising curve in the class of C1 functions, then it is

also a minimising curve in the wider class of D1 functions.

Proof: See [9].

The following theorem is also stated without proof:

Theorem 3: In order that a D1 function is a minimiser for the fixed end-point problem

it is necessary that:

17 of 58

1. The Euler Lagrange equation is minimised between corners and between a corner

and an end-point.

2. ∂f∂x

is continuous at a corner.

3. f − x∂f∂x

is continuous at a corner.

Proof: See [9].

1.5 Isoperimetric problems

Here we consider the problem of finding an extremum to a functional subject to an

equality constraint involving a second functional. Historically, the first problems of

this type involved finding an optimal curve whose total length (perimeter) was fixed

- hence the name. It turns out that the standard method used in the optimisation of

functions in Rn under equality constraints - Lagrange multipliers - also applies here:

Problem IP: Minimise the functional

J [x] =

∫ t1

t0

f(t, x, x)dt

with x(t0) = x0, x(t1) = x1, subject to the integral constraint:

I =

∫ t1

t0

g(t, x, x)dt = c

where c is a constant.

Theorem 4: In order that x = x?(t) is a solution of Problem IP (iso-perimetric

problem) it is necessary that it should be an extremal of:

∫ t1

t0

(f(t, x, x) + λg(t, x, x))dt

for a certain constant λ (Lagrange multiplier).

Proof: See [9].

Example: Minimise J =∫ 1

0x2dt with x(0) = 2, x(1) = 4 subject to the constraint∫ 1

0xdt = 1.

Theorem 4 says that we should find the extremals of∫ 1

0(x2 + λx)dt. The Euler-

Lagrange equation is λ − d(2x)/dt = 0, or x = λ/2. Integrating gives the solution:

18 of 58

x(t) = λt2/4 + kt + l The end conditions give l = 2, k = 2− λ/4. We find λ by using

the constraint: ∫ 1

0

{λ

4t2 +

(2− λ

4

)t + 2

}dt = 1

which gives λ = 48 after some algebra. Hence the required extremal is x(t) =

12t2 − 10t + 2.

19 of 58

2. Optimal control

2.1 The general optimal control problem

The theory of optimal control allows for the solution of a large class of non-linear

control problems subject to complex state and control-signal constraints. The theory

is an extension of classical calculus of variations since it does not rely of the

“smoothness assumptions” made so far; indeed in most cases the optimal control is

highly discontinuous (“bang-bang control”, control along “switching curves”, “sliding-

control”). The formulation of the problem involves the minimization of a “cost-

function” subject to initial and terminal constraints which is reminiscent of calculus of

variation problems. The optimal control signal is typically obtained either as a function

of time u?(t) or, more interestingly for control applications, in feedback form, i.e. as a

function of the state u?(x).

The most important result in this area is Pontryagin’s “maximum principle” which

gives necessary conditions for optimality under very general assumptions. The general

optimal control problem can be formulated as follows:

Suppose the plant is described by the non-linear time-varying dynamical equation

x(t) = f(x, u, t)

where the state x(t) ∈ Rn and the control u(t) ∈ U ⊆ Rm, where U is some compact

region of Rm. With this system we associate the performance index:

J(x0) = M(x(T ), T ) +

∫ T

t0

L(x(t), u(t), t)dt

where [t0, T ] is the time-interval of interest. Note that the terminal cost M(·, ·) is

a function of the terminal state and time, whereas the weighting function L(·, ·, ·)penalizes time and the state and control variables at intermediate times. The problem

is to find the optimal control u?(t) ∈ U which drives the state-vector along an optimal

trajectory x?(t), such that J(x0) is minimised, subject to a constraint on the final state

of the form ψ(x(T ), T ) = 0 for a given function ψ(·, ·).

2.2 A simplified optimal control problem

The optimal control problem which will be considered here is simplified as follows: (1)

The system dynamics are assumed to be time-invariant (no explicit dependence of f on

20 of 58

t); (2) The terminal time T will be assumed fixed; (3) The constraint u(t) ∈ U on the

control signal is removed, along with the terminal state constraint ψ(x(T ), T ) = 0; this

avoids some intricate questions on the existence of an optimal u(t) (controllability).

Note that removal of the direct constraint u(t) ∈ U does not mean that u(t) is allowed

to be unrestrained - constraints on the size (or energy, etc) of the control input can be

imposed indirectly through the penalty term L(·, ·) under the integral sign, which is

also taken not to depend explicitly on time. To summarize, we consider the following

problem:

• Time interval of interest: [0, T ], T fixed.

• System dynamics: x(t) = f(x(t), u(t)), x(t) ∈ Rn, u(t) ∈ Rm. u(t) is assumed

piece-wise continuous in [0, T ]; the solution of x(t) = f(x(t), u(t)) is assumed to

exist and to be unique for all t ≥ 0.

• Initial conditions: x(0) = x0 (fixed).

• Performance index: J [u] = M(x(T )) +∫ T

0L(x(t), u(t))dt. Functions L(·, ·) and

M(·) are assumed to be differentiable in all their variables.

• Problem: Find u?(t), 0 ≤ t ≤ T which minimizes J [u].

We develop necessary conditions for optimality using informal variational arguments.

First suppose that u?(t), 0 ≤ t ≤ T is the optimal solution and let x?(t) be the

corresponding (optimal) state trajectory. Next, consider the variation u? + δu(t) of

u?(t), resulting in a state-trajectory x?(t)+δx(t), where ‖δx‖ = O(‖δu‖). Now consider

variation ∆J :

∆J = M(x?(T ) + δx(T )) +

∫ T

0

L(x? + δx, u? + δu)dt−M(x?(T ))−∫ T

0

L(x?, u?)dt

Expanding using Taylor series:

∆J = M(x?(T )) + (δx(T ))T ∂M(T )

∂x+

∫ T

0

{L(x?, u?) + (δx)T ∂L

∂x+ (δu)T ∂L

∂u

}dt

− M(x?(T ))−∫ T

0

L(x?, u?)dt + O(‖δu‖2)

where all partial derivatives are evaluated on the optimal trajectory (x?(t), u?(t)).

Ignoring terms O(‖δu‖2) gives the first variation of performance index as:

δJ = (δx)T ∂M

∂x

∣∣∣∣T

+

∫ T

0

{(δx)T ∂L

∂x+ (δu)T ∂L

∂u

}dt

21 of 58

which may be written out in full as:

δJ =n∑

j=1

∂M(T )

∂xj

δxj(T ) +

∫ T

0

{n∑

j=1

∂L

∂xj

δxj +m∑

j=1

∂L

∂uj

δuj

}dt

Note that as u?(t) is optimal, δJ = 0.

We now face a problem: The δxi and δuj are not independent increments, since

they are linked via the dynamic constraints x = f(x, u). The standard procedure

when we face constraints is to augment the cost function by introducing Lagrange

multipliers. Then necessary conditions for a minimum of the constrained problem (here

the dynamics x = f(x, u) act as constraints) are transformed to necessary conditions for

a minimum of the augmented unconstrained problem. The same technique is followed

here, but since the constraints are “continuous”, the multipliers must now be time-

dependent. We need n scalar Lagrange variables, pi(t), one for each dynamic constraint

xi(t) = fi(x, u). Consider the n integrals:

Φi =

∫ T

0

pi(t)(fi(x, u)− xi(t))dt i = 1, 2, . . . , n

evaluated along the optimal trajectory (x?(t), u?(t)), and note that Φi = 0 for any u(t)

(not just the optimal). Since Φi = 0, so is its first variation, i.e δΦi = 0. Now,

∆Φi =

∫ T

0

pi(t)

[fi(x + δx, u + δu)− (xi +

d

dt(δxi))

]dt−

∫ T

0

pi(t)(fi(x, u)− xi)dt

Expanding,

∆Φi =

∫ T

0

pi(t)

[fi(x, u) +

n∑j=1

∂fi

∂xj

δxj +m∑

j=1

∂fi

∂uj

δuj − xi(t)− d

dt(δxi)

]dt

−∫ T

0

pi(t) [fi(x, u)− xi(t)] dt + O(‖δu‖2)

or

δΦi =

∫ T

0

pi(t)

[n∑

j=1

∂fi

∂xj

δxj +m∑

j=1

∂fi

∂uj

δuj − d

dt(δxi)

]dt

Integrating the last term by parts,∫ T

0

pi(t)d

dtδ(xi)dt = [pi(t)δxi(t)]

T0 −

∫ T

0

pi(t)δxi(t)dt = pi(T )δxi(T )−∫ T

0

pi(t)δxi(t)dt

since δxi(0) = 0, and hence

δΦi =

∫ T

0

pi(t)

[n∑

j=1

∂fi

∂xj

δxj +m∑

j=1

∂fi

∂uj

δuj

]dt +

∫ T

0

pi(t)δxi(t)dt− pi(T )δxi(T )

22 of 58

Thus, by changing the order in which summation is carried out under the integral sign,

n∑i=1

δΦi =

∫ T

0

n∑j=1

δxj

(n∑

i=1

pi(t)∂fi

∂xj

)dt +

∫ T

0

m∑j=1

δuj

(n∑

i=1

pi(t)∂fi

∂uj

)dt

+

∫ T

0

(δx)T p(t)dt− (δx(T ))T p(T )

where the summation in the last integral is written as a vector product. Thus:

δJ +n∑

i=1

δΦi = (δx)T ∂M

∂x

∣∣∣∣T

+

∫ T

0

n∑j=1

δxj

(∂L

∂xj

+n∑

i=1

pi(t)∂fi

∂xj

)dt

+

∫ T

0

m∑j=1

δuj

(∂L

∂uj

+n∑

i=1

pi(t)∂fi

∂uj

)dt +

∫ T

0

(δx)T p(t)dt− (δx(T ))T p(T )

A necessary condition for u?(t) to be optimal is that

δJ +n∑

i=1

δΦi = 0

(since each δΦi = 0). The expression for the variation can be simplified considerably

by introducing the Hamiltonean function:

H(x, u) = L(x, u) + pT f(x, u) = L(x, u) +n∑

i=1

pi(t)fi(x, u)

Note that:∂H(x, u)

∂xj

=∂L

∂xj

+n∑

i=1

pi(t)∂fi(x, u)

∂xj

and∂H(x, u)

∂uj

=∂L

∂uj

+n∑

i=1

pi(t)∂fi(x, u)

∂uj

With this substitution we get that

(δx)T

(∂M

∂x− p(t)

)∣∣∣∣T

+

∫ T

0

{(δx)T

(∂H

∂x+ p(t)

)+ (δu)T ∂H

∂u

}dt = 0

is a necessary condition for optimality. Now, we have the Lagrange variables at our

disposal and we make the choice:

p(t) = −∂H

∂xp(T ) =

∂M(T )

∂x

which form the “co-state” of the system. With this choice we need

δJ =

∫ T

0

(δu)T ∂H

∂udt = 0

23 of 58

for a minimum, or equivalently since δu is arbitrary, we have that

∂H

∂u= 0

along the optimal trajectory, i.e. the Hamiltonean is an extremum at every point of

the optimal trajectory. To summarize, to find the optimal solution to the problem:

• We define the Hamiltonean H(x, u, p) = L(x, u) + pT f(x, u) and set ∂H∂u

= 0.

• To find the optimal control we need to solve this condition simultaneously with:

x =∂H

∂p= f(x, u), x(0) = x0 (state equations)

and

p = −∂H

∂xp(T ) =

∂M(T )

∂x(co-state equations)

Note that the state equations need to be integrated forward in time (from initial

condition x(0) = x0) and the co-state equations backwards in time (from terminal

condition p(T ) = ∂M(T )∂x

).

24 of 58

2.3 The finite-horizon linear/quadratic regulator

Here we consider the specific problem of determining the optimal control of a linear

time-invariant (LTI) system with a quadratic performance index. The plant dynamics

are described by the standard state-space equations:

x(t) = Ax(t) + Bu(t) x(0) = x0

where A ∈ Rn×n and B ∈ Rn×m. The performance index to be minimised is:

J [u] =1

2xT (t)Sx(t) +

1

2

∫ T

0

(xT Qx + uT Ru)dt

where T is fixed, and S, Q ≥ 0, R > 0 are all symmetric constant matrices. First, we

form the Hamiltonean:

H(x, u) =1

2(xT Qx + uT Ru) + pT (Ax + Bu)

To obtain the minimising solution we need to solve:

∂H

∂u= 0 ⇒ Ru + BT p = 0 ⇒ u = −R−1BT p

p = −∂H

∂x= −Qx− AT p

p(T ) =∂M(T )

∂x= Sx(T )

together with the dynamic equations. These can be assembled together in matrix form

as: (x

p

)=

(A −BR−1BT

−Q −AT

) (x

p

)

subject to the initial conditions: x(0) = x0 and p(T ) = Sx(T ). This is a two-

point boundary value problem with part of the boundary conditions specified at

the initial time t = 0 (x(0) = x0) and the other part at the terminal time t = T

(p(T ) = Sx(T )). The matrix defining the dynamic map has a special structure and

is called a “Hamiltonean” matrix; the overall dynamic map defines a “Hamiltonean

system”.

To solve the problem (inspired from the boundary condition p(T ) = Sx(T )), assume

that we have p(t) = P (t)x(t) for some unknown matrix function P (t), which satisfies

25 of 58

P (T ) = S. If we can find such a P (t), then the assumption is valid. Now differentiating

the co-state equations we get:

p = P x + Px = P x + P (Ax + Bu) = P x + P (Ax−BR−1BT p)

= P x + PAx− PBR−1BT p = P x + PAx− PBR−1BT Px

Using the co-state equation p = −Qx− AT p = −Qx− AT Px gives

−P x = (AT P + PA− PBR−1BT P + Q)x

for t ∈ [0 T ]. Since this must hold for all state-trajectories given any x(0) = x0, it is

necessary that:

−P = AT P + PA− PBR−1BT P + Q

for t ∈ [0 T ]. This is a differential matrix equation (Riccati equation) and if P (t) is

its solution with final condition P (T ) = S, then p(t) = P (t)x(t) for all t ∈ [0 T ], so

that our assumption is justified. Note that this can be solved backwards in time (e.g.

by numerical integration) from the terminal condition P (T ) = S. Note also that if

the differential Riccati equation is transposed it remains the same except that P (t)

is replaced by P T (t). Since the terminal condition is symmetric by assumption, i.e.

P (T ) = S = ST , this implies that P T = P for all 0 ≤ t ≤ T . It can further be shown

that under the assumptions that S = ST ≥ 0, Q = QT ≥ 0 and R = RT > 0, the

solution P (t) to the differential Riccati equation is unique. Note that the solution to

the problem is obtained in state-feedback form, i.e. u?(t) = −K(t)x(t) where K(t)

denotes the Kalman gain K(t) = R−1BT P (t). Note further that the Kalman gain is

time-varying.

Of course we have not yet shown that u?(t) obtained above is actually the minimizing

solution, i.e. that it results in the smallest performance index among all admissible

controls u(t) defined on the interval [0 T ]. This follows by considering by the following

identities:

xT (T )P (T )x(T )− xT (0)P (0)x(0) =

∫ T

0

d

dt(xT Px)dt

=

∫ T

0

(xT Px + xT P x + xT Px)dt

=

∫ T

0

{xT (P + AT P + PA)x + uT BT Px + xT PBu

}dt

26 of 58

Using the fact that P (T ) = S, the performance index can be written as:

J [u] =1

2xT (T )Sx(T ) +

1

2

∫ T

0

(xT Qx + uT Ru)dt

=1

2xT (0)P (0)x(0) +

1

2

∫ T

0

{xT (P + AT P + PA + Q)x + uT BT Px + xT PBu + uT Ru

}dt

=1

2xT (0)P (0)x(0) +

1

2

∫ T

0

{xT PBR−1BT Px + uT BT Px + xT PBu + uT Ru

}dt

=1

2xT (0)P (0)x(0) +

1

2

∫ T

0

(u + R−1BT Px)T R(u + R−1BT Px)dt

≥ 1

2xT (0)P (0)x(0)

with equality if and only if u(t) = −R−1BT P (t)x(t).

1/s x(t)x(t).

B

Au*(t)

-R-1BTP(t)

++

Finite-horizon LQR

We summarize the results of this section with a Theorem:

27 of 58

Theorem 1: Given:

• A system model: x(t) = Ax(t) + Bu(t), x(0) = x0 and,

• A performance index J [u] = 12xT (T )Sx(T )+

∫ T

0(xT (t)Qx(t)+uT (t)Ru(t))dt with

S = ST ≥ 0, Q = QT ≥ 0 and R = RT > 0, then:

• The optimal feedback control in [0 T ] is given as u?(t) = −K(t)x(t) where K(t) =

R−1BT P (t) and P (t) is the unique symmetric solution of the differential matrix

Riccati equation −P = AT P + PA − PBR−1BT P + Q with terminal condition

P (T ) = S. The corresponding optimal performance index is J [u?] = 12xT

0 P (0)x0.

2.3 The infinite horizon linear/quadratic regulator

In this section we consider the infinite-horizon LQ optimal control problem. This is

obtained from the finite-horizon problem by setting the terminal cost matrix S to zero

and taking the limit T → ∞. In particular, we seek an optimal control signal u(t),

t ≥ 0 which minimizes the performance index:

J [u] =1

2

∫ ∞

0

(xT Qx + uT Ru)dt

where Q = QT ≥ 0 and R = RT > 0, subject to the plant dynamics constraints

x(t) = Ax(t) + Bu(t), x(0) = x0. It will be shown that in this case there exists an

optimal control which solves the problem, provided we make certain controllability and

observability assumptions (which can be relaxed!). The optimal control is obtained in a

time-invariant state-feedback form and depends on the solution of an algebraic matrix

Riccati equation.

Let us denote by Π(t, T ) (for t ∈ [0; T ]) the solution of the Riccati differential equation

with terminal condition Π(T, T ) = P (T ) = 0. We also write the cost function in the

more explicit form J(u, x0, T ) to emphasize the initial condition and time-horizon of the

problem. The following Theorem establishes an important link between the solutions

of the differential and algebraic Riccati equations:

Theorem 2: If Q = QT ≥ 0, R = RT > 0 and (A,B) is controllable, then the

following limit exists:

limT→∞

Π(0, T ) = P

28 of 58

Further P = P T ≥ 0 and satisfies the algebraic Riccati equation:

AT P + PA− PBR−1BT P + Q = 0

If (A,Q) is observable, then P > 0.

Proof: The proof is broken to the following steps for clarity:

1. xT0 Π(0, T )x0 is a monotone non-decreasing function in T .

2. xT0 Π(0, T )x0 is bounded above for all T .

3. Π(0, T ) tends to a limit P .

4. P satisfies the algebraic Riccati equation AT P + PA− PBR−1BT P + Q = 0.

5. P = P T ≥ 0 and P > 0 if (A,Q) is observable.

1. xT0 Π(0, T )x0 is a monotone non-decreasing function in T : Consider two terminal

times T1 < T2. Denote by u(t) the optimal control for the finite-horizon problem on

[0 T2] (i.e. u(t) = −BT R−1Π(t, T2)x(t)) and by ur(t) its restriction in [0 T1] (i.e.

ur(t) = u(t) for 0 ≤ t ≤ T1, ur(t) = 0 for T1 ≤ t ≤ T2). Then:

1

2xT

0 Π(0, T1)x0 ≤ J(ur(t), x0, T1) (LHS is optimal cost on [0 T1])

≤ J(u(t), x0, T2) (extra state-penalty cost on [T1 T2])

=1

2xT

0 Π(0, T2)x0 (u(t) assumed optimal on [0 T2])

Thus xT0 Π(0, T1)x0 ≤ xT

0 Π(0, T2)x0 if T1 < T2.

2. xT0 Π(0, T )x0 is bounded above for all T : Because of the controllability assumption

we can find a (finite) control which drives the state to the origin in time 1, say. Clearly,

for this control J0 = 12

∫ 1

0(xT Qx + uT Ru)dt is finite. By taken the input to be zero for

t ≥ 1, the state of the system stays at the origin after t = 1 (since x = 0 for t ≥ 1) and

hence J0 ≥ 12xT

0 Π(0, T )x0 for every T > 1 (since the LHS still represents the cost over

the extended horizon - note that additional state and control cost over [1 T ] is zero -

while the RHS of the inequality is the minimum cost over [0, T ]), and hence for any

T > 0. Hence xT0 Π(0, T )x0 is bounded above for all T .

3. Π(0, T ) tends to a limit P : According to a classical theorem of analysis, since

xT0 Π(0, T )x0 is monotonically non-decreasing and bounded from above, it must

29 of 58

tT10 T2

u*(t)u*r(t)

Step 1: Optimal and truncated control

x0

x(1)=0

0 1 Tt

here u(t)=0 and x(t)=0

x(t)

Step 2: Optimal cost is bounded

converge as T → ∞. Taking xT0 = [0 . . . 0 1 0 . . . 0] with the 1 in the i-

th position shows that the diagonal entries of Π(0, T ) tend to a limit. Taking

xT0 = [0 . . . 0 1 0 . . . 1 0 . . . 0] with the 1’s in the i-th and j-th positions (i 6= j)

shows that

2Π(0, T )ij = xT0 Π(0, T )x0 − Π(0, T )ii − Π(0, T )jj

also converges as T →∞. This shows that Π(0, T ) converges and we denote the limit

by P .

4. P satisfies the algebraic Riccati equation AT P+PA−PBR−1BT P+Q = 0: Consider

30 of 58

the matrix

AT Π(0, T ) + Π(0, T )A− Π(0, T )BR−1BT Π(0, T ) + Q (5)

which from part 3 must tend to the limit

AT P + PA− PBR−1BT P + Q (6)

as T →∞. However equation (5) is equal to

− d

dtΠ(0, T )

∣∣∣∣t=0

(7)

which tends to the limit (6) as T →∞. However, since the Riccati differential equation

is time invariant, (7) is equal to

− d

dtΠ(t, 0)

∣∣∣∣t=−T

(8)

where Π(t, 0), t ≤ 0 is the solution of the Riccati differential equation with the terminal

solution shifted to time 0. Thus Π(−T, 0)(= Π(0, T )) and its derivative (8) both tend

to a limit as T → ∞, and hence (8) must tend to the zero limit. Thus the matrix

expression (6) is zero and P := limT→∞ Π(0, T ) satisfies the algebraic Riccati equation

AT P + PA− PBR−1BT P + Q = 0.

5. P = P T ≥ 0 and P > 0 if (A,Q) is observable: Since Π(0, T ) is symmetric for all

T , so is its limit P = limT→∞ Π(0, T ). Also P is non-negative definite since

xT0 Px0 ≥ xT

0 Π(0, T )x0 ≥ 0 (9)

for all x0 ∈ Rn. If xT0 Π(0, T )x0 = 0 for some x0, then J [u?] = 0 (see last part of

Theorem 1) and hence

∫ T

0

{xT (t)Qx(t) + u?T (t)Ru?(t)

}dt = 0

(note that we have taken P (T ) = S = 0 here). Since R = RT > 0 this implies that

u?(t) = 0 identically in the interval [0 T ] and thus

∫ T

0

xT (t)Qx(t)dt = 0 (10)

Since u?(t) = 0 identically, the (optimal) state trajectory is x(t) = eAtx0, 0 ≤ t ≤ T .

Thus (10) can be written as:

xT0

(∫ T

0

eAT tQeAtdt

)x0 = 0

31 of 58

which implies that x0 = 0 if (A,Q) is observable. Thus in this case (9) implies that P

is positive definite. ¥

We can now prove the main result of this section:

Theorem 3: The minimising control for the performance index

J [u] =1

2

∫ ∞

0

(xT Qx + uT Ru)dt

where Q = QT ≥ 0, R = RT > 0 subject to the dynamic equations

x = Ax + Bu; x(0) = x0

where (A,B) is assumed controllable and (A, Q) is assumed observable is given by:

u(t) = −R−1BT Px(t) := −Kx(t)

where P is the unique positive definite solution of the algebraic Riccati equation


The value of the performance index corresponding to the optimal control is given by:

J [u?] =1

2xT

0 Px0

Proof: We simply outline the first part of the proof which is too technical: It can be

shown that under the stated assumptions there is a unique positive-definite solution

P to the algebraic Riccati equation; further for this P all eigenvalues of the matrix

A − BR−1BT P have negative real parts (such a solution is called stabilising). It can

next be shown that under the assumptions of the Theorem, the only candidate optimal

controls are those for which ‖x(t)‖ → 0; for all other control signals the performance

index does not converge (i.e. is infinite).

Assuming that u(t) is chosen so that ‖x‖ → ∞. Controls of this type do exist: For

example, setting u(t) = −R−1BT Px(t) results in asymptotically stable closed-loop

dynamics x = (A−BR−1BT P )x(t). Thus

x(t) = exp{(A−BR−1BT P )t}x0

and hence

u(t) = −R−1BT P exp{(A−BR−1BT P )t}x0

32 of 58

which result in finite cost (actually optimal) since

J [u] =1

2

∫ ∞

0

(xT Qx + uT Ru)dt ≤ 1

2

∫ ∞

0

{λ(Q)‖x(t)‖2 + λ(R)‖u(t)‖2

}dt

=1

2

{λ(Q)‖x(t)‖2

2 + λ(R)‖u(t)‖22

}< ∞

where ‖x(t)‖2 denotes the total energy of x(t) in [0 ∞], i.e.

‖x(t)‖22 =

n∑i=1

∫ ∞

0

x2i (t)dt

which is finite for all signals of the form x(t) = exp(Ft)x0 with F an exponentially

stable matrix.

So next consider all control signals u(t) that result in state-trajectories x(t) (x(0) = x0)

such that ‖x(t)‖ → 0 as t →∞. Then, similarly to the derivation in section 2.3,

J [u] =1

2

∫ ∞

0

(xT Qx + uT Ru)dt

=1

2

∫ ∞

0

[xT (−AT P − PA + PBR−1BT P )x + uT Ru]dt

=1

2

∫ ∞

0

[− d

dt(xT Px) + uT BT Px + xT PBu + xT PBR−1BT Px + uT Ru

]dt

=1

2xT

0 Px0 +1

2

∫ ∞

0

(u + R−1BT Px)T R(u + R−1BT Px)dt

≥ 1

2xT

0 Px0

with equality if and only if u(t) = −R−1BT Px(t). ¥

Note that in order to minimize technicalities in the proofs, the limiting behaviour of

the finite horizon LQR problem was derived in this section under an unnecessarily

strong assumption, i.e. that (A,B) is controllable. Moreover, it was shown that (A,Q)

observable is sufficient for P to be positive definite (which is not necessary). These

may be strengthened as follows: The ARE equation has a unique, stabilising positive

definite solution which is the steady-state solution of the differential Riccati equation

if and only if (A,B) is stabilizable and (A,Q) is detectable. The relaxed conditions

are clarified in the following section.

2.4 The algebraic Riccati equation and its solution

33 of 58

The solution of the infinite-horizon LQR problem reduces to the solution of the

algebraic Riccati equation:


where R = RT > 0 and Q = QT ≥ 0. Here we will investigate all solutions of

this equation. To simplify the presentation we will factor R = R1/2R1/2 with R1/2

symmetric and positive definite and will redefine BR−1/2 → B. We will also factor

Q = CT C which is always possible since Q is symmetric and positive-semidefinite.

Thus the Riccati equation simplifies to:

AT P + PA− PBBT P + CT C = 0

Apart from developing a method to generate all solutions of the algebraic Riccati

equation, we will be especially interested in the stabilising solution. This is the the

solution for which the matrix A − PBBT is an (asymptotically) stable matrix (all

eigenvalues have negative real parts). This is important because A − PBBT (or

A − PBR−1BT in our earlier set-up) is the closed-loop “A”-matrix when we use the

optimal state feedback law u?(t) = −BR−1BT x(t).

Associated with the algebraic Riccati equation (in which A is an n× n matrix) is the

2n× 2n Hamiltonean matrix:

H =

(A −BBT

−CT C −AT

)

which was considered earlier in conjunction with the finite-horizon problem. We will

develop solvability conditions of the ARE in relation to the properties of H. Along

the way, we will derive tighter conditions for the solvability of the LQR problem from

those derived so far (by relaxing the assumptions on controllability and observability

made earlier).

It is first useful to note that the spectrum of the Hamiltonean σ(H) is symmetric about

the imaginary axis. To see this introduce the 2n× 2n matrix:

J =

(O −In

In O

)

Then it is easily verified that J2 = I2n and that J−1HJ = −JHJ = −HT . Thus H

and −HT are similar and hence if λ is an eigenvalue of H, so is −λ.

34 of 58

Definition: A subspace S ⊆ Cn is called invariant for a linear transformation A

(or for a matrix A) if AS ⊆ S. Examples of invariant subspaces is the linear

span of the eigenvectors corresponding to n distinct eigenvalues, or the span of the

generalised eigenvectors corresponding to a multiple eigenvalue, provided that all lower

rank generalised eigenvectors are included. For example, suppose that λ in a multiple

eigenvalue of A and let {x1, . . . , xr} be the corresponding eigenvector and generalised

eigenvectors:

(A− λI)x1 = 0

(A− λI)x2 = x1

...

(A− λI)xr = xr−1

Then the subspace S spanned by the {x1, x2, . . . , xt} (t ≤ r) is A-invariant. Conversely,

if S is non-trivial and A-invariant, then there exists a x ∈ S and λ ∈ C such that

Ax = λx. An A-invariant subspace S ⊆ Cn is called a stable (resp. anti-stable) A-

invariant subspace if all eigenvalues of A constrained to S have negative (resp. positive)

real parts.

The next theorem gives a method for constructing solutions to the ARE:

Theorem 4: Let V ⊆ C2n be an n-dimensional invariant subspace of H, and let P1,

P2 ∈ Cn×n be two complex‘matrices such that

V = Im

(P1

P2

)

If P1 is invertible, then P := P2P−11 is a solution to the ARE and σ(A − BBT P ) =

σ(H|V). Further, the solution P is independent of a specific choice of bases of V .

Proof: Since V is H-invariant, there exists Λ ∈ Cn×n such that

(A −BBT

−CT C −AT

) (P1

P2

)=

(P1

P2

)Λ

Post-multiplying by P−11 ,

(A −BBT

−CT C −AT

)(In

P

)=

(I

P

)P1ΛP−1

1

35 of 58

Pre-multiplying by [−P In]:

0 =(−P I

) (A −BBT

−CT C −AT

)(In

P

)

=(−PA− CT C PBBT − AT

) (In

P

)

= −PA− AT P + PBBT P − CT C

which shows that P is indeed a solution of the ARE. In addition,

A−BBT P = P1ΛP−11

so that σ(A− BBT P ) = σ(Λ). But by definition Λ is a matrix representation of H|Vso that σ(A−BBT P ) = σ(H|V). Finally note that any other basis spanning V can be

represented as: (P1

P2

)X =

(P1X

P2X

)

for some non-singular matrix X. Since (P2X)(P1X)−1 = P2P−11 the solution is

independent of the choice of basis for V . ¥

The following Theorem shows that the converse implication of Theorem 4 also holds,

i.e. that every solution of the ARE can be generated in the manner suggested by

Theorem 4.

Theorem 5: If P ∈ Cn×n is a solution of the ARE, then there are matrices

P1, P2 ∈ Cn×n, with P1 invertible, such that P = P2P−11 and the columns of [P T

1 P T2 ]T

form a basis of an n-invariant subspace of H.

Proof: Let Λ = A−BBT P . Multiplying this by P gives:

PΛ = PA− PBBT P = −CT C − AT P

since P solves the ARE. Write these two equations in matrix form as:(

A −BBT

−CT C −AT

)(I

P

)=

(I

P

)Λ

Hence, the columns of [I P T ]T span an n-dimensional invariant subspace of H. Defining

P1 = In, P2 = P completes the proof. ¥

36 of 58

Theorems 4 and 5 above suggest the following method for finding the stabilising solution

of the ARE:

• Find n linearly independent vectors spanning the stable invariant subspace of H

and stack them in a 2n× n matrix, partitioned as:

(P1

P2

)

with P1, P2 square (the eigenvectors/generalised eigenvectors corresponding to

the stable eigenvalues of H would do). Then P = P2P−11 is the required solution.

This algorithm works provided that: (i) H does not have eigenvalues on the imaginary-

axis (for otherwise it is impossible to choose an n-dimensional stable invariant subspace

of H - recall symmetry property of the spectrum of H), (ii) P1 is invertible. Provided

these two conditions hold Theorems 4 and 5 guarantee that a stabilizing solution exists

and that it is unique. We will see that both conditions are guaranteed by relatively

mild assumptions. We will also establish some additional properties of the stabilising

solution (symmetry, positive semi-definiteness/definiteness).

Example: Consider the ARE with matrices:

A =

(0 0

1 0

), B =

(2

1

)and C =

(1 1

)

The Hamiltonean matrix is:

H =

0 0 −4 −2

1 0 −2 −1

−1 1 0 −1

1 −1 0 0

The eigenvalues of H are {−1.118 + 0.866j,−1.118− 0.866j, 1.118 + 0.866j,−1.118 +

0.866j} and the corresponding eigenvectors are the columns of the eigenvector matrix:

V =

−0.690 −0.690 −0.568− 0.232j −0.568 + 0.232j

0.040 + 0.299j 0.040− 0.299j −0.702 −0.702

−0.332− 0.092j −0.332 + 0.092j 0.121 + 0.281j 0.121− 0.281j

0.279 + 0.484j 0.279− 0.484j −0.025− 0.187j −0.025 + 0.187j

37 of 58

−1.5 −1 −0.5 0 0.5 1 1.5−1

−0.8

−0.6

−0.4

−0.2

0

0.2

0.4

0.6

0.8

1

real

imag

−a+jb a+jb

−a−jb a−jb

O

Eigenvalues of Hamiltonean

To generate all solutions of the ARE we need to generate all 2-dimensional H-invariant

subspaces. Let Vij be the matrix formed by the i and j eigenvectors. There are 6

combinations: V12, V13, V14, V23, V24 and V34. Here are some of them:

(1) {λ1, λ2} = {−1.118 + 0.866j,−1.118− 0.866j}. Here:

V12 =

[P1

P2

]=

0.690 −0.690

0.040 + 0.299j 0.040− 0.299j

−0.332− 0.092j −0.332 + 0.092j

0.279 + 0.484j 0.279− 0.484j

P = P2P−11 =

[0.463 −0.309

−0.309 1.618

]

which is real, symmetric and positive-definite. This is the stabilising solution.

(2) {λ1, λ3} = {−1.118 + 0.866j, 1.118 + 0.866j}. Here:

V13 =

−0.6908 −0.5681− 0.2323j

0.0408 + 0.2991j −0.7022

−0.3328− 0.0924j 0.1213 + 0.2819j

0.2794 + 0.4840j −0.0256− 0.1879j

, P =

[0.625− 0.216j −0.750− 0.433j

−0.750− 0.433j 0.500 + 0.866j

]

38 of 58

Note that splitting a pair of complex-conjugate eigenvalues may result in solutions

which are not real.

(3) {λ3, λ4} = {1.118 + 0.866j, 1.118− 0.866j}. Here:

V34 =

−0.568− 0.232j −0.568 + 0.232j

−0.702 −0.702

0.121 + 0.281j 0.121− 0.281j

−0.025− 0.187j −0.025 + 0.187j

, P =

[−1.213 0.809

0.809 −0.618

]

which is a real, symmetric negative-definite and de-stabilising solution. ¥

Definition: Assume that H has no eigenvalues on the imaginary axis. Then from the

symmetry property of the spectrum of H we must have n eigenvalues in the left-half

plane and n eigenvalues in the right-half plane (counted according to their algebraic

multiplicity). We call this the “stability property”. Consider the two spectral subspaces

P−(H) and P+(H) corresponding to the eigenvalues in the left and right half planes

respectively. Find a basis for P−(H) and stack the basis vectors in a 2n× n matrix,

P− = Im

(P1

P2

)

where P1 and P2 are square. If P1 is nonsingular, or equivalently the two subspaces

P−(H) and Im

(On×n

In

)

are complementary (“complementarity property”) we can set P = P2P−11 . Then P is

the unique stabilising solution of the ARE, i.e. H → X is a function, which we will

denote as Ric(·). The domain of this function, dom(Ric), consists of all Hamiltonean

matrices (of the form defined above), which satisfy the two properties of stability and

complementarity.

Theorem 6: Suppose that H ∈ dom(Ric) and P = Ric(H). Then (i) P is (real)

symmetric, (ii) P satisfies the ARE AT P + PA − PBBT P + CT C = 0, and (iii)

A−BBT P is stable.

Proof: (i) Let P1 and P2 be as defined above. It is first shown that P T1 P2 is symmetric.

To prove this note that there exists a stable matrix H− ∈ Rn×n such that:

H

(P1

P2

)=

(P1

P2

)H−

39 of 58

(H− is a matrix representation of H|P−(H)). Pre-multiply this equation by [P ?1 P ?

2 ]J to

get:(

P ?1 P ?

2

)JH

(P1

P2

)=

(P ?

1 P ?2

)J

(P1

P2

)H−

Since JH is symmetric, so is the LHS of this equation, and hence also the RHS:

(−P ?1 P2 + P ?

2 P1)H− = HT−(−P ?

1 P2 + P ?2 P1)

? = −HT−(−P ?

1 P2 + P ?2 P1)

?

= −HT−(−P ?

1 P2 + P ?2 P1)

This is a Lyapunov equation. Since H− is stable, the unique solution is:

−P ?1 P2 + P ?

2 P1 = 0

Thus P ?1 P2 = P ?

2 P1. Since P1 is non-singular, P = (P−11 )?(P ?

1 P2)P−11 is Hermitian. To

show that P is actually real (symmetric) note that since H is real, a “basis-matrix”

[P T1 P T

2 ]T for P−(H) may be chosen real.

(ii) Start with the equation

H

(P1

P2

)=

(P1

P2

)H−

and post-multiply by P−11 to get:

H

(I

P

)=

(I

P

)P1H−P−1

1 (11)

Next pre-multiply by [P − I]:

(P −I

)H

(I

P

)= 0 ⇒

(P −I

) (A −BBT

−CCT −AT

)(I

P

)= 0

which implies that AT P + PA− PBBT B + CT C = 0.

(iii) Pre-multiply (11) by [I 0] to get:

A−BBT P = P1H−P−11

Thus A−BBT P is stable because H− is. ¥

The following two results give necessary and sufficient conditions for the existence of

a unique stabilising solution to the ARE:

40 of 58

Theorem 7: Suppose that H has no imaginary axis eigenvalues. Then H ∈ dom(Ric)

if and only if (A,B) is stabilisable.

Proof: Assume that (A,B) is stabilisable. To prove that H ∈ dom(Ric) we must show

that:

P−(H), Im

(0

I

)

are complementary. Define P1, P2 and H− so that

P−(H) = Im

(P1

P2

)

or, (A −BBT

−CT C −AT

)(P1

P2

)=

(P1

P2

)H− (12)

We want to show that P1 is nonsingular, i.e. that Ker(P1) = {0}. First it is claimed

that Ker(P1) is H−-invariant. Let x ∈ Ker(P1) so that P1x = 0. Pre-multiply (12) by

[I O] to get

AP1 −BBT P2 = P1H− (13)

Pre-multiplying by x?P ?2 and post-multiplying by x gives: x?P ?

2 AP1x−x?P2BBT P2x =

x?P ?2 P1H−x or x?P2BBT P2x = x?P ?

1 P2H−x = 0 using the facts that P ?1 P2 is symmetric

and P1x = 0. Thus x?P ?2 BBT P2x = 0, which implies that BT P2x = 0. Now multiply

(13) by x to get P1H−x = 0, i.e. that H−x ∈ Ker(P1) which proves the claim. Now,

to prove that P1 is nonsingular, suppose for contradiction that Ker(P1) 6= {0}. Then,

H−|Ker(P1) has an eigenvalue, λ and an eigenvector x, such that H−x = λx, Re(λ) < 0,

0 6= x ∈ Ker(P1). Pre-multiply (12) by [0 I] to get −CT CP1 − AT P2 = P2H−.

Post-multiply by x to get (AT + λI)P2x = 0. Recalling that BT P2x = 0 gives

x?P ?2 [A + λI B] = 0. Then, stabilisability of (A,B) implies that P2x = 0. But if

both P1x = 0 and P2x = 0, then x = 0 since [P T1 P T

2 ]T has full column rank, which is

a contradiction.

Assume next that H ∈ dom(Ric). Then P is a stabilising solution and A − BBT P is

asymptotically stable, and hence (A,B) is stabilisable. ¥

Theorem 8:

H ∈ dom(Ric) if and only if (A,B) is stabilisable and (A,C) has no unobservable

modes on the imaginary axis. Furthermore, P = Ric(H) ≥ 0 if H ∈ dom(Ric) and

P > 0 if and only if (A, C) has no stable unobservable modes.

41 of 58

Proof: We first show that if H has an eigenvalue on the imaginary axis, then this

must be either an uncontrollable mode of (A, B) or an unobservable mode of (A,C)

(or both). Let jω be an eigenvalue of H and [xT yT ]T the corresponding eigenvector.

Then: (A −BBT

−CT C −AT

) (x

y

)= jω

(x

y

)

or

Ax−BBT y = jωx and − CT Cx− AT y = jωy

Pre-multiplying the first equation by y? and the second by x? gives:

y?Ax− y?BBT y = jωy?x and − x?CT Cx− x?AT y = jωx?y

Conjugating the first equation above gives

x?AT y − y?BBT y = −jωx?y

and adding this to the second we obtain:

y?BBT y + x?CT Cx = 0 ⇒ y?B = 0 and Cx = 0

Hence:

y?(

jωI − A B)

= 0 and

(jωI − A

C

)x = 0

Since x and y cannot be 0 simultaneously, jω is either and uncontrollable mode of

(A,B) or an unobservable mode of (A,C) (or both).

Assume now that H ∈ dom(Ric). Then H has no imaginary axis eigenvalues and

hence from the above derivation (A,C) has no unobservable modes on the imaginary

axis; moreover from Theorem 8, (A,B) is stabilisable. Conversely suppose that (A,C)

has no unobservable modes on the imaginary axis and (A, B) stabilisable. Then H

cannot have any imaginary axis eigenvalues (for if it did then this would have to be an

uncontrollable mode of (A,B) and hence (A,B) would not be stabilisable). Again from

Theorem 8 stabilisability of (A,B) implies that H ∈ dom(Ric), hence establishing the

equivalence of the two statements.

Next, let P = Ric(H). We need to show that P ≥ 0. The Riccati equation is:

AT P + PA− PBBT P + CT C = 0 (14)

42 of 58

or equivalently

(A−BBT P )T P + P (A−BBT P ) +(

PB CT) (

BT P

C

)= 0 (15)

with A−BBT P stable (see Theorem 6). Then, from standard Lyapunov theory1,

P =

∫ ∞

0

e(A−BBT P )T t(PBBT P + CT C)e(A−BBT )tdt

and hence P ≥ 0 because PBBT P + CT C ≥ 0.

Finally, we need to show that P is nonsingular if and only if (A,C) has stable

unobservable modes. Let x ∈ Ker(P ) so that Px = 0. Pre-multiply (14) by x?

and post-multiply by x to get Cx = 0. No post-multiply again (14) by x to get

PAx = 0. Thus Ker(P ) is an A-invariant subspace. Now if Ker(A) 6= {0}, then there

is an x ∈ Ker(P ), x 6= 0 and a λ such that λx = Ax = (A − BBT P )x and Cx = 0.

Since A−BBT P is stable, Re(λ) < 0; thus λ is a stable unobservable mode of (A,C).

Conversely, suppose that (A,C) has a stable unobservable mode λ, i.e. there is an

x 6= 0 such that Ax = λx and Cx = 0. By pre-multiplying the Riccati by x? and

post-multiplying by x we get

2Re(λ)x?Px− x?PBBT Px = 0

and hence x?Px = 0, i.e. P is singular. ¥

The following result is an immediate consequence of Theorem 8 and defines the standard

assumptions for the solution of the LQR problem:

Theorem 9: Suppose that (A,B) is stabilisable and (A,C) is detectable. Then the

ARE

AT P + PA− PBBT P + CT C = 0

has a unique positive semi-definite solution. Moreover this solution is stabilising.

1Consider the system x = Ax, with A stable, x(0) = x0 arbitrary and let P satisfy the

Lyapunov equation PA + AT P + Q = 0. Let V (x(t)) = xT Px. Then V (x) = xT Px + xT Px =

xT (AT P +PA)x = −xT Qx. Integrating along a trajectory gives V (∞)−V (0) = − ∫∞0

xT (t)Qx(t)dt =

−xT0 (

∫∞0

eAT tQeAtdt)x0. Since V (0) = xT0 Px0 and V (∞) = 0 we get xT

0 Px0 = xT0 (

∫∞0

eAT tQeAtdt)x0

and hence P =∫∞0

eAT tQeAtdt since x0 is arbitrary.

43 of 58

Proof: For Theorem 8 it follows that the ARE has a unique stabilising solution which is

positive semi-definite. Thus it suffices to show that ever positive semi-definite solution

P ≥ 0 is stabilising. Re-write the ARE as

(A−BBT P )T P + P (A−BBT P ) + PBBT P + CT C = 0 (16)

and let λ and x be an unstable eigenvalue and the corresponding eigenvector of

A − BBT P , respectively, i.e. (A − BBT P )x = λx. Next, pre-multiply and post-

multiply (16) by x? and x respectively to give

(λ + λ)x?Px + x?(PBBT P + CT C)x = 0

which implies, (since λ+λ ≥ 0 and P ≥ 0) that BT P = 0 and Cx = 0. Thus Ax = λx

and Cx = 0 so that λ is an unstable unobservable mode of (A,C), a contradiction.

Hence Re(λ) < 0, i.e. P ≥ 0 is the stabilising solution. ¥

2.5 LQR problems with cross state-control penalty term

A slight modification to the infinite horizon LQR problem allows for a more general

cost function which includes a cross-term between control and state-variables. In this

formulation, the performance index to be minimised is given as:

J [u] =1

2

∫ ∞

0

(xT (t)Qx(t) + 2xT (t)Nu(t) + uT (t)Ru(t))dt

This may be put into more compact form as:

J [u] =1

2

∫ ∞

0

(xT (t) uT (t)

) (Q N

NT R

)(x(t)

u(t)

)dt

Here we require that R = RT > 0 and(

Q N

NT R

)≥ 0

The optimal solution is given by the modified state-feedback law:

u?(t) = −R−1(NT + BT P )x(t)

where P is the unique stabilizing solution of the (modified) ARE:

P (A−BR−1NT ) + (A−BR−1NT )T P − PBR−1BT P + Q−NR−1NT = 0

The standard assumptions for the existence of a (unique) stabilising solution is that

(A,B) is stabilisable and that (A,Q − NR−1NT ) detectable. Note that under the

stated assumptions we have Q = QT ≥ 0 and Q−NR−1NT ≥ 0.

44 of 58

2.6 Deterministic state-observers

One of the main disadvantages in the formulation of the LQR problem (both finite

and infinite-horizon) is the assumption that the state-vector x(t) is available for

feedback. (Recall that the optimal control signal is given in state-feedback form,

u?(t) = −R−1BT P (t)x(t)). This is an important limitation for high-order systems

having a large number of states, since it implies that every state-variable is measured

by a sensor. In most practical cases, however, only a few state-variables (or a few linear

combinations of the state variables) are measurable.

The problem is thus independent from optimality and arises from the distinction

between state and output feedback. We have seen that if the pair (A,B) is controllable,

the eigenvalues of the closed-loop dynamics A + BF may be assigned at arbitrary

locations of the complex plane (subject to real-symmetry constraints, assuming that

no limitation is placed on the magnitude of the control signal and other practical

constraints). This is a very strong result which gives the designer extreme flexibility in

shaping the closed-loop characteristics. Although we can not expect to retain the same

degree of flexibility when output feedback is used, we would like to retain as much

as possible of the state-feedback approach when designing dynamic output-feedback

controllers. This can be achieved by using state-observers.

A state-observer is a dynamic filter, driven by the input and output signals of the

plant and producing as outputs “estimates” of the state-variables of the system to

which it is connected. When the plant operates in open-loop, we normally settle for

asymptotic tracking of the system’s states (if the system’s dynamics are exactly known).

In general, if the states can be tracked sufficiently fast and accurately, then its seems

sensible to use the estimates in the place of the “real” states (which are inaccessible)

in a state-feedback control scheme.

Under what conditions do we expect to be able reconstruct the system’s states? Clearly

a necessary condition is observability: If the system is unobservable, then at least

one initial state-variable can not be determined uniquely, and thus neither can its

subsequent trajectory. The observability condition turns out to be sufficient as well: If

the system is observable, then the initial state-vector x0 can be determined from the

output data, recorded over a finite time-interval, e.g. by inverting the observability

grammian. (Note that the control signal u(t) is assumed known over the time interval

45 of 58

in question). Then, future states could be obtained from the state-equation

x(t) = eAtx0 +

∫ t

0

eA(t−τ)Bu(τ)dτ

Although this method (“open-loop observer”) is conceptually fine, and would result in

principle to perfect state-reconstruction, it is clumsy and would not work in practice

for various reasons, mainly due to its inability to take into account disturbances,

measurement noise and model errors (e.g think of the effect of such an observer in

tracking the state of the system x = (a+ε)x(t) with a > 0 and ε arbitrarily small when

the plant model ignores ε). Instead, closed-loop (asymptotic) observers are typically

used. Suppose that the system’s equations are:

x(t) = Ax(t) + Bu(t), y(t) = Cx(t) + Du(t)

with initial condition x(0) = x0. Define the observer via as the dynamic system:

˙x(t) = Ax(t) + Bu(t) + L(y(t)− y(t)), y(t) = Cx(t) + Du(t)

with initial condition x(0) = x0, L being the observer gain. Denote the state estimation

error by e(t) = x(t)− x(t). Then:

e(t) = x(t)− ˙x(t) = Ax + Bu− Ax−Bu− L(Cx + Du) + L(Cx + Du)

= A(x− x)− LC(x− x) = (A− LC)e(t)

If (A,C) is detectable, L can be chosen so that A − LC is asymptotically stable. If

(A,C) is observable, the eigenvalues of A−LC may be assigned at arbitrary locations

in the left-half plane (subject to symmetry constraints relative to the real axis). In

either case, if A− LC is asymptotically stable the error satisfies

e(t) = e(A−LC)te(0) → 0 as t →∞

where e(0) = x0 − x(0), and hence ‖x(t)− x(t)‖ → 0, i.e the state-estimates track the

state variables asymptotically. Now suppose that we use state feedback on the state-

estimates x(t) rather than on the states themselves, i.e. generate the control input as

u(t) = −Fx(t) where F is chosen so that A−BF is asymptotically stable. Then:

x(t) = Ax(t)−BFx(t) = Ax(t)−BF (x(t)− e(t)) = (A−BF )x(t) + BFe(t)

and the closed-loop dynamics can be written as:(

x(t)

e(t)

)=

(A−BF BF

0 A− LC

)(x(t)

e(t)

)

46 of 58

and hence the spectrum of the closed-loop system “A”-matrix is σ(A−BF )∪σ(A−LC).

This is the simplest form of the “separation principle” which implies that the observer

and state-feedback dynamics can be assigned independently (under controllability and

observability assumptions). Typically, the eigenvalues of the observer dynamics A−LC

are assigned to faster modes relative to the state-feedback eigenvalues A−BF so that

the estimates track the states accurately while the response is still in the transient

region.

A

A

D

D

B C1/su(t)

L

y(t)

B 1/s C

x(t)

x(t)^

y(t)^

F

+_

+

++

+

+++

Step 1: Observer/state-feedback structure

2.7 Optimal state estimation: The Kalman filter

The Kalman filter is a stochastic optimal state estimator. The problem can be

formulated both on a finite and infinite horizon, similarly to the LQR problem.

Assume that continuous plant equations have the form:

x(t) = Ax(t) + Bu(t) + Gv(t) y(t) = Cx(t) + w(t)

47 of 58

Here v(t) and w(t) are zero-mean, white, mutually uncorrelated noise signals

representing process and measurement noise, respectively. We further assume:

E[v(t)vT (τ)] = V δ(t− τ), E[w(t)wT (τ)] = Wδ(t− τ)

where δ(t) is the Dirac delta function and E[w(t)vT (τ)] = 0. Further assume that

E[x(t0)] = x0, which is uncorrelated with the process and measurement noise:

E[x0vT (τ)] = 0 E[x0w

T (τ)] = 0

for all τ and that the covariance of the initial state is

E{[x0 − E(x0)][x0 − E(x0)]

T}

= P0

The state-estimator is assumed to be linear and to have the standard state-estimator

structure:

˙x(t) = Ax(t) + Bu(t) + L(t)[y(t)− cx(t)]

where the gain L(t) is assumed to be time-varying. First consider the estimation error

e(t) = x− ˙x(t)

= Ax + Bu + Gv − Ax−Bu− L(Cx + w − Cx)

= (A− LC)e(t) + Gv(t)− Lw(t)

This is a linear time-varying system, whose solution is:

e(t) = Φ(t, 0)e(0) +

∫ t

0

Φ(t, τ)[Gv(τ)− Lw(τ)]dτ

where Φ(t, 0) is the state-transition matrix of the error system. [Recall that the

solution of a time-varying system x(t) = A(t)x(t) (with A(t) sufficiently smooth) is

obtained in terms of the state-transition matrix Φ(t, τ) = X(t)X−1(τ), where X(t)

is the fundamental solution matrix (whose columns consist of n-linearly independent

solutions of the differential equation), as x(t) = Φ(t, t0)x(t0). The state transition

matrix satisfies the “semigroup” equations: (i) Φ(t, t) = I, Φ(t, τ) = −Φ−1(τ, t) and

Φ(t2, t1)Φ(t1, t0) = Φ(t2, t0). Further, Φ(t, τ) := ddt

(Φ(t, τ)) = A(t)Φ(t, τ) which is used

below].

The error covariance is:

P (t) = E[e(t)eT (τ)] =

E

{[Φ(t, 0)e(0) +

∫ t

0

Φ(t, τ)[Gv(τ)− Lw(τ)]dτ

] [Φ(t, 0)e(0) +

∫ t

0

Φ(t, s)[Gv(s)− Lw(s)]ds)

]T}

48 of 58

Noting that the expectation of the cross-terms is zero,

P (t) = E

[Φ(t, 0)e(0)eT (0)ΦT (t, 0) +

∫ t

0

∫ t

0

Φ(t, τ)[Gv(τ)vT (s)GT + Lw(τ)wT (s)LT ]ΦT (t, s)dτds

]

= Φ(t, 0)P0ΦT (t, 0) +

∫ t

0

∫ t

0

Φ(t, τ)[GV δ(τ − s)GT + LWδ(τ − s)LT ]ΦT (t, s)dτds

= Φ(t, 0)P0ΦT (t, 0) +

∫ t

0

Φ(t, s)[GV GT + LWLT ]ΦT (t, s)ds

Next we differentiate the error covariance using the general formula:

d

dt

[∫ h(t)

g(t)

f(s, t)ds

]=

∫ h(t)

g(t)

∂

∂tf(s, t)ds + f(h(t), t)h(t)− f(g(t), t)g(t)

which gives in our case:

P (t) = Φ(t, 0)P0ΦT (t, 0) + Φ(t, 0)P0Φ

T (t, 0)

+

∫ T

0

[Φ(t, s)(GV GT + LWLT )ΦT (t, s) + Φ(t, s)(GV GT + LWLT )ΦT (t, s)]ds

+ GV GT + LWLT

= (A− LC)Φ(t, 0)P0ΦT (t, 0) + Φ(t, 0)P0Φ

T (t, 0)(A− LC)T

+ (A− LC)

∫ t

0

[Φ(t, s)(GV GT + LWLT )ΦT (t, s)]ds

+

∫ t

0

[Φ(t, s)(GV GT + LWLT )ΦT (t, s)]ds (A− LC)T

+ GV GT + LWLT

= (A− LC)P (t) + P (t)(A− LC)T + GV GT + LWLT

To obtain the optimal observer we need to minimize the mean-square error at any time

t. Now,

E[eT (t)e(t)] = trace{E[e(t)eT (t)]} = trace{P (t)}To carry out the minimisation, we need to make the derivative P as large as possible

(and negative) at any given instant; then P (t) would decrease at a maximal rate and

P (t) itself would be minimised over time. Thus we need to minimise over L(t) at any

t,

trace{P (t)} = trace{(A− LC)P + P (A− LC)T + GV GT + LWLT}

49 of 58

Setting the derivative to zero,

∂

∂L[trace(P )] =

∂

∂L

{trace

[(A− LC)P + P (A− LC)T + GV GT + LWLT

]}

=∂

∂Ltrace(−LCP ) +

∂

∂Ltrace(−PCT LT ) +

∂

∂Ltrace(LWLT )

= −2P (t)CT + 2L(t)W

= 0

and hence L(t) = P (t)CT W−1 which is the time-varying Kalman gain. Note that the

error covariance dynamics:

P (t) = (A− LC)P (t) + P (t)(A− LC)T + GV GT + LWLT

= AP (t)− P (t)CT W−1CP (t) + P (t)A− P (t)CT W−1CP (t) + GV GT + P (t)CT W−1CP (t)

= AP (t) + P (t)AT − P (t)CT W−1CP (t) + GV GT

satisfies a differential matrix Riccati equation (integrated forward from the initial

condition P (0) = P0 which is the dual equation of the LQR problem.

The similarity of the Kalman filter with the solution of the LQR problem are clear;

these are summarised below (in fact the two problems are dual of each other):

LQR controller: −P = PA + AT P − PBR−1BT P + Q, K = −R−1BT P

Kalman filter: P = AP + PAT − PCT W−1CP + GV GT , L = PCT W−1.

Note that the Riccati equation of the LQR problem is integrated backwards in time

(from the terminal solution P (T ) = S), whereas the error covariance P in the Kalman

filter is integrated forward in time (from the initial error covariance P0). In the same

way that we obtained the solution to the infinite-horizon LQR problem, we can define

the stationary Kalman filter, which minimises:

limt→∞

E[(x(t)− x(t))(x(t)− x(t))T ]

The stationary covariance may be obtained by setting P = 0 in the Kalman differential

Riccati equation, giving the stationary error covariance as the solution of the algebraic

Riccati equation:

AP + PAT − PCT W−1CP + GV GT = 0

It may be shown that the ARE has a unique, stabilising, positive definite solution P

that is the steady-state solution of the Kalman differential Riccati equation if and only

if (A,C) is detectable and (A,GV 1/2) is stabilisable.

50 of 58

It should also be noted that the “output-injection” structure of the observer used

here does not limit the optimality of the Kalman filter. It may shown using Youla’s

parametrisation of all stabilising controllers, that every stabilising controller of the

plant can be generated by such an observer structure, possibly augmented with

additional dynamics. Thus the Kalman filter is the optimal state estimator (in the

mean-square error sense) among all linear state-estimator filters. If in addition the

noise processes are assumed to be Gaussian, the Kalman filter can be shown to be

optimal among the class of all (causal) non-linear filters as well.

2.8 The LQG optimal control problem and the “Certainty Equivalence

Principle”

The standard LQG optimal control problem is a stochastic version of the LQR problem:

Given a continuous-time system:

x(t) = Ax(t) + Bu(t) + Gv(t), y(t) = Cx(t) + Du(t) + w(t)

in which v(t) and w(t) are white noise processes satisfying the conditions given in

section 2.7, the objective is to obtain the optimal-control signal which minimises the

objective function:

J [u] = E

{1

2xT (T )Sx(T ) +

1

2

∫ T

0

[xT (t)Qx(t) + uT (t)Ru(t)]dt

}

where E[·] denotes the expectation operator and S, Q and R satisfy the usual

assumptions. The solution to the problem is provided by the “separation” or “certainty-

equivalence” principle which states that the optimal solution can be decomposed into

two separate sub-problems:

• An optimal estimator (Kalman filtering) problem over the interval [0, T ] for the

stochastic dynamic system, and

• An LQR problem over the interval [0, T ] in which the stochastic terms in the

dynamics and the performance index are ignored. The optimal control is given

by applying the optimal state-feedback law on the estimated rather than the

“true” state vector.

This is a very nice result, because the overall problem completely decomposes into two

separate sub-problems (optimal estimation and optimal regulation) which have been

51 of 58

already solved. The “certainty equivalence” principle (which is not proved here) is one

of the most elegant results in optimal control theory. Of course a similar principle

applies for the infinite-horizon problem.

52 of 58

2.9 LQG design example: Control of active suspension system

The case study included in this section involves a fully active suspension system of a

quarter car model. The model is described in the Figure below.

m1

m2

k1 u

z1

z2

z0

k2

Step 1: Quarter car model

In the diagram m1 represents the quarter car mass, m2 the mass of the tyre and

suspension assembly, k1 the stiffness of the passive suspension spring and k2 the tyre

stiffness. z1, z2 represent displacement of masses m1 and m2 from their equilibrium,

respectively, and z0 is the road profile. The tyre is always assumed to be in contact

with the road and u represents the force of the actuator, applied in opposite directions

to masses m1 and m2 to which it is connected. No bandwidth constraints are placed

on the actuator (i.e. the actuator can respond with zero delay to its command signal).

The dynamic equations corresponding to the model are:

m1z1 = u− k1(z1 − z2)

m2z2 = k0(z0 − z2)− k1(z2 − z1)− u

These will be augmented with the dynamic equation for the road profile, modelled as

approximate integrated white noise, i.e.

z0 = −βz0 + Kftw(t)

53 of 58

where β is a filter parameter, kft is the road’s “roughness coefficient” and w(t)

represents white noise of unit intensity. The equations can be put into standard state-

space form x(t) = Ax(t) + Bu(t) + Fw(t), where

z1

z1

z2

z2

z0

=

0 1 0 0 0

− k1

m10 k1

m10 0

0 0 0 1 0k1

m20 −k1+k2

m20 k2

m2

0 0 0 0 −β

z1

z1

z2

z2

z0

+

01

m1

0

− 1m2

0

u(t) +

0

0

0

0

kft

w(t)

The design specifications are defined as follows:

• Passenger comfort: For a smooth ride m1 needs to be isolated from the road-

induced vertical vibrations. This objective will be addressed by including into

the cost function a term corresponding to vertical acceleration z21 .

• Realistic strut displacements: We need to keep the suspension working range

z1 − z2 within acceptable limits. Ideally this would be done by imposing a hard

upper limit on |z1−z2|, but this is not possible in our framework. We will do this

indirectly, by including a component ρ1(z1 − z2)2 into our cost function, which

will be minimised (together with the other penalty terms) in the mean-square

sense.

• Good road holding: The suspension needs to be sufficiently hard so that the

vehicle does not loose contact with the road. We will address this objective

by including a term in the cost function ρ2(z0 − z2)2 which penelizes the

dynamic tyre’s dynamic pressure. The more concentrated the dynamic pressure

distribution around its mean value (static pressure corresponding to weight

(m1 + m2)g), the less probable it is for the overall pressure to become zero and

thus for the tyre to lose contact with the road.

Thus the overall cost function that will be minimised is

J [u] = E

{∫ ∞

0

(z21 + ρ1(z1 − z2)

2 + ρ2(z0 − z2)2)dt

}

The relative weighting terms are (1, ρ1, ρ2) which may be chosen to shift the emphasis

between the three (conflicting) objectives. For example, raising the value of ρ2 means

that we are placing more emphasis on the road-holding objective and we expect that the

54 of 58

corresponding design will reduce the rms value of z0−z2, possibly at the expense of the

other two objectives. Striking an acceptable compromise between all three objectives

involves “tuning” the two design parameters ρ1 and ρ2 through simulations.

Next we need to formulate the cost function in the standard LQR format. First expand

the cost-function integrand as:

z21 + ρ1(z

21 − 2z1z2 + z2

2) + ρ2(z22 − 2z2z0 + z2

0)

Substituting from the mechanical equation m1z1 = u− k1(z1 − z2) this becomes:

(ρ1 +

k21

m21

)z21+

(ρ1 + ρ2 +

k21

m21

)z22+ρ2z

20−2

(ρ1 +

k21

m21

)z1z2−2ρ2z2z0−2k1

m21

uz1+2k1

m21

uz2+u2

m21

which may be written as

xT

k21

m21

+ ρ1 0 − k21

m21− ρ1 0 0

0 0 0 0 0

− k21

m21− ρ1 0

k21

m21

+ ρ1 + ρ2 0 −ρ2

0 0 0 0 0

0 0 −ρ2 0 ρ2

x + 2xT

− k1

m21

0k1

m21

0

0

u +u2

m21

which is in the standard form xT Qx + 2xT Nu + uT Ru (see section 2.5). The design

can now be carried out using matlab’s lqr.m function. The following diagram shows

a typical simulation of the vertical displacement of a passenger for a “soft” design in

which passenger comfort is the dominant objective.

The reduced levels of acceleration are at the expense of both strut displacement and

dynamic tyre pressure, both of which increase. Finally, a Kalman filter can be designed

for the system using the function kalman.m (typical measurements are vertical body

acceleration and strut displacement).

55 of 58

0 5 10 15 20 25 30 35−0.02

−0.01

0

0.01

0.02

0.03

0.04

0.05passenger displacement

time sec

m

Active vs passive simulation

References:

1. B.D.O Anderson and J.B. Moore, Optimal Control: Linear quadratic methods, Pren-

tice Hall, Englewood Cliffs, 1989.

2. J.S. Bay, Linear State-space Systems, McGraw-Hill International Editions, New

York, 1999.

3. B.D. Craven, Control and Optimization, Chapman and‘Hall, London, 1995.

4. M.H.A. Davis and R.B. Vinter, Stochastic modelling and control, Chapman and

Hall, London, 1985.

5. I.M. Gelfand and S.V. Fomin, Calculus of Variations, Dover, New York, 1991.

6. H. Goldstein, Classical Mechanics, Addison-Wesley, 1980.

7. F.L. Lewis and V.L. Syrmos, Optimal Control (second edition), John Wiley and

Sons Inc., New York, 1995.

8. D.J.N. Limebeer, Lecture Notes in Linear Systems (unpublished notes), Impe-

rial College, London, 1995.

9. E. Pinch, Optimal Control and Calculus of variations, Oxford University Press,

1985.

56 of 58

10. K. Sharma, D.A. Crolla and D.A Wilson, The design of a fully active suspen-

sion system incorporating a Kalman filter for state estimation, Research Report, Uni-

versity of Leeds, 1997.

11. P. Whittle, Optimal Control, John Willey and Sons inc, New York, 1996.

12. W.M Wohnham, Linear Multivariable Control: A Geometric approach, Springer-

Verlag, New-York, 1979.

13. K. Zhou, J.C. Doyle and K. Glover, Robust and Optimal Control, Prentice-Hall,

New Jersey, 1996.

57 of 58

Additional topics to be addressed in a full course:

1. Introduction to linear algebra and matrix theory.

2. Background optimisation theory: Taylor series, gradient, hessian matrix, conditions

for local minima/maxima, constrained problems and Lagrange multipliers, Kuhn-

Tucker conditions, convexity, duality.

3. Review of linear systems (including time-varying systems): Stability, Controllability,

Observability, Kalman decomposition, State-feedback and observer design.

4. Special optimal control problems that can be solved as projections in Hilbert space

(minimum-energy control, mean-squares estimation).

3. Sufficient conditions for extrema; fields of extrema (calculus of variations).

4. Calculus of variations in Classical Mechanics.

5. Calculus of variations and Shape optimization.

6. Outline of Pontryagin’s maximum principle; simple optimal control problems.

7. Optimal estimation theory (review of stochastic processes, auto-correlation and

cross-correlation functions, linear systems excited by noise processes, Wiener-Hopf

integral equation, spectral factorization, Wiener and Kalman filtering).

8. Classical Interpretation of LQ optimal control I: Tracking, LQ regulators with

integral action. 9. Automotive pre-view case study.

10. Classical Interpretation of LQ optimal control II: Stability margins of LQR, “cheap”

control, pole Butterworth patterns.

11. Classical Interpretation of LQ optimal control III: Loss of stability margins in LQG

and Loop-Transfer Recovery.

12. Introduction to robustness and H-infinity optimal control.

58 of 58

Documents

LECTURE NOTES IN CALCULUS OF VARIATIONS AND …george1/calculus-variations-optimal-control.pdf · Ver.1.2 LECTURE NOTES IN CALCULUS OF VARIATIONS AND OPTIMAL CONTROL MSc in Systems