Calculus of Variations and Pdes

8/2/2019 Calculus of Variations and Pdes

1/272

Calculus of Variations and Partial

Differential Equations

Diogo Aguiar Gomes


2/272


3/272

Contents

. Introduction 5

1. Finite dimensional optimization problems 9

1. Unconstrained minimization in Rn 10

2. Convexity 163. Lagrange multipliers 26

4. Linear programming 30

5. Non-linear optimization with constraints 37

6. Bibliographical notes 48

2. Calculus of variations in one independent variable 49

1. Euler-Lagrange Equations 50

2. Further necessary conditions 57

3. Applications to Riemannian geometry 60

4. Hamiltonian dynamics 75

5. Sufficient conditions 89

6. Symmetries and Noether theorem 105

7. Critical point theory 111

8. Invariant measures 116

9. Non convex problems 118

10. Geometry of Hamiltonian systems 119

11. Perturbation theory 122

12. Bibliographical notes 126

3. Calculus of variations and elliptic equations 127

1. Euler-Lagrange equation 129

2. Further necessary conditions and applications 136

3. Convexity and sufficient conditions 136

4. Direct method in the calculus of variations 136

3


4/272

4 CONTENTS

5. Euler-Lagrange equations 145

6. Regularity by energy methods 146

7. Holder continuity 1558. Schauder estimates 171

4. Optimal control and viscosity solutions 183

1. Elementary examples and properties 186

2. Dynamic programming principle 188

3. Pontryagin maximum principle 190

4. The Hamilton-Jacobi equation 192

5. Verification theorem 193

6. Existence of optimal controls - bounded control space 195

7. Sub and superdifferentials 197

8. Optimal control in the calculus of variations setting 202

9. Viscosity solutions 214

10. Stationary problems 224

5. Duality theory 231

1. Model problems 231

2. Some informal computations 237

3. Duality 241

4. Generalized Mather problem 2445. Monge-Kantorowich problem 266

. Bibliography 269

. Index 271


5/272

Introduction

This book is dedicated to the study of calculus of variations and its

connection and applications to partial differential equations. We have

tried to survey a wide range of techniques and problems, discussing,

both classical results as well as more recent techniques and problems.This text is suitable to a first one-year graduate course on calculus of

variations and optimal control, and is organized in the following way:

1. Finite dimensional optimization problems;

2. Calculus of variations with one independent variable;

3. Calculus of variations and elliptic partial differential equations;

4. Deterministic optimal control and viscosity solutions;

5. Duality theory.

The first chapter is dedicated to finite dimensional optimization,

giving emphasis to techniques that can be generalized and applied in in-

finitely dimensional problems. This chapter starts with an elementary

discussion of unconstrained optimization in Rn and convexity. Then

we discuss constrained optimization problems, linear programming and

KKT conditions. The following chapter concerns variational problems

with one independent variable. We study classical results including

applications to Riemannian geometry and classical mechanics. We also

discuss sufficient conditions for minimizers, Hamiltonian dynamics andseveral other related topics. The next chapter concerns variational

problems with functionals defined through multiple integrals. In many

of these problems, the Euler-Lagrange equation is an elliptic partial

differential equation, possibly non linear. Using the direct method in

the calculus of variations, we prove the existence of minimizers. Then

5


6/272

6 INTRODUCTION

we show that the minimum is a weak solution to the Euler-Lagrange

equation and study its regularity. The study of regularity follows the

classical path: first we consider energy methods, then we prove the DeGiorgi-Nash-Moser estimates and finally Schauder estimates. In the

fourth chapter we consider optimal control problems. We study both

classical control theory methods such as the dynamic programming

and Pontryagin maximum principle, as well as more recent tools such

as viscosity solutions of Hamilton-Jacobi equations. The last chap-

ter is a brief introduction to the (infinite dimensional) duality theory

and its applications to non-linear partial differential equations. We

study Mathers problem and Monge-Kantorowich optimal mass trans-

port problem. These have important relations with Hamilton-Jacobiand Monge-Ampere equations, respectively.

The pre-requisites of these notes are some familiarity with the

Sobolev spaces and functional analysis, at the level of [Eva98b]. With

some few exceptions, we do not assume familiarity with partial differ-

ential equations beyond elementary theory.

Many of the results discussed, as well as important extensions,

can be found in the bibliography. In what it what concerns finite

dimensional optimization and linear programming, the main reference

is [Fra02]. On variational problems with one independent variable,

a key reference is [AKN97]. The approach to elliptic equations in

chapter 3 was strongly influenced by the course the author frequented

at the University of California at Berkeley by Fraydoun Rezakhanlou,

by the (unpublished) notes on Elliptic Equations by my advisor L. C.

Evans, and by the book [Gia83]. The books [GT01] and [Gia93]

are also classical references in this area. Optimal control problems are

discussed in 4. The main references are [Eva98b], [Lio82], [Bar94]

[FS93], [BCD97]. The last chapter concerns duality theory. We rec-

ommend the books [Eva99] [Vil03a], [Vil] as well as the authors

papers [Gom00], [Gom02b].


7/272

INTRODUCTION 7

I would like to thank my students: Tiago Alcaria, Patrcia Engracia,

Slvia Guerra, Igor Kravchenko, Anabela Pelicano, Ana Rita Pires,

Veronica Qutalo, Lucian Radu, Joana Santos, Ana Santos, and VitorSaraiva, which took courses based on part of these notes and suggested

me several corrections and improvements. My friend Pedro Girao de-

serves a special thanks are he read the first LATEX version of these notes

and suggested many corrections and improvements.


8/272


9/272

1

Finite dimensional optimization problems

This chapter is an introduction to optimization problems in finite

dimension. We are certain that many of the results discussed, as well as

its proofs, are familiar to the reader. However, we feel that it is instruc-

tive to recall them and, throughout this text, observe how they can beadapted for infinite dimensional problems. The plan of this chapter is

the following: we start in 1 by considering unconstrained minimizationproblems in Rn, we discuss existence and uniqueness of minimizers, as

well as first and second order tests for minimizers. The following sec-

tion, 2, concerns properties of convex functions which will be neededthroughout the text. We start the discussion of constrained optimiza-

tion problems in 3 by studying the Lagrange multiplier method forequality constraints. Then, the general case involving both equality

and inequality constrains is discussed in the two remaining sections. In

4 we consider linear programming problems, and in 5 we discuss non-linear optimization problems and we derive the Karush-Kuhn-Tucker

(KKT) conditions. The chapter ends with a few bibliographical refer-

ences.

The general setting of optimization problems is the following: given

a function f : Rn R and a set X Rn, called the admissible set, wewould like to solve the following minimization problem

(1)min f(x)x X,

i.e. to find the solution set S X such that

f(y) = infX

f,

9


10/272

10 1. FINITE DIMENSIONAL OPTIMIZATION PROBLEMS

for all y S. We should note that the min in (1) should beread minimize rather than minimum as the minimum may not

be achieved. The number infX f is called the value of problem (1).

1. Unconstrained minimization in Rn

In this section we address the unconstrained minimization case,

that is the case in which the admissible set X is Rn. Let f : Rn Rbe an arbitrary function. We look for conditions on f that

ensure the existence of a minimum; show that this minimum is unique.

In many instances, existence and uniqueness results are not enough:

we would also like to

determine necessary or sufficient conditions for a point to be aminimum;

estimate the location of a possible minimum.

By looking for all points that satisfy necessary conditions one can

determine a set of candidate minimizers. Then, by looking at sufficient

conditions one may in fact be able to show that some of these points

are indeed minimizers.

To study the existence of a minimum of f, we can use the following

procedure, called the direct method of the calculus of variations: let

(xn) be a minimizing sequence, that is, a sequence such that

f(xn) inff.

Proposition 1. LetA be an arbitrary set and f : A R. Then thereexists a minimizing sequence.


11/272

1. UNCONSTRAINED MINIMIZATION IN Rn 11

Proof. If infA f = , there exists xn A such that f(xn)

. Otherwise, if infA f >

,we can always find xn

A such

that infA f f(xn) infA f + 1n , which again produces a minimizingsequence.

Let f : Rn R. Suppose (xn) is a minimizing sequence for f. Ifxn (or some subsequence) converges to a point x, and, if additionaly,

f(xn) converges to f(x), then x is a minimum of f because

f(x) = lim f(xn),

and

lim f(xn) = inff,

because xn is a minimizing sequence. Thus f(x) = inff. Although

minimizing sequences always exist, they may fail to converge, even up

to subsequences, as the next exercise illustrates:

Exercise 1. Consider the functionf(x) = ex. Compute inff, give an

example of a minimizing sequence. Show that no minimizing sequence

for f converges.

As the previous exercise suggests, to ensure convergence it is nat-

ural to impose certain compactness conditions. In Rn, any bounded

sequence (xn) has a convergent subsequence. A convenient condition

on f that ensures boundedness of minimizing sequences is coercivity:

a function f : Rn R is called coercive if f(x) +, as |x| .

Exercise 2. Letf be a coercive function and let xn be a sequence such

that f(xn) is bounded. Show that xn is bounded. Note in particular

that if f(xn) is convergent then xn is bounded.

Therefore, from the previous exercise, it follows

Proposition 2. Let f : Rn R be a coercive function. Let (xn) isa minimizing sequence for f. Then there exists a point x for which,

through some subsequence xn x.


12/272


Unfortunately, iff is discontinuous at x, f(xn) may fail to converge

to f(x). This poses a problem because if xn is a minimizing sequence

f(xn) inff and if this limit is not f(x) then x cannot be a mini-mizer. It would, therefore, seem natural to require f to be continuous.

However, to establish that x is a minimizer we do not really need con-

tinuity. In fact, a weaker property is sufficient: it is enough that for

any sequence (xn) converging to x the following inequality holds:

(2) lim inf f(xn) f(x).A function f is called lower semicontinuous if inequality (2) holds for

any point x and any sequence xn converging to x.

Example 1. The function

f(x) =

1 if x = 00 if x = 0

is lower semicontinuous. However,

g(x) =

0 if x = 01 if x = 0

is not.

ADD HERE GRAPH OF FUNCTIONS

Proposition 3. Let f : Rn R be lower semicontinuous and let(xn) Rn be a minimizing sequence converging to x Rn. Then x isa minimizer of f.

Proof. Let xn be a minimizing sequence. Then

inff = lim f(xn) = lim inff(xn) f(x),that is, f(x) inff.

Lower semicontinuity is a weaker property than continuity, and

therefore easier to be satisfied.


13/272


Establishing the uniqueness of minimizer is, in general, more com-

plex. A convenient condition that implies uniqueness of minimizers is

convexity.

A set A Rn is convex if for all x, y A and any 0 1 wehave x + (1 )y A. Let A be a convex set A function f : A Ris convex if, for any x, y A and 0 1,

f(x + (1 )y) f(x) + (1 )f(y),

and it is uniformly convexif there exists > 0 such that for all x, y Aand 0 1,

f(x + (1 )y) + (1 )|x y|2 f(x) + (1 )f(y).

Example 2. Let be any norm in Rn. Then, by the triangleinequality

x + (1 )y x + (1 )y = x + (1 )y,

for all 0 1. Thus the mapping x x is convex.

Exercise 3. Show that the square of the Euclidean norm inRd, x2 =k x

2k is uniformly convex.

Proposition 4. LetA Rn be a convex set and f : A R be a convexfunction. If x and y are minimizers of f then so is x + (1 )y, forany 0 1. If f is uniformly convex then x = y.

Proof. If x and y are minimizers then f(x) = f(y) = min f.

Consequently, by convexity

f(x + (1

)y)

f(x) + (1

)f(y) = min f.

Therefore x + (1 )y is a minimizer of f. If f is uniformly convex,and choosing 0 < < 1, we obtain

f(x + (1 )y) + (1 )|x y|2 min f,which implies x = y.


14/272


The characterization of minimizers, through necessary or sufficient

conditions is usually made by introducing certain conditions that in-

volve first or second derivatives. Let f : Rn R be a C2 function. Re-call that Df and D2f denote, respectively the first and second deriva-

tives of f. Also we use the notation that a n n matrix A 0 if Ais semidefinite positive and A > 0 is A is definite positive. The next

proposition is a well known result that illustrates this.

Proposition 5. Letf : Rn R be a C2 function and x a minimizerof f. Then

Df(x) = 0 and D2f(x) 0.

Proof. For any vector y Rn and > 0 we have0 f(x + y) f(x) = Df(x)y + O(2),

dividing by , and letting 0, we obtainDf(x)y 0.

Since y is arbitrary we conclude that:

Df(x) = 0.

In a similar way,

0 f(x + y) + f(x y) 2f(x)2

= yTD2f(x)y + o(1),

and so, when 0, we obtainyTD2f(x)y 0.

Let f : Rn R be a C1 function. A point x is called a criticalpoint of f if Df(x) = 0.

Exercise 4. Let A be any set and f : A R be a C1 function inthe interior int A of A. Show that any maximizer or minimizer of f is

either a critical point or lies on the boundary A of A.


15/272


We will now show that any critical point of a convex function is a

minimizer. For that we need the following preliminary result:

Proposition 6. Let f : Rn R be a C1 convex function. Then, forany x, y we have

f(y) f(x) + Df(x)(y x).

Proof. We have

(1)f(x)+f(y) f(x+(yx)) = f(x)+Df(x)(yx)+o(|(yx)|).Thus, reorganizing the inequality and dividing by we obtain

f(y) f(x) + Df(x)(y x) + o(1),as 0.

We can use now this result to prove:

Proposition 7. Let f : Rn R be a C1 convex function and x acritical point of f. Then x is a minimizer of f.

Proof. Since Df(x) = 0 and f is convex, it follows from proposi-tion 6 that

f(y) f(x),for all y.

Exercise 5. Letf(x, ) : Rn Rm R be a C2 function, x0 a mini-mizer of f(, 0), with D2xxf(x0, 0) definite positive. Show that, for each in a neighborhood of = 0, there exists a unique local minimizer xof f(, ) with x|=0 = x0. Compute Dx at = 0.

Growth conditions on f can be used to estimate the norm of a

minimizer. In finite dimensional problems, estimates on the norm of a

minimizer are important for numerical methods. For instance, if such

an estimate exits, it makes it possible localize the search region for

a minimizer. In infinite dimensional problems this issue is even more


16/272


relevant as it will be clear later in these notes. An elementary result is

given in the next exercise:

Exercise 6. Letf : Rn R be such that f(x) C1|x|2 + C2, C1 > 0.Letx0 be a minimizer of f. Show that

|x0|

f(y) C2C1

,

for any y Rn.

Exercise 7. Let f(x, ) : R2 R be a continuous function. Supposefor each there is at least one minimizer x of x

f(x, ). Suppose

there exists C such that |x| C for all in a neighborhood of = 0.Suppose that for = 0 there exists a unique minimizer x0. Show that

lim0 x = x0.

Exercise 8. Let f C1(R2). Define u(x) = infyR f(x, y). Supposethat

lim|y|

f(x, y) = +,

uniformly in x. Letx0 be a point in which the infimum in y of f is

achieved at a single point y0. Show that u is differentiable in x at x0and thatu

x(x0) =

f

x(x0, y0).

Give an example that shows that u may fail to be differentiable if the

infimum of f in y is achieved at more than one point.

Exercise 9. Find all maxima and minima (both local and global) of

the function xy(1 x2 y2) on the square 1 x, y 1.

2. Convexity

As we discussed in the previous section, convexity is a central prop-

erty in optimization. In this section we discuss additional properties of

convex functions which will be necessary in the sequel.


17/272

2. CONVEXITY 17

2.1. Characterizarion of convex functions. We now discuss

several tools that are useful to characterize convex functions. We first

observe that given a family of convex functions it is possible to buildanother convex function by taking the pointwise supremum. This is a

useful construction and is illustrated in figure

ADD FIGURE HERE

Proposition 8. LetI be an arbitrary set and f : Rn R, I, an

indexed collection of convex functions. Let

f(x) = supI

f(x).

Then f is convex.

Proof. Let x, y Rn and 0 1. Thenf(x + (1 )y) = sup

If(x + (1 )y) sup

If(x) + (1 )f(y)

sup1I

f1(x) + sup2I

(1 )f2(y)

= f(x) + (1 )f(y).

Corollary 9. Suppose f : Rn R is a C1 function satisfyingf(y) f(x) + Df(x)(y x),

for all x. Then f is convex.

Proof. It suffices to observe that

f(y) supxRn

f(x) + Df(x)(y x),

which by proposition 8 is convex. Finally, we just observe that

supxRn

f(x) + Df(x)(y x) f(y),

and so the equality follows.

Proposition 10. Letf : Rn R be a C2 function. Then f is convexif and only if D2f(x) is positive semi-definite, for all x Rn.


18/272


Proof. Observe that if f is convex then for any y Rn and any

0 we have

f(x y) + f(x + y) 2f(x)2

0.

By sending 0 and using Taylor formula concludeyTD2f(x)y 0,

and so D2f(x) is semi-definite positive.

Conversely,

f(y) f(x) = 1

0Df(x + s(y x))(y x)ds =

= Df(x)(y x) +10

[Df(x + s(y x))(y x) Df(x)(y x)] ds

= Df(x)(y x) +10

10

s(y x)TD2f(x + ts(y x))(y x)dt

ds

Df(x)(y x),since (y x)TD2f(x + ts(y x))(y x) 0, by the semi-positivedefiniteness hypothesis.

Proposition 11. Letf : Rn R be a continuous function. Then f isconvex if and only if

(3) f(x + y) + f(x y) 2f(x) 0,for any x, y Rn.

Proof. Clearly convexity implies (3). Let x, y Rn, and 0 1 be such that x + (1 )y = z. We must prove that(4) f(x) + (1 )f(y) f(z)holds. We claim that the previous equation holds for any = k

2j, for

any 0 k 2j. Clearly (4) holds when j = 1. Now we proceed withinduction in j. Assume that (4) holds for = k2j . Then we claim that

it holds with = k2j+1

. Ifk is even we can reduce the fraction, therefore


19/272

2. CONVEXITY 19

we may suppose that k is odd, = k2j+1

and x + (1 )y = 0. Nownote that

z = 12

k 12j+1

x +

1 k 12j+1

y

+ 12

k + 12j+1

x +

1 k + 12j+1

y

.

Thus

f(z) 12

f

k 12j+1

x +

1 k 1

2j+1

y

+

1

2f

k + 1

2j+1x +

1 k + 1

2j+1

y

but, since k 1 and k +1 are even, k0 = k12 and k1 = k+12 are integers.Hence

f(z) 12

fk02j

x + 1 k02j y +

1

2f

k12j

x + 1 k12j y

But this implies, that

f(z) k0 + k12j+1

f(x) +

1 k0 + k1

2j+1

f(y).

Since k0 + k1 = k we get

f(z) k2j+1

f(x) +

1 k

2j+1

f(y).

Since f is continuous and the rationals of the form k2j

are dense in [0, 1],

we conclude thatf(z) f(x) + (1 )f(y),

for any real 0 1. Exercise 10. Letf : Rn R be aC2 function. Show that the followingstatements are equivalent:

1. f is uniformly convex;

2. D2f > 0, for some > 0;3. fx+y

2 + |xy|2

4 f(x)+f(y)

2;

4. f(y) f(x) + Df(x)(y x) + 2|x y|2, for some > 0.

Exercise 11. Let : R R be a non-decreasing convex function, and : Rn R a convex function. Show that is convex. Show bygiving an example that if is not non-decreasing then may failto be convex.


20/272


2.2. Lipschitz continuity. Convex functions enjoy remarkable

properties. We will first show that any convex function is locally

bounded and Lipschitz.

Proposition 12. Let f : Rd R be a convex function. Then f islocally bounded and locally Lipschitz.

Proof. For x Rd denote |x|1 =

k |xk|. Define XM = {x Rd :|x|1 M}. We will prove that f is bounded on XM/8.

Any point x XM can be written as a convex combination of thepoints Mek, where ek is the k-th standard unit vector. Thus

f(x) maxk

{f(Mek), f(Mek)}.

Suppose now f is not bounded by bellow on XM/8. Then there exists

a sequence xn XM/8 such that f(xn) . Choose a point y XM/4XcM/8. Note that 2yxn XM. Therefore we can write 2y xnas a convex combination of the points Mek, i.e.

y =1

2xn +

1

2 k

k Mek.

Thus

f(y) 12

f(xn) +1

2maxk

{f(Mek), f(Mek)},which is a contradiction if f(xn) .

Now we will show the second part of the proposition, i.e., that any

convex function is also locally Lipschitz. By contradiction, by changing

coordinates if necessay, we can assume that 0 is not a Lipschizt point,

that is, there exists a sequence xn

0 such that

|f(xn) f(0)| C|xn|,

for all C and all n large enough. In particular this implies that

lim supn

f(xn) f(0)|xn| {, +}.


21/272

2. CONVEXITY 21

and, similarly,

lim infn

f(xn)

f(0)

|xn| {, +}.By the previous part of the proof, we can assume that f is bounded

on X1. For each n choose a point yn such that |yn|1 = 1 such thatxn = |xn|1yn. Then

f(xn) |xn|f(yn) + (1 |xn|)f(0),

which implies

f(yn)

f(0) +

f(xn) f(0)

|xn|.

Therefore

(5) lim supn

f(xn) f(0)|xn| = ,

otherwise we would have a contradiction (note that f(yn) is bounded).

We can also write 0 = 11+|xn|xn |xn|

1+|xn|yn. So

f(0) 11 +

|xn|

f(xn) +|xn|

1 +|xn|

f(yn).

This implies

f(yn) f(0) + f(0) f(xn)|xn| .

Because f(yn) is bounded

lim supn

f(0) f(xn)|xn| =

which is a contradiction to (5).

2.3. Separation. In this last subsection we study separation prop-

erties that arise from convexity and present some applications.

Proposition 13. LetC be a closed convex set not containing the ori-

gin. Then there exists x0 C which minimizes |x| over all x C.


22/272


Proof. Consider a minimizing sequence xn. By a simple compu-

tation, we have the parallelogram identityxn + xm22 + 14xn xm2 = 12xn2 + 12xm2.

Because xn+xm2

C, by convexity, we have the inequalityxn + xm22

infyC

y2.

As n, m we also havexn, xm inf

yCy2.

But then, as n, m , we conclude thatxn xm2 0.

Therefore any minimizing sequence is a Cauchy sequence and hence

convergent.

Exercise 12. Let F : Rn R be a uniformly convex function. Showthat any minimizing sequence for F is a Cauchy sequence. Hint:

F(xn)+F(xm)2infF F(xn)+F(xm)2F(xn + xm2

) 2|xnxm|2.

Proposition 14. LetU and V be disjoint closed convex sets. Supposeone them is compact. Then there exists w Rn and a > 0 such that

(w, x y) a > 0,for all x U and y V.

Proof. Consider the closed convex set W = U V (this set isclosed because either U or V is compact). Then there exists a point

w W with minimal norm. Since 0 W, w = 0. So, for all x U

and y V, by the convexity of W,w2 (x y) + (1 )w2

= (1 )2w2 + 2(1 )(x y, w) + 2x y2.The last inequality implies

0 ((1 )2 1)w2 + 2(1 )(x y, w) + 2x y2.


23/272

2. CONVEXITY 23

Dividing by and letting 0 we obtain(x

y, w)

w

2 > 0.

As a first application to the separation result we discuss a general-

ization of derivatives for convex functions. The subdifferential f(x)

of a convex function f : Rn R at a point x Rn is the set of vectorsp Rn such that

f(y) f(x) + p (y x),for all y

Rn.

Proposition 15. Let f : Rn R be a convex function and x0 Rn.Then f(x0) = .

Proof. Consider the set E(f) = {(x, y) Rn+1 : y f(x)}, theepigraph of f. Then, because f is convex and hence continuous, E(f)

is a closed convex set. Consider the sequence yn = f(x0) 1n . Becausefor each n the sets E(f) and {(x0, yn)} are disjoint closed convex sets,and the second one is compact, there is a separating plane

(6) f(x) n(x x0) + n,for all x and

(7) f(x0) 1n

= yn n f(x0).Thus, from (7) we get that n is bounded. Since f is locally bounded,

the inequality (6) implies the boundedness of n. Therefore, up to a

subsequence, there exists = lim n and = lim n. Furthermore

f(x) (x x0) + ,and, again using (7), we get that f(x0) = . Thus

f(x) (x x0) + f(x0),and so f(x). Exercise 13. Letf : R R, be given by f(x) = |x|. Compute f.


24/272


Exercise 14. Letf : Rn R be convex. Show that iff is differentiableat x

Rn then f(x) =

{Df(x)

}.

Proposition 16. Letf : Rn R be a C1 convex function. Then(Df(x) Df(y)) (x y) 0.

Proof. Observe that

f(y) f(x) + Df(x) (y x) f(x) f(y) + Df(y) (x y).Add these two inequalities.

Exercise 15. Prove the analogous to the previous proposition for thecase in which f is not C1 by replacing derivatives by points in the

subdifferential.

Exercise 16. Letf be a uniformly convex function. Show that

(Df(x) Df(y)) (x y) |x y|2.

Exercise 17. Letf : Rn R be a convex function. Show that a pointx Rn is a minimizer of f if and only if 0 f(x).

Exercise 18. Let A be a convex set and f : A R be a uniformlyconvex function. Let x A be a maximizer of f. Show that x isan extreme point, that is, that there are no y, z A, x = y, z and0 < < 1 such that x = y + (1 )z.

The second application of Proposition 14 is a very important result

called Farkas lemma:

Lemma 17 (Farkas Lemma). LetA be a m

n matrix, c a line vector

inRn. Then we have one and only one of the following alternatives

1. c = yTA, for some y 02. There exists a column vector w Rn, such that Aw 0 and

cw > 0


25/272

2. CONVEXITY 25

Proof. If the first alternative does not hold, the sets U = {yTA, y 0

}and V =

{c

}are disjoint and convex. Then the separation theo-

rem for convex sets (see proposition 14) implies that there exists anhyperplane with normal w which separates them, that is

(8) yTAw aand

cw > a.

Note that a 0 (by setting y = 0 in (8)), so cw > 0. Furthermore, forany 0 we have

yTAw a,by letting + we conclude that

yTAw 0.So this corresponds to the second alternative.

Example 3. Consider a discrete state one-period pricing model, that

is, we are given n assets which at the initial time cost ci, 1 i n perunit (we regard c as a row vector) and after one unit of time, each asset

is worth with probability pj , 1 j m, Pji . A portfolio is a (column)vector Rn. The value of the portfolio at time 0 is c and at timeone, with probability pj the value is (P )j . An arbitrage opportunity

is a portfolio such that c < 0 and (P )j 0, i.e. a portfolio withnegative cost and non-negative return.

Farkas lemma yields that either

1. there exists y Rm, y 0 such that c = yP

or2. there exists an arbitrage portfolio.

Furthermore, if one of the assets is a no-interest bearing bank ac-

count, for instance c1 = 1 and Pj1 = 1. Then y is a probability vector

which in general may be different from p.


26/272


3. Lagrange multipliers

Many important problems require minimizing (or maximizing) func-

tions under equality constraints. The Lagrange multiplier method is

the standard tool to study these problems. For inequality constraints,

the Lagrange multiplier method can be extended in a suitable way as

it will be studied in the two following sections.

Proposition 18. Let f : Rn R and g : Rn Rm (m < n) be C1functions. Suppose c Rm fixed, and assume that the rank of Dg is mat all points of the set g = c. Then, if x0 is a minimum of f in the set

g(x) = c, there exists Rm

such thatDf(x0) =

TDg(x0).

Proof. Let x0 be as in the statement. Suppose that w1, . . . wm are

vectors in Rn satisfying

det[Dg(x0)W] = 0,where W [w1 wm] is the matrix with columns w1, . . . wm. Notethat it is possible to choose such vectors because the rank of Dg is m.

Given v Rn

consider the equation

g(x0 + v + W i) = c.

The implicit function theorem implies that there exists a unique func-

tion i() : R Rm,

i() =

i1()...

im()

,

defined in a neighborhood of = 0, with i(0) = 0, and such that

g(x0 + v + W i()) = c.

Additionally,

i(0) = (Dg(x0)W)1Dg(x0)v.Since x0 is a minimizer of f in the set g(x) = c, the function

I() = f(x0 + v + W i())


27/272

3. LAGRANGE MULTIPLIERS 27

satisfies

0 = I(0) = Df(x0)v + Df(x0)W i(0),

that is,

Df(x0)v = TDg(x0)v,

with

T = Df(x0)W(Dg(x0)W)1,

for any vector v.

Proposition 19. Let f : Rn R, and g : Rn Rm, with m < n,be smooth functions. Assume that Dg has maximal rank at all points.

Let xc be a minimizer of f(x) under the constraint g(x) = c, and c

the corresponding Lagrange multiplier, i.e.

(9) Df(xc) = cDg(xc).

Suppose that xc is differentiable function of c. Define

V(c) = f(xc).

Then DcV(c) = c.

Proof. We have

g(xc) = c.

By differentiating with respect to c we obtain

Dg(xc)xcc

= I.

Multipying by c and using (9) yields

c = cDg(xc)xcc

= Df(xc)xcc

= DcV(c).

Exercise 19. Let f : Rn R, and g : Rn Rm, with m < n, besmooth functions. Assume that Dg has maximal rank at all points. Letx0 be a minimizer of f(x) under the constraint g(x) = g(x0), the

corresponding Lagrange multiplier, and F = f + g. Show that

D2xixjF(x0)ij 0,for all vectors that satisfy Dxig(x0)i = 0.


28/272


Proposition 20. Let f : Rn R, and g : Rn Rm, with m < n.Letx0 be a minimizer of f(x) under the constraint g(x) = g(x0). Then

there exist constants 0, m not identically zero such that0Df + 1Dg1 mDgm = 0

at x0. Furthermore, if Dg has maximal rank we can choose 0 = 1.

Proof. First observe that the matrixDf

Dg

cannot have rank m + 1. Indeed, this follows by applying the implicitfunction theorem to the function (x, c) (f(x) c0, g(x) c) withx Rn and c = (c0, c) Rm+1, to obtain a contradiction to x0 beinga minimizer.

This fact then implies that there exist constants 0, m not iden-tically zero such that

0Df + 1Dg1 + + mDgm = 0at x

0. Observe also that ifDg has maximal rank we can choose

0= 1.

In fact, if0 = 0, it suffices to multiply by 10 . To see that 0 = 0 weargue by contradiction. In fact, if 0 = 0 we would have

1Dg1 + + mDgm = 0which contradicts the hypothesis that Dg has maximal rank m.

Example 4 (Minimax principle). There exists a nice formal interpre-

tation of Lagrange multipliers, which although not rigorous is quite

useful. Fix c Rm, and consider the problem of minimizing a functionf : Rn R under the constraint g(x)c = 0, with g : Rn Rm. Thisproblem can be rewritten as

minx

max

f(x) + T(g(x) c).The minimax principle asserts that the maximum can be exchanged

with the minimum (which is frequently false) and, therefore, we obtain


29/272

3. LAGRANGE MULTIPLIERS 29

the equivalent problem

max minx f(x) +

T

(g(x) c).From this we deduce that, for each the minimum x is determined

by

(10) Df(x) + TDg(x) = 0.

Furthermore, the function to maximize in is

f(x) + T(g(x) c).

Differentiating this equation with respect to , assuming that x is

differentiable, and using (10), we obtain

g(x) = c.

Exercise 20. Use the minimax principle to determine (formally) op-

timality conditions for the problem

min f(x)

under the constraint g(x) c.

The next exercise illustrates that the minimax principle may indeed

be false, although in many problems it is an important heuristic

Exercise 21. Show that the minimax principle is not valid in the fol-

lowing cases:

1. x + ;

2. x3 + (x2 + 1);

3. 11+(x)2 .

Exercise 22. LetA andB be arbitrary sets and F : AB R. Showthat

infaA

supbB

F(a, b) supbB

infaA

F(a, b).


30/272


4. Linear programming

We now continue the study of constrained optimization problems

by looking into minimization of linear functions subjected to linear in-

equality constraints - i.e., linear programming problems. A detailed dis-

cussion on this class of problems can be found, for instance, in [GSS08]

or [Fra02].

4.1. The setting of linear programming. A model problem in

linear programming is the following: given a line vector c Rn, a realm n matrix A, and a column vector b Rm we look for a columnvector x Rn which is a solution to the problem:

(11)

maxx cx

Ax bx 0,

where the notation v 0 for a vector v means that all components ofv are non-negative. The set defined by the inequalities Ax b andx

0 may be empty, or in this set the function cx may be unbounded

by above. To simplify the discussion, we assume that this situation

does not occur.

Move here feasible set

Example 5. Add example here.

Observe that ifc = 0 the maximizers ofcx cannot be interior pointsof the feasible set, otherwise by exercise 4 they would be critical points.

Therefore, the maximizers must lie on the boundary of Ax b, x 0.Unfortunately this boundary can be quite complex as consists on a

finite (but frequently large) union of intersections of planes (of the

form dx = e) with half-planes (of the form dx e).


31/272

4. LINEAR PROGRAMMING 31

Exercise 23. Suppose that no line ofA vanishes. Show that the bound-

ary of the set Ax

b consist of all points which satisfy Ax

b with

equality in at least one coordinate.

Note that the linear programming problem (11) is quite general

as it is possible to include equality constraints as inequalities: in fact

Ax = b is the conjunction of Ax b and Ax b.

A vector x is called feasible for (11) if it satisfies the constraints,

that is Ax b and x 0.Example 6 (Diet problem). A animal food factory would like to min-

imize the production cost of a pet food, while keeping it nutritionally

balanced. Each food i costs ci by unit. Therefore, if each unit of pet

food contains an amount xi of the food ci, the total cost is

cx.

There is, of course, the obvious constraint that x 0. Suppose thatAij represents the amount of the nutrient i in the food j, and bi the

minimum recommended amount of the nutrient i. Then, to ensure a

nutritionally ballanced diet we must have

Ax b.Thus the diet problem is

min cx

Ax bx 0.

Example 7 (Optimal Transport). A large multinational needs to trans-

port its supply from each factory i to the distribution points j. Thesupply in i is si and the demand in j is dj . The cost of transporting

one unit from i to j is cij . We would like to determine the quantity ijtransported from i to j solving the following optimization problem

min

ij

cijij,


32/272


under the constraints ij 0, and supply and demand bounds

j

ij si, i

ij dj.

Example 8. The existence of feasible vectors, i.e. vectors satisfying

the constraint Ax b is not obvious. There exists, however a procedurethat can convert this question into a new linear programming problem.

Let x0 be a new variable. We would like to solve

min x0

where the minimum is taken over all vectors (x0, x) which satisfy the

constraints (Ax)j bj + x0, for all j. It is clear that the feasible set forthis problem is non-empty, take for instance x = 0 and x0 = max |bj|.

This new linear programming problem has therefore a value (which

could be but not +). If the value is non-positive, there existfeasible vectors for the constraint Ax b. Otherwise, if the value ispositive, it implies that the feasible set of the original problem is empty.

Exercise 24. Let A be m n matrix, with m > n. Consider theoverdetermined system

Ax = b

forb Rm. In general, this equation has no solution. We would like todetermine a vector x Rn which minimizes the maximum of the error

supi |(Ax)i bi|.Rewrite this problem as a linear programming problem. Compare this

problem with the minimum square method which consists in solving

minx

Ax b2.


33/272


4.2. The dual problem. To problem (11), which we call primal,

we associate another problem, called the dual, which consists in deter-

mining y Rm, which solves

(12)

min yTb

yTA cy 0.

As the next exercise shows, the dual problem can be motivated by the

minimax principle:

Exercise 25. Show that (11) can be written as

(13) maxx0

miny0

cTx + yT(b Ax).

Suppose we can exchange the maximum with the minimum in (13).

Relate the resulting problem with (12).

Example 9 (Interpretation of the dual of the diet problem). The dual

of the diet problem (example 6) is the following

max yTb

yTA

c

y 0.This problem admits the following interpretation. A competing com-

pany is willing to provide a nutritionally balanced diet, charging for

each unit of the nutrient i a price yi. Obviously, the competing com-

pany would like to maximize its income. There are the following con-

straints: y 0, and furthermore if the food item j costs cj the com-peting company should charge an amount (yTA)j no larger than cj.

This constraint is quite natural, since if it does not hold, at least part

of the diet could be obtained by buying the food items j such that(yTA)j > cj.

Exercise 26. Show that the dual of the dual is equivalent to the primal.

Exercise 27. Determine the dual of the optimal transport problem and

give a possible interpretation.


34/272


As the next theorem concerns the relation between the primal and

dual problems:

Theorem 21.

1. Weak Duality: Suppose x and y are feasible, respectively, for

(11) and (12), then

cx yTb.2. Optimality: Furthermore, if cx = yTb then x and y are solu-

tions of (11) and (12), respectively.

3. Strong duality: If (11) has a solution x

, then (12) also hasa solution y,

cx = (y)Tb.

Finally, yj = 0 for all indices j such that (Ax)j < bj.

Proof. To prove the weak duality, observe that

cx (yTA)x = yT(Ax) yTb.The optimality criterion follows from the previous inequality.

To prove the strong duality, we may assume that the inequality

Ax b includes also x 0, for instance replacing A by the augmentedmatrix

A =

A

I

and the vector b by

b =

b

0

.

In this case it will be enough to prove that there exists a vector y Rn+m such that y 0,

c = (y)TA

with yj = 0 for all indices j such that (Ax)j < bj. In fact, if such

vector y is given we just set y to be the first n coordinates of y.


35/272


Then c (y)TA and then

cx

= (y

)

T Ax

= (y

)

T

b = (y

)

T

b,since b differs from b by adding n zero entries. From this point on we

drop the to simplify the notation.

First we state the following auxiliary result, whose proof is a simple

corollary to Lemma 17:

Lemma 22. Let A be a m n matrix, c a line vector inRn and Jan arbitrary set of lines of A. Then we have one and only one of the

following alternatives

1. c = yTA, for some y 0 with yj = 0 for all j J.2. There exists a column vector w Rn, such that (Aw)j 0 for

all j J and cw > 0.Exercise 28. Use Lemma 17 to prove Lemma 22.

Let x be a solution of (11). Let J be the set of indices j for which

(Ax)j = bj. We will show that there exists y

0 such that c = yTA

and yj = 0 for j J. By contradiction assume that no such y exists.By the previous lemma there is w such that cw > 0 and (Aw)j 0 for

j J. But then, x = x + w is feasible, for > 0 sufficiently smallsince

Ax = Ax + Aw b.However,

cx = c(x + w) > cx,

which contradicts the optimality of x.

Therefore, for some y 0,cx = yTAx = yTb.

Consequently, by the second part of the theorem we conclude that y is

optimal.


36/272


Lemma 23. Let x and y be, respectively, feasible for the primal and

dual problems. Define

s = b Ax 0, e = ATy cT 0.Then

sTy + xTe = bTy xTcT 0.

Proof. Since x, y 0 we havesTy = bTy xTATy 0 xTe = xTATy xTcT 0.

By adding these two expressions, we obtain

sTy + xTe = bTy xTcT 0.

Theorem 24 (Complementarity). Suppose x and y are solutions of

(11) and (12), respectively. Then

sTy = 0 and xTe = 0.

Proof. We have sTy, xTe 0. Ifx and y are optimal then cx =yTb. By the previous lemma

sTy + xTe = 0,

which implies the theorem.

Exercise 29. Study the following problem inR2:

max x1 + 2x2

with x1, x2 0, x1 + x2 1 and 2x1 + x2 3/2. Determine the dualproblem, its solution and show that it has the same value as the primal

problem.

Exercise 30. Letx be a solution of the problem

min cx


37/272

5. NON-LINEAR OPTIMIZATION WITH CONSTRAINTS 37

under the constraints Ax b, x 0 and let y be a solution of thedual. Use complementarity to show that x minimizes

cx (y)TAxunder the constraint x 0.Exercise 31. Solve by elementary methods the problem

max x1 + x2

under the constraints 3x1 + 4x2 12, 5x1 + 2x2 10.Exercise 32. Consider the problem

min 7x1 + 9x2 + 16x3,under the constraints x 0, 2 x1 + 2x2 + 9x3 7. Obtain an upperand lower bound for the value of the minimum.

Exercise 33. Show that the solution set of a linear programming prob-

lem is a convex set.

Exercise 34. Consider a linear programming problem inRn

min cx

under the constraints Ax b, x 0. Suppose c = c0 + c1. Supposethat for > 0 there exists a minimizer x which converges to a point

x0, as 0. Show that x0 is a minimizer of c0x under Ax b, x 0. Show, furthermore that if this limit problem has more than one

minimizer then x0 minimizes c1x among all other minimizers.

5. Non-linear optimization with constraints

Let f :Rn R and g : R

n

Rm

be C1

functions. We considerthe following non-linear optimization problem:

(14)

maxx

f(x)

g(x) 0x 0.


38/272


We denote the feasible set by X:

X = {x Rn|x 0, g(x) 0},and the solution set by S:

S = {x X : f(x) = supxX

f(x)}.

In this section we derive necessary conditions, called the Karush-Kuhn-

Tucker (KKT) conditions, for a point to be a solution of the problem.

We start by explaining these conditions which generalize both the La-

grange multipliers for equality constraints and the optimality condi-

tions from linear programming. We then show that under convexity

hypothesis these conditions are in fact sufficient. After that we show

that under a condition called constraint qualification that the KKT

conditions are indeed necessary optimality conditions. We end the

discussion with several conditions that allow to check in practice the

constraint qualification conditions.

5.1. KKT conditions. For yRm define the Lagrangian

L(x,y,) = f(x) yTg(x) + Tx

For (x,,y) Rn Rn Rm the KKT conditions are the following:

(15)

Lxi

= 0

g(x) 0, yTg(x) = 0x 0, Tx = 0, y

0.

The variables y and are called the Lagrange multipliers.

Several variations of the KKT conditions arise in different problems.

For instance, in the case in which there is no positivity constraints for

the variable x, the KKT conditions take the form: for (x, y) RnRm,


39/272


and L(x, y) = f(x) yTg(x),

(16)

Lxi = 0g(x) 0, yTg(x) = 0y 0.

Exercise 35. Derive (16) from (15) by writing x = x+ x wherex+, x 0.

Another example are equality constraints g(x) = 0, again without

positivity constraints in the variable x. We can write the equality

constraint as g(x) 0 and g(x) 0. Let y be the multiplierscorresponding to g(x) 0, define y = y+ y. Then (16) can bewritten as

f

xi=

mj=1

yjgjxi

, g(x) = 0,

that is, y is the Lagrange multiplier for the equality constraint g(x) = 0.

Consider a linear programming problem where in (14) we set

f(x) = cx, g(x) = Ax b.Then the KKT conditions are then

c yTA = Ax b, yT(Ax b) = 0x 0, Tx = 0, y 0.

In this case, the first line of the KKT conditions can be rewritten as

c yTA 0,

that is, since y 0, y is admissible for the dual problem. Using thecondition Tx = 0 we conclude that

c x = yTAx.


40/272


Then the second line of the KKT condition yields yTAx = yTb, which

implies

cx = yTb,

which is the optimality criterion for the linear programming problem,

and shows that a solution of the KKT condition is in fact a solution

of (14). Furthermore, it also shows that y is a solution to the dual

problem.

Example 10. Let Q be an n n real matrix. Consider the quadraticprogramming problem

(17)

maxx1

2

xTQx

Ax bx 0.

The KKT conditions are

(18)

xTQ yTA = Ax b, yT(Ax b) = 0x 0, Tx = 0, y 0.

5.2. Duality and sufficiency of KKT conditions. We can write

problem (14) in the following minimax form:

supx0

infy0

f(x) yTg(x).

We define the dual problem as

(19) inf y0

sup

x0

f(x)

yTg(x).

Let

h(y) = supx0

f(x) yTg(x),and

h(x) = infy0

f(x) yTg(x).


41/272


Then (14) is equivalent to

supx0

h(x),

and (19) is equivalent to the problem

infy0

h(y).

From exercise 22, we have the duality inequality

supx0

h(x) = supx0

infy0

f(x) yTg(x)

infy0

supx0

f(x) yTg(x) = infy0

h(y).

Furthermore, if x

0 and y

0 satisfy

h(x) = h(y)

then x and y are, respectively, solutions to (14) and (19).

If we choose

f(x) = cx, g(x) = Ax b,(14) is a linear programming problem. Then

h(x) = cx if Ax b 0

otherwise,and

h(y) =

b

Ty if ATy c 0+ otherwise.

Consider the quadratic programming problem

(20)

max 12xTQx

Ax b 0.Note that here the variable x does not have any sign constraint.

In this case we define

h(x) = infy0

1

2xTQx yT(Ax b) =

12x

TQx if Ax b 0 otherwise,


42/272


and

h(y) = supx

1

2

xTQx

yT(Ax

b).

If we assume that Q is non-singular and negative definite we have

h(y) = 12

yTAQ1ATy + yTb.

It is easy to check directly that h(x) h(y).

It turns out that the KKT conditions are in fact sufficient if f and

g satisfy additional convexity conditions.

Proposition 25. Suppose that

f and each component of g is convex.

Let(x, , y) RnRnRm be a solution of the KKT conditions (15).Then x is a solution of (14).

Proof. Let x X. By the concavity of f we havef(x) f(x) Df(x)(x x).

By the KKT conditions (15),

Df(x)(x x) = yTDg(x)(x x) T(x x).

Since each component of g is convex, and y 0,yTDg(x)(x x) yT(g(x) g(x))

Since yTg(x) = 0, yTg(x) 0, Tx 0, and Tx = 0, we havef(x) f(x) 0,

that is x is solution.

As the next proposition shows, the KKT conditions imply strong

duality.

Proposition 26. Suppose thatf and each component of g is convex.Let(x, , y) RnRnRm be a solution of the KKT conditions (15).Then

h(x) = h(y).


43/272


Proof. Observe that, by the previous theorem, any solution to

Df(x)

yTDg(x) + T = 0,

with 0, Tx = 0, is a maximizer of the functionf(x) yTg(x),

under the constraint x 0. Thereforeh(y) = f(x) yTg(x) = f(x),

since yTg(x) = 0. Furthermore,

h(x) = f(x) + infy0

yTg(x) = f(x),

because g(x) 0. Thush(x) = h

(y).

5.3. Constraint qualification and KKT conditions. Consider

the constraints

(21) g(x)

0, x

0.

Let X denote the admissible set for (21). For x X define the activecoordinates indices as I(x) = {i : xi = 0}, and the active constraintsindices as J(x) = {j : gj(x) = 0}. For x X define the tangent coneto the admissible set X at the point x X as the set T(x) of vectorsv Rn which satisfy

vi 0, v Dgj(x) 0,for all i I(x) and all j J(x). We say that the constraints satisfy theconstraint qualification condition if for any x

X and any v

T(x)

there exists a C1 curve x(t) with x(0) = x and x(0) = v with x(t) Xfor all t 0 sufficiently small.Proposition 27. Let x be a solution of (14), and assume that the

constraint qualification condition holds. Then there exists Rn andy Rm such that (15) holds.


44/272


Proof. Fix v T(x) and let x be a curve as in the constraintqualification condition. Because x is a maximizer,

(22) 0 ddt

f(x(t))

t=0

= v Df(x).

From Farkas lemma (Lemma 17) we know that either there is v T(x)such that v Df > 0 or else the vector Df belongs to the positive conegenerated by ei, i I and Dgj(x), for j J. By (22) we know thatthe first alternative does not hold, hence there exists a vector Rn,with i 0 for i I, and i = 0 for i Ic, and y Rm with yj 0for j J and yj = 0 for j Jc such that

Df = yTDg T.By the construction of y and , as well as the definition of I and J, it

is clear that Tx = 0 as well as yTg = 0.

To give an interpretation of the Lagrange multipliers in the KKT

conditions, consider the family of problems

(23) maxx

f(x)

g(x) 0,where Rm and

g(x) = g(x) .We will assume that for all the constraint qualification condition

holds. Furthermore, assume that there exists a unique solution x

which is a differentiable function of . Define the value function

V() = f(x).

Let y

Rm

be the corresponding Lagrange multipliers, which weassume to be also differentiable.

We claim that for any 0 Rm

(24)V(0)

j= y0j .


45/272


To prove this identity, observe first that we have, using the KKT con-

ditions,

V()j

=k

f(x)xk

x

j=kj

yjgj (x)

xkxkj

.

By differentiating the complementarity condition

k ykg

k(x) = 0 with

respect to j we obtain

(25) 0 =

k

ykj

gk(x) + yk

i

gk(x)

xi

xij

yj .

For = 0 we either have gk(x

0) = 0 or gk(x0) < 0, in which case

yk vanishes in a neighborhood of 0. Consequently, in this last case we

have y0kj

= 0. Therefore

ykj

gk(x) = 0.

So, from (25), we conclude that

y0j =k

ykg0k (x0)

xi

x0ik

.

Thus we obtain (24).

5.4. Checking the constraint qualification conditions. Con-

sider the following optimization problem

(26)

maxx

x1

(1 x1)3 + x2 0x 0.

The Lagrangian is

L(x,y,) = x1 yT

(x2 (1 x1)3

) + 1x1 + 2x2and so

L(x,y,)

x1= 1 3(1 x1)2y + 1.

In particular, when x1 = 1, the equation

1 + 1 = 0


46/272


does not have a solution with 1 0. Hence the KKT conditions arenot satisfied. Nevertheless the point (x1, x2) = (1, 0) is a solution.

This example illustrates the need for obtaining simple criteria to

check whether the constraint qualification conditions hold. We will

show that the following are sufficient conditions for the verification of

the constraint qualifications.

1. The Mangasarian-Fromowitz condition: for any x X there isv such that

gi(x)v < 0;2. The Cotte-Dragominescu condition: for any x X the active

constraints are positively linearly independent:ygi = 0, y 0 implies y = 0;

3. The Arrow-Hurwicz and Uzawa condition: for any x X theactive constraints are linearly independent.

It is obvious that 3. implies 2. We will show that 1. is equivalent to 2.

To do so we need the following lemma:

Proposition 28 (Gordon alternative). Let A be a real-valued m nmatrix. Then one and only one of the following holds:

There exists x Rn such that Ax < 0; There exists y Rm, y 0, and y = 0, such that yTA = 0.

Proof. (i) It is clear that the two conditions are disjoint. Other-

wise, if Ax < 0 and yTA = 0 we would have 0 = yTAx < 0 which is a

contradiction.

(ii) We consider the following optimization problem:

(27)

maxy

y1 + + ymyTA = 0

y 0.


47/272


It is clear that if the second alternative holds then the value of this

problem is +

. Otherwise, y = 0 is a solution and the value is 0. In

this case the dual problem:

(28)

minx 0(Ax)i 1, i = 1, . . . , m

has a solution, i.e., there is a point x satisfying the constraints. Hence,

the first alternative holds.

Proposition 29. The Cotte-Dragominescu condition is equivalent to

the Mangasarian-Fromowitz condition.

Proof. Set A = g. The Mangasarian-Fromowitz condition cor-responds to the first case in the Gordon alternative. Therefore, the

only solution of

ygi = 0 and y 0 is y = 0. Thus the Cotte-Dragominescu condition is satisfied. Conversely, if the only solution

to

ygi = 0 and y 0 is y = 0 the second case of the Gordonalternative does not hold. Then the first alternative holds and so the

Mangasarian-Fromowitz condition is satisfied.

Theorem 30. If the Mangasarian-Fromowitz condition holds then the

constraint qualification condition is satisfied.

Proof. Let x0 X. Take w such that gi(x0)w 0. We mustconstruct a curve x() in such a way that x() X for sufficientlysmall and such that x(0) = w. Let v be a vector as in the Mangasarian-

Fromowitz condition. Take M sufficiently large and define

x() = x0 + w + M2v.

Then using Taylors series we have

gi(x()) = gi(x0)+gi(x0)w+M2gi(x0)v + 2

2 wTD2gi(x0)w+O(3).

Thus, if M is large enough and sufficiently small gi(x()) < 0.

Theorem 31. If either the Cotte-Dragominescu condition or the Arrow-

Hurwicz and Uzawa condition hold then so does the constraint qualifi-

cation condition.


48/272


6. Bibliographical notes

In what concerns linear programming problem, we have used the

books [GSS08] or [Fra02]...


49/272

2

Calculus of variations in one independent variable

This chapter is dedicated to a classical subject in the calculus of

variations: variational problems with one independent variable. These

are extremely important because of its applications to classical me-

chanics and Riemannian geometry. Furthermore they serve as a modelfor optimal control problems and problems with multiple integrals. We

start in Section 1, by deriving the Euler-Lagrange equation and give

some elementary applications. Then, in section 2 we study additional

necessary conditions for minimizers, and in section 3 we discuss several

applications to Riemannian geometry and classical mechanics.

An introduction to the Hamiltonian formalism is discussed in sec-

tion 4. The next issue, section 5, is the study of sufficient conditions

for a trajectory to be a minimizer: first we establish the existence of

local minimizers, then we study the connections between smooth solu-

tions of Hamilton-Jacobi equations and global minimizers, and finally

we discuss the Jacobi equation, conjugate points and curvature.

Symmetries are an important topic in calculus of variations. In

section 6 we present Rouths method for integration of Lagrangian

systems and Noethers theorem.

Of course, not every solution to the Euler-Lagrange equation is a

minimizer. Section 7 is a brief introduction to minimax methods and tothe mountain pass theorem. We also consider several examples of non-

existence of minimizing orbits (Lavrentiev phenomenon) and relaxation

methods (Young measures) in section 9.

49


50/272

50 2. CALCULUS OF VARIATIONS IN ONE INDEPENDENT VARIABLE

Invariant measures for Lagrangian and Hamiltonian systems are

considered in section 8.

The next part of this chapter is dedicated to the study of the ge-

ometry of Hamiltonian systems: symplectic and Poisson structures,

Darboux theorem and Arnold-Liouville integrability, section 10

In the last section, section 11 we consider perturbation problems

and describe the Linstead series perturbation procedure.

We end the chapter with bibliographical notes.

1. Euler-Lagrange Equations

In classical mechanics, the trajectories x : [0, T] Rn of a me-chanical system are determined by a variational principle called the

minimal action principle. This principle asserts that the trajectories

are minimizers (or at least critical points) of an integral functional. In

this section we study this problem and discuss several examples.

Consider a mechanical system on Rn with kinetic energy K(x, v)

and potential energy U(x, v). We define the Lagrangian, L(x, v) : RnRn R to be difference between the kinetic energy K and potentialenergy U of the system, that is, L = KU. The variational formulationof classical mechanics asserts that trajectories of this mechanical system

minimize (or are at least critical points) of the action functional

S[x] =

T

0

L(x(t), x(t))dt,

under fixed boundary conditions. More precisely, a C1 trajectory x :

[0, T] Rn is a minimizer S under fixed boundary conditions if for anyC1 trajectory y : [0, T] Rn such that x(0) = y(0) and x(T) = y(T)we have

S[x] S[y].


51/272

1. EULER-LAGRANGE EQUATIONS 51

In particular, for any C1 function : [0, T] Rn with compact supportin (0, T), and any

R we have

i() = S[x + ] S[x] = i(0).Thus i() has a minimum at = 0. So, ifi is differentiable, i(0) = 0. A

trajectory x is a critical pointofS, if for any C1 function : [0, T] Rnwith compact support in (0, T) we have

i(0) =d

dS[x + ]

=0

= 0.

The critical points of the action which are of class C2 are solutions

to an ordinary differential equation, the Euler-Lagrange equation, that

we derive in what follows. Any minimizer of the action functional

satisfies further necessary conditions which will be discussed in section

2.

Theorem 32 (Euler-Lagrange equation). Let L(x, v) : Rn Rn Rbe a C2 function. Suppose that x : [0, T] Rn is a C2 critical point ofthe action S under fixed boundary conditions x(0) and x(T). Then

(29)d

dtDv

L(x, x)

Dx

L(x, x) = 0.

Proof. Let x be as in the statement. Then for any : [0, T] Rnwith compact support on (0, T), the function

i() = S[x + ]

has a minimum at = 0. Thus

i(0) = 0,

that is, T0

DxL(x, x) + DvL(x, x) = 0.

Integrating by parts, we conclude thatT0

d

dtDvL(x, x) DxL(x, x)

= 0,


52/272


for all : [0, T] Rn with compact support in (0, T). This implies(29) and ends the proof of the theorem.

Example 11. In classical mechanics, the kinetic energy K of a particle

with mass m with trajectory x(t) is:

K = m|x|2

2.

Suppose that the potential energy U(x) depends only on the position x.

Assume also that U is smooth. Then the Lagrangian for this mechanical

system is then

L = K

U.

and the corresponding Euler-Lagrange equation is

mx = U(x),which is the Newtons law.

Exercise 36. LetP Rn, and consider the Lagrangian L(x, v) : Rn Rn R defined by L(x, v) = g(x)|v|2 + Pv U(x), where g andU areC2 functions. Determine the Euler-Lagrange equation and show that it

does not depend on P.

Exercise 37. Suppose we form a surface of revolution by connecting a

point (x0, y0) with a point (x1, y1) by a curve (x, y(x)), x [0, 1], andthen revolving it around the y axis. The area of this surface isx1

x0

x

1 + y2dx.

Compute the Euler-Lagrange equation and study its solutions.

To understand the behavior of the Euler-Lagrange equation it is

sometimes useful to change coordinates. The following propositionshows how this is achieved:

Proposition 33. Letx : [0, T] Rn be a critical point of the actionT0

L(x, x)dt.


53/272


Letg : Rn Rn be a C2 diffeomorphism and L given byL(y, w) = L(g(y), Dg(y)w).

Then y = g1 x is a critical point ofT0

L(y, y)dt.

Proof. This is a simple computation and is left as an exercise to

the reader.

Before proceeding, we will discuss some applications of variational

methods to classical mechanics. As mentioned before, the trajectories

of a mechanical system with kinetic energy K and potential energy

U are critical points of the action corresponding to the Lagrangian

L = K U. In the following examples we use this variational principleto study the motion of a particle in a central field, and the planar two

body problem.

Example 12 (Central field motion). Consider the Lagrangian of a

particle in the plane subjected to a radial potential field.

L(x, y, x, y) =x2 + y2

2 U(

x2 + y2).

Consider polar coordinates, (r, ), that is (x, y) = (r cos , r sin ) =

g(r, ), We can change coordinates (see proposition 33) and obtain the

Lagragian in these new coordinates

L(r, , r, ) =r22 + r2

2 U(r).

Then the Euler-Lagrange equations can be written as

d

dtr2 = 0

d

dt r = U(r) + r2.The first equation implies that r2 is conserved. Therefore, r2 =2

r3. Multiplying the second equation by r we get

d

dt

r2

2+ U(r) +

2

2r2

= 0.


54/272


Consequently

E =r2

2+ U(r) +

2

2r2is a conserved quantity. Thus, we can solve for r as a function of r

(given the values of the conserved quantities E and ) and so obtain

a first-order differential equation for the trajectories.

Example 13 (Planar two-body problem). Consider now the problem

of two point bodies in the plane, with trajectories (x1, y1) and (x2, y2).

Suppose that the interaction potential energy U depends only on the

distance

(x1 x2)2 + (y1 y2)2 between them. We will show howto reduce this problem to the one of a single body under a radial field.

The Lagrangian of this system is

L = m1x21 + y

21

2+ m2

x22 + y22

2 U(

(x1 x2)2 + (y1 y2)2).

Consider new coordinates (X,Y,x,y), where (X, Y) is the center of

mass

X =m1x1 + m2x2

m1 + m2, Y =

m1y1 + m2y2m1 + m2

,

and (x, y) the relative position of the two bodies

x = x1 x2, y = y1 y2.In these new coordinates the Lagrangian, using proposition 33, is

L = L1(X, Y) + L2(x, y, x, y).

Therefore, the equations for the variables X and Y are decoupled from

the ones for x, y. Elementary computations show that

d2

dt2X =

d2

dt2Y = 0.

Thus X(t) = X0 + VXt and Y(t) = Y0 + VYt, for suitable constants X0,

Y0, VX and VY.

Since

L2 =m1m2

m1 + m2

x2 + y2

2 U(

x2 + y2),

the problem now is reduced to the previous example.


55/272


Exercise 38 (Two body problem). Consider a system of two point

bodies inR3 with masses m1 and m2, whose relative location is given

by the vector r R3. Assume that the interaction depends only onthe distance between the bodies. Show that by choosing appropriate

coordinates, the motion can be reduced to the one of a single point

particle with mass M = m1m2m1+m2 under a radial potential. Show, by

proving that r r is conserved, that the orbit of a particle under aradial field lies in a fixed plane for all times.

Exercise 39. Letx : [0, T] Rn be a solution to the Euler-Lagrangeequation associated to a C2 Lagrangian L : Rn Rn R. Show that

E(t) = L(x, x) + x DvL(x, x)is constant in time. For mechanical systems this is simply the conser-

vation of energy. Occasionally, the identity ddt

E(t) = 0 is also called

the Beltrami identity.

Exercise 40. Consider a system of n point bodies of mass mi, and

positions ri R3, 1 i n. Suppose the kinetic energy is T =imi2

|r|2 and the potential energy is U = i,j=i mimj2|rirj | . Let I =i mi|ri|

2. Show that

d2

dt2I = 4T + 2U,

which is strictly positive if the energy T+ U is positive. What implica-

tions does this identity have for the stability of planetary systems?

Exercise 41 (Jacobi metric). Let L(x, v) : Rn Rn R be a C2Lagrangian. Let x : [0, T] Rn be a solution to the correspondingEuler-Lagrange

(30)d

dtDvL DxL = 0,

for the Lagrangian

L(x, v) =|v|2

2 V(x).

LetE(t) = |x(t)|2

2 + V(x(t)).

1. Show that E = 0.


56/272


2. LetE0 = E(0). Show thatx is a solution to the Euler-Lagrange

equation

(31)d

dtDvLJ DxLJ = 0

associated to LJ =

E0 V(x)|x|.3. Show that any reparametrization in time of x is also a solution

to (31) and observe that the functionalT0

E0 V(x)|x|

represents the lenght of the path between x(0) and x(T) using

the Jacobi metric g =

E0 V(x).4. Show that the solutions to the Euler-Lagrange (31) when repar-

ametrized in time in such a way that the energy of the reparametrized

trajectory is E0 satisfy (30).

Exercise 42 (Braquistochrone problem). Let (x1, y1) be a point in a

(vertical) plane. Show that the curve y = u(x) that connects (0, 0) to

(x1, y1) in such a way that a particle with unit mass moving under the

influence a unit gravity field reaches (x1, y1) in the minimum amount

of time minimizes x10

1 + u2

2u dx.

Hint: use the fact that the sum of kinetic and potential energy is con-

stant.

Determine the Euler-Lagrange equation and study its solutions, us-

ing exercise 39.

Exercise 43. Consider a second-order variational problem:

(32) minx

T0

L(x, x, x)

where the minimum is taken over all trajectories x : [0, T] Rnwith fixed boundary data x(0), x(T), x(0), x(T). Determine the Euler-

Lagrange equation corresponding to .


57/272

2. FURTHER NECESSARY CONDITIONS 57

2. Further necessary conditions

A classical strategy in the study of variational problems consists

in establishing necessary conditions for minimizers. If there exists a

minimizer and if the necessary conditions have a unique solution, then

this solution has to be the unique minimizer and thus the problem is

solved. In addition to Euler-Lagrange equations, several other neces-

sary conditions can be derived. In this section we discuss boundary

conditions which arise, for instance when the end-points are not fixed,

and second-order conditions.

2.1. Boundary conditions. In certain problems, the boundary

conditions, such as end point values are not prescribed a-priori. In

this case, it is possible to prove that the minimizers satisfy certain

boundary conditions automatically. These are called natural boundary

conditions.

Example 14. Consider the problem of minimizing the integral

(33)T0 L(x, x)dt,

over all C2 curves x : [0, T] Rn. Note that the boundary values forthe trajectory x at t = 0, T are not prescribed a-priori.

Let x be a minimizer of (33) (with free endpoints). Then for all

: [0, T] Rn, not necessarily compactly supported,T0

DxL(x, x) + DvL(x, x) dt = 0.

Integrating by parts and using the fact that x is a solution to the

Euler-Lagrange equation, we conclude that

DvL(x(0), x(0)) = DvL(x(T), x(T)) = 0.


58/272


Exercise 44. Consider the problem of minimizing the integral

T0

L(x, x)dt,

over all C2 curves x : [0, T] Rn such that x(0) = x(T). Deduce thatDvL(x(0), x(0)) = DvL(x(T), x(T)).

Use the previous identity to show that any periodic (smooth) minimizer

is in fact a periodic solutions to the Euler-Lagrange equations.

Exercise 45. Consider the problem of minimizing

T0 L(x, x)dt + (x(T)),

with x(0) fixed and x(T) free. Derive a boundary condition at t = T

for the minimizers.

Exercise 46 (Free boundary).

Consider the problem of minimizingT0

L(x, x),

over all terminal times T and allC2 curves x : [0, T] Rn. Show thatx is a solution to the Euler-Lagrange equation and that

L(x(T), x(T)) = 0,

DxL(x(T), x(T))x(T) + DvL(x(T), x(T))x(T) 0,DvL(x(T), x(T)) = 0.

Letq R and L : R2 R given by

L(x, v) =(v q)2

2

+x2

2 1

If possible, determine T and x : [0, T] R that are (local) minimizersof T

0

L(x, x)ds,

with x(0) = 0.


59/272

2. FURTHER NECESSARY CONDITIONS 59

2.2. Second-order conditions. If f : R R is a C2 functionwhich has a minimum at a point x0 then f

(x0) = 0 and f(x0)

0.

For the minimal action problem, the analog of the vanishing of the firstderivative is the Euler-Lagrange equation. We will now consider the

analog to the second derivative being non-negative.

The next theorem concerns second-order conditions for minimizers:

Theorem 34 (Jacobis test). Let L(x, v) : Rn Rn R be a C2Lagrangian. Let x : [0, T] Rn be a C1 minimizer of the actionunder fixed boundary conditions. Then, for each : [0, T] Rn, withcompact support in (0, T), we have

(34)

T0

1

2TD2xxL(x, x) +

TD2xvL(x, x) +1

2TD2vvL(x, x) 0.

Proof. If x is a minimizer, the function I[x + ] has aminimum at = 0. By computing d

2

d2I[x + ] at = 0 we obtain

(34).

A corollary of the previous theorem is Lagranges test that we state

next:

Corollary 35 (Lagranges test). Let L(x, v) : Rn Rn R be a C2Lagrangian. Suppose x : [0, T] Rn is a C1 minimizer of the actionunder fixed boundary conditions. Then

D2vvL(x, x) 0.

Proof. Use Theorem 34 with = (t)sin t, for : [0, T] Rn,

with compact support in (0, T), and let 0. Exercise 47. Let L : R2n R be a continuous Lagrangian and letx : [0, T] Rn be a continuous piecewise C1 trajectory. Show that foreach > 0 there exists a trajectory y : [0, T] Rn of class C1 suchthat

T0

L(x, x) T0

L(y, y)

< .


60/272


As a corollary, show that the value of the infimum of the action over

piecewise C1 trajectories is the same as the infimum over trajectories

globally C1. Note, however, that a minimizer may not be C1.

Exercise 48 (Weierstrass test). Letx : [0, T] Rn be a C1 minimumof the action corresponding to a Lagrangian L. Let v, w Rn and0 1 be such that v + (1 )w = 0. Show that

L(x, x + v) + (1 )L(x, x + w) L(x, x).Hint: To prove the inequality at a point t0, choose such that

(t) =

v if t0

t

t +

w if t + < t t0 + 0 otherwise

and consider I[x + ], as 0.

3. Applications to Riemannian geometry

This section is dedicated to some applications of the calculus of

variations to Riemannian geometry, namely the study of geodesics andcurvature. We also present some applications to geometric mechanics,

namely the study of the rigid body.

In our examples we will use most of the time local coordinates and

will not try to address global problems in geometry. In fact, by using

suitable charts, the problems we address can usually be reduced to

problems in Rn. To simplify the notation we will also use the Einstein

convention for repeated indices, that is aibi in fact is an abreviation of

i

aibi.

Example 15. Let M be a Riemannian manifold with metric g, defined

in local coordinates by the positive definite symmetric matrix gij(x).

Let L : T M R be given by

L(x, v) =1

2gij(x)vivj.


61/272

3. APPLICATIONS TO RIEMANNIAN GEOMETRY 61

Let x : [a, b] M be a curve that minimizes

ba

L(x, x)dt,

over all curves with certain fixed boundary conditions. Then, we have

d

dt(gijxi) 1

2Djgmkxmxk = 0,

that is,

(35) xi +1

2

gij (Dkgmj + Dkgmj Djgmk)

xmxk = 0,

where gij represents the inverse matrix ofgij. We can write the previous

equation in the more compact formxi +

ikmxmxk = 0,

where

(36) ikm =1

2gij (Dkgmj + Dmgkj Djgmk)

is the Christoffel symbol for the metric g (note that the change in the

order of the indices in the second term does not change the sum in (35)

but makes symmetric in the indices m and k).

Theorem 36. Let gij be a smooth Riemannian metric in Rn. Thecritical points x of the functional

(37)

T0

1

2gij(x)xixjdt

are also critical points of the functional

(38)

T0

gij(x)xixjdt,

Additionally, we can reparametrize the critical points of (38) in such a

way that they are also critical points of (37).

Proof. The fact that the critical points of (37) are critical points

of (38) is a simple computation. To prove the second part of the

theorem it suffices to observe that the solutions of the Euler-Lagrange

associated to L preserve the energy E = 12

gij(x)xixj. Using this fact is


62/272


easy to find the correct parametrization of the critical points of (38).

The minimizers of (38) are called geodesics, although sometimes

the name is also used for critical points.

Example 16. Consider a parametrization f : A Rm Rn of am-dimensional manifold. The induced metric in Rm is represented by

the matrix

g = (Df)TDf.

The motivation is the following, given a curve (t) M consider thecorresponding tangent vector (t) in T M. Let x = f() and x = Df.

Then we define

, = x, x,which gives rise, precisely to the induced metric.

Exercise 49. ConsiderR2\{0} with polar coordinates (r, ). Show thatthe standard metric inR2 can be written in these coordinates as

g = 1 0

0 r2

.

Let

L(r,, r, ) =r2 + r22

2,

the Lagrangian of a free particle in polar coordinates. Compute the

Euler-Lagrange equation and determine the corresponding Christoffel

symbol.

Exercise 50. Consider the sphere x2 + y2 + z2 = 1 and the associate

spherical coordinates (, )

x = cos sin

y = sin sin

z = cos ,


63/272


(0, 2) and (0, ). Show that the induced metric is given bythe matrix

g =

sin2 00 1

.

Determine the Euler-Lagrange equation forL = 12gijvivj and the Christof-

fel symbol corresponding to the coordinates (, ).

Exercise 51. Consider the revolution surface in R3 parametrized by

(r, ):

x = r cos

y = r sin

z = z(r).

Show that the induced metric is

g =

1 + (z)2 0

0 r2

.

Show that the equation for the geodesics is

+2

rr = 0

r r

1 + (z)22

+

zz

1 + (z)2 r2

= 0

Determine the corresponding Christoffel symbols. Prove the Clairaut

identity, that is, that r cos is constant, where is the angle between and r

r +

.

Exercise 52 (Spherical pendulum). Show that for a spherical pendu-

lum with unit mass, the Lagrangian can be written as

L =2 sin2 + 2

2 U().

Exercise 53. Determine the Lagrangian of point particle constrainedto the cone z2 = x2 + y2.

Exercise 54. Consider the Lagrangian for a particle of unit mass con-

strained to move in the cycloid parametrized by

x = sin y = cos .


64/272


Show that the y coordinate is 2-periodic for any initial condition that

yields a periodic orbit.

3.1. Parallel Transport. The Christoffel symbols ikm can be

used to study parallel transport in a Riemannian manifold. In this

section we define and discuss the main properties of parallel transport.

Let M be a manifold and (M) the set of all C vector fields in

M. As usual in differential geometry, we identify vector fields in M

with the corresponding first-order linear differential operators. That

is, if X = (X1, . . . X n) is a vector field, we identify X with the first

order differential operator

X =i

Xi

xi.

Then, the commutator of two vector fields X and Y is the vector field

[X, Y], which is defined through its action as a differential operator in

smooth functions f:

[X, Y]f = X(Y(f)) Y(X(f)).

A connection in M is a mapping

:

satisfying the following properties

1. fX+gYZ = fXZ+ gYZ,2. X(Y + Z) = XY + XZ,3. X(f Y) = fXY + X(f)Y,

for all X,Y ,Z (M) and all f, g C(M).

The vector XY represents the rate of variation ofY along a curvetangent to X.


65/272


Exercise 55. LetM be a manifold and a connection in M. Defineikm as

xk

xm

= ikm

xi.

Show that

(39) XY =

ikmXkYm + XjYixj

xi,

whereX = Xjxj

e Y = Yjxj

.

In every point x, the formula (39) only depends on the value of the

vector field X at x, this allow us to define the covariant derivative of avector field Y along a curve x(t) trough

DY

dt= xY.

A vector field X is parallel along a curve x(t) if

DX

dt= 0.

A connection is symmetric if

XY YX = [X, Y].In general, connections in a manifold do not have to be symmetric, and

therefore

XY YX = T(X, Y) + [X, Y],where T is the torsion.

Exercise 56. Determine an expression for the torsion in local coordi-

nates.

Exercise 57. Let be a symmetric connection. Show thatkij =

kji .

A manifold can be endowed with different connections. For Rie-

mannian manifolds, are of special interest the connections which are


66/272


compatible with the metric, that is, that for all vector fields X and Y

satisfy

(40)d

dtX, Y = D

dtX, Y + X, D

dtY,

where the derivatives are taken along any arbitrary curve x(t). There

exists a unique symmetric connection compatible with the metric, the

Levi-Civita connection, whose Christoffel symbols are given by (36).

Theorem 37. Let M be a Riemannian manifold with metric g. The

the Levi-Civita connection, defined in local coordinates by the Christof-

fel symbols (36), is the unique connection which is symmetric and com-

patible with the metric g.

Proof. Let be a connection which is symmetric and compatiblewith the metric g. Then one can use (40) to determine Dkgmj, Dmgkjand Djgmk and it is a simple computation to show that its Christoffel

symbols are give by (36).

Exercise 58. Verify that the Christoffel symbols define a connection.

Exercise 59. Use formula (36) to determine the Christoffel symbol

corresponding to the polar coordinates inR2 - compare with the result

of exercise 49.

Exercise 60. LetX be a vector field and x a trajectory that satisfies

dx

dt= X(x).

Show that in local coordinates

xi

xi= Xk(x)

Xixk

xi,

and, therefore,

DX

dt=

ikmxkxm + xi

xi.


67/272


Show that the previous definition is independent of the choice of local

coordinates, which allow us to define covariant acceleration as:

Dx

dt=

ikmxkxm + xi

xi,

for any C2 trajectory.

Example 17. Equation (15) can be then rewritten as

Dx

dt= 0,

which should be compared with the Newton law for a particle in the

absence of forces x = 0.

Exercise 61. LetM be a Riemannian manifold in which is defined a

potential V : M R. The corresponding Lagrangian isL(x, v) =

1

2gijvivj V(x).

Documents

Calculus of Variations and Pdes