Calculus of Variations and Pdes

Embed Size (px)

Citation preview

  • 8/2/2019 Calculus of Variations and Pdes

    1/272

    Calculus of Variations and Partial

    Differential Equations

    Diogo Aguiar Gomes

  • 8/2/2019 Calculus of Variations and Pdes

    2/272

  • 8/2/2019 Calculus of Variations and Pdes

    3/272

    Contents

    . Introduction 5

    1. Finite dimensional optimization problems 9

    1. Unconstrained minimization in Rn 10

    2. Convexity 163. Lagrange multipliers 26

    4. Linear programming 30

    5. Non-linear optimization with constraints 37

    6. Bibliographical notes 48

    2. Calculus of variations in one independent variable 49

    1. Euler-Lagrange Equations 50

    2. Further necessary conditions 57

    3. Applications to Riemannian geometry 60

    4. Hamiltonian dynamics 75

    5. Sufficient conditions 89

    6. Symmetries and Noether theorem 105

    7. Critical point theory 111

    8. Invariant measures 116

    9. Non convex problems 118

    10. Geometry of Hamiltonian systems 119

    11. Perturbation theory 122

    12. Bibliographical notes 126

    3. Calculus of variations and elliptic equations 127

    1. Euler-Lagrange equation 129

    2. Further necessary conditions and applications 136

    3. Convexity and sufficient conditions 136

    4. Direct method in the calculus of variations 136

    3

  • 8/2/2019 Calculus of Variations and Pdes

    4/272

    4 CONTENTS

    5. Euler-Lagrange equations 145

    6. Regularity by energy methods 146

    7. Holder continuity 1558. Schauder estimates 171

    4. Optimal control and viscosity solutions 183

    1. Elementary examples and properties 186

    2. Dynamic programming principle 188

    3. Pontryagin maximum principle 190

    4. The Hamilton-Jacobi equation 192

    5. Verification theorem 193

    6. Existence of optimal controls - bounded control space 195

    7. Sub and superdifferentials 197

    8. Optimal control in the calculus of variations setting 202

    9. Viscosity solutions 214

    10. Stationary problems 224

    5. Duality theory 231

    1. Model problems 231

    2. Some informal computations 237

    3. Duality 241

    4. Generalized Mather problem 2445. Monge-Kantorowich problem 266

    . Bibliography 269

    . Index 271

  • 8/2/2019 Calculus of Variations and Pdes

    5/272

    Introduction

    This book is dedicated to the study of calculus of variations and its

    connection and applications to partial differential equations. We have

    tried to survey a wide range of techniques and problems, discussing,

    both classical results as well as more recent techniques and problems.This text is suitable to a first one-year graduate course on calculus of

    variations and optimal control, and is organized in the following way:

    1. Finite dimensional optimization problems;

    2. Calculus of variations with one independent variable;

    3. Calculus of variations and elliptic partial differential equations;

    4. Deterministic optimal control and viscosity solutions;

    5. Duality theory.

    The first chapter is dedicated to finite dimensional optimization,

    giving emphasis to techniques that can be generalized and applied in in-

    finitely dimensional problems. This chapter starts with an elementary

    discussion of unconstrained optimization in Rn and convexity. Then

    we discuss constrained optimization problems, linear programming and

    KKT conditions. The following chapter concerns variational problems

    with one independent variable. We study classical results including

    applications to Riemannian geometry and classical mechanics. We also

    discuss sufficient conditions for minimizers, Hamiltonian dynamics andseveral other related topics. The next chapter concerns variational

    problems with functionals defined through multiple integrals. In many

    of these problems, the Euler-Lagrange equation is an elliptic partial

    differential equation, possibly non linear. Using the direct method in

    the calculus of variations, we prove the existence of minimizers. Then

    5

  • 8/2/2019 Calculus of Variations and Pdes

    6/272

    6 INTRODUCTION

    we show that the minimum is a weak solution to the Euler-Lagrange

    equation and study its regularity. The study of regularity follows the

    classical path: first we consider energy methods, then we prove the DeGiorgi-Nash-Moser estimates and finally Schauder estimates. In the

    fourth chapter we consider optimal control problems. We study both

    classical control theory methods such as the dynamic programming

    and Pontryagin maximum principle, as well as more recent tools such

    as viscosity solutions of Hamilton-Jacobi equations. The last chap-

    ter is a brief introduction to the (infinite dimensional) duality theory

    and its applications to non-linear partial differential equations. We

    study Mathers problem and Monge-Kantorowich optimal mass trans-

    port problem. These have important relations with Hamilton-Jacobiand Monge-Ampere equations, respectively.

    The pre-requisites of these notes are some familiarity with the

    Sobolev spaces and functional analysis, at the level of [Eva98b]. With

    some few exceptions, we do not assume familiarity with partial differ-

    ential equations beyond elementary theory.

    Many of the results discussed, as well as important extensions,

    can be found in the bibliography. In what it what concerns finite

    dimensional optimization and linear programming, the main reference

    is [Fra02]. On variational problems with one independent variable,

    a key reference is [AKN97]. The approach to elliptic equations in

    chapter 3 was strongly influenced by the course the author frequented

    at the University of California at Berkeley by Fraydoun Rezakhanlou,

    by the (unpublished) notes on Elliptic Equations by my advisor L. C.

    Evans, and by the book [Gia83]. The books [GT01] and [Gia93]

    are also classical references in this area. Optimal control problems are

    discussed in 4. The main references are [Eva98b], [Lio82], [Bar94]

    [FS93], [BCD97]. The last chapter concerns duality theory. We rec-

    ommend the books [Eva99] [Vil03a], [Vil] as well as the authors

    papers [Gom00], [Gom02b].

  • 8/2/2019 Calculus of Variations and Pdes

    7/272

    INTRODUCTION 7

    I would like to thank my students: Tiago Alcaria, Patrcia Engracia,

    Slvia Guerra, Igor Kravchenko, Anabela Pelicano, Ana Rita Pires,

    Veronica Qutalo, Lucian Radu, Joana Santos, Ana Santos, and VitorSaraiva, which took courses based on part of these notes and suggested

    me several corrections and improvements. My friend Pedro Girao de-

    serves a special thanks are he read the first LATEX version of these notes

    and suggested many corrections and improvements.

  • 8/2/2019 Calculus of Variations and Pdes

    8/272

  • 8/2/2019 Calculus of Variations and Pdes

    9/272

    1

    Finite dimensional optimization problems

    This chapter is an introduction to optimization problems in finite

    dimension. We are certain that many of the results discussed, as well as

    its proofs, are familiar to the reader. However, we feel that it is instruc-

    tive to recall them and, throughout this text, observe how they can beadapted for infinite dimensional problems. The plan of this chapter is

    the following: we start in 1 by considering unconstrained minimizationproblems in Rn, we discuss existence and uniqueness of minimizers, as

    well as first and second order tests for minimizers. The following sec-

    tion, 2, concerns properties of convex functions which will be neededthroughout the text. We start the discussion of constrained optimiza-

    tion problems in 3 by studying the Lagrange multiplier method forequality constraints. Then, the general case involving both equality

    and inequality constrains is discussed in the two remaining sections. In

    4 we consider linear programming problems, and in 5 we discuss non-linear optimization problems and we derive the Karush-Kuhn-Tucker

    (KKT) conditions. The chapter ends with a few bibliographical refer-

    ences.

    The general setting of optimization problems is the following: given

    a function f : Rn R and a set X Rn, called the admissible set, wewould like to solve the following minimization problem

    (1)min f(x)x X,

    i.e. to find the solution set S X such that

    f(y) = infX

    f,

    9

  • 8/2/2019 Calculus of Variations and Pdes

    10/272

    10 1. FINITE DIMENSIONAL OPTIMIZATION PROBLEMS

    for all y S. We should note that the min in (1) should beread minimize rather than minimum as the minimum may not

    be achieved. The number infX f is called the value of problem (1).

    1. Unconstrained minimization in Rn

    In this section we address the unconstrained minimization case,

    that is the case in which the admissible set X is Rn. Let f : Rn Rbe an arbitrary function. We look for conditions on f that

    ensure the existence of a minimum; show that this minimum is unique.

    In many instances, existence and uniqueness results are not enough:

    we would also like to

    determine necessary or sufficient conditions for a point to be aminimum;

    estimate the location of a possible minimum.

    By looking for all points that satisfy necessary conditions one can

    determine a set of candidate minimizers. Then, by looking at sufficient

    conditions one may in fact be able to show that some of these points

    are indeed minimizers.

    To study the existence of a minimum of f, we can use the following

    procedure, called the direct method of the calculus of variations: let

    (xn) be a minimizing sequence, that is, a sequence such that

    f(xn) inff.

    Proposition 1. LetA be an arbitrary set and f : A R. Then thereexists a minimizing sequence.

  • 8/2/2019 Calculus of Variations and Pdes

    11/272

    1. UNCONSTRAINED MINIMIZATION IN Rn 11

    Proof. If infA f = , there exists xn A such that f(xn)

    . Otherwise, if infA f >

    ,we can always find xn

    A such

    that infA f f(xn) infA f + 1n , which again produces a minimizingsequence.

    Let f : Rn R. Suppose (xn) is a minimizing sequence for f. Ifxn (or some subsequence) converges to a point x, and, if additionaly,

    f(xn) converges to f(x), then x is a minimum of f because

    f(x) = lim f(xn),

    and

    lim f(xn) = inff,

    because xn is a minimizing sequence. Thus f(x) = inff. Although

    minimizing sequences always exist, they may fail to converge, even up

    to subsequences, as the next exercise illustrates:

    Exercise 1. Consider the functionf(x) = ex. Compute inff, give an

    example of a minimizing sequence. Show that no minimizing sequence

    for f converges.

    As the previous exercise suggests, to ensure convergence it is nat-

    ural to impose certain compactness conditions. In Rn, any bounded

    sequence (xn) has a convergent subsequence. A convenient condition

    on f that ensures boundedness of minimizing sequences is coercivity:

    a function f : Rn R is called coercive if f(x) +, as |x| .

    Exercise 2. Letf be a coercive function and let xn be a sequence such

    that f(xn) is bounded. Show that xn is bounded. Note in particular

    that if f(xn) is convergent then xn is bounded.

    Therefore, from the previous exercise, it follows

    Proposition 2. Let f : Rn R be a coercive function. Let (xn) isa minimizing sequence for f. Then there exists a point x for which,

    through some subsequence xn x.

  • 8/2/2019 Calculus of Variations and Pdes

    12/272

    12 1. FINITE DIMENSIONAL OPTIMIZATION PROBLEMS

    Unfortunately, iff is discontinuous at x, f(xn) may fail to converge

    to f(x). This poses a problem because if xn is a minimizing sequence

    f(xn) inff and if this limit is not f(x) then x cannot be a mini-mizer. It would, therefore, seem natural to require f to be continuous.

    However, to establish that x is a minimizer we do not really need con-

    tinuity. In fact, a weaker property is sufficient: it is enough that for

    any sequence (xn) converging to x the following inequality holds:

    (2) lim inf f(xn) f(x).A function f is called lower semicontinuous if inequality (2) holds for

    any point x and any sequence xn converging to x.

    Example 1. The function

    f(x) =

    1 if x = 00 if x = 0

    is lower semicontinuous. However,

    g(x) =

    0 if x = 01 if x = 0

    is not.

    ADD HERE GRAPH OF FUNCTIONS

    Proposition 3. Let f : Rn R be lower semicontinuous and let(xn) Rn be a minimizing sequence converging to x Rn. Then x isa minimizer of f.

    Proof. Let xn be a minimizing sequence. Then

    inff = lim f(xn) = lim inff(xn) f(x),that is, f(x) inff.

    Lower semicontinuity is a weaker property than continuity, and

    therefore easier to be satisfied.

  • 8/2/2019 Calculus of Variations and Pdes

    13/272

    1. UNCONSTRAINED MINIMIZATION IN Rn 13

    Establishing the uniqueness of minimizer is, in general, more com-

    plex. A convenient condition that implies uniqueness of minimizers is

    convexity.

    A set A Rn is convex if for all x, y A and any 0 1 wehave x + (1 )y A. Let A be a convex set A function f : A Ris convex if, for any x, y A and 0 1,

    f(x + (1 )y) f(x) + (1 )f(y),

    and it is uniformly convexif there exists > 0 such that for all x, y Aand 0 1,

    f(x + (1 )y) + (1 )|x y|2 f(x) + (1 )f(y).

    Example 2. Let be any norm in Rn. Then, by the triangleinequality

    x + (1 )y x + (1 )y = x + (1 )y,

    for all 0 1. Thus the mapping x x is convex.

    Exercise 3. Show that the square of the Euclidean norm inRd, x2 =k x

    2k is uniformly convex.

    Proposition 4. LetA Rn be a convex set and f : A R be a convexfunction. If x and y are minimizers of f then so is x + (1 )y, forany 0 1. If f is uniformly convex then x = y.

    Proof. If x and y are minimizers then f(x) = f(y) = min f.

    Consequently, by convexity

    f(x + (1

    )y)

    f(x) + (1

    )f(y) = min f.

    Therefore x + (1 )y is a minimizer of f. If f is uniformly convex,and choosing 0 < < 1, we obtain

    f(x + (1 )y) + (1 )|x y|2 min f,which implies x = y.

  • 8/2/2019 Calculus of Variations and Pdes

    14/272

    14 1. FINITE DIMENSIONAL OPTIMIZATION PROBLEMS

    The characterization of minimizers, through necessary or sufficient

    conditions is usually made by introducing certain conditions that in-

    volve first or second derivatives. Let f : Rn R be a C2 function. Re-call that Df and D2f denote, respectively the first and second deriva-

    tives of f. Also we use the notation that a n n matrix A 0 if Ais semidefinite positive and A > 0 is A is definite positive. The next

    proposition is a well known result that illustrates this.

    Proposition 5. Letf : Rn R be a C2 function and x a minimizerof f. Then

    Df(x) = 0 and D2f(x) 0.

    Proof. For any vector y Rn and > 0 we have0 f(x + y) f(x) = Df(x)y + O(2),

    dividing by , and letting 0, we obtainDf(x)y 0.

    Since y is arbitrary we conclude that:

    Df(x) = 0.

    In a similar way,

    0 f(x + y) + f(x y) 2f(x)2

    = yTD2f(x)y + o(1),

    and so, when 0, we obtainyTD2f(x)y 0.

    Let f : Rn R be a C1 function. A point x is called a criticalpoint of f if Df(x) = 0.

    Exercise 4. Let A be any set and f : A R be a C1 function inthe interior int A of A. Show that any maximizer or minimizer of f is

    either a critical point or lies on the boundary A of A.

  • 8/2/2019 Calculus of Variations and Pdes

    15/272

    1. UNCONSTRAINED MINIMIZATION IN Rn 15

    We will now show that any critical point of a convex function is a

    minimizer. For that we need the following preliminary result:

    Proposition 6. Let f : Rn R be a C1 convex function. Then, forany x, y we have

    f(y) f(x) + Df(x)(y x).

    Proof. We have

    (1)f(x)+f(y) f(x+(yx)) = f(x)+Df(x)(yx)+o(|(yx)|).Thus, reorganizing the inequality and dividing by we obtain

    f(y) f(x) + Df(x)(y x) + o(1),as 0.

    We can use now this result to prove:

    Proposition 7. Let f : Rn R be a C1 convex function and x acritical point of f. Then x is a minimizer of f.

    Proof. Since Df(x) = 0 and f is convex, it follows from proposi-tion 6 that

    f(y) f(x),for all y.

    Exercise 5. Letf(x, ) : Rn Rm R be a C2 function, x0 a mini-mizer of f(, 0), with D2xxf(x0, 0) definite positive. Show that, for each in a neighborhood of = 0, there exists a unique local minimizer xof f(, ) with x|=0 = x0. Compute Dx at = 0.

    Growth conditions on f can be used to estimate the norm of a

    minimizer. In finite dimensional problems, estimates on the norm of a

    minimizer are important for numerical methods. For instance, if such

    an estimate exits, it makes it possible localize the search region for

    a minimizer. In infinite dimensional problems this issue is even more

  • 8/2/2019 Calculus of Variations and Pdes

    16/272

    16 1. FINITE DIMENSIONAL OPTIMIZATION PROBLEMS

    relevant as it will be clear later in these notes. An elementary result is

    given in the next exercise:

    Exercise 6. Letf : Rn R be such that f(x) C1|x|2 + C2, C1 > 0.Letx0 be a minimizer of f. Show that

    |x0|

    f(y) C2C1

    ,

    for any y Rn.

    Exercise 7. Let f(x, ) : R2 R be a continuous function. Supposefor each there is at least one minimizer x of x

    f(x, ). Suppose

    there exists C such that |x| C for all in a neighborhood of = 0.Suppose that for = 0 there exists a unique minimizer x0. Show that

    lim0 x = x0.

    Exercise 8. Let f C1(R2). Define u(x) = infyR f(x, y). Supposethat

    lim|y|

    f(x, y) = +,

    uniformly in x. Letx0 be a point in which the infimum in y of f is

    achieved at a single point y0. Show that u is differentiable in x at x0and thatu

    x(x0) =

    f

    x(x0, y0).

    Give an example that shows that u may fail to be differentiable if the

    infimum of f in y is achieved at more than one point.

    Exercise 9. Find all maxima and minima (both local and global) of

    the function xy(1 x2 y2) on the square 1 x, y 1.

    2. Convexity

    As we discussed in the previous section, convexity is a central prop-

    erty in optimization. In this section we discuss additional properties of

    convex functions which will be necessary in the sequel.

  • 8/2/2019 Calculus of Variations and Pdes

    17/272

    2. CONVEXITY 17

    2.1. Characterizarion of convex functions. We now discuss

    several tools that are useful to characterize convex functions. We first

    observe that given a family of convex functions it is possible to buildanother convex function by taking the pointwise supremum. This is a

    useful construction and is illustrated in figure

    ADD FIGURE HERE

    Proposition 8. LetI be an arbitrary set and f : Rn R, I, an

    indexed collection of convex functions. Let

    f(x) = supI

    f(x).

    Then f is convex.

    Proof. Let x, y Rn and 0 1. Thenf(x + (1 )y) = sup

    If(x + (1 )y) sup

    If(x) + (1 )f(y)

    sup1I

    f1(x) + sup2I

    (1 )f2(y)

    = f(x) + (1 )f(y).

    Corollary 9. Suppose f : Rn R is a C1 function satisfyingf(y) f(x) + Df(x)(y x),

    for all x. Then f is convex.

    Proof. It suffices to observe that

    f(y) supxRn

    f(x) + Df(x)(y x),

    which by proposition 8 is convex. Finally, we just observe that

    supxRn

    f(x) + Df(x)(y x) f(y),

    and so the equality follows.

    Proposition 10. Letf : Rn R be a C2 function. Then f is convexif and only if D2f(x) is positive semi-definite, for all x Rn.

  • 8/2/2019 Calculus of Variations and Pdes

    18/272

    18 1. FINITE DIMENSIONAL OPTIMIZATION PROBLEMS

    Proof. Observe that if f is convex then for any y Rn and any

    0 we have

    f(x y) + f(x + y) 2f(x)2

    0.

    By sending 0 and using Taylor formula concludeyTD2f(x)y 0,

    and so D2f(x) is semi-definite positive.

    Conversely,

    f(y) f(x) = 1

    0Df(x + s(y x))(y x)ds =

    = Df(x)(y x) +10

    [Df(x + s(y x))(y x) Df(x)(y x)] ds

    = Df(x)(y x) +10

    10

    s(y x)TD2f(x + ts(y x))(y x)dt

    ds

    Df(x)(y x),since (y x)TD2f(x + ts(y x))(y x) 0, by the semi-positivedefiniteness hypothesis.

    Proposition 11. Letf : Rn R be a continuous function. Then f isconvex if and only if

    (3) f(x + y) + f(x y) 2f(x) 0,for any x, y Rn.

    Proof. Clearly convexity implies (3). Let x, y Rn, and 0 1 be such that x + (1 )y = z. We must prove that(4) f(x) + (1 )f(y) f(z)holds. We claim that the previous equation holds for any = k

    2j, for

    any 0 k 2j. Clearly (4) holds when j = 1. Now we proceed withinduction in j. Assume that (4) holds for = k2j . Then we claim that

    it holds with = k2j+1

    . Ifk is even we can reduce the fraction, therefore

  • 8/2/2019 Calculus of Variations and Pdes

    19/272

    2. CONVEXITY 19

    we may suppose that k is odd, = k2j+1

    and x + (1 )y = 0. Nownote that

    z = 12

    k 12j+1

    x +

    1 k 12j+1

    y

    + 12

    k + 12j+1

    x +

    1 k + 12j+1

    y

    .

    Thus

    f(z) 12

    f

    k 12j+1

    x +

    1 k 1

    2j+1

    y

    +

    1

    2f

    k + 1

    2j+1x +

    1 k + 1

    2j+1

    y

    but, since k 1 and k +1 are even, k0 = k12 and k1 = k+12 are integers.Hence

    f(z) 12

    fk02j

    x + 1 k02j y +

    1

    2f

    k12j

    x + 1 k12j y

    But this implies, that

    f(z) k0 + k12j+1

    f(x) +

    1 k0 + k1

    2j+1

    f(y).

    Since k0 + k1 = k we get

    f(z) k2j+1

    f(x) +

    1 k

    2j+1

    f(y).

    Since f is continuous and the rationals of the form k2j

    are dense in [0, 1],

    we conclude thatf(z) f(x) + (1 )f(y),

    for any real 0 1. Exercise 10. Letf : Rn R be aC2 function. Show that the followingstatements are equivalent:

    1. f is uniformly convex;

    2. D2f > 0, for some > 0;3. fx+y

    2 + |xy|2

    4 f(x)+f(y)

    2;

    4. f(y) f(x) + Df(x)(y x) + 2|x y|2, for some > 0.

    Exercise 11. Let : R R be a non-decreasing convex function, and : Rn R a convex function. Show that is convex. Show bygiving an example that if is not non-decreasing then may failto be convex.

  • 8/2/2019 Calculus of Variations and Pdes

    20/272

    20 1. FINITE DIMENSIONAL OPTIMIZATION PROBLEMS

    2.2. Lipschitz continuity. Convex functions enjoy remarkable

    properties. We will first show that any convex function is locally

    bounded and Lipschitz.

    Proposition 12. Let f : Rd R be a convex function. Then f islocally bounded and locally Lipschitz.

    Proof. For x Rd denote |x|1 =

    k |xk|. Define XM = {x Rd :|x|1 M}. We will prove that f is bounded on XM/8.

    Any point x XM can be written as a convex combination of thepoints Mek, where ek is the k-th standard unit vector. Thus

    f(x) maxk

    {f(Mek), f(Mek)}.

    Suppose now f is not bounded by bellow on XM/8. Then there exists

    a sequence xn XM/8 such that f(xn) . Choose a point y XM/4XcM/8. Note that 2yxn XM. Therefore we can write 2y xnas a convex combination of the points Mek, i.e.

    y =1

    2xn +

    1

    2 k

    k Mek.

    Thus

    f(y) 12

    f(xn) +1

    2maxk

    {f(Mek), f(Mek)},which is a contradiction if f(xn) .

    Now we will show the second part of the proposition, i.e., that any

    convex function is also locally Lipschitz. By contradiction, by changing

    coordinates if necessay, we can assume that 0 is not a Lipschizt point,

    that is, there exists a sequence xn

    0 such that

    |f(xn) f(0)| C|xn|,

    for all C and all n large enough. In particular this implies that

    lim supn

    f(xn) f(0)|xn| {, +}.

  • 8/2/2019 Calculus of Variations and Pdes

    21/272

    2. CONVEXITY 21

    and, similarly,

    lim infn

    f(xn)

    f(0)

    |xn| {, +}.By the previous part of the proof, we can assume that f is bounded

    on X1. For each n choose a point yn such that |yn|1 = 1 such thatxn = |xn|1yn. Then

    f(xn) |xn|f(yn) + (1 |xn|)f(0),

    which implies

    f(yn)

    f(0) +

    f(xn) f(0)

    |xn|.

    Therefore

    (5) lim supn

    f(xn) f(0)|xn| = ,

    otherwise we would have a contradiction (note that f(yn) is bounded).

    We can also write 0 = 11+|xn|xn |xn|

    1+|xn|yn. So

    f(0) 11 +

    |xn|

    f(xn) +|xn|

    1 +|xn|

    f(yn).

    This implies

    f(yn) f(0) + f(0) f(xn)|xn| .

    Because f(yn) is bounded

    lim supn

    f(0) f(xn)|xn| =

    which is a contradiction to (5).

    2.3. Separation. In this last subsection we study separation prop-

    erties that arise from convexity and present some applications.

    Proposition 13. LetC be a closed convex set not containing the ori-

    gin. Then there exists x0 C which minimizes |x| over all x C.

  • 8/2/2019 Calculus of Variations and Pdes

    22/272

    22 1. FINITE DIMENSIONAL OPTIMIZATION PROBLEMS

    Proof. Consider a minimizing sequence xn. By a simple compu-

    tation, we have the parallelogram identityxn + xm22 + 14xn xm2 = 12xn2 + 12xm2.

    Because xn+xm2

    C, by convexity, we have the inequalityxn + xm22

    infyC

    y2.

    As n, m we also havexn, xm inf

    yCy2.

    But then, as n, m , we conclude thatxn xm2 0.

    Therefore any minimizing sequence is a Cauchy sequence and hence

    convergent.

    Exercise 12. Let F : Rn R be a uniformly convex function. Showthat any minimizing sequence for F is a Cauchy sequence. Hint:

    F(xn)+F(xm)2infF F(xn)+F(xm)2F(xn + xm2

    ) 2|xnxm|2.

    Proposition 14. LetU and V be disjoint closed convex sets. Supposeone them is compact. Then there exists w Rn and a > 0 such that

    (w, x y) a > 0,for all x U and y V.

    Proof. Consider the closed convex set W = U V (this set isclosed because either U or V is compact). Then there exists a point

    w W with minimal norm. Since 0 W, w = 0. So, for all x U

    and y V, by the convexity of W,w2 (x y) + (1 )w2

    = (1 )2w2 + 2(1 )(x y, w) + 2x y2.The last inequality implies

    0 ((1 )2 1)w2 + 2(1 )(x y, w) + 2x y2.

  • 8/2/2019 Calculus of Variations and Pdes

    23/272

    2. CONVEXITY 23

    Dividing by and letting 0 we obtain(x

    y, w)

    w

    2 > 0.

    As a first application to the separation result we discuss a general-

    ization of derivatives for convex functions. The subdifferential f(x)

    of a convex function f : Rn R at a point x Rn is the set of vectorsp Rn such that

    f(y) f(x) + p (y x),for all y

    Rn.

    Proposition 15. Let f : Rn R be a convex function and x0 Rn.Then f(x0) = .

    Proof. Consider the set E(f) = {(x, y) Rn+1 : y f(x)}, theepigraph of f. Then, because f is convex and hence continuous, E(f)

    is a closed convex set. Consider the sequence yn = f(x0) 1n . Becausefor each n the sets E(f) and {(x0, yn)} are disjoint closed convex sets,and the second one is compact, there is a separating plane

    (6) f(x) n(x x0) + n,for all x and

    (7) f(x0) 1n

    = yn n f(x0).Thus, from (7) we get that n is bounded. Since f is locally bounded,

    the inequality (6) implies the boundedness of n. Therefore, up to a

    subsequence, there exists = lim n and = lim n. Furthermore

    f(x) (x x0) + ,and, again using (7), we get that f(x0) = . Thus

    f(x) (x x0) + f(x0),and so f(x). Exercise 13. Letf : R R, be given by f(x) = |x|. Compute f.

  • 8/2/2019 Calculus of Variations and Pdes

    24/272

    24 1. FINITE DIMENSIONAL OPTIMIZATION PROBLEMS

    Exercise 14. Letf : Rn R be convex. Show that iff is differentiableat x

    Rn then f(x) =

    {Df(x)

    }.

    Proposition 16. Letf : Rn R be a C1 convex function. Then(Df(x) Df(y)) (x y) 0.

    Proof. Observe that

    f(y) f(x) + Df(x) (y x) f(x) f(y) + Df(y) (x y).Add these two inequalities.

    Exercise 15. Prove the analogous to the previous proposition for thecase in which f is not C1 by replacing derivatives by points in the

    subdifferential.

    Exercise 16. Letf be a uniformly convex function. Show that

    (Df(x) Df(y)) (x y) |x y|2.

    Exercise 17. Letf : Rn R be a convex function. Show that a pointx Rn is a minimizer of f if and only if 0 f(x).

    Exercise 18. Let A be a convex set and f : A R be a uniformlyconvex function. Let x A be a maximizer of f. Show that x isan extreme point, that is, that there are no y, z A, x = y, z and0 < < 1 such that x = y + (1 )z.

    The second application of Proposition 14 is a very important result

    called Farkas lemma:

    Lemma 17 (Farkas Lemma). LetA be a m

    n matrix, c a line vector

    inRn. Then we have one and only one of the following alternatives

    1. c = yTA, for some y 02. There exists a column vector w Rn, such that Aw 0 and

    cw > 0

  • 8/2/2019 Calculus of Variations and Pdes

    25/272

    2. CONVEXITY 25

    Proof. If the first alternative does not hold, the sets U = {yTA, y 0

    }and V =

    {c

    }are disjoint and convex. Then the separation theo-

    rem for convex sets (see proposition 14) implies that there exists anhyperplane with normal w which separates them, that is

    (8) yTAw aand

    cw > a.

    Note that a 0 (by setting y = 0 in (8)), so cw > 0. Furthermore, forany 0 we have

    yTAw a,by letting + we conclude that

    yTAw 0.So this corresponds to the second alternative.

    Example 3. Consider a discrete state one-period pricing model, that

    is, we are given n assets which at the initial time cost ci, 1 i n perunit (we regard c as a row vector) and after one unit of time, each asset

    is worth with probability pj , 1 j m, Pji . A portfolio is a (column)vector Rn. The value of the portfolio at time 0 is c and at timeone, with probability pj the value is (P )j . An arbitrage opportunity

    is a portfolio such that c < 0 and (P )j 0, i.e. a portfolio withnegative cost and non-negative return.

    Farkas lemma yields that either

    1. there exists y Rm, y 0 such that c = yP

    or2. there exists an arbitrage portfolio.

    Furthermore, if one of the assets is a no-interest bearing bank ac-

    count, for instance c1 = 1 and Pj1 = 1. Then y is a probability vector

    which in general may be different from p.

  • 8/2/2019 Calculus of Variations and Pdes

    26/272

    26 1. FINITE DIMENSIONAL OPTIMIZATION PROBLEMS

    3. Lagrange multipliers

    Many important problems require minimizing (or maximizing) func-

    tions under equality constraints. The Lagrange multiplier method is

    the standard tool to study these problems. For inequality constraints,

    the Lagrange multiplier method can be extended in a suitable way as

    it will be studied in the two following sections.

    Proposition 18. Let f : Rn R and g : Rn Rm (m < n) be C1functions. Suppose c Rm fixed, and assume that the rank of Dg is mat all points of the set g = c. Then, if x0 is a minimum of f in the set

    g(x) = c, there exists Rm

    such thatDf(x0) =

    TDg(x0).

    Proof. Let x0 be as in the statement. Suppose that w1, . . . wm are

    vectors in Rn satisfying

    det[Dg(x0)W] = 0,where W [w1 wm] is the matrix with columns w1, . . . wm. Notethat it is possible to choose such vectors because the rank of Dg is m.

    Given v Rn

    consider the equation

    g(x0 + v + W i) = c.

    The implicit function theorem implies that there exists a unique func-

    tion i() : R Rm,

    i() =

    i1()...

    im()

    ,

    defined in a neighborhood of = 0, with i(0) = 0, and such that

    g(x0 + v + W i()) = c.

    Additionally,

    i(0) = (Dg(x0)W)1Dg(x0)v.Since x0 is a minimizer of f in the set g(x) = c, the function

    I() = f(x0 + v + W i())

  • 8/2/2019 Calculus of Variations and Pdes

    27/272

    3. LAGRANGE MULTIPLIERS 27

    satisfies

    0 = I(0) = Df(x0)v + Df(x0)W i(0),

    that is,

    Df(x0)v = TDg(x0)v,

    with

    T = Df(x0)W(Dg(x0)W)1,

    for any vector v.

    Proposition 19. Let f : Rn R, and g : Rn Rm, with m < n,be smooth functions. Assume that Dg has maximal rank at all points.

    Let xc be a minimizer of f(x) under the constraint g(x) = c, and c

    the corresponding Lagrange multiplier, i.e.

    (9) Df(xc) = cDg(xc).

    Suppose that xc is differentiable function of c. Define

    V(c) = f(xc).

    Then DcV(c) = c.

    Proof. We have

    g(xc) = c.

    By differentiating with respect to c we obtain

    Dg(xc)xcc

    = I.

    Multipying by c and using (9) yields

    c = cDg(xc)xcc

    = Df(xc)xcc

    = DcV(c).

    Exercise 19. Let f : Rn R, and g : Rn Rm, with m < n, besmooth functions. Assume that Dg has maximal rank at all points. Letx0 be a minimizer of f(x) under the constraint g(x) = g(x0), the

    corresponding Lagrange multiplier, and F = f + g. Show that

    D2xixjF(x0)ij 0,for all vectors that satisfy Dxig(x0)i = 0.

  • 8/2/2019 Calculus of Variations and Pdes

    28/272

    28 1. FINITE DIMENSIONAL OPTIMIZATION PROBLEMS

    Proposition 20. Let f : Rn R, and g : Rn Rm, with m < n.Letx0 be a minimizer of f(x) under the constraint g(x) = g(x0). Then

    there exist constants 0, m not identically zero such that0Df + 1Dg1 mDgm = 0

    at x0. Furthermore, if Dg has maximal rank we can choose 0 = 1.

    Proof. First observe that the matrixDf

    Dg

    cannot have rank m + 1. Indeed, this follows by applying the implicitfunction theorem to the function (x, c) (f(x) c0, g(x) c) withx Rn and c = (c0, c) Rm+1, to obtain a contradiction to x0 beinga minimizer.

    This fact then implies that there exist constants 0, m not iden-tically zero such that

    0Df + 1Dg1 + + mDgm = 0at x

    0. Observe also that ifDg has maximal rank we can choose

    0= 1.

    In fact, if0 = 0, it suffices to multiply by 10 . To see that 0 = 0 weargue by contradiction. In fact, if 0 = 0 we would have

    1Dg1 + + mDgm = 0which contradicts the hypothesis that Dg has maximal rank m.

    Example 4 (Minimax principle). There exists a nice formal interpre-

    tation of Lagrange multipliers, which although not rigorous is quite

    useful. Fix c Rm, and consider the problem of minimizing a functionf : Rn R under the constraint g(x)c = 0, with g : Rn Rm. Thisproblem can be rewritten as

    minx

    max

    f(x) + T(g(x) c).The minimax principle asserts that the maximum can be exchanged

    with the minimum (which is frequently false) and, therefore, we obtain

  • 8/2/2019 Calculus of Variations and Pdes

    29/272

    3. LAGRANGE MULTIPLIERS 29

    the equivalent problem

    max minx f(x) +

    T

    (g(x) c).From this we deduce that, for each the minimum x is determined

    by

    (10) Df(x) + TDg(x) = 0.

    Furthermore, the function to maximize in is

    f(x) + T(g(x) c).

    Differentiating this equation with respect to , assuming that x is

    differentiable, and using (10), we obtain

    g(x) = c.

    Exercise 20. Use the minimax principle to determine (formally) op-

    timality conditions for the problem

    min f(x)

    under the constraint g(x) c.

    The next exercise illustrates that the minimax principle may indeed

    be false, although in many problems it is an important heuristic

    Exercise 21. Show that the minimax principle is not valid in the fol-

    lowing cases:

    1. x + ;

    2. x3 + (x2 + 1);

    3. 11+(x)2 .

    Exercise 22. LetA andB be arbitrary sets and F : AB R. Showthat

    infaA

    supbB

    F(a, b) supbB

    infaA

    F(a, b).

  • 8/2/2019 Calculus of Variations and Pdes

    30/272

    30 1. FINITE DIMENSIONAL OPTIMIZATION PROBLEMS

    4. Linear programming

    We now continue the study of constrained optimization problems

    by looking into minimization of linear functions subjected to linear in-

    equality constraints - i.e., linear programming problems. A detailed dis-

    cussion on this class of problems can be found, for instance, in [GSS08]

    or [Fra02].

    4.1. The setting of linear programming. A model problem in

    linear programming is the following: given a line vector c Rn, a realm n matrix A, and a column vector b Rm we look for a columnvector x Rn which is a solution to the problem:

    (11)

    maxx cx

    Ax bx 0,

    where the notation v 0 for a vector v means that all components ofv are non-negative. The set defined by the inequalities Ax b andx

    0 may be empty, or in this set the function cx may be unbounded

    by above. To simplify the discussion, we assume that this situation

    does not occur.

    Move here feasible set

    Example 5. Add example here.

    Observe that ifc = 0 the maximizers ofcx cannot be interior pointsof the feasible set, otherwise by exercise 4 they would be critical points.

    Therefore, the maximizers must lie on the boundary of Ax b, x 0.Unfortunately this boundary can be quite complex as consists on a

    finite (but frequently large) union of intersections of planes (of the

    form dx = e) with half-planes (of the form dx e).

  • 8/2/2019 Calculus of Variations and Pdes

    31/272

    4. LINEAR PROGRAMMING 31

    Exercise 23. Suppose that no line ofA vanishes. Show that the bound-

    ary of the set Ax

    b consist of all points which satisfy Ax

    b with

    equality in at least one coordinate.

    Note that the linear programming problem (11) is quite general

    as it is possible to include equality constraints as inequalities: in fact

    Ax = b is the conjunction of Ax b and Ax b.

    A vector x is called feasible for (11) if it satisfies the constraints,

    that is Ax b and x 0.Example 6 (Diet problem). A animal food factory would like to min-

    imize the production cost of a pet food, while keeping it nutritionally

    balanced. Each food i costs ci by unit. Therefore, if each unit of pet

    food contains an amount xi of the food ci, the total cost is

    cx.

    There is, of course, the obvious constraint that x 0. Suppose thatAij represents the amount of the nutrient i in the food j, and bi the

    minimum recommended amount of the nutrient i. Then, to ensure a

    nutritionally ballanced diet we must have

    Ax b.Thus the diet problem is

    min cx

    Ax bx 0.

    Example 7 (Optimal Transport). A large multinational needs to trans-

    port its supply from each factory i to the distribution points j. Thesupply in i is si and the demand in j is dj . The cost of transporting

    one unit from i to j is cij . We would like to determine the quantity ijtransported from i to j solving the following optimization problem

    min

    ij

    cijij,

  • 8/2/2019 Calculus of Variations and Pdes

    32/272

    32 1. FINITE DIMENSIONAL OPTIMIZATION PROBLEMS

    under the constraints ij 0, and supply and demand bounds

    j

    ij si, i

    ij dj.

    Example 8. The existence of feasible vectors, i.e. vectors satisfying

    the constraint Ax b is not obvious. There exists, however a procedurethat can convert this question into a new linear programming problem.

    Let x0 be a new variable. We would like to solve

    min x0

    where the minimum is taken over all vectors (x0, x) which satisfy the

    constraints (Ax)j bj + x0, for all j. It is clear that the feasible set forthis problem is non-empty, take for instance x = 0 and x0 = max |bj|.

    This new linear programming problem has therefore a value (which

    could be but not +). If the value is non-positive, there existfeasible vectors for the constraint Ax b. Otherwise, if the value ispositive, it implies that the feasible set of the original problem is empty.

    Exercise 24. Let A be m n matrix, with m > n. Consider theoverdetermined system

    Ax = b

    forb Rm. In general, this equation has no solution. We would like todetermine a vector x Rn which minimizes the maximum of the error

    supi |(Ax)i bi|.Rewrite this problem as a linear programming problem. Compare this

    problem with the minimum square method which consists in solving

    minx

    Ax b2.

  • 8/2/2019 Calculus of Variations and Pdes

    33/272

    4. LINEAR PROGRAMMING 33

    4.2. The dual problem. To problem (11), which we call primal,

    we associate another problem, called the dual, which consists in deter-

    mining y Rm, which solves

    (12)

    min yTb

    yTA cy 0.

    As the next exercise shows, the dual problem can be motivated by the

    minimax principle:

    Exercise 25. Show that (11) can be written as

    (13) maxx0

    miny0

    cTx + yT(b Ax).

    Suppose we can exchange the maximum with the minimum in (13).

    Relate the resulting problem with (12).

    Example 9 (Interpretation of the dual of the diet problem). The dual

    of the diet problem (example 6) is the following

    max yTb

    yTA

    c

    y 0.This problem admits the following interpretation. A competing com-

    pany is willing to provide a nutritionally balanced diet, charging for

    each unit of the nutrient i a price yi. Obviously, the competing com-

    pany would like to maximize its income. There are the following con-

    straints: y 0, and furthermore if the food item j costs cj the com-peting company should charge an amount (yTA)j no larger than cj.

    This constraint is quite natural, since if it does not hold, at least part

    of the diet could be obtained by buying the food items j such that(yTA)j > cj.

    Exercise 26. Show that the dual of the dual is equivalent to the primal.

    Exercise 27. Determine the dual of the optimal transport problem and

    give a possible interpretation.

  • 8/2/2019 Calculus of Variations and Pdes

    34/272

    34 1. FINITE DIMENSIONAL OPTIMIZATION PROBLEMS

    As the next theorem concerns the relation between the primal and

    dual problems:

    Theorem 21.

    1. Weak Duality: Suppose x and y are feasible, respectively, for

    (11) and (12), then

    cx yTb.2. Optimality: Furthermore, if cx = yTb then x and y are solu-

    tions of (11) and (12), respectively.

    3. Strong duality: If (11) has a solution x

    , then (12) also hasa solution y,

    cx = (y)Tb.

    Finally, yj = 0 for all indices j such that (Ax)j < bj.

    Proof. To prove the weak duality, observe that

    cx (yTA)x = yT(Ax) yTb.The optimality criterion follows from the previous inequality.

    To prove the strong duality, we may assume that the inequality

    Ax b includes also x 0, for instance replacing A by the augmentedmatrix

    A =

    A

    I

    and the vector b by

    b =

    b

    0

    .

    In this case it will be enough to prove that there exists a vector y Rn+m such that y 0,

    c = (y)TA

    with yj = 0 for all indices j such that (Ax)j < bj. In fact, if such

    vector y is given we just set y to be the first n coordinates of y.

  • 8/2/2019 Calculus of Variations and Pdes

    35/272

    4. LINEAR PROGRAMMING 35

    Then c (y)TA and then

    cx

    = (y

    )

    T Ax

    = (y

    )

    T

    b = (y

    )

    T

    b,since b differs from b by adding n zero entries. From this point on we

    drop the to simplify the notation.

    First we state the following auxiliary result, whose proof is a simple

    corollary to Lemma 17:

    Lemma 22. Let A be a m n matrix, c a line vector inRn and Jan arbitrary set of lines of A. Then we have one and only one of the

    following alternatives

    1. c = yTA, for some y 0 with yj = 0 for all j J.2. There exists a column vector w Rn, such that (Aw)j 0 for

    all j J and cw > 0.Exercise 28. Use Lemma 17 to prove Lemma 22.

    Let x be a solution of (11). Let J be the set of indices j for which

    (Ax)j = bj. We will show that there exists y

    0 such that c = yTA

    and yj = 0 for j J. By contradiction assume that no such y exists.By the previous lemma there is w such that cw > 0 and (Aw)j 0 for

    j J. But then, x = x + w is feasible, for > 0 sufficiently smallsince

    Ax = Ax + Aw b.However,

    cx = c(x + w) > cx,

    which contradicts the optimality of x.

    Therefore, for some y 0,cx = yTAx = yTb.

    Consequently, by the second part of the theorem we conclude that y is

    optimal.

  • 8/2/2019 Calculus of Variations and Pdes

    36/272

    36 1. FINITE DIMENSIONAL OPTIMIZATION PROBLEMS

    Lemma 23. Let x and y be, respectively, feasible for the primal and

    dual problems. Define

    s = b Ax 0, e = ATy cT 0.Then

    sTy + xTe = bTy xTcT 0.

    Proof. Since x, y 0 we havesTy = bTy xTATy 0 xTe = xTATy xTcT 0.

    By adding these two expressions, we obtain

    sTy + xTe = bTy xTcT 0.

    Theorem 24 (Complementarity). Suppose x and y are solutions of

    (11) and (12), respectively. Then

    sTy = 0 and xTe = 0.

    Proof. We have sTy, xTe 0. Ifx and y are optimal then cx =yTb. By the previous lemma

    sTy + xTe = 0,

    which implies the theorem.

    Exercise 29. Study the following problem inR2:

    max x1 + 2x2

    with x1, x2 0, x1 + x2 1 and 2x1 + x2 3/2. Determine the dualproblem, its solution and show that it has the same value as the primal

    problem.

    Exercise 30. Letx be a solution of the problem

    min cx

  • 8/2/2019 Calculus of Variations and Pdes

    37/272

    5. NON-LINEAR OPTIMIZATION WITH CONSTRAINTS 37

    under the constraints Ax b, x 0 and let y be a solution of thedual. Use complementarity to show that x minimizes

    cx (y)TAxunder the constraint x 0.Exercise 31. Solve by elementary methods the problem

    max x1 + x2

    under the constraints 3x1 + 4x2 12, 5x1 + 2x2 10.Exercise 32. Consider the problem

    min 7x1 + 9x2 + 16x3,under the constraints x 0, 2 x1 + 2x2 + 9x3 7. Obtain an upperand lower bound for the value of the minimum.

    Exercise 33. Show that the solution set of a linear programming prob-

    lem is a convex set.

    Exercise 34. Consider a linear programming problem inRn

    min cx

    under the constraints Ax b, x 0. Suppose c = c0 + c1. Supposethat for > 0 there exists a minimizer x which converges to a point

    x0, as 0. Show that x0 is a minimizer of c0x under Ax b, x 0. Show, furthermore that if this limit problem has more than one

    minimizer then x0 minimizes c1x among all other minimizers.

    5. Non-linear optimization with constraints

    Let f :Rn R and g : R

    n

    Rm

    be C1

    functions. We considerthe following non-linear optimization problem:

    (14)

    maxx

    f(x)

    g(x) 0x 0.

  • 8/2/2019 Calculus of Variations and Pdes

    38/272

    38 1. FINITE DIMENSIONAL OPTIMIZATION PROBLEMS

    We denote the feasible set by X:

    X = {x Rn|x 0, g(x) 0},and the solution set by S:

    S = {x X : f(x) = supxX

    f(x)}.

    In this section we derive necessary conditions, called the Karush-Kuhn-

    Tucker (KKT) conditions, for a point to be a solution of the problem.

    We start by explaining these conditions which generalize both the La-

    grange multipliers for equality constraints and the optimality condi-

    tions from linear programming. We then show that under convexity

    hypothesis these conditions are in fact sufficient. After that we show

    that under a condition called constraint qualification that the KKT

    conditions are indeed necessary optimality conditions. We end the

    discussion with several conditions that allow to check in practice the

    constraint qualification conditions.

    5.1. KKT conditions. For yRm define the Lagrangian

    L(x,y,) = f(x) yTg(x) + Tx

    For (x,,y) Rn Rn Rm the KKT conditions are the following:

    (15)

    Lxi

    = 0

    g(x) 0, yTg(x) = 0x 0, Tx = 0, y

    0.

    The variables y and are called the Lagrange multipliers.

    Several variations of the KKT conditions arise in different problems.

    For instance, in the case in which there is no positivity constraints for

    the variable x, the KKT conditions take the form: for (x, y) RnRm,

  • 8/2/2019 Calculus of Variations and Pdes

    39/272

    5. NON-LINEAR OPTIMIZATION WITH CONSTRAINTS 39

    and L(x, y) = f(x) yTg(x),

    (16)

    Lxi = 0g(x) 0, yTg(x) = 0y 0.

    Exercise 35. Derive (16) from (15) by writing x = x+ x wherex+, x 0.

    Another example are equality constraints g(x) = 0, again without

    positivity constraints in the variable x. We can write the equality

    constraint as g(x) 0 and g(x) 0. Let y be the multiplierscorresponding to g(x) 0, define y = y+ y. Then (16) can bewritten as

    f

    xi=

    mj=1

    yjgjxi

    , g(x) = 0,

    that is, y is the Lagrange multiplier for the equality constraint g(x) = 0.

    Consider a linear programming problem where in (14) we set

    f(x) = cx, g(x) = Ax b.Then the KKT conditions are then

    c yTA = Ax b, yT(Ax b) = 0x 0, Tx = 0, y 0.

    In this case, the first line of the KKT conditions can be rewritten as

    c yTA 0,

    that is, since y 0, y is admissible for the dual problem. Using thecondition Tx = 0 we conclude that

    c x = yTAx.

  • 8/2/2019 Calculus of Variations and Pdes

    40/272

    40 1. FINITE DIMENSIONAL OPTIMIZATION PROBLEMS

    Then the second line of the KKT condition yields yTAx = yTb, which

    implies

    cx = yTb,

    which is the optimality criterion for the linear programming problem,

    and shows that a solution of the KKT condition is in fact a solution

    of (14). Furthermore, it also shows that y is a solution to the dual

    problem.

    Example 10. Let Q be an n n real matrix. Consider the quadraticprogramming problem

    (17)

    maxx1

    2

    xTQx

    Ax bx 0.

    The KKT conditions are

    (18)

    xTQ yTA = Ax b, yT(Ax b) = 0x 0, Tx = 0, y 0.

    5.2. Duality and sufficiency of KKT conditions. We can write

    problem (14) in the following minimax form:

    supx0

    infy0

    f(x) yTg(x).

    We define the dual problem as

    (19) inf y0

    sup

    x0

    f(x)

    yTg(x).

    Let

    h(y) = supx0

    f(x) yTg(x),and

    h(x) = infy0

    f(x) yTg(x).

  • 8/2/2019 Calculus of Variations and Pdes

    41/272

    5. NON-LINEAR OPTIMIZATION WITH CONSTRAINTS 41

    Then (14) is equivalent to

    supx0

    h(x),

    and (19) is equivalent to the problem

    infy0

    h(y).

    From exercise 22, we have the duality inequality

    supx0

    h(x) = supx0

    infy0

    f(x) yTg(x)

    infy0

    supx0

    f(x) yTg(x) = infy0

    h(y).

    Furthermore, if x

    0 and y

    0 satisfy

    h(x) = h(y)

    then x and y are, respectively, solutions to (14) and (19).

    If we choose

    f(x) = cx, g(x) = Ax b,(14) is a linear programming problem. Then

    h(x) = cx if Ax b 0

    otherwise,and

    h(y) =

    b

    Ty if ATy c 0+ otherwise.

    Consider the quadratic programming problem

    (20)

    max 12xTQx

    Ax b 0.Note that here the variable x does not have any sign constraint.

    In this case we define

    h(x) = infy0

    1

    2xTQx yT(Ax b) =

    12x

    TQx if Ax b 0 otherwise,

  • 8/2/2019 Calculus of Variations and Pdes

    42/272

    42 1. FINITE DIMENSIONAL OPTIMIZATION PROBLEMS

    and

    h(y) = supx

    1

    2

    xTQx

    yT(Ax

    b).

    If we assume that Q is non-singular and negative definite we have

    h(y) = 12

    yTAQ1ATy + yTb.

    It is easy to check directly that h(x) h(y).

    It turns out that the KKT conditions are in fact sufficient if f and

    g satisfy additional convexity conditions.

    Proposition 25. Suppose that

    f and each component of g is convex.

    Let(x, , y) RnRnRm be a solution of the KKT conditions (15).Then x is a solution of (14).

    Proof. Let x X. By the concavity of f we havef(x) f(x) Df(x)(x x).

    By the KKT conditions (15),

    Df(x)(x x) = yTDg(x)(x x) T(x x).

    Since each component of g is convex, and y 0,yTDg(x)(x x) yT(g(x) g(x))

    Since yTg(x) = 0, yTg(x) 0, Tx 0, and Tx = 0, we havef(x) f(x) 0,

    that is x is solution.

    As the next proposition shows, the KKT conditions imply strong

    duality.

    Proposition 26. Suppose thatf and each component of g is convex.Let(x, , y) RnRnRm be a solution of the KKT conditions (15).Then

    h(x) = h(y).

  • 8/2/2019 Calculus of Variations and Pdes

    43/272

    5. NON-LINEAR OPTIMIZATION WITH CONSTRAINTS 43

    Proof. Observe that, by the previous theorem, any solution to

    Df(x)

    yTDg(x) + T = 0,

    with 0, Tx = 0, is a maximizer of the functionf(x) yTg(x),

    under the constraint x 0. Thereforeh(y) = f(x) yTg(x) = f(x),

    since yTg(x) = 0. Furthermore,

    h(x) = f(x) + infy0

    yTg(x) = f(x),

    because g(x) 0. Thush(x) = h

    (y).

    5.3. Constraint qualification and KKT conditions. Consider

    the constraints

    (21) g(x)

    0, x

    0.

    Let X denote the admissible set for (21). For x X define the activecoordinates indices as I(x) = {i : xi = 0}, and the active constraintsindices as J(x) = {j : gj(x) = 0}. For x X define the tangent coneto the admissible set X at the point x X as the set T(x) of vectorsv Rn which satisfy

    vi 0, v Dgj(x) 0,for all i I(x) and all j J(x). We say that the constraints satisfy theconstraint qualification condition if for any x

    X and any v

    T(x)

    there exists a C1 curve x(t) with x(0) = x and x(0) = v with x(t) Xfor all t 0 sufficiently small.Proposition 27. Let x be a solution of (14), and assume that the

    constraint qualification condition holds. Then there exists Rn andy Rm such that (15) holds.

  • 8/2/2019 Calculus of Variations and Pdes

    44/272

    44 1. FINITE DIMENSIONAL OPTIMIZATION PROBLEMS

    Proof. Fix v T(x) and let x be a curve as in the constraintqualification condition. Because x is a maximizer,

    (22) 0 ddt

    f(x(t))

    t=0

    = v Df(x).

    From Farkas lemma (Lemma 17) we know that either there is v T(x)such that v Df > 0 or else the vector Df belongs to the positive conegenerated by ei, i I and Dgj(x), for j J. By (22) we know thatthe first alternative does not hold, hence there exists a vector Rn,with i 0 for i I, and i = 0 for i Ic, and y Rm with yj 0for j J and yj = 0 for j Jc such that

    Df = yTDg T.By the construction of y and , as well as the definition of I and J, it

    is clear that Tx = 0 as well as yTg = 0.

    To give an interpretation of the Lagrange multipliers in the KKT

    conditions, consider the family of problems

    (23) maxx

    f(x)

    g(x) 0,where Rm and

    g(x) = g(x) .We will assume that for all the constraint qualification condition

    holds. Furthermore, assume that there exists a unique solution x

    which is a differentiable function of . Define the value function

    V() = f(x).

    Let y

    Rm

    be the corresponding Lagrange multipliers, which weassume to be also differentiable.

    We claim that for any 0 Rm

    (24)V(0)

    j= y0j .

  • 8/2/2019 Calculus of Variations and Pdes

    45/272

    5. NON-LINEAR OPTIMIZATION WITH CONSTRAINTS 45

    To prove this identity, observe first that we have, using the KKT con-

    ditions,

    V()j

    =k

    f(x)xk

    x

    j=kj

    yjgj (x)

    xkxkj

    .

    By differentiating the complementarity condition

    k ykg

    k(x) = 0 with

    respect to j we obtain

    (25) 0 =

    k

    ykj

    gk(x) + yk

    i

    gk(x)

    xi

    xij

    yj .

    For = 0 we either have gk(x

    0) = 0 or gk(x0) < 0, in which case

    yk vanishes in a neighborhood of 0. Consequently, in this last case we

    have y0kj

    = 0. Therefore

    ykj

    gk(x) = 0.

    So, from (25), we conclude that

    y0j =k

    ykg0k (x0)

    xi

    x0ik

    .

    Thus we obtain (24).

    5.4. Checking the constraint qualification conditions. Con-

    sider the following optimization problem

    (26)

    maxx

    x1

    (1 x1)3 + x2 0x 0.

    The Lagrangian is

    L(x,y,) = x1 yT

    (x2 (1 x1)3

    ) + 1x1 + 2x2and so

    L(x,y,)

    x1= 1 3(1 x1)2y + 1.

    In particular, when x1 = 1, the equation

    1 + 1 = 0

  • 8/2/2019 Calculus of Variations and Pdes

    46/272

    46 1. FINITE DIMENSIONAL OPTIMIZATION PROBLEMS

    does not have a solution with 1 0. Hence the KKT conditions arenot satisfied. Nevertheless the point (x1, x2) = (1, 0) is a solution.

    This example illustrates the need for obtaining simple criteria to

    check whether the constraint qualification conditions hold. We will

    show that the following are sufficient conditions for the verification of

    the constraint qualifications.

    1. The Mangasarian-Fromowitz condition: for any x X there isv such that

    gi(x)v < 0;2. The Cotte-Dragominescu condition: for any x X the active

    constraints are positively linearly independent:ygi = 0, y 0 implies y = 0;

    3. The Arrow-Hurwicz and Uzawa condition: for any x X theactive constraints are linearly independent.

    It is obvious that 3. implies 2. We will show that 1. is equivalent to 2.

    To do so we need the following lemma:

    Proposition 28 (Gordon alternative). Let A be a real-valued m nmatrix. Then one and only one of the following holds:

    There exists x Rn such that Ax < 0; There exists y Rm, y 0, and y = 0, such that yTA = 0.

    Proof. (i) It is clear that the two conditions are disjoint. Other-

    wise, if Ax < 0 and yTA = 0 we would have 0 = yTAx < 0 which is a

    contradiction.

    (ii) We consider the following optimization problem:

    (27)

    maxy

    y1 + + ymyTA = 0

    y 0.

  • 8/2/2019 Calculus of Variations and Pdes

    47/272

    5. NON-LINEAR OPTIMIZATION WITH CONSTRAINTS 47

    It is clear that if the second alternative holds then the value of this

    problem is +

    . Otherwise, y = 0 is a solution and the value is 0. In

    this case the dual problem:

    (28)

    minx 0(Ax)i 1, i = 1, . . . , m

    has a solution, i.e., there is a point x satisfying the constraints. Hence,

    the first alternative holds.

    Proposition 29. The Cotte-Dragominescu condition is equivalent to

    the Mangasarian-Fromowitz condition.

    Proof. Set A = g. The Mangasarian-Fromowitz condition cor-responds to the first case in the Gordon alternative. Therefore, the

    only solution of

    ygi = 0 and y 0 is y = 0. Thus the Cotte-Dragominescu condition is satisfied. Conversely, if the only solution

    to

    ygi = 0 and y 0 is y = 0 the second case of the Gordonalternative does not hold. Then the first alternative holds and so the

    Mangasarian-Fromowitz condition is satisfied.

    Theorem 30. If the Mangasarian-Fromowitz condition holds then the

    constraint qualification condition is satisfied.

    Proof. Let x0 X. Take w such that gi(x0)w 0. We mustconstruct a curve x() in such a way that x() X for sufficientlysmall and such that x(0) = w. Let v be a vector as in the Mangasarian-

    Fromowitz condition. Take M sufficiently large and define

    x() = x0 + w + M2v.

    Then using Taylors series we have

    gi(x()) = gi(x0)+gi(x0)w+M2gi(x0)v + 2

    2 wTD2gi(x0)w+O(3).

    Thus, if M is large enough and sufficiently small gi(x()) < 0.

    Theorem 31. If either the Cotte-Dragominescu condition or the Arrow-

    Hurwicz and Uzawa condition hold then so does the constraint qualifi-

    cation condition.

  • 8/2/2019 Calculus of Variations and Pdes

    48/272

    48 1. FINITE DIMENSIONAL OPTIMIZATION PROBLEMS

    6. Bibliographical notes

    In what concerns linear programming problem, we have used the

    books [GSS08] or [Fra02]...

  • 8/2/2019 Calculus of Variations and Pdes

    49/272

    2

    Calculus of variations in one independent variable

    This chapter is dedicated to a classical subject in the calculus of

    variations: variational problems with one independent variable. These

    are extremely important because of its applications to classical me-

    chanics and Riemannian geometry. Furthermore they serve as a modelfor optimal control problems and problems with multiple integrals. We

    start in Section 1, by deriving the Euler-Lagrange equation and give

    some elementary applications. Then, in section 2 we study additional

    necessary conditions for minimizers, and in section 3 we discuss several

    applications to Riemannian geometry and classical mechanics.

    An introduction to the Hamiltonian formalism is discussed in sec-

    tion 4. The next issue, section 5, is the study of sufficient conditions

    for a trajectory to be a minimizer: first we establish the existence of

    local minimizers, then we study the connections between smooth solu-

    tions of Hamilton-Jacobi equations and global minimizers, and finally

    we discuss the Jacobi equation, conjugate points and curvature.

    Symmetries are an important topic in calculus of variations. In

    section 6 we present Rouths method for integration of Lagrangian

    systems and Noethers theorem.

    Of course, not every solution to the Euler-Lagrange equation is a

    minimizer. Section 7 is a brief introduction to minimax methods and tothe mountain pass theorem. We also consider several examples of non-

    existence of minimizing orbits (Lavrentiev phenomenon) and relaxation

    methods (Young measures) in section 9.

    49

  • 8/2/2019 Calculus of Variations and Pdes

    50/272

    50 2. CALCULUS OF VARIATIONS IN ONE INDEPENDENT VARIABLE

    Invariant measures for Lagrangian and Hamiltonian systems are

    considered in section 8.

    The next part of this chapter is dedicated to the study of the ge-

    ometry of Hamiltonian systems: symplectic and Poisson structures,

    Darboux theorem and Arnold-Liouville integrability, section 10

    In the last section, section 11 we consider perturbation problems

    and describe the Linstead series perturbation procedure.

    We end the chapter with bibliographical notes.

    1. Euler-Lagrange Equations

    In classical mechanics, the trajectories x : [0, T] Rn of a me-chanical system are determined by a variational principle called the

    minimal action principle. This principle asserts that the trajectories

    are minimizers (or at least critical points) of an integral functional. In

    this section we study this problem and discuss several examples.

    Consider a mechanical system on Rn with kinetic energy K(x, v)

    and potential energy U(x, v). We define the Lagrangian, L(x, v) : RnRn R to be difference between the kinetic energy K and potentialenergy U of the system, that is, L = KU. The variational formulationof classical mechanics asserts that trajectories of this mechanical system

    minimize (or are at least critical points) of the action functional

    S[x] =

    T

    0

    L(x(t), x(t))dt,

    under fixed boundary conditions. More precisely, a C1 trajectory x :

    [0, T] Rn is a minimizer S under fixed boundary conditions if for anyC1 trajectory y : [0, T] Rn such that x(0) = y(0) and x(T) = y(T)we have

    S[x] S[y].

  • 8/2/2019 Calculus of Variations and Pdes

    51/272

    1. EULER-LAGRANGE EQUATIONS 51

    In particular, for any C1 function : [0, T] Rn with compact supportin (0, T), and any

    R we have

    i() = S[x + ] S[x] = i(0).Thus i() has a minimum at = 0. So, ifi is differentiable, i(0) = 0. A

    trajectory x is a critical pointofS, if for any C1 function : [0, T] Rnwith compact support in (0, T) we have

    i(0) =d

    dS[x + ]

    =0

    = 0.

    The critical points of the action which are of class C2 are solutions

    to an ordinary differential equation, the Euler-Lagrange equation, that

    we derive in what follows. Any minimizer of the action functional

    satisfies further necessary conditions which will be discussed in section

    2.

    Theorem 32 (Euler-Lagrange equation). Let L(x, v) : Rn Rn Rbe a C2 function. Suppose that x : [0, T] Rn is a C2 critical point ofthe action S under fixed boundary conditions x(0) and x(T). Then

    (29)d

    dtDv

    L(x, x)

    Dx

    L(x, x) = 0.

    Proof. Let x be as in the statement. Then for any : [0, T] Rnwith compact support on (0, T), the function

    i() = S[x + ]

    has a minimum at = 0. Thus

    i(0) = 0,

    that is, T0

    DxL(x, x) + DvL(x, x) = 0.

    Integrating by parts, we conclude thatT0

    d

    dtDvL(x, x) DxL(x, x)

    = 0,

  • 8/2/2019 Calculus of Variations and Pdes

    52/272

    52 2. CALCULUS OF VARIATIONS IN ONE INDEPENDENT VARIABLE

    for all : [0, T] Rn with compact support in (0, T). This implies(29) and ends the proof of the theorem.

    Example 11. In classical mechanics, the kinetic energy K of a particle

    with mass m with trajectory x(t) is:

    K = m|x|2

    2.

    Suppose that the potential energy U(x) depends only on the position x.

    Assume also that U is smooth. Then the Lagrangian for this mechanical

    system is then

    L = K

    U.

    and the corresponding Euler-Lagrange equation is

    mx = U(x),which is the Newtons law.

    Exercise 36. LetP Rn, and consider the Lagrangian L(x, v) : Rn Rn R defined by L(x, v) = g(x)|v|2 + Pv U(x), where g andU areC2 functions. Determine the Euler-Lagrange equation and show that it

    does not depend on P.

    Exercise 37. Suppose we form a surface of revolution by connecting a

    point (x0, y0) with a point (x1, y1) by a curve (x, y(x)), x [0, 1], andthen revolving it around the y axis. The area of this surface isx1

    x0

    x

    1 + y2dx.

    Compute the Euler-Lagrange equation and study its solutions.

    To understand the behavior of the Euler-Lagrange equation it is

    sometimes useful to change coordinates. The following propositionshows how this is achieved:

    Proposition 33. Letx : [0, T] Rn be a critical point of the actionT0

    L(x, x)dt.

  • 8/2/2019 Calculus of Variations and Pdes

    53/272

    1. EULER-LAGRANGE EQUATIONS 53

    Letg : Rn Rn be a C2 diffeomorphism and L given byL(y, w) = L(g(y), Dg(y)w).

    Then y = g1 x is a critical point ofT0

    L(y, y)dt.

    Proof. This is a simple computation and is left as an exercise to

    the reader.

    Before proceeding, we will discuss some applications of variational

    methods to classical mechanics. As mentioned before, the trajectories

    of a mechanical system with kinetic energy K and potential energy

    U are critical points of the action corresponding to the Lagrangian

    L = K U. In the following examples we use this variational principleto study the motion of a particle in a central field, and the planar two

    body problem.

    Example 12 (Central field motion). Consider the Lagrangian of a

    particle in the plane subjected to a radial potential field.

    L(x, y, x, y) =x2 + y2

    2 U(

    x2 + y2).

    Consider polar coordinates, (r, ), that is (x, y) = (r cos , r sin ) =

    g(r, ), We can change coordinates (see proposition 33) and obtain the

    Lagragian in these new coordinates

    L(r, , r, ) =r22 + r2

    2 U(r).

    Then the Euler-Lagrange equations can be written as

    d

    dtr2 = 0

    d

    dt r = U(r) + r2.The first equation implies that r2 is conserved. Therefore, r2 =2

    r3. Multiplying the second equation by r we get

    d

    dt

    r2

    2+ U(r) +

    2

    2r2

    = 0.

  • 8/2/2019 Calculus of Variations and Pdes

    54/272

    54 2. CALCULUS OF VARIATIONS IN ONE INDEPENDENT VARIABLE

    Consequently

    E =r2

    2+ U(r) +

    2

    2r2is a conserved quantity. Thus, we can solve for r as a function of r

    (given the values of the conserved quantities E and ) and so obtain

    a first-order differential equation for the trajectories.

    Example 13 (Planar two-body problem). Consider now the problem

    of two point bodies in the plane, with trajectories (x1, y1) and (x2, y2).

    Suppose that the interaction potential energy U depends only on the

    distance

    (x1 x2)2 + (y1 y2)2 between them. We will show howto reduce this problem to the one of a single body under a radial field.

    The Lagrangian of this system is

    L = m1x21 + y

    21

    2+ m2

    x22 + y22

    2 U(

    (x1 x2)2 + (y1 y2)2).

    Consider new coordinates (X,Y,x,y), where (X, Y) is the center of

    mass

    X =m1x1 + m2x2

    m1 + m2, Y =

    m1y1 + m2y2m1 + m2

    ,

    and (x, y) the relative position of the two bodies

    x = x1 x2, y = y1 y2.In these new coordinates the Lagrangian, using proposition 33, is

    L = L1(X, Y) + L2(x, y, x, y).

    Therefore, the equations for the variables X and Y are decoupled from

    the ones for x, y. Elementary computations show that

    d2

    dt2X =

    d2

    dt2Y = 0.

    Thus X(t) = X0 + VXt and Y(t) = Y0 + VYt, for suitable constants X0,

    Y0, VX and VY.

    Since

    L2 =m1m2

    m1 + m2

    x2 + y2

    2 U(

    x2 + y2),

    the problem now is reduced to the previous example.

  • 8/2/2019 Calculus of Variations and Pdes

    55/272

    1. EULER-LAGRANGE EQUATIONS 55

    Exercise 38 (Two body problem). Consider a system of two point

    bodies inR3 with masses m1 and m2, whose relative location is given

    by the vector r R3. Assume that the interaction depends only onthe distance between the bodies. Show that by choosing appropriate

    coordinates, the motion can be reduced to the one of a single point

    particle with mass M = m1m2m1+m2 under a radial potential. Show, by

    proving that r r is conserved, that the orbit of a particle under aradial field lies in a fixed plane for all times.

    Exercise 39. Letx : [0, T] Rn be a solution to the Euler-Lagrangeequation associated to a C2 Lagrangian L : Rn Rn R. Show that

    E(t) = L(x, x) + x DvL(x, x)is constant in time. For mechanical systems this is simply the conser-

    vation of energy. Occasionally, the identity ddt

    E(t) = 0 is also called

    the Beltrami identity.

    Exercise 40. Consider a system of n point bodies of mass mi, and

    positions ri R3, 1 i n. Suppose the kinetic energy is T =imi2

    |r|2 and the potential energy is U = i,j=i mimj2|rirj | . Let I =i mi|ri|

    2. Show that

    d2

    dt2I = 4T + 2U,

    which is strictly positive if the energy T+ U is positive. What implica-

    tions does this identity have for the stability of planetary systems?

    Exercise 41 (Jacobi metric). Let L(x, v) : Rn Rn R be a C2Lagrangian. Let x : [0, T] Rn be a solution to the correspondingEuler-Lagrange

    (30)d

    dtDvL DxL = 0,

    for the Lagrangian

    L(x, v) =|v|2

    2 V(x).

    LetE(t) = |x(t)|2

    2 + V(x(t)).

    1. Show that E = 0.

  • 8/2/2019 Calculus of Variations and Pdes

    56/272

    56 2. CALCULUS OF VARIATIONS IN ONE INDEPENDENT VARIABLE

    2. LetE0 = E(0). Show thatx is a solution to the Euler-Lagrange

    equation

    (31)d

    dtDvLJ DxLJ = 0

    associated to LJ =

    E0 V(x)|x|.3. Show that any reparametrization in time of x is also a solution

    to (31) and observe that the functionalT0

    E0 V(x)|x|

    represents the lenght of the path between x(0) and x(T) using

    the Jacobi metric g =

    E0 V(x).4. Show that the solutions to the Euler-Lagrange (31) when repar-

    ametrized in time in such a way that the energy of the reparametrized

    trajectory is E0 satisfy (30).

    Exercise 42 (Braquistochrone problem). Let (x1, y1) be a point in a

    (vertical) plane. Show that the curve y = u(x) that connects (0, 0) to

    (x1, y1) in such a way that a particle with unit mass moving under the

    influence a unit gravity field reaches (x1, y1) in the minimum amount

    of time minimizes x10

    1 + u2

    2u dx.

    Hint: use the fact that the sum of kinetic and potential energy is con-

    stant.

    Determine the Euler-Lagrange equation and study its solutions, us-

    ing exercise 39.

    Exercise 43. Consider a second-order variational problem:

    (32) minx

    T0

    L(x, x, x)

    where the minimum is taken over all trajectories x : [0, T] Rnwith fixed boundary data x(0), x(T), x(0), x(T). Determine the Euler-

    Lagrange equation corresponding to .

  • 8/2/2019 Calculus of Variations and Pdes

    57/272

    2. FURTHER NECESSARY CONDITIONS 57

    2. Further necessary conditions

    A classical strategy in the study of variational problems consists

    in establishing necessary conditions for minimizers. If there exists a

    minimizer and if the necessary conditions have a unique solution, then

    this solution has to be the unique minimizer and thus the problem is

    solved. In addition to Euler-Lagrange equations, several other neces-

    sary conditions can be derived. In this section we discuss boundary

    conditions which arise, for instance when the end-points are not fixed,

    and second-order conditions.

    2.1. Boundary conditions. In certain problems, the boundary

    conditions, such as end point values are not prescribed a-priori. In

    this case, it is possible to prove that the minimizers satisfy certain

    boundary conditions automatically. These are called natural boundary

    conditions.

    Example 14. Consider the problem of minimizing the integral

    (33)T0 L(x, x)dt,

    over all C2 curves x : [0, T] Rn. Note that the boundary values forthe trajectory x at t = 0, T are not prescribed a-priori.

    Let x be a minimizer of (33) (with free endpoints). Then for all

    : [0, T] Rn, not necessarily compactly supported,T0

    DxL(x, x) + DvL(x, x) dt = 0.

    Integrating by parts and using the fact that x is a solution to the

    Euler-Lagrange equation, we conclude that

    DvL(x(0), x(0)) = DvL(x(T), x(T)) = 0.

  • 8/2/2019 Calculus of Variations and Pdes

    58/272

    58 2. CALCULUS OF VARIATIONS IN ONE INDEPENDENT VARIABLE

    Exercise 44. Consider the problem of minimizing the integral

    T0

    L(x, x)dt,

    over all C2 curves x : [0, T] Rn such that x(0) = x(T). Deduce thatDvL(x(0), x(0)) = DvL(x(T), x(T)).

    Use the previous identity to show that any periodic (smooth) minimizer

    is in fact a periodic solutions to the Euler-Lagrange equations.

    Exercise 45. Consider the problem of minimizing

    T0 L(x, x)dt + (x(T)),

    with x(0) fixed and x(T) free. Derive a boundary condition at t = T

    for the minimizers.

    Exercise 46 (Free boundary).

    Consider the problem of minimizingT0

    L(x, x),

    over all terminal times T and allC2 curves x : [0, T] Rn. Show thatx is a solution to the Euler-Lagrange equation and that

    L(x(T), x(T)) = 0,

    DxL(x(T), x(T))x(T) + DvL(x(T), x(T))x(T) 0,DvL(x(T), x(T)) = 0.

    Letq R and L : R2 R given by

    L(x, v) =(v q)2

    2

    +x2

    2 1

    If possible, determine T and x : [0, T] R that are (local) minimizersof T

    0

    L(x, x)ds,

    with x(0) = 0.

  • 8/2/2019 Calculus of Variations and Pdes

    59/272

    2. FURTHER NECESSARY CONDITIONS 59

    2.2. Second-order conditions. If f : R R is a C2 functionwhich has a minimum at a point x0 then f

    (x0) = 0 and f(x0)

    0.

    For the minimal action problem, the analog of the vanishing of the firstderivative is the Euler-Lagrange equation. We will now consider the

    analog to the second derivative being non-negative.

    The next theorem concerns second-order conditions for minimizers:

    Theorem 34 (Jacobis test). Let L(x, v) : Rn Rn R be a C2Lagrangian. Let x : [0, T] Rn be a C1 minimizer of the actionunder fixed boundary conditions. Then, for each : [0, T] Rn, withcompact support in (0, T), we have

    (34)

    T0

    1

    2TD2xxL(x, x) +

    TD2xvL(x, x) +1

    2TD2vvL(x, x) 0.

    Proof. If x is a minimizer, the function I[x + ] has aminimum at = 0. By computing d

    2

    d2I[x + ] at = 0 we obtain

    (34).

    A corollary of the previous theorem is Lagranges test that we state

    next:

    Corollary 35 (Lagranges test). Let L(x, v) : Rn Rn R be a C2Lagrangian. Suppose x : [0, T] Rn is a C1 minimizer of the actionunder fixed boundary conditions. Then

    D2vvL(x, x) 0.

    Proof. Use Theorem 34 with = (t)sin t, for : [0, T] Rn,

    with compact support in (0, T), and let 0. Exercise 47. Let L : R2n R be a continuous Lagrangian and letx : [0, T] Rn be a continuous piecewise C1 trajectory. Show that foreach > 0 there exists a trajectory y : [0, T] Rn of class C1 suchthat

    T0

    L(x, x) T0

    L(y, y)

    < .

  • 8/2/2019 Calculus of Variations and Pdes

    60/272

    60 2. CALCULUS OF VARIATIONS IN ONE INDEPENDENT VARIABLE

    As a corollary, show that the value of the infimum of the action over

    piecewise C1 trajectories is the same as the infimum over trajectories

    globally C1. Note, however, that a minimizer may not be C1.

    Exercise 48 (Weierstrass test). Letx : [0, T] Rn be a C1 minimumof the action corresponding to a Lagrangian L. Let v, w Rn and0 1 be such that v + (1 )w = 0. Show that

    L(x, x + v) + (1 )L(x, x + w) L(x, x).Hint: To prove the inequality at a point t0, choose such that

    (t) =

    v if t0

    t

    t +

    w if t + < t t0 + 0 otherwise

    and consider I[x + ], as 0.

    3. Applications to Riemannian geometry

    This section is dedicated to some applications of the calculus of

    variations to Riemannian geometry, namely the study of geodesics andcurvature. We also present some applications to geometric mechanics,

    namely the study of the rigid body.

    In our examples we will use most of the time local coordinates and

    will not try to address global problems in geometry. In fact, by using

    suitable charts, the problems we address can usually be reduced to

    problems in Rn. To simplify the notation we will also use the Einstein

    convention for repeated indices, that is aibi in fact is an abreviation of

    i

    aibi.

    Example 15. Let M be a Riemannian manifold with metric g, defined

    in local coordinates by the positive definite symmetric matrix gij(x).

    Let L : T M R be given by

    L(x, v) =1

    2gij(x)vivj.

  • 8/2/2019 Calculus of Variations and Pdes

    61/272

    3. APPLICATIONS TO RIEMANNIAN GEOMETRY 61

    Let x : [a, b] M be a curve that minimizes

    ba

    L(x, x)dt,

    over all curves with certain fixed boundary conditions. Then, we have

    d

    dt(gijxi) 1

    2Djgmkxmxk = 0,

    that is,

    (35) xi +1

    2

    gij (Dkgmj + Dkgmj Djgmk)

    xmxk = 0,

    where gij represents the inverse matrix ofgij. We can write the previous

    equation in the more compact formxi +

    ikmxmxk = 0,

    where

    (36) ikm =1

    2gij (Dkgmj + Dmgkj Djgmk)

    is the Christoffel symbol for the metric g (note that the change in the

    order of the indices in the second term does not change the sum in (35)

    but makes symmetric in the indices m and k).

    Theorem 36. Let gij be a smooth Riemannian metric in Rn. Thecritical points x of the functional

    (37)

    T0

    1

    2gij(x)xixjdt

    are also critical points of the functional

    (38)

    T0

    gij(x)xixjdt,

    Additionally, we can reparametrize the critical points of (38) in such a

    way that they are also critical points of (37).

    Proof. The fact that the critical points of (37) are critical points

    of (38) is a simple computation. To prove the second part of the

    theorem it suffices to observe that the solutions of the Euler-Lagrange

    associated to L preserve the energy E = 12

    gij(x)xixj. Using this fact is

  • 8/2/2019 Calculus of Variations and Pdes

    62/272

    62 2. CALCULUS OF VARIATIONS IN ONE INDEPENDENT VARIABLE

    easy to find the correct parametrization of the critical points of (38).

    The minimizers of (38) are called geodesics, although sometimes

    the name is also used for critical points.

    Example 16. Consider a parametrization f : A Rm Rn of am-dimensional manifold. The induced metric in Rm is represented by

    the matrix

    g = (Df)TDf.

    The motivation is the following, given a curve (t) M consider thecorresponding tangent vector (t) in T M. Let x = f() and x = Df.

    Then we define

    , = x, x,which gives rise, precisely to the induced metric.

    Exercise 49. ConsiderR2\{0} with polar coordinates (r, ). Show thatthe standard metric inR2 can be written in these coordinates as

    g = 1 0

    0 r2

    .

    Let

    L(r,, r, ) =r2 + r22

    2,

    the Lagrangian of a free particle in polar coordinates. Compute the

    Euler-Lagrange equation and determine the corresponding Christoffel

    symbol.

    Exercise 50. Consider the sphere x2 + y2 + z2 = 1 and the associate

    spherical coordinates (, )

    x = cos sin

    y = sin sin

    z = cos ,

  • 8/2/2019 Calculus of Variations and Pdes

    63/272

    3. APPLICATIONS TO RIEMANNIAN GEOMETRY 63

    (0, 2) and (0, ). Show that the induced metric is given bythe matrix

    g =

    sin2 00 1

    .

    Determine the Euler-Lagrange equation forL = 12gijvivj and the Christof-

    fel symbol corresponding to the coordinates (, ).

    Exercise 51. Consider the revolution surface in R3 parametrized by

    (r, ):

    x = r cos

    y = r sin

    z = z(r).

    Show that the induced metric is

    g =

    1 + (z)2 0

    0 r2

    .

    Show that the equation for the geodesics is

    +2

    rr = 0

    r r

    1 + (z)22

    +

    zz

    1 + (z)2 r2

    = 0

    Determine the corresponding Christoffel symbols. Prove the Clairaut

    identity, that is, that r cos is constant, where is the angle between and r

    r +

    .

    Exercise 52 (Spherical pendulum). Show that for a spherical pendu-

    lum with unit mass, the Lagrangian can be written as

    L =2 sin2 + 2

    2 U().

    Exercise 53. Determine the Lagrangian of point particle constrainedto the cone z2 = x2 + y2.

    Exercise 54. Consider the Lagrangian for a particle of unit mass con-

    strained to move in the cycloid parametrized by

    x = sin y = cos .

  • 8/2/2019 Calculus of Variations and Pdes

    64/272

    64 2. CALCULUS OF VARIATIONS IN ONE INDEPENDENT VARIABLE

    Show that the y coordinate is 2-periodic for any initial condition that

    yields a periodic orbit.

    3.1. Parallel Transport. The Christoffel symbols ikm can be

    used to study parallel transport in a Riemannian manifold. In this

    section we define and discuss the main properties of parallel transport.

    Let M be a manifold and (M) the set of all C vector fields in

    M. As usual in differential geometry, we identify vector fields in M

    with the corresponding first-order linear differential operators. That

    is, if X = (X1, . . . X n) is a vector field, we identify X with the first

    order differential operator

    X =i

    Xi

    xi.

    Then, the commutator of two vector fields X and Y is the vector field

    [X, Y], which is defined through its action as a differential operator in

    smooth functions f:

    [X, Y]f = X(Y(f)) Y(X(f)).

    A connection in M is a mapping

    :

    satisfying the following properties

    1. fX+gYZ = fXZ+ gYZ,2. X(Y + Z) = XY + XZ,3. X(f Y) = fXY + X(f)Y,

    for all X,Y ,Z (M) and all f, g C(M).

    The vector XY represents the rate of variation ofY along a curvetangent to X.

  • 8/2/2019 Calculus of Variations and Pdes

    65/272

    3. APPLICATIONS TO RIEMANNIAN GEOMETRY 65

    Exercise 55. LetM be a manifold and a connection in M. Defineikm as

    xk

    xm

    = ikm

    xi.

    Show that

    (39) XY =

    ikmXkYm + XjYixj

    xi,

    whereX = Xjxj

    e Y = Yjxj

    .

    In every point x, the formula (39) only depends on the value of the

    vector field X at x, this allow us to define the covariant derivative of avector field Y along a curve x(t) trough

    DY

    dt= xY.

    A vector field X is parallel along a curve x(t) if

    DX

    dt= 0.

    A connection is symmetric if

    XY YX = [X, Y].In general, connections in a manifold do not have to be symmetric, and

    therefore

    XY YX = T(X, Y) + [X, Y],where T is the torsion.

    Exercise 56. Determine an expression for the torsion in local coordi-

    nates.

    Exercise 57. Let be a symmetric connection. Show thatkij =

    kji .

    A manifold can be endowed with different connections. For Rie-

    mannian manifolds, are of special interest the connections which are

  • 8/2/2019 Calculus of Variations and Pdes

    66/272

    66 2. CALCULUS OF VARIATIONS IN ONE INDEPENDENT VARIABLE

    compatible with the metric, that is, that for all vector fields X and Y

    satisfy

    (40)d

    dtX, Y = D

    dtX, Y + X, D

    dtY,

    where the derivatives are taken along any arbitrary curve x(t). There

    exists a unique symmetric connection compatible with the metric, the

    Levi-Civita connection, whose Christoffel symbols are given by (36).

    Theorem 37. Let M be a Riemannian manifold with metric g. The

    the Levi-Civita connection, defined in local coordinates by the Christof-

    fel symbols (36), is the unique connection which is symmetric and com-

    patible with the metric g.

    Proof. Let be a connection which is symmetric and compatiblewith the metric g. Then one can use (40) to determine Dkgmj, Dmgkjand Djgmk and it is a simple computation to show that its Christoffel

    symbols are give by (36).

    Exercise 58. Verify that the Christoffel symbols define a connection.

    Exercise 59. Use formula (36) to determine the Christoffel symbol

    corresponding to the polar coordinates inR2 - compare with the result

    of exercise 49.

    Exercise 60. LetX be a vector field and x a trajectory that satisfies

    dx

    dt= X(x).

    Show that in local coordinates

    xi

    xi= Xk(x)

    Xixk

    xi,

    and, therefore,

    DX

    dt=

    ikmxkxm + xi

    xi.

  • 8/2/2019 Calculus of Variations and Pdes

    67/272

    3. APPLICATIONS TO RIEMANNIAN GEOMETRY 67

    Show that the previous definition is independent of the choice of local

    coordinates, which allow us to define covariant acceleration as:

    Dx

    dt=

    ikmxkxm + xi

    xi,

    for any C2 trajectory.

    Example 17. Equation (15) can be then rewritten as

    Dx

    dt= 0,

    which should be compared with the Newton law for a particle in the

    absence of forces x = 0.

    Exercise 61. LetM be a Riemannian manifold in which is defined a

    potential V : M R. The corresponding Lagrangian isL(x, v) =

    1

    2gijvivj V(x).