15_11_2010_curs

Embed Size (px)

Citation preview

  • 8/13/2019 15_11_2010_curs

    1/129

    Mathematics for Economists and Social Sciences

    Cristian Necul aescu(Cristian Necul aescu) Academy of Economic Studies, room 2625, Calea Doroban T i nr.

    11-13, sector 1, Bucure S ti, RomniaE-mail address , Cristian Necul aescu: [email protected]

    Dedicated to the memory of my Teachers and Professors: Dan Jebeleanu, Gheorghe Pntea, Aristide Halanay and Stefan Miric a.

  • 8/13/2019 15_11_2010_curs

    2/129

    2000 Mathematics Subject Classication. Primary 05C38, 15A15; Secondary 05A15, 15A18

    The Author thanks V. Exalted.

    Abstract. Replace this text with your own abstract.

  • 8/13/2019 15_11_2010_curs

    3/129

    Contents

    Part 1. Calculus 1

    Chapter 1. Innite series 31.1. Introduction 31.2. Special cases 61.3. Convergence Tests for positive series 91.4. Convergence tests for general series 111.5. Convergence tests for alternating series 121.6. Some formulas and exercises 121.7. A Macroeconomical Example 121.8. Power Series 131.9. Taylors expansions 14

    Chapter 2. Functions of several variables (2 lectures) 172.1. Introduction 172.2. Limits of a function at a point 192.3. Continuity 202.4. Derivatives 222.5. Higher order derivatives 252.6. Further results about dierentiable functions 262.7. Implicit functions 312.8. Taylor Polynomials 322.9. Applications in Economics 332.10. Extreme points 362.11. Unconstrained Local Optimization 382.12. Unconstrained optimization. Approximating functions by Least Square Method. 402.13. Constrained Optimization 44

    Chapter 3. Ordinary dierential equations (1/2 lecture) 533.1. Classical elementary types of EODEs 543.2. More elaborate examples 563.3. Bernoulli equations 583.4. Change of variables in an EODE 603.5. Connections between EODE and Functional Equations 603.6. Connections with Dierence Equations 63

    Chapter 4. Qualitative Results for EODE 654.1. Existence 65

    4.2. Unicity 66iii

  • 8/13/2019 15_11_2010_curs

    4/129

    iv Cristian Necul aescu

    4.3. Global aspects 664.4. Solutions as functions with respect to the initial values and parameters 66

    Chapter 5. Finite dierence equations (1/2 lecture) 71

    Chapter 6. Improper integrals. Euler functions: Gama, Beta 73

    Chapter 7. Applications of Calculus to economic modelling 75

    Part 2. Probabilities (7 lectures) 77

    Chapter 8. Events. Probability: classic and axiomatic denition. Field of events. Properties of probability. 79

    Chapter 9. Conditional probability. Probability of a union/intersection of events. Total probabilityformula. Bayes formulas. Classical probability schemes. 81

    Chapter 10. Denition of a random variable. Operations with random variables. Examples on thediscrete case. Cumulative distribution function: denition, properties. Functions of random variables. 83

    Chapter 11. Continuous random variables. Probability density function: denition, properties. 85

    Chapter 12. Moments of random variables. Expectation and variance. Properties. Chebyshevinequality. 87

    Chapter 13. Discrete bivariate random variables: marginal distributions, moments, conditional

    distributions, covariance, correlation. 89Chapter 14. Discrete and classical distributions. Applications of probability theory to economic

    modelling. 91

    Chapter 15. Convexity 93

    Appendix A. * High School Revision 95A.1. Sets 95A.2. Usual Number Sets. Countability 98A.3. Minorants, majorants 101A.4. Relations 102

    A.5. Functions 103A.6. Binary Logic 107A.7. Database applications for Logic, Sets, Relations and functions 110A.8. Sequences 112A.9. Symbols 117

    Appendix B. Topology 119

    Appendix C. Functions of one variable 123

    Appendix. Bibliography 125

  • 8/13/2019 15_11_2010_curs

    5/129

    Part 1

    Calculus

  • 8/13/2019 15_11_2010_curs

    6/129

  • 8/13/2019 15_11_2010_curs

    7/129

    CHAPTER 1

    Innite series

    "Divergent series are the invention of the devil,and it is shameful to base on themany demonstration whatsoever." Abel, 1828

    The starting point for the main body of these Lecture Notes is the level of knowledge given in Mathe-matics by "High School graduate, with the maximum concentration on Mathematics". Broadly speaking,this means all "Precalculus", "Geometry and Trigonometry", "Analytic Geometry", "Linear Algebra linear systems, matrices, determinants", "Abstract Algebra groups, elds, rings", "Calculus limits,continuity, derivability, graphs of functions", "Calculus elementary integrals". In the Appendix it maybe found a brief review of some of these topics; still, you may nd useful to keep close appropriate highschool texts. During the lectures and seminars, each of you is welcomed to ask questions and to comment.As Murphy says, "Science advances when the student asks and the teacher doesnt know the answer".

    1.1. Introduction

    Consider a sequence of real numbers denoted (an )n 2 N .

    1.1.1. Denition (Formal) . The symbol1

    Pn =1 anDef = a1 + a2 + + an + is called "series" or "real

    series" or "real innite series";the number an is called "the [general] term of the series";

    the number S n dened by S n = a1 + a2 + a3 + + an =n

    Pk=1 ak is called "the nth order partial sumof the series";the sequence (S n )n2 N is called "the sequence of partial sums of the series".

    1.1.2. Denition (Informal) . A series is an "innite summation" or (more precise) a "discrete innite

    summation" or a "countable innite summation".1.1.3. Remark . Series as an abstract mathematical model may be found in "Macroeconomics" rep-

    resenting "discrete dynamics" or "indenite discrete nancial ows"; a typical situation describes the(expected) present value of a future accumulation process in which the accumulation will take place at anindenite number of future moments (e.g. dividends, insurance). The detailed study of these situationsis beyond the purpose of the present text the interested reader may consult titles like [ 17 ] or [18]. Youmay see at the end of this chapter a little Macroeconomic model.

    1.1.4. Example . A series:1

    Pn =0 3 2n

    5n; The sign "P" comes from the capital greek letter "sigma".

    The general term: an = 3 2n

    5n .3

  • 8/13/2019 15_11_2010_curs

    8/129

    4

    !!! Pay attention at the rst term (which is not always 0 or 1), located at the bottom of the summationsymbol:

    1

    Xn = !!!rst te rm anThe sequence of partial sums of the series: S n =

    n

    Pk=0 ak =n

    Pk=0 3 2k

    5k; in this particular case we may

    obtain an explicit form for S n :n

    Pk=0 3 2k

    5k = 3

    n

    Pk=025

    k

    = 31

    25

    n +1

    1 25

    = 5 "1 25 n +1#= S n .It may be seen that

    9 limn !1

    S n = 5.

    1.1.5. Remark . Between the sequences (an )n 2 N and (S n )n 2 N there are certain recurrence relations:S n +1 = S n + an +1 (or an +1 = S n +1 S n ), 8n 2N .1.1.6. Denition (convergence/divergence) . The symbol

    1

    Pn =1 an is called "convergent" (we say "itconverges") if the sequence (S n )n 2 N is convergent (converges); only in this case is the value S = limn !1

    S ncalled "the sum of the series";

    the sequence (S n S )n 2 N (the dierence between the partial sum and the sum) is "the remaindersequence" and it vanishes (it tends toward 0);

    1

    Pn =1

    an is divergent (diverges) if (S n )n 2 N is divergent (diverges).

    1.1.7. Example . For the previous example, since limn !1

    S n = 5 we conclude that the series1

    Pn =0 3 2n

    5n

    converges and the sum is 5 (the value of the limit). We write1

    Pn =0 3 2n

    5n= 5 .

    1.1.8. Theorem . [Divergence test] limn !1 an 6= 0 ) Pn 2 N an diverges:IF the general term does not tend towards zero,THEN the series diverges.

    Proof. By contradiction: the statement ( an

    6! 0

    ) Pn 2 N

    an diverges) is logically equivalent with the

    statement ( Pn 2 N an converges ) an ! 0).Pn 2 N an converges

    from denition

    ) 9S = limn!1 S n ) an = S n S n 1 !n !1 S S = 0: The behavior of a series (convergent or divergent) is a qualitative information , called "the nature

    of the series". When convergent we may also talk about the sum of the series, which is quantitative information [conditioned by the qualitative information]. A dierence between the two types of information(quantitative and qualitative) is that usually the algorithms embedded in software products are built uponthe claim that the qualitative part is satised and so the usage of some software products for situationswhere the qualitative part is not satised may lead to unexpected results. As a general rule, it is advisable

    to separate the qualitative and quantitative studies.

  • 8/13/2019 15_11_2010_curs

    9/129

    5

    1.1.9. Remark . There are some signicant dierences between nite and innite addition (summation);some of them:

    (1) While nite addition always exist, this is not the case with innite addition.(2) While nite addition is commutative, the rearrangement of the terms of an innite addition may

    alter both the qualitative and the quantitative results.(3) While nite addition is asociative, careless grouping and regrouping of the terms of an innite

    addition is false and may lead to unexpected results.

    1.1.10. Example . Consider1

    Pn =0 ( 1)n . The series diverges because the general term doesnt tendtowards zero. Still the following false line of reasoning "nds the sum of the series":S = 1 1 + 1 1 + 1 1 + 1 +

    )S 1 = 1 + 1 1 + 1 + )S 1 = (1 1 + 1 1 + 1 + ) = S )) 2S = 1 ) S =

    12

    :

    The "result" is false and the unique mistake is "the notation" S =1

    Pn =0 ( 1)n which implicitly and falselyassumes that a number S exists and is equal with the abstract symbol

    1

    Pn =0 ( 1)n .1.1.11. Remark . Given a series, the inclusion/exclusion of a nite number of terms doesnt change the

    nature of the series [Because a nite number of additions/substractions does not modify the existence of a limit] [the series Pn 2 N an and Pn 2 Nnf 0;1; ;kg an have the same nature]. Still, it may change the value of thesum, when it exists.1.1.12. Remark . While nite addition is associative, innite addition is not always associative. This

    means that innite grouping of the added objects sometimes changes the nature of the innite summation.Example: " 0 = 1". False line of reasoning:

    1 = 1 + 0 + 0 + + 0 + == 1 + ( 1 + 1) + ( 1 + 1) + ( 1 + 1) + == (1 1) + (1 1) + + (1 1) + == 0 :

    [The line of reasoning again makes the (hidden) false assumption that there is a number S equal to the

    abstract symbol1

    Pn =0 ( 1)n and falsely assumes that rearrangements are true for divergent series]1.1.13. Remark . When convergent, the sum of a series is unique [because the limit of a sequence is

    unique].

    1.1.14. Remark (Algebraic operations with series, Thms. 3.47, 3.50, 3.51 [ 19 ]). When the series Pn 2 N anand Pn 2 N

    bn are both convergent and

    2 R, the series

    Pn 2 N

    (an + bn ) and

    Pn 2 N

    ( an ) are also convergent and

  • 8/13/2019 15_11_2010_curs

    10/129

    6

    moreover, the following relations between the sums of series are valid:

    Pn2 N (an + bn ) = Pn 2 N a

    n + Pn 2 N bn ;

    Pn 2 N ( an ) = Pn 2 N an :1.1.15. Remark . In this result, the qualitative part is: " Pn 2 N an and Pn 2 N bn are both convergent )

    Pn 2 N (an + bn ) and Pn 2 N ( an ) are also convergent" while the quantitative part is: Pn 2 N(an + bn ) = Pn 2 N an + Pn 2 N bn ;Pn 2 N ( an ) = Pn 2 N an :

    1.1.16. Remark . The proof is based on translating the convergences in terms of " "denitions".

    1.2. Special cases1.2.1. Arithmetic sequence (arithmetic progression). is a sequence of numbers so that the

    dierence between any two consecutive terms is constant (and is called "common dierence") (Alternativecharacterization: For any three consecutive terms, the middle term is the arithmetic mean of boundaryterms)

    an = a1 + ( n 1) d,Arithmetic series:

    n

    Pk=1 ak =n

    Pk=1 (a1 + ( k 1) d) = na 1 + n (n 1)

    2 d

    [Used in nance, simple interest formulas]

    1.2.2. Geometric sequence (geometric progression). is a sequence of numbers such that theratio between two consecutive terms is constant (Alternative characterization: For any three consecutiveterms, the middle term is the geometric mean of the extreme terms).

    an = a1 r n 1, r 6= 1 .Geometric Series:n

    Pk=1 ak =n

    Pk=1 a1 r k 1 = a11 r n

    1 r .

    Pn 2 N an = convergent, a 2 ( 1; 1)divergent, a 2Rn( 1; 1)

    In fact S n = 1 + a + a2 + a3 + + an = 8

  • 8/13/2019 15_11_2010_curs

    11/129

    7

    n

    Pk=1

    ak =n

    Pk=1

    1a1 + ( k 1) d

    (no elementary formula available)

    Interpretation: Given n (ordered) observations for a certain measurement (such that the observationsare comparable), say that an observation is a "record" if it is the greatest of all (up to it). Then the

    expected number of records is 1 + 12

    + 13

    + + 1n

    .

    1.2.4. The number e. e =1

    Pn =0 1n!1.2.1. Theorem (Thm. 3.31, [ 19 ]). lim

    n !11 +

    1n

    n

    = e.

    1.2.2. Theorem (Thm. 3.32, [ 19 ]). The number e is irrational.

    1.2.3. Example (Achille and the Turtle (Zenon paradox); also see Section 1.3 [ 4]). Achilles (A) andthe Turtle (T) race together. It is assumed that Achilles speed is much bigger than the Turtles speed, socommon sense tells that even if Achille gives the Turtle an initial advantage, he will still win the race.

    The following line of reasoning has been known since Ancient Greece as the "Zenon paradox":Denote As speed vA and Ts speed vT (with vA > vT ). Consider the advance given by A in the form of

    distance S 0. A starts the race only when T covers S 0. Then A starts and until he also covers S 0 T alreadycovers another distance called S 1. In the time needed by A to cover the new distance S 1, T covers a newdistance S 2, and so on. "Common sense" says that the distances S n even if they are increasingly smaller,they are always strictly positive. This is interpreted in the following manner: "Achilles will never outrunthe Turtle, because the Turtle will always have a strictly positive distance in advance".

    Ts total advantage is (the geometric series):1

    Pn =0 S 0vT vA

    n

    = limn !1

    S 01

    vT vA

    n +1

    1 vT vA

    = S 0vAvA vT

    [Al-

    though the Turtles total advantage is an innite sum of strictly positive distances, the total value of thesum is nite]

    The time Achille needs to cover this distance is S 0vA vT

    which is equal with (the geometric series)1

    Pn =0

    S 0

    vA

    vT

    vA

    n

    .

  • 8/13/2019 15_11_2010_curs

    12/129

    8

    1.2.4. Example (Telescoping/collapsing series) . Consider the series1

    Pn =1

    1n (n + 1)

    . It is convergent and

    is a "telescopic series" in the sense that the sum may be calculated "elementary", by successive cancellation:1

    k (k + 1) =

    1k

    1k + 1

    ; so thatn

    Pk=11

    k (k + 1) =

    n

    Pk=11k

    1k + 1

    =

    = 11

    12=

    +12=

    13=

    +

    += =

    +

    +1n=

    1n + 1

    = 1 1n + 1

    ) S n = 1 1n + 1

    ;

    ) 9 limn !1 S n (so the series is convergent)and lim

    n !1S n = 1 (so the sum is 1)

    CAUTION: The sumn

    Pk=11k

    1k + 1

    has a nite number of terms so it is not wrong to writen

    Pk=11k

    n

    Pk=1

    1

    k + 1.On the contrary, in the situation

    1

    Pn =1

    1

    n 1

    n + 1, because of the innite number of terms, it

    is wrong to write1

    Pn =11n

    1

    Pn =11

    n + 1; both series are divergent so that actually we have:

    1

    Xn =1 1n 1n + 1 " = "1

    Xn=1 1n 1

    Xn =1 1n + 1 () 1" = " 1 1:1.2.5. Example .

    1

    Pn=1 p n + 2 2p n + 1 + p n1Pn =1 p n + 2 2p n + 1 + p n == 1Pn =1 p n + 2 p n + 1 + p n p n + 1 == lim

    n !1

    n

    Pk=1 p k + 2 p k + 1 + p k p k + 1 == limn !1

    p n + 2 p 2 + 1 p n + 1 = 1 p 21.2.6. Exercise . For the following telescoping series, establish their nature and if convergent nd the

    sum:

    (1)1

    Pn =1

    1

    p n + p n + 1=

    1

  • 8/13/2019 15_11_2010_curs

    13/129

    9

    (2)1

    Pn =1

    1n2 + 5 n + 6

    = 13

    (3) 1Pn =1 1n2 + 4 n + 3 = 512(4)

    1

    Pn =1 ln nn + 1

    (5)1

    Pn =13n2 + n 1n2 2n + 3

    1.2.5. Various series classications.

    1.2.5.1. With respect to the convergence/divergence (and the type of divergence) of the sequence

    partial sums:

    convergent

    %series ! divergent ! sum equal to 1& sum does not exist

    1.2.5.2. With respect to the type of the general term:

    general ( an 2R)%series ! positive ( an 0)

    & alternate ( an = ( 1)n bn , bn 0)or an an +1 < 0

    1.3. Convergence Tests for positive series

    The general term for positive series will be positive ( an 0) and strictly positive ( an > 0) only whenrequired by the involved operations. The sum of these series always exists, but it may be innite ( + 1 ).1.3.1. Theorem (Thm. 1.48, [ 4]). For a positive terms series, changing the order of the terms does

    not change the nature of the series or the value of the sum.

    Proof. Consider1

    Pn =1 an with an > 0 for all n and1

    Pn =1 bn a rearrangement of the rst series (that is,the same terms in dierent order).The sequence of the partial sums S an =

    n

    Pk=1 an is an increasing sequence (because S an +1 = S an + an +1 > S an )so it has a limit (which may be innite, denote it by S a ).The sequence of the partial sums S bn =

    n

    Pk=1 bn is also an increasing sequence with limit S b.Consider an arbitrary xed index n. Since b1, , bn is a rearrangement of the terms an , there is pnthe biggest index for which bk = an k ( pn = max fn1; ; nkg). Then S bn S a pn S a so passing to limit forn ! 1 it follows that S

    b S

    a. A similar argument leads to S

    a S

    bso in fact S

    a= S

    b.

  • 8/13/2019 15_11_2010_curs

    14/129

    10

    1.3.2. Theorem (First Comparison Test; Thm. 1.49, [ 4]). Consider two series with positive terms

    Pn 2 N

    an and

    Pn 2 N

    bn so that there is an index n0

    2N for which 0

    an

    bn ,

    8n

    n0.

    Then:(1) If Pn 2 N bn converges then Pn2 N an converges;(2) If Pan diverges then Pn 2 N bn diverges.

    Proof. Since the nature of the series does not change when substracting a nite number of terms,it may be assumed that the inequality 0 an bn is valid for all n. Then between the partial sumssequences (which are increasing sequences for the present case) there is the relation S an S bn for all nwhich means that when S bn is bounded S an is bounded too, and when S an is unbounded S bn is unboundedtoo.

    Exercise: For the series Pn 2 N 13n + 2 use the inequality 3n + 2 3n ) 13n + 2 13n and the rstcomparison test to study the nature of the series.Exercise: For the series Pn 2 N

    1p n use the inequality p n + p n = 2p n p n + p n + 1 )

    1p n

    1p n + p n + 1 and the rst comparison test to study the nature of the series.

    1.3.3. Theorem (Ratio Comparison Test; Thm. 1.55, [ 4]). Consider two series with positive terms

    Pn 2 N

    an and

    Pn 2 N

    bn so that there is an index n0 2N for which an +1

    an bn+1

    bn 8n n0.Then:(1) If Pn 2 N bn converges then Pn2 N an also converges;(2) If Pn 2 N an diverges then Pn 2 N bn also diverges.

    Proof. Again consider that an +1

    an bn +1

    bnfor all n. By multiplying all the inequalities from n = 0 up

    to n = k 1 it follows that aka0

    bkb0

    so that ak a0b0

    bk for all k and The Comparison Test may be appliedto conclude the proof.

    1.3.4. Theorem (Limit Comparison Test; Thm. 1.52 [ 4]). If 9 limn !1a

    nbn = 2 (0;1 ) then the seriesPn 2 N an and Pn 2 N bn have both the same nature.1.3.5. Theorem (nth Root Test / Cauchys test, Thm. 1.65, [ 4]). For the series Pn2 N an , an > 0.If lim

    n !1np an = L 2 (0;1 ), then:

    (1) For L < 1 the series converges;(2) For L > 1 the series diverges;(3) For L = 1 the test is inconclusive.

    1.3.6. Theorem (Ratio Test / DAlemberts test, Thm. 1.62, [ 4]). For the series

    Pn 2 N

    an , an > 0.

  • 8/13/2019 15_11_2010_curs

    15/129

    11

    If limn !1

    an +1an

    = L 2 (0;1 ), then:(1) For L < 1 the series converges;(2) For L > 1 the series diverges;(3) For L = 1 the test is inconclusive.

    1.3.7. Theorem (Integral Test, Thm. 1.57 [ 4]). Consider a function ( ) : [1;1 ) ! R+ continuous anddecreasing. Then the series Pn 2 N (n) converges if and only if the improper integral R 1

    1 (x) dx converges.

    1.3.8. Theorem (Cauchy Condensation Test, Thm. 2.3 [ 4]). The series Pn 2 N an , an > 0 and Pn2 N 2n a2nhave both the same nature.1.3.9. Theorem . The pseries

    Pn 2 N

    1

    n p with p

    2R is:

    (1) Convergent if p > 1.(2) Divergent if p 1.

    1.3.10. Theorem (Schlmilch, Thm. 2.4 [ 4]). If an > 0 is eventually decreasing and the sequence nk

    is strictly increasing such thatnk+1 nknk nk 1 k

    is a bounded sequence, then the series Pn2 N an , an > 0 andPn 2 N (nk+1 nk) an k have both the same nature.1.3.11. Theorem (Raabes Test, Thm. 11, [ 12 ]). For a series

    Pn 2 N

    an with positive terms ( an > 0),

    suppose the limit limn !1

    n anan +1

    1 exist and is equal with L. Then:

    (1) If L > 1 then the series converges;(2) If L < 1 then the series diverges.(3) If L = 1 then the test is inconclusive.

    1.4. Convergence tests for general series

    1.4.1. Denition . The series Pn 2 N an is called absolute convergent when Pn 2 N janj is convergent (the seriesof absolute values).1.4.2. Remark . For a general series (with an 2 R) the series of absolute values is a positive termsseries, so the previous section applies to it.1.4.3. Denition . The series Pn 2 N an is called conditionally convergent when it is convergent but notabsolute convergent.1.4.4. Theorem . If a series converges absolute then it converges (in the ordinary sense).

    Proof. Consider an absolute convergent series Pn 2 N an . Then Pn 2 N janj is convergent and:0 an + janj 2janj ) the series P

    n 2 N(an + janj) is with positive terms and is dominated by a

    convergent series so by The Comparison Test it is convergent.

  • 8/13/2019 15_11_2010_curs

    16/129

    12

    Then because the series

    Pn 2 N

    (an + janj) and

    Pn 2 N janjare convergent, so it is their dierence:

    Pn 2 N

    (an + janj)

    Pn 2 N janj = Pn2 N (a

    n + janj janj) = Pn 2 N an .

    1.4.5. Theorem (Abel) . If Pn 2 N an converges and (bn )n 2 N is a bounded monotone sequence then Pn 2 N an bnconverges.1.4.6. Theorem (Dirichlet) . If Pn 2 N an has bounded partial sums and (bn )n 2 N is monotone and limn !1 bn =0, then Pn 2 N an bn converges.

    1.5. Convergence tests for alternating series

    1.5.1. Denition . The series Pn 2 N an is called alternating when an = ( 1)n bn , bn > 0.

    1.5.2. Theorem (Alternating series test / Leibniz, Thm. 1.75, [ 4]). If:(1) 9n0 2N, 8n n0, bn +1 bn .(2) lim

    n!1bn = 0.

    Then the alternating series Pn 2 N ( 1)n bn , bn > 0 converges.1.6. Some formulas and exercises

    n

    Pk=1 1 = nnPk=1 k =

    n (n + 1)2

    n

    Pk=1 k2 = n (n + 1) (2 n + 1)

    6n

    Pk=1 k3 =n (n + 1)

    2

    2

    1 + x + x2

    + + xn

    = 8

  • 8/13/2019 15_11_2010_curs

    17/129

    13

    1.7.1. Example . A typical Macroeconomics model, called "The households maximization problem",may look like this:

    maxf ct g1t =1

    1

    Pt=1 t 1u (ct ) ;subject to:

    1

    Pt=1P (yt ct )(1 + R)t 1

    = 0:

    It is beyond the goal of the present text to study such models. Here we just mention the economicalinterpretations expressed by means of series:

    t 2 N means "(discrete) time" ( 0 means "now", 1 means "a year from now" and so on); themeasurement unit for time may be "year" or a certain unspecied "period of time". the discussion is about a "household", and not an individual; one dierence is that while anindividual lives a nite number of years, the household may be considered "to live forever" (anindenite number of years).

    the household uses a single commodity (say bananas) measured in quantities (kilos of bananas)both for income and consumption;

    yt is "the households income for period t" (kilos of bananas) (exogeneous) ct is "the households consumption for the period t" (kilos of bananas) P is the price of one kilo of bananas (doesnt change over time :) ); the household has access to a "bananas market", where it may buy (at price P ), sell (at price

    P ) and invest money to buy bonds on the bananas market, which bear interest R (1 USDinvested gives the next period (1 + R) USD);

    u ( ) is an increasing function of consumption, called "the households consumption utility

    function" 2 [0; 1] is "the households discount factor" and it is a way to express how much the householdcares for the current consumption as oposed to future consumption = 0 means that the household only cares about current consumption; = 1 means that the household cares equally about current indenite future consumption; = 0:95 (a typical value) should mean that the household cares a little more about the

    present than the future consumption.With the above conventions, the initial problem says: "nd the maximum present utility and the

    consumption strategy to attain this, while keeping equal the present values of all future income and allfuture consumption".

    1.8. Power Series

    1.8.1. Denition . Given a sequence of real numbers (an )n 2 N and a 2 R the series1

    Pn =0 an (x a)n is apower series around a and the numbers an are the coecients of the power series.1.8.2. Theorem . For the power series

    1

    Pn =0 an (x a)n put = limn !1 np janj (if it exists) and R = 1

    .

    Then the power series converges if jx aj < R and diverges if jx aj > R (R is called the radius of convergence ). A similar result is valid when = lim

    n !1jan +1 jjanj

    (if it exists) and R = 1

    ).

    Proof. Apply the root test (or the ratio test).

  • 8/13/2019 15_11_2010_curs

    18/129

  • 8/13/2019 15_11_2010_curs

    19/129

    15

    1.9.2. Remark (Taylor Series) . Suppose that the conditions in Taylors Theorem are satised for any

    n 2 N and that the remainder f (n+1) ( )

    (n + 1)! (h)n+1

    converges to 0 as n ! 1 (uniformly with respect to h inj[a R; a + R]j. Then f (a + h) =

    1

    Pn =0f (n ) ( )

    n! hn , for h 2 j[a R; a + R]j.

    1.9.1. Some usual Taylor expansions. ex =1

    Pn =0xn

    n!, 8x 2R

    sin x =1

    Pn =0( 1)n

    (2n + 1)!x2n +1 , 8x 2R

    cos x =1

    Pn =0

    ( 1)n

    (2n)! x2n , 8x 2R

    ax = 1Pn =0 lnn

    an!

    xn , 8x 2Rsinh x =

    1

    Pn =01

    (2n + 1)!x2n +1 , 8x 2R

    cosh x =1

    Pn =01

    (2n)!x2n , 8x 2R

    (1 + x) =1

    Pn =0( 1) ( n + 1)

    n! xn , jxj 1

    p 1 + x = 1 + 12

    x 1 12 4

    x2 + 1 1 32 4 6

    x3 1 1 3 52 4 6 8

    x4 + , jxj 13

    p 1 + x = 1 + 13x

    1 23 6x

    2

    + 1 2 53 6 9x

    3 1 2 5 83 6 9 12x

    4

    + , jxj 1ln (1 + x) =

    1

    Pn =0( 1)n +1

    n xn , 8x 2 ( 1; 1]

    1.9.3. Exercise .1

    Pn =1 n an1.9.4. Example .

    1

    Pn=1(n3 + 1) an

    (n + 1)!

    1.9.5. Solution .

    1

    Pn =1(n3 + 1) an

    (n + 1)! =

    1

    Pn =1(n + 1) ( n2 n + 1) an

    (n + 1)! ==

    1

    Pn =1(n2 n + 1) an

    n! =

    1

    Pn =1 nan

    (n 1)! an

    (n 1)! +

    an

    n!=

    =1

    Pn =1na n

    (n 1)! 1

    Pn =1an

    (n 1)! +

    1

    Pn =1an

    n! =

    = ea 1 +1

    Pn=1na n

    (n 1)! a1

    Pn =1an 1

    (n 1)! =

    = ea 1 a ea +1

    Pn =1(n 1 + 1) an

    (n 1)! =

    = ea 1 a ea + a +1

    Pn =2

    (n 1 + 1) an

    (n 1)! =

  • 8/13/2019 15_11_2010_curs

    20/129

    16

    = ea 1 a ea + a +1

    Pn =2

    (n 1) an

    (n 1)! +

    an

    (n 1)!=

    = ea 1 a ea + a + a21

    Pn =2an 2

    (n 2)! + a

    1

    Pn =2an 1

    (n 1)! =

    = ea 1 a ea + a + a2ea + a (ea 1) = ea 1 + a2ea

  • 8/13/2019 15_11_2010_curs

    21/129

    CHAPTER 2

    Functions of several variables (2 lectures)

    Recommended reviews/prerequisites:From [3]:Chapter 2 "OneVariable Calculus: Foundations"Chapter 3: "OneVariable Calculus: Applications"

    Chapter 4: "OneVariable Calculus: Chain Rule"Chapter 5: "Exponents and Logarithms"From [8]:Chapter I: "Introduction to Analysis"Chapter II: "Dierentiation of functions"Chapter III: "The extrema of a function and the geometric applications of a derivative"

    2.1. Introduction

    [See also Chapters 12, 13, [ 3]] Vectorial functions with vectorial variable have the general form: f ( ) :D

    Rn

    !Rm . When m = 1, the function is a scalar one (with vectorial variable).

    An usual interpretation of a function in Economics, as a production function, is:n is the number of possible rawmaterials [or inputs].Rn is the set of all possible input bundles [quantities of rawmaterial].D is the set of all admissible/acceptable input bundles.x 2 D Rn is the quantity of raw material.m is the number of possible nite products.Rm is the set of all possible quantities of nite product.f (x) 2Rm is the quantity of nite product obtained from the input bundle x.The notation f ( ) is used to distinguish between the function itself and f (x) (which is an element

    from codomain) and to suggest the number of relevant variables when studying the function. In [ 3] itis suggested the notation x

    7! f (x), which is an alternative notation with the same role: to distinguish

    between the function and some image.The graph of the function is the set:

    G (f ( )) = Gf ( ) = f(x; f (x)) ; x 2 Dg Rn RmSome function operations are operations inherited from the operations possible with codomain elements,

    such as summation, multiplication with a scalar function and scalar product of two functions. Otherfunction operations are functionspecic, such as composition and limittaking.

    2.1.1. Denition . For the functions f ( ) : D Rn ! E Rm , g ( ) : E Rm ! R p, the new functionobtained by composition is (g f ) ( ) : D Rn ! R p, dened by: (g f ) (x) = g(f (x)) . This operationis only possible when f (D) E (the range of f ( ) is contained in the domain of g( )).17

  • 8/13/2019 15_11_2010_curs

    22/129

    18

    2.1.2. Denition . The function f ( ) : D Rn ! Rm is called bounded (on the set D) when f (D) isbounded (as a set):

    9M > 0;8x 2 D; kf (x)k M

    (, the scalar function x 7! kf (x)k is bounded).2.1.3. Remark (Some important vectorial functions) . All the following functions are considered on

    corresponding appropriate algebraic structures (mostly linear spaces):

    Linear functions (linear operators):

    f ( ) : Rn ! Rm ; f (x) = Ax

    (A is a matrix and x is a column vector) Bilinear functions:

    f ( ; ) : Rn Rm ! R; f (x; y) = xT Ay

    (A is a matrix, x and y are column vectors) Quadratic functions:

    f ( ) : Rn ! R; f (x) =n

    Xi=1 a ij xix j = xT Ax: Euclidian distance function:

    d ( ; ) : Rn Rn ! R;d (x; y) = r

    n

    Pi=1 x2i = kxk:

    Euclidian scalar product:

    h; i : Rn Rn ! R; hx; yi =n

    Xi=1 x iyi(a particular form of quadratic function).

  • 8/13/2019 15_11_2010_curs

    23/129

  • 8/13/2019 15_11_2010_curs

    24/129

    20

    Denote by LIMx! x0

    f (x) = fl; l is a limit point of f ( ) at x0g the set of all limit points of f ( ) at x0.Denote by lim inf x! x0 f (x) = inf LIMx! x0 f (x) the lower limit of f ( ) at x0 and by lim supx! x0 f (x) = sup LIMx! x 0 f (x)

    the upper limit of f ( ) at x0.

    2.2.1. Denition . Consider f ( ) : D Rn ! Rm and x0 2 D0 (a limit point of the domain). Thefunction f ( ) has the limit l at point x0 if:8U 2 V (l) ( in Rm ) ; 9V 2 V (x0) ( in Rn ) ; f (V \ Dnfx0g) U:

    Notation limx! x 0

    f (x) = l [less ambiguous alternative notations are: limx! x0x2 D

    f (x) = l or limD 3 x! x0

    f (x) = l]

    2.2.2. Remark (Equivalent denitions for the limit of the function) .

    With sequences: limx! x0 f (x) = l ,8xk ! x0; xk 2 Dnfx0g; f (xk) ! l:

    With " : limx! x0

    f (x) = l ,8" > 0;9 " > 0;8x 2 Dnfx0g with kx x0k < " ; kf (x) lk < ":

    2.2.3. Remark . When 9 limx! x0 f (x), LIMx! x0 f (x) = limx! x 0 f (x) and lim inf x! x0 f (x) = limsupx! x0 f (x) =lim

    x! x0f (x).

    2.3. Continuity

    2.3.1. Denition . The function f ( ) : D Rn ! Rm is called continuous at x0 2 D if 8U 2 V (f (x0)) ;9V 2 V (x0) ; f (V \ D) U:

    [Equivalently] When x0 2 D is an isolated point of D the function is continuous at x0. When x0 2 Dis not isolated continuity is the same as the following condition:9 limx! x0 f (x) = f (x0) :

    [The limit of the function at x0 exist and is equal with the value of the function at x0]2.3.2. Theorem (Thm. 13.4, [ 3]). Consider the functions f ( ) ; g ( ) : D Rn ! Rm and the xedscalar values , . If f ( ) and g ( ) are both continuous on D then all the following new functions are also

    continuous:

    (f + g) ( ), (f g) ( ), hf; gi( ), kf k( ), ( f + g) ( ).2.3.3. Theorem (Thm. 13.5, [ 3]). A vectorial function is continuous if and only if its components

    (which are scalar functions) are continuous.

    2.3.4. Theorem (Thm. 13.7, [ 3]). If two functions are continuous and their composition is possible,

    then the composed function is also continuous.

  • 8/13/2019 15_11_2010_curs

    25/129

    21

    2.3.5. Example . f (x; y) = xyx2 + y2 ; (x; y) 6= (0 ; 0)

    0; (x; y) = (0 ; 0) is partial continuous at (0; 0) with respect to each

    variable, in the sense that the following limits exists (also called iterated limits ):

    9limx! 0 limy! 0 f (x; y) = 0 = f (0; 0)and

    9limy! 0 limx! 0 f (x; y) = 0 = f (0; 0)but fails to be continuous at (0; 0) because the limit

    lim(x;y )! (0;0)

    f (x; y) does not exist.

    Proof. Choose the sequences xn = an

    , yn = bn

    , with a; b 2R arbitrary xed. The sequences satisfy:xn ! 0 and yn ! 0, and also (xn ; yn ) 6= (0 ; 0).Then f (xn ; yn ) =

    an

    bn

    an

    2+

    bn

    2 = aba2 + b2

    which means that the value of the limit depends on a and

    b (by changing these values we obtain a dierent limit). So, lim(x;y )! (0;0)

    f (x; y) does not exist.

    -4

    -0.4

    -2

    -0.2z 0.0x 0 420

    0.2

    y-2-4 2

    4

    0.4

    f (x; y) = xyx2 + y2

    2.3.6. Example . Partial continuity [Ex. 13.21, p. 295, [ 3]]: continuity with respect to a single

    variable, all the other varibles being considered constant (in Economics: CETERIS PARIBUS). Thefunction f (x; y) = x sin

    1y

    has limit at (0; 0) equal to 0 while the iterated limit limx! 0

    limy! 0

    f (x; y) fails to

    exist.

    The connection between continuity and partial continuity is not a simple one and only with supple-mentary conditions may be said more about it.

    2.3.7. Denition . Uniform continuity: f ( ) : D Rn ! Rm is uniformly continuous on D if 8" > 0; 9 " > 0; 8x0 2 D; ;

    8x 2 D; kx x0k < " ; kf (x) f (x0)k < "2.3.8. Remark . Continuity on compact sets:

  • 8/13/2019 15_11_2010_curs

    26/129

    22

    (1) Any continuous function on a compact set is bounded on that set.(2) Any continuous function on a compact set is uniformly continuous on that set.(3) The image of a compact set through a continuous function is a compact set.(4) A continuous scalar function on a compact set "reaches" its extreme values on that compact.

    2.3.9. Denition . A set A is connected if there is no pair of opened sets G1; G2 with the properties:

    A G1 [ G2; A \ G1 6= ;; A \ G2 6= ;; (A \ G1) \ (A \ G2) = ;:(the set A is not a disjoint union of opened sets; this is a generalization of the interval from the set R)

    2.3.10. Remark . The image of a connected set through a continuous function is a connected set.

    2.3.11. Denition . Functions which transforms connected sets in connected sets are called functions with Darboux property (and they are not necesarily continuous)

    2.4. Derivatives

    A scalar function f ( ) : I R ! Rm is derivable at t0 2 int (I ) if the following limit exist and is nite:limt ! t 0

    f (t) f (t0)t t0

    :

    2.4.1. Example . The function f ( ) : [ 2; 2]

    ! R3, dened by f (t) =

    0B@cos tsin t

    t2 1CA

    (which represents a

    helix, in parametric form) has the following graph:

    -0.5-1.0

    -0.50.50.0-0.5

    x

    0.0-1.0 0.0

    y

    z1.0

    0.5

    0.51.0

    0B@cos tsin t

    t2

    1CAFor each t 2 ( 2; 2), the derivative is f 0(t) = 0

    B@

    sin tcos t

    12

    1CA

    .

  • 8/13/2019 15_11_2010_curs

    27/129

    23

    A vectorial function f ( ) : D Rn ! Rm is dierentiable at x0 2 int (D) if there is T (x0) ( ) 2L

    (Rn ; Rm ) (a linear operator) such that

    limx! x0

    kf (x) f (x0) T (x0) (x x0)kkx x0k

    = 0:

    The linear operator T (x0) ( ) is called the (Frchet) dierential of the function f ( ) at x0 and is denotedby f 0(x0) or df (x0).

    T (x0) ( ) : Rn ! Rm is represented as:

    Rm 3 T x0 (dx) Not= df x0 dx = 0B

    B@

    (f 1)0x1 (x0) (f 1)0x2 (x

    0) (f 1)0xn (x0)(f 2)0x1 (x0) (f 2)0x2 (x0) (f 2)0xn (x0) (f m )0x1 (x0) (f m )0x2 (x0)

    (f m )0xn (x

    0)

    1CCA | {z } m lines and n columns0@

    dx1...

    dxn1A

    ;

    where the componenets of the matrix are called partial derivatives and they are dened as follows:

    (f i)0x j x0 =

    @f i@x j

    x0 = limx j ! x0j

    f i x01; ; x0 j 1; x j ; x0 j +1 ; ; xn f i x01; ; x0 j 1; x0 j ; x0 j +1 ; ; xnx j x0 j

    ;

    which is the derivative of the component f i ( ) viewed as a function of variable x j alone, and keeping allthe other variables constant and equal with the components from x0.

    2.4.2. Example . Consider the function f ( ; ) : R2 ! R2, dened by f (x; y) = f 1 (x; y)f 2 (x; y)

    =

    x3 2xy2 + y3 3x3 + 2 xy2 y3 + 3The graphics of the components:

    0 0-100

    xy-200

    -2-4 z

    42

    02

    -2

    4

    -4100

    200

    f 1 (x; y) = x3 2xy2 + y3 3

    yx

    -4

    -400

    -2-200

    4 2z 0 002 -24

    200

    -4

    400

    f 2 (x; y) = x3 + 2 xy2 y3 + 3The partial derivatives are:

    @f 1@x

    (x; y) = 3 x2 2y2

    @f 1@y

    (x; y) = 4xy + 3 y2@f 2@x

    (x; y) = 3 x2 + 2 y2

    @f 2@y

    (x; y) = 4 xy 3y2

    The dierential of the function is:

  • 8/13/2019 15_11_2010_curs

    28/129

    24

    df (x; y) = 0B@@f 1@x

    (x; y) @f 1

    @y (x; y)

    @f 2@x

    (x; y) @f 2@y

    (x; y) 1CA dxdy =

    = 0B@@f 1@x

    (x; y) dx + @f 1@y

    (x; y) dy@f 2@x

    (x; y) dx + @f 2@y

    (x; y) dy1CA

    =

    = (3x2 2y2) dx + ( 4xy + 3 y2) dy

    (3x2 + 2 y2) dx + (4 xy 3y2) dy

    2.4.3. Remark . For linear operators the following equivalent norm denitions are used:

    kT

    k = inf

    fM

    0;

    kT (x)

    k6 M

    kx

    k;

    8x

    2Rn

    g =

    = supkxk=1 kT (x)k = supkxk6 1 kT (x)k:2.4.4. Remark . 1. The dierential at a point, when exist, is unique.2. The denition of the dierential at x0 is equivalent with:

    9T (x0) ( ) 2 L(Rn ; Rm ) ; 9U 2 V (x0) ; 9! ( ) : U ! Rm with limx! x0 ! (x) = 0 ;such that f (x) = f (x0) + T (x0) (x x0) + kx x0k ! (x) :

    2.4.5. Remark (Explicit dierentiability condition for 2 variables functions) . f ( ; ) : D R2 ! Ris dierentiable at (x0; y0) 2 int (D) if there are the numbers 2 R, " > 0 and the function ! ( ; ) :B" (x0) B" (y0) ! R such that:(1) 9 lim(x;y )! (x0 ;y0 ) ! (x; y) = 0 ,(2) f (x; y) = f (x0; y0) + (x x0) + (y y0) + ! (x; y)q (x x0)2 + ( y y0)2, 8(x; y) 2 B" (x0)B" (y0) [when the condition is satised, the numbers and are equal with the corresponding

    partial derivatives).

    2.4.6. Example . The function f (x; y) = xy

    p x2 + y2is continuous at (0; 0), has partial derivatives at

    (0; 0) but it is not dierentiable at (0; 0).

    00

    xy-2

    -2-4

    -2z-4 0 22 4

    4

    2

    f (x; y) = xy

    p x2 + y2

  • 8/13/2019 15_11_2010_curs

    29/129

    25

    A geometrical interpretation would be that while the surface accepts tangent lines at (0; 0), it doesntaccept a tangent plane at (0; 0).

    Proof. The partial derivatives at (0; 0) are:@f @x

    (0; 0) = limx! 0

    f (x; 0) f (0; 0)x 0

    = 0,@f @y

    (0; 0) = limy! 0

    f (0; y) f (0; 0)y 0

    = 0;

    Suppose by contradiction that the function f ( ; ) is dierentiable at (0; 0). Then the following relationwould be satised:

    f (x; y) = f (0; 0) + @f @x

    (0; 0) x + @f @y

    (0; 0) y + ! (x; y)

    p x2 + y2;

    so that we obtain ! (x; y) = f (x; y)

    p x2 + y2=

    xyx2 + y2

    . But since this function has no limit at (0; 0), we get a

    contradiction.

    2.5. Higher order derivatives

    Secondorder partial derivatives: @ 2f

    @xi@x j(x) = f 00x i x j (x) = f

    0x i

    0

    x j(x) [the secondorder partial deriva-

    tive is obtained by partial derivating with respect to x j the partial derivative with respect to xi].

    2.5.1. Example . For the function f 1 ( ; ) : R2 ! R2, dened by f 1 (x; y) = x3 2xy2 + y3 3, withrstorder partial derivatives given by:@f 1@x

    (x; y) = 3 x2 2y2

    @f 1@y

    (x; y) = 4xy + 3 y2and the dierential given by:df 1 (x; y) = (3 x2 2y2) dx + ( 4xy + 3 y2) dy,

    the secondorder partial derivatives are:@ 2f 1@x2

    (x; y) = (3 x2 2y2)0x = 6x@ 2f 1@x@y

    (x; y) = (3 x2 2y2)0y = 4y@ 2f 1@y@x

    (x; y) = ( 4xy + 3 y2)0x = 4y@ 2f 1@y2

    (x; y) = ( 4xy + 3 y2)0y = 6y

    It may be seen that the secondorder mixed partial derivatives are equal [ @ 2f 1@x@y

    (x; y) = @ 2f 1@y@x

    (x; y)].

    The secondorder dierential is:

  • 8/13/2019 15_11_2010_curs

    30/129

    26

    d2

    f (x; y) = dx dy 0BB@@ 2f 1@x2

    (x; y) @ 2f 1

    @x@y (x; y)

    @ 2f 1@y@x

    (x; y) @ 2f 1

    @y2 (x; y) 1CCA

    dxdy

    d2f (x; y) = dx dy 6x 4y4y 6y dxdy

    d2f (x; y) = 6 x dx2 8y dxdy + 6 y dy2

    2.5.2. Theorem (Young) . If the secondorder partial derivatives are continuous, then @ 2f @xi@x j

    (x) =

    @ 2f @x j @xi

    (x).

    Let f ( ; ) : A R2 ! R and (a; b) 2 A.2.5.3. Denition . The n th degree Taylor polynomial for the function f ( ; ) at (a; b), is the 2variables

    polynomial:

    T n (x; y) = f (a; b) + 11!@f @x

    (a; b)(x a) + @f @y

    (a; b)(y b) +

    + 12!@ 2f @x2

    (a; b)(x a)2 + 2 @ 2f @x@y

    (a; b)(x a)(y b) + @ 2f @y2

    (a; b)(y b)2 + :::

    + 1

    n!

    @

    @x(x a) +

    @

    @y(y b)

    (n )

    f (a; b)

    Rn (x; y) = f (x; y) T n (x; y) is the nth order remainder.

    2.5.4. Denition . The secondorder dierential:d2f ( ) =

    n

    Pi;j =1@ 2f

    @xi@y j(a) dxidx j =

    = dx1 dx2 dxn0BBBB@

    @ 2f @x1@x1

    (a) @ 2f @xn @x1

    (a)...

    ...@ 2f

    @x1@xn(a)

    @ 2f

    @xn @xn(a)

    1CCCCA

    0BB@

    dx1dx2

    ...dxn

    1CCA2.6. Further results about dierentiable functions

    2.6.1. Remark . If the function f ( ) : U Rn ! Rm is dierentiable at each point of U , then thedierential on U is a new function with values at L(Rn ; Rm ):df ( ) : U ! L(Rn ; Rm ) :

    2.6.2. Theorem . If a function is dierentiable at a point, then it is also continuous at that point.

    2.6.3. Remark . There are functions continuous at a point but not dierentiable. Example: f (x) =

    jx

    jat x0 = 0 .

  • 8/13/2019 15_11_2010_curs

    31/129

    27

    2.6.4. Remark . Existence of the derivative on an interval does not imply the continuity of the de-

    rivative. Example: f (x) = ( x2

    sin

    1

    x ; x 6= 00; x = 0: has the derivative f 0(x) = ( 2x sin 1

    x cos 1

    x ; x 6= 00; x = 0:

    which is not continuous at 0. [f 0(0) = limx! 0

    f (x) f (0)x 0

    = limx! 0

    x2 sin 1x

    x = lim

    x! 0x sin

    1x

    = 0]

    2.6.5. Remark . The existence of the partial derivatives does not imply continuity at that point.

    Example: f (x; y) = ( xyx2 + y2 ; (x; y) 6= (0 ; 0) ;0; (x; y) = (0 ; 0) :2.6.6. Remark . Linear operators on nite dimensional spaces are continuous and dierentiable func-

    tions. At each point, the dierential is the operator itself.2.6.7. Remark . An operator T ( ; ) 2 L(Rn ; Rm ; R p) (bilinear operator, T ( ; ) : Rn Rm ! R p ) iscontinuous and dierentiable at each point; moreover,

    dT (x0; y0) (x; y) = T (x0; y) + T (x; y0) :

    In particular, the square of the euclidean norm f (x) = kxk2 has the dierential:df (x0) (x) = 2 hx; x 0i:

    2.6.8. Theorem (Chain rule) . [The dierential of a composed function]Consider U = int (U ) Rn ; V = int (V ) Rm , and the functions f ( ) : U ! V , g ( ) : V ! R p suchthat f ( ) dierentiable at x0

    2 U and g( ) dierentiable at f (x0). Then the composed function

    h ( ) = ( g f ) ( ) : U ! R pis dierentiable at x0 and the connection between its dierential and the dierentials of the componentsis given by:

    d (g f ) (x0) = dg f x0 df x0 :

    Proof. Since the function f ( ) is dierentiable at x0 and the function g ( ) is dierentiable at f (x0)we may write:

    f (x) = f (x0) + f 0(x0) (x x0) + kx x0k! f;x 0 (x) ; 8x 2 U;g (y) = f (x0) + g0(f (x0)) ( y f (x0)) + ky f (x0)k! g;f (x0 ) (y) ; 8y 2 V;with limx! x 0 ! f;x 0 (x) = 0 (2

    Rm ) and limy! f (x0 ) ! g;f (x0 ) (y) = 0 (2

    R p) ;by replacing y with f (x) we get:

    g (f (x)) = f (x0) + g0(f (x0)) ( f (x) f (x0)) + kf (x) f (x0)k! g;f (x 0 ) (f (x)) == f (x0) + g0(f (x0)) ( f 0(x0) (x x0) + kx x0k! f;x 0 (x)) ++ kf 0(x0) (x x0) + kx x0k! f;x 0 (x)k! g;f (x0 ) (f (x)) f (x0) + g0(f (x0)) f 0(x0) (x x0) ++ kx x0k g0(f (x0)) ! f;x 0 (x) + ( kf 0(x0)k+ k! f;x 0 (x)k) ! g;f (x0 ) (f (x)) == f (x0) + g0(f (x0)) f 0(x0) (x x0) + kx x0k! g f;x 0 (x) ;where ! g f;x 0 (x) = g0(f (x0)) ! f;x 0 (x) + ( kf 0(x0)k+ k! f;x 0 (x)k) ! g;f (x0 ) (f (x))and limx! x0 ! g f;x 0 (x) = 0 ( 2R p)

  • 8/13/2019 15_11_2010_curs

    32/129

    28

    2.6.9. Remark (The matrix form of the "Chain Rule") . dh (x0) 2 L (Rn ; R p) (a linear operator de-scribed by an p n matrix):

    R p 3 dh x0 dx = 0BB@(h1)0x1 (x

    0) (h1)0x2 (x0) (h1)0xn (x0)(h2)0x1 (x0) (h2)0x2 (x0) (h2)0xn (x0)

    (h p)0x1 (x0) (h p)0x2 (x0) (h p)0x n (x0)1CCA

    | {z } p lines and n columns0@

    dx1...

    dxn1A

    df (x0) 2 L (Rn ; Rm ) (a linear operator described by an p m matrix):

    Rm

    3 df x0

    dx = 0BB@(f 1)0x1 (x

    0) (f 1)0x2 (x0) (f 1)0xn (x0)(f 2)0x1 (x0) (f 2)0x2 (x0)

    (f 2)0xn (x

    0)

    (f m )0x1 (x0) (f m )0x2 (x0) (f m )0xn (x0) 1CCA | {z } m lines and n columns

    0@dx1

    .

    ..dxn 1Adg (u0) 2 L (Rm ; R p) ( a linear operator described by an m n matrix):

    R p 3 dg u0 du = 0BB@(g1)0u 1 (u

    0) (g1)0u 2 (u0) (g1)0u m (u0)

    (g2)0u 1 (u0) (g2)0u 2 (u

    0) (g2)0u m (u0) (g p)0u 1 (u0) (g p)0u 2 (u0) (g p)0u m (u0)

    1CCA | {z }

    p lines and m columns

    0@du1

    ...dum

    1Aand the relation has the matrix form: ( p n) = ( p m) (m n), namely:

    0BB@(h1)0x1 (x

    0) (h1)0x2 (x0) (h1)0xn (x0)

    (h2)0x1 (x0) (h2)0x2 (x

    0) (h2)0xn (x0) (h p)0x 1 (x0) (h p)0x2 (x0) (h p)0xn (x0)

    1CCA0@

    dx1...

    dxn1A

    =

    = 0BB@

    (g1)0u 1 (f (x0)) (g1)0u 2 (f (x

    0)) (g1)0u m (f (x0))(g2)0u 1 (f (x0)) (g2)0u 2 (f (x0)) (g2)0u m (f (x0))

    (g p)0u 1 (f (x

    0

    )) (g p)0u 2 (f (x

    0

    )) (g p)0u m (f (x

    0

    ))

    1CCA0BB@

    (f 1)0x1 (x0) (f 1)0x2 (x

    0) (f 1)0xn (x0)(f 2)0x1 (x

    0) (f 2)0x2 (x0) (f 2)0xn (x0)

    (f m )0x1 (x0) (f m )0x 2 (x0) (f m )0xn (x0)1CCA0@

    dx1...

    dxn1A

    2.6.10. Exercise . Given the functions u (x; y) and v (x; y), nd the partial derivatives of the followingfunctions:

    f (x; y) = u (x; y)v(x;y ) ;g (x; y) =

    p u (x; y) + v (x; y);

    h (x; y) = arctan u(x;y )

    v(x;y ) :

  • 8/13/2019 15_11_2010_curs

    33/129

    29

    2.6.11. Exercise . If u (x; y) = ex+ y2

    and v (x; y) = x2 + y, nd the partial derivatives of the functions:

    f (x; y) = ln u2 (x; y) + v (x; y) ; g (x; y) = arctan u (x; y)v (x; y) :

    2.6.12. Exercise . Verify that the given functions satisfy the given equations:

    f (x; y) = ' yx veries xf 0x + yf 0y = 0:

    f (x;y;z) = ' (xy;x2 + y2 z2) veries xzf 0x yzf 0y + ( x2 y2) f 0z = 0:

    2.6.13. Remark . If f (x) = T (g1 (x) ; g2 (x)) , with T ( ; ) 2 L(Rn ; Rm ; R p) and g1 ( ) : Rq ! Rn ,g2 ( ) : Rq ! Rm dierentiable at x0, then f ( ) is dierentiable at x0 anddf x0 v = T dg1 x0 v; g2 x0 + T g1 x0 ; dg2 x0 v ;8v 2Rq:

    2.6.14. Theorem (DenjoyBourbaki) . [The Mean Value Theorem, [ 10 ], vol. I, (8.5.1)] [Premierthorme des accroissements nis] Consider B = ( r k)k2 N [a; b], f ( ) : [a; b] ! Rm , g ( ) : [a; b] ! R, suchthat:(1) f ( ), g ( ) continuous on [a; b] and g ( ) increasing on [a; b];(2) f ( ), g ( ) dierentiable on [a; b]nB;(3) kdf (t)k6 dg (t) ; 8t 2 [a; b]nB.

    Then:

    kf (b) f (a)k6 g (b) g (a) :2.6.15. Theorem . If f ( ) : [a; b] ! Rn is continuous on [a; b], dierentiable on [a; b]nB (B countable)and k

    f 0(t)

    k M ,

    8t

    2 [a; b]

    nB, then

    kf (b) f (a)k6 M (b a) :[[10], vol. I, (8.5.2)]

    Proof. Apply the Mean Value Theorem with g (t) = M t:

    2.6.16. Theorem . If f ( ) : [a; b] ! Rn is continuous on [a; b], dierentiable on [a; b]nB (B countable)and f 0(t) = 0 (2 Rn ), then f ( ) is constant on [a; b]. [mentioned as a comment at [ 10 ], vol. I, Ch. VIII,Section 6, page 162, as an application of the Mean Value Theorem]2.6.17. Theorem ([10 ], vol. I, Th. (8.5.4)) . Consider U = int (U ) convex Rn and f ( ) : U ! Rmdierentiable on U . Then:

    kf (v) f (u)k6 kv uk supx2 [u;v ]kf 0(x)k; 8u; v 2 U:

    Proof. Let u; v 2 U and g ( ) : [0; 1] ! Rm , g (t) = (1 t) u + tv. Apply [[10], vol. I, (8.5.2)] for thefunction h (t) = ( f g) (t). We have g0(t) = v u andh0(t) = f 0(g (t)) g0(t) = f 0((1 t) u + tv) (v u)

    so that

    kh0(t)k = kf 0((1 t) u + tv) (v u)k kf 0((1 t) u + tv)k k(v u)k66

    k(v u)

    k supt2 [0;1] k

    f 0((1 t) u + tv)

    k =

    k(v u)

    k supx2 [u;v ]k

    f 0(x)

    k:

  • 8/13/2019 15_11_2010_curs

    34/129

    30

    Then

    kf (u) f (v)

    k =

    kh (1) h (0)

    k6 (1 0) sup

    t2 [0;1] kh0(t)

    k6

    k(v u)

    k supx2 [u;v ]k

    f 0(x)

    k2.6.18. Corollary . Consider U = int (U ) convex Rn and f ( ) : U ! Rm dierentiable on U . If thereis M 0 such that

    kf 0(x)k6 M; 8x 2 U;then

    kf (x1) f (x2)k6 M kx1 x2k; 8x1; x2 2 U:[f ( ) is Lipschitzian on U ]

    2.6.19. Theorem ([10 ], vol. I, Th. (8.6.2)) . [Seconde thorme des accroissements nis] Consider U open, convex in Rn and f ( ) : U ! Rm dierentiable on U . Then

    8a; b 2 U and c 2 [a; b] ;kf (b) f (a) f 0(c) (b a)k6 kb aksupx2 [a;b] kf 0(x) f 0(c)k:

    Proof. Consider a; b 2 U and c 2 [a; b], and the function g( ) : U ! Rm dened byg (x) = f (x) (f 0(c)) ( x) :

    Then

    g0(x) = f

    0(x) f

    0(c)

    and, by Mean Value Theorem, we have

    kg (b) g (a)k6 kb ak supx2 [a;b]kg0(x)k;

    which means

    kf (b) f (a) f 0(c) (b a)k6 kb ak supx2 [a;b]kf 0(x) f 0(c)k:

    2.6.20. Corollary . Let D be open, convex in Rn

    , f ( ) : D ! Rm

    dierentiable on D: If f 0

    (x) = 0 ;8x 2 D then f ( ) is constant on D:2.6.21. Theorem ([10 ], vol. I, Th. (8.6.3)) . Let U be open, convex in Rn , and f n ( ) : U ! Rm asequence of dierentiable functions such that:(1) There is x0 2 U such that the (numerical) sequence (f n (x0))n is Cauchy;(2) 8u 2 U , 9"u > 0 such that B"u (u) U and f 0njB" u (u) ( ) n (the sequence of the restrictions on

    the ball B"u (u) of the derivates) converges unifornly.

    Then, for each u 2 U , the sequence f njB" u (u) ( ) n converges uniformly on B"u (u). Moreover, if wedenote by f (x) = lim

    n !1f n (x),

    8x

    2 U and by g (x) = lim

    n !1f 0

    n (x),

    8x

    2 U , then g ( ) = f 0( ) on U .

  • 8/13/2019 15_11_2010_curs

    35/129

    31

    2.7. Implicit functions

    2.7.1. Theorem (The Implicit function theorem) . Let U = int (U ) Rn

    Rm

    and h ( ; ) : U ! Rm

    continuous dierentiable on U and (a; b) 2 U such that:(1) h (a; b) = 0 ,

    (2) @h@y

    ( ; ) = D2h ( ; ) : U ! L(Rm ; Rm ) is continuous on U ,(3)

    @h@y

    (a; b) = D2h (a; b) 2 L(Rm ; Rm ) is bijective (invertible).Then 9r > 0 and ' ( ) : Br (a) ! Rm such that:(1) (a) (x; ' (x)) 2 U , 8x 2Br (a),(b) ' ( ) is continuous on Br (a),

    (c) h (x; ' (x)) = 0 , 8x 2Br (a),(d) ' (a) = b,(e) ' ( ) is unique.

    Moreover, ' ( ) is dierentiable at a and:

    ' 0(a) = @h@y

    (a; b)1 @h

    @x (a; b) :

    Proof. Let

    T :=@h@y

    (a; b)1

    2 L(Rm ; Rn )and

    f (x; y) := y (T h) (x; y) ; 8x; y 2 U:We have f (x; y) = y , h (x; y) = 0 (because T is invertible).Choose ; r; such that we may apply to f ( ; ) the parametric version of contraction principle:

    from continuity of @h@y

    ( ; ) it follows 8" 2 0; 1kT k ;9r 1; > 0 such that:

    ( Br 1 (a) B (b) U @h@y (x1; y1) @h@y (x2; y2) < "2 ;8(x1; y1) ; (x2; y2) 2Br 1 (a) B (b)From 8.6.2 we have the inequalities:

    kf (x; y1) f (x; y2)k = ky1 y2 T (h (x; y1) h (x; y2))k == T T 1 (y1 y2) T (h (x; y1) h (x; y2)) kT k T 1 (y1 y2) (h (x; y1) h (x; y2)) =

    = kT k@h@y

    (a; b) (y1 y2) (h (x; y1) h (x; y2))

    12"

    h (x; y1) h (x; y2) @h@y

    (x; y1) (y1 y2) + @h@y

    (x; y1) (y1 y2) @h@y

    (a; b) (y1 y2)

    12"

    h (x; y1) h (x; y2) @h@y

    (x; y1) (y1 y2) +@h@y

    (x; y1) @h@y

    (a; b) (y1 y2)

  • 8/13/2019 15_11_2010_curs

    36/129

    32

    2.7.2. Exercise . The functions u = (x; y) and v = (x; y) are implicitly dened by the relations:u + v = x + y, xu + yv = 1. Find their partial derivatives.

    2.8. Taylor Polynomials

    Let f ( ; ) : A = int (A) R2 ! R and (a; b) 2 A. If f ( ; ) is n times dierentiable at (a; b), then themixed partial derivatives of the same order are equal.2.8.1. Denition . The polynomial:

    T n (x; y) = f (a; b) + 11! h@f (a;b)@x (x a) + @f (a;b)@y (y b)i++ 12! h@ 2 f (a;b)@x2 d2x + 2 @ 2 f (a;b)@x@y dxdy + @ 2 f (a;b)@y2 i+ :::+

    1

    n !

    h@

    @x(x a) + @

    @y(y b)i(n )

    f (a; b)is called "Taylor polynomial of degree n of the function f ( ; ) at (a; b)" while

    Rn (a; b) = f (a; b) T n (a; b)

    is the nth order remainder of the Taylor polynomial of degree n.

    2.8.2. Theorem . Consider the function f ( ; ) : A = int (A) R2 ! R n + 1 times dierentiable in theneighborhood A V 2 V ((a; b)) . Then 8(x; y) 2 V , 9( ; ) 2 [x; a ] [y; b] such that:R n (x; y) =

    1(n + 1)!

    @ @x

    (x a) + @ @y

    (y b)(n +1)

    f ( ; )

    Proof. Let (x; y) 2 V and x (t) = a + ( x a) t, y (t) = b+ ( y b) t, t 2 [0; 1] and the function:F (t) = f (x(t); y(t)) = f (a + ( x a)t; b + ( y b)t):

    F (0) = f (a; b), F (1) = f (x; y). The function F ( ) is n + 1 times dierentiable on (0; 1), so we write theTaylor formula for a single variable, with the remainder in Lagrange form:

    F (t) = F (0) + t1!

    F 0(0) + t2

    2!F "(0) + ::: +

    tn

    n!F (n )(0) +

    tn +1

    (n + 1)!F (n +1) ( ); 0 < < t:

    For t = 1 we have:

    F (1) = F (0) + 1

    1!F 0(0) +

    1

    2!F "(0) + ::: +

    1

    n!F (n )(0) +

    1

    (n + 1)!F (n +1) ( ); 0 < < 1:

    Also, for k = 1; 2;:::;n + 1 , we have

    dkF (t) = (

    xdx(t) +

    y(t)) (k) f (x(t); y(t)) = (

    x

    (x a) +

    y(y b)) (k)(dt)k f (x(t); y(t)) ;

    It follows that dkF (t)

    dtk =

    dx

    (x a) +

    dx(y b)

    (k)

    f (x(t); y(t)) ;

    and dkF (0)

    dtk = F (k)(0) =

    x

    (x a) +

    y(y b)

    (k)

    f (a; b):

    Finally we get:

    f (x; y) = f (a; b) + 11!

    x(x a) +

    y(y b) f (a; b) + 1

    2!

    x(x a) +

    y(y b)

    2

    f (a; b) + :::+

  • 8/13/2019 15_11_2010_curs

    37/129

    33

    + 1n !

    x(x a) +

    y(y b)

    (n )

    f (a; b) + 1(n +1)!

    x(x a) +

    y(y b)

    (n +1)

    f ( ; )

    It follows Rn (x; y) = 1(n +1)!

    x(x a) +

    y(y b)

    (n +1)

    f ( ; ),

    with = a + (x a), = b+ (y b), 0 < < 1.

    2.8.3. Theorem . If the function f ( ; ) : A = int (A) R2 ! R is n + 1 times dierentiable in aneighborhood of (a; b), then there is a function ! ( ; ) : A ! R continuous and null at (a; b) such thatRn (x; y) = 1n !

    n (x; y) ! (x; y), where (x; y) = p (x a)2 + ( y b)2.2.8.4. Remark . If (x; y) 6= ( a; b); then

    Rn (x; y)n (x; y) =

    1n ! ! (x; y); and lim(x;y )! (a;b)R

    n (x; y)n (x; y) = 0, becauselim

    (x;y )! (a;b)! (x; y) = ! (a; b) = 0 .

    2.9. Applications in Economics

    Common usage in Economics:A distinction should be made between the name of the good and the quantity of the good. I will

    distinguish between them by denoting X (capital letter) the name and by x (small letter) the quantity of X .

    Observe that I use the word "quantities" and not the word "units" for measurements. This is becausewhile the word "units" may sound better than "quantities", it is also misleading from the point of viewof Analysis and what is meant by "appropriate masurement units". Analysis refers to "innitesimalmodications" and not integers and moreover an appropriate measurement unit should be "indenitelydecomposable" in the sense that we should be able to refer to amounts as small as possible.

    Since x, y and f (x; y) are quantities, some measurement units should be chosen for each of them.

    Some usual choices will be used for exemplication purposes.X is money (capital) with measurement unit e (so that x is the quantity of e )Y is labor with measurement unit wh (working hours) (so that y is the quantity of working hours)Z is nails (the production process uses capital and money to produce nails) with measurement unit kg

    (of nails) (so that z is the quantity of kg of nails)It should always be clear in what measurement units are we measuring dierent quantities.

    2.9.1. Examples of usual production functions (from Economics):

  • 8/13/2019 15_11_2010_curs

    38/129

    34

    CobbDouglas type: f ( ) : Rn+ ! R+ ;f (x1; x2; ; xn ) = ax 11 x 22

    x nnCES [constant elasticity of substitution] type: f ( ) : Rn+ ! R+ ;

    f (x1; x2; ; xn ) = a n

    Pi=1 cixib

    ; 0 6= < 1(for = 0, u (x1; x2; ; xn ) = a

    n

    Pi=1 ci ln xi)Perfect substitutes [linear] type: f ( ) : Rn+ ! R+ ;f (x1; x2; ; xn ) =

    n

    Pi=1 a ixiPerfect complement type (Leontie): f ( ) : Rn+ ! R+ ;f (x1; x2; ; xn ) = min a ixi ; i = 1; nIf f ( ; ) is a production function (so that x and y are quantities of raw materials X and Y used to

    obtain the quantity f (x; y) of nite product Z ).Then:The partial derivative

    @f @x

    (x0; y0) is measured in (kg of nails)/ e and it may be interpreted as the speedof change in nails when money changes, around (x0; y0). The usual naming convention in Economics is"marginal product of X at (x0; y0)" and actually it expresses literally the following approximation:

    f (x; y0) ' f (x0; y0) + @f @x

    (x0; y0) (x x0)

    Note that from the perspective concerning the measurement units, the (usual) interpretation of @f @x

    (x0; y0)

    as an estimator of "the change in output due to a one unit increase in capital" is misleading because themeasurement units are dierent: while "the change in output" is f (x; y0) f (x0; y0) and is measured in

    (kg of nails), the quantity @f @x

    (x0; y0) is measured in (kg of nails)/ e .Of course the confusion comes from identifying the distinct objects

    @f @x

    (x0; y0)(kg of nails)/ e

    and @f @x

    (x0; y0)(kg of nails)/ e

    1e

    and it may be addressed by accepting a convention to make a distinction when appropriate, but still thesituation leads to signicant ambiguities.

    In a similar manner,

    f (x0; y) ' f (x0; y0) + @f @y

    (x0; y0) (y y0)

    f (x; y) ' f (x0; y0) + @f @x

    (x0; y0) (x x0) + @f @y

    (x0; y0) (y y0)

    The confusion may be further increased by the following facts: the approximations are valid only inside the convergence domains for the attached Taylor expan-sions

    the "quality of approximation" decreases with future operations applied (the more the operationsapplied, the worse the quality of approximation)

    The need to think dx [= x x0] in integer terms contradicts its signicance as "small", "innitez-

    imal", "local".

  • 8/13/2019 15_11_2010_curs

    39/129

    35

    2.9.1. Denition . A function u ( ) : Rn ! R is called an utility function if it represents a preferencerelation, in the sense that x % y ()

    u (x)

    u (y)

    2.9.2. Elasticity for singlevariable functions.

    2.9.2. Denition . Percentage change of a function f ( ) at x2 compared to the base from x1: f (x2) f (x1)

    f (x1)[The dierence between the two states as a fraction of the initial state x1] [it is a dimensionless value].

    2.9.3. Denition . Mean value of a function at x: f (x)

    x [the average] [it is measured in (kg of nails)/ e ].

    2.9.4. Denition . Derivative of a function at x (when exist): f 0(x0) = limx! x0

    f (x) f (x0)x x0

    . [Speed at

    x0] [also measured in (kg of nails)/ e ]

    2.9.5. Denition . El (f ( ) ; x) = xf 0(x)

    f (x) =

    xf (x)

    df (x)dx

    = d (ln f (x))

    dx [Possible interpretation: the

    elasticity of f ( ) with respect to x (at a certain xed value for x) is the percentage change in f ( )corresponding to a one per cent increase in x] [the ratio between the speed and the average] [dimensionlessvalue ].

    2.9.6. Remark . General Economical terminology:

    jEl (f ( ) ; x)j > 1 (f ( ) is elastic)jEl (f ( ) ; x)j = 1 (f ( ) is unitary elastic)jEl (f ( ) ; x)j < 1 (f ( ) is inelastic)jEl (f ( ) ; x)j = 0 (f ( ) is completely inelastic)2.9.7. Remark . Calculation Rules:El ((f g) ( ) ; x) = El (f ( ) ; x) + El (g ( ) ; x)El

    f g

    ( ) ; x = El (f ( ) ; x) El (g ( ) ; x)

    El ((f g) ( ) ; x) = f (x) El (f ( ) ; x) g (x) El (g ( ) ; x)

    f (x) g (x)El ((f g) ( ) ; x) = El (f ( ) ; g (x)) El (g ( ) ; x)El (f 1 ( ) ; f (x0)) =

    1El (f ( ) ; x0)

    [if f ( ) is invertible]

    El (A; x) = 0 [A is a constant]El (x ; x) = [ is a constant]El (sin x; x ) = x cot xEl (cos x; x) = x tan xEl (tan x; x ) =

    xsin x cos x

    El (cot x; x) = x

    sin x cos xEl (ln x; x ) =

    1ln x

    El (loga x; x) = 1ln x

    2.9.8. Exercise . Prove the statements from the above remark.

  • 8/13/2019 15_11_2010_curs

    40/129

    36

    2.9.3. Elasticity for manyvariables functions.

    2.9.9. Denition . Mean value with respect to the ith variable at x: f (x)xi[measured in (kg of

    nails)/(unit of X i)]

    2.9.10. Denition . El ix i

    f x1 ; ;

    xn ; x =x i

    @f @xi

    (x)

    f (x) [the partial elasticity of the function f ( )

    at x with respect to the ith variable] [the ratio between the speed with respect to i and the average withrespect to i]

    El jt j

    F f 1t 1 ; ;

    tm ; ; f nt 1 ; ;

    tm ; t =n

    Pi=1

    El ix i

    F x 1 ; ;

    xn ; x El jt j

    f it 1 ; ;

    tm ; t [cha

    rule]

    2.10. Extreme points

    2.10.1. Denition (Extreme points and extreme values concepts) . Consider f ( ) : A Rn ! R andx0 2 A.(1) The point x0 2 A is a (global) minimum point for f ( ) if the following condition holds:

    f x0 f (x) ; 8x 2 A:The value f (x0) is the (global) minimum value of f ( ) on A.

    (2) The point x0

    2 A is a (global) maximum point for f ( ) if the following condition holds:f x0 f (x) ; 8x 2 A:The value f (x0) is the (global) maximum value of f ( ) on A.

    (3) The point x0 2 A is a local minimum point for f ( ) if the following condition holds:9V 2 N x0 ; f x0 f (x) ; 8x 2 A \ V:

    The value f (x0) is a local minimum value of f ( ) on A.(4) The point x0 2 A is a local maximum point for f ( ) if the following condition holds:

    9V

    2 N x0 ; f x0 f (x) ;

    8x

    2 A

    \V:

    The value f (x0) is a local maximum value of f ( ) on A.(5) The point x0 2 A is an unconstrained local minimum point for f ( ) if x0 is an interior point of Aand a local minimum point for f ( ).(6) The point x0 2 A is an unconstrained local maximum point for f ( ) if x0 is an interior point of Aand a local maximum point for f ( ).(7) The least upper bound of the function f ( ) on A is denoted by inf

    x2 Af (x) and it is the least upper

    bound (possibly innite) for the set f (A) ( inf x2 A

    f (x) = inf f (A)).(8) The greatest lower bound of the function f ( ) on A is denoted by sup

    x2 Af (x) and it is the greatest

    lower bound (possibly innite) for the set f (A) (supx2 A

    f (x) = sup f (A)).

  • 8/13/2019 15_11_2010_curs

    41/129

    37

    inf x2 A

    f (x) = ()1. f (x), 8x 2 A2. 8 > , 9x 2 A, f (x ) < .2.10.2. Theorem (Thm. 2.1, [ 14 ]). Consider f ( ) : A Rn ! R, A closed (A = A).Assume there is 2R such that the set A = fx 2 A; f (x) g is nonempty and bounded. [i.e. the level set is relatively compact]Then there is at least a point x0 2 A such that f (x0) = inf x2 A f (x) (the function attains its minimum).Proof. Consider = inf

    x2 Af (x) and observe that f (x) 8x 2 A.

    If < then Ab would be empty, a contradiction with the hypotheses, so it has to be that .If = , since the set A is nonempty, there are points with the required quality and the proof isnished for this this case.

    Consider > .If is not an accumulation point for the set f (A), then should be an isolated point for f (A). Then

    2 f (A) so there is a point x0 2 A such that = f (x0) (in this case, x0 is also an isolated point for A).If is an accumulation point for the set f (A), then consider a sequence (xn )n 2 N such that xn 2 A8n 2 N and f (xn ) ! . Because > , and the codomain is separated (as a topological space), thesequence (f (xn ))n 2 N will eventually satisfy f (xn ) (i.e.: 9n0 2 N such that 8n n0, f (xn ) )so the sequence (xn )n 2 N may be considered such that xn 2 A, 8n 2 N, f (xn ) ! and f (xn ) (i.e.xn 2 A , 8n 2 N). Since A is bounded, the sequence (xn )n 2 N is also bounded and has a subsequence(xn k )k2 N convergent to a value x0 2 A (because the sequence is from the set A which is included in theclosed set A). Since the function f ( ) is continuous, f (x0) = lim

    k!1f (x

    n k) = , so the function f ( ) attains

    its minimum value at the point x0.

    2.10.3. Theorem (Thm. 2.2, [ 14 ]). Consider f ( ) : A Rn ! R and = inf x2 A f (x). Assume there isb > such that 8 < b the set A is compact. Then there is x0 2 A such that f (x0) = (f ( ) attains itsminimum value). [Observe that the previous continuity requirement for f ( ) is replaced by the compactityrequirement not for just one level set A , but for all level sets A ]

    Proof. For ; 1 2 ( ; b] with < 1 the sets A and A 1 are nonempty, compact and A A 1 , sothat the intersection T2 ( ;b] A is nonempty. Then for x0 2 T2 ( ;b] A it happens that f (x0) , 8 2 ( ; b]which means f (x0)

    = inf x2 A

    f (x), so that in fact f (x0) = .

    A consequence of the previous results is usually used:

    2.10.4. Theorem (Thm. 2.3, [ 14 ]). Consider f ( ) : A Rn ! R. If f ( ) is continuous and A iscompact then there are points x0, x1 2 A such that f (x) 2 [f (x0) ; f (x1)], 8x 2 A. [When the functionis continuous and the domain is compact both the minimum and the maximum values of the function areattained]

    2.10.5. Theorem (Thm. 2.5, [ 14 ]). Consider f ( ) : A Rn ! R such that the function is continuousin a neighborhood of the compact set A0 A. If f (x) > 8x 2 A0 then the inequality also holds for aneighborhood of A0.Proof by contradiction

  • 8/13/2019 15_11_2010_curs

    42/129

    38

    2.11. Unconstrained Local Optimization

    Consider f ( ) : A Rn

    ! R and x0

    2 int (A).The derivative of f ( ) at x0 in the direction h is the linear form f 0(x0; ) : Rn ! R, f 0(x; h) =nPi=1

    @f @xi

    (x0) h i = f 0(x) h is the derivative of the function g (t) = f (x0 + th ) at t = 0 ( @ @t

    (g (x0 + th ))t=0

    ).

    The second derivative of f ( ) in the direction h is the quadratic form f 00(x; h) =n

    Pi=1n

    P j =1@ 2f

    @xi@x j(x0) h ih j =

    h t @ 2f @xi@x j

    (x0)i;j =1 ;n

    h = ht f 00(x) h, where f 00(x) = @ 2f @xi@x j

    (x0)i;j =1 ;n

    is the Hessian of f ( ) at x0.

    The quadratic form f 00(x; h) is the second derivative of g (t) = f (x0 + th ) at t = 0 ( @ 2

    @t2 (g (x0 + th ))

    t=0).

    2.11.1. Theorem (Thm. 3.1, [ 14]). Consider a class C 2 function f ( ) : A Rn ! R and x0 2 int (A).If x0 is an unconstrained local minimum, then f 0(x0) = 0 and f 00(x0) is nonnegative.2.11.2. Theorem (Thm. 3.2, [ 14]). Consider a class C 2 function f ( ) : A Rn ! R and x0 2 int (A).If f 0(x0) = 0 and f 00(x) is positive, then 9 ; m > 0 such that f (x) f (x0) + m kx x0k, 8x 2 A and

    kx x0k < .A solution x0 of the system f 0(x) = 0 is called critical (stationary) point, and the corresponding

    value of the function f (x0) is called critical value. A critical point x0 is called nondegenerate if f 00(x0) isnonsingular (the determinant is nonzero).

    2.11.3. Theorem (Thm. 3.5, [ 14 ]). Consider a class C 2 function f ( ) : A

    Rn

    !R and x0

    2 int (A)

    a nondegenerate critical point. The point x0 is a local minimum if and only if f 00(x0) is nonnegative. Thepoint x0 is a local maximum if and only if f 00(x0) is nonpositive.

    2.11.4. Remark . Procedure for 2 variable function:

    (1) Solve (FOC ) : f 0x (x; y) = 0

    f 0y (x; y) = 0(2) For each solution of (F OC ) system,

    (a) Calculate the Hessian: H f (x; y) = f 00x2 (x; y) f

    00xy (x; y)

    f 00yx (x; y) f 00y2 (x; y)

    (b) Decide:

    det H f (x; y) f 00x2 (x; y) (x; y)> 0 < 0 maximum> 0 > 0 minimum< 0 saddle (no extreme)= 0 the method doesnt decide

    2.11.5. Remark . For the general case: for each stationary point consider the quadratic function

    ' ( ) =n

    Xi;j =1 @ 2f

    @xi@y j(a) i j :

    If:(1) ' ( ) is negative denite, then a is an extreme point, namely a local interior maximum point for

    f ( ).

  • 8/13/2019 15_11_2010_curs

    43/129

    39

    (2) ' ( ) is positive denite, then a is an extreme point, namely a local interior minimum point forf ( ).

    (3) ' ( ) is not dened, then a is not a local interior extreme point (it is a saddle point)(4) ' ( ) is nonnegative or nonpositive denite, then the procedure cannot decide.

    2.11.6. Remark . The Hessian H f ( ) is the matrix of the quadratic function:

    H f (a) = @ 2f @xi@x j

    (a)i;j =1 ;n

    =0BBBB@

    @ 2f @x1@x1

    (a) @ 2f @xn @x1

    (a)...

    ...@ 2f

    @x1@xn(a)

    @ 2f @xn @xn

    (a)

    1CCCCAThe Hessian (and hence the quadratic form ' ( )) is positive if an only if all the determinants arepositive 8 j = 1; n, j > 0 and negative if and only if 8 j = 1; n, ( 1)

    j j > 0, where

    j =

    @ 2f @x1@x1

    (a) @ 2f @x j @x1

    (a)...

    ...@ 2f

    @x1@x j(a)

    @ 2f @x j @x j

    (a)

    .

    2.11.7. Example . The following function has a unique local minimum point.f ( ; ) : R2 ! R, f (x; y) = 5 x2 + 4 xy + y2 6x 2y + 6

    -4-2-4

    x

    -2

    y

    000

    2 424

    z

    100

    200

    300

    f (x; y) = 5 x2 + 4 xy + y2 6x 2y + 6

    f 0x (x; y) = (5 x2 + 4 xy + y2 6x 2y + 6)0x = 10x + 4 y 6

    f 0y (x; y) = (5 x2 + 4 xy + y2 6x 2y + 6)0y = 4x + 2 y 2 )

    ) (F OC ) 10x + 4 y 6 = 0

    4x + 2 y 2 = 0 ) 5x + 2 y = 3

    2x + y = 1 ) (1; 1)

    8>>>:

    f 00x2 (x; y) = (10 x + 4 y 6)0x = 10

    f 00xy (x; y) = (10 x + 4 y 6)0y = 4

    f 00yx (x; y) = (4 x + 2 y 2)0x = 4

    f 00y2 (x; y) = (4 x + 2 y 2)

    0y = 2

    ) d2f (x; y) = 10dx2 + 8 dxdy + 2 dy2

  • 8/13/2019 15_11_2010_curs

    44/129

    40

    The Hessian is H (x; y) = 10 44 2 ; H (1; 1) = 10 4

    4 2 , det H (1; 1) = 20 4 = 16 > 0 so the

    point (1; 1) is a local extremum and since f 00x2 (x; y) = 10 > 0, the point is a local minimum. The localminimum value of the function is f (1; 1) = 5 4 + 1 6 + 2 + 6 = 4 .

    2.11.8. Example . For the following function, the critical point is not an extremum one:f ( ; ) : R2 ! R, f (x; y) = x2 y2 (which has the following graph)

    -4

    20

    -2

    10

    z4 2

    0

    -10y

    0 -2 -4

    x4

    -20

    20

    f (x; y) = x2 y2

    f 0x (x; y) = ( x2 y2)0x = 2x

    f 0y (x; y) = ( x2 y2)0y = 2y )

    (F OC ) 2x = 02y = 0 ) (0; 0) critical point.

    8>>>:

    f 00x2 (x; y) = (2 x)0x = 2

    f 00xy (x; y) = (2 x)

    0y = 0f 00yx (x; y) = ( 2y)

    0x = 0

    f 00y2 (x; y) = ( 2y)0y = 2

    ) H (x; y) = 2 00 2, H (0; 0) = 2 00 2

    , det H (0; 0) = 4 < 0

    so the critical point (0; 0) is not a local extremum one.

    2.12. Unconstrained optimization. Approximating functions by Least Square Method.

    Assume that n observations were made and the values (x1; y1), (x2; y2), (xn ; yn ) have been obtained. There are n observations There are 2 variables: x [with values x1, , xn ] and y [with values y1, , yn ] The variable x is called "control variable" and it is considered "errorfree" The variable y is called "response variable" Usually it is assumed that the values of the y variable are imprecise, "contaminated", or that theycontain small random perturbations

    It is considered a theoretical model y = f (x; ), where stands for a list of parameters. The method nds a value for parameters which is best from a certain point of view. The simplest example: f (x; a; b) = ax + b, with = ( a; b) [the linear model]; so, we are lookingfor the line that "ts best" the observations, and the line is y = ax + b.

    The method to nd which value is best measures the vertical distance between each point (xi ; yi)and the corresponding point on the line (x i ; ax i + b): (ax i + b yi)2.

    The distance between the line and the observations is considered to be the sum of all squared

    vertical distances:

  • 8/13/2019 15_11_2010_curs

    45/129

    41

    E (a; b) =

    n

    Xi=1 (axi + b yi)

    2

    :

    Find the point which minimizes F ( ; ):

    8>>>:

    @E @a

    (a; b) = 2n

    Pi=1 (ax i + b yi) xi@E @b

    (a; b) = 2n

    Pi=1 (ax i + b yi) )) (F OC ) : 8

    >:

    @E @a

    (a; b) = 0@E

    @b (a; b) = 0 )

    8>:

    2n

    Pi=1 (ax i + b yi) xi = 02

    n

    Pi=1

    (ax i + b yi) = 0 )8>>>:

    n

    Pi=1 x2i a + n

    Pi=1 xi b =n

    Pi=1 x iyi nPi=1

    xi

    a + n

    Pi=11 b =

    n

    Pi=1y

    i

    )

    )

    8>>>>>>>>>>>>>>>>>>>>>:

    a =n

    n

    Pi=1 xiyi n

    Pi=1 xi n

    Pi=1 yin

    n

    Pi=1 x2i n

    Pi=1 x i2

    b =

    n

    Pi=1 x2i n

    Pi=1 yi n

    Pi=1 x i n

    Pi=1 xiyin

    n

    Pi=1

    x2i n

    Pi=1

    x i2 :

    8>>>>>>>>>:

    @ 2E @a2 (a; b) = 2

    n

    Pi=1 x2i > 0

    @ 2E @b2

    (a; b) = 2 n@ 2E @a@b

    (a; b) = @ 2F @b@a

    (a; b) = 2n

    Pi=1 x i)

    The Hessian: H (a; b) = 0B@2

    n

    Pi=1 x2i 2n

    Pi=1 xi2

    n

    Pi=1 xi 2n1CA

    det H (a; b) = 4 n

    n

    Pi=1 x2i 4

    n

    Pi=1 xi

    2

    0 from the CauchyBuniakowskiSchwarz inequality [withyi = 1]; moreover, it is strictly greater than zero under the assumption that the values xi are dierent, fordierent values of i.

    From (SOC ) it follows that the point a; b is a strict global minimum point.Obs [the CauchyBuniakowskiSchwarz inequality]:

    8n 2N, 8xi , yi 2R n

    Pi=1 x iyi2 n

    Pi=1 x2i n

    Pi=1 y2i .Proof:Observe that (x i t + yi)2 0 8t 2Rthen x2i t2 + 2 xiyi t + y2i

    0

    8t

    2R

    summing from 1 to n:

  • 8/13/2019 15_11_2010_curs

    46/129

    42

    n

    Pi=1

    x2i t2 + 2 n

    Pi=1

    xiyi t +n

    Pi=1

    y2i 0 8t 2RThe last expression is a quadratic function in the variable t which is positive for all t, so the discriminant

    should be less than or equal with 0 (otherwise there are two real distinct roots and the function will benegative between them)

    ) 4 n

    Pi=1 x iyi2

    4 n

    Pi=1 x2i n

    Pi=1 y2i 0 so nPi=1 xiyi

    2 n

    Pi=1 x2i n

    Pi=1 y2i which completes the proof.The line y = ax + b is the line which best ts the data (the regression line).The value E a; b is the error of the evaluation.

    Econometrics language:n the sample size (the number of observations)i the subscript used to index observations (the ith observation)X the variable considered independent (with observed values x1, , xn ) (there may be more thanone independent variable) (the explanatory variable) (the control variable) (the predictor variable) (the

    regressor) (the covariate)

    x = 1n

    n

    Pi=1 xi is the sample average of the observed X values.Y the variable considered dependent (on X ) (with observed values y1, , yn ) (the explained variable)(the response variable) (the predicted variable) (the regressand)y =

    1

    n

    n

    Pi=1

    yi is the sample average of the observed Y values.

    the pairs (xi ; yi) are the observed dependencies.The model is y = ax + bThe best t is y = ax + b (where a; b is the solution of (F OC ) system)

    For each i, yi = ax i + b is the predicted valueu i = yi yi is the observed error (the residuals)The model aims "to explain Y in terms of X "Issues of the explanation:1. There will never be an exact relationship of the form Y = aX + b. How do we include in the model

    the possiblity that other (unknown) factors may aect Y as well?

    2. How do we nd the best linear functional relationship between X and Y ?3. How can we be sure that a "ceteris paribus"type relationship is captured?Properties:1.

    n

    Pi=1 (xi x) = 0Proof: From denition.2.

    n

    Pi=1 (xi x) xi =n

    Pi=1 (x i x)2Proof:

    n

    Pi=1 (xi x) x i =n

    Pi=1 (xi x) (x i x + x) =n

    Pi=1 (x i x)2 +n

    Pi=1 (x i x) x =n

    Pi=1 (xi x)2.

  • 8/13/2019 15_11_2010_curs

    47/129

    43

    3.n

    Pi=1

    (y yi) xi =n

    Pi=1

    (x i x) (y yi).

    Proof: nPi=1 (y yi) x i =n

    Pi=1 (y yi) (x i x + x) =n

    Pi=1 (y yi) (x i x)+n

    Pi=1 (y yi) x =n

    Pi=1 (y yi) (x i x).4. b = y xaProof:From (F OC ) it follows that the values a; b satisfy the relation:

    n

    Pi=1 xi a + nb =n

    Pi=1 yi : n ) xa + b = y ) b = y xa:5.

    n

    Pi=1 u i = 0Proof:u i = yi ax i b so from (FOC ) it follows:

    n

    Pi=1 u i =n

    Pi=1 yi ax i b =n

    Pi=1 (yi ax i y + xa) =nPi=1 (yi y) + an

    Pi=1 (x xi) = 0 .6.

    n

    Pi=1 x i u i = 0Proof:n

    Pi=1 xi u i =n

    Pi=1 x i (yi yi) =n

    Pi=1 x i yi ax i b =n

    Pi=1 x iyi an

    Pi=1 x2i bn

    Pi=1 xi ==

    n

    Pi=1

    x iyi an

    Pi=1

    x2i (y xa)n

    Pi=1

    x i =n

    Pi=1

    xiyi yn

    Pi=1

    xi a n

    Pi=1

    x2i xn

    Pi=1

    xi =

    =n

    Pi=1 x iyi yn

    Pi=1 x i a n

    Pi=1 x2i xn

    Pi=1 xi ==

    n

    Pi=1 x iyi yn

    Pi=1 x i 1n

    n n

    Pi=1 xiyi n

    Pi=1 x i n

    Pi=1 yi = 0) 0 =

    n

    Pi=1 ax i + b yi xi =n

    Pi=1 (ax i + y xa yi) xi =n

    Pi=1 (a (x i x) + y yi) xi ))

    n

    Pi=1 (xi x) xi a =n

    Pi=1 (y yi) x iwe have

    n

    Pi=1 (xi x) x i =

    n

    Pi=1 (xi x) (x i x + x) =

    n

    Pi=1 (xi x) (x i x) + x

    n

    Pi=1 (xi x) =

    n

    Pi=1 (xi x)

    2

    andn

    Pi=1 (y yi) x i =n

    Pi=1 (y yi) (xi x + x) =n

    Pi=1 (y yi) (x i x) + xn

    Pi=1 (y yi) =n

    Pi=1 (x i x) (y yi).So: n

    Pi=1 (x i x)2 a =n

    Pi=1 (xi x) (y yi) ) a =n

    Pi=1 (x i x) (y yi)nPi=1 (x i x)2

    (the sample covariance between X

    and Y divided by the sample variance of X )

    E (a; b) =n

    Pi=1

    (ax i + b yi)2 =n

    Pi=1

    ax i ax i + ax i + b b + b yi2

    =n

    Pi=1

    ax i + b yi + xi (a a) + b

  • 8/13/2019 15_11_2010_curs

    48/129

    44

    =n

    Pi=1

    ax i + b yi2

    +n

    Pi=1

    xi (a a) + b b2

    + 2n

    Pi=1

    ax i + b yi xi (a a) + b b =

    = E a; b +n

    Pi=1 x i (a a) + b b2+2 ( a a)

    n

    Xi=1 ax i + b yi xi | {z } =0 from FOC

    +2 b b n

    Xi=1 ax i + b yi | {z }=0 from FOC

    =

    = E a; b +n

    Pi=1 xi (a a) + b b2

    which shows directly that the smallest value is attained when

    a = a and b = b.

    2.13. Constrained Optimization

    2.13.1. Theorem ([3], Th. 18.1; Necessary Optimality Conditions for Two Variables and One EqualityConstraint) . Consider the class C 1 functions f ( ; ) ; h ( ; ) : R2 ! R and (x; y) a solution of the problem

    max(x;y )

    f (x; y) ;

    subject to h (x; y) = c;

    such that (x; y) is not a critical point of h ( ; )1. Then, there is a real number ^ such that the pointx; y; ^ is a critical point of the function

    L (x; y; ) = f (x; y) [h (x; y) c]:

    [The function L ( ; ; ) is called "Lagrangian function"]

    Proof. Assume that (x; y) is a constrained maximum.Consider the system of two equations in (x; y) given by:

    f (x; y) = f (x; y)h (x; y) = c

    and the (extended) Jacobian matrix at (x; y), f 0x (x; y) f 0y (x; y)

    h0x (x; y) h0y (x; y).

    If the rank of this matrix is two, then by applying The Implicit Function Theorem we would get for each( ; ) in a certain neighborhood of (f (x; y) ; c) (which neighborhood contains (f (x; y) ; c) as an interior

    point) a solution (x ; ; y ; ) of the system f (x; y) = h (x; y) = .

    In particular, for an " > 0 suciently small there is a solution (x" ; y") of the system f (x; y) = f (x; y) + "h (x; y) = c: ,

    which means that h (x" ; y") = c ((x" ; y" ) satises the constraint) and f (x" ; y") = f (x; y) + " > f (x; y)((x" ; y") is better than (x; y)) which is a contradiction with the assumption that (x; y) is a constrainedmaximum.

    So, the matrix f 0x (x; y) f 0y (x; y)

    h0x (x; y) h0y (x; y)cannot have rank two [and because of NDCQ it has to be of rank

    one]. This implies the existence of a constant ^ such that f 0x (x; y) f 0y (x; y) = ^ h0x (x; y) h0y (x; y) ,

    1This condition is called "Non Degenerate Constraint Qualication" [NDCQ]

  • 8/13/2019 15_11_2010_curs

    49/129

    45

    which means that f 0x (x; y) = ^h0x (x; y)

    f 0y (x; y) = ^h0

    y (x; y)

    . Finally, this together with h (x; y) = c means that x; y; ^

    is a critical point for the function L ( ; ; ).

    2.13.2. Theorem ([3], Th. 19.7; Sucient Conditions for Two Variables and One Equality Constraint) .Under the same conditions as above, assume that x; y; ^ is a critical point of the Lagrangean function

    such that the (bordered) determinant

    0 h0x (x; y) h0y (x; y)h0x (x; y) L00x2 x; y; ^ L

    00xy x; y; ^

    h0y (x; y) L00yx x; y; ^ L00y2 x; y; ^> 0. Then (x; y) is a

    constrained local maximum.

    2.13.1. Procedure for 2 variables functions: [opt] f (x; y) constrained by g (x; y) = 01. Form the Lagrange function:L (x; y; ) = f (x; y) g (x; y)2. Find the critical points of g ( ; ).3. Apply (F OC ) for L ( ; ; ) to obtain:

    (CFOC ) : 8

  • 8/13/2019 15_11_2010_curs

    50/129

    46

    (CFOC ) : 8:y = x =

    2 = 1 ) = 12 ) solution1

    2;

    1

    2;

    1

    2

    (CSOC ) :Alternative 1: dierentiate the constraintx + y 1 = 0 ) dx + dy = 0j(x;y )=(1 =2;1=2) ) dx + dy = 0 ) dy = dx.[Calculate the secondorder dierential for the function (x; y) 7! L (x; y; 0)]d2(x;y )L (x; y; ) = L

    00x2 (x; y; ) dx

    2 + 2 L00xy (x; y; ) dxdy + L00y2 (x; y; ) dy2 =

    = 2 dxdy

    d2(x;y )

    L1

    2; 1

    2;

    1

    2= 2 dxdy

    jdy= dx =

    2dx2 < 0 (negat