67
講義ノート4:動的計画法 稲葉 大 July 3rd and 15th, 2010 @ Gakushuin E-mail: [email protected] http://sites.google.com/site/masaruinaba/ 稲葉 大 () 多期間モデル 2010/07/3, 15 @ Gakushuin 1 / 67

Slide Lecture4

Embed Size (px)

DESCRIPTION

DP

Citation preview

  • July 3rd and 15th, 2010 @ Gakushuin

    E-mail: [email protected]://sites.google.com/site/masaruinaba/

    () 2010/07/3, 15 @ Gakushuin 1 / 67

  • I. (dynamic programming):

    () 2010/07/3, 15 @ Gakushuin 2 / 67

  • maxct ;kt+1

    TXt=0

    tu(ct)

    subject to

    kt+1 kt = f (kt) ct kt for t = 1; 2; ;T;k0 = k0kT+1 0:

    0 < < 1 (discount factor)u() f ()

    u(0) = 0; u0() > 0; u00() < 0; u0(0) = 1; u0(1) = 0f (0) = 0; f 0() > 0; f 00() < 0; f 0(0) = 1; f 0(1) = 0

    () 2010/07/3, 15 @ Gakushuin 3 / 67

  • (recursive)

    .

    (Bellmans principle of optimality)

    .

    .

    .

    . ..

    .

    .

    ((1994)P150)

    (backward induction method) t (backwardinduction method)

    () 2010/07/3, 15 @ Gakushuin 4 / 67

  • I.1 (backward inductionmethod)

    T kT ucT cT kT+1

    maxcT ;kT+1

    u(cT )s.t. kT+1 kT = f (kT ) cT kT

    kT : givenkT+1 0:

    () 2010/07/3, 15 @ Gakushuin 5 / 67

  • maxkT+1

    u

    f (kT ) + (1 )kT kT+1

    s.t. kT : givenkT+1 0:

    kT+1 kT

    kT+1 = T (kT ):

    policy function

    () 2010/07/3, 15 @ Gakushuin 6 / 67

  • policy functionT

    u0(cT ) = TT =

    kt+1 0; 0; and kT+1 = 0:

    u0(cT ) = kT+1 = 0:

    u0(cT ) = > 0kT+1 = T (kT ) = 0

    cT = f (kT ) + (1 )kT T (kT ) policy function kT

    () 2010/07/3, 15 @ Gakushuin 7 / 67

  • VT (kT ) =u

    f (kT ) + (1 )kT T (kT )

    VT (kT ) T kT (state evaluation function) (value function)

    () 2010/07/3, 15 @ Gakushuin 8 / 67

  • T 1T T 1 kT VT (kT )T VT (kT )T 1

    maxcT1;kT

    u(cT1) + VT (kT )s.t. kT kT1 = f (kT1) cT1 kT1

    kT1 : given cT1

    maxkT

    u

    f (kT1) + (1 )kT1 kT+ VT (kT )

    s.t. kT1 : given kT1

    kT = T1(kT1) () 2010/07/3, 15 @ Gakushuin 9 / 67

  • value function

    VT1(kT1) =uT1(kT1)

    + VT (kT )

    VT1(kT1) T 1 kT1 value function

    () 2010/07/3, 15 @ Gakushuin 10 / 67

  • ( tt + 1t kt+1Vt+1(kt+1)Vt+1(kt+1)t

    maxct ;kt+1

    u(ct) + Vt+1(kt+1)s.t. kt+1 kt = f (kt) ct kt

    kt : given ct

    maxkt+1

    u

    f (kt) + (1 )kt kt+1+ Vt+1(kt+1)

    s.t. kt : given kt+1 = t(kt)

    Vt(kt) =u

    f (kt) + (1 )kt t (kt)+ Vt+1(kt+1)

    () 2010/07/3, 15 @ Gakushuin 11 / 67

  • ( t=0)

    maxc0;k1

    u(c0) + V1(k1)s.t. k1 k0 = f (k0) c0 k0

    k0 : given

    c0

    maxk1

    u

    f (k0) + (1 )k0 k1+ V1(k1)

    s.t. kt : given

    k1 = 0(k0)

    V0(k0) =u

    f (k0) + (1 )k0 0(k0)+ V1(k1)

    () 2010/07/3, 15 @ Gakushuin 12 / 67

  • .

    .

    .

    1 t = 0 k0 k1 = 0(k0). c0 = f (k0) + (1 )k0 0(k0)

    .

    .

    .

    2 t = 1 k1 k2 = 1(k1). c1 = f (k1) + (1 )k1 1(k1):::

    .

    .

    .

    3 t kt kt+1 = t(kt). ct = f (kt) + (1 )kt t(kt):::

    .

    .

    .

    4 T kT kT+1 = T (kT ) = 0. cT = f (kT ) + (1 )kT T (kT ) kT+1 = 0

    () 2010/07/3, 15 @ Gakushuin 13 / 67

  • T

    () 2010/07/3, 15 @ Gakushuin 14 / 67

  • (fundamental recurrencerelation)

    maxct ;kt+1

    TXt=0

    tu(ct)

    subject to

    kt+1 kt = f (kt) ct kt for t = 1; 2; ;T;k0 = k0kT+1 0:

    () 2010/07/3, 15 @ Gakushuin 15 / 67

  • T

    VT (kT ) = maxcT ;kT+1

    u(cT )s.t. kT+1 kT = f (kT ) cT kT

    kT : given; kT+1 0:

    u0(cT ) = TT =

    kt+1 0; 0; and kT+1 = 0:

    () 2010/07/3, 15 @ Gakushuin 16 / 67

  • u0(cT ) = kT+1 = 0:

    u0(cT ) = > 0 kT+1 = (kT ) = 0.

    cT = f (kT ) + (1 )kT 0T

    VT (kT ) = u

    f (kT ) + (1 )kT: (1)

    (boundary condition)

    () 2010/07/3, 15 @ Gakushuin 17 / 67

  • T 1VT1(kT1) = max

    cT1;cT ;kT ;kT+1u(cT1) + u(cT ) (2)

    s.t. kT kT1 = f (kT1) cT1 kT1kT+1 kT = f (kT ) cT kTkT1 : givenkT+1 0:

    VT1(kT1) = maxcT1;kT

    hu(cT1) + max

    cT ;kT+1u(cT )

    is.t. kT kT1 = f (kT1) cT1 kT1

    kT+1 kT = f (kT ) cT kTkT1 : given:

    () 2010/07/3, 15 @ Gakushuin 18 / 67

  • (1) VT (kT )

    VT1(kT1) = maxkT

    nu

    f (kT1) kT + (1 )kT1| {z }cT1

    + VT (kT )

    o(3)

    () 2010/07/3, 15 @ Gakushuin 19 / 67

  • T 2VT2(kT2) = max

    cT2;cT1;cT ;kT1;kT ;kT+1u(cT2) + u(cT1) + 2u(cT ) (4)

    s.t. kT1 kT2 = f (kT2) cT2 kT2kT kT1 = f (kT1) cT1 kT1kT+1 kT = f (kT ) cT kTkT2 : given; kT+1 0:

    VT2(kT2) = maxcT2;kT1

    u(cT2) +

    nmax

    cT1;cT ;kT ;kT+1u(cT1) + u(cT )

    os.t. kT1 kT2 = f (kT2) cT2 kT2

    kT kT1 = f (kT1) cT1 kT1kT+1 kT = f (kT ) cT kTkT2 : given; kT+1 0:

    () 2010/07/3, 15 @ Gakushuin 20 / 67

  • (2) VT1(kT1)

    VT2(kT2) = maxkT1

    nu

    f (kT2) kT1 + (1 )kT2+ VT1(kT1)

    o(5)

    () 2010/07/3, 15 @ Gakushuin 21 / 67

  • T 3VT3(kT3) = max

    cT3;cT2;cT1;cT ;kT2;kT1;kT ;kT+1u(cT3) + u(cT2) + 2u(cT1)

    + 3u(cT )(6)

    s.t. kT2 kT3 = f (kT3) cT3 kT3kT1 kT2 = f (kT2) cT2 kT2kT kT1 = f (kT1) cT1 kT1kT+1 kT = f (kT ) cT kTkT3 : givenkT+1 0:

    () 2010/07/3, 15 @ Gakushuin 22 / 67

  • VT3(kT3) = maxcT3;kT2

    u(cT3)

    + n

    maxcT2;cT1;cT ;kT1;kT ;kT+1

    u(cT2) + u(cT1) + 2u(cT )o

    s.t. kT2 kT3 = f (kT3) cT3 kT3kT1 kT2 = f (kT2) cT2 kT2kT kT1 = f (kT1) cT1 kT1kT+1 kT = f (kT ) cT kTkT3 : givenkT+1 0:

    () 2010/07/3, 15 @ Gakushuin 23 / 67

  • (4) VT2(kT2)

    VT3(kT3) = maxkT2

    nu

    f (kT3) kT2 + (1 )kT3+ VT2(kT2)

    o(7)

    () 2010/07/3, 15 @ Gakushuin 24 / 67

  • t

    Vt(kt) = maxc;k+1

    TX=t

    tu(c)

    s.t. k+1 k = f (k) c k (for = t; ;T 1;)kt : given; kT+1 0:

    Vt(kt) =maxct ;kt+1

    2666664u(ct) + maxc;k+1

    8>>>: TX=t+1

    t1u(c)9>>=>>;3777775

    s.t. kt+1 kt = f (kt) ct ktk+1 k = f (k) c k (for = t + 1; ;T 1;)kt : given:

    () 2010/07/3, 15 @ Gakushuin 25 / 67

  • t + 1

    Vt(kt) = maxkt+1

    nu

    f (kt) kt+1 + (1 )kt+ Vt+1(kt+1)

    o(8)

    (Bellmansequation)

    () 2010/07/3, 15 @ Gakushuin 26 / 67

  • k0 = k0

    VT (kT ) = u

    f (kT ) + (1 )kT

    t = 0; 1; 2; ;T 1Vt(kt) = max

    kt+1

    nu

    f (kt) kt+1 + (1 )kt+ Vt+1(kt+1)

    o (value function)

    VT (kT ) ! VT1(kT1) ! ! V0(k0)policy function

    maxkt+1

    nu

    f (kt) kt+1 + (1 )kt+ Vt+1(kt+1)

    o(for t = 0; 1; 2; ;T 1:)

    kt kt+1 = t(kt),ct = f (kt) t(kt)|{z}

    kt+1

    +(1 )kt

    () 2010/07/3, 15 @ Gakushuin 27 / 67

  • (optimal policy function)

    maxkt+1

    nu

    f (kt) kt+1 + (1 )kt+ Vt+1(kt+1)

    o(for t = 0; 1; 2; ;T 1:)

    kt+1

    u0(ct) + V 0t+1(kt+1) = 0 (9)()u0(ct) = V 0t+1(kt+1) (10)()u0

    f (kt) kt+1 + (1 )kt

    = V 0t+1

    kt+1

    ()

    (11)()kt+1 = t(kt) (12)

    kt.

    () 2010/07/3, 15 @ Gakushuin 28 / 67

  • kt+1 = t(kt)

    Vt(kt) = u

    f (kt) t(kt) + (1 )kt+ Vt+1

    t(kt)

    (for t = 0; 1; 2; ; T 1:)

    kt

    V 0t (kt) = u0(ct )hf 0(kt) 0t(kt) + (1 )

    i+ V 0t+1(kt+1)0t(kt)

    ()V 0t (kt) = u0(ct )hf 0(kt) + (1 )

    i+ 0t(kt)

    hu0(ct ) + V 0t+1(kt+1)

    i

    () 2010/07/3, 15 @ Gakushuin 29 / 67

  • (10)( (envelope theorem))

    ()V 0t (kt) = u0(ct )hf 0(kt) + (1 )

    i(10)

    u0(ct) = u0(ct+1)hf 0(kt+1) + (1 )

    i

    () 2010/07/3, 15 @ Gakushuin 30 / 67

  • II (dynamic programming)

    () 2010/07/3, 15 @ Gakushuin 31 / 67

  • II.1

    maxct ;kt+1

    1Xt=0

    tu(ct)

    subject to

    kt+1 kt = f (kt) ct ktk0 = k0

    0 < < 1 (discount factor)u() f ()

    u(0) = 0; u0() > 0; u00() < 0; u0(0) = 1; u0(1) = 0f (0) = 0; f 0() > 0; f 00() < 0; f 0(0) = 1; f 0(1) = 0

    () 2010/07/3, 15 @ Gakushuin 32 / 67

  • (value function)

    V0(k0) =maxct ;kt+1

    1Xt=0

    tu(ct) (13)

    s.t. kt+1 kt = f (kt) ct ktk0 : given

    V0(k0) (indirect utility function)

    () 2010/07/3, 15 @ Gakushuin 33 / 67

  • (13)

    V0(k0) =maxc0;k1

    nu(c0) + max

    c;k+1

    1X=1

    1u(c)o

    (14)

    s.t. kt+1 kt = f (kt) ct kt; k0 : givent = 1

    V0(k0) =maxc0;k1

    nu(c0) + V1(k1)

    o(15)

    s.t. k1 k0 = f (k0) c0 k0; k0 : given:c0V0(k0)

    V0(k0) =maxk1

    nu

    f (k0) k1 + (1 )k0+ V1(k1)

    o(16)

    () 2010/07/3, 15 @ Gakushuin 34 / 67

  • t

    Vt(kt) = maxc;k+1

    1X=t

    tu(c) (17)

    s.t. k+1 k = f (k) c k; kt : given

    Vt(kt) =maxct ;kt+1

    2666664u(ct) + maxc;k+1

    8>>>: 1X=t+1

    t1u(c)9>>=>>;3777775

    s.t. kt+1 kt = f (kt) ct kt; kt : given, ctVt(kt) (Bellman equation)

    Vt(kt) =maxkt+1

    nu

    f (kt) kt+1 + (1 )kt+ Vt+1(kt+1)

    o(18)

    () 2010/07/3, 15 @ Gakushuin 35 / 67

  • (17) tt (time-invariant value function)

    V() = Vt(): V()

    V(kt) =maxkt+1

    nu

    f (kt) kt+1 + (1 )kt+ V(kt+1)

    o:

    () 2010/07/3, 15 @ Gakushuin 36 / 67

  • II.2 (value function) (policy function)

    value function policy function T value function V()

    V(kt) =maxkt+1

    nu

    f (kt) kt+1 + (1 )kt+ V(kt+1)

    o: (19)

    V()

    () 2010/07/3, 15 @ Gakushuin 37 / 67

  • (i) Value function (functional analysis)value function V() value function iteration

    .

    .

    .

    1 (19)

    .

    .

    .

    2 kt+1 = kkt = k V0 (iteration)j ! 1 V j()

    V j+1(k) =maxk

    nu

    f (k) k + (1 )k+ V j(k)

    os.t. k : given:

    () 2010/07/3, 15 @ Gakushuin 38 / 67

  • (ii) Policy function(19) kt+1

    u0(ct) + V 0(kt+1) = 0, u0(ct) = V 0(kt+1) (20), u0(ct) = V 0( f (kt) ct + (1 )kt) (21)

    u(), f (), V() policy function (time invariant policyfunction)

    kt+1 = (kt) (22) t kt tkt t kt ct

    ct = f (kt) (kt) + (1 )kt (23) () 2010/07/3, 15 @ Gakushuin 39 / 67

  • (iii) Euler equation

    policy function (23) ct = f (kt) (kt) + (1 )kt

    V(kt) = u

    f (kt) (kt) + (1 )kt+ V(kt+1)

    kt

    V 0(kt) = u0(ct)hf 0(kt) 0(kt) + (1 )

    i+ V 0(kt+1)0(kt)

    () V 0(kt) = u0(ct+1)hf 0(kt) + (1 )

    i 0(kt)

    hu0(ct) V 0(kt+1)

    i

    () 2010/07/3, 15 @ Gakushuin 40 / 67

  • (20)( (envelope theorem))

    V 0(kt) = u0(ct)hf 0(kt) + (1 )

    i(24)

    (20)u0(ct) = V 0(kt+1)

    () u0(ct) = u0(ct+1)nf 0(kt+1) + (1 )

    o

    () 2010/07/3, 15 @ Gakushuin 41 / 67

  • II.3

    value function(i) Value function iteration(ii) Howards improvement algorithm(iii) Guess and verify

    () 2010/07/3, 15 @ Gakushuin 42 / 67

  • (i) Value function iteration

    .

    . . 1 kt+1 = kkt = k V0

    .

    .

    .

    2 (iteration)

    V j+1(k) =maxk

    nu

    f (k) k + (1 )k+ V j(k)

    os.t. k : given:

    .

    .

    .

    3 j = j + 1

    .

    .

    .

    4 V j value function iterationiterating on the Bellmanequation

    () 2010/07/3, 15 @ Gakushuin 43 / 67

  • (ii) Howards improvement algorithm

    () 2010/07/3, 15 @ Gakushuin 44 / 67

  • (iii) Guess and Verify u(c) = log(c) f (k) = Ak 0 < < 1; A > 0 = 1 (19)

    V(kt) =maxkt+1

    nlog

    Akt kt+1

    + V(kt+1)

    o(25)

    V() V()

    V(kt) = E + F log(kt) (26) (guess)E F(undetermined coefficients)

    () 2010/07/3, 15 @ Gakushuin 45 / 67

  • guess policy function(25)

    1ct+ V 0(kt+1) = 0

    , 1ct= V 0(kt+1) (27)

    , 1ct+ F

    1kt+1

    = 0

    , 1ct= F

    1Akt ct

    , ct = Akt ctF

    , 1 +

    1F

    !ct =

    AktF

    , ct = Akt

    1 + F:

    () 2010/07/3, 15 @ Gakushuin 46 / 67

  • kt+1 policy functionkt+1 = Ak ct

    , kt+1 = Ak Akt

    1 + F

    , kt+1 = Ak Akt

    1 + F

    , kt+1 = F1 + F Akt

    (24)V 0(kt) = V 0(kt+1)Ak1t

    , V 0(kt) = 1ctAk1t ((27))

    , V 0(kt) = Ak1t

    Akt1+F

    (ct policy function)

    , V 0(kt) = (1 + F)k1t (28) () 2010/07/3, 15 @ Gakushuin 47 / 67

  • (26) kV 0(kt) = Fk1t (29)

    (28) (29)F = (1 + F)

    , F = 1

    value function policyfunction

    V(kt) = E + 1 log(kt) (30)ct = (1 )Akt (31)kt+1 = Akt (32)

    () 2010/07/3, 15 @ Gakushuin 48 / 67

  • kt

    t kt kt+1 policy functionkt+1 = Akt kt+1 = Akt

    log kt+1 = log(A) + log kt (33)jj < 1t ! 1kt

    k = Ak

    , k = (A) 11

    () 2010/07/3, 15 @ Gakushuin 49 / 67

  • III

    () 2010/07/3, 15 @ Gakushuin 50 / 67

  • (functional analysis)

    () 2010/07/3, 15 @ Gakushuin 51 / 67

  • 0 < < 1r(; )(control variables)futg1t=0

    maxfutg1t=0

    1Xt=0

    tr(xt; ut) (34)

    s.t. xt+1 = g(xt; ut); x0 : given.

    r(; )xt+1 = g(xt; ut)xt (transition equation) f(xt+1; xt) : xt g(xt; ut)g

    () 2010/07/3, 15 @ Gakushuin 52 / 67

  • (time-invariant) (policy function)hh (state variables)xt(control variables)utmapping

    ut = h(xt) (35)xt+1 = g(xt; ut) (36)x0 : given,

    futg1t=0recursive

    () 2010/07/3, 15 @ Gakushuin 53 / 67

  • (value function)

    V0(x0) = maxfutg1t=01X

    t=0tr(xt; ut) (37)

    s.t. xt+1 = g(xt; ut); x0 : given.

    value function

    V0(x0) = maxu0

    nr(x0; u0) + maxfug1=1

    1X=1

    1r(x; u)o

    s.t. xt+1 = g(xt; ut); x0 : given.

    (37)

    V0(x0) = maxu0

    nr(x0; u0) + V1(x1)

    os.t. xt+1 = g(xt; ut); x0 : given.

    () 2010/07/3, 15 @ Gakushuin 54 / 67

  • t

    Vt(xt) = maxut

    nr(xt; ut) + Vt+1(xt+1)

    os.t. xt+1 = g(xt; ut); xt : given.

    (37) tV0() = V() (time-invariant) x = xt+1x = xtu = utV()

    V(x) = maxu

    nr(x; u) + V(x)

    o(38)

    s.t. x = g(x; u); x : given,

    () 2010/07/3, 15 @ Gakushuin 55 / 67

  • value function V()V()policy function

    maxu

    nr(x; u) + V(x)

    os.t. x = g(x; u)

    x : given.

    () 2010/07/3, 15 @ Gakushuin 56 / 67

  • value function V()policy functionh()

    V(x) = maxu

    nr(x; u) + V(x)

    o(39)

    s.t. x = g(x; u); x : given,

    policy function h(x) x = g(x; u)x

    V(x) = rx; h(x) + Vgx; h(x): (40) V()h() (functionalequation)

    () 2010/07/3, 15 @ Gakushuin 57 / 67

  • 1. (39)2. V0(iteration) j ! 1 V j()

    V j+1(x) =maxx

    nr(x; u) + V j(x)

    os.t. x : given:

    3. (39)@r(x; u)

    @u+ V 0

    g(x; u)@g(x; u)

    @u= 0 (41)

    (time-invariant) policyfunction h()

    () 2010/07/3, 15 @ Gakushuin 58 / 67

  • 4. value function(40)

    V 0(x) = @rx; h(x)@x

    + @g

    x; h(x)@x

    V 0gx; h(x): (42)

    Benveniste and Scheinkman x = g(u)@g

    @x= 0

    V 0(x) = @rx; h(x)@x

    : (43)

    () 2010/07/3, 15 @ Gakushuin 59 / 67

  • 5. Eulerx = g(u)(41)

    @r(x; u)@u

    + V 0(x)@g(u)@u

    = 0 (44)

    (43)@r(x; u)

    @u+

    @rx; h(x)@x

    @g(u)@u

    = 0: (45)

    Euler

    () 2010/07/3, 15 @ Gakushuin 60 / 67

  • @r(xt; ut)@ut

    + @rxt+1; h(xt+1)@xt+1

    @g(ut)@ut

    = 0

    @r(xt; ut)@ut

    + @rxt+1; ut+1

    @xt+1

    g0(ut) = 0

    xt = ktut = kt+1r(xt; ut) = u

    f (kt) kt+1 + (1 )kt

    g(ut) = kt+1

    u0(ct) + u0(ct+1) f f 0(kt+1) + (1 )g = 0

    () 2010/07/3, 15 @ Gakushuin 61 / 67

  • (i) (42) (Benveniste and Scheinkman(1979))(1/2)

    (40)

    V 0(x) = @rx; h(x)@x

    +@rx; h(x)@u

    @h(x)@x

    + V 0gx; h(x)(@gx; h(x)

    @x+@g

    x; h(x)@u

    @h(x)@x

    )

    V 0(x) = @rx; h(x)@x

    + V 0gx; h(x)@gx; h(x)

    @x

    +

    "@rx; h(x)@u

    + V 0gx; h(x)@gx; h(x)

    @u

    #@h(x)@x

    :

    () 2010/07/3, 15 @ Gakushuin 62 / 67

  • (42) (2/2)

    (41) (envelopetheorem) (42)

    V 0(x) = @rx; h(x)@x

    + V 0gx; h(x)@gx; h(x)

    @x:

    () 2010/07/3, 15 @ Gakushuin 63 / 67

  • III.2 Value function

    value function policy function 3

    (i) Value function iteration(ii) Howards improvement algorithm(iii) Guess and verify

    () 2010/07/3, 15 @ Gakushuin 64 / 67

  • (i) Value function iteration

    .

    ..

    1 V0

    .

    .

    .

    2 (iteration)

    V j+1(x) =maxu

    nr(u; x) + V j(x)

    os.t. x = g(x; u)

    x : given,

    .

    .

    .

    3 j = j + 1

    .

    .

    .

    4 V j value function iterationiterating on the Bellmanequation

    () 2010/07/3, 15 @ Gakushuin 65 / 67

  • (ii) Howards improvement algorithmHowards improvement algorithm

    .

    . . 1 policy function u = h0(x)value

    Vh j(x) =1X

    t=0tr

    xt; h j(xt); s.t. xt+1 = gxt; h j(xt);x0 : given.

    .

    .

    .

    2 policy functionu = h j+1(x)

    maxu

    nr(x; u) + Vh j

    g(x; u)o

    .

    .

    .

    3 j = j + 1

    .

    .

    .

    4 h j()step 1, 2, 3

    () 2010/07/3, 15 @ Gakushuin 66 / 67

  • (iii) Guess and verify

    () 2010/07/3, 15 @ Gakushuin 67 / 67

    (dynamic programming):(backward induction method)I.2 (fundamental recurrence relation)

    (dynamic programming)(value function)(policy function)

    III.1