56
Introduction Maximum principle Extensions to stochastic games Applications Conclusions and more extensions Maximum principle for one kind of discrete-time stochastic optimal control problem and its applications Zhen WU School of Mathematics, Shandong University Joint work with Dr. Feng ZHANG (SDUFE) 1 / 56

Maximum principle for one kind of discrete-time stochastic ... › events › 2019 › qfinance › files › zhen.pdf · Extensions to stochastic games Applications Conclusions and

  • Upload
    others

  • View
    3

  • Download
    0

Embed Size (px)

Citation preview

Page 1: Maximum principle for one kind of discrete-time stochastic ... › events › 2019 › qfinance › files › zhen.pdf · Extensions to stochastic games Applications Conclusions and

IntroductionMaximum principle

Extensions to stochastic gamesApplications

Conclusions and more extensions

Maximum principle for one kind of discrete-timestochastic optimal control problem and its

applications

Zhen WU

School of Mathematics, Shandong University

Joint work with Dr. Feng ZHANG (SDUFE)

1 / 56

Page 2: Maximum principle for one kind of discrete-time stochastic ... › events › 2019 › qfinance › files › zhen.pdf · Extensions to stochastic games Applications Conclusions and

IntroductionMaximum principle

Extensions to stochastic gamesApplications

Conclusions and more extensions

Outline

1 Introduction

2 Maximum principleProblem formulation and preliminariesMaximum principle

3 Extensions to stochastic gamesMP for discrete-time nonzero-sum gameMP for discrete-time zero-sum game

4 ApplicationsApplication to one investment/consumption choice problemApplication to one kind of discrete-time LQ problem

5 Conclusions and more extensions

2 / 56

Page 3: Maximum principle for one kind of discrete-time stochastic ... › events › 2019 › qfinance › files › zhen.pdf · Extensions to stochastic games Applications Conclusions and

IntroductionMaximum principle

Extensions to stochastic gamesApplications

Conclusions and more extensions

Outline

1 Introduction

2 Maximum principleProblem formulation and preliminariesMaximum principle

3 Extensions to stochastic gamesMP for discrete-time nonzero-sum gameMP for discrete-time zero-sum game

4 ApplicationsApplication to one investment/consumption choice problemApplication to one kind of discrete-time LQ problem

5 Conclusions and more extensions

3 / 56

Page 4: Maximum principle for one kind of discrete-time stochastic ... › events › 2019 › qfinance › files › zhen.pdf · Extensions to stochastic games Applications Conclusions and

IntroductionMaximum principle

Extensions to stochastic gamesApplications

Conclusions and more extensions

Stochastic MP

The maximum principle (MP ) is a milestone of optimal controltheory.

For stochastic MP, see e.g. J.M.Bismut (1978), A.Bensoussan(1981), U.G.Haussmann (1986), S.G.Peng (1990), J.M.Yong(2010) and Z.Wu (2013).

4 / 56

Page 5: Maximum principle for one kind of discrete-time stochastic ... › events › 2019 › qfinance › files › zhen.pdf · Extensions to stochastic games Applications Conclusions and

IntroductionMaximum principle

Extensions to stochastic gamesApplications

Conclusions and more extensions

Discrete-time stochastic optimal control problems

Discrete-time optimal control problems are as important ascontinuous-time ones.

Difference equations are in their own right importantmathematical models.In many situations, it is sufficient or natural to describe asystem by a discrete-time model.

signal values are sometimes only available at certain times.discretization of the dynamics of continuous-time problems.computer manipulation.

Difficulties in studying discrete-time stochastic optimalcontrol problems:

differentiation and integration operations no longer work;well-known tools of Ito’s Calculus do not necessarily hold (Ito’sformula, BDG inequality, etc.).

5 / 56

Page 6: Maximum principle for one kind of discrete-time stochastic ... › events › 2019 › qfinance › files › zhen.pdf · Extensions to stochastic games Applications Conclusions and

IntroductionMaximum principle

Extensions to stochastic gamesApplications

Conclusions and more extensions

Works for discrete-time stochastic optimal controls

To turn the original problem into a deterministic one and thenuse a matrix version of MP:A.Beghi and D.D’ Alessandro (J. Optim. Theory Appl., 1998),M.Ait Rami, X.Chen and X.Y.Zhou (J. Global Optim., 2002).

To use the completion of squares for LQ problems:J.B.Moore, X.Y.Zhou and A.E.B.Lim (Syst. Control Lett.,1999), R.J.Elliott, X.Li and Y.H.Ni (Automatica, 2013).

Oswaldo L.V.Costa, D.Li, J.F.Zhang, H.S.Zhang, W.H.Zhang,. . .

6 / 56

Page 7: Maximum principle for one kind of discrete-time stochastic ... › events › 2019 › qfinance › files › zhen.pdf · Extensions to stochastic games Applications Conclusions and

IntroductionMaximum principle

Extensions to stochastic gamesApplications

Conclusions and more extensions

Comments on one literature

X.Y.Lin and W.H.Zhang, A maximum principle for optimal control of

discrete-time stochastic systems with multiplicative noise, IEEE

Transactions on Automatic Control, 60, 1121–1126, 2015.Main contributions:

necessary as well as sufficient conditions are established for onekind of discrete-time stochastic optimal control problem;one kind of new backward stochastic difference equation isintroduced to be the adjoint equation.

Some deficiencies:the assumptions are strict, excluding the quadratic costfunctional;the convex perturbation is not appropriately defined;the necessary condition does not necessarily hold for generalconvex control domains;the square-integrability of the solution to the adjointequation is not guaranteed.

7 / 56

Page 8: Maximum principle for one kind of discrete-time stochastic ... › events › 2019 › qfinance › files › zhen.pdf · Extensions to stochastic games Applications Conclusions and

IntroductionMaximum principle

Extensions to stochastic gamesApplications

Conclusions and more extensions

Results and contributions of this talk

This talk considers one kind of discrete-time stochastic optimalcontrol problem with convex control domains.

Main results:

necessary MPsufficient MP (verification theorem)extensions and applications

Main contributions:

discrete-time stochastic MP in a rigorous versionproof in a clear and concise wayroad for further related topics

8 / 56

Page 9: Maximum principle for one kind of discrete-time stochastic ... › events › 2019 › qfinance › files › zhen.pdf · Extensions to stochastic games Applications Conclusions and

IntroductionMaximum principle

Extensions to stochastic gamesApplications

Conclusions and more extensions

Problem formulation and preliminariesMaximum principle

Outline

1 Introduction

2 Maximum principleProblem formulation and preliminariesMaximum principle

3 Extensions to stochastic gamesMP for discrete-time nonzero-sum gameMP for discrete-time zero-sum game

4 ApplicationsApplication to one investment/consumption choice problemApplication to one kind of discrete-time LQ problem

5 Conclusions and more extensions

9 / 56

Page 10: Maximum principle for one kind of discrete-time stochastic ... › events › 2019 › qfinance › files › zhen.pdf · Extensions to stochastic games Applications Conclusions and

IntroductionMaximum principle

Extensions to stochastic gamesApplications

Conclusions and more extensions

Problem formulation and preliminariesMaximum principle

Notations

(Ω,F ,P): a probability space

For fixed N ∈ N, T := 0, 1, · · · , N − 1, T := 1, 2, · · · , N.

Noises: Wk = (W 1k , · · · ,W d

k )> ∈ Rd, k ∈ T .

Filtration F0,F1, · · · ,FN ⊂ F :

F0 = ∅,Ω, Fk = σW0,W1, · · · ,Wk−1, k ∈ T.

10 / 56

Page 11: Maximum principle for one kind of discrete-time stochastic ... › events › 2019 › qfinance › files › zhen.pdf · Extensions to stochastic games Applications Conclusions and

IntroductionMaximum principle

Extensions to stochastic gamesApplications

Conclusions and more extensions

Problem formulation and preliminariesMaximum principle

Noises and admissible controls

Assume that for some α > 2,

E [|Wk|α|Fk] = C(α) <∞, ∀k ∈ T . (1)

Let U0, · · · , UN−1 be nonempty convex subsets of Rm.u = u0, · · · , uN−1 is called an admissible control, if

uk is an Uk-valued, Fk-measurable r.v. for each k ∈ T ;for some β > 2,

EN−1∑k=0

|uk|β <∞. (2)

Denote by U the set of admissible controls.

11 / 56

Page 12: Maximum principle for one kind of discrete-time stochastic ... › events › 2019 › qfinance › files › zhen.pdf · Extensions to stochastic games Applications Conclusions and

IntroductionMaximum principle

Extensions to stochastic gamesApplications

Conclusions and more extensions

Problem formulation and preliminariesMaximum principle

Problem formulation

The discrete-time stochastic optimal control problem, Problem (OCP),is to find u∗ = u∗0, · · · , u∗N−1 ∈ U such that

J(u∗) = infu∈U

J(u),

where the state X is defined byXk+1 = b(k,Xk, uk) + σ(k,Xk, uk)Wk, k ∈ T ,X0 = x0 ∈ Rn, (3)

and the cost J is given by

J(u) = E

N−1∑k=0

l(k,Xk, uk) + h(XN )

.

Such u∗ is called an optimal control; the corresponding system state is

denoted by X∗ = X∗k , k ∈ T.12 / 56

Page 13: Maximum principle for one kind of discrete-time stochastic ... › events › 2019 › qfinance › files › zhen.pdf · Extensions to stochastic games Applications Conclusions and

IntroductionMaximum principle

Extensions to stochastic gamesApplications

Conclusions and more extensions

Problem formulation and preliminariesMaximum principle

Assumptions

The coefficients are defined by

b(k, ·, ·, ·) : Rn ×Uk ×Ω→ Rn, σ(k, ·, ·, ·) : Rn ×Uk ×Ω→ Rn×d,l(k, ·, ·, ·) : Rn × Uk × Ω→ R and h(·, ·) : Rn × Ω→ R for k ∈ T .

(H1) b(k, x, v) and σ(k, x, v) are Fk-measurable for any (k, x, v),and continuously differentiable in (x, v) with bounded derivatives.In addition, |b(k, 0, 0)|+ |σ(k, 0, 0)| ≤ C for some C > 0.

(H2) l(k, x, v) is Fk-measurable and h(x) is FN -measurable forany (k, x, v); (l, h) are continuously differentiable in (x, v), and thederivatives are bounded by C(1 + |x|+ |v|). Besides,

E[∑N−1

k=0 |l(k, 0, 0)|+ |h(0)|]<∞.

13 / 56

Page 14: Maximum principle for one kind of discrete-time stochastic ... › events › 2019 › qfinance › files › zhen.pdf · Extensions to stochastic games Applications Conclusions and

IntroductionMaximum principle

Extensions to stochastic gamesApplications

Conclusions and more extensions

Problem formulation and preliminariesMaximum principle

Solvability of the state equation

Set γ = minα, β(> 2).

Lemma 1. Under (H1), Eq. (3) admits a unique solutionX = Xk, k ∈ T such that Xk ∈ Lγ(Ω,Fk,P;Rn) for each k.Moreover, there exists C = C(α, γ,N) > 0 satisfying

EN∑k=1

|Xk|γ ≤ CE

1 + |x0|γ +

N−1∑k=0

|uk|γ. (4)

Proof. The existence of X is obvious. We only need to check(4). In fact, by (H1),

E|Xk+1|γ ≤C(γ)E[

1 + |Xk|γ + |uk|γ]

+[1 + |Xk|γ |+ |uk|γ

]|Wk|γ

, k ∈ T .

14 / 56

Page 15: Maximum principle for one kind of discrete-time stochastic ... › events › 2019 › qfinance › files › zhen.pdf · Extensions to stochastic games Applications Conclusions and

IntroductionMaximum principle

Extensions to stochastic gamesApplications

Conclusions and more extensions

Problem formulation and preliminariesMaximum principle

Solvability of the state equation

By (1), Holder’s inequality yields E [|Wk|γ |Fk] ≤ C(α, γ) =: C1. Then

E [|Xk|γ |Wk|γ ] = E |Xk|γE [|Wk|γ |Fk] ≤ C1E|Xk|γ .

Similarly, E [|uk|γ |Wk|γ ] ≤ C1E|uk|γ . So, there exists C2 := C2(α, γ) s.t.

E|Xk+1|γ ≤ C2E 1 + |Xk|γ + |uk|γ , k ∈ T .

Then, by induction we get for k ∈ T,

E|Xk|γ ≤k∑j=1

(C2)j + (C2)k|x0|γ +

k−1∑j=0

(C2)k−jE|uj |γ .

Thus we can draw the conclusion.

Remark. The conditions (1) and (2) are not sufficient to get the

Lr-estimate of Xk (k ≥ 2) for r > γ. In fact, if ξ ∈ Lp and η ∈ Lq, then

generally we can only get ξη ∈ Lr for r ≤ pq/(p+ q)(< minp, q).15 / 56

Page 16: Maximum principle for one kind of discrete-time stochastic ... › events › 2019 › qfinance › files › zhen.pdf · Extensions to stochastic games Applications Conclusions and

IntroductionMaximum principle

Extensions to stochastic gamesApplications

Conclusions and more extensions

Problem formulation and preliminariesMaximum principle

Convex perturbation

Let us define the convex perturbation.

For any µ = µ0, · · · , µN−1 ∈ U , defineνk = µk − u∗k, ν = ν0, · · · , νN−1,uεk = u∗k + ενk, u

ε = uε0, · · · , uεN−1.

Then uε ∈ U .

Similar to (4), it’s easy to check that

EN∑k=0

|Xεk −X∗k |2 ≤ Cε2E

N−1∑k=0

|νk|2,

where C is a positive constant independent of ε.

16 / 56

Page 17: Maximum principle for one kind of discrete-time stochastic ... › events › 2019 › qfinance › files › zhen.pdf · Extensions to stochastic games Applications Conclusions and

IntroductionMaximum principle

Extensions to stochastic gamesApplications

Conclusions and more extensions

Problem formulation and preliminariesMaximum principle

The variational equation

Let us introduce (the variational equation)Zk+1 =

[b>x (k)Zk + b>v (k)νk

]+

d∑i=1

[(σix(k))>Zk + (σiv(k))>νk

]W ik, k ∈ T ,

Z0 = 0,

(5)

where σ = (σ1, · · · , σd).

Eq. (5) admits a unique solution Z = Zk, k ∈ T such thatZk ∈ L2(Ω,Fk,P;Rn) for each k. Moreover, there exists C > 0such that

EN∑k=1

|Zk|2 ≤ CEN−1∑k=0

|νk|2.

17 / 56

Page 18: Maximum principle for one kind of discrete-time stochastic ... › events › 2019 › qfinance › files › zhen.pdf · Extensions to stochastic games Applications Conclusions and

IntroductionMaximum principle

Extensions to stochastic gamesApplications

Conclusions and more extensions

Problem formulation and preliminariesMaximum principle

The variational equation

As a linear stochastic difference equation, the variational equation(5) could be explicitly solved. In fact, (5) is equivalent to

Zk+1 = MkZk +Rk, k ∈ T ; Z0 = 0,

where

Mk = b>x (k)+d∑i=1

(σix(k))>W ik, Rk = b>v (k)νk+

d∑i=1

(σiv(k))>W ikνk.

Then using induction yields

Zk =

k−1∑i=0

Mk−1i Ri, (6)

where M ji = MjMj−1 · · ·Mi+1 when i < j, and M j

i = In wheni = j.

18 / 56

Page 19: Maximum principle for one kind of discrete-time stochastic ... › events › 2019 › qfinance › files › zhen.pdf · Extensions to stochastic games Applications Conclusions and

IntroductionMaximum principle

Extensions to stochastic gamesApplications

Conclusions and more extensions

Problem formulation and preliminariesMaximum principle

The variational inequality

Lemma 2. Under (H1) and (H2), it holds that

LJ(u∗) := E

N−1∑k=0

[〈lx(k), Zk〉+ 〈lv(k), νk〉] + 〈hx(X∗N ), ZN 〉

≥ 0.

The proof is similar to that of the continuous-time counterpart.

Define the Hamiltonian H by

H(k, x, v, p, q) = 〈b(k, x, v), p〉+

d∑i=1

〈σi(k, x, v), qi〉+ l(k, x, v)

with q = (q1, · · · , qd). Note thatd∑i=1〈σi, qi〉 = tr

[σ>q

].

19 / 56

Page 20: Maximum principle for one kind of discrete-time stochastic ... › events › 2019 › qfinance › files › zhen.pdf · Extensions to stochastic games Applications Conclusions and

IntroductionMaximum principle

Extensions to stochastic gamesApplications

Conclusions and more extensions

Problem formulation and preliminariesMaximum principle

The adjoint equation

Introduce the following adjoint equation:Pk = E

Hx(k + 1, X∗k+1, u

∗k+1, Pk+1, Qk+1)

∣∣Fk ,Qk = E

Hx(k + 1, X∗k+1, u

∗k+1, Pk+1, Qk+1)W

>k

∣∣Fk ,k = 0, 1, · · · , N − 2,PN−1 = E[hx(X∗N )|FN−1],QN−1 = E[hx(X∗N )W>N−1

∣∣FN−1].(7)

20 / 56

Page 21: Maximum principle for one kind of discrete-time stochastic ... › events › 2019 › qfinance › files › zhen.pdf · Extensions to stochastic games Applications Conclusions and

IntroductionMaximum principle

Extensions to stochastic gamesApplications

Conclusions and more extensions

Problem formulation and preliminariesMaximum principle

The adjoint equation

This adjoint equation, introduced first by Lin and Zhang (IEEETAC, 2015), is a kind of backward stochastic difference equation,which is quite different from the continuous-time counterpart.

Note that Cohen and Elliott (SPA, 2010; SICON, 2011) alsostudied one kind of backward stochastic difference equation andobtained the existence and uniqueness result. However, our adjointequation is different from the equations studied in the above tworeferences. On the one hand, they have different forms. On theother hand, our adjoint equation is introduced naturally as the dualequation of the variational equation, while in our opinion, theequations studied in the above two references are not appropriateto serve as the adjoint equations of Problem (OCP).

21 / 56

Page 22: Maximum principle for one kind of discrete-time stochastic ... › events › 2019 › qfinance › files › zhen.pdf · Extensions to stochastic games Applications Conclusions and

IntroductionMaximum principle

Extensions to stochastic gamesApplications

Conclusions and more extensions

Problem formulation and preliminariesMaximum principle

The adjoint equation

The main concern of the adjoint equation (7) is the integrability ofits solution (Pk, Qk), k ∈ T .

For later use we need these integrability conditions:E[|Pk|+ |Qk|] <∞, k ∈ T ,E[(|Pk+1|+ |Qk+1|)|Wk|] <∞, k = 0, · · · , N − 2,E[(|Pk|+ |Qk|)|Xk|+ (|Pk|+ |Qk|)|uk|] <∞, k ∈ T .

(8)

Let us point out that the first two integrability conditions in (8)comes from the adjoint equation itself, while the third condition isneeded in the prove of necessary as well as sufficient MP.

22 / 56

Page 23: Maximum principle for one kind of discrete-time stochastic ... › events › 2019 › qfinance › files › zhen.pdf · Extensions to stochastic games Applications Conclusions and

IntroductionMaximum principle

Extensions to stochastic gamesApplications

Conclusions and more extensions

Problem formulation and preliminariesMaximum principle

The adjoint equation

The classical square-integrability conditions of the noises andadmissible controls are insufficient for our purpose.

If Wk are only square-integrable (α = 2), then Xk,1 ≤ k ≤ N , are at most square-integrable even when uk arebounded. In this case, (Pk, Qk), 1 ≤ k ≤ N − 2, are notsquare-integrable. Thus E[(|Pk+1|+ |Qk+1|)|Wk|] andE[(|Pk|+ |Qk|)|Xk|] are not well defined. So, a strongerintegrability condition on Wk is needed.

If uk are only square-integrable (β = 2) and Wk are notbounded, then Xk, 1 ≤ k ≤ N , are at most square-integrable,and (Pk, Qk), 1 ≤ k ≤ N − 2, are still not square-integrable.Thus E[(|Pk|+ |Qk|)|Xk|] and E(|Pk|+ |Qk|)|uk| are not welldefined. So, a stronger integrability condition on uk isneeded.

23 / 56

Page 24: Maximum principle for one kind of discrete-time stochastic ... › events › 2019 › qfinance › files › zhen.pdf · Extensions to stochastic games Applications Conclusions and

IntroductionMaximum principle

Extensions to stochastic gamesApplications

Conclusions and more extensions

Problem formulation and preliminariesMaximum principle

The adjoint equation

Based on the previous discussions, let us assume moreover

(H3) α ≥ N + 1 and β = 2α(α−N + 1)−1.

Lemma 3. Under (H1)-(H3), the integrability conditions in (8)hold true.

Proof. As a corollary of Holder’s inequality or Young’s inequality,the following assertion holds true: if ξ ∈ La and η ∈ Lb with

a, b > 0, then ξη ∈ Laba+b , and ξη ∈ L1 if 1

a + 1b ≤ 1 or ab

a+b ≥ 1.

Denote θk = αγα+γ(N−1−k) for k ∈ T . Note that θk is increasing in

k and θk ≤ γ.Let us use induction to show that

|Pk| ∈ Lθk , |Qk| ∈ Lθk−1 , k = 1, · · · , N − 1. (9)

24 / 56

Page 25: Maximum principle for one kind of discrete-time stochastic ... › events › 2019 › qfinance › files › zhen.pdf · Extensions to stochastic games Applications Conclusions and

IntroductionMaximum principle

Extensions to stochastic gamesApplications

Conclusions and more extensions

Problem formulation and preliminariesMaximum principle

The adjoint equation

Since |hx(X∗N )| ≤ C(1 + |X∗N |), |X∗N | ∈ Lγ and |WN−1| ∈ Lα, we have

|PN−1| ∈ Lγ and |QN−1| ∈ Lαγα+γ . Thus (9) holds for k = N − 1.

Let us use induction to proceed. Assume

|Pj+1| ∈ Lθj+1 , |Qj+1| ∈ Lθj (10)

for some j = 1, · · · , N − 2. Firstly, note that

|Pj | ≤ CE[(1 + |X∗j+1|+ |u∗j+1|+ |Pj+1|+ |Qj+1|

)|Fj ].

Since |X∗j+1| ∈ Lγ , |u∗j+1| ∈ Lβ , |Pj+1| ∈ Lθj+1 , |Qj+1| ∈ Lθj , and

θj < θj+1 ≤ γ ≤ β, it follows that Pj ∈ Lθj . Secondly, we have

|Qj | ≤ CE[(1 + |X∗j+1|+ |u∗j+1|+ |Pj+1|+ |Qj+1|

)|Wj ||Fj ]

Note that |Wj | ∈ Lα, |X∗j+1||Wj | ∈ Lαγα+γ , |u∗j+1||Wj | ∈ L

αβα+β ,

|P ∗j+1||Wj | ∈ Lθj , |Q∗j+1||Wj | ∈ Lθj−1 . Since

θj−1 < θj ≤ αγα+γ ≤

αβα+β < α, we have |Qj | ∈ Lθj−1 .

25 / 56

Page 26: Maximum principle for one kind of discrete-time stochastic ... › events › 2019 › qfinance › files › zhen.pdf · Extensions to stochastic games Applications Conclusions and

IntroductionMaximum principle

Extensions to stochastic gamesApplications

Conclusions and more extensions

Problem formulation and preliminariesMaximum principle

The adjoint equation

With these preparations, the integrability conditions in (8) hold if

|Q1||W0|, |Q1||X1|, |Q2||X2| ∈ L1.

It’s easy to check that the above holds true if

γ(α−N + 1) ≥ 2α. (11)

Since γ = minα, β, (11) is equivalent toα > N − 1,α(α−N + 1) ≥ 2α,β(α−N + 1) ≥ 2α.

Solving it yields

α ≥ N + 1, β ≥ 2α(α−N + 1)−1.

Thus, it’s proper and sufficient to choose α ≥ N + 1 andβ = 2α(α−N + 1)−1, which is just (H3).

26 / 56

Page 27: Maximum principle for one kind of discrete-time stochastic ... › events › 2019 › qfinance › files › zhen.pdf · Extensions to stochastic games Applications Conclusions and

IntroductionMaximum principle

Extensions to stochastic gamesApplications

Conclusions and more extensions

Problem formulation and preliminariesMaximum principle

The adjoint equation

Remark. (i). The proof of Lemma 3 shows the reason that weneed and how we worked out the assumption (H3). Under thepreconditions (H1)-(H2), (H3) is almost the minimum conditionthat ensures (8).

(ii). Lemma 3 guarantees that the adjoint equation is well definedwith a unique solution (Pk, Qk), k ∈ T , and all the expectationsinvolving (Pk, Qk), k ∈ T in this talk are well defined.

(iii). It is expected that both α and β could be small; however, it’snot the case. In fact, (H3) implies that 2 < β ≤ α; in addition, thesmaller β is, the larger α is.

27 / 56

Page 28: Maximum principle for one kind of discrete-time stochastic ... › events › 2019 › qfinance › files › zhen.pdf · Extensions to stochastic games Applications Conclusions and

IntroductionMaximum principle

Extensions to stochastic gamesApplications

Conclusions and more extensions

Problem formulation and preliminariesMaximum principle

The adjoint equation

Lemma 4. Under (H1)-(H3), the variational equation (5) and theadjoint equation (7) admits the following duality:

EN−1∑k=0

⟨bv(k)Pk +

d∑j=1

σjv(k)Qjk, νk

⟩= E

N∑k=1

〈lx(k), Zk〉. (12)

In fact, the adjoint equation (7) can be solved explicitly to getPk =

N∑i=k+1

E[(M i−1

k )>lx(i)∣∣Fk] ,

Qk =N∑

i=k+1

E[(M i−1

k )>lx(i)W>k∣∣Fk] . (13)

Then Lemma 4 could be proved by interchanging the order of

summations and making use of the properties of conditional expectations.

Lemma 4 shows the main role played by (P,Q). (13) is the origin of the

adjoint equation (7).28 / 56

Page 29: Maximum principle for one kind of discrete-time stochastic ... › events › 2019 › qfinance › files › zhen.pdf · Extensions to stochastic games Applications Conclusions and

IntroductionMaximum principle

Extensions to stochastic gamesApplications

Conclusions and more extensions

Problem formulation and preliminariesMaximum principle

Necessary MP

Theorem 1. Assume (H1)-(H3). If u∗ = u∗k, k ∈ T is anoptimal control of Problem (DTC), then it satisfies

〈Hv(k,X∗k , u∗k, Pk, Qk), v − u∗k〉 ≥ 0, ∀v ∈ Uk, k ∈ T , (14)

Proof. By (12), the variational inequality becomes

EN−1∑i=0

〈Hv(i,X∗i , u∗i , Pi, Qi), µi − u∗i 〉 ≥ 0, ∀µ = µi, i ∈ T ∈ U .

For any k ∈ T , v ∈ Uk and A ∈ Fk, define µk = vχA + u∗kχA, andµi = u∗i if i 6= k. Then µ ∈ U . Applying to the last inequality gives

E [〈Hv(k,X∗k , u∗k, Pk, Qk), v − u∗k〉χA] ≥ 0.

Since 〈Hv(k,X∗k , u∗k, Pk, Qk), v − u∗k〉 is Fk-measurable and A ∈ Fk is

arbitrarily chosen, the result follows directly.29 / 56

Page 30: Maximum principle for one kind of discrete-time stochastic ... › events › 2019 › qfinance › files › zhen.pdf · Extensions to stochastic games Applications Conclusions and

IntroductionMaximum principle

Extensions to stochastic gamesApplications

Conclusions and more extensions

Problem formulation and preliminariesMaximum principle

Sufficient MP

Theorem 2. Assume (H1)-(H3). Let (u∗, X∗) be an admissiblepair and (P,Q) the solution of the adjoint equation (7). Assumethat h(·) and H(k, ·, ·, Pk, Qk) are convex functions for eachk ∈ T . Then u∗ is an optimal control of Problem (DTC) if itsatisfies (14).

Proof. For admissible pair (u,X), set b(k) = b(k,Xk, uk)− b(k,X∗k , u

∗k) and

σ(k) = σ(k,Xk, uk)− σ(k,X∗k , u

∗k). Then

J(u)− J(u∗) = EN−1∑k=0

[H(k,Xk, uk, Pk, Qk)−H(k,X∗k , u

∗k, Pk, Qk)]

− EN−1∑k=0

[〈b(k), Pk〉+

d∑j=1

〈σj(k), Qjk〉]+ E [h(XN )− h(X∗

N )] .

Note that for the first two expectations to be well defined, thethird integrability condition in (8) is needed.

30 / 56

Page 31: Maximum principle for one kind of discrete-time stochastic ... › events › 2019 › qfinance › files › zhen.pdf · Extensions to stochastic games Applications Conclusions and

IntroductionMaximum principle

Extensions to stochastic gamesApplications

Conclusions and more extensions

Problem formulation and preliminariesMaximum principle

Sufficient MP

If h and H are convex in (x, v) and (14) holds, then

J(u)− J(u∗) ≥ EN∑k=1

〈lx(k), Xk〉 − EN−1∑k=0

[〈αk, Pk〉+

d∑j=1

〈βjk, Qjk〉],

where X = X −X∗, αk = b(k)− b>x (k)Xk and

βjk = σj(k)− (σjx(k))>Xk. Similar to (6), we have Xk =k−1∑i=0

Mk−1i δi

for k ∈ T, where δk = αk +d∑j=1

βjkWjk , k ∈ T .

Finally, in view of (13), by interchanging the order of summations and

using the properties of conditional expectations we can show that the

right-hand side of the last inequality is equal to zero.

31 / 56

Page 32: Maximum principle for one kind of discrete-time stochastic ... › events › 2019 › qfinance › files › zhen.pdf · Extensions to stochastic games Applications Conclusions and

IntroductionMaximum principle

Extensions to stochastic gamesApplications

Conclusions and more extensions

MP for discrete-time nonzero-sum gameMP for discrete-time zero-sum game

Outline

1 Introduction

2 Maximum principleProblem formulation and preliminariesMaximum principle

3 Extensions to stochastic gamesMP for discrete-time nonzero-sum gameMP for discrete-time zero-sum game

4 ApplicationsApplication to one investment/consumption choice problemApplication to one kind of discrete-time LQ problem

5 Conclusions and more extensions

32 / 56

Page 33: Maximum principle for one kind of discrete-time stochastic ... › events › 2019 › qfinance › files › zhen.pdf · Extensions to stochastic games Applications Conclusions and

IntroductionMaximum principle

Extensions to stochastic gamesApplications

Conclusions and more extensions

MP for discrete-time nonzero-sum gameMP for discrete-time zero-sum game

Let us consider a system described byXk+1 = b(k,Xk, u1,k, u2,k) + σ(k,Xk, u1,k, u2,k)Wk,X0 = x0 ∈ Rn, (15)

and introduce two cost functionals

Ji(u1, u2) = E

N−1∑k=0

li(k,Xk, u1,k, u2,k) + hi(XN )

, i = 1, 2.

Player i hopes to minimize the cost Ji by selecting an appropriatecontrol u∗i , i = 1, 2. The problem is to find (u∗1, u

∗2) ∈ U1 × U2 s.t. J1(u

∗1, u∗2) = inf

u1∈U1J1(u1, u

∗2),

J2(u∗1, u∗2) = inf

u2∈U2J2(u

∗1, u2).

(16)

It is a discrete-time nonzero-sum stochastic game. We call itProblem (NZSG). Such (u∗1, u

∗2) is called an equilibrium point.

33 / 56

Page 34: Maximum principle for one kind of discrete-time stochastic ... › events › 2019 › qfinance › files › zhen.pdf · Extensions to stochastic games Applications Conclusions and

IntroductionMaximum principle

Extensions to stochastic gamesApplications

Conclusions and more extensions

MP for discrete-time nonzero-sum gameMP for discrete-time zero-sum game

Similar assumptions are supposed on the noises, the controls andthe coefficients as in the previous section.

Let us define

Hi = 〈b, p〉+ tr[σ>q] + li, i = 1, 2.

Introduce the following adjoint equations for i = 1, 2:

Pi,k = EHix(k + 1, X∗k+1, u

∗1,k+1, u

∗2,k+1, Pi,k+1, Qi,k+1)

∣∣Fk ,Qi,k = E

Hix(k + 1, X∗k+1, u

∗1,k+1, u

∗2,k+1, Pi,k+1, Qi,k+1)W

>k

∣∣Fk ,k = 0, 1, · · · , N − 2,Pi,N−1 = E[hix(X∗N )|FN−1],Qi,N−1 = E[hix(X∗N )W>N−1

∣∣FN−1].34 / 56

Page 35: Maximum principle for one kind of discrete-time stochastic ... › events › 2019 › qfinance › files › zhen.pdf · Extensions to stochastic games Applications Conclusions and

IntroductionMaximum principle

Extensions to stochastic gamesApplications

Conclusions and more extensions

MP for discrete-time nonzero-sum gameMP for discrete-time zero-sum game

Note that Problem (NZSG) consists of two discrete-timestochastic optimal control problems of Problem (DTC)’s type.

Theorem 3. Suppose (u∗1, u∗2) is an equilibrium point of Problem

(NZSG). Then it holds that ∀vi,k ∈ Ui,k, k ∈ T , i = 1, 2,

〈Hiui(k,X∗k , u∗1,k, u

∗2,k, Pi,k, Qi,k), vi,k − u∗i,k〉 ≥ 0. (17)

Theorem 4. Let (u∗1, u∗2) be a pair of admissible controls for

Problem (NZSG). Suppose that H1(k, ·, ·, u∗2,k, P1,k, Q1,k),H2(k, ·, u∗1,k, ·, P2,k, Q2,k), h1(·) and h2(·) are convex for eachk ∈ T . Then (u∗1, u

∗2) is an equilibrium point if it satisfies (17).

35 / 56

Page 36: Maximum principle for one kind of discrete-time stochastic ... › events › 2019 › qfinance › files › zhen.pdf · Extensions to stochastic games Applications Conclusions and

IntroductionMaximum principle

Extensions to stochastic gamesApplications

Conclusions and more extensions

MP for discrete-time nonzero-sum gameMP for discrete-time zero-sum game

Next let us consider a discrete-time zero-sum stochastic game. Westill use the system (15) where u1 and u2 are the controls of twoplayers. The discrete-time zero-sum stochastic game, Problem(ZSG), is to find (u∗1, u

∗2) ∈ U1 × U2 such that

J(u∗1, u2) ≤ J(u∗1, u∗2) ≤ J(u1, u

∗2), ∀(u1, u2) ∈ U1 × U2, (18)

where

J(u1, u2) = E

N−1∑k=0

l(k,Xk, u1,k, u2,k) + h(XN )

.

Such (u∗1, u∗2) is called a saddle point.

36 / 56

Page 37: Maximum principle for one kind of discrete-time stochastic ... › events › 2019 › qfinance › files › zhen.pdf · Extensions to stochastic games Applications Conclusions and

IntroductionMaximum principle

Extensions to stochastic gamesApplications

Conclusions and more extensions

MP for discrete-time nonzero-sum gameMP for discrete-time zero-sum game

By transforms, Problem (ZSG) could be turned to a nonzero-sumstochastic game of Problem (NZSG)’s type. In fact, by defining

l1 = l, h1 = h, J1 = J, l2 = −l, h2 = −h, J2 = −J,

Problem (ZSG) is equivalent to finding (u∗1, u∗2) such that

J1(u∗1, u∗2) = inf

u1∈U1J1(u1, u

∗2), J2(u

∗1, u∗2) = inf

u2∈U2J2(u

∗1, u2),

where

Ji(u1, u2) = E

N−1∑k=0

li(k,Xk, u1,k, u2,k) + hi(XN )

, i = 1, 2.

37 / 56

Page 38: Maximum principle for one kind of discrete-time stochastic ... › events › 2019 › qfinance › files › zhen.pdf · Extensions to stochastic games Applications Conclusions and

IntroductionMaximum principle

Extensions to stochastic gamesApplications

Conclusions and more extensions

MP for discrete-time nonzero-sum gameMP for discrete-time zero-sum game

Define the Hamiltonian by H = 〈b, p〉+ tr[σ>q

]+ l, and the adjoint

equation byPk = E

[Hx(k + 1, X∗k+1, u

∗1,k+1, u

∗2,k+1, Pk+1, Qk+1)

∣∣Fk],Qk = E

[Hx(k + 1, X∗k+1, u

∗1,k+1, u

∗2,k+1, Pk+1, Qk+1)W>k

∣∣Fk],k = 0, 1, · · · , N − 2,PN−1 = E [hx(X∗N )|FN−1] ,QN−1 = E

[hx(X∗N )W>N−1

∣∣FN−1

].

By the necessary MP of Problem (NZSG) [see Theorem 3], we get thefollowing necessary MP for Problem (ZSG).

Theorem 5. Suppose (u∗1, u∗2) is a saddle point of Problem (ZSG).

Then for each k ∈ T ,⟨Hu1(k,X∗k , u

∗1,k, u

∗2,k, Pk, Qk), v1,k − u∗1,k

⟩≥ 0, ∀v1,k ∈ U1,k,⟨

Hu2(k,X∗k , u∗1,k, u

∗2,k, Pk, Qk), v2,k − u∗2,k

⟩≤ 0, ∀v2,k ∈ U2,k.

(19)

38 / 56

Page 39: Maximum principle for one kind of discrete-time stochastic ... › events › 2019 › qfinance › files › zhen.pdf · Extensions to stochastic games Applications Conclusions and

IntroductionMaximum principle

Extensions to stochastic gamesApplications

Conclusions and more extensions

MP for discrete-time nonzero-sum gameMP for discrete-time zero-sum game

By the sufficient MP of Problem (NZSG) [see Theorem 4], weget the following sufficient MP for Problem (ZSG).

Theorem 6. Let (u∗1, u∗2) be a pair of admissible controls for

Problem (ZSG). Assume that H(k, ·, ·, u∗2,k, Pk, Qk) is convex,H(k, ·, u∗1,k, ·, Pk, Qk) is concave and h(x) = Rx+ ξ with R beinga bounded FT -measurable r.v. and ξ an integrable FT -measurabler.v.. Then (u∗1, u

∗2) is a saddle point if (19) holds.

39 / 56

Page 40: Maximum principle for one kind of discrete-time stochastic ... › events › 2019 › qfinance › files › zhen.pdf · Extensions to stochastic games Applications Conclusions and

IntroductionMaximum principle

Extensions to stochastic gamesApplications

Conclusions and more extensions

Application to one investment/consumption choice problemApplication to one kind of discrete-time LQ problem

Outline

1 Introduction

2 Maximum principleProblem formulation and preliminariesMaximum principle

3 Extensions to stochastic gamesMP for discrete-time nonzero-sum gameMP for discrete-time zero-sum game

4 ApplicationsApplication to one investment/consumption choice problemApplication to one kind of discrete-time LQ problem

5 Conclusions and more extensions

40 / 56

Page 41: Maximum principle for one kind of discrete-time stochastic ... › events › 2019 › qfinance › files › zhen.pdf · Extensions to stochastic games Applications Conclusions and

IntroductionMaximum principle

Extensions to stochastic gamesApplications

Conclusions and more extensions

Application to one investment/consumption choice problemApplication to one kind of discrete-time LQ problem

Example 1. Suppose there is a market in which m+ 1 assets aretraded discontinuously. The price Bk, k ∈ T of one risk-lessasset satisfies

Bk+1 −Bk = rkBk, k ∈ T ; B0 = b0. (20)

The prices of m risk assets, Si,k, k ∈ T , i = 1, 2, · · · ,m, satisfy

Si,k+1 − Si,k = Si,k(µi,k + σi,kWk), k ∈ T ; Si,0 = si. (21)

We now consider an investor whose wealth at time k is Xk.Denote by ui,k the wealth invested in the i-th stock at time k, andcall uk = (u1,k, u2,k, · · · , um,k)> a portfolio of the investor at timek. Assume moreover that consumption occurs at each time k ∈ T ,and denote by vk, k ∈ T the consumption sequence. Then

Xk+1−Xk =

Xk −m∑i=1

ui,k

Bk(Bk+1−Bk)+

m∑i=1

ui,kSi,k

(Si,k+1−Si,k)−vk.

41 / 56

Page 42: Maximum principle for one kind of discrete-time stochastic ... › events › 2019 › qfinance › files › zhen.pdf · Extensions to stochastic games Applications Conclusions and

IntroductionMaximum principle

Extensions to stochastic gamesApplications

Conclusions and more extensions

Application to one investment/consumption choice problemApplication to one kind of discrete-time LQ problem

Thus the dynamics of the wealth satisfiesXk+1 =

[(rk + 1)Xk + b>k uk − vk

]+ σ>k ukWk,

X0 = x0,

where bk = (µ1,k − rk, µ2,k − rk, · · · , µm,k − rk)> andσk = (σ1,k, σ2,k, · · · , σm,k)>. Assume that r, b, σ are bounded.

Let U be the set of u = uk, k ∈ T , where each uk is anFk-measurable and Rm-valued r.v. with necessary integrability.Denote by V the collection of all v = vk, k ∈ T such that eachvk is real-valued and Fk-measurable, vk ≥ a with some constanta > 0, and v has necessary integrability.

42 / 56

Page 43: Maximum principle for one kind of discrete-time stochastic ... › events › 2019 › qfinance › files › zhen.pdf · Extensions to stochastic games Applications Conclusions and

IntroductionMaximum principle

Extensions to stochastic gamesApplications

Conclusions and more extensions

Application to one investment/consumption choice problemApplication to one kind of discrete-time LQ problem

The objective of the investor is to find (u∗, v∗) ∈ U × V such that

J(u∗, v∗) = inf(u,v)∈U×V

J(u, v)

with

J(u, v) = E

N−1∑k=0

[1

2Lk|uk − ηk|2 −Gk

(vk)1−δ

1− δ

]−RXN

.

Assume that Lk, Gk and R are bounded scalar-valued r.v.’s, ηk is abounded Rm-valued r.v. representing a benchmark, and δ is aconstant representing the Arrow-Pratt index of the risk aversion.Assume 0 < δ < 1, Gk > 0, Lk > c and R > c for some c > 0.

Note that the objective consists of three parts. The first one is aminimization of the difference between the portfolio and a givenbenchmark, the second is a maximization of a utility of theconsumption and the last is a maximization of the terminal wealth.

43 / 56

Page 44: Maximum principle for one kind of discrete-time stochastic ... › events › 2019 › qfinance › files › zhen.pdf · Extensions to stochastic games Applications Conclusions and

IntroductionMaximum principle

Extensions to stochastic gamesApplications

Conclusions and more extensions

Application to one investment/consumption choice problemApplication to one kind of discrete-time LQ problem

Let (H1) and (H2) hold true. The Hamiltonian is defined by

H =[(rk + 1)x+ b>k u− v

]p+ σ>k uq +

Lk2|u− ηk|2 −Gk

v1−δ

1− δ.

It’s easily seen that H and h(x) = −Rx are convex functions of(x, u, v). Applying Theorem 1 shows that the optimal control(u∗, v∗) satisfies

〈bkPk + σkQk + Lk(u∗k − ηk), u− u∗k〉 ≥ 0, ∀u ∈ Rm, (22)[

−Pk −Gk(v∗k)−δ]

(v − v∗k) ≥ 0, ∀v ≥ a (23)

for each k, where (Pk, Qk), k ∈ T uniquely solvesPk = E [(rk+1 + 1)Pk+1|Fk] , k = 0, 1, · · · , N − 2,Qk = E [(rk+1 + 1)Pk+1Wk|Fk] , k = 0, 1, · · · , N − 2,PN−1 = −E [R|FN−1] , QN−1 = −E [RWN−1|FN−1] .

(24)

44 / 56

Page 45: Maximum principle for one kind of discrete-time stochastic ... › events › 2019 › qfinance › files › zhen.pdf · Extensions to stochastic games Applications Conclusions and

IntroductionMaximum principle

Extensions to stochastic gamesApplications

Conclusions and more extensions

Application to one investment/consumption choice problemApplication to one kind of discrete-time LQ problem

By Theorem 2, an admissible control (u∗, v∗) satisfying (22) and(23) is indeed an optimal control.

From (22) we derive

u∗k = ηk − L−1k (bkPk + σkQk). (25)

Since H(v) := −Pkv −Gk v1−δ

1−δ is a differentiable and convexfunction, (23) implies that H(v), v ∈ [a,+∞), takes its minimumat v∗k. Note that Pk < 0, k ∈ T . Then the solution of (23) is

v∗k =

vk, if vk ≥ a,a, if vk < a,

(26)

with vk =(−GkPk

) 1δ

.

45 / 56

Page 46: Maximum principle for one kind of discrete-time stochastic ... › events › 2019 › qfinance › files › zhen.pdf · Extensions to stochastic games Applications Conclusions and

IntroductionMaximum principle

Extensions to stochastic gamesApplications

Conclusions and more extensions

Application to one investment/consumption choice problemApplication to one kind of discrete-time LQ problem

It’s easy to see that each Pk is bounded. Thus, Qk ∈ Lα and soQk ∈ Lβ since β ≤ α. This implies that u∗k defined by (25) isindeed admissible. Using induction we can get Pk < −c for k ∈ T ,which shows that vk is bounded. Thus, v∗k defined by (26) isbounded and so admissible. Consequently, (u∗k, v∗k) defined by(25) and (26) is the optimal strategy of this investmentconsumption choice problem.

Eq. (24) can be solved explicitly in some special cases. Ifrk, k = 1, 2, · · · , N − 1 and R are deterministic constants, andE [Wk|Fk] = 0, k ∈ T , then (24) is uniquely solved by Qk ≡ 0 and

Pk =

−R(rN−1 + 1)(rN−2 + 1) · · · (rk+1 + 1), if 0 ≤ k ≤ N − 2,−R, if k = N − 1.

46 / 56

Page 47: Maximum principle for one kind of discrete-time stochastic ... › events › 2019 › qfinance › files › zhen.pdf · Extensions to stochastic games Applications Conclusions and

IntroductionMaximum principle

Extensions to stochastic gamesApplications

Conclusions and more extensions

Application to one investment/consumption choice problemApplication to one kind of discrete-time LQ problem

Example 2. Let us consider one kind of discrete-time LQstochastic optimal control problem in the case when d = 1,E[Wk|Fk] = 0 and Uk = Rm for each k ∈ T .

Let us define

b(k, x, v) = Akx+Bkv + Ck, σ(k, x, v) = Dkx+ Ekv + Fk,

l(k, x, v) = [〈Gkx, x〉+ 〈Lkv, v〉] /2, h(x) = 〈Rx, x〉,

where the coefficients are deterministic and bounded matrices.Moreover, Gk, R ≥ 0 and Lk > 0 for each k.

47 / 56

Page 48: Maximum principle for one kind of discrete-time stochastic ... › events › 2019 › qfinance › files › zhen.pdf · Extensions to stochastic games Applications Conclusions and

IntroductionMaximum principle

Extensions to stochastic gamesApplications

Conclusions and more extensions

Application to one investment/consumption choice problemApplication to one kind of discrete-time LQ problem

By Theorems 1 and 2, u∗k = −L−1k (B>k Pk + E>k Qk) is an optimal

control of this LQ problem if it’s admissible, where (P,Q) solvesPk = E

[A>k+1Pk+1 +D>k+1Qk+1 +Gk+1X

∗k+1

] ∣∣Fk ,Qk = E

[A>k+1Pk+1 +D>k+1Qk+1 +Gk+1X

∗k+1

]Wk

∣∣Fk ,k = 0, 1, · · · , N − 2,PN−1 = E[RX∗N |FN−1], QN−1 = E[RX∗NWN−1|FN−1].

Let us give a state-feedback form of u∗ in the simple case when Dk = 0and Ek = 0 for each k ∈ T . In this case, we have

u∗k = −L−1k B>k Pk, (27)

wherePk = E

[A>k+1Pk+1 +Gk+1X

∗k+1

] ∣∣Fk , k = 0, 1, · · · , N − 2,PN−1 = E[RX∗N |FN−1].

To conclude, we only need to express Pk in terms of X∗k .

Since the diffusion term is independent of x and v, and Qk is not

needed, the square-integrability issue disappears in this case.48 / 56

Page 49: Maximum principle for one kind of discrete-time stochastic ... › events › 2019 › qfinance › files › zhen.pdf · Extensions to stochastic games Applications Conclusions and

IntroductionMaximum principle

Extensions to stochastic gamesApplications

Conclusions and more extensions

Application to one investment/consumption choice problemApplication to one kind of discrete-time LQ problem

By Lemma 4, we havePk = (AN−1

k )>RE[X∗N |Fk] +N−1∑j=k+1

(Aj−1k )>GjE[X∗j |Fk],

k = 0, · · · , N − 2,PN−1 = RE[X∗N |FN−1],

(28)

where Aji = AjAj−1 · · ·Ai+1 if i < j, and Aji = In if i = j.

The subsequent idea is: Pk ⇐= E[X∗k+1|Fk]⇐= X∗k .Let us assume

Pk = αkE[X∗k+1|Fk] + βk, k ∈ T , (29)

where αk, βk, k ∈ T are deterministic matrices which are to determinedlater. Then by (27), we have

u∗k = −L−1k B>k (αkE[X∗k+1|Fk] + βk). (30)

49 / 56

Page 50: Maximum principle for one kind of discrete-time stochastic ... › events › 2019 › qfinance › files › zhen.pdf · Extensions to stochastic games Applications Conclusions and

IntroductionMaximum principle

Extensions to stochastic gamesApplications

Conclusions and more extensions

Application to one investment/consumption choice problemApplication to one kind of discrete-time LQ problem

By substituting (30) into the state equation and taking conditionalexpectation we get

ΦkE[X∗k+1|Fk] = AkX∗k + Ψk, k ∈ T ,

where Φk = In +BkL−1k B>k αk and Ψk = Ck −BkL−1k B>1,kβk.

Let us assume

Φ−1k exists and is bounded, ∀k ∈ T . (31)

ThenE[X∗k+1|Fk] = φkX

∗k + ψk, k ∈ T (32)

withφk = Φ−1k Ak, ψk = Φ−1k Ψk. (33)

50 / 56

Page 51: Maximum principle for one kind of discrete-time stochastic ... › events › 2019 › qfinance › files › zhen.pdf · Extensions to stochastic games Applications Conclusions and

IntroductionMaximum principle

Extensions to stochastic gamesApplications

Conclusions and more extensions

Application to one investment/consumption choice problemApplication to one kind of discrete-time LQ problem

Consequently, by (30) and (32) we get

u∗k = −L−1k B>k αkφkX∗k − L−1k B>k (αkψk + βk). (34)

This is the feedback form. It remains to determine (αk, βk).

By (32), we can use induction to get

E[X∗i |Fk] = φi−1k E[X∗k+1|Fk] +

i−1∑j=k+1

φi−1j ψj , i ≥ k + 2, (35)

where φji is defined by: φji = φjφj−1 · · ·φi+1 if i < j, and φji = Inif i = j. That is, we can express E[X∗i |Fk], i ≥ k + 2, in terms ofE[X∗k+1|Fk] only.

51 / 56

Page 52: Maximum principle for one kind of discrete-time stochastic ... › events › 2019 › qfinance › files › zhen.pdf · Extensions to stochastic games Applications Conclusions and

IntroductionMaximum principle

Extensions to stochastic gamesApplications

Conclusions and more extensions

Application to one investment/consumption choice problemApplication to one kind of discrete-time LQ problem

Taking i = N and k = N − 2 in (35) gives

E[X∗N |FN−2] = φN−1E[X∗N−1|FN−2] + ψN−1. (36)

By (28) we have

PN−2 = [(AN−1)>RφN−1 +GN−1]E[X∗N−1|FN−2] + (AN−1)>RψN−1.

Recall from (28) that Pk satisfies that

Pk = (AN−1k )>RE[X∗N |Fk] +

N−1∑j=k+2

(Aj−1k )>GjE[X∗j |Fk]

+Gk+1E[X∗k+1|Fk], k = 0, 1, · · · , N − 3,PN−2 = [(AN−1)>RφN−1 +GN−1]E[X∗N−1|FN−2]

+(AN−1)>RψN−1,PN−1 = RE[X∗N |FN−1].

(37)

52 / 56

Page 53: Maximum principle for one kind of discrete-time stochastic ... › events › 2019 › qfinance › files › zhen.pdf · Extensions to stochastic games Applications Conclusions and

IntroductionMaximum principle

Extensions to stochastic gamesApplications

Conclusions and more extensions

Application to one investment/consumption choice problemApplication to one kind of discrete-time LQ problem

Next, substituting (29) and (35) into (37) and comparing the coefficientsleads to

αN−1 = R, βN−1 = 0,αN−2 = GN−1 + (AN−1)>RφN−1, βN−2 = (AN−1)>RψN−1,

αk = Gk+1 +N−1∑j=k+2

(Aj−1k )>Gjφ

j−1k + (AN−1

k )>RφN−1k ,

βk =N−1∑j=k+2

(Aj−1k )>Gj

j−1∑s=k+1

φj−1s ψs + (AN−1

k )>RN−1∑j=k+1

φN−1j ψj ,

k = 0, 1, · · · , N − 3.(38)

Note that (αk, βk), k ∈ T defined above are bounded.

Finally the conclusion is as follows. Let (φk, ψk) be defined by (33)

and (αk, βk), k ∈ T be defined by (38). If (31) holds, then u∗ defined

by (34) is the admissible optimal control of this LQ problem.

53 / 56

Page 54: Maximum principle for one kind of discrete-time stochastic ... › events › 2019 › qfinance › files › zhen.pdf · Extensions to stochastic games Applications Conclusions and

IntroductionMaximum principle

Extensions to stochastic gamesApplications

Conclusions and more extensions

Outline

1 Introduction

2 Maximum principleProblem formulation and preliminariesMaximum principle

3 Extensions to stochastic gamesMP for discrete-time nonzero-sum gameMP for discrete-time zero-sum game

4 ApplicationsApplication to one investment/consumption choice problemApplication to one kind of discrete-time LQ problem

5 Conclusions and more extensions

54 / 56

Page 55: Maximum principle for one kind of discrete-time stochastic ... › events › 2019 › qfinance › files › zhen.pdf · Extensions to stochastic games Applications Conclusions and

IntroductionMaximum principle

Extensions to stochastic gamesApplications

Conclusions and more extensions

Conclusions.

One kind of discrete-time stochastic optimal control problem isstudied. The main result is to establish the necessary as well assufficient stochastic maximum principle in a rigorous and clear way.A difficulty arising from the lack of necessary integrability of thesolution to the adjoint equation is overcome. The results areextended to two kinds of discrete-time stochastic games. Twoapplications are studied, for which the optimal controls are derived.This research paves a way for further related works.

More extensions.

general control domain case

different kinds of control systems, i.e., delayed system,mean-field system, · · ·state/control constrained cases

partial information/ partial observation cases55 / 56

Page 56: Maximum principle for one kind of discrete-time stochastic ... › events › 2019 › qfinance › files › zhen.pdf · Extensions to stochastic games Applications Conclusions and

IntroductionMaximum principle

Extensions to stochastic gamesApplications

Conclusions and more extensions

Thank Y ou !

56 / 56