177

Stochastic Stability and Control Harold Joseph Kushner

Embed Size (px)

Citation preview

Page 1: Stochastic Stability and Control Harold Joseph Kushner
Page 2: Stochastic Stability and Control Harold Joseph Kushner

Stochastic Stability and Control

Page 3: Stochastic Stability and Control Harold Joseph Kushner

MATH E MAT I C S

A N D E N G I N E E R I N G I N S C I E N C E

A S E R I E S OF M O N O G R A P H S A N D T E X T B O O K S

Edited by Richard Bellman University of Southern California

1 .

2. 3.

4.

5.

6.

7. 8. 9.

10. 11.

12.

13.

14. 15.

16.

17.

18.

19. 20. 21.

22.

TRACY Y. THOMAS. Concepts from Tensor Analysis and Differential Geometry. Second Edition. 1965

TRACY Y. THOMAS. Plastic Flow and Fracture in Solids. 1961

RUTHERFORD ARIS. The Optimal Design of Chemical Reactors: A Study in Dynamic Programming. 196 1 JOSEPH LASALLE and SOLOMON LEFSCHETZ. Stability by Liapunov's Direct Method with Applications. 1961

GEORGE LEITMANN (ed.) . Optimization Techniques: With Applications to Aerospace Systems. 1962

RICHARD BELLMAN and KENNETH L. COOKE. Differential-Difference Equations. 1963

FRANK A. HAIGHT. Mathematical Theories of Traffic Flow. 1963 F. V. ATKINSON. Discrete and Continuous Boundary Problems. 1964

A. JEFFREY and T. TANIUTI. Non-Linear Wave Propagation: With Appli- cations to Physics and Magnetohydrodynamics. 1964

JULIUS T. Tow. Optimum Design of Digital Control Systems. 1963

HARLEY FLANDERS. Differential Forms: With Applications to the Physical Sciences. 1963

SANFORD M. ROBERTS. Dynamic Programming in Chemical Engineering and Process Control. 1964

SOLOMON LEFSCHETZ. Stability of Nonlinear Control Systems. 1965

DIMITRIS N. CHORAFAS. Systems and Simulation. 1965

A. A. PERvozvANsKrr. Random Processes in Nonlinear Control Systems. 1965

MARSHALL C. PEASE, 111. Methods of Matrix Algebra. 1965

V. E. BENES. Mathematical Theory of Connecting Networks and Tele- phone Traffic. 1965

WILLIAM F. AMES. Nonlinear Partial Differential Equations in Engineering. 1965

J. A C Z ~ L . Lectures on Functional Equations and Their Applications. 1966

R. E. MURPHY. Adaptive Processes in Economic Systems. 1965

S. E. DREYFUS. Dynamic Programming and the Calculus of Variations. 1965

A. A. FEL'DBAUM. Optimal Control Systems. 1965

Page 4: Stochastic Stability and Control Harold Joseph Kushner

MATHEMATICS I N SCIENCE A N D E N G I N E E R I N G

23.

24.

25. 26. 27.

28. 29.

30. 31.

32. 33.

34.

35.

36.

37.

38.

A. HALANAY. Differential Equations : Stability, Oscillations, Time Lags. 1966

M. NAMIK OGVUZTORELI. Time-Lag Control Systems. 1966 DAVID SWORDER. Optimal Adaptive Control Systems. 1966

MILTON ASH. Optimal Shutdown Control of Nuclear Reactors. 1966 DIMITRIS N. CHORAFAS. Control System Functions and Programming Ap- proaches. ( In Two Volumes.) 1966

N. P. ERUGIN. Linear Systems of Ordinary Differential Equations. 1966

SOLOMON MARCUS. Algebraic Linguistics; Analytical Models. 1967 A. M. LIAPUNOV. Stability of Motion. 1966

GEORGE LEITMANN (ed.). Topics in Optimization. 1967

MASANAO AOKI. Optimization of Stochastic Systems. 1967

HAROLD J. KUSHNER. Stochastic Stability and Control. 1967

MINORU URABE. Nonlinear Autonomous Oscillations. 1967 F. CALOGERO. Variable Phase Approach to Potential Scattering. 1967

A. KAUFMANN. Graphs, Dynamic Programming, and Finite Games. 1967

A. KAUFMANN and R. CRUON. Dynamic Programming: Sequential Scien- tific Management. 1967

J. H. AHLBERG, E. N. NILSON, and J. L. WALSH. The Theory of Splines and Their Applications. 1967

In preparation

Y. SAWARAGI, Y. SUNAHARA, and T. NAKAMIZO. Statistical Decision Adaptive Control Systems A. KAUPMANN and R. FAURE. Introduction to Operations Research

Theory in

RICHARD BELLMAN. Introduction to the Mathematical Theory of Control Proc- esses ( In Three Volumes. )

E. STANLEY LEE. Quasilinearization and Invariant Bedding

WILLARD MILLER, JR. Lie Theory and Special Functions

PAUL B. BAILEY, LAWRENCE F. SHAMPINE, and PAUL E. WALTMAN. Nonlinear Two Point Boundary Value Problems

Page 5: Stochastic Stability and Control Harold Joseph Kushner

This page intentionally left blank

Page 6: Stochastic Stability and Control Harold Joseph Kushner

STOCHASTIC STABILITY

A N D CONTROL

Harold J. Kushner Brown University

Providence, Rhode Island

1967

ACADEMIC PRESS New York / London

Page 7: Stochastic Stability and Control Harold Joseph Kushner

COPYRIGHT Q 1967, BY ACADEMIC PRESS INC. ALL RIGHTS RESERVED. NO PART OF THIS BOOK MAY BE REPRODUCED IN ANY FORM, BY PHOTOSTAT, MICROFILM, OX ANY OTHER MEANS, WITHOUT WRITTEN PERMISSION FROM THE PUBLISHERS.

ACADEMIC PRESS INC. 11 1 Fifth Avenue, New York, New York 10003

United Kingdom Edition published by ACADEMIC PRESS INC. (LONDON) LTD. Berkeley Square House, London W. 1

LIBRARY OF CONGRESS CATALOG CARD NUMBER: 66-30089

PRINTED IN THE UNITED STATES OF AMERICA

Page 8: Stochastic Stability and Control Harold Joseph Kushner

To Linda

Page 9: Stochastic Stability and Control Harold Joseph Kushner

This page intentionally left blank

Page 10: Stochastic Stability and Control Harold Joseph Kushner

Preface

In recent years there has been a great deal of activity in the study of qualitative properties of solutions of differential equations using the Liapunov function approach. Some effort has also been devoted to the application of the “Liapunov function method” to the design of controls or to obtain sufficient conditions for the optimality of a given control (say, via the Hamilton-Jacobi equation, or dynamic pro- gramming approach to optimal control).

In this monograph we develop the stochastic Liapunov function approach, which plays the same role in the study of qualitative proper- ties of Markov processes or controlled Markov processes as Liapunov functions do for analogous deterministic problems. Roughly speaking, a stochastic Liapunov function is a suitable function of the state of a process, which, considered as a random process, possesses the super- martingale property in a neighborhood. From the existence of such functions, many properties of the random trajectories, both asymptotic and finite time, can be inferred. The motivation for the work was the author’s interest in stochastic problems in control, and the methods to be discussed enlighten many such problems; nevertheless, much of the material seems to have an independent probabilistic interest.

Although some discrete time results are given, we have emphasized processes with a continuous time parameter, which require somewhat more elaborate methods. Actually, most of the proofs are not difficult in that they do not involve highly detailed or subtle arguments; some of the proofs are difficult in the sense that we have found it

ix

Page 11: Stochastic Stability and Control Harold Joseph Kushner

X PREFACE

necessary to refer, in their proof, to theorems which are subtle. Some of these are discussed in the background material of Chapter I, and others in the referenced works of E. B. Dynkin.

The analysis requires the introduction of the weak infinitesimal operators of (strong) Markov processes, either as a general abstract object, or in the forms in which it appears for special cases. Instead of doing the analysis for various special processes, we chose the economical alternative of treating general cases, and listing special cases in remarks or corollaries.

Chapter I is devoted to a discussion of many of the concepts from probability theory which are used in the sequel. Chapter 11, on stability, is the longest and probably the most basic part; the material underlies most of the results of the other chapters. The chapter is by no means exhaustive. We have concentrated on several results which we feel to be important (and can prove); there are certainly many other cases of potential interest, both obvious and not. The chapter contains a number of nonlinear and linear examples. Nevertheless, a quick survey of the examples reveals a shortcoming that we share with the deterministic method; namely, the difficulties in finding suitable Liapunov functions. In particular, we have not been able to obtain a family of Liapunov functions which can be used to completely characterize the asymptotic stability properties of the solutions of linear differential equations with homogeneous Markov process coef- ficients (i.e., obtain necessary and sufficient conditions in terms of (say) the transition functions of the coefficient processes, for certain types of statistical asymptotic stability).

Chapter 111 is devoted to the study of first exit times or, equiva- lently, to the problem of obtaining useful upper bounds to the proba- bility that the state of the process will leave some given set at least once by a given time. The approach is expected to be useful, in view of the importance of such problems in control and the difficulty of obtaining useful estimates.

Chapter IV is devoted to problems in optimal control; in particular, to the determination of sufficient conditions for the optimality of a control. Some of the material provides a stochastic analog to the

Page 12: Stochastic Stability and Control Harold Joseph Kushner

PREFACE xi

Hamilton-Jacobi equation, or dynamic programming, approach to sufficient conditions for optimality in the deterministic case, and provides a justification for some well-known formal results obtained by dynamic programming.

In Chapter V, we discuss several uses of stochastic Liapunov functions in the design of controls. The method may be applied to the computation of a control which reduces some “cost” or ensures that some stability property obtains.

It is a pleasure to express my appreciation to P. L. Falb and W. Fleming for their helpful criticisms on parts of the manuscript, and to Mrs. KSue Brinson for the excellent typing of several drafts.

The work was supported in part by the National Aeronautics and Space Administration under Grant No. NGR-40-002-015 and in part by the United States Air Force through the Air Force Office of Scientific Research under Grant No. AF-AFOSR-693-66.

April 1967 HAROLD J. KUSHNER

Page 13: Stochastic Stability and Control Harold Joseph Kushner

This page intentionally left blank

Page 14: Stochastic Stability and Control Harold Joseph Kushner

Contents

Preface . . . . . . . . . . . . . . . . . . . .

I / Introduction . . . . . . . . . . . . . . . . 1. Markov Processes . . . . . . . . . . . . . . .

Discrete time Markov process 1, Continuous time Markov processes 3, Notation 4, Continuity 4

2. Strong Markov Processes. . . . . . . . . . . . . Markov time 5, Strong Markov process 5, Discrete parameter pro- cesses 7, Feller processes 8, Weak infinitesimal operator of a strong Markov process 9, Dynkin’s formula 10

3. Stopped Processes. . . . . . . . . . . . . . . 4. It6 Processes . . . . . . : . . . . . . . . .

Strong Markov property 15, Differential generator 15, It6’s lemma 16, Stopped processes 17, Weakening of condition (4-5) 18

5. Poisson Differential Equations . . . . . . . . . . . Second form of the Poisson equation 20

6. Strong Diffusion Processes . . . . . . . . . . . . 7. Martingales . . . . . . . . . . . . . . . .

II / Stochastic Stability . . . . . . . , . . . . . 1. Introduction . . . . . . . . . . . . . . . .

Definitions 30, The idea of a Liapunov function 33, The stochastic Liapunov function approach 34, History 35

2. Theorems. Continuous Parameter . . . . . . . . . . Assumptions 36, Definition 39, Remark on time dependence 41, Remark on strong diffusion processes 44

3. Examples . . . . . . . . . . . . . . . . . 4. Discrete Parameter Stability . . . . . . . . . . . . 5. On the Construction of Stochastic Liapunov Functions . . . .

ix

1

1

4

11 12

18

22 25

27

27

36

55 71 72

... XI11

Page 15: Stochastic Stability and Control Harold Joseph Kushner

xiv CONTENTS

Ill / Finite Time Stability and First Exit Times . . . . . 1 . Introduction . . . . . . . . . . . . . . . . 2 . Theorems . . . . . . . . . . . . . . . . .

3 . Examples . . . . . . . . . . . . . . . . . Strong diffusion processes 89

Improving the bound 100

IV / Optimal Stochastic Control . . . . . . . . . . 1 . Introduction . . . . . . . . . . . . . . . .

A dynamic programming algorithm 103. Discussion 106. A deter- ministic result 107. The stochastic analog 108

2 . Theorems . . . . . . . . . . . . . . . . . Terminology 109. A particular form for F( V(x)) 118. An optimality theorem 118. Control over a fixed time interval 121. Strong diffusion process 124. General nonanticipative comparison controls 126. “Practical optimality” : Compactifying the state space 128

3 . Examples . . . . . . . . . . . . . . . . . 4 . A Discrete Parameter Theorem . . . . . . . . . . .

V / The Design of Controls . . . . . . . . . . . . 1 . Introduction . . . . . . . . . . . . . . . . 2 . The Calculation of Controls Which Assure a Given Stability

Property . . . . . . . . . . . . . . . . . 3 . Design of Controls to Decrease the Cost . . . . . . . .

The Liapunov function approach to design 150

References . . . . . . . . . . . . . . . . . . .

Author Index . . . . . . . . . . . . . . . . .

77

77 79

91

102

102

109

130 141

143

143

144 147

153

159

Subject Index . . . . . . . . . . . . . . . . . . 160

Page 16: Stochastic Stability and Control Harold Joseph Kushner

I / INTRODUCTION

In this chapter, we list and discuss some results and definitions in various branches of the theory of Markov processes which are to be used in the sequel. The chapter is for introductory purposes only, and all the ideas which are discussed are treated in greater detail in the listed references. Occasionally, to simplify the introduction, the list of properties which characterize a definition will be shortened, and refer- ence made to the precise definition. The sample space is denoted by Q, with generic point o. The range of the process is in a Euclidean space E, and d is a Bore1 field of sets in E.

1. Markov Processes

DISCRETE TIME MARKOV PROCESS

Let xl, . . . be a discrete time parameter stochastic process with the associated transition function P(s , x; n + s, f). Then the function P ( s , x ; n + s, r ) is to be.interpreted as the probability that xn+, is in TE€, given that x, = x . Suppose that the conditional distribution function satisfies

P ( ~ n + s ~ r I x l , *-., xs> = P ( X n i s E r J X s ) (1-1)

for all f in d and nonnegative n and s, with probability one. Then the process is termed a Markov process and, in addition, the Chapman-

1

Page 17: Stochastic Stability and Control Harold Joseph Kushner

2 I / INTRODUCTION

Kolmogorov equation (1-2) holds:

P(s , x; n + rn + s, r) = [ ~ ( s , x ; s + m , d y ) P ( s + m , y ; n + rn +s, r) . (1-2)

The following alternative characterization of a Markov process will be useful. To each point x in E and nonnegative integer s < co, a probability measure P , , , { A } is associated. The measure, on an ap- propriate a-algebra of sets in R is defined by* its values on the Bore1 sets

E

p x , s { X n + s E r } = P ( s , X; s + n, r ) and satisfies with probability one

p x , m { X m + n + s E r I X I 3 x m + s } = P ( m +s, x m + s ; s + n + m , r ) for each r in 8. P,, { A I b} is interpreted as the probability of the event A, with respect to the measure P,, , conditioned on the minimum 0-

algebra on SZ over which b is measurable.+ Thus P x , , { x n + , ~ r } is the probability that x,+, is in r, given that the process starts at s with x, = x, a constant. If P , , , ( X ~ + ~ E ~ } does not depend on s, we write P, {x,ErJ for the probability that X " + ~ E ~ , given that the process starts at x with x, = x, for any initial time s.

The sample paths of stochastic process are not always defined for all time (for either continuous or discrete parameter processes). There may be a nonzero probability that the process will escape to infinity in a finite time, or at least leave the domain on which it is defined as a Markov process at some finite time. Consider the trivial Markov process whose paths are the solutions of the differential equation f = x 2 where the initial value xo (taken at t = 0) is a random variable. Then, for any T < co, it is easily seen that x, -03 as t + ((x,) 5 T, with the probability that x,, 2 1/ T.

In general to each initial condition x in E there is associated a

* {xn+* E T J is defined as the w set ( w : xn+s E r}. + Sometimes the appropriate a-algebra is written in lieu of b.

Page 18: Stochastic Stability and Control Harold Joseph Kushner

1. MARKOV PROCESSES 3

random variable 5 which takes values in [0, co]. ( is termed the “killing time” and the process x, is defined for n < ( only. If 5 -= 03 with a nonzero probability, then P,,,,(X,EE} will be less than 1 for large n. If the sample paths are defined for all time with probability one, then ( = 03 with probability one.

It is often convenient to suppose that x, = 03 for n 2 5 . However, for most purposes of the sequel, the killing time will be unimportant, and unless mentioned otherwise, we assume that it is equal to 03 with probability one. Generally we will be concerned with the behavior of a process up to the first instant of time that the process exists from a given set. It will be required that the process be defined up to this time, and this will be true either by assumption or as a known property of the specific case treated. An exception to this is Theorem 8, Chapter 11, in which it is proved that ( = co with probability one (a type of Lagrange stability) under stated conditions on the process.

CONTINUOUS TIME MARKOV PROCESSES

If the time indices are allowed to take any values in [0, a), then the discrete case definitions carry over to the continuous parameter case. Precise definitions and results on continuous parameter processes are given in Dynkin [l, 21. Other useful sources on discrete or continuous parameter processes are Feller [l], Chung [l], Doob [l], Loeve [l], and Bharucha-Reid [l].

Occasionally, to assist in the simplification of notation, it will be helpful to consider time as a component of the Markov process. According to the definitions above, if x, is a Markov process, then so is the pair * (x,, t ) . We will occasionally use the convention of writing the state (x, s) (or (xs, s)) simply as x (or xs). The new state space E x [0, a) (which contains the (x, t ) sets I‘) is then written simply as E.

* The pair (xt , t ) does not satisfy the precise definition of a Markov process as given by Dynkin [2], p. 78, condition 3.1G. This deficiency is easily overcome by a standard enlargement of the space l2 which leaves all other properties intact. See footnote on p. 79 of Dynkin [2].

Page 19: Stochastic Stability and Control Harold Joseph Kushner

4 I / INTRODUCTION

This will allow several results which are stated more succinctly in the homogeneous case terminology, to extend to the nonhomogeneous case.

NOTATION

The following terminology will be used. Suppose that time is not a component of the state. Let the process x, be homogeneous. Then P , { x ~ E ~ ) = P(s, x; s + t , r ) = P(x , t , r ) with probability one for any s >= 0. Also, E,f(x,) = S f ( y ) P ( x , t , dy). If the process is not homogene- ous, let P x , s { X t + s d ) = P(s , x; s + t , r ) and E,,,f(x,+,) = S f ( y ) P(s , x; s + t , dy). In interpreting P , , , { x , + , E ~ } it is to be understood that x is the initial value of the process at initial time s. Px(do) and P,,s(dw) are the probability measures corresponding to the homogene- ous and nonhomogeneous cases, respectively.

If time is considered as a component of the state, then either the homogeneous case or the nonhomogeneous case terminology may be applied.

CONTINUITY

The process x, is said to be stochastically continuous at the point x if

P,{II.x, - XI1 2 &} + o ,

P,{ S U P IIXd - XI1 2 8 ) + 0

l I ~ I l 2 = c xi'

as 6+0,for any E > O . If

d 2 A t O

uniformly for x in a set M , as 6+0, for any E > 0, then the process is uniformly stochastically continuous in the set M.

2. Strong Markov Processes

In this section, it is assumed that the killing time 5 equals infinity with probability one. This assumption will not restrict the results of

Page 20: Stochastic Stability and Control Harold Joseph Kushner

2. STRONG MARKOV PROCESSES 5

the following chapters, but will allow a discussion of a number of concepts and results of Dynkin [2] (which we will use later) with a reasonably unburdened notation.

MARKOV TIME

Let 9, be the minimum a-algebra on SZ determined by conditions on x,, 0 S s 5 t , xo = x (and completed with respect to the measure Px(do)) . A random variable 7 taking values in [0, co] and depending also on x is called a Markov time if the event {T 5 t } is contained in 9, for each t < co and fixed x. 7 is also called a “time” or “optional time.” More intuitively, 7 is a functional of the sample paths and, whether or not the time 7 has “arrived” by the time t , can be determined by observing the sample paths only up to and including t .

Define the first exit time of x , from an open set Q by 7 = inf { t : x, not in Q}. It is intuitively clear that if x, is continuous from the right with probability one, then z is a Markov time (since whether or not the process has left Q at least once by time tcan be determined (with proba- bility one) by observing xs, s 5 t ) . More precisely (see Loeve [l], p. 580), let R, be the set of x points whose distance (Euclidean) from E - Q is less than l/n and let { r i } be the rationals. Then

(7 5 t> = { x t $ Q } U r’l U { X r i E R n } + N € F t , n r i j t

where N is a null set. A much more thorough characterization of the first entrance and

exit times which are also Markov times is given in Dynkin [2, Chapter 4, Section 11. See also It6 [2].

If a Markov time is undefined for some o, we set it equal to infinity.

STRONG MARKOV PROCESS

Markov times play an important role in the analysis of a Markov process, and will play a crucial role in the sequel. The events of greatest

Page 21: Stochastic Stability and Control Harold Joseph Kushner

6 I / INTRODUCTION

interest to us will occur at Markov times; for example, the first time that a controlled process enters a target set; the first time that a trajectory leaves a given neighborhood of the origin (which is a property of interest in stability), or the first time that two processes are within E of one another. In addition, much of the analysis concerns the behavior of sequences of random variables which are the values of a Markov process at a sequence of Markov times.

It is useful to further restrict the class of Markov process with which we deal so that the analysis with the Markov times will be feasible. The definitions which follow are according to Dynkin [2 ] . First, let z be a Markov time and define the a-algebra Pr as follows: let the set A be in Pr if

for each t 2 0. The verification that Sr is a a-field is straightforward; for example,letA,,n= 1, ... b e i n S - , . T h e n s i n c e A , n { s ~ t } is i n 9 , for each t and n, (U, A , ) n { T 5 t } is also in 9, for each t 2 0; hence, U, A , is in Fr, and so forth. Pr includes those events which can be determined by observations

on the process up to and including, but no later than, 7. Two examples will clarify this. Let z = s, a constant. T is obviously a Markov time. For t < s, (7 s t } is empty, and all sets A satisfy (2-1). For t 2 s, (7 t } = Q and only the sets A in Ss satisfy Q u A ~9,, all t 2 s. Thus

Consider also the simple example where Q contains all possible outcomes of two successive coin tosses; C? = { H H , HT, TH, TT, r)} . Let 7 = 1 if the first toss is heads and T = 2 otherwise. T is a random time. Also, Sl = {Q, r), H H u HT, T T u T H } and T2 contains all possible combinations of points in Q. It is easily verified that gr is the field over {Q, 4, H H u HT, T H u TT, TH, TT}, that is, the collection of events which can be observed at n = 1, if the first toss is a head, or at n = 2 otherwise.

In what follows we always assume that the process x, is continuous on the right with probability one; no other processes will be considered

Pr = Ss.

Page 22: Stochastic Stability and Control Harold Joseph Kushner

2. STRONG MARKOV PROCESSES 7

in the monograph. The process x, is termed* a strong Markov process if, for any Markov time 7 and any t 2 0, r in 8, and x in E, the conditional probability satisfies

with probability one. Equation (2-2) has the interpretation that the probability of { x t + , E r } , conditioned upon the history up to 7, equals the probability of { x ~ + , E ~ ) , conditioned upon x, only. For example, let x, be a continuous process, and let 7 be the first exit time from a bounded open set Q. Then x , ~ d Q , and (2-2) means that the condition- al probability that the path will be in r, t units of time after contacting dQ, is the same whether only the position on dQ at 7 is given, or whether the method of attaining the position on dQ at T is given.

Since (2-2) holds for 7 equal to any finite constant, any strong Markov process is also a Markov process. The reverse is not true. See the counterexample in Loeve [l], pp. 577-578. Nevertheless, to the author’s knowledge, all Markov processes which have been studied as models of physical processes are strongly Markovian.

DISCRETE PARAMETER PROCESSES

All discrete parameter Marksv processes are also strong Markov processes. This follows directly from Loeve [l, p. 5811. The basic distinction between discrete and continuous parameter processes which yields the result is that Markov times for discrete parameter processes can take only countably many values. That this is sufficient may be seen from the following argument. Suppose that 7 is a Markov time for the process xl, .. . . Assume, for simplicity, that the process is homogeneous. To prove the result we must show that

* I f xt is not right continuous, then Dynkin [2], p. 99, requires that it be measur- able.

Page 23: Stochastic Stability and Control Harold Joseph Kushner

8 I / INTRODUCTION

with probability one. By the left side of (2-3), we mean P, {xr+,,~T ISr}. Equation (2-3) is equivalent to the statement that

p, { A n {xn+r@) E r}} = px (dw) p (% &(u) 9 r) (2-4) s A

for any set A in Sr. Define the event Bi = {T = j } . The B j are disjoint sets and their union (including the event B, = {z = 03)) is a. Thus (2-4) may be written as

1 P, { A n Bj n {xm+ j~ r}} = / px (dw) p (k &(m) 9 r> i

A n ( U B j )

= 7 px(dw) ~ ( n , x j , r>. A n B j

However, the defining relation (2-6) for discrete parameter processes

(2-5)

Markov

P , { X , , + ~ E ~ ~ X , , r 5 j } = P ( n , xi, r ) implies (2-5) and concludes the demonstration.

(with probability one) (2-6)

FELLER PROCESSES

Suppose that the function f(x) is bounded, continuous, and has compact support. If the function

E,f (4 = F (4 is continuous in x, for each t 2 0, the process is termed a Fellerprocess. Also, every right continuous Feller Markov process is a strong Markov process (Dynkin [2, Theorem 5.101). Indeed, this criterion provides a quite useful computational check for the strong Markov property; see for example, Sections 4 and 5. Other, more abstract, criteria for the strong Markov property are given in Dynkin [2], Chapter 3, Section 3.

Page 24: Stochastic Stability and Control Harold Joseph Kushner

2. STRONG MARKOV PROCESSES 9

WEAK INFINITESIMAL OPERATOR OF A STRONG MARKOV PROCESS

The functionf(x) is said to be in the domain of the weak infinite- simal operator A’ of the process x,, and we write Jf(x) = h ( x ) , if the limit

6-0

exists pointwise in E , and satisfies*

lim Exh ( x 6 ) = h (x) . 6+0

(2-7)

Clearly, 2 is linear. It will be calculated for It6 and Poisson processes in Sections 4 and 5, respectively.

Suppose that x, is the solution to an ordinary deterministic differ- ential equation f = g ( x ) . Then “ f ( x ) is in the domain of 2’ implies that f ( x ) has continuous first partial derivatives and that f ( x , ) is continuous in t. Under these conditions x f ( x ) = f:(x) f = f(x). In general, x f ( x ) is interpreted as the average time rate of change of the processf (x , ) at time s, given that x, = x . Note that A’ is not necessarily a local (in the ordinary Euclidean topology) operator. The value of the function $ ( x ) = h ( x ) at some point xo will depend on the values of f ( x , ) (with initial condition xol which are attainable for small s. For example, if x, is a Poisson process, then h(xo) will depend on the values off(x) which can be reached by a single jump from xo. For a detailed development of the properties and uses of A’ and related operators, the reader is referred to the monograph of Dynkin [2]. Its application in the sequel is due to formula (2-9).

Consider the form and domain of the weak infinitesimal operator when the process is not homogeneous or when f ( x , t ) depends ex- plicitly on time. Suppose that for each constant b in some interval the function of x given by f ( x , 6 ) is in the domain of A” and J f ( x , 6) =

h(x , b). Suppose also that f ( x , t ) has a continuous and bounded deriua-

* If the limits (2-7) and (2-8) are uniform in x in E, thenf(x) is in the domain of the strong infinitesimal operator of the process.

Page 25: Stochastic Stability and Control Harold Joseph Kushner

10 I / INTRODUCTION

tioe,f,(x, r), with respect to t, for each x in E and t 5 T. Let

Ex. ,h(x,+a, t+6)- ,h(x, t )

E,.fft(Xf+d, t + a)-fl(x, t )

as 6 + 0 and 6 >= ct -+ 0. Then (2-7a) and (2-8a) hold and f (x, t) is in the domain of 2. (In (2-7a) and (2-8a) the symbol 2 is used for the weak infinitesimal operator of x, operating onf (x, t), where t is fixed.)

E,.ff(X,+d 3 t + 6) -f(x, 0 lim d - 0 6

Ex,rf(Xc+dr t + 6) - E,,ff(Xf+d, t ) = lim

d+O 6

+ lim 6 - t O 6

Ex,tf(X,+d, t ) -f(& r>

= lim E , , f f r ( ~ t + a , t + ct) + z f ( x , r)

= A (x, 2) + h (XI r )

6 - 0 d Z a + O

(2-7a)

DYNKIN’S FORMULA

Suppose that x, is a right continuous strong Markov process and z is a random time with E,z < co. Letf(x) be in the domain of 2, with zf(x) = h (x). Then (Dynkin [2], p. 133)

T r

E,f(x,) -f(x) = Ex 1 h (x,) ds = Ex 1 Af(x,) ds . (2-9)

Equation (2-9) will be termed Dynkin’s formula. The deterministic counterpart is the basic formula of the calculus:

0 0

r

f(xt) -f(x) = /Axs) ds * 0

Page 26: Stochastic Stability and Control Harold Joseph Kushner

3. STOPPED PROCESSES 11

3. Stopped Processes

Define r n s = min(t, s). Suppose that T is the first time of exit of x, from an open set Q. The process I , = xtnr is a sfoppedprocess:

I , = x,

I, = X,

( t < T),

( r 2 T).

We use the notation Pf and Ef (in lieu of Px and Ex) for the stopped process only when confusion may otherwise arise. If x, is right con- tinuous, then so is I, . The process I , is strongly Markovian (Dynkin [2], Theorem 10.2). Denote the corresponding weak infinitesimal operator by JQ.

If the limit (3-2) exists for the process I , and l i m a + o E ~ h (Ia) = h (x), thenf(x) is in the domain of xQ:

~ 9 ( 1 a > = ~x [ ~ 7 2 a I f ( ~ 6 ) + ~ . x [~r<JIf(xr)

,yA is the characteristic function of the set A . Suppose that x is not in Q. Then z Q f ( x ) = O . If x is in Q, then

T > 0 with probability one by right continuity. Iff(.) is in the domain of A” and

and

E:h(I,)-h(x) as S - 0 (3-3a)

Exh (xa) + h (x) (3-3b)

for all XEQ, thenf(x) is also in the domain of JQ and J Q f ( x ) = @(x) for x in Q. Equation (3-2) equals

E x ~ a > r [f(Xanr) - f(x6)I 6

Page 27: Stochastic Stability and Control Harold Joseph Kushner

12 I / INTRODUCTION

If V(x) is bounded and continuous, then the limit (3-2) is zero if

(3-4)

Equation (3-4) is true for the Ita and Poisson processes to be discussed, and essentially implies that the transition density P( t , x, r ) is differ- entiable with respect to t (from above) at t = 0.

I f f (x) is in the domain of A“ and (3-2) holds, thenf(x) is in the domain of xQ if h(x) is continuous and bounded in Q. This follows since the difference between the left-hand sides of (3-3a) and (3-3b) is

lim ExXa>r Ch ( X a J - h ( X A l 6 - 0

which equals zero under the above condition.

in the domain of A”Q implies V ( x ) is in the domain of JPv(x) in P.

It is implicitly assumed in much of the text that if Q 2 P, then V(x) and xQV(x) =

4. It6 Processes

A class of continuous time Markov processes whose members are often used as models of stochastic control systems are the solution processes of the stochastic differential (It8) equation

(4- 1 )

z, is a normalized vector Wiener process with E(z, - zs) (zt - zs)’ =

Z i t - sI, where Z is the identity matrix. Define the matrix

d x = f (x, t ) dt + a(x, t ) dz.

{ S i j ( X , t ) } = S(x, t ) = a’(x, t ) a(x, t ) .

In some engineering literature, (4-1) appears as

= f ( x , r) + t ) $ 9 (4-2)

where $ is called “white Gaussian noise.” The resemblence of (4-2) to an ordinary differential equation is misleading, since t+b is not a function. Some relations between the solutions of (4-1) (or (4-2)) and the solution of ordinary differential equations are discussed in Example 2 of Chapter 11. See also Wong and Zakai [l, 21. If a(x, s) = 0, then (4-1)

Page 28: Stochastic Stability and Control Harold Joseph Kushner

4. IT6 PROCESSES 13

is interpreted to be an ordinary differential equation. Equation (4-1) is given a precise interpretation by It8 [l], who writes it as an integral equation: 1 2

x, = xo + If(.,, s ) ds + /.(x,, s ) dz,.

0 0

The second integral, called a stochastic integral, is defined roughly as follows. A random function V, and variable z, respectively, are said to be nonanticipative if V, and the event {z s s], respectively, are inde- pendent of zt - z, for all triples t 2 u 2 s. Suppose that g ( o , s) is a scalar-valued nonanticipative random function which satisfies

E g 2 (o, s ) ds -= 03. (4-3) i 0 Let {ti} be an increasing sequence of numbers tending to T. Now, let g(w, s) satisfy (4-3) and take the value (independent of s) g i (o ) in the interval [ t i , t i+l) . Define the stochastic integrali(w, s) by

~i(o.s)dzs=Zgi(o)(ill+, i -zti)* (4-4) 0

Approximateg(w, s) to be a mean fundamental sequence of such simple functions. The integral j’bg(w, s) dz, is defined as a (mean square or probability one) limit of the corresponding sequence (4-4). With this definition, and the conditions (4-9, It8 uses a Picard iteration technique to construct a sequence of processes which converge with probability one to a process which is defined as the solution to (4-1):

f (x, t ) , a (x, t ) are continuous functions

I I ~ ( X , ~) I I ’ = CIfi(x, t)12 s ~ ( 1 + 11x11’) i

where K is a nonnegative real number.

Page 29: Stochastic Stability and Control Harold Joseph Kushner

14 I 1 INTRODUCTION

The scalar case of (4-1) is discussed by Doob [l], and the vector case by Dynkin [2] and Skorokhod [l]. Under (4-5), the solution of (4-1) is a Markov process with killing time equal to infinity. The process is continuous with probability one and, for any 0 a < b < 00

and initial condition x ,

J%,a max IlXill2 <a. a s i s b

x, is independent of z , - zu, all s > u 2 t . Also

P,, , { max JIxi+d - x I I 2 E =- 0} 5 (1 + llx112)3’2 O(63’2 ) . (4-6) dbAb0

O ( . ) is uniform in t and x , but depends on E. The scalar version of (4-6) is given in Doob [l], p. 285, and the vector version is derived in an identical manner. By (4-6), the process is uniformly stochastically continuous in any compact set.

Skorokhod [I] shows that, for fixed t , x , is continuous in probabili- ty with respect to the initial condition. In other words, let the solutions xt and y, of (4-1) correspond to initial values x and y, respectively. Then

P,,y.r {Ilx,+a - Y I + d l l L E l + 0 (4-7)

as IIx - yl l -0, for any E > 0, 6 > 0. Since Ex,a maxogtSbllxtllz < co, for each fixed a, b, and x , Chebychev’s inequality gives

(4-8) K1

p,.t {IlXT - XI1 ’ N l s -2 9 N

where K1 is some real number. See Doob 111, p. 285, for the derivations of the scalar versions of (4-9) and (4-10):

f + d

E x , i ( x i + d - X) = f (.s, X) d . ~ + (1 + 1 1 ~ 1 1 ~ ) ” ~

E , , f ( X , + d - X I (Xt+d - XI’

(4-9)

i + 6

S(s, x) ds + (1 + ( 1 ~ 1 ) ~ ) ’ ” O ( h 3 l 2 ) . (4-10)

Page 30: Stochastic Stability and Control Harold Joseph Kushner

4. IT6 PROCESSES

STRONG MARKOV PROPERTY

15

Next, we show that the solution (in the sense of Itd) of (4-1) is a Feller process and, hence, since the paths x, are continuous, x, is a strong Markov process. Let g(x) be a bounded continuous real-valued function and consider t as a component of the state; then we must show that E,g(x,) is continuous in x for each t > 0. Suppose that Ig(x)l 5 B < co and x, and y , are solutions of (4-1) with initial con- ditions x and y , respectively. Then we must show that IE,g(x,) - E,g (y , ) ) + 0 as IIx - yII + 0. We now divide the possible situations into the following three cases: either IIx, -ytll 2 p, or else Ilx, - y,ll < p occurs together with either 1Ix,\1 2 N or I)x,)1 < N . Thus

lE,g(x,) - Eyg(y,)IS P,.,{lIxt - YrlI 2 PI 2 8

+ P,{ IIXrII 2 N ) 2B + SUP llull 6 P

Ig(w + .) - g(w)I.

11 W1I 5 N (4-11)

Given any 6 > 0, we will find an E > 0 so that if IIx - yll 5 E , then each term on the right of (4-11) is less than 6/3. Choose N < c a so that P, ( 1 1 ~ ~ 1 1 > N } 6/6B. Then choose p > 0 so that the last term of (4-1 1) is less than 4'3. Finally choose E > 0 so that P,,,{ llx, -y,ll 2 p} < 6/6B, if Ilx -yJJ < E . The demonstration is complete.

DIFFERENTIAL GENERATOR

The operator

a 1 a* a i axi 2 i , j axiaxj at

9 = Cfi(x, t ) - + - CSij(x, t ) - +- - (4-12)

is known as the differential generator of the process x,. Suppose that the function g(x, r ) is bounded and has bounded and

continuous first and second partial derivatives with respect to the x i , and a bounded and continuous derivative with respect to t . By using the evaluations (4-9) and (4-10) it is not hard to show that for each

Page 31: Stochastic Stability and Control Harold Joseph Kushner

16 I / INTRODUCTION

fixed t and x

Let A?g(x, t) = h(x, t). The verification of

lim E x , f h ( x , + ~ , t + 6) = h (x, t ) (4-14) 6-0

is straightforward, and will be demonstrated for the term gxlx,(x, t) Si j (x , t) only. By our assumptions on g(x, t) and S(x), lgxrx,(x, t) Sij (x, t)l 5 K2(1 + 11x(I2), for some real number K 2 . Also, Ex maxdzuzo llxul12 < co for any 6 > 0. Then it follows directly from the dominated convergence theorem, and the continuity of both the process x, and the functions Sij(.;) and gxix,(.,*), that

Equation (4-13) together with (4-14) imply that, on the class of functions described by the first sentence of the previous paragraph,

Al= 2.

1 ~ 6 ' ~ LEMMA

Let x, be an It6 process, and let F(x, t) have continuous derivatives F,(x, t), Fxi(x, t), and Fxix,(x, t ) for 0 6 t 5 T and JIxJI < co, and sup- pose that s ~ p ~ ~ , ~ , ( F ( x , , t)l <cx, with probability one. Then, for 0 5 s 4 t 6 T, (see It6 [l], Dynkin [2], Theorem 7.2, Skorokhod [l], Theorem 2.5),

r 1

~ ( x , , t> - ~ ( x , ,s> =I d i ~ ~ ( x , , u ) du + J ~:(x,, , u ) c(x,,, u ) dz,. S S

(4-15) with probability one.

Now, consider a more general It6 process. Assume thatf(x, t, 0)

Page 32: Stochastic Stability and Control Harold Joseph Kushner

4. IT6 PROCESSES 17

and a(x, t , 0) satisfy (4-5) uniformly in o and are nonanticipative random functions if x: is nonanticipative. Then the equation

dx = f (x , t , 0) dt + a(x , t , 0) dz (4-16)

may be interpreted in very much the same way as (4-1). The solutions are continuous with probability one, E maxm>b>s>n20 x:xs < 03, etc. Of course, x, is not necessarily a Markov process. Furthermore, (4-1 5) holds where

a a 1 aZ = - + C f i ( X , t , 0) - + - c s i j ( x , t , 0)-.

at i axi 2 i , j axi axj

Equation (4-15) also holds if t and s are nonanticipative random variables. Write Zr as the indicator of the ( q t ) set where T 5 t , and suppose that T and q(0, t ) are nonanticipative, and T is uniformly bounded. Then

j. 4 (0, t ) dzr = I r q (0, t ) dzr. 0 7 0

E z,q (0, t ) dz, = 0. j 0

STOPPED PROCESSES

The processes obtained upon stopping the solutions of (4-1) at random times (of the process z t ) are discussed in Dynkin [2], Chapter 11, Section 3. Suppose that the stopping time T is the first moment of exit of x, from an open set Q. By continuity, x, is on aQ with proba- bility one. By (4-6), the limit (3-4) is zero. Let g(x, r ) be the restriction to Q + aQ of a function which, together with its derivatives, satisfies the boundedness and continuity conditions following equation (4-12). Then g(x, t ) is in the domain of JQ and &(x, t ) = JQg(x, t ) in Q.

Page 33: Stochastic Stability and Control Harold Joseph Kushner

18 I / INTRODUCTION

WEAKENING OF CONDITION (4-5)

Suppose that there is an increasing sequence of open sets Q,, tending to E, so that (4-5) is satisfied in each Q, + dQ, (with finite Lipschitz constant and bound K;) . Construct a sequence of functions f m ( x , t ) and am(x, f), agreeing withf(x, 1) and a(x, t) , respectively, on Q, + dQ,, and satisfying (4-5) for some finite K,. Corresponding to each of these functions there is a well-defined unique solution to (4-1), which we denote by xr. The processes x:, m 2 n, take identical values until the first exit time from Q,, and this first exit time from Q, is identical for all xr, m >= n. (See, for example, Dynkin [2], Chapter 11, Section 3, or Doob [2 ] . ) The sequence of first exit times, T,, of xy from Q,, increase and tend to a finite or infinite limit [. By this procedure we may define a unique solution x,, to (4-l), for t < [, with (4-5) being satisfied only locally. If ( (0) < 00, then we say that x,(o) has a finite escape time.

In the sequel, we will generally be concerned with processes which are either stopped at or defined until the first moment of exit from open sets. It will then be required that (4-5) hold in these sets only.

5. Poisson Differential Equations

Let q, be a vector-valued process whose components are independent Poisson step processes with aid + o ( A ) the probability that the i th component will experience a jump in [f, t + A ) . Given that a jump oc- curs, let Pi(dy) be the corresponding probability measure on the jump amplitude. Let Pi(dy) have compact support. Let*

y P i ( d y ) = 0 s for each i. Write

d x = f (x, t ) d t + a(x, t ) dq . (5-1)

* Eyi = 0 is a helpful assumption, since then the 91 in (5-1) is a martingale, and

+ Actually, (5-1) was also studied first by It6 [I]. a very close analog with the results of the previous section is available.

Page 34: Stochastic Stability and Control Harold Joseph Kushner

5. POISSON DIFFERENTIAL EQUATIONS 19

The theory of equations of the form (5-1) is precisely that of equations (4-1). The definitions and existence and uniqueness proofs are identical. This derives from the fact that both the Wiener process zr and the Poisson process qt are martingales and have independent infinitely divisible increments. Both (5-1) and (4-1) are special cases of the equations discussed by It6 [l] and Skorokhod [l].

Let (4-5) hold. Then

as l]x-yll+O. Equation (5-3) is the analog of (4-7):

P,,t{ max IIXt+A - XI1 > 4 = O ( 4 . d > A Z O

(5-4)

The derivation of (5-4) is almost exactly that of (4-6). (The scalar case derivation of (4-6) appears in Doob [I], p. 385; note that the last term of Doob’s equation (3.19), p. 285, is now 0(6), rather than 0(a3/’), since our qi is a Poisson process. All other terms are similar in both cases.) The analog of (4-8) is derived by using (5-2). Equation (4-9) also holds here. Also, for fixed x and t ,

i + d l + d

1 r

The normed term in (5-5) equals

t + d t + d

(5-5)

r t

whose mean square value is of the order of 6. (The integrals are evaluated exactly as in the scalar case in Doob [l], p. 284, (3.16) and (3.17).)

It is readily verified that xi is a Feller process and, hence, since the paths are continuous from the right, it is a strong Markov process.

Page 35: Stochastic Stability and Control Harold Joseph Kushner

20 I / INTRODUCTION

Equation (5-4) implies that, at least on bounded continuous functions, if g(x , t ) is in the domain of 2, then it is in the domain of A”, and Jg(x , t ) = &g(x, t ) in Q (see (3-1)).

Define the operator 9. Let yi be a vector with y in the ith com- ponent and zeros elsewhere:

n

Suppose that g(x, r ) , g,(x, r ) , gxi(x, r ) are bounded and continuous, and the support of Pi(dy) is compact. The evaluation (5-5) then yields

and

l i m E x , t h ( x r + a , t + 6) = h(x , t ) . 6 + 0

Thus

9=Al

on the class of functions described.

SECOND FORM OF THE POISSON EQUATION

Let us consider another form of a “Poisson driven” process. We use the homogeneous case notation. Let f = f ( x , y) , where f ( x , y ) is bounded and satisfies a uniform Lipschitz condition in x, and yt is a Poisson step process, taking values y’, ..., y”. Let aij8 + o(6) = P{yt+a =y-”(y , =yi}. The pair ( x t , y,) is clearly a right continuous

Page 36: Stochastic Stability and Control Harold Joseph Kushner

5. POISSON DIFFERENTIAL EQUATIONS 21

strong Markov process with no finite escape time. In fact, x, is uniformly 'continuous.

Suppose that g(x, y ) is bounded and has continuous and bounded derivatives gxi(x, y) , for each y'. Then we define the operator d :

d g ( x , Y i ) = c g x , < X , Yi)fj(x, yi) + C aij [ g ( x t Y') - g (x, Y')I i i

= 12 ( x , yi ) . (5-8)

We will show that on the described class of functions, d is the weak infinitesimal operator of the process (x,, y,). The fact that Ex,yi h ( x , , y,) -+ h(x , y i ) is obvious. This property holds for any function h ( x , y ) which is bounded and continuous in x for each value of y , y = y ' , . . ., y". Now, supposing that g(x,y) and its derivatives are bounded absolute value by G,

d c

in

A d

+ E x , y i ~ ~ i , d g [ ~ + ~ f ( ~ s ~ y ' ) ~ ~ + ~ f ( ~ , ~ ~ ' ) d ~ ~ ~ ~ ] + C ~ ( 6 ) ~ i

(5-9) 0 A

where A is the (random) time of a transition in y , (supposing that there is one transition), and the expectation on the right of (5-9) is over A(w). Continuing the evaluation of (5-9), we have

E x , y i g ( x J 9 ~ 6 ) = d C g x i ( x 7 Y i > f j ( x , yi) + d C g ( x , Y') aij

- d C g ( x , Y i ) aij + G o ( 6 ) + g ( x , Yi),

i i

i

which yields the desired result that (5-8) equals 2 on g(x, y). Let Q , , . . . , Q , be open x sets. Let Q have the form Q = Q , x { y ' } +

... + Q , x { y " } . Write Q = Uy=l Qi . Let T be the first exit time of (x,, y,)Jom the open set Q . Since for (x, y ) in Q, P,,,{T c t } = O(f), we have A = d = JQ on the described class of functions. If g(x, y ) and

Page 37: Stochastic Stability and Control Harold Joseph Kushner

22 I / INTRODUCTION

f ( x , y ) satisfy the imposed uniform boundedness, differentiability, and uniform Lipschitz conditions only on for each y i , then we still have JQ = d, and the process ( x r n , , y t n , ) is a right continuous strong Markov process.

6. Strong Diffusion Processes

Markov processes with continuous paths and for which there are Holder continuous functions a i j (x , t ) , b i (x , t ) , and c (x , t ) given by (6-1) are called diffusion processes. Let E be any positive real number:

1

s+o s lim - 1 ( x , + , - x ) P,, (dw) = {bi (x, t ) ) = b ( x , t )

( l lx t+s-x l l - = E l

1 r

k t + . € E l

It6 processes (Section 4) constitute one example (where c=O). ( c (x , r ) A is essentially the probability of escape to infinity from x in the interval [t, t + A ) . ) Under conditions (6-2) and (6-3), there are some rather strong properties associated with the process. We now suppose that (6-2) and (6-3) hold in a bounded open region G in E.

For some real positive p and any vector t,

a i j ( x , t ) , b i (x , t ) , c ( x , t ) are continuous and bounded and satisfy a uniform Holder condition in x , (6-3)

aG is of class* H ~ + . . (6-4) ~

* The local representation of the boundary aC has Holder continuous second derivatives.

Page 38: Stochastic Stability and Control Harold Joseph Kushner

6. STRONG DIFFUSION PROCESSES 23

Let aij, bi , and c depend only on x, and not on r . The following references are all to Dynkin [2]. Under (6-2) and (6-3), there is a neighborhood U of each x in G with E,T < 00, where T is the first exit time from U (Theorem 5.8). Since G has a compact closure, (6-2) and (6-3) imply the following (Theorems 5.10 and 13.18) for x in G and on the class of bounded functions of x with continuous first and second derivatives * :

A”= A”Q = L - c(x).

1 a2 a 2 i . j ax, ax j i dXi

L - C(.) = -Caij(X)- + x b i ( x ) - -c(x) .

The process is stochastically continuous in G, and the killing time is greater than zero with probability one (Lemma 5.11). In fact, if ~ ( x ) = 0, the process is continuous and defined with probability one up to at least the first moment of contact with dG. All processes with the same differential generator L - c(x) in Gareequivalent in G (p. 160) (that is, they have the same probability transition function).

Furthermore (Theorems 5.8, 5.11, 13.11, 13.12, and 13.16), to each operator L - C(X) satisfying (6-2) and (6-3) on G, there is a unique process on G (terminated upon first exit from G) with differential generator L - c(x). The transition density of this process is the unique fundamental solution p(r, x, y ) of

a p -- - ( L - C(X)) p ‘at

with boundary condition p ( t , x, y ) + 0 as x -, aG and t > 0. Let G be bounded and dG satisfy (6-4). Let g(x) satisfy a uniform Holder con- dition and be bounded in G, and let q ( x ) , defined on aG, becontinuous. Then (6-5) has a unique solution which is (6-6) (Theorem 13.16):

* We write L in lieu of P, since P includes a term alar. t The process is unique if the initial distribution is fixed.

Page 39: Stochastic Stability and Control Harold Joseph Kushner

24 I / INTRODUCTION

T is the first instant of contact of x, with dG. In particular, let q ( x ) = 0 on dG and g(x) = 1, c = 0. Then the

unique solution of Lf = - 1 i s f ( x ) = E,T, which must thus be finite. Suppose that G is of the form of Figure 1, with smooth exterior

Figure 1

boundary rl and smooth interior boundary Tz. Let q(x ) = 1 on rl and q ( x ) = 0 on Tz and let c = g = 0. Then Lf = 0 has the unique solution

f (x) = Exq (x,) = P, {xr touches rl before r,} .

Page 40: Stochastic Stability and Control Harold Joseph Kushner

7. MARTINGALES 25

7. Martingales

The references of this section are specifically to Doob [l]. (See also Loeve [l].) Let y,,, n = 1,. .. be a sequence of random variables and B,,, n = 1, . . . a nondecreasing sequence of a-algebras. If y,, is measur- able with respect to B,, and satisfies

with probability one, then the sequence {y,,, B,,} is a martingale. If

E [ ~ n + 1 IBn1 5 Y, (7-1)

with probability one, then the sequence {y,,, B,,} is a supermartingale. Equation (7-1) implies, for each set A in B,,, and any n 2 I , m LO,

1 Y n + rnp (do) 5 Ynp (dw) s A A

Let x,, n = 1, . . ., be a Markov process, and B, the minimum a-alge- bra over which x i , i 5 n, is measurable. If El V(x,,)I < 03 for each n < co and

Ex[V(xn+,)IBnI 5 V ( x n ) (7-2)

with probability one, then { V(x,,), B,,}, n < 03 is a supermartingale. The left-hand side of equation (7-2) equals ExnV(x ,+ l ) with probabil- ity one. (This holds also for continuous parameter processes.)

Let z be an (integral-valued) Markov time. If {y,,, B,} is a supermar- tingale or martingale, then so is { y n n T , B,,) (Chapter 7, Theorem 2.1).

Let y , 2 0 and suppose that { y,,, B,,} is a supermartingale. The argu- ments yielding equation (3.2) of Chapter 7 of Doob [ l ] also give, for each A >= 0,

(7-3) Y l

P { m a x y i L 1 } 5 E . n > i > l 1

(Theorem 4.1s (Doob)). Let {y,,, B,} be a submartingale. Let 1.u.b. Ely,l = M < 03. Then limn y , = y, exists with probability one and

Page 41: Stochastic Stability and Control Harold Joseph Kushner

26 I / INTRODUCTION

E (y,lS M. (If y , 2 0 is a supermartingale, the theorem applies to { -y,, B,}. Then the hypothesis is satisfied if Ely,l <w.) If the y. are uniformly integrable, then E ( y , - y n ( -0.

The continuous parameter results are essentially identical to the discrete parameter results. Separability of the process is always assumed. If (7-1) holds and B, 2 B,, t 2 s, and y , is measurable over Bt and E (y,l <GO, then {yt, B,} , r < 03, is a supermartingale.

For a nonnegative supermartingale

Y o P { sup y , L A} 5 E - T Z t Z O 1 (7-4)

Let {yf, B,} beasubmartingale. Let 1.u.b. E ( y , ( <a. Then there exists a random variable y , such that y , + ym with probability one. If the y, are uniformly integrable, then E I y m - ynl + 0.

(Theorem 11.6 (Doob); see also remark on p. 379). Let z be a non- anticipative time for the process y,, 0 5 t < 00, and let { y , , B,} be a submartingale, where y, is continuous from the right with probability one, Then {y,,,, B,} is a submartingale. ( y r is, of course, evaluated as the limit of yr+r as t - 0 from above.)

Page 42: Stochastic Stability and Control Harold Joseph Kushner

11 / STOCHASTIC STABILITY

1. Introduction

Deterministic stability is a branch of the qualitative theory of dy- namical systems. In particular, the majority of presently available results which are termed stability results pertain to certain qualitative and quantitative (which do not involve the actual computation of a solution) properties of differential equations. Consider the differential equation P = f ( x , t ) with initial condition xo belonging to a set R. In what follows, R may vary but will always be a nonempty bounded open set containing the origin x = (0). Let P be a set containing R. Some typical problems which may be grouped under the title “sta- bility problems” are :

Let P be given. Is there an R such that if xOe R, then X,E P for all finite t? In reference to (Pl), is there some R corresponding to each open set P containing the origin? In reference to (Pl), estimate the largest set R. (The estimate is a quantitative property.) For a given set of initial values R , is the set P, containing the range of the trajectory for all t < 00, bounded? What is the smallest set containing the asymptotic values of the solution for all xo in R? For fixed P and xo in R , estimate the first time of exit of x, from P.

21

Page 43: Stochastic Stability and Control Harold Joseph Kushner

28 11 / STOCHASTIC STABILITY

(P7) Will the trajectory intersect a given set S in finite time, for any xo in R?

(P8) Let f = f ( x , t ) + p . Will (Ix,)I be uniformly bounded if llptII is?

The reader is referred to the monographs of LaSalle and Lefschetz [l], Hahn [l], Krasovskii [l], Aiserman and Gantmacher [l], or Lefschetz [l] for more specific details and references on the deter- ministic “stability” problems.

Related questions arise in the analysis of stochastic processes ; in particular, in the analysis of stochastic processes that are models of automatic control systems. The processes with which we will deal are right continuous strong Markov processes. Not all Markov processes are strongly Markovian (see counterexample in Loeve [l], p. 578); however, to the author’s knowledge, all Markov processes which have been studied as models of physical problems have the strong Markov property.

Some typical problems subsumed under the title of finite time stochastic stability are mentioned in the introduction to Chapter 111. Stochastic analogies to the listed deterministic problems can easily be drawn. Let x, be a right continuous strong Markov process, whose initial value, a nonrandom constant xo, lies in a nonempty open set R containing the origin. P is a set which contains R.

(Pl)’ Is there an R so that P,{x ,$ P, some t < co} 5 p < 1 for some given p , P and any x in R?

(P2)’ In reference to (Pl)’, is there some R corresponding to each given P and p < l?

(P3)’ For fixed P and p in (Pl)’, estimate the largest set R . (P4)’ What is the value of minxosR P x o { x r ~ P, all t < co}, when P is a

given bounded set? (P5)‘ What is the smallest set to which x, tends with probabilityone,

as t+w?

(P6)’ Estimate the probability P,{x ,$ P, some t 5 T } , XE R . (P7)’ Is there a finite-valued Markov time T such that x t - + S , as

t 4 z?

Page 44: Stochastic Stability and Control Harold Joseph Kushner

1. INTRODUCTION 29

The obvious analogs of (P8) are of no probabilistic interest. Two related problems of interest are:

(P8)’ Let i = f ( x ) + p , where p, is a Markov process. Is the pair ( x , , ~ , ) a positive recurrent process? Is E Ilxrl12 uniformly bounded if E IIprl12 is?

(P6)‘ is a type of finite time stability or first exit time result. (See Chapter 111.) For a generic source of such problems in control theory, consider the system i = f ( x ) + p , . Suppose first that p , = 0. Under this condition, suppose that if x = xo is in R , then x, is in R, for all t < co, and that if x is in E - R , then x , will be in E - R , all t < co, R is the “desired” region of operation. Let the forcing term p , be a stochastic process. Then p , may eventually drive x, into E - R with probability one; however, if the first time of entrance into E - R is “large,” the fact that x, + E - R may be unimportant for practical purposes. In any case, estimates of the probabilities of entrance times are of interest.

The methods of this chapter are applicable to the determination ((P7)‘) of whether or not x, tends to a given set S as t+co, or inter- sects S at some finite-valued Markov time. The results are pertinent to the problem of choosing a control which will transfer the initial condition x to a given target set S with probability one (or, perhaps, with at least probability p, where p is given).

Uniform boundedness of the sample paths ((P4)‘) is also of some interest. Consider the case where a dynamical system is removed from a stable equilibrium state by a random disturbance whose “intensity” or “magnitude” eventually decreases to zero as t -, 03 ; for example, an earthquake, or an echoing series of shocks. Suppose that one of the states of the dynamical system is the stress associated with a particular member. Then the probability that this stress is uniformly bounded by some given number (perhaps the average breaking stress minus a safety factor) is of interest. This problem is distinguished from the finite time stability problem in that, here, the shocks decrease with time, and the state may indeed be uniformly bounded (for t < co) in a prescribed way with a nonzero probability.

Page 45: Stochastic Stability and Control Harold Joseph Kushner

30 11 / STOCHASTIC STABILITY

Problems on an infinite time interval, where the noise intensity is proportional to a function of x , but does not depend on time, also occur. Consider the scalar problem

i = f (x) + u (x) + u (x) t ,

where 5, is the solution of an It8 equation

d( = - a< dr + u dz (a > 0).

The “control” ~ ( x ) is to be selected to ensure that x, -0 with proba- bility one. Since the noise effects are proportional to ~ ( x ) , it is not immediately obvious that there is a ~ ( x ) which will accomplish the desired task (unless xf(x) < 0, in which case u ( x ) = 0 will do). Suppose that, if t, =O,thenasuitableu(x)hastheproperty: lu(x)l +Oas 1x1 -0. Then, with the same u ( x ) and the true t t , it may be expected that, at least for small 0, x, -, 0 with probability one. Some such situations are discussed in the examples. For any particular u(x) , an estimate of P x { ~ ~ p m , r z O (x,I 2 E ] , as well as the asymptotic behavior of x,, is of interest.

DEFINITIONS

For further use, and to fix ideas on the properties with which the chapter will deal, we list some definitions and some properties which will be discussed in the sequel. The reader should be cautioned not to take the assigned terminology too seriously. The terminology is largely motivated by the terminology used for related deterministic properties. There is some discrepancy in usage among authors in the field and, in any case, the most appropriate usage remains to be determined, as the mathematical results and practical applications develop further and become codified. There are stochastic Liapunov theorems correspond- ing to all the properties which are defined by (Dl) to (D9).

(Dl) The origin is stable with probability one if and only if, for any

Page 46: Stochastic Stability and Control Harold Joseph Kushner

1. INTRODUCTION 31

p > 0 and E > 0 there is a 6 ( p , E ) > 0 such that, if llxoII c 6 ( p , E ) ,

then, with xo = x,

p, { sup IIx,II 2 E ) s P . m > f z O

The “with probability one” clause is motivated by the arbitrariness of p. An alternative definition which is less ambiguous is:

(D2) The system is stable with respect to the triple (Q, P, p ) if and only if x in Q implies that

P x ( x t ~ P , all t <a} 5 p .

More important than mere knowledge of the satisfaction of the conditions of the definitions are the upper bounds of

P x { SUP v ( x t > L E } , m > r l O

which are provided by the theorems. V ( x ) is a stochastic Liapunov function.

(D3) The origin is asymptotically stable with probability one, if and only if it is stable with probability one and x, + 0 with probabili- ty one, for all xo in some neighborhood of the origin R . I f R is the whole space, we add “in the large.” For a given initial condition x, x, is un$ormly bounded by E

with probability p if and only if (D4)

P, { sup IIXfII >= E } s 1 - p * W > t t O

The stability properties of some components of the Markov process may not be of interest. For example, let y t be a Markov process and let k = f ( ~ ’ , y ) . Then, under suitable conditions on f (w, y ) , ( w i t , y , ) is a Markov process, but it is possible that only M’, is of interest in some problems. Then definition (D2) is still applicabie, and the probability bounds still yield useful information. (Dl) is irrelevent, but may be modified as : Let ( wr , y,) be a Markov process. Let wv take values in the

Page 47: Stochastic Stability and Control Harold Joseph Kushner

32 I1 / STOCHASTIC STABILITY

Euclidean space E,. The origin of E, is said to be stable with proba- bility one if and only if, for each y o = y , p > 0, and E > 0, there is a 6 ( p , E , y ) > 0 so that, if 11 wo 11 < 6 ( p , E , y ) , we have, with wo = w,

P,,,{ S U P I IWfI I 2 8 ) 5 P . m > i > O

In fact, the case where some components of the Markov process are of no interest is probably the general case. Some of the states of a Markov model of a control process will be the observed states, or the quantities of interest in the control process. However, in order to “Markovianize” a control process, many “states” may have to be added.

(D5) The origin is exponentially stable with probability one if and only if it is stable with probability one and, for all T < co,

P, { sup IIx,I( 2 E } 5 K e-‘*, m > f t T

where K < co and ci > 0. The origin is unstable with probability p if and only if

limlimP,{ sup I l x , l l Z ~ } = p .

(D6)

& + O X - r O W > f > O

(D7) The process is ultimately bounded with probability one and with bound m if and only if, for each x in E, there is a finite-valued (with probability one) random time T ( X ) such that

PX{ S U P IIXfII 5 m } = 1 9 m > f > T(X)

or, alternatively,

lim P x { sup IIx,(I > m} = 0. T - r m m > i > T

If the time ~ ( x ) does not depend on x (and is finite with proba- bility one), then the bound is said to be uniform. Let xi and it be processes corresponding to the initial con- ditions x and 2, respectively. The process is said to be equi-

(D8)

Page 48: Stochastic Stability and Control Harold Joseph Kushner

1. INTRODUCTION 33

distance* bounded with probability one if and only if for fixed

IIX -2117 Px,, { sup 11x1 - frl l 2 E } + 0

o o > t ~ O

uniformly in x and 2, as E + 00.

(D9) The process is said to be equistable ~ i f h probability one if and only if it satisfies (D8) and, for any fixed E > 0,

K*;{ sup 11x1 - 2111 L &} -0 m>t>O

as j/x -211 + O , uniformly in x and 2.

The properties (Dl) and (D9) depend on the behavior of the sample paths over a time interval of positive length (as opposed to moments, for example, which depend only on the distribution of x, at fixed instants of I, and only indirectly on the path characteristics). Most of the sequel is devoted to this type of property. An exception is Theorem 6 that allows the estimation oflim,,, E,KV(x,) (if such exists and is independent of x) and of l imjr E x J V ( x , ) d t /T for suitable functions

Finite time or first exit time results appear in Chapter 111.

THE IDEA OF A LIAPUNOV FUNCTION

The Liapunov function approach to the study of the stability of deterministic systems can be briefly introduced as follows. Suppose that the nonnegative continuous function V ( x ) , satisfying V(0) = 0, V ( x ) > 0, x # 0, has continuous first partial derivatives in the bounded set Q,, = (x: Y ( x ) < m } , m < co. Let i = f ( x ) have a unique solution in Q,,. Now the sets Q,, each containing the origin, decrease mono- tonically to the origin, as r --f 0. Suppose that r'(x,) = C': ( .Y,)~(X,) = - k ( x , ) 5 Oin Q,, where k ( x ) is continuous. The fact that r'(.~,) 5 0 in Q, together with the definition of Q, implies that, if V ( x o ) < m, then

* Following the deterministic definition of Yoshizawa [ I ] .

Page 49: Stochastic Stability and Control Harold Joseph Kushner

34 11 / STOCHASTIC STABILITY

V(x , ) < m ; hence, x, is in Q, for all t -= 00, provided only that xo is in Q,. This is a stability result. The function V ( x ) is termed a Liapunov function.

We may proceed further. Write

I

V ( x o ) - V ( x , ) = k ( x , ) ds = - 1 r‘(xs) ds. (1-1) i 0 0

Since k ( x ) is nonnegative and continuous in the bounded set Q,, and x, is in Q,, t < 00, and since the left side of (1-1) is bounded, it is obvious that x, + {x: k ( x ) = 0} as t 4 0 0 . Also, it is clear that V(x , ) decreases monotonically to some nonnegative constant v. If, for ex- ample, k ( x ) = 0 in Q , implies x = 0, then x, + 0 as t -+ 00.

These results are among many which may be obtained by the use of Liapunov functions. (See the references previously cited.) The “Liapunov function method” is a method of obtaining information on a family of solutions of i = f ( x ) without actually computing the numerical values of each solution. In lieu of the problem of numerical solution, there is the problem of obtaining useful Liapunov functions.

THE STOCHASTIC LIAPUNOV FUNCTION APPROACH

In studying the qualitative properties of a random process, we study the qualitative properties of a family of functions (the sample paths) simultaneously. If r‘(x,) 5 0 for each sample path, then the deterministic theorems may be used, and the problem may not have much probabilistic interest. A basic property of deterministic Liapunov functions, obtained in the previous subsection, is V(x , ) v 2 0. While we do not ordinarily expect r’(x,)SO in the stochastic case, it is reasonable to expect that some properties of a stability nature could be inferred from the form of A V ( x ) , the natural stochastic analog of the deterministic derivative. Further, since we are interested in making inferences concerning the asymptotic value of V(xJ from local proper- ties, the possible applicability of the martingale theorems is suggested.

Page 50: Stochastic Stability and Control Harold Joseph Kushner

1. INTRODUCTION 35

If V ( x ) 2 0 and, for any A > 0,

with probability one, then the supermartingale theorems apply and one may infer V(x, ) + v 2 0 with probability one. Just as the form of the functions k(x) was helpful in obtaining information on the asymp- totic values of x, in the deterministic case, we expect that the form of (1-2) would yield similar information in the stochastic case. In fact, Dynkin’s formula provides the needed connection between V ( x ) , A V ( x ) , and the martingale theorems, and plays the same role in the stochastic proofs as (1-1) plays in the deterministic proofs. Let V ( x ) 2 0 be in the domain of 2 and let T be a Markov time satisfying Exr < 03.

Then, supposing that A”V(x) = - k(x) 5 0,

V ( X ) - E X V ( x , ) = E x k(xJ ds = - Ex A V ( X , ) ds >= 0. (1-3) i 0 s 0

It is reasonable to expect that V(x, ) is a supermartingale, and also that x, + {x: k ( x ) = 0} with probability one (these facts will be proved in Section 2). A number of difficulties attend the immediate application of (1-3). First, the domain of 2 i s usually too restricted to include many functions V ( x ) of interest. Even if V ( x ) is in the domain of A, k ( x ) may be nonnegative in a subset of E only and, furthermore, to investi- gate asymptotic properties, we must let T + cc. These difficulties may be circumvented by first applying (1-3) to an appropriate stopped process derived from x,, and then taking limits.

HISTORY

The probability literature contains much material concerned with criteria under which random processes have some given qualitative property. However, the first suggestions that there may be a stochastic Liapunov method analogous to the deterministic Liapunov method seems to have appeared in the papers of Bertram and Sarachik [ l ] and Kats and Krasovskii [l]. Both papers are concerned with moments

Page 51: Stochastic Stability and Control Harold Joseph Kushner

36 I1 / STOCHASTIC STABILITY

only. Bucy [l] recognized that stochastic Liapunov functions should have the supermartingale property and proved a theorem on “with probability one” convergence for discrete parameter processes. Bucy’s work is probably the first to treat a nonlinear stochastic stability problem in any generality. Some results, of the Liapunov form, were given by Khas’minskii [3] for strong diffusion processes. The martin- gale theorems have had, of course, wide application in providing probability bounds on stochastic processes, although their applicability to the problems of the references had not been exploited very much prior to these works. Other works are those of the author included here and in Kushner [3-61, Wonham [2, 31 (for strong diffusion pro- cesses), Khas’minskii [2], Kats [l], and Rabotnikov [l]. There is cer- tainly much other material on “stochastic stability,” concerned with more direct methods. See Kozin [l-31 and Caughy and Gray [l].

2. Theorems. Continuous Parameter

ASSUMPTIONS

The following assumptions are collected here for use in the follow-

For some fixed m, ing theorems.

(Al) V(x) is nonnegative and is continuous in the open set Q, = {x: V(x) < m } .

(A2) x, is a right continuous strong Markov process defined until at least some 7’ > 7, = inf(t: x,$Q,f with probability one (or, for all t<co, if x,EQ, all t<co). If x, (w) is in Q,, for all t < co, define 7,(w) = 00.

(A3) Write 2, for xQm, the weak infinitesimal operator of xtnrm. (A4) V(x) is in the domain of 2,. (AS) Px{supf2s20 ~ ~ x s - x \ ~ >E}-+Oast+OforxEQ,and anyE>O.

Denote the w set { w : x,EQ,, all t < co} by B,.

Page 52: Stochastic Stability and Control Harold Joseph Kushner

2. THEOREMS. CONTINUOUS PARAMETER 37

Lemma 1 establishes a result which is to be used repeatedly. It also illustrates the basic use to which Dynkin’s formula will be put in the sequel.

Lemma 1. Assume (Al) to (A4). Let A”,V(x) 5 O.* Then V(xrnrm) is a nonnegative supermartingale of the stopped process xrnr,,, and, for 2 5 m, and initial condition xo = x in Q,,

(2-1) V (XI

V (Xrnr,,) 2 I - } 5 __ P x { SUP m > r b o 2

Also there is a random variable c(o), 0 5 c(w) < m, such that, with probability one relative to B,, V ( x , ) + c ( w ) , as t+w. P,{B,}2 1 - V(x) /m.

Proof. By Dynkin’s formula

r n r m

E,V ( x r n r m ) - V (x) = E , A”,,,V (xJ ds 5 0. (2-2) J 0

Thus E,V(.xrnr,,,) 5 V ( x ) . Also, since V ( x ) is in the domain of A”,,,, E , V ( X ~ ~ , , ~ ) + V ( x ) , as t -+ 0. These two facts imply the supermartingale property (Dynkin [2], Theorem 12.6). Equation (2-1) is the super- martingale probability inequality, and can also be derived from (2-2). The existence of a c (w) follows from the supermartingale convergence theorem and the last statement of the hypothesis follows from (2-1).

Remark. Suppose the state is (x, t ) . Define Q , as in (Al). If ~ , , , V ( X , t ) 5 0 in Q,(J,,, is the weak infinitesimal operator of the process (xrnr,, t n7,)), then the conclusions of Lemma 1 hold.

* For greater clarity, we could write the redundant statement A,!, V ( x ) 6 0 in

+ An obvious set of increasing a-fields can be adjoined to V (xrnrm) to complete Qt3,. Arrl V ( x ) is, of course, defined only in Q, .

the description of the supermartingale.

Page 53: Stochastic Stability and Control Harold Joseph Kushner

38 I1 / STOCHASTIC STABILITY

Equation (2-1) is written as

(2-1') V ( x , T )

V ( X r 5 t ) 2 I> S ~

m > r z T I ' px, , { SUP

where (x, 7') is in Q, and A 5 m. Theorem 1 is the basic result on stochastic stability, and is the

stochastic analog of Liapunov's theorem on stability. The probability estimate (or (D2)) is more important than the mere fact that (DI) is satisfied.

Theorem 1. (Stability) Assume (Al) to (A4) for some m > 0. Let V ( 0 ) = 0, XEQ,. Then the system is stable relative to (Q,, Q,, 1 - r / m ) , for any r = V(xo) = V ( x ) s m (see (D2) in Section 1). Also, for almost all w in B,, V ( x r n Z m ) --t c ( w ) s m. If V ( x ) =- 0 for x # Oand XEQ,, then the origin is stable with probability one.

Proof. The proof is a direct consequence of Lemma 1, the defi- nitions of stability, and the properties of the function V(x) .

Lemma 2 establishes an upper bound on the average time that the process x, spends in a set. It will later be used to prove, among other things, that certain sets are reached in finite average time.

Lemma 2. Let V ( x ) be nonnegative in an open region Q (Q is not necessarily bounded), x, a right continuous strong Markov process 'in Q, and T the random first exit time from the open set P c Q. Let V ( x ) be in the domain of lQ. Let lQ V ( x ) 2 - b < 0 in P . Then the average time, E,T, spent interior to P up to the first exit time is no greater than

V(x)lb.

Proof. By Dynkin's formula r n r f n r

V ( x ) - E , V ( x r n T ) = - E , / l Q V ( x , ) ds 2 bE, I,(s, w ) bs 0

(2-3)

0 s = b E x ( t n T ) ,

where I , (s ,w) is the indicator function of the (s ,w) set where

Page 54: Stochastic Stability and Control Harold Joseph Kushner

2. THEOREMS. CONTINUOUS PARAMETER 39

&V(x) 5 - b (it depends on xo = x). The nonnegativity of V(x) and (2-3) clearly imply the theorem.

DEFINITION

An &-neighborhood of a set M relative to an open set Q is an open set N , ( M ) c Q, such that

N , ( M ) = ( X E Q such that j/x - y/l < E for some EM}.

N , ( M ) = N , ( M ) + dN,(M). Note that it is not necessary that N , ( M ) contain M (although N , ( M ) will contain M in Theorem 2).

Theorems 2 and 3 are stochastic analogs of some Liapunov theorems on asymptotic stability.

Theorem 2. (Asymptotic Stability) Assume (Al) to (A5) and also A”,V(x) = - k ( x ) 5 0. Let Q, be bounded and define P , = Q, n {x: k ( x ) = O}. For each d less than some do > 0, let there exist an &d > 0 such that, for the &,-neighborhood N,,(P,) of P , relative to Q,, we have k(x) 2 d > 0 on Q, - NEd(P,,,). (A condition implying this is uniformcontinuity ofk(x)onP,andk(x) > Oforsornex~Q,.) Then

xt + prn

with a probability no less than 1 - V(x)/m. Suppose that the hypotheses hold for all m > 0. Define P = Uy P,.

and let N,(P) be the &-neighborhood of P relative to Q = Up Q,. If, for each d, 0 < d do, there is an &d > 0 such that k ( x ) 2 d on Q - N,, (P) , then x, -+ P with probability one.

Proof. Note that the statement of the last paragraph follows from the statement of the first paragraph. Note also that if k ( x ) = O in Q,, the theorem is trivially true as a consequence of Theorem 1. By Lemma 1, x, remains strictly interior to Q, with a probability 2 [ 1 - V(x)/m].

Fix dl and d2 such that do > dl > d , > 0. Let t i correspond to di

Page 55: Stochastic Stability and Control Harold Joseph Kushner

40 I1 / STOCHASTIC STABILITY

(that is, in Q,, - NCl(Pm), we have k ( x ) 2 di). It is no loss of generality to suppose that N,,(P,) is properly contained in N,,(Pm). Define Zx(s, W , E ~ ) as the indicator of the (s, W ) set where x,(w) is in Q , - N,, (P , ) and write

r,.

T,(t, E ~ ) = I , ( s , W , E ~ ) ds . I tnr,

T'(t, ci) is the total time spent in Q, - NZi(P, ) after time t and before either t = co or the first exit time from Q, (with xo = x). T,(t, ci) = 0 if T , 2 t . By Theorem 1 ,

v (XI P, {x, leaves Q , at least once before t =a} = 1 - P, {B , } 2 -

m

By Lemma 2 we conclude that T,(t, ti) < co with probability one and, hence, T,(t, c i ) + 0 as t + 00.

Now we distinguish two possibilities: (a) there is a random variable T ( E , ) <co with probability one such that x,EN,,(P,) with probability one (relative to B,) for all t > T ( E , ) ; or (b) for WEB,, x, moves from Ne2(P,) to Q,, - N E I ( P , ) and back to N,,(P,) infinitely often in any interval [ t , co). ((a) and(b) are all possibilities, since the path WEB,) can stay exterior to N E 2 ( P , ) for only a finite total time, since T,(t, c2)-+0 as t + co.) Consider case (b). Since T,(f , E J + O with probability one as t + co there are infinitely many movements from NE2(P,,) to Q , - NEI(P,) and back to N,,(P,) in a total time which (in the integral sense) is arbitrarily small. We now show that the proba- bility of case (b) is zero.

Fix 6, > 0. Choose h > 0 so that

sup P,{ sup IIx, - XI1 2 E l - E 2 } < 6 , . (2-4) XEQ", - N & z ( P , , , ) h 2 s b 0

By stochastic continuity and compactness of Q , + dQ,, for each 6, > 0 there is some h > 0 for which (2-4) holds. For each 6, > 0, and the fixed initial condition x, there is a t < co so that

P,{T,(t , E 2 ) > h } < 6 2 . (2-5)

Under case (b), (2-4) and (2-5) are contradictory. Thus, we conclude

Page 56: Stochastic Stability and Control Harold Joseph Kushner

2. THEOREMS. CONTINUOUS PARAMETER 41

that P x { x s ~ Q r n - N E , ( P , ) for some s E ( f , a ) } + O

as t -+a, for XEQ,. Since and E~ are arbitrary, the theorem is proved.

REMARK ON TIME DEPENDENCE

Suppose that either the process x, is not homogeneous or that the Liapunov function V ( x , r ) depends on time, so that as a consequence either k or V are time dependent. Then Theorem 2 may still be used as stated, although it is the custom in the deterministic theory to re- word the hypothesis so that certain trivial possibilities are eliminated. For example, suppose that k(x , t ) = g ( x ) / ( l + r ) , where g ( x ) > 0 for x # 0. Then, considering t as a state, all that we may conclude (from Theorem 3, since Q, is not compact in this case) is that either t + 00

or g(x) + 0 with some probability. We may reword the hypothesis in the following way. Let V ( x , t ) be

nonnegative, continuous, and in the domain of 2, (corresponding to the process (xfnr,,, rnr ,>)wi th~ ,V(x , t ) = - k ( x , t )sOinQ,.(Note that Q, = {x, t : V ( x , t ) < m } . ) Suppose that k ( x , t ) k , (x) in Q,, where k , (x) satisfies the conditions on k ( x ) in Theorem 2. Let xT = x be the initial condition. Then x, -+ {x: k , (x) = 0 } n {x: V ( x , t ) < m for some t 2 T } with a probability no less than 1 - V ( x , T ) / m .

Many special results, analogous to deterministic results, may be developed for time-dependent processes or functions, but we will not pursue the matter here.

Theorem 3 will be useful in examples where the “stability” of only a few components of x is of interest. (For example, where some com- ponents are of the nature of “parameters.”) In such cases the sets Q, are not always bounded. Define the event DE, 3 { w : W E B , , and j: Ix(s, w, ci) ds <a}. DE, includes those o such that x,(w)~Q,, all t <a, and x,(w)EQ,, - N&,(P , ) for only a finite total time.

Theorem 3. (Asymptotic Stability) Assume the conditions of Theorem 2, except that boundedness of Q, and P , is not required.

Page 57: Stochastic Stability and Control Harold Joseph Kushner

42 I1 / STOCHASTIC STABILITY

Let, for any E~ > 0, P , { jlx, 11 -+ co, x , E Q, , DEi} = 0. Then the conclusions of Theorem 2 hold. They also hold if (A5) holds uniformly in x in Q , - N,(P,), for each E > 0.

Proof. If Q , and P , are not bounded, then the proof of Theorem 2 implies that either: (c) there is a random variable zl(o) < 00 with probability onesuch that x , E N , , ( P , ) , all f > T , ( w ) , for W E B , ; or (d) for any compact set* A c Q,, there is a random variable r ( A ) < co with probability one such that x, is not A for t > T ( A ) and W E B , . The event DEi, ci > 0, occurs in any case. (Equations (2-4) and (2-5) are valid when Q , is replaced by compact A . ) Under condition (d), there is a monotone increasing sequence of sets A , , A , t Q,, and random variables ?(A, ) < co with probability one for each n, which satisfy (d). Then, under condition (d), IIx,II -+a, and x ,EQ, for all t . Since P, { IIx, II -+ 00, x, E Q,, DEi} = 0, case (c) must hold. Since is arbitrary, x, -+ P , with probability one, relative to B,. The last statement follows from the proof of Theorem 2, and the observation that, under the uniformity hypothesis, there can be no escape to infinity in finite (with probability one) time, unless the tails of the trajectories are entirel- contained in NE(Pm).

The following corollary will be useful in the chapters on optimal cony trol and design of controls. The set S will correspond to a “target” set.

Corollary 3-1. Assume (Al) to (A5) with the following ex- ception: V ( x ) is not necessarily nonnegative. Let S = { x : V ( x ) 0}, and define Qk = { x : m > V ( x ) > O}. Define T, as inf { t : x ,$QL} . Let 2, be the weak infinitesimal operator corresponding to the stopped process xrnrm. Let V ( x ) be in the domain of A”, with X,V(x) = - k ( x ) 5 0 in QL.

Suppose that k ( x ) is uniformly continuous** in the set PL =

* The words “any compact set A” could be replaced by the words “any set A for which (A5) holds uniformly, bounded or not.”

** More generally let N , (Pm) be an &-neighborhood of P‘,,, relative to Q‘,,,. Let k ( x ) satisfy, on Q‘,,, - N,, (Fm) the conditions on Q‘,,, - N,, (P,) of Theorem 2. Then Corollary 3-1 remains true.

Page 58: Stochastic Stability and Control Harold Joseph Kushner

2. THEOREMS. CONTINUOUS PARAMETER 43

{ x : k ( x ) = 0 } n {Q,, +as} and k ( x ) > 0 at some x in QL. Then there is a random variable 7 5 co such that x,-+ ( x : k ( x ) = 0) u {Q,, + S } as ?+T, with a probability no less than 1 - V(x) /m. Also

Proof. The proof follows closely the proofs of Theorem 2 and Lemma 1 and will be omitted.

This situation envisioned in Corollary 3-1 is depicted in Figure 1. The behavior of the process after 7 is not of interest. Note that if m is arbitrary in Corollary 3-1, and k ( x ) > 0 in each Q6, then there is a random variable 7’ co such that x, + S with probability one as t + 7’.

Figure 1

The result for It6 and Poisson processes are collected in Corollaries 3-2 and 3-3. The proofs involve merely the substitution ofthe properties of these processes into the hypotheses of Theorem 2 or 3.

Corollary 3-2. Let x, be an It8 process, the coefficients of whose differential generator satisfy the boundedness and Lipschitz conditions (of Chapter I, Section 4) in Q, . Let V ( x ) be a continuous nonnegative function which is bounded and has bounded continuous first and second derivatives in Q,. (If Sij (x) E 0 in Q,, then 8’ V(x)/dxi axj need not be continuous.) Then V ( x ) is in the domain of 2,. With

Page 59: Stochastic Stability and Control Harold Joseph Kushner

44 I1 / STOCHASTIC STABILITY

the homogeneous case notation,

d V ( X ) 1 a2 v ( x ) A”,V(x) = m q x ) = X&(x)-- axi + 2 i , j ~ CSi j (X) - - axi a x j = - k ( x ) .

i

s ( x ) = { S i i ( X ) } = a ‘ ( x ) . ( x ) .

Let k ( x ) 2 0 in Q,, and if Q, is unbounded, let P,{ IIx,II 3 co, X , E Q,, D,} =0, for each E > 0. Then x,+ {x: k ( x ) = 0} n Q, with at least the probability 1 - V(x) /m.

REMARK ON STRONG DIFFUSION PROCESSES

Define G, = {x: / lx/\ r > and R > r > 0. Suppose that x, is a strong diffusion process in G R - C,, for some fixed R > 0 and each r > 0 (that is, for each r > 0, there is an m, > 0 so that t’S(x) < 2 mrl\tl\z, for any vector 5). Note that the process is not necessarily a strong diffusion process in the set G R , since m, may tend to zero as r -+ 0. In fact, if the origin is stable with probability one ((2-6)) this must be the case, si.nce, otherwise, the escape time from GR is finite with proba- bility one, for any initial value x, including x = 0. Let

P x { sup )Ix,II g R } + O , as x + O . (2-6) r n > r z O

Then there is a nonnegative continuous function V ( x ) satisfying V(0) = 0, V ( x ) = 1, x E a C , , V ( x ) > 0, x # 0, and X E G , , and V ( x ) is in the domain of the weak infinitesimal operator of each process xtorr , where z, is the first exit time from GR - G, - aGR = D,. Also, z D , V ( x ) = L V ( x ) = 0 in D, for each r > 0 and

px { SUP llxrll L R } I Px { SUP v ( x r ) L I} I J‘(x). (2-7) m>r>=O U2>t20

In other words, the existence of a stochastic Liapunov function is also a necessary condition for stability of the origin with probability one (2-6). The sufficiency follows from Theorem 5 (we say nothing there about the properties of V ( x ) near x = 0). Also x, 0 with a probability

Page 60: Stochastic Stability and Control Harold Joseph Kushner

2. THEOREMS. CONTINUOUS PARAMETER 45

at least 1 - Px{~~p , , , zO llxfll 2 R } . Thus, stability of the origin with probability one implies asymptotic stability with probability one, in this case. The result is due to Khas’minskii [3], and a proof follows.

Define the function q ( x ) on the boundary of GR - G,. Let q ( x ) = 1 on JGR and q ( x ) = 0 on dG,. Then, since x, is a strong diffusion pro- cess in D,, the equation (see Chapter I, Section 6) LV,(x) = 0 with boundary condition V,(x) = q(x) , for xEdD, has a unique continuous solution which is given by

V, (x) = E,q (xr,) = P, {x, reaches JG, before x, reaches JG,) ,

where T, is the first exit time from D,. Also E,T, < 03, X E D , , and

Let { r i } be a sequence of values of r tending to zero. It is easy to show, using the strong maximum principle for elliptic operators, that V,,(x) is a nondecreasing sequence in each GR - G, (for r, 5 r ) . Define V ( x ) :

0 s Vr(X) 5 1.

V (x) = lim V , (x) = lim P, {x, reaches dG, before JG,} r + O r + O

= P, { sup IIx,II 2 R } . (2-8)

The last term on the right of (2-8) should be PX{sup,,, - 2o llxfll 2 R}, where T = limr+o T,. If T < co with a probability which is greater than zero, then either IIx,II = R for some t < 03, or llx,ll 4 0 , t - 5 . In the latter case sup,>,zo IIx,II < R with probability one (relative to (Ilxtll + 0 as t + T}). This may be proved by use,of the hypothesis (2-6). Thus equation (2-8) is valid.

By hypothesis the right side of (2-8) goes to zero as x+O; thus Y(0) = 0. Also, by the method of constructing the Ifr(.). 0 5 V ( X ) 5 1 and V ( x ) = 1 on dGR. V ( x ) is the limit of a nondecreasing uniformly bounded sequence of harmonic functions; hence, L V ( x ) = LV,(x) = 0 in each D,, r > 0.

Now an argument similar to that used in Theorem 5 may be applied to complete the proof. V ( x ) is in the domain of the weak infinitesimal operator of the process in each D,, and the operator, acting on V ( x ) in this region, is L. In Theorem 5, we require A”,,, V ( X ) - 6, in

r n > f Z O

Page 61: Stochastic Stability and Control Harold Joseph Kushner

46 I1 / STOCHASTIC STABILITY

QR - Q, (or, equivalently, LV(x) - 6, < 0 for each r > 0 in Dr), in order to assure that the first exit time from the set QR - QE would be finite with probability one. The latter property holds here for each D,. Thus, pursuing the argument of Theorem 5 , we conclude that xt+O with at least the probability 1 - Px{sup, IIxtII 2 R } . Equation (2-7) is obvious from (2-8).

Corollary 3-3. Let the nonnegative function V ( x ) have bounded and continuous first partial derivatives in Q , and let

d x = f ( x ) dt + C(X) dz

be the Poisson equation of Section 5, Chapter I. Let f ( x ) and ~ ( x ) satisfy a uniform Lipschitz condition and be bounded in Q,. Let aid + o(d) be the probability that zit has a jump in [ t , t + d) and Pi(dy) the density of the jump in zit . Let each Pi(&) have compact support. Suppose that V ( x + ~ ( x ) yi) is bounded* and continuous for x in (9, and for each y' for which Pi(dy) is not zero. Then V ( x ) is in the domain of A', and

A",V(x) = ~ V ( X ) = C L ( x ) ~x , (x ) i

+ C = - k ( x ) .

[ v (X + 0 (x) yi) - v (.)I aipi ( d y ) i s

Suppose that k ( x ) satisfies the conditions of either Theorem 2 or 3.

xt + {x: k ( x ) = 0 } n Q,

(Note that k ( x ) is continuous in Q,.) Then

with probability at least 1 - V(x) /m.

Proof. The theorem follows from Theorem 3 and Section 5 ,

If A,V(x) = - k ( x ) 5 0 in (3, and k ( x ) is proportional to V ( x ) in Chapter I.

Q,, then a type of exponential convergence occurs. In particular:

* yi is a vector with y in the ith component and zero elsewhere.

Page 62: Stochastic Stability and Control Harold Joseph Kushner

2. THEOREMS. CONTINUOUS PARAMETER 47

Theorem 4. (Exponential Asymptotic Stability) Assume (Al) to (A5), V(0) = 0, &V(x) 5 - o!V(x) in Q , for some o! > 0. Then, with xo = x,

v (x) + v (x) e-aT Px{ sup V ( X J 2 A} 5 ~

oo>rbT ni I

If the hypothesis holds for arbitrary rn, then

Proof. Let T,,, = inf { t : x,#Q,,,}. Define the process Zt

2, = X, ( t < T,), Zr = 0 ( t 2 Tm).

Let Ex correspond to the process Zr. The process Zr, t < oc), is a right continuous strong Markov process (although 7, is not necessarily a Markov time of the process &). By Dynkin's formula,

7,,,nr

EXV(2,) - V ( X ) E , V ( X ~ , ~ , ) - V ( X ) = E x . X , , , V ( X , ) ~ ~ s 0 r,nt r

- I - o!EJ V(X,) ds = - .Ex/ V(XJ ds

= - . p y ( P , ) ds ,

0 0 r

0

which implies, by the Gronwall-Bellman lemma, that

EXv(zT) 5 ~ ( x ) e-'=. Now, the facts

gxV(zr) s V(X)

E x ~ ( ~ J -, ~ ( x ) , as t -, 0

imply that V(2,) is a nonnegative supermartingale (Dynkin [2],

Page 63: Stochastic Stability and Control Harold Joseph Kushner

48 I1 / STOCHASTIC STABILITY

Theorem 12.6). The supermartingale inequality yields

which, together with

V (x) P,{ SUP IV(Tr) - V(x, ) l > 0 } = P x { SUP V ( x r ) L m } S -

Ca>tzo m > r 2 0 m

implies the theorem.

An alternative proof can be based on the observation that J,ezr V ( x ) S 0 in {x: V ( x ) < m } = Q,. Then earn', V(xlnl,) is a non- negative supermartingale. Thus

P,{ sup V ( x J exp at 2 m } = P , ( V ( x , ) >= m exp - a t , some t <a} m>rZO

with A < m and T > 0; we have

Since P,{t, < T } S V ( x ) / m and ExV(xTnr , ) exp c( T n r , S V ( x ) , Theorem 4 is again proved.

It is not usually necessary that V ( x ) be in the domain A", in all of Q,. In particular, if we are concerned only with the proof that x, -+ 0 as t + co, then the properties of V ( x ) at x = 0 may not be important. It may only be required that V ( x ) be in the domain of the weak infinite- simal operator in the complement (with respect to Q,) of each neigh- borhood of { x = O } . Theorem 5, giving just such an extension of Theorem 3, will be useful to improve the result in one of the examples where the process is an It6 process and the Liapunov function has a

Page 64: Stochastic Stability and Control Harold Joseph Kushner

2. THEOREMS. CONTINUOUS PARAMETER 49

cusp at the origin. The theorem may be extended to include non- compact Q,.

Note that stochastic continuity is not explicitly used in the proof. The supermartingale inequalities (2-9) and (2-10) are sufficient to give the result in the case of Theorem 5, since { x : k ( x ) = 0} must be in- cluded in ( x : V ( x ) = O > . Define, for E < ~ , T , ( E ) = inf ( t : x , $ Q m - Q, - a Q J

Theorem 5. Assume (Al) to (A3), V ( 0 ) = 0, and that the set Q , is bounded. Denote the first exit time of x, from Q , - Q , - dQ, by T,(E). Let A”,,, denote the weak infinitesimal operator of the process xtnTm(&). Let V ( x ) be in the domain of zm,t for E > 0 arbitrarily small, and let A,,,, V ( x ) = - k ( x ) 5 0 in Q , - Q , - dQ,. For each small E > 0, let there be a 6 > 0 so that k ( x ) 2 6 in Q,- Q , - dQ, . Suppose that {x: V ( x ) = O } is an absorbing set; that is, exit is impossible.* Then x, + ( x : V ( x ) = 0 } with a probability at least 1 - V ( x ) / m , as t-+m.

Proof. define the sequence^^,&^=^^^^-^,^^=^, whereO<E<m, and E < 1. The Q,, decrease to { x : V ( x ) = 0}, and V ( x ) is in the do- main of X m x E i , for each i < 00. Let ai correspond to c i . Then, by Lem- ma 2, E,T, , (E~) < co and the T , ( E ~ ) , i < co, are all finite with probability one. Denote by T the limit T = limn T,(E,,). T is a Markov time of x , , since the nondecreasing T , ( E ~ ) are. Let x = xo # 0 and let y be any point in Q,, + dQ,,. Note that

. (2-10) V ( Y ) < - = E n En

E n - 1 E n - 1

- P, , {V(x , ) 2 before V ( x , ) 2 E , , + ~ } 2 ~ -

The probability of the set of trajectories (Q - Be) which leave Q , before entering any arbitrary Q,, E > 0, is less than V(x) /m. Then

* The assumption that { x : V ( x ) : 0) is an absorbing set is convenient if r < co

(see proof) witha nonzero probability. I t eliminates the need for introducing assump- tions which guarantee i t , as well as the attendent arguments. In any case, either V ( m + O a s r + m o r V(xJ =Oforsomes< co,withprobability 2 [ I -V(x)/rn].

Page 65: Stochastic Stability and Control Harold Joseph Kushner

50 I1 / STOCHASTIC STABILITY

x,(w)+ Q,, + ape", as t + T,(E,) with at least the probability 1 - V(x) /m. Once in Q,, + dQ,., the probability that a trajectory will leave pen-, before entering Q,.,, +dQ,,+, is bounded by E" by (2-10). Thus, for W E B , , , V(x t ) is bounded by E , , - ~ in the time interval [ ~ , , ( E , , ) , T , ( E , , + ~ ) ) with aprobabilitynoless than(1 - E , , / E , , - ~ ) = (1 - E").

Thus, x, converges uniformly to zero, as t + T , with a probability no less than

(2-1 I)

Since E is arbitrary, V(x, )+O with probability 2 [ l - V(x ) /m] , as t + 7, and the theorem is proved.

The Liapunov function method may be used to estimate the asymp- totic values of moments, providing that they exist. In any case, we have:

Theorem 6. Assume (Al) to (A4). Let V ( x ) be in the domain of x,,, for each m > 0. Suppose that V ( x ) is bounded for finite x, and that x, does not have a finite escape time (with probability one) to cc). Let

X,,,V(x) 5 - k ( x ) + c2 (2-12)

k ( x ) Z O ineach Q,, c c ) > c > O . Then

K

5 1. t

0

(2-1 3)

Strict equality in (2-12) implies strict equality in (2-13). If E,k(x,) has a limit, then it is at most c2.

Proof. By Dynkin's formula and (2-12) tnrm

(XI - (Xznr,,,) = - ~x 1 JmV (xs) ds 0

tnf,,,

2 Ex [ k (xs) - c2] ds , (2-14) s 0

Page 66: Stochastic Stability and Control Harold Joseph Kushner

2. THEOREMS. CONTINUOUS PARAMETER 51

where T, is the first exit time from Q,. Since t < 00, co > c > 0, and k ( x ) 2 0, and also since, by hypothesis, ~ , + c o with probability one, we may replace the limit of the right-hand integral of (2-14) (as rn -+ co) by the integral from 0 to t (dominated convergence theorem). Also, by Fubini’s theorem, the order of integration may be changed. Thus

V ( x ) - lim EXV(xrnr,) 1 / E , k ( x , ) ds - c2r.

r

m-r w 0

Now, (2-13) follows by letting t + c o in

V ( x ) V ( x ) - limm+m E,V(xtnr,) 1 E k ( x ) ds > - x s - 1 .

C 2 t 2 - C2 t - j - t c 2 0

Remark. Generally, in applications of Theorem 6 , when V ( x ) is large, A^,V(x) is negative ( V ( x ) decereases “on the average” along trajectories of the process) and Z,V(x) is positive when V ( x ) is small ( V ( x ) then increases along trajectories of the process “on the average”). x, ‘‘oscillates’’ in some random fashion about the set Urn“= I ( ( x : x m Y ( x ) = 0 } n Qn,). Since, in applications generally A”,V(x) = x , , V ( x ) , n > rn, in Q,, the sets { x : Z ,V(x) = 0 } nQ, will generally be increasing.

Extension. Let x , V ( x ) = - k ( x ) +g(x), k ( x ) 2 0, g(x) 2 0. Then Theorem 6 may be extended to read

provided only that some condition is given which guarantees that rmnf r,nr r m n t

Ex J (- k(xJ + d x , ) ) ds = - E x 1 k(xJ ds + E x 1 g ( x J ds 0 0 0

and

Page 67: Stochastic Stability and Control Harold Joseph Kushner

52 I1 / STOCHASTIC STABILITY

While the problem of existence of stochastic Liapunov functions has not been solved,* some existence results are possible. Theorem 7 is applicable to the case where, for example, x, is an It6 process, h(x,) = IIxsll, and Ex llxJ tends to zero exponentially. See Kats and Krasovskii [l] and also Hahn [l], Theorem 24.5.

Theorem 7. Let x, be a right continuous strong Markov process. Let h(x) satisfy h(0) = 0, h ( x ) > 0 for x # 0, and be continuous at 0. Suppose that xo = x and

m

F ( x ) = E x h ( x , ) ds <a, all x in E s 0

E,h ( x s n I Q ) + h ( x ) as s --f 0

where T~ is the first exit time from an arbitrary neighborhood, Q, of the origin. Suppose that

F ( x ) + O as x + O ,

F ( x ) + a as llxll +a.

Then F ( x ) is a stochastic Liapunov function satisfying the conditions of Theorem 2, and x,+O with probability one, as s + w , for any xo.

Proof. First, we show that V ( x ) is in the domain of JQ, for each neighborhood Q of the origin. Let x be in Q . By Dynkin [2], Theorem 3.11,

m m

E X F ( x 6 , 1 Q ) = E X E X 6 n I Q 1 ( X S + d n I Q ) ds = E X ( x S > d s ' 0 d n r p

* Stochastic counterparts of the deterministic existence theorems will appear in H. J. Kushner, Converse theorems for stochastic Liapunov functions, SZAM J . Control 5, No. 2 (1967).

Page 68: Stochastic Stability and Control Harold Joseph Kushner

2. THEOREMS. CONTINUOUS PARAMETER 53

Then, under the hypothesis,

--t - h (x) E J (XanrJ - F ( X I - - Ex IFTQ h (xs) ds -

6 6 Exh (4 -P h ( X I

and, hence,

A " , F ( x ) = - h ( x ) , each Q c E and X E Q .

By hypothesis, Q , = ( x : F(x) < m} is bounded and F(x) is continuous. Thus F(x) satisfies the hypothesis on V ( x ) of Theorem 2. Q.E.D.

Theorem 8. Assume (Al) to (A4), except that the killing time C may be less than infinity. Let 0 S V ( x ) -, 03 as llxll-+ co and, for each m > 0, A",V(x) =g(x), where g(x) S 0 for large IIxII. Let there be an infinite sequence of compact sets G,fE, such that, if llx,ll -00 as t -+ 5 -= 03, then (with probability one) x,(~)EG, for some sequence of random times t ( i ) f c. Then there is no finite escape time ([ = m with probability one). If A",V(x) 5 - E < 0 in Q , - G, for all large m and where G is compact, then x , always returns to G with probability one.

Remark. The purpose of the condition on the Gi and T ( i ) is to eliminate the possibility that x, either disappears or jumps directly to infinity from some finite point.* A simple example will illustrate the problem. Let x , be a constant, t <[. Let the killing time c have the distribution P { c S t +6l[ > t } = c6 + o(6). Then E,V(x =

(1 - cb + o(6)) V ( x ) and

A",v(x) = - c V ( x )

for all V(x) . But we may infer neither stability nor even boundedness of the x, process (in any time interval).

Proof. For large i each Gi is contained in some bounded Q,, and 0 in E - Q,. Let Q, 3 Q, =I Qp

* An equivalent condition is that for each bounded open set Q with first exit

contains some Qpi . For all large tl, g(x) -

time TQ, there is an EQ > 0 with probability one and C >= TQ + EQ.

Page 69: Stochastic Stability and Control Harold Joseph Kushner

54 I1 / STOCHASTIC STABILITY

with x = x o in Q, - Q, - dQ, and g(x) 5 0 in Q, - Q, - dQ,. Define T, = inf { t : x,$Qm - Q, - @,}.Then T, < 5 with probability one, for each m, and

rmnr r,nt

E,V(X~~,~,) - V ( X ) = E x XmV(xS) ds = E , g(xJ ds 5 0 . s 0 s 0

Since r,,, -= i we have mp, SUP^,^^ zs

all t < co and, hence, V(xs> 2 m> 5 Ex J'(Xr_,,) for

(2-15) V ( X I P,{ sup V(xs) >= m } 5 - .

r , b s b O m

Since (2-15) holds for arbitrary m < co, x , must always enter Q, + aQ, before going to infinity, with probability one. This implies that there is no finite escape time. The last statement of the theorem follows from the above reasoning and Lemma 2.

Many other stochastic Liapunov theorems are possible. The proof of Theorem 9 is a slight variation on the proof of Theorem 2 and will not be given.

Theorem 9. Assume (Al) to (A5) with x n , V ( x ) 5 - k(x) + qr, for each m, where k ( x ) Z O and cpt+O as t+m. Let k(x)+cr, and V(x)+co as / /x/ ]+co. Let k ( x ) be continuous on the compact set {x: k ( x ) = 0} = P. Then, for each neighborhood N ( P ) , there is a finite-valued random time T~ such that x , e N ( P ) , all I > T ~ , with probability one. Suppose that k ( x ) 5 0 in Q, 3 P only. Then x , + P with the probability that x , does not escape from Q m .

The following theorem is proved by use of the supermartingale V ( x ) +S:cp, ds; the proof is omitted.

Theorem 10. Assume ( A l ) to (A4) and let V(x)+cr , as llxll +a. Let x m V ( x ) 5 cp, where cp, 2 0 is integrable over [O,co). Then there is a finite-valued random variable v such that V ( x , ) - v with probability one. Also

Page 70: Stochastic Stability and Control Harold Joseph Kushner

3. EXAMPLES 55

There is a simple stochastic analog of a theorem of Yoshizawa [ l ] concerning a bound on the “sensitivity” of the trajectory to the initial condition. The proof is similar to that of Theorem 1 or 2 and is omitted. The processes x,, y , have initial conditions x, y , respectively. If x, and y, are right continuous strong Markov processes, then so is (x,,y,).

Theorem 11. Let (Al ) to (A5) hold for the process (x,, y,) and function V ( x , y , t ) in the bounded open set Q = {x, y : W(llx -yII) < m>, where W(0) = 0 and W ( r ) is continuous and strictly monotone increasing for r 5 m. Let

W(lIx - Yll) I V(x, Y, t ) . Let

JQV(x, Y , 1 ) 5 - k(x, Y , t ) S - ki(Ilx - YII) 5 0. Then

K , , { S U P IIX, - Yrll 2 w - ’ ( m ) ) s P,,,{ S U P V ( X , ? Y f ? t ) 5 m> rn>t20 rn>?ZO

< V(X, Y , 0) - -

m

If k , ( r ) is continuous on P = { x , y : k, ( I lx -y l l )=O}nQ, then we have Ijx, - y,/j = r, + { r : k , ( r ) = 0 ) with a probability at least 1 - q x , y , 0 ) h .

3. Examples

First, several simple scalar It8 processes will be investigated.

Example I . Let x, be the solution of the scalar It8 equation

dx = ax d t + ox dz .

The function V ( x ) = x2 satisfies the conditions of Corollary 3-2 in each set Q, = {x: xz < m 2 } , and

(3-1)

,T,v(x) = YV((X) = x2(2a + 0’).

Page 71: Stochastic Stability and Control Harold Joseph Kushner

56 I1 / STOCHASTIC STABILITY

If 2a + u2 S 0, Corollary 3-2 yields the estimate

X 2 P,{ sup x,‘ 2 m2> 5 -

a J > t ~ O m2 .

Let 2a + u2 < 0; then xt -, 0 with at least the probability 1 - x2 /m2 . Since m is arbitrary here, x, + 0 with probability one.

Since 9 V ( x ) = - aV(x) , a = 12a + 0’1, and m is arbitrary, Theo- rem 4 yie1ds;for any A < 00,

Example 2. Let x, be the solution of (3-1). An improved result will be obtained with the use of the Liapunov function V ( x ) = IxIs, 03 > s > 0. If s < 2, the second derivative of V ( x ) does not exist at x = 0. Nevertheless, Theorem 5 is still applicable. Then, appealing to Corollary 3-2 and Theorem 5, we have in each Q,, - Q, - dQ,

- A , , , V ( X ) = ~ ’ V ( X ) = b x 2 ,

where we suppose that (s - 1) fJ2

b = a + - - - - - - 0 . 2

Hence,

Clearly the larger is s, the smaller is (3-2). In general, Liapunov functions of higher algebraic order are preferable, if they exist, since they yield better estimates of the probabilities, as in the case (3-2). If a < a2/2, there is some GO > s > 0 such that b < 0. Then x, + 0 with probability one. Thus, even if a > 0, there may still be asymptotic stability. Except for some specially constructed examples, there does not seem to be a vector analog of this result. In our case, it is verifiable

Page 72: Stochastic Stability and Control Harold Joseph Kushner

3. EXAMPLES 51

directly by noting that the exact solution to (3-1) is (with probability one)

x , = xo exp [ (u - (izi)t + z , ] . (3-3)

If u - cr’ = - c < 0, then

x, = xo ex,[ - c + :] t

which tends to zero with probability one, as t --t co. In deterministic stability studies, if a function V ( x ) is a Liapunov

function in a region Q then so is Va(x) = V“(x) , ct > 1, in the same region. There is not necessarily a stochastic analog of this property. (If there were, then all estimates of the form (3-2) could be written with s = 00.) That V ( x ) is a stochastic Liapunov function in Q implies

W m

E x S AQV (x , ) ds = E,AQV (x,) ds < 00

0 0

or that EXJQV(x , ) decreases sufficiently rapidly as s-co. But this does not imply that ExxQVa(x , ) will decrease sufficiently rapidly to ensure that, for any a > 1,

ExAQVa(x,) d s < 0 0 . i 0 I t is these “integral moment” properties which effectively determine the probability bounds.

Remarks on the model (3-1). Astrom [l] gives a thorough analysis, together with numerical results on the sample path behavior, of the scalar equation dx = crlx dz, +a2 d z , , where zll and z2* are (possibly correlated) Wiener processes.

Let R = 0x5

Page 73: Stochastic Stability and Control Harold Joseph Kushner

58 11 / STOCHASTIC STABILITY

where 5, is integrable on each [0, TI. Then f

x , = xo exp oSg, d s .

The solution still holds (with probability one) if 5, is the solution of the It6 equation

0

d t = - dt + a dz (a > 0 ) .

As a-oo, Jb ( , d s + z , and

x , -+ xo exp oz,

in mean square. The last equation is the solution to the It6 equation

The conclusion to the discussion of this paragraph is the following. Suppose that 5, is a random function with a “large” bandwidth, which acts on the system whose output is governed by i = xo5, and which we wish to model by an It6 equation. Then the model dx = a2x/2dt + v x dz may be preferable to the model 2 = v x dz. This modelling question, which is concerned with the relation of the solution of ordinary and stochastic differential equations, is discussed by Wong and Zakai [l, 21.

Example 3, Consider the scalar problem

dx = - a x d t + o ( x ) d z (U > 0 ) .

The exponential form I/ ( x ) = exp I I x I ‘

is useful in the study of a variety of scalar situations. The constants a and I are to be selected. Here, and in Example 4, we set a = 1. Since V ( x ) does not have a second derivative at x = 0, but is in the domain of the weak infinitesimal operator of x, in all Q, - Q, - dQ,, E > 0,

Page 74: Stochastic Stability and Control Harold Joseph Kushner

3. EXAMPLES 59

Theorem 5, rather than Theorem 2, must be applied. Suppose that

0’(x) = 1x1 0’.

(Note that the corresponding ~ ( x ) does not satisfy a Lipschitz con- dition in any Q,, owing to its behavior at x = 0. This is unimportant. We define the origin as an absorbing point, and the solution may be defined up to the time of contact with the origin by a method analogous to that of the last subsection of Section 4, Chapter I. Furthermore, the results below and the argument used in the proof of Theorem 5 show that either x, + 0 as t 3 00, or x , = 0 for some finite-valued Markov time s.)

For x # 0,

(3-4)

Setting I = 2a/a2 yields 9 V ( x ) = 0. Then, by an appeal to Theorem 5 ,

Example 4 . Take the model of Example 3 but set

a’ (x) = (I - e-lXl> 0’ .

The remarks in Example 3 regarding the properties of the process in the neighborhoods Q, hold here also. For x # 0,

3. (3-5) AT’ (I - e-lxl)

9 V ( x ) = W ( X ) 2

Set A = 2a/a2; this yields dpV(x) conclude, as in Example 3, that

0, for x # 0. By Theorem 5, we

The bound is identical to that obtained in Example 3, even though

Page 75: Stochastic Stability and Control Harold Joseph Kushner

60 I1 / STOCHASTIC STABILITY

the variance a’(.) is uniformly bounded here. The probability bounds are determined by the choice of 1, whose value in both cases depends upon the properties of ~ ’ ( x ) at x = 0 (or, alternatively, on the re- quirement that the bracketed terms of (3-4) and (3-5) be nonpositive).

Example 5 . A method of computing Liapunov functions for scalar It6 processes will be given. (See also Khas’minskii [I], where a similar method is used in the study of processes which are “dominated” in a certain sense by a one-dimensional process and Feller [2].) Let the model be

d x = f ( x ) d t + ~ ( x ) d z (3-6) f ( 0 ) = a(0) = 0 .

In order to obtain a tentative Liapunov function, we proceed formally at first. Assume that V(0) = 0, V ( x ) > 0 for x # 0. Let V ( x ) be in the domain of A”,,, in Q, - Q, - aQ, and let A”,, ,V(x) 5 0 in Q, - Q, - aQ,, where each Q, is bounded. Then, in Q, - Q, - aQ,,

or, equivalently,

(3-7)

The function defined by (3-8) satisfies the relation (3-7), provided that the right-hand integral exists:

X S

(The exponent of the exponential is written as an indefinite integral since, in many applications, the integral from 0 to s may not exist.) Since x = 0 is an absorbing point, the rest of the discussion may be specialized to x 2 0. It is readily verified that 9 V , ( x ) = 0 on Q, - Q, - aQ, for any 0 < E < m < co. From the nonformal point of view, the existence of the integrals in (3-8) must be verified.

Page 76: Stochastic Stability and Control Harold Joseph Kushner

3. EXAMPLES 61

Some special cases where the preceeding method is applicable are of interest. Suppose that

a2 (x) = c1x2 + czlxl

f(x) = - ax

(cl > 0, cz > O ) , (u > 0).

Then, computation of the integral (3-8), and omission of the constants of integration, yields the tentative Liapunov function

2 a l c l + 1

V, (x) = (x + ;) *

The function V l ( x ) is in the domain of A",,, for each E > 0. By the method of construction, 9 V l ( x ) = 0. In order to infer asymptotic stability of the origin from Theorem 2 or 5 , we require a Liapunov function V(x) with -YV(x) = - k ( x ) 5 0, where k ( x ) = 0 implies x = 0. To achieve this, we take

V(X) = (x + :y - (;I (3-9)

where p = 2a/c1 + 1 - 6, for some arbitrarily small positive 6. Now V(0) = 0 and V ( x ) > 0, x > 0, and

Thus by Theorem 5 , Corollary 3-2, and the arbitrariness of m, x, + 0 with probability one, and for x = xo > 0,

Page 77: Stochastic Stability and Control Harold Joseph Kushner

62 I1 STOCHASTIC STABILITY

Example 6. Another application of the method of construction given in Example 5 will be considered. We suppose again that the origin is an absorbing point and that x > 0. Let f ( x ) be differentiable and less than -ax in the region 0 s x S 2 and also a>0. Let a2(x)=clxz +c,Ixl. Now, we replace the right side of (3-7) by its minorant 2ax/02 ( x ) and compute a tentative Liapunov function, V, (x), using the minorant.

Thus

V,(x> =jds[expj(-, du] 0

where

Vz (4 = Vl (X I 3

9 v z ( x ) j 0 ( x f O and x S 2 ) .

Now proceeding as in Example 5 , and using the function V(x) defined by (3-9) (which is in the domain of A”,,c for each E > 0 and m 5 V(Z)) , we may apply Theorem 5 and Corollary 3-2, which yield

and x, -+ 0 with at least the probability 1 - V(x) /V(P) .

Example 7. Consider the two-dimensional nonlinear Itd equation, where g ( x l ) satisfies a local Lipschitz condition

dx, = x2 dt

dx, = - g(xJ dt - axz dt - x2c dz

/g(s)ds+cc as t - c c

f

0

sg(s)> 0, S Z O g(0) = 0.

(3-10)

Page 78: Stochastic Stability and Control Harold Joseph Kushner

3. EXAMPLES 63

The function X I

v (x) = x: + 2 g(s) ds J 0

is a Liapunov function for the deterministic problem where c = 0 (see LaSalle and Lefschetz [l], p. 59 ff). Each Q, is bounded and V ( x ) is in the domain of each 2,. (Since Sll(x) = Sl,(x) = 0, it is not re- quired that V x , x r ( ~ ) exist, i = 1, 2.) We have

9 v ( x ) = x;(- 2a + 2). (3-11)

Suppose that a > c2/2. Then (3-11) (by appeal to Corollary 3-2, and the arbitrariness of m) implies x,, + 0 with probability one.

It will now be proved that xlt + 0 with probability one. By Lemma 1, and the fact that, for any m > 0,

(3-12)

there is a random variable b(w) < 00 with probability one, such that V(x , ) + b(w) with probability one, as t + 00. Since xzt + 0 with proba- bility one, this implies that

X l t

G(x l , )=2 g ( s ) d s + b ( w ) < c o s 0

as t +a. Since G(0) = 0 and G(s) is strictly increasing as Is1 increases on either side of s = 0, for any w there are at the most two values, al (w) and a2(w) , such that G(a,(w)) = G(a,(w)) = b(w); al (w) >= 0, a2(w) 5 0. Thus xlf+ {a l (w) , a2(w)}. In fact, in the limit, each path xlt tends to only one of a1 (o), a2(w) , since it cannot (with probability one) jump from a1 ( w ) to az (w) and back to a1 (w) . (Since xlt + {al (o), a2(w)}, for any E > 0 there is a random variable T,(o) < cc with proba- bility one such that min [(x,, - a1(w))’ + x$, (xlt - a,(o))’ + xi,] < E

with probability one for t 2 ~ ~ ( w ) . x, cannot move from a neighbor- hood of {al(w),O} to a neighborhood of {a,(o), 0} without traversing the path between the points. By stochastic continuity, the probability

Page 79: Stochastic Stability and Control Harold Joseph Kushner

64 I1 / STOCHASTIC STABILITY

of such a jump is zero.) Thus x l t -+ a(w),la(o)l < co with probability one, where U(W) is either a l ( w ) or a2(o) .

Now, for any 1 > p > 0, there is a compact set M so that x,, t -= co, is confined to M with probability at least 1 - p (by (3-12)). Thus, by the local Lipschitz condition on g(s), for each p > 0 there is a real- valued Kp < 00, so that, uniformly in t,

Ig(x1J - g(.(o))l 5 K p l X l t - a(o)l

with a probability at least 1 - p. This and the convergence of x l t imply that g(xl,) -+g(a(o)) with probability one.

The final evaluation of a(o) is obtained by integrating the second equation of (3-10) between T, - A and T,, where A is positive but arbitrary. Let T, -+a and suppose that the intervals [T,, T, - A ) are disjoint :

T . T , T ,

X 2 T , - X Z T , - A = - 1 g(X1,) dt - a 1 X2r d t - c [ x2t dz,. (3-13)

T n - A T , - A T , - A

Now, all with probability one as n-co, the left side of (3-13) goes to zero, the first term on the right tends to -dg(a(w)), and the second term on the right tends to zero. Thus

T.

- g(a (o ) ) A = lim c x l t dz,. (3-14) n-tm

T , - A

The last term on the right side of (3-13) represents a sequence of ortho- gonal random variables with uniformly bounded variance (since Exxi , Ex V ( x , ) 5 V ( x ) ) . Thus, the sequence, since it converges, can converge only to zero (with probability one). Thus, by (3-14), g ( a ( o ) ) = 0, and, hence, ~ ( o ) = 0 with probability one. In conclusion, x, -+ 0 with probability one under the condition - 2a -t c2 < 0. Also, for any m > 0,

(3-15)

Page 80: Stochastic Stability and Control Harold Joseph Kushner

3. EXAMPLES 65

Let us try to improve the probability estimate (3-15) for small c2.

Let Vn(x) = V"(x) , n 2 1. For each m > 0, V,,(x) is in the domain of 2, (where for this example we let Q , still denote the set { x : V ( x ) < m}) :

s 2 n V n - 1 ( x ) x ; [ - a + c 2 ( n - l ) ) ] . (3-16)

If a >= (n - 1) cz , then Y V " ( x ) s 0 and we have

P,{ sup V ( X ) Z rn} = P x { sup V " ( x ) Z rn"} s ' " ( ' ) , - r n z r z o m > r z O m"

(3-17)

which is better bound than (3-15).

Example 8. The function

Suppose that a2(x) = a2x:/(l + x:) in Example 7.

Vl (x) = exp i V ( x ) ,

where V ( x ) is the Liapunov function of Example 7, is in the domain of 2, for each m > 0, and

%I 2 IpVl ( x ) = 2,V1 ( x ) = 21V1 ( x ) - ax: + ( 1 + 21x3

o2 1x:a2 5 2nv1 ( x ) x:

[ [ 2(1 + x:> ++I (1 + x:) - a + ~

~ 2 ~ v , ( X ) X : [ - a + ~ 2 ( i + + ) ] .

If a > 02/2, then letting 1 = (a -a2/2)/az yields Y V , ( x ) 5 0 and the bound

P, { sup V ( X r ) Z P } = px { SUP Vl(X,) 2 exp API r n > t ' O rnsrz0

s exp n ( ~ ( x ) - p ) .

Example 9. A Stochastic van der Pol Equation. Consider the form

Page 81: Stochastic Stability and Control Harold Joseph Kushner

66 I1 / STOCHASTIC STABILITY

of the van der Pol equation

d(R) + ~ ( 1 - x’) R dt + bx d t = c x d z ( E > 0), (3-18) or

With c = 0, the origin is asymptotically stable, and there is an un- stable limit cycle. Equation (3-18) is also equivalent to the system

(3-20)

dy2 = - by1 dl + Cy1 d z .

Equation (3-20) is obtained from (3-18) by integrating (3-18) and letting y1 = x and yZl =Jb [- byls ds + cyls dz,]. The identity between (3-18) and (3-20) can also be seen by noting that (at least until the first exit time from an arbitrary compact set) y , is differentiable and dyi,/dr = (i - 1 ) y;; pit. See also the discussion of the deterministic problem in LaSalle and Lefschetz [l], pp. 59-62, where the following Liapunov function is used:

YZ by: V ( y ) = - + -.

2 2

Let A”, be the weak infinitesimal operator of the process (3-20) in Q, . Q, is bounded, and the terms of (3-20) satisfy a uniform Lipschitz condition in each Q,. V ( y ) is in the domain of 2, for each m > 0:

YpV(y ) = y : [(; - be) + (%)I Suppose that c2/2 - be < 0. Then Y V ( y ) < 0 on the set

p =b: Y : < P I

3 (be - c 2 / 2 ) ’= be

Page 82: Stochastic Stability and Control Harold Joseph Kushner

3. EXAMPLES 67

Define b p 3(b& - c2/2)

m = - = 2 2E

Then, in the open set Q,, Y V ( y ) < 0, provided that y , # 0. By Corollary 3-2,

(3-2 1)

and x,, = y,, + 0 with a probability at least 1 - V(y) /m. By a method similar to that used in Example 7, it can be shown

tha ty , -+Oforwin{w: V ( x t ) < m , all t<03}=B,. Let d>O and define t , = t,-l + A , to=O. Write t' for tnq,,. Then, from (3-20),

f',, 1'"

Ylr,,, - Ylt,,,-, = j' Y2t dt - & j' (y1r - $) d t . (3-22)

1'"- 1 r ' . - I

There is a random variable c(w), 0 5 c(w) < m, such that V(x, ) + c(o) (with probability one, relative to B,) for w in B , . Let w be in El,.

Sincey,, -+ 0, we havey:, -, 240) . As in Example 7, eithery,, +JC(&) or y 2 , + - d m . Since the term on the left and the term on the far right of (3-22) tend to zero as n + 03, the first term on the right also tends to zero as n + 03. This implies that c(w) = 0. Note that P , {B,} is given by (3-21).

If c2 is small, there is a simple device which will improve the estimate (3-21). Let V, (y ) = V"(y) . V,(y) is in the domain ofz , for each m > 0:

I .Vn-yy)Y:[(; - b&) + y + ( n - 1)cZ . 1 -

Let c2 be sufficiently small so that

be > ~ ' ( n - 3). (3-23)

Page 83: Stochastic Stability and Control Harold Joseph Kushner

68 I1 / STOCHASTIC STABILITY

Then Y V n ( y ) < 0 (if y , # 0) provided that

2 3 ( b ~ - C ' ( U - *)) bs

Y l < P n = -

Define m,,

bp 3 ( b ~ - C'(U - 3)) m = - =

" 2 2 E

Then, UV,(y)<O ( y , # O ) in the set Q,, = { y : V ( y ) < m , } = {y:

V"(Y) < 4 1 , and

There is some n for which V"(y)/m: is minimum, within the constraint (3-23). Also, x l t = yl, + 0 with a probability at least 1 - V"( y)/m:.

An application of Theorem 6 and its extension is given in:

ExampIe 10. Let E < 0 in (3-20). Then the deterministic problem (c = 0) has an unstable origin and a stable limit cycle. With

Y: by: V ( y ) = - + -, 2 2

we have, for each m > 0,

Also there is no finite escape time. Suppose that (as is true)

E x J y : s d s = m .

0

Then

Page 84: Stochastic Stability and Control Harold Joseph Kushner

3. EXAMPLES 69

If E,yts and E,y:, both converge to nonzero limits, then

lim E,yt, - c2/2 + be

lim E,y:, bs/3 . -

Moment results on another type of stochastic van der Pol equation appear in Zakai [l, 21.

Example I I . A Two-Dimensional Poisson Parameter Problem. Let

k = ( A + Y ) x ,

where Y, is a matrix-valued Poisson process with two possible values, & = D or Y, = - D . Let the probabilities of transition be P{ Y,+, = Dly , = - D} = P { Y , + , = - DI y , = D } = P A + o(d). Suppose that both M & D are positive definite. The function

V ( x , Y ) = x ’ ( M + Y ) x

is bounded and has bounded derivatives in every bounded set of x points and tends to infinity as llxll -+a. For the process stopped on exit from each region Q,,= {x, y : V ( x , y ) < m } , V ( x , y ) is in the domain of the weak infinitesimal operator (which, on V(x) , is d).

To continue, let the real parts of the eigenvalues of ( A + D ) be negative. We write

d V ( x , Y ) = - x ’ C + x ,

when the initial condition is Y = + D , and

d V ( x , Y ) = - X’C-x

when the initial condition is Y = - D :

- C + = ( A + D ) ’ ( M + D) + (&I + D ) ’ ( A + D) - 2pD (3-24a)

- C- = ( A - D)’ ( M - D) + ( M - 0)’ ( A - D) + 2pD (3-24b)

If ( M t D ) and both C, and C - were positive definite, then Theorem 2

Page 85: Stochastic Stability and Control Harold Joseph Kushner

70 I1 / STOCHASTIC STABILITY

would apply. To be specific, suppose that

A = [ - ' 1 - b '1, D = [ i -:] ( d > b > O )

( A - D is an unstable matrix, so that the problem is not trivial.) If F is a stable matrix and G positive definite and symmetric, then there exists a positive definite symmetric matrix solution to - G = F'B + BF (Bellman [l]). Thus, by choosing a diagonal C, = C, with positive entries cl, c2 chosen so that C, - 2pD is positive definite and sym- metric, there is a positive definite symmetric solution ( M + D ) to (3-24a) :

M + D = (

Also M - D is positive definite. Substitute ( M - D) into (3-24b) and compute - C- ; this yields that C- is positive definite provided that

1 ( d - b)(c2 + c1 + 2pd) - 4 ( d - b) d + 2pd

- d 2 [2 - c112 > 0. (3-25)

If p is sufficiently large, then the inequality is satisfied. As p increases, so does the probability of a transition in any interval. We will not pursue the matter further. Certainly, stability is implied by weaker conditions than (3-25). If (3-25) holds, then, for any m > 0,

x ' ( M + Y ) x Px, , { sup x: ( M + r,) x, 2 m} 5 ~~.

m > t 2 O m

The form of V(x , Y ) was suggested by the following observation. If Y = - D, then, although the derivative of x: (M - D) x, may be positive, the average rate of decrease in x ' (M + Y,) x (initial condition Y = - D) for fixed x may be sufficiently negative for at V(x , Y ) 5 0.

Page 86: Stochastic Stability and Control Harold Joseph Kushner

4. DISCRETE PARAMETER STABILITY 71

4. Discrete Parameter Stability

The statements of the stability theorems for the discrete parameter process are similar to the continuous parameter theorems. The proofs are generally simpler, since the technicalities centering about the weak infinitesimal operator are not required. To illustrate the method, we state and prove only one theorem.

Theorem 12. Let xl, . .. be a discrete parameter Markov process, V ( x ) 2 0, and Q, = {x: V ( x ) < m}. In Q,, let

E X , , V ( x , + 1 ) - V ( X ) = - k ( x ) 5 0. Then

There is a random variable v, 0 5 v 5 m, such that V(xn) + v with probability >= [I - V(x) /m] . Also k(x,)+O in Q , with at least the probability that x,, is in Q,, all n < co.

Proof. Let t be the first exit time of x, from Q,. Define I, = x,,,; 2, is a Markov process. Also, V(2 , ) is a nonnegative supermartingale since J!?~, ,V(X,+~) - V ( X ) S 0 for Z either in Q , or not in Q,. Equa- tion (4-1) and the sentence following 'it follow from the supermartin- gale properties and the fact that V( .Q 2 m if V(x , ) 2 m for some i n.

Define i ( y ) by Ex, , v(z,+l) - V ( I ) = - i(I).

If I is in Q,, then l ( X ) = k ( 2 ) , and if 2 is not in Q,, then 2n+l = 2, = I and L(2) = 0; thus i ( 2 ) 2 0. Also,

n- 1

EXJl v (x,) - v ( 2 ) = - c Ex,o l (Xi ) . i = 1

(4-2)

Equation (4-2) stays bounded as n + co since V ( x ) 2 0 and &(x) 2 0. Thus pi = P x , o ( a ( x i ) 2 E ) is a summable sequence for each E > 0. By the Borel-Cantelli lemma, &(f,) -+ 0 with probability one. Since I,, = x,, all n < co, with probability 2 [I - V(x)/m], the theorem follows.

Page 87: Stochastic Stability and Control Harold Joseph Kushner

72 I1 / STOCHASTIC STABILITY

5. O n the Construction of Stochastic Liapunov Functions

Generally, in the application of the Liapunov function method, a Liapunov function V ( x ) is given and &V(x) = - k ( x ) computed. The usefulness of the function for stability inferences depends heavily on the form of k ( x ) , and much effort is usually expended in search for functions V ( x ) yielding useful k(x ) . In the deterministic theory, a great deal of effort has also been devoted to the reverse problem: Choose a k ( x ) z O of the desired form and seek the function V ( x ) giving v(x) = - k ( x ) in some region Q. If such a V ( x ) can be found, and if it has the appropriate properties, then stability inferences can be drawn. This reverse approach is intriguing, but it has not, to date, produced many useful Liapunov functions, except for the case of time varying linear systems. See Zubov [l], Rekasius [l], Geiss [l], and Infante [l].

We now give the stochastic analog of the method of partial inte- gration, an approach which attempts to obtain V ( x ) by integrating a given k(x , ) , with the use of the system equations. In the stochastic case, the average value of the integral is the quantity of interest. Only diffusion processes will be considered. Our starting point will be Dynkin’s formula, instead of the (deterministic tool) integral-deriva- tive relation of the ordinary calculus. Suppose 9 V ( x ) is given, then under suitable conditions on V ( x ) and T ,

V ( X ) - E x V ( ~ J = - E x 9 V ( x S ) ds = E x k ( x J d s . (5-1) i 0 i 0

In order to reduce the number of technicalities in the discussion, all operations in the development will be formal. T (in (5-1)) will not be specified; we assume that, on all functions considered, 9 V ( x ) = x Q V ( x ) , and Q will not be specified. Further, E,V(x,) is assumed equal to zero. These assumptions provide for convenience in the derivations, and are not seriously restrictive from the point of view of obtaining a result, since, in any case, any computed Liapunov function must be checked against the theorems of Section 2. (Roughly, if T is the first

Page 88: Stochastic Stability and Control Harold Joseph Kushner

5. ON THE CONSTRUCTION OF STOCHASTIC LIAPUNOV FUNCTIONS 73

exit time of x , from an open set Q, and 9 V ( x ) = &V(x) = - k ( x ) in Q , then YPE,V(x,) = 0 and E,V(x,) contributes nothing of value to either V ( x ) or z Q V ( x ) . ) The method is illustrated by two examples.

Example 1. Suppose that the process is the solution to the It6 equation

d x , = x2 d t

d ~ 2 = - ( b ~ 2 + 2 ~ ~ 1 + 3 x : ) d t + o ~ , d ~ ( b > 0 , c > O ) .

Suppose that we seek a function V ( x ) which is a Liapunov function in a neighborhood of the origin and which satisfies

9 V ( x ) = - x : . Define

11 = E x ~ 1 t ~ 2 1 d t , i 0

I , = d t .

0

Note that, formally, 1, satisfies 91, = - x: , and we will solve for 1,. Upon applying Dynkin’s formula (and omitting the €,g(xT) terms),

and substituting the values of the 9x1, we have

i i 0 i 0

2 x2 = Ex (- 2&,) dt = 4c1, + 61, + (2b - d) I , , (5-3a)

0

x ; = E , (- 25:) dt = - 21, , (5-3b)

X : = E, ( - 2~;) dt = - 31,. (5-3c)

Page 89: Stochastic Stability and Control Harold Joseph Kushner

74 I1 / STOCHASTIC STABILITY

If a’ c 26, the solution of the linear set (5-3) for the function I,, which we now denote by V ( x ) , is

It is indeed true that -EpV(x) = - x: and, further, it is readily verified thatTheorem2may be applied to V(x) in a neighborhood of the origin.

Equation (5-4) was not hard to calculate. The answer (5-4) was known : the application of 9 to each term of (5-4) individually yields only terms which are integrands in (5-2); hence, the I i could be calcu- lated by use of the integral formula. In general, we pursue the following procedure. Choose a collection of functions g, (x) , i = 1, ..., n ; these correspond to theleft sides of(5-3). Computeeach 9 g i ( x ) = x j a i j h i j ( x ) . Hopefully, the set { A i j } will contain only n distinct functions and one of these will (hopefully) be the desired function k ( x ) , the desired value of - U V ( x ) . Finally, by writing

gi(x) = E x J - 9 g i ( x s ) ds = - z a i j E , kij(xs) ds

we may solve the set of linear equations for the desired Liapunov function

0 i s 0

T

V ( x ) = E x 1 k ( x , ) d s . 0

Of course, it is usually no easy matter (if not impossible) to select a proper set { g , ( x ) } (so that there are only n distinct terms in {h i j (x ) } ) . As in the deterministic case, the choice of { g i ( x ) } seems to be a matter of trial and error. The computed V ( x ) must be tested according to the appropriate theorem.

If the number of terms in {h i j (x ) } cannot be reduced to n ( k ( x ) must be a distinct term in {hij(x)} in any case), it may still be possible to compute a Liapunov function which gives some information. An example follows.

Page 90: Stochastic Stability and Control Harold Joseph Kushner

5. ON THE CONSTRUCTION OF STOCHASTIC LIAPUNOV FUNCTIONS 75

Example 2. Let

dx = (- a + y ) x dt (a > 0) ( 5 - 5 )

where the time varying coefficient y , is the solution of

Our aim is to obtain a function V ( x , y ) which is uniformly (in y ) positive definite in a neighborhood of the origin (of the x space) and which satisfies

Y V ( x , y ) = - x 2 .

We restrict our search to functions of the form V ( x , y ) = x 2 P ( y ) , where P ( y ) is a polynomial.

Define the sequence I , , .. . : T

I - , = I - , = 0, I,, = ~ , , , [ x : y : d t ( n > 0).

The desired V ( x , y ) equals I , . Then, a formal application of Dynkin's formula gives

0

1

x2yn = Ex,y (- Yx:~:) dt I 0

n ( n - 1 ) J n - 2 = ( 2 ~ + n b ) I , , - 2 I , , + l - - - - (5-7) 2

where J n - 2 = Ex,y JL a 2 ( y , ) x:y:-2 dt. We now make the additional assumption, which will be helpful in the following calculations, that J,,-2 = a 2 ( y ) I , - , . Let us solve (5-7) under the supposition that I , = 0, n 2 3. Then, providing that

A = inf {2a (2a + 2b) (2a + b ) - 4a2(y)} > 0, Y

Page 91: Stochastic Stability and Control Harold Joseph Kushner

76 I1 / STOCHASTIC STABILITY

we have

x 2 [ (2a + 6 ) (2a + 2 b ) + 2 ( 2 a + 2 6 ) y + 4y2]

A

2x2 [ - a (2a + 6 ) (2a + 2 6 ) + 2a2(y) + 4y3]

A

1, = V ( x , y) =

2 V ( x , Y) =

(5-8)

Suppose that a2(y) and yo are such that y: < A / 8 for all f < 03 and the bracketed term in (5-8) is always positive. Then (5-8) is a stochastic Liapunov function. The set of such a2(y) and initial conditions yo is neither empty nor trivial. Let a = b = 1 , and max 0 2 ( y ) = 1. Then A = 20,andtheconditionsaresatisfiedifa2(y) = 0,for lyl > (20/8)’’3> lyol. By truncating the set of linear equations at larger n, and solving the set for a new approximation to I,, a larger allowable range of variation in y, is obtained.

Page 92: Stochastic Stability and Control Harold Joseph Kushner

1 1 1 / F I N I T E T I M E STABIL ITY AND FIRST E X I T T I M E S

1. Introduction

The behavior of many tracking and control systems is of interest for a finite time only, and their study requires estimates of the quanti-

for some suitable function V ( x ) . First exit time probabilities have a traditional importance in communication theory (see, for example, Rice [l]) where they are used, for example, to study the statistical properties of local extrema of random functions, or of clipped random functions; the study of control systems via a study of appropriate first exit times is certainly not new (see, for example, Ruina and Van Valkenburg [l]). Nevertheless, most of the available work that is pertinent to control systems seems to bear mainly on asymptotic properties or to involve numerical solutions of partial differential equations which are associ- ated with certain types of first exit time problems. In this chapter, the stochastic Liapunov function method of obtaining upper bounds to (1-1) is discussed, and several theorems and examples are given.

Most of the published work on stochastic control and tracking sys- tems seems to be concerned with the calculation of moments, especially with second moments. Moments have generally been easier to calculate or to estimate than the values of (1-1). However, for many problems, knowledge of (1-1) has greater pertinence than knowledge of moments.

77

Page 93: Stochastic Stability and Control Harold Joseph Kushner

78 111 / FINITE TIME STABILITY A N D FIRST EXIT TIMES

Many tracking or breakdown problems must be considered only as finite time problems, since either a stationary distribution of error may not exist, owing to eventual breakdown or loss of track, or else the “transient” period is the time of greatest interest. Estimates of (1-1) are also of importance when the system, even in the absence of the disturbing noise, is unstable outside of a given region. The noise may eventually drive the system out of the stable region with probability one, and then estimates of the probabilities of the times of these occurrences are of interest.

Tracking systems occur in many forms. For example, an aircraft may be tracked through the use of the noise corrupted signals which are returned to a tracking radar system, or a parameter of a control system may be tracked either directly by using noise disturbed obser- vations of the value of the parameter, or indirectly by noise disturbed observations on the values of the states of the system which is para- metrized by the parameter. In such cases, bounds on the probability that the maximum tracking error (within some given time interval) will be greater than some given quantity are of interest. This is especial- ly true in systems which lose track at the moment that the error exceeds some given quantity. The orbiting telescope is a problem of some current interest, where first exit times are important. To allow successful photography of astronomical objects, the direction of point (attitude) of the telescope must remain within a given region for at least the time required to take the photograph. However, the attitude is influenced by the random forces acting on the satellite, and in its control system.

Problems in hill climbing are often of the same type. Given some maximum seeking method of the gradient estimation type, one would like a reasonably good upper bound of the probability that at least one member of some finite sequence of estimates of the peak deviates by more than a given amount from the true location of the peak. Such estimates provide a useful evaluation of the hill climbing procedure,*

* Two opposing demands on the design of maximum seeking methods (which are based on the use of noise corrupted observations) are (l), to have the probability

Page 94: Stochastic Stability and Control Harold Joseph Kushner

2. THEOREMS 79

and are of particular use in cases where there are many relative maxima, or where large errors cannot be tolerated ; for example, where a parameter of an operating plant is being adjusted. Similarly, the use of first exit probabilities provide one natural way of evaluating other “so-called” adaptive control systems.

The reader is also referred to the work on deterministic finite time stability in Infante and Weiss [ l , 21. The finite time results differ from the infinite time results in that 2, V ( x ) may be positive for some or all values of x. Nontrivial statements regarding the values of the proba- bilities (1-1) may, nevertheless, be made.

If the system has some “finite time stability” properties, but is not stable in the sense of Chapter 11, then a positive definite Liapunov function cannot satisfy x , V ( x ) s O in the set of interest, since this would imply a result of the type of Chapter I1 which gives (stability) information on the trajectories over the infinite interval [0, a). Generally, in this chapter, A,,,V(x) will be positive for all or some values of x in the set of interest. Nevertheless, the form of x , V ( x ) will suggest time dependent nonnegative supermartingales which, in turn, will be useful to obtain upper bounds to the probabilities (1-1).

An analog to the deterministic case is helpful. Let V ( x ) be differ- entiable in the set Q, = {x: V ( x ) -= m}, m > 0. Let i = f ( x ) and as- sume that in the set Q,, v(x) = ( d V ( x ) / d x ) ‘ f ( x ) 5 cp, where cp is a nonnegative constant. Then elementary considerations show that x, is in the set Q, for at least the time T = (m - V(x))/cp, where x = xo is the initial condition. This is a result in deterministic finite time stability.

2. Theorems

Theorem 1. Assume (Al) to (A5) of Chapter 11. In Q,, let

A m V ( x ) s - P ~ ( x ) + qpI (P > O > , (2-1)

of “bad” estimates as low as possible (for example, keep the probability of at least one “bad” estimate below a given quantity), and (2), to have the largest possible “rate of improvement” in the estimate of the maximum. The greater “caution” demanded by (1) slows down (2); see Kushner 191.

Page 95: Stochastic Stability and Control Harold Joseph Kushner

80 I11 / FINITE TIME STABILITY AND FIRST EXIT TIMES

where rp, is nonnegative and continuous in [0, TI. Define in [O, T ]

@, = i rps ds

0

where c is selected so that

p 2 max crp,.

Let I be such that if V(x, ) 2 m, then W ( x , t ) 2 I. Then, with xo = x,

f s T

With suitable choices of I and c, the appropriate equation, (2-3) or (2-4), holds, where @; = @,p/max rp, :

Remark. Letr(t) = e-'*'(I - eCGT/c) + l/c.Thedefinition of W(x, r) together with equation (2-2) imply that (x,, t ) stays within the hatched area of Figure 1 with at least the probability given by (2-2). To obtain (2-3) or (2-4), I is chosen so that maxTbrhOr(t) = m.

Page 96: Stochastic Stability and Control Harold Joseph Kushner

2. THEOREMS 81

Moximum of r ( t ) in LOJI occurs ot t = O

Moximum of r l t ) in t0,Tl occurs ot t=T

Figure 1

Remark. Note that, for V(x , ) = V ( x ) # 0 and T = 0, the right sides of (2-3) and (2-4) do not reduce to zero. This behavior is due to the method of proof which does not distinguish between initial con- ditions (of V ( x , ) ) which are fixed constants of value V ( x ) and initial conditions which are random variables whose expectation is V ( x ) . The results remain valid (as they do in Chapter 11) if the initial condition is a random variable (nonanticipative) with mean value V ( x ) .

The fact that W(x, , t ) (stopped on exit from some region) may be a nonnegative supermartingale is suggested by the form of A",V(x) in (2-1).

Proof. Define & = {x, t : W ( x , t ) < 1, t < T ) and let z be the first time of exit of ( x t , t ) form on. (Labeling the vertical axis in Figure 1

Page 97: Stochastic Stability and Control Harold Joseph Kushner

82 111 / FINITE TIME STABILITY AND FIRST EXIT TIMES

by V ( x ) , the scaled Q, corresponds to the interior of the shaded areas.) Let s, = r n 7 . Both 2, = xrnr and it = (Z,, s,) are right continuous strong Markov processes in [0, TI. In reference to the it process, let px,, { B } be the probability of the event B given the initial condition 2, = x, s, = t where (x, t)e&,. Now, let (x, t )e&. Then the stochastic continuity of the process 2, at t implies that p,,,(7 > t + h) -, 1 as h decreases to zero. Consequently Ex, ,(exp [ c @ ( t + h ) n T] - e'@')/h +

c q , ecdt as h + 0. Let a, be the weak infinitesimal operator of it. The operation of d, on W ( x , t ) , for (x, t ) in Q A , is easily calculated to be (note that if (x,, s,) is in Q A , then t = s,)

a , W ( x , s) = e '*s[cq ,V(x) + A",V(x)] - ~ s e c * s ~ 0. (2-5)

Note that W ( x , s) 2 0 for (x, s) in Q,. Now an application of Dynkin's formula vields

which is (2-2).

bound (2-2). If W ( x , t ) 2 1 for some t 5 T , then Inequalities (2-3) and (2-4) are obtained by a weakening of the

V(XJ 2 r ( t ) = e-'Or C

for some t 2 T(and vice versa). (See the remark following the theorem statement.) Also, Px{supT,,,, V ( x , ) 2 supT2,,,r(t)} is no greater than the bound given by (2-2). r ( t ) is either monotonic nonincreasing 01 monotonic nondecreasing, as t increases to T, depending on whether or not 1 is greater than or less than eCoT/c. Each case will be considered separately.

Let the maximum o f r ( t ) ( in [0, TI) occur at t = 0. Then, choosing 1 so that the maximum of r ( t ) equals m, we have 1 = in + (eCoT - 1)/c . Note that, for consistency, we require 12 ec*T/c; thus c 2 l / m is required in this case, and the right side of (2-2) equals

~ ( x ) + (ecQT - l ) / c

in + (ecQT - l ) / c B ( c ) = *

Page 98: Stochastic Stability and Control Harold Joseph Kushner

2. THEOREMS 83

A further constraint on the value of c is c 5 p/max cp,, which assures that (2-5) is nonpositive in &. Suppose that the interval A = [l/m, p/max cp,] is not empty (a necessary and sufficient condition for the maximum of r ( t ) , in [0, TI, to occur at t = 0). The value of c in A which minimizes B(c) is c = l/m, the smallest number in A . Equation (2-3) follows immediately.

If A is empty, then the maximum of r ( t ) in [0, TI occurs at t = T. To satisfy c 5 p/max cp,, we set c = p/max cp,. Also, choose 1 so that m = max,.. - r ( t ) = r ( T ) = leKCaT. The right side of (2-2) now equals

~ ( x ) + (ecaT - l)/c

mecoT ~~ B(c) =

and (2-4) follows immediately upon substitution of the chosen c. (It can also be verified directly that A s e-C*T/c in this case.)

Remark. In the examples, we use the special case cp, = cp, a constant. Then QT = cpT and rnax,.,cp,/p = cp/p.

Corollary 1-1. Assume the conditions of Theorem 1, except that p 2 0. Then

Proof. The proof is the same as that of Theorem 1, except that W ( x , t ) = V ( x ) + QT - Q, is used. (Alternatively, let c + 0 in Theorem 1.) For m s @jT, the bound (2-6) is trivial.

Remark. The theorems of this chapter are only particular cases of the general idea that if W ( x , t ) , t 5 T, is a nonnegative supermartingale in a suitable region, then

Theorem 1 is given in the form in which it appears, since form (2-1)

Page 99: Stochastic Stability and Control Harold Joseph Kushner

84 I11 / FINITE TIME STABILITY AND FIRST EXIT TIMES

arose in a number of examples, with the use of some simple V(x) . It is likely that other cases, not covered in the theorems or by the examples, will be of greater use for some problems, either in that the corresponding Liapunov functions will be easier to find or that they will yield better bounds. One such special case is given by Corollary 1-2, and the Wiener process example following it.

Remark. The forms of A”, on given domains for It6 and Poisson processes are given in Chapter 1, Sections 4 and 5.

Remark. The bound (2-6) may seem poor, and it probably is for most problems that one is likely to encounter. Yet there are simple systems for which the probability in (2-6) does increase linearly with time. Let i = 1, and let the initial condition be a random variable which is uniformly distributed in the interval [0, 13. Then P x { ~ ~ p T 2 , 2 0 x, 2 l} = T f o r T s 1.

The Liapunov function approach is often a crude approach. Given a tentative Liapunov function, its application to systems of rather diverse natures may yield very similar forms for &,V(x). If there is to be no further analysis or search for other possible Liapunov functions, then we must be content with the result given by a function which does not distinguish between quite different systems.

Corollary 1-2. Assume (Al) to (A5) and let A“,V(x) S p V ( x ) . Let m 2 WLT, 11, 2 p > 0. Then

(2-7) V (x) P, { sup e-”lr V ( x , ) 2 A} S ~.

T > r z O 1

Proof. The proof is very similar to the first part of the proof of Theorem 1 and will be omitted. The following example illustrates a case where the value of p, may be chosen as a function of T, to yield a useful estimate.

Example. Let dx = a dz, where z , is a Wiener process. Let V ( x ) =

cosh px = (eP” + C P X ) / 2 . Then, for each m c co, A“, acting on V ( x )

Page 100: Stochastic Stability and Control Harold Joseph Kushner

2. THEOREMS 85

(or on a function which is equal to V ( x ) on Q, and which is bounded and has bounded and continuous second derivatives) is (a2/2) d’ldx’. Thus

- a2J2 A , V ( x ) = - V ( x ) .

2

By reasoning as in Theorem 1, for pl = J20’/2 = p, we have

~ ( x ) = 0. Define m by

Then cosh Jrn = (cosh JE) exp (pT). (2-8)

P, { sup (x,[ 2 m> 2 P, { sup e-”‘cosh px, 2 cosh PE) T>rgO T g r t O

cosh Jx cosh Px - <-=-

cash PE cosh Jm (exp PT) -

5 2 exp [pT + P (x - rn)] . (2-9)

The last term on the right of (2-9) is minimized by J = ( rn - x)/a’T yielding a bound

( m - x)’ 2 exp -

2a’T ’

The success of the method for more difficult problems depends, of course, on the availability of functions satisfying J , V ( x ) = p V ( x ) , for an appropriate range of p.

Suppose that x, and y , are two processes (not necessarily with the same transition function) whose initial conditions are close. Theorem 2 gives a “Liapunov function” estimate for the probability that the processes have separated by at least some fixed distance at least once in a given time interval. The proof is essentially that of Theorem 1 and will be omitted. Such estimates are particularly important in control applications (as, for example, in the study of the deviations of a ran- domly perturbed trajectory from the“nomina1” unperturbed trajectory).

Page 101: Stochastic Stability and Control Harold Joseph Kushner

86 111 / FINITE TIME STABILITY AND FIRST EXIT TIMES

Theorem 2. Let (Al ) to (A5) hold for processes x,, y,(withinitial conditions x, y , respectively) and function V(x, y , r ) 2 Oin the bounded set Q = {x, y : W, (Ijx - y 11) < m}, where W, ( r ) >= 0 is continuous and increasing and W, (0) = 0, and V ( x , y , r ) 2 W, (IIx - y ll), for t 5 T. Let

where p > 0 and cp, is nonnegative and continuous in 10, TI. Define

W ( x , y , t ) as in Theorem 1, with V ( x , y, I ) replacing V ( x ) . Then, (2-2)- (2-4) hold, with V(x , , y,, t ) replacing V(x, ) and V ( x , y , 0) replacing V ( x ) . Also, with these substitutions the left sides of (2-3) and (2-4) majorize the quantity

Theorem 3 is the discrete time parameter version of Theorem 1.

Theorem 3. Let x,, n = 0, ..., N be a Markov process and V ( x ) a continuous nonnegative function with

(2-10)

in Q, = { x : Y ( x ) < m}, where P > 1, cp, 2 0. Define

whereF"(K)=n{-' [ K / ( K - cpi)],FO(k)= l , andP> K / ( K - cpJ.Let1 be such that if Y ( x ) 2 m, then W(x, n) 5 A. Then with xo = x,

V(X) + K F N ( K ) - K P x { sup W(x,,, n) 2 A} 5 ~ ~ ~ = B ( K ) . (2-11)

N L n z O 1

With appropriate choices of A and K, and letting m > max 'pi = cp', we

Page 102: Stochastic Stability and Control Harold Joseph Kushner

2. THEOREMS 87

obtain

P,{ sup V ( x n ) z m> 5 1 - ( 1 - y ) ~ y (1 - "i), (2-12) N t n Z O M

m ' =- "" > m (2-13) P - 1

Remark. reduces to

In the special case where 'pi = q, the right side of (2-13)

(2-13a)

Proof. Define T as the first Markov time (integral valued) for which W(x, , n) 2 A, n 5 N , and let &, be as in the proof of Theorem 1. Except for the fact that Dynkin's formula is not needed, the proof is similar to that ofTheorem 1. The stopped process X, = x,,, is a Markov process, and S O is in = (z,, z nn). A straightforwardcalculation shows that W(x, , n) is a nonnegative supermartingale if f i > K / ( K - qi) (for each i) or K > (p'/?/(/? - 1) (with adjunction of the appropriate family of a-fields), for n 5 N . (For ( x , , n ) in Q,, Ex, W ( X , + ~ , n + 1) - W(x, , n) 5 0.) Inequality (2-1 1) is the nonnegative supermartingale probability inequality. Define

= yj! (+) K - ~ p . [A: - K y (A)] + K .

The maximum of r ( n ) , in the interval [0, N ] , occurs at either n = 0 or n = N , depending on whether A is greater than or less than

Page 103: Stochastic Stability and Control Harold Joseph Kushner

88 111 / FINITE TIME STABILITY AND FIRST EXIT TIMES

K n E - ‘ ( K / ( K - (pi))= K F N ( K ) . If 1 is chosen so that maxos,5Nr(n) = m, then

P,{ sup V(x,,)Z m } 2 P,{ sup W ( x , , n) 2 A} i B ( K ) . N t n t O N > n t O

Assume A? KF,(K) and let maxNtntor(n) = m. Then 1 = m + K F N ( K ) - K . The last two sentences jmply that m 2 K . Thus we require

Now

which is minimized (in the alloted interval m 2 K 2 (p ’p/ (p - 1)) by the maximum value K = m, and (2-12) follows immediately upon making this substitution.

Now, let 1 5 KF, (K) ; then the maximum of r ( n ) occurs at n = N , and, setting the maximum equal to m yields 1 = mF,(K) . For consist- ency, we now require that K > m. Since K 2 (p ’p / (p - 1) is also re- quired, and (in this case) m < (p’p/(p - l), and

decreases as K decreases, we set K = (p ’p / (p - 1) = m’ > m, the mini- mum allowed value. Equation (2-1 3) follows immediately upon this substitution.

Corollary 2-1. Assume the conditions of Theorem 2, except that BL 1. Then .~

P, { sup V (x,) 2 m } i (2-14) N t n t O m

Proof. The proof is similar to that of Theorem 2, using W ( x , n) =

V ( X ) + 1;- cpi.

Page 104: Stochastic Stability and Control Harold Joseph Kushner

2. THEOREMS 89

STRONG DIFFUSION PROCESSES

It is instructive to consider a derivation of a result of the type of Theorem 1 from another point of view. The derivation is for the special case of the strong diffusion process in a compact region and makes use of the relationship between such processes and parabolic differential equations.

Let G be an open region with boundary dG and compact closure. In order to apply the result for parabolic equations it is required that dG satisfy some smoothness condition. We suppose that dG is of class H 2 + = (see Chapter I , Section 6). If g is given by Q, = {x: V ( x ) < m ) , where Q, is bounded and V ( x ) has Holder continuous second deriva- tives, the smoothness condition on dG is satisfied.

Theorem 4. Let x, be a strong diffusion process in G, where C and dG satisfy the conditions in the previous paragraph. Let P ( x , t ) be a continuous nonnegative function with continuous second deriva- tives in the components of x and a continuous first derivative in I , in G = [G + dG] x [0, TI. If

( L - ;) P ( x , t ) 2 0 (2-15)

and

(2-16) P(x, 0) L 0 , P ( x , t ) Z l , T Z t > O , x ~ d G .

Then, for t 5 T,

P , { x , $ G , some s 5 t } 5 P ( x , t ) . (2-17)

Remark. ,Although Theorem 4 is applicable only to strong dif- fusion processes, it is suggestive as a criterion for evaluating stochastic Liapunov functions to which Theorems 1 and 4 are to be applied. Let P, (x, t ) and P,(x , t ) satisfy the conditions on P ( x , r ) of Theorem4. Let P, (x, t ) 2 P,(x , t ) on dG x [0, TI + G x {0},

Page 105: Stochastic Stability and Control Harold Joseph Kushner

90 111 / FINITE TIME STABILITY AND FIRST EXIT TIMES

in G . Then Theorem 4 and the strong maximum principle for para- bolic operators (Friedman [l], Chapter 2) yields

P, (x, t ) 2 P2 (x, t ) 2 P, {xs# G , some s 5 t } .

Thus P 2 ( x , t ) is no worse an estimate than P, (x, t ) , and will be better if the inequalities are strict at some point.

Let x, be a strong diffusion process and suppose that V ( x ) has continuous second derivatives. Let P,(x , t ) be the right side of (2-3) of Theorem 1 (where cp, = cp and p 2 cpfm) and L = A",, and let G = {x: V ( x ) < m } be bounded. On dG, V ( x ) = rn and Pl (x, t ) = 1. At t = 0, Pl(x, 0) = V(x) /m 2 0. Also

Since LV(x) 5 - p V ( x ) + cp and p 2 cp/m by assumption,

and we conclude that the right side of (2-3) satisfies the conditions on P ( x , t ) of Theorem 4, under the appropriate conditions on the process x,. A similar statement can be made for the right side of (2-4).

Proof of Theorem 4. Under the conditions on S i j ( x ) , f ; ( x ) , and on dG, Theorem 13.18 of Dynkin [2] (see Section 6, Chapter I) yields that p ( t , x, y ) , the probability transition density of the process x, in G (and stopped on dG), satisfies ( L - d / d t ) u = 0. (Here P , ( x , E ~ c G) =

Jr(p(t,x,y)dy). The function p ( / , x , y ) is continuous and has contin- uous second derivatives for t > 0. Alsop(t, x, y ) + 0, x + dG, t > 0, and p ( t , x, y ) 5 Kt - exp - Ily - x1I2/2th, for some positive real numbers K and h.

So, the function 1 - jGP(tr x,y) dy = Q(x, t ) is the probability that the first time that xs exists from G is no larger than t . Q(x, t ) satisfies ( L - a / d t ) Q(x, t ) = 0 with boundary condition Q(x, t ) + 1 as x + dG, t > O a n d Q(x,t)+Oas t + O , x ~ G . N o t i n g t h a t ( L - d / d t ) P ( x , t ) g O and that P(x , t ) 1 Q(x, 1 ) on G x ( 0 ) + dG x [0, T I , an application of

Page 106: Stochastic Stability and Control Harold Joseph Kushner

3. EXAMPLES 91

the strong maximum principle for parabolic operators yields that

in P(x, t ) 2 Q(x, t ) [G + dG] x [0, T I .

Thus, for t T,

Q(x, t ) = P,{x,$G, some s S t } S P ( x , t ) . Q.E.D.

3. Examples

Example I. A Hill Climbing Problem. In this example, bounds on the errors associated with a standard simple gradient hill climbing method are developed. Let f ( x ) be a smooth scalar-valued function with a unique maximum O n , at time n, which varies in time according to the rule en+, = 8, + $,, where the $, are assumed to be inde- pendent random variables. The quantities x and 8, are scalar valued, and x, is the nth sequential estimate of 8,. For the most general problem, it would be assumed that not much a priori information concerningf(x) is available, but that observations on f(x), corrupted by additive noise, can be taken.

To facilitate a simple development we suppose that f ( x ) takes the simple formf(x) = - k(x - 8,)/2 and let two observations (at x, + c and at x, - c) be taken simultaneously. The sequence x, is given by the gradient procedure

a [observation at (x, + c) - observation at (x, - c)]

2c ~~~~~ - x,+1 = x, + - ~~

(3-1) (xn + C) - f (xn - C) + t n l

= x, + ~~

2c

The random variable t, is the total observation noise and the members of {t,} are assumed to be independent. a and c are suitable positive constants. Equation (3-1) can be written as

xn+ 1 - e n + 1 = (1 - a k ) (xn - e n > + v n

at, 0 =--$,

2c

Page 107: Stochastic Stability and Control Harold Joseph Kushner

92 111 / FINITE TIME STABILITY AND FIRST EXIT TIMES

Define Ev; = m2 and ED: = m4 and let Eon = Evi = 0. In order to apply Theorem 3, the assumption that 11 - akl c 1 will also be needed. Let en = x, - 9,; then

e n + , = (1 - &)en + 0,. (3-2)

The estimates of P,(supNLntO lenl 2 E ) given by twodifferent Liapunov functions will be compared. First, define V, ( e ) = e2 . Then (3-2), applied to V, (e), gives

Eerie;+ = (1 - ak)2 e: + m 2 ,

p = (1 - ak) - ' , 9, = m2.

Thus Theorem 3 is applicable and, with the use of m = E~ and e t = e2, gives the bounds

P,{ sup lenl 2 E } = P,{ sup e: 2 2) N t n z O N t n z O

if m2 - 2 9P

& z - - p - 1 [I - (1 - a k y j

and

e2( l - ak)2N m2[1 - (1 - ~ k ) ~ ~ ] P x { sup (enl 2 E } 5 ~ ~~ + ~ ~ ~~~~ (3-3b)

N B n z O E 2 E 2 [l - (1 - ak)2]

if 2 m2

& 5 - p [l - ( 1 - ak)2j'

The bound in (3-3b) is separated into two terms, the first depending on the initial error, and the second depending on the noise variance. As N increases, the contribution of the initial error decreases exponen- tially, while the noise contribution increases as a constant minus a decreasing exponential.

Now let us try V,(e) = Be4 + e2 as a Liapunov function.

Page 108: Stochastic Stability and Control Harold Joseph Kushner

3. EXAMPLES 93

2 &“(Be:+ + en+ = B ( l - ak) , e:

cp = Bin, + m 2 . (3-4)

Apply Theorem 3 to V,(e) and let m = Be4 + E ’ ; this gives the bounds

P, { SUP lenl 2 E l = P,{ SUP V 2 ( 4 1 Vz(E)l N L n > O N > n > O

N Be4 + e2 Bm, + m2 I l - 1-- 1 -___ ) - ( BE, + c 2 ) ( BE, + E’

(3-5a)

when 4 Bm, + m2

m=BE +E‘L--- 1 - 1 / p ’

and

P,{ SUP lenl L E l N L n > O

(Be4 + e’) /TN f (1 - p- ” ) (Bm, + m2)/(l - p - ’ ) I (3-5b)

BE,> E 2 -

when Bin, + m2

BE, + E’ < 1 - l $

Let us compare the two forms V,(e) and V2(e) . To simplify the comparison, let (3-3a) and (3-5a) be applicable, and let e=O. The bounds (3-3a) and (3-5a) reduce to 1 - ( r n , / ~ ~ ) ’ and 1 -(Bm, + m 2 ) ” / ( B ~ 4 + E ’ ) ~ , respectively. The ratio of the powered quantities

Page 109: Stochastic Stability and Control Harold Joseph Kushner

94 111 / FINITE TIME STABILITY AND FIRST EXIT TIMES

is (the second to the first)

1 + Bm4/m2

1 + B e 2 *

It is now readily seen that V,(e) is preferable to V2(e) if and only if rn4 2 e 2 m 2 . If V2 (e ) is preferable to Vl ( e ) there remains the problem of choosing B, but we will not pursue it.

Example 2. A Second Order Nonlinear It6 Equation. We consider the particular stochastic form of Lienard's equation

d x , = x2 dt d ~ 2 = - (2x1 + 3 ~ : + ~ 2 ) dt + 0 d z .

The function

(3-6)

x: ( X I + X 2 l 2 2 v ( x ) = ~ + ~- ~- + 2x , (1 + X I ) 2 2

is a Liapunov function for the deterministic problem.* The determin- istic system has a saddle point at x = (- 2/3, 0) (see LaSalle and Lefschetz [l], p. 64) and the origin is asymptotically stable in the sense of Liapunov. At x = (- 2/3, 0), V ( x ) = 14/27 = f i . Thus, the proba- bility that V(x , ) 2 f i is essentially the probability that the object will be lost in the time interval [0, TI. Q, is within the domain of attraction of the origin for the deterministic problem. V ( x ) is in the domain of z,,, for m 5 f i . Suppose that rn 2 f i :

A",V(x) = Y V ( X ) = - x: - 3x: (3 + x l ) + cT2

A majorization of V ( x ) yields

2

V ( x ) 5 3 ( : + .:) + 2x:

~~ -

* This Liapunov function is a special case of one constructed by E. F. Infante (personal communication) for equation (3-6) with arbitrary coefficients.

Page 110: Stochastic Stability and Control Harold Joseph Kushner

3. EXAMPLES 95

and

In Q,, the least value of the right side of (3-7) occurs at (xl = - q,, x2 = 0) where qm < 2/3. Thus

and Y V ( x ) 6 - p,V (x) + a2.

P,{ SUP V(Xl )Z A},

Now, we seek a good estimate of

T L 1 Z O (3-9)

and, to obtain it, we would like to use the stronger part of Theorem 1, namely (2-3), if applicable. This form, whose application to the region Q, requires f i 2 q / p A = a’/.,, is not directly applicable, however, since pA = 0. Suppose that there is some m, < A such that mo 2 a2/p,, and V ( x ) < m,. Then the form (2-3) may be used to estimate P X { ~ ~ p T z l z , V ( x J 2 m,}, and a bound on (3-9) obtained by the in- equality

P, { sup V ( x , ) 2 A} 5 P, { sup V ( x , ) 2 m,} . (3-10)

We now outline a procedure for optimizing m,. (Note first, how- ever, that for some (small) values of T and V ( x ) , the form (2-4) applied to QA may be preferable to the following procedure.) Since the ex- ponent in (2-3) is a2/m, provided m 2 0 2 / p , , the remarks above sug- gest that we seek the largest value of m < A for which m 2 a2/p,. This is the largest m (or, equivalently, the largest q,,, < 5) for which

T Z I Z O T h t Z O

Page 111: Stochastic Stability and Control Harold Joseph Kushner

96 I11 / FINITE TIME STABILITY AND FIRST EXIT TIMES

For small u2, the estimate can be improved by the use of either V”(x) or exp $V(x) , although, since the computations are somewhat more complex, the problem will not be pursued here.

Example 3. In this example, a simple first order linear It8 equation is considered. This is the simplest model of a continuous time tracker.

Let x, be the position of the object to be tracked, and let us suppose that the system dx = u1 dw, where w , is a Wiener process, models x,. Let rn, and e, = rn, - x, denote the estimate of x,, and the tracking error, respectively. A fairly common form for the observations on x, is

g (4 + $5 I

where $, is a noise term and g(e )e 2 0. The observation may be a function of e , , rather than of x,, since the physical observer or tracker may center his “sighting” at rn,, the estimate of x,, and the obser- vation would then be a function, g(e,), of the difference between the center of the sighting and the true location.

Let the $, be “white” Gaussian noise and represent the m, process

by dm = - g ( e ) dt + o2 d u ,

where u, is a Wiener process (to account for the supposed white Gaussian observation noise). Then

de = - g ( e ) dt + 0 d z ,

where z , is a Wiener process and (T dz = - u1 dw + u2 du. If g(e ) = 0 for (el 2 E , then the only “restoring force,” when [el 2 E , is the noise, and track may be considered to be lost if lei 2 E .

Page 112: Stochastic Stability and Control Harold Joseph Kushner

3. EXAMPLES 97

In the remainder of this example, we consider the process in { e : [el < E } and the linear case g ( e ) = ue. Let V(e) = lei" n 2 2. Then

I1 ( n - 1) A",T/(e) = P V ( e ) = - nu [el" + Q ~ (3-1 1)

2 which takes the form

P V ( e ) 5 - pT/(e) + cp

p = n u , c p = - - ----

(3-12)

o 2 n ( n - 1 ) P

2

Let m = E" 2 cp/p = 0 2 ~ " - 2 ( n - 1)/2u. Then (2-3) is applicable and gives the bound

P,{ sup letl 2 E } = P x { sup le,l" 2 E " } T z r z o T L t t O

n(n - 1) 02T - < 1 - ( I - Fr)exp[- ~ 2E2 --I . (3-13)

When putting (3-1 1) into the form (3-12), a helpful tradeoff between the first and second forms of terms of (3-14) is possible. Write (3-1 1) as

2 - pyleln + c p r , 1 L y 2 0. (3-14)

A judicious choice of y can reduce the bound (3-13). To facilitate a simple illustration, let

m2--[Zn CTyn - 1) ( n - 1) CT2 ] ( n - 2 ) i 2 E D .

cpy = o 2 ( n - 1)[(--- -1

2

The reason for this assumption will appear shortly. It implies that form (2-3) is applicable for some 0 < y < 1 ; cpy is the maximum of the second term of (3-11) and equals

n - 1) (. - 2) CT2 ' n - 2 ) / 2

2(1 - y ) a n

Page 113: Stochastic Stability and Control Harold Joseph Kushner

98 I11 / FINITE TIME STABILITY AND FIRST EXIT TIMES

2 and occurs at e2 = (n - 1) (a - 2) a2/(2an(l - y ) ) = ey ; if e, > E, then the maximum value of the second term of (3-1 1) occurs at E = e. Let E~ 5 E . To make the most effective use of Theorem 1 (assuming (2-3) is applicable) we want to select y so that cpy/m is a minimum subject to the constraint p y 2 q y / m (which assures us that (2-3) applies); i.e., if m is sufficientlylarge, we would like to select y = y * so that m = cpy./py.. (See Figure 2.) The value of y which minimizes cpu/py is y* = 2/n and

m

I 1 0 Y* I

Y-

Figure 2

yields cp,./py. = D. Thus, we require

m >= cpy,/py. >= D . The inequalities are satisfied by our assumption on m. Thus, corre- sponding to some 0 5 y 6 1, the equality m = cpy/p7 holds. With this y, (2-3) yields the bound

(3 -15)

Page 114: Stochastic Stability and Control Harold Joseph Kushner

3. EXAMPLES 99

If the hypotheses of the construction are satisfied, namely m 2 D, then (3-15) is preferable to (3-13), provided that the same value of n is used in each. However, the best value of n in (3-13) is n = 2 (at least for e = 0; n 2 2 by assumption), and the computations which have been made have not proved that (3-15) is preferable to (3-13) when each is minimized seperately with respect to n. Nevertheless, the method does seem to have a general interest.

Example 4. A Two-Dimensional Linear It6 Equation. Let the system be modeled by

d x , = x2 dt

dx2 = (- X I - ~ 2 ) dt + CT d~ (3-16)

We will use the stochastic Liapunov function

V ( x ) = +x; + X l X 2 + x: = x’Bx with

A”,V(X) = ~ V ( X ) = - X : - X : + C T ~ = - X ’ A X + 0’. (3-17)

Equation (3-17) can be put into the form

J m V ( x ) 2 - p V ( x ) + cp

p = min (x: + x:)/ (+x: + x 1 x 2 + x : ) . 2

cp = 0 , X

The minimum of the ratio is easily shown to be the least value of 111 for which

( A + 1B) x = 0

has a nontrivial solution, and it equals p = 1 - 0.4 J1.25 2 0.552. Thus

z m V ( x ) 2 - 0 . 5 5 2 V ( x ) + a’. Suppose that m 2 a2/(0.552). Then equation (2-3) gives the bound

P,{ sup V ( x , ) 2 m} 5 1 - ( 1 - ~ v ( X I )exp(-$). (3-18) T L t Z O

Page 115: Stochastic Stability and Control Harold Joseph Kushner

100 I11 / FINITE TIME STABILITY AND FIRST EXIT TIMES

IMPROVING THE BOUND

We will give a qualitative description of the situation when the Liapunov function V"(x) is used :

A",V"(x) = 9 V " ( x )

(2x2 + d ( n - 1 ) = n V ~ - l ( x ) [ - x : - x 2 + f J +- 2V ( X I --I.

(3-19)

There are many ways of factoring (3-19) into the form 9 V " ( x ) s - pV"(x) + cp. Let ji = 0.552 and write

9 V " ( X ) 5 - p , V " ( x ) + cp" where

p,, = npa, ,

cpn=max nji(a,,- ~ ) ~ " ( x ) + n ~ " - ' ( x ) c ~ ~

1 > a, > 0

~ -1. (3-20) 2

The first term in the bracket of (3-20) is negative and dominates the other terms for large ]lxll. Thus, for any 1 > a, > 0, the term being maximized will have a finite maximum which will occur at a finite point. cpn increases, as a, increases to one. cpn also increases with n roughly as n". Choose 1 > a, > 0 so that cpn is as small as possible subject to the constraint m" 2 cp,/pn. (To simplify the discussion, we assume that m" 2 cp,/pn for the range of n of interest. For any fixed m, this will hold for at most finitely many n. If m" 2 cp,/p, cannot be achieved, then the argument must be based on (2-4) rather than on

For any n, if m is suflciently large, then the exponent in (2-3) or (3-21) satisfies cpn/mn < cpr/mr for all r < n. Also, Vn(x) /mn < V r ( x ) / m r , r < n. We may conclude that, for each value of m, there is an optimum n (n is not required to be integral valued, provided n 2 1); the optimum

s (2x2 + x l ) 2 V " - ' ( x ) c?(n - 1 ) n +-

(2- 3).

Page 116: Stochastic Stability and Control Harold Joseph Kushner

3. EXAMPLES 101

value of n is an increasing function of m. While no computations have been made, it seems reasonable to expect that, if the optimum n(m) were substituted into (3-21), then, for fixed T, the graph of the right side of (3-21), considered as a function of m, would have a bell shape, rather than the shape of a simple exponential:

P, { sup v (XI) 2 m } = P, { sup I/" (XI) 2 m") T z t h O T z I t O

- I 1 - (1 - 7) V" (X I exp (- ::). (3-21)

A related approach involves the use of the Liapunov function exp $ V ( x ) in lieu of V " ( x ) . The choice of $ replaces the choice of n. We have

$0' exp $V(x) = $ exp $v(x ) - x: - x i + 0' + -(2x, + x 1

2

(3-22)

which may be treated similarly to 9 V n ( x ) . For more details, see Example 2 of Chapter V which concerns the design of a control for the system (3-16).

Page 117: Stochastic Stability and Control Harold Joseph Kushner

Iv / OPTIMAL STOCHASTIC CONTROL

1. Introduction

This section motivates the more general discussion of the succeeding sections. Let x: be a family of strong Markov processes. The parameter u associated with each member is termed a control. The control determines the probability transition function p”( t , x; s, y ) of the process xy. Usually each u may be identified with a specific member of a given family of functions which takes values u(x:, t ) depending only on x: and t , at time t. The object of control theory is to select the control so that the corresponding process possesses some desired property. One object may be to transfer an initial state x to some target set S with probability one, that is, choose a u so that x: + S with probability one); another object could be for x: to follow as closely as possible (in a suitable statistical sense) some preassigned path.

Write A”u and A”,” for the weak infinitesimal operators of the process x: and the process stopped on exit from an open set Q,,,, respectively, and write E: for the expectation given that the initial state is x. Hence- forth x: will be written as x,, and #($, t ) may be written u,. The specific control associated with x, will be clear from the context. The problem may be developed further by associating to each control a cost

C” (x) = E l b (XI,) + E l k (x,, us) ds . (1-1) 1 0

102

Page 118: Stochastic Stability and Control Harold Joseph Kushner

1 . INTRODUCTION 103

If we wish to attain a target set S, then 7, is the random time of ar- rival at S . A problem of optimal control is to select u so that x, -+ S with probability one as t approaches some random time 7, (possibly infinite valued) and which minimizes the cost C"(x) with respect to other controls of a specified class of "admissible comparison controls."

A DYNAMIC PROGRAMMING ALGORITHM

Let us first consider the control of processes governed by It8 equations, and, for the moment, proceed in a formal way. Suppose that the control process is modeled by the homogeneous process

d x = f (x, U) dt + a ( x , u ) d z

Y = c f i ( X , u ) - + a 1 - c s i j ( x , u ) - - - . at , (1-2) j ax i 2 i , j axi a x j

and assume A"u = 2" on all functions to which the operator is to be applied. The function u takes values u ( x J at time t . The system (1-2) has a well-defined solution in the sense of Itd for any sufficiently smoothf, u, and 0. Thus, the control u indeed determines the process.

Let a target set S be given and let (1-1) represent the cost given that xo = x is not in S. Denote the minimum cost by V ( x ) ; in other words, V ( x ) 5 C"(x) for all controls u lying in a specified comparison class. If V ( x ) is indeed the minimum cost, then the principle of dynamic programming (Bellman [ 2 ] ) implies that V ( x ) satisfies the functional equation

V ( x ) = min E;[V(xAnru + k ( x , , us) d s ] . (1-3) U ) AT 0

The time index Ant, appears since control is terminated at 7,. The initial time in (1-3) is set equal to zero for convenience. Since u depends only on x, the process is homogeneous in time and the convention is inconsequential. If V ( x ) is given, then, by the algorithm of dynamic programming, the control u minimizing the right side of (1-3) is the optimal control.

Page 119: Stochastic Stability and Control Harold Joseph Kushner

104 IV / OPTIMAL STOCHASTIC CONTROL

Continuing formally, let us put (1-3) into a more convenient form. Divide (1-3) by A and let

k ( x s , us) = k ( x , u ( x ) ) . A - 0 ff

0

This yields 0 = min [ 9 " " V ( x ) + k ( x , u)] ,

U

which is a nonlinear partial differential equation for V(x) , and also gives the optimum control u ( x ) in terms of the derivatives of V ( x ) . In other words, if u is the function minimizing (1-5) and w is some other control, then

0 = ~ ' V ( X ) + k ( x , U) 5 Y w V ( ~ ) + k ( x , w ) . (1-6)

V ( x ) is, of course, subject to the boundary condition V ( x ) = b(x ) for x on as. In a sense, for stochastic control, (1-5) replaces the Hamilton- Jacobi equation corresponding to the deterministic optimal control problem, where the It6 equation is replaced by an ordinary differential equation. See, for example, Kalman [l], Dreyfus [l], and Athans and Falb [l].

A slightly different control problem occurs if we add the require- ment that the process is to be terminated upon the first time x, exits from an open set G containing S, provided that this occurs before x, reaches S . Then by letting 7, be the least time to either aG or dS, whichever is contacted first, and assigning the penalty b(x7") to the stopping position, the problem is unchanged except that the solution of (1-5) is subject to the boundary condition b ( x )

A seemingly different type of control problem appears when no target set S is specified, and the control is to be terminated at a given finite time T. Now, let u take values u(x , , t ) depending on both x, and t . By

on asu ac.

Page 120: Stochastic Stability and Control Harold Joseph Kushner

1. INTRODUCTION 105

letting T" equal the minimum of T and the first time of contact with JG, and letting x, = x e G be the initial condition, the cost is written as

ru

CU(x, t ) = E:,, b (x , , , , T,,) + E : , , / k ( x , , u s , s ) d s , (1-7) t

where E:,[ is the expectation given x, = xeG and r

all functions to depend explicitly on t )

T. The associated dynamic programming equation is (we now allow

+ 9 " V (x, t ) + k (x, u ,

where V ( x , t ) is the minimum cost function (the minimum of Cu(x, t ) over u).

Actually, from the point of view of formal development, the problem leading to (1-7) and (1-8) is a special case of the preceeding problem. It is not necessary to introduce time explicitly since some component of x, can be considered to be time; that is, dx,+ = dt or x,+ = t . In fact, unless otherwise mentioned, we will omit the t argument, and assume that some state is time.

The control forms which have been discussed up to now are functions of only the present value x, at time t. Their values at t have not been allowed to depend on the values of x s , s < t . Let P be the class of controls whose values depend only on the present state x,. Let P, be the class of controls whose values have an arbitrary dependence upon the present and also upon the past history of the state. Is the class P no worse than the class P,? In other words, for each control i t ' in P, is there a control u ( w ) in P such that Cu(w) (x) Cw(x)? The question as formulated is still a little vague. Nevertheless, it is intuitively reasonable that, if the uncontrolled process has the Markov property, that the class P i s "just as good" as the class PI. Derman [ l ] gives some results on this problem for the discrete parameter control problem and Fleming [2] for the continuous parameter diffusion process; see also Theorem 8 and Example 1 of this chapter.

Page 121: Stochastic Stability and Control Harold Joseph Kushner

106 IV / OPTIMAL STOCHASTIC CONTROL

DISCUSSION

A number of difficulties with the development leading to (1-5) and (1-6) are readily apparent. Let Q be the family of controls (functions of x) such that, if U E Q , then (1-2) has a unique solution (in the sense of It6) which is a well-defined strong Markov process. The class Q is essentially limited to functions satisfying at least a local Lipschitz condition in x. There may be no u in Q such that C"(x) Cw(x) for every other control M? in Q.* Even if an optimum control u in Q did exist, there is no guarantee that the cost C"(x) is in the domain of 2 and that 2 = 9''' when acting on C"(x); that is, the limits (1-4) may be meaningless. Third, if a smooth solution to (1-5) did exist, there is no guarantee that the control which minimizes (1-5) will be in Q; in particular, it may not satisfy even a local Lipschitz condition in x. (If u does not satisfy this condition, then the corresponding It6 process has not been defined, although there are diffusion processes whose differential generators have coefficients that are less smooth than that required (at present) for the existence and uniqueness of solutions to the It6 equation.)

The family of comparison controls Q is not clear from the derivation of (1-5) and (1-6). Also, without further analysis, we cannot assume as is done in (1-4) that P: { A < T,,} -+ 1 as A -+ 0. (This property holds for all the right continuous processes of concern here.) Lastly, the solution to ( 1 4 subject to the appropriate boundary conditions, may not be unique. -~

* Consider a control problem where a target set S is to be attained. Typically, the question of the existence of an optimal control is broken into two questions. The first is a question of attainability and the second of optimality. First: Is there a control with which the set S will be attained with probability one? It is possible to treat this with the methods developed here for stochastic stability. See Chapter V. The second question is: Given that there is one control accomplishing the desired task, then is there an optimal control? The latter question has been well treated for the deterministic problem. See, for example, Lee and Marcus [I], Roxin [ I ] , and Fillipov [I] . For the stochastic problem, the possibilities are great and the results more fragmentary; see Kushner [7] and Flemingand Nisio [I] . See the survey papers Kushner [2, 101 for more references to the literature on stochastic control.

Page 122: Stochastic Stability and Control Harold Joseph Kushner

1. INTRODUCTION 107

The purpose of the next section is to resolve these and related questions.

A DETERMINISTIC RESULT

To motivate the theorems of the sequel, we first consider the control of the deterministic system i = f ( x , u). Let Q be a class of controls each member of which is a function whose values depend on time, and impose conditions on Q and f so that f = f ( x , u) has a unique con- tinuous solution and Ilf(x, u)II is bounded in the time interval [0, a), for each u in Q. Let S be a compact target set and define the cost as

C"(X) = b (xJ + k ( x s , us) ds i 0

where T" is the time of arrival at 8s. Let b(x ) be continuous and let k ( x , u) > 0 outside of S. Now, let V ( x ) be a nonnegative function with continuous derivatives, and which tends to infinity as llxll+oo. Define the operator H" = c i A ( x , u) 8 / d x i . Let

H"V (x) = V ( X ) = - k (x, u ) .

Then V ( x ) is a Liapunov function for f = f ( x , u), and any trajectory of the system is uniformly bounded in time. Also, with xo = x and t 5 T",

V ( X ~ ) - V ( X ) = H"V(x, ) ds = - k ( x , , u,) ds. (1-9) i 0 i 0

Owing to the boundedness of the solutions and of 11il1 and to the properties of V ( x ) and v(x) outside S, we see that, as t increases to some finite or infinite value T", x, must approach some point xTy on as, and V(x , ) must approach b(xTJ . Hence V ( x ) = C"(x).

Now, let w be any other control in Q which transfers x to dS and let H " V ( x ) + k ( x , w ) 2 H"V(x) + k ( x , u). Then, again V ( x t ) -+ b(xrw),

Page 123: Stochastic Stability and Control Harold Joseph Kushner

108 IV / OPTIMAL STOCHASTIC CONTROL

and a simple calculation yields

T" rW

C " ( x ) = / k ( x s , us) ds + b ( x J S C w ( x ) = k ( x , , w,) ds + b ( x , , , , ) .

In other words, if 0 = H " V ( x ) + k ( x , u) 5 H " V ( x ) + k ( x , w ) for all w in Q, then u is an optimal control relative to the set of controls Q. Also, the minimum cost is a solution to H"V(x) + k ( x , u) = 0 with boundary data b ( x ) and the optimum u minimizes H " V ( x ) + k ( x , u). ( H " V ( x ) + k ( x , u) = 0 is an equation of the Hamilton-Jacobi type.)

Except for a few results concerning existence and uniqueness for some problems in the control of strong diffusion processes, the ap- proach of the sequel is similar to the deterministic approach just described. Given candidates for the minimum cost and optimum control, we discuss criteria which may be used to test for optimality. The processes, target sets, and conditions will vary from theorem to theorem. The stochastic situation is much more difficult than the deterministic counterpart just described.

0 s 0

THE STOCHASTIC ANALOG

Stochastic analogs of the deterministic result are the burden of part of Section 2. In order to base a stochastic proof upon the deterministic model of the previous subsection, an analog of the integral formula (1-9) is needed. Dynkin's formula

EZV ( x ~ ) - V ( x ) = Ef: A""V ( x , ) ds (1-10) i 0

will provide this. T is a Markov time with finite average value, and V ( x ) is in the domain of A"". The use of (1-10) would appear to be hindered by these conditions. Nevertheless, by first applying (1-10) under these conditions, we may subsequently extend its domain of validity in such a way as to provide a useful tool. The extension is

Page 124: Stochastic Stability and Control Harold Joseph Kushner

2. THEOREMS 109

accomplished by truncating either V(x) or the process x, and then taking limits, and is developed in the proofs which follow.

2. Theorems

TERMINOLOGY

In Theorem 1, the following conventions and assumptions are used for the target set S. Let x = (2, K), where Tr is continuous with proba- bility one, for t =< T, , if T, c co, and for t < co otherwise. The com- ponent R, is only right continuous on the same time interval. Let 2 be in the Euclidean space E and x’ in E , E = E x E . The object of the control is to transfer the initial value of Zt, denoted by 2, to S in E. These conventions will simplify the proof. They do not seem to be a serious compromise of generality, since discontinuous processes will probably enter the problem as parameters or, at least, not as variables whose values are to be transferred to some set.

For each control under consideration, the corresponding process is assumed to be a right continuous strong Markov process. The previous usage Q, = {x: V(x) < m}, where V(x) is a given function, will be retained. Note also that A”,” refers to the weak infinitesimal operator of the process which corresponds to u and which is stopped on exit from a given set Q,. The set Q, (and function V(x)) will be clear from the context. Except where explicitly noted, u will always denote a real- or vector-valued function whose values will depend either only on x, or on x and t , as indicated. We define A:,b as the weak infinite- simal operator of the process (with control u) stopped on exit from Q, - S = {x: V(x) S m} - S . T~ is always the first entrance time into the set S, or the time of termination of control if this occurs first.

Theorem 1 gives conditions on a function V(x) and control u so that V(x) = C”(x). It is assumed that b(x) = 0 and Tr + 8s with probability one as t + T,. Theorem 2 gives conditions on a function V(x) so that V ( x ) = C”(x) and b(x) may take nonzero values if x ,+Sas t + ~ , , where T” < co with probability one. Theorem 3 gives a readily check-

Page 125: Stochastic Stability and Control Harold Joseph Kushner

110 IV / OPTIMAL STOCHASTIC CONTROL

able condition under which a condition (uniform integrability) re- quired in Theorems 1 and 2 is satisfied. Theorem 4 is an optimality theorem; conditions on two controls u and u are given so that C"(x) 5 C"(x). The analog of Theorem 2 for fixed time of control appears in Theorem 5. Theorems 6 and 7 give special forms (stronger results under stronger conditions) of Theorems 2 and 5, respectively, for the special case of strong diffusion processes. Finally, there are examples and a discrete parameter theorem.

Theorem 1. Assume the target set conventions of the first para- graph of Section 2. Let b ( x ) = 0. Let V ( x ) 2 0 be a continuous function defined on E and which takes the value zero on S. For each rn > 0, let V ( x ) be in the domain of Ai,b and

Suppose* that ,ff -+ a3 as t -, t,,. There is a nondecreasing sequence of Markov times ti satisfying E:ti < co, ti --f t,, with probability one and ti 5 inf { t : V ( x , ) 2 i}. Let x be in E - S . Assume that either (2-1) holds, or that (2-2) togethert with either (2-3) or (2-4) holds:

E : l / ( x T i ) + O as i - m ; (2-1)

(2-2)

(2-3)

P: { sup llxtll 2 N } + O as N + c o . (2-4)

the V (xTi) are uniformly integrable 1 ;

V ( x ) is uniformly continuous on 8s;

T , > t Z O

... .

* This is a question in stochastic stability, and can be verified immediately if k ( x , u) has the appropriate form; see for example Lemma 2 or Corollary 3-1 of Chapter 11.

t I f V ( x ) --f x as l[x[l + m, then the nonpositivity of a m , , u V ( x ) implies (2-4). See Chapter 11.

A sequence fi is uniformly integrable if, for every E > 0, there is an N < 00

independent of i such that EZu lfil x , l ~ , l B N ) 5 E , where X A is the characteristic function of the set A. If fi +f with probability one, then uniform integrability implies €Sufi + E Z y J

Page 126: Stochastic Stability and Control Harold Joseph Kushner

2. THEOREMS 1 1 1

Then 7"

V ( x ) = C"(x) = E,/k(x, , us) d s . (2-5) 0

Proof. Define t i = min (i, zu, inf { t : V ( x t ) 2 i}). Since P x { ~ ~ p 7 u , t ~ 0 V(x , ) 2 i} V(x)/i implies that inf { t : V(x , ) 2 i} --t 03 as i -+ 03 (with probability one), we have z, n inf { t : V ( x , ) >= i} -+ T, (with probability one) as i + m . Thus { z i } satisfies the hypothesis. First, we prove that (2-2) and (2-4) imply (2-1). Ifthe sample function x,, t < z,, is uniformly bounded, then, by the continuity of V ( x ) , the sample values V(x,) -+ 0 as zi -+ T,. Equation (2-4) implies that the sample functions x, are uniformly bounded with a probability arbitrarily close to one. Thus V(x,) -+ 0 with probability one. (This is true even though the x, may not converge to a unique limit on 8s. By hypothesis x, -+ as = aS x E with probability one. Thus, for each E > 0, there is a finite-valued random variable 4& such that the distance between x, and dS is less than E for t > $&, with probability one. V(x, ) -+O as t -+ z,, since V ( x ) is uniformly continuous on the range of x,, t < z,, with a probability ar- bitrarily close to one.) Finally, this fact together with (2-2) implies (2-1).

Similarly, it is proved that (2-2) and (2-3) imply (2-1). (Note that the fact that V(x, ) --t 0 with probability one follows immediately from the uniform continuity of V ( x ) on as together with the fact that x, -+ as with probability one as t -+ z,.)

By Dynkin's formula

V ( X ) = E:V(xTi) - Ej: X:~V(X) ds j: 0 .Ti

= E;V(xri) + E : / k(x,, us) ds.

0

Since k(x) 2 0, the integral tends to E," f: k(x,, us) ds = C"(x). Since the first term on the right tends to zero as i + m , the proof is complete.

Theorem 2. Let the nonnegative function V ( x ) be defined and continuous on E and take the nonnegative value b ( x ) on the target set

Page 127: Stochastic Stability and Control Harold Joseph Kushner

112 1V / OPTIMAL STOCHASTIC CONTROL

S. Assume that V ( x ) is in the domain of A”:,*, for each m > 0, and that

Z: ,*V(x) = - k ( x , u ) 5 0

in E - S + as, and that the control u transfers the initial condition to S at T,, where T, c 00 with probability one. There is a nondecreasing sequence of Markov times T~ such that zi 4 T , with probability one, E:T~ < co and T~ =< inf ( t : V(x,) 2 i]. For some such sequence, let either (2-la) or (2-2a) hold:

the V (xTi) are uniformly integrable. (2-2a) Then

V ( X ) = C”(X) = E:b(x,,,) + E: k ( x , , us) d s . (2-5a) J 0 Remark. The primary difference between the problems stated in

Theorems 1 and 2 is that here the limit lim, x, = xTu must exist with probability one. Otherwise, the “terminal cost” term E,Ub(x,”) is meaningless. This is the reason for the requirement that T, be es- sentially finite valued.

Proof. Since A”i, ,V(x) 5 0, in E - S + as we have, for X E E - S,

which goes to zero as m -03. Define zi = min ( i , T,, inf ( 1 : V(x, ) 2 i}). As in Theorem 1, { T ~ } satisfies the hypothesis. Also, since T , < 00 with probability one, there is an i(o) which is finite valued with probability one, such that x,, = xTu for all i > i (w) . Hence lim+ ‘IJ V(x,) = V(x,,,) with probability one, whether or not the process V(x , ) is continuous. Now (2-2a) implies (2-la). Applying Dynkin’s formula to T~ and V ( x )

Page 128: Stochastic Stability and Control Harold Joseph Kushner

2. THEOREMS 113

we have T i

V ( x ) = E : V ( x , ) + El! k ( x , , us) ds. 0

Using (2-la) and taking limits we obtain (2-5a).

Remark on the significance of (2-2) or (2-2a). Whether or not any of (2-1)-(2-4), (2-la), or (2-2a) holds, the nonnegativity of k ( x , u) and the monotone convergence theorem imply that, for any sequence of Markov times t ,

V ( x ) = E: k ( x , , us)& + lim E:V(x , ) . I 1 - T " 0

By Fatou's lemma (if b ( x ) is identically zero set b(xTU) = V(xTu) = 0 in what follows),

lirn E:V ( x , ) 2 E:V (xT,) = E:b ( x J . Z - T "

Hence V ( x ) 2 C"(X) >= 0

and, for x on S ,

V ( x ) = b (x) = C " ( X ) .

Also, since V ( x ) is in the domain of each z i , b , the function defined by

r w

Cy (x) = E: k (xs, us) ds + E: b(x,,,) 0

may be assumed* to be in the domain of each 2 i . b with Ai,bCy(x) =

-

* I t is not always true that integrals of the form ELu SiUg(xr) ds = C (x) are in the domain of A,,,J,~ for any m > 0. If this is so and g (x) is continuous, then AnL.bu G ( x ) = -g (x). The assumption is made for the sake of argument, but it will not be pursued here. See Dynkin [2], Chapter 5, for a discussion of such quest ions.

Page 129: Stochastic Stability and Control Harold Joseph Kushner

114 IV / OPTIMAL STOCHASTIC CONTROL

If the only function of the process xtnr, which is a martingale and which satisfies the above boundary and expectation conditions is the trivial function q ( x ) = 0, then condition (2-2) is satisfied and (2-5) holds.

To see that there can be such functions q ( x ) which are not identically zero, consider the scalar uncontrolled It6 process whose stopped sub- processes have the weak infinitesimal operators x z , b and with the origin as the target set S :

d x = - x dt + J:x d z Suppose that

4x2

3 k ( x , u ) = k ( x ) = ---, b ( O ) = O .

Let V ( x ) = x2. Then V ( x ) is in the domain of x : , b , and on V ( x ) in the set Q,, 9 =A:, for each m, and

Page 130: Stochastic Stability and Control Harold Joseph Kushner

2. THEOREMS 115

Now let V, (x) = x2 + x4; V, (x) is also in the domain of for each m, and V , (0) = 0 (note that Q, is now relative to V , (x)) and x : , b = Y o on V, (x):

p ~ l ( ~ ) = - - = - k ( x ) = ~ o V ( x ) . 4x2

3

In this case it can be shown that zo = co with probability one (see the proof of Khas'minskii [3]) ; there are analogous examples, however, when zo is finite with probability one.

The process x, may be essentially written as

x, = xo exp (42, - 4t ) .

A direct computation yields

4 4 Ex,xr + s = xs

with probability one, verifying that the difference V , (x,) - V(x, ) = x: is a martingale. Hence, the stopped process x&,, is also a martingale. Write v = V(x,). By a supermartingale convergence theorem of Doob [l], the supermartingale sequence 6 converges to a finite valued ran- dom variable v with probability one if ElVJ 5 M -= 00. Also, if the sequence 5 is uniformly integrable, then Ex& + Exv. Thus, uniform integrability (condition (2-2)) together with the convergence of the V, to the appropriate boundary values is equivalent to (2-1) or (2-la). Theorem 3 gives a general and usable criterion for the uniform inte- grability of the sequence V(xTi); hence, under the conditions of Theo- rem 3, if V(x,) converges to the desired boundary value b(xJ with probability one, then E: V ( x , ) -+ E:b(xru).

Theorem 3. Let the function V ( x ) and the sequence T~ satisfy the conditions of Theorem 1 or 2. Let there exist a nonnegative function E ( x ) in the domain of x : , b with xi," E ( x ) 5 0 for each m. Let

Page 131: Stochastic Stability and Control Harold Joseph Kushner

116 IV / OPTIMAL STOCHASTIC CONTROL

as A + 03. Then the V(x,,) are uniformly integrable. In particular

E:V ('Ti) + E:v (xTu).

Proof.

Thus,

Define the random sequences v = V(xTi) and Fi = g(xr,). For each w in the set Bmi = {w: 2 m}, v / F i 5 l/g(m).

Since J:,b g(x) 5 0, the process @(xtnTU) is a nonnegative supermartin- gale. The discrete parameter process obtained by sampling a super- martingale at a sequence of Markov times is a discrete parameter supermartingale (Doob [l], Chapter VIII, Theorems 11.6 and 11.8). Thus the process Fi is a discrete parameter nonnegative supermartin- gale. By the martingale property,

E:Fi 5 F?.

Substituting this into the right side of (2-7) one obtains a bound which is independent of i and tends to zero as m + 03. Hence the sequence Vi is uniformly integrable. Q.E.D

Remark on the selection of P ( x ) for It6 processes. In Example 3, E ( x ) will be taken in the special form E(x) = F( V ( x ) ) , where F(A)/A + co as A+co. Let F(A) have continuous second derivatives and let V ( x ) be in the domain of A:,, for each m > 0. Then F( V ( x ) ) is in the domain of J:,b for each m > 0, and

Page 132: Stochastic Stability and Control Harold Joseph Kushner

2. THEOREMS 117

where V,(x) is the gradient of V ( x ) and

k ( x , U) = - ~ ' V ( X ) .

Let m(s) be any continuous function which satisfies

in E - S. If F ( V ( x ) ) is to satisfy the conditions of Theorem 2 then (2-8) must be nonpositive in E - S, or, equivalently,

in E - S . The function F (A) defined by

S

(2-10)

satisfies 9 ' F ( V ( x ) ) S 0 in E - S. If the condition

S

exp[rn(y)dy-+cc as s+m (2-11)

is also satisfied, then F ( A ) / A -+ m as A+ co, and since F (A) has con- tinuous second derivatives, it will satisfy the hypotheses of Theorem 3. If m(s) is the trivial function m(s) = 0, then F ( A ) / A = 1.

Thus, if the bound (2-9) satisjes (2-ll), the V(xri) are uniformly integrable.

If the process x, is confined to a set Q, 3 S with probability one, then the Vi are uniformly integrable, since they are uniformly bounded. This boundedness is guaranteed if S (x) -+ 0 uniformly as x -+ dQ, from the interior and k ( x , u) > E > 0 in some neighborhood of dQ, (relative to Q,) for some E > 0. To show this we need only observe that the given conditions imply the existence of a function m(A) which is continuous in Q, - S, satisfies (2-9), and tends to infinity as A f r.

Page 133: Stochastic Stability and Control Harold Joseph Kushner

118 IV / OPTIMAL STOCHASTIC CONTROL

Thus F(A) /1 (defined by (2-10)) tends to infinity as A+r, and finally, this implies that

P x { sup V ( x J > r } = O . r . > r t O

A PARTICULAR FORM FOR F( V(X))

For It6 processes, it is convenient to use the particular form

F(1) = 1 log ( A + A) (2-12)

for a suitably large real number A . Then

log@ + V ( X ) ) +---I k ( x , u ) V ( x ) + A

+ + V(x)’ [ V i ( x ) S(x) V x ( x ) ] . (2-13) [2 - 2(A + V ( x ) )

AN OPTIMALITY THEOREM

The cost corresponding to control u (of the statement of Theorem 2) may be compared to the costs corresponding to other controls by the following considerations. Let w be a control which takes the initial condition x to S with probability one, for each X E E - S . Let V ( x ) satisfy the conditions of Theorem 2 and suppose that V ( x ) is also in the domain of 2 1 , b for each m > 0 and that

A”,”,bl / (x) >= - k ( x , W). (2-14)

Suppose that T, < co with probability one, and that, with control w, there is a sequence T~ of Markov times satisfying the conditions of Theorem 2 (but converging to T,). Then, by Dynkin’s formula, for each i,

V ( x ) 5 E:V ( x r i ) + E: k ( x , , ws) ds . j: 0

Page 134: Stochastic Stability and Control Harold Joseph Kushner

2. THEOREMS 119

By taking limits, we obtain

V ( x ) s CW(X) + 6 ( x ) , where

C” ( x ) = E:b (xr,) + E:

Sw(x) = lim E:V (x,,) - EZV ( x r w ) 2 0 .

k ( x s , w,) ds i 0

i - t m

In order to compare the controls u and w, we need to evaluate a”(.). If Sw(x) = 0, then C ” ( x ) = V ( x ) 5 C w ( x ) . In general, u is optimal (in the sense of minimizing the cost) with respect to at least all controls w for which A”,”,, v ( X ) 2 - k ( x , w) , and a”(.) = 0.

In the deterministic problem, if V ( x ) is continuous and if a control w transfers the initial point x to a finite point on dS, then V(x, )+ V(x,,) = b(x,,). In the stochastic problem, if w transfers the initial point x to S i n finite time and V ( x ) is continuous and E - S is bounded, then V ( x J + V(x,,) = b(xrw) with probability one, and EZ V(x,,)+ E:b(x,,). Some of the most common models taken for continuous time control processes do not have a bounded state space. Then, although we may have x, + Sand V(x, ) + V(x,,), both with probabili- ty one as t + T” < co, we may not have Exw V ( x , ) + E,” V(x,,,,), as the example following Theorem 2 showed. Thus, in order to compare C ” ( x ) and C w ( x ) in this case, some further constraints on the effects of the comparison control w are required. Theorem 4 sums up this discussion.

Theorem 4. Let both the controls u and w transfer the initial condition x to the set S with T, < 03 and T, < co with probability one. Let the nonnegative continuous function V ( x ) take the value b ( x ) in S, and be in the domain of both z:, , and A”,”, b for each m > 0. Suppose that

z : , b V ( x ) = - k ( x , U) (2- 14a)

A ” , ” , b V ( X ) = - k,(X, W ) 2 - k ( x , W). (2-14b)

Page 135: Stochastic Stability and Control Harold Joseph Kushner

120 IV / OPTIMAL STOCHASTIC CONTROL

Let either k, S 0 or

P:{ sup V ( x J ~ m } - + O as m - a .

For the process corresponding to each control, there exists a sequence of Markov times satisfying the condition of Theorem 2. For any such pair of sequences, let either ( u = u or w)

r,>rbO

EI: V ( X r J -+ EI: V (xz,) (2-15)

or let the V(x, ) be uniformly integrable. Then

C"(X) S CW(X).

Remark. Theorem 4 is a precise statement of the dynamic pro- gramming result of Section 1 for a more general process. In Section 1, = 9" was used. Conditions on k ( x , u ) which imply that x, + S with probability one are discussed as stability results in Chapter 11.

Proof. The conditions imply that the appropriate sequences {T~) exist. The rest of the proof follows from Theorem 2, and the discussion preceeding the statement of Theorem 4.

Corollary 4-1 follows immediately from Theorem 4, and applies to the case where C"(x) = Pf: { s u ~ , ~ , ~ > ~ V(x , ) >= A} and the target set S is contained strictly interior to Q,; that is, the cost is the probability that x, will leave Q, at least once before absorbtion on S at T".

Corollary 41. Let k ( x , u) = 0 with V ( x ) = 0 on S (and V ( x ) = A on dQ,) where S is strictly interior to Qd. Let V ( x ) and the controls u and w satisfy the other conditions of Theorem 4. Then

P:{ sup V(XJ 1 A} 5 P,w( sup V(XJ 1 A}. r , > r t O r ,> tbO

For future reference in Example 3 we write:

Corollary 42. Let u and w be controls and let the corresponding processes be It6 processes, for t < ?,,and t < T,, respectively. Let (2-14)

Page 136: Stochastic Stability and Control Harold Joseph Kushner

2. THEOREMS 121

and the conditions preceding it in Theorem 4 be satisfied. Suppose that A > 0 and

. Y V ( X ) log ( A + v ( x ) ) 2 0

LYWV ( x ) log ( A + v (x)) 5 0.

C"(x) 5 CW(X) . Then

Proof. The proof follows from Theorems 1 to 4.

Corollary 4 3 . Let the set E - S + dS be bounded. Let the con- tinuous nonnegative function V ( x ) satisfy the boundary condition b(x ) on S and let V ( x ) be in the domains of both Ji,b and 2 Z . b . Let the controls u and w transfer the initial condition x to S with proba- bility one as t -+ T,, or T,, respectively. Then, if

J; ,gyx ) = - k ( x , u ) 5 0,

ZZ, b v ( x ) 2 - ( x , w, >

we have C"(X) 5 CW(X) .

Proof. V ( x ) is continuous and bounded in the bounded set E - S + 8s. Thus, for both controls u and MI, the corresponding pro- cesses V ( x , ) converge to the appropriate boundary condition (V(xrU) or V(x ,_ ) ) . Also, for the associated sequences isi}, we have both ET: V ( x J -+ ET: V(xr,) and Ef: V(xT,) -+ Ef: V(x,.). The rest of the proof follows from Theorem 1 or 2. Note that, if b ( x ) E 0, the arrival times z, and zw may take infinite values.

CONTROL OVER A FIXED TIME INTERVAL

If the control is to be exercised over a fixed time interval ( t , TI only, the result is essentially a special case of Theorem 2. We now introduce the time variable t explicitly, and list some conventions which provide

Page 137: Stochastic Stability and Control Harold Joseph Kushner

122 IV / OPTIMAL STOCHASTIC CONTROL

for the immediate application of Theorem 2. The pair (x,, t ) is con- sidered as a right continuous Markov process, for t s T, and the con- ventions of Chapter I, Section 3, are used to define the weak infinite- simal operator 2; acting on the function V ( x , t ) in the set Qm = {x, t : V (x , t ) < m}. Define the target set S, = S x [0, TI + { T } x E. We allow all components of x, to be only right continuous. Define T, = Tninf { t : x , E S } . The terminal cost is the nonnegative continuous function b(x, t ) defined on the target set S,. Let the initial value x, = x E E - S + dS be given at time t < T. Then, we define the cost corresponding to control u and the interval ( t , TI by

k ( x , , u s , s) ds + C,, b(xTU, T,,),

where k ( x , u, s) 2 0. In Theorem 5 , we use xi,, as the weak infinite- simal operator of the nonhomogeneous process x, (with control u) stopped at T”, and acting on functions which may depend on time.

Theorem 5. Let the killing times for the processes corresponding to controls u and w be no less than T with probability one. Let the nonnegative function V(x , t ) be continuous in all its arguments and suppose that it satisfies the boundary condition V ( x , t ) = b(x, t ) 2 0 on S,. Assume that, in ( E - S ) x [O, TI

Ji,bv(x, t ) = - k ( x , u , f) 5 0 (2-16a)

for each m and

A”,”,bv(x, t ) = - k, (x , W , t ) >= - k ( X , W , t ) , (2- 16b)

where either k,(x, w, t ) 2 0 or

P I { sup V ( x , , t ) L m } + O as m + m . (2-17)

To each control u and w, there is a corresponding sequence of non- decreasing Markov times with the properties; T~ + 7, with probability one; E;,,q < cx) (where u is either u or w ) ; T~ 5 inf ( t : V(xs, s) 2 i > ;

T w > f L O

Page 138: Stochastic Stability and Control Harold Joseph Kushner

2. THEOREMS 123

for each w, there is an integer i(o) < co so that T~ = T" for all i > i(w). Finally, suppose that either (2-18) or (2-19) hold:

E:, t (x , , 7 7,) -+ E:, I V ( X I " 9 T,) (2-18)

the V ( x , , , T,) are uniformly integrable for both u and w . (2-19)

Then V ( x , t ) = C" ( x , r ) s CW(X, t ) . (2-20)

ProoJ Let the control be u. By (2-16a), the process V ( x l , t ) , stopped at T,, is bounded with probability one, for t 5 T,; that is,

V ( x , t ) PI{ sup V ( x s , s) 2 m } -,

which goes to zero, as rn-co. Thus there is a sequence {T~} satisfying the conditions of the hypothesis (for example, T ~ = min (i, T,, inf { t :

V(xl) 2 i})). Let x be in E - S and t < T. Dynkin's formula may be applied to T~ and V ( x , t ) and yields

T 2 r. 2 sh I rn

7 ,

The last term on the right converges to

7"

I

as i+co. By (2-18), V ( x , t ) = C"(x, t ) . In any case, since x,, = xrY for some i < 00 with probability one (since T~ = T" for some i < 00 with probability one), we have limi+m V(x, , , T ~ ) = V(xru, T,,), with proba- bility one. Now (2-19) implies E:,t V(x7 , , T~)-+ EI,t V(xTU, T,), and the left side of (2-20) follows.

Page 139: Stochastic Stability and Control Harold Joseph Kushner

124 IV / OPTIMAL STOCHASTIC CONTROL

If the control w satisfies either kw(x, w , s) >= 0 in ( E - S ) x [0, TI or condition (2-17), then we also infer that V ( x r i , z i )+ V(xrW, zw) with probability one. Then (2-19) implies (2-18). Combining (2-18) with Dynkin's formula, we obtain

V ( x ) 5 lim E:, ,V (xTi , zi) + lim i - m i + m

= C W ( x , t )

and (2-20) follows.

STRONG DIFFUSION PROCESS

A stronger result is available when x, is a strong diffusion process and E - S is bounded. In Theorem 6 the control depends on x only, and time is not one of the components of x.

Theorem 6. Let Q be a bounded open region of class* IT2+, , and let the process x,, corresponding to controls u = u or w, be a strong diffusion process in Q + aQ, with differential generator

a 1 a2 L"= Yip" = C f i ( X , u ) - + ~ C S i j ( X , u ) -

i axi 2 i . j axi a x j where the control u(x), and the functionsfi(x, u ) and Sij(x, u ) satisfy a uniform Holder condition in Q + aQ. Then each u transfers the initial condition XEQ to aQ in time zip" with E ~ T ~ < 00.

Let the function V(x) have continuous second derivatives at each x in Q , let V(x) = b(x) on aQ, where b(x) is continuous, and suppose that

Yp"V(x) = - k ( x , u ) s 0

Y W V ( x ) 2 - k ( x , W )

* See Chapter I, Section 6 , for the definition.

Page 140: Stochastic Stability and Control Harold Joseph Kushner

2. THEOREMS 125

at each x in Q, where k ( x , u) satisfies a uniform Holder condition in its arguments in Q + aQ. Then

V ( x ) = C"(x) 2 CW(X).

For any such k(x, u), the equation P ' A ( x ) = - k ( x , u) has a unique solution satisfying A ( x ) = b ( x ) on aQ and A ( x ) has Holder continuous second derivatives in compact subsets of Q.

Proof. The proof is almost an immediate consequence of the remarks in Chapter I, Section 6. Under the hypothesis on P", it is the differential generator of a strong diffusion process and, hence, E ~ T , < co. Since, in addition, the process is continuous with probability one, and Q is bounded, the cost C"(x) is well defined (either by the arguments of Theorem 2 or Section 6 of Chapter I). Also, there is a unique solution to L?"A(x) = - k ( x , u) with A ( x ) = b ( x ) on aQ. By Section 6 of Chapter I, this solution must be C"(x); also, by the uniqueness and the hypothesis on V(x) , V ( x ) = C"(x).

The same reasoning yields a unique solution to 9w W ( x ) = - k ( x , w), with boundary condition W ( x ) = b ( x ) on aQ. Also, W ( x ) = Cw(x). By the hypothesis,

Y W U (x) 2 0

where we define

U ( x ) = CW(X) - C"(x)

in Q and U ( x ) = O on aQ. Finally, the strong maximum principle yields Cw(x) 2 C"(x).

Theorem 7 is the analog of Theorem 5 for strong diffusion process.

Theorem 7. Let Q be a bounded open set of class H, , , and x, a strong diffusion process in Q, for 0 2 t 5 T, with differential gener- ator

a 1 az a ~ " = ~ f i ( x , u , t ) - + - ~ s i j ( X , u , t ) - -+- .

i axi 2 i . j axi ax j at

Page 141: Stochastic Stability and Control Harold Joseph Kushner

126 IV / OPTIMAL STOCHASTIC CONTROL

Let each considered control u be of the form u(x, t ) . For any vector 5 let

C S i j ( x , 0, t> titj B 1.1 IItII (1.1 > 0) i , j

in D +aD, where D = Q x [O, TI. In D + a l l , Iet A ( x , u, f ) , ui (x , t ) ,

S i j ( x , u, t ) , and k ( x , u, t ) be continuous and satisfy uniform Holder conditions in x and u. Suppose that the function V ( x , t ) has continuous second partial derivatives in the components of x , and a continuous derivative with respect to t , and V ( x , t ) equals the continuous function b(x , t) on Q x { T ) + a Q x [0, TI, and that

Y”’ ( x , t ) = - k ( x , U, t )

Y W V ( x , t )z - k ( x , W, t ) ,

where both u and w satisfy the conditions on u. Then

CW(x, t ) 2 c y x , t ) .

Proof. Except that the corresponding theorems for parabolic equations are required, the proof is essentially the same as that of Theorem 6 , and will not be given.

GENERAL NONANTlCIPATlVE COMPARISON CONTROLS

The control u of Theorem 4 or 5, whose values at t depend on either x, or x , and t , is optimal with respect to a class of comparison controls whose values at t depend only on either x , or x , and t . When the con- trolled processes are the solutions of stochastic differential equations of the It6 type, it is possible to widen the class of comparison controls to include a family of nonanticipative functions.* We suppose that

dy = f ( Y , W , t ) d t + U ( Y , W , 2) d z

* The possibility of the extension was pointed out to the author by W. H. Fleming. See also Fleming [2].

Page 142: Stochastic Stability and Control Harold Joseph Kushner

2. THEOREMS 127

where the control w takes values in a set U and is nonanticipative, and also that the possible w are further restricted so that the solution process x, is defined in the time interval of interest.

Let a, be the least a-algebra over which z,, s s t , is measurable. Then w, is measurable over g,. Let y, be the process corresponding to control w,, and x, to u, (whose values depend on x, and r only). y may not be a Markov process. The cost corresponding to control w in the interval [ t , T,] is the random variable

f W

c w ( g t , t ) = ~ " t b ( y ~ _ , T,) + ~ a t / k ( y , , ws, s) d s , r

where T , is a nonanticipative random variable which is the terminal time of control (as defined in Theorem 4 or 5). The cost is a random variable simply because the value of the control at s >= t may depend on values of ys for s < t .

Theorem 8. Suppose that the It6 process corresponding to the continuous and locally Lipschitz control u ( x ) or u(x , t ) satisfies the conditions of Theorem 4 or 5 (depending on whether or not the terminal time T is fixed) and 9' V(x , t ) = - k ( x , u, t ) s 0 in E - S and V ( x , t ) = C"(x, t ) . For any other number u in U and any x in E - S, let 9" Y(x , 8 ) 2 - k ( x , u, t ) . Let there be a sequence of nonanti- cipative random variables ( r i } tending to T , with probability one and satisfying

~ ~ ' ~ ( y r i ? T i ) + E g t V ( y r w , ~ w ) (2-21)

with probability one and

r

Let V(y , , t ) t m with probability one. Then

v (x, t ) = CU(X, t ) s CW( a,, t )

with probability one.

(2-22)

Page 143: Stochastic Stability and Control Harold Joseph Kushner

128 IV / OPTIMAL STOCHASTIC CONTROL

Proof. The inequalities and equalities are understood to hold only with probability one. It8’s lemma (Chapter I, Section 4) is applicable to V ( y , t ) and zi and

T i

V ( Y d J - V ( Y 1 , f ) = p y Y p , P ) d p 1

11

+ p X Y , . P ) d Y , ? W p 7 PI 4. (2-23) f

Since 9“ V ( x , t ) 2 - k ( x , u, t ) for all u E U and x in E - S, (2-23) and (2-22) yield

E”‘V ( y f , r) S E”’V ( y T i , zi) + E”’ k ( y p , w, , p ) d p . 1 BY (2-211,

V(Y, , t ) = E”’V(y,, t ) 5 C W ( a t , t).

Since Cu(x, t ) = V ( x , t ) by the hypothesis, we have for any random variable yf ,

C”(Yf1 t ) 5 cw (a1 9 t)

and the theorem is proved.

“PRACTICAL OPTIMALITY” : COMPACTIFYING THE STATE SPACE

Commonly, the descriptions of the process x, at large llxll are idealizations which provide for a relatively simple or tractable mathe- matical model. Often, the behavior of the process model at large llxll will be grossly different from the behavior of the “physical” process at large IIxII. For example, the paths of the “physical” process may be uniformly bounded, whereas those of the model may not be (for example, if the model is “Gaussian”). In such cases, the model (which is equivalent to Al”) and the assigned cost k ( x , u) cannot be taken too seriously for large llxII. Recall also that the possible behavior of V ( x f ) for large llxll demanded, in the theorems, the introduction of technical

Page 144: Stochastic Stability and Control Harold Joseph Kushner

2. THEOREMS 129

conditions which would ensure that E,” V ( x r , ) Ei V(xr,) . These con- siderations suggest that a modification of the problem statement al- lowing for some freedom in the choice of the process for large llxll would be useful to simplify the statements of the theorems, and to simplify the checking of the “optimality” of any given control.

Let us modify the problem in the following way. Define the rate of cost to be k ( x , u). Let u be a fixed control transferring x to S a t t, < 03 with probability one. Assume that there is a finite region of the state space, R , which incorporates part of S and such that P i (x ,#R, some s < 7”} < 6, where x = xo in R and 6 is given and is small. (Naturally R - S is not to be empty.) The discussion of the preceding paragraph suggests that if xo = x is in R, then the behavior, outside R, of the process x, corresponding to the model, should not be important and should be changeable if a change would be helpful in studying the “optimality” properties of a given control.

Denote the weak infinitesimal operator of x, stopped on first en- trance into S by A”:. Let k ( x , u) = k ( x , u) in R ; then k ( x , u) in E - R may be altered in any way consistent with k ( x , u) 5 0 (which the theorems require*). Let R , 2 R be a compact set, and modify A”: in R , - R so that the process x, (with control u ) never leaves R , . (For example, for It6 processes, let the cij+O as x + a R , and alter the f i (x, u) in R , - R so that the paths of the resulting system never leave R , and return to R with probability one, if they ever leave R. In, general, let us alter the process so that, with an arbitrary control, it is deterministic in E - R , .

Now consider the altered control problem. Denote the alteration of A”: by A:. Let V ( x ) be in the domain of A:. We still have A” V ( x ) = - k ( x , u) in R. Now dejine the function k ( x , u) 2 0 in R , by the expression A: V ( x ) = - k ( x , u). Then Ef: V(x , ) 4 E: b(xru) and

C ” ( X ) = V ( X ) = E:b(xru) + E]: k ( x , , u,) ds . i 0

- ~~

* Theorems 6 and 7 do not require k 2 0, but there the space was bounded for all mathematical purposes.

Page 145: Stochastic Stability and Control Harold Joseph Kushner

130 IV / OPTIMAL STOCHASTIC CONTROL

Let w be any control transferring x to S with probability one, let V ( x ) be in the domain of A:, and let

Then, by Theorem 2, and noting that the paths of x,, corresponding to control w, are bounded (since x , + S with probability one and x, is deterministic and bounded in E - R , ) we have

C”(X) = V ( x ) s CW(X).

For the altered process, the control u is optimal with respect to w. If an arbitrary nonzero 6 yields a compact R, and if there is a process alteration so that (for the controls u, w) the paths do not leave R, and satisfy the other conditions above, then we say that u is “practi- cally optimum” with respect to w.

Let V ( x ) - + o o and llxll +a, and suppose that V ( x ) is in the domain of x& and A i , , V ( x ) = - k ( x , u) 5 0, all m > 0. Suppose that the initial value xo is such that V(xo) s V ( i ) . Then, since

a candidate for the region R is

Note that R is compact, since V ( x ) - t c o as llxll +a.

3. Examples

Example I . In the first example, we prove that the usual solution (as, for example, given in Florentin [l]) of the fixed time of control, linear system, and quadratic loss problem for the It6 equation is indeed an optimal control, and we exhibit a class of controls with respect to

Page 146: Stochastic Stability and Control Harold Joseph Kushner

3. EXAMPLES 131

which it is optimal. Let the loss and system be represented by

k(x, U, t) = x’C(r) x + u’D(t ) u

b ( x ) = X‘MX (3-1) dx = A ( t ) x dt + B ( t ) u dt + o(t ) dz

where D ( t ) satisfies (’D(t)( 2 fi11511’ for all 5 and all t in [0, T ] where p > 0. The control interval is [0, TI. C ( t ) and M are symmetric and nonnegative definite. The target set is the entire space. First, we obtain the usual result formally by the use of dynamic programming.

A standard application of the dynamic programming result of Section 1 yields that the optimal control is the control which produces the minimum in (3-2):

0 = min ( V , ( x , t ) + V: (x, t ) ( A x + Bv) + x‘Cx + &Do v

where V ( x , t ) is the minimum cost. Denote the optimal control by u. Then, formally,

V ( x , t ) = cyx, t )

0

From (3-2), D - ‘ ( t ) B ‘ ( t ) V,(x, r)

2 u(x, t ) = - (3-3)

Equation (3-2) is to be solved by assuming a solution of the form

V ( X , t ) = x ’ P ( t ) x + Q ( t ) ,

where P ( t ) is nonnegative definite and symmetric and Q ( t ) 2 0:

v,(x, t ) = 2 P ( t ) x

~ ( ~ , t ) = - D - l ( t ) B ( t ) P ( t ) x .

Page 147: Stochastic Stability and Control Harold Joseph Kushner

132 IV / OPTIMAL STOCHASTIC CONTROL

Upon substituting V ( x , t ) and u ( x , t ) into (3-2) and collecting co- efficients of like powers of x, we obtain that the satisfaction of ( p i j ( t ) are the elements of P ( t ) )

P + A ’ P + P A + c - P ( B D - ’ B ’ ) P = o , P ( T ) = M (3-4)

Q + C p i j S i j = 0 , Q ( T ) = 0 . (3-5)

is (at least formally) a sufficient condition for the optimality of (3-3) and minimality of V(x , t ) . Equation (3-5) is not important since the control depends only on P. Note that if P is nonnegative, then Z p i j S i j = trace P S 2 0 and, hence Q ( t ) 2 0 for 0 5 t 5 T ( Q ( T ) = 0). Equation (3-4) is known as the matrix Ricatti equation. Since D ( t ) is positive definite and C(r) and M semidefinite, there is a nonnegative definite solution to the Ricatti equation in the interval [O,T] which satisfies P ( t ) + M as t + T . Since u ( x , t ) is continuous in t and satisfies a uniform Lipschitz condition in x, the process (3-1) corresponding to u(x,t) is defined and finite with probability one in the time interval [0, a) and enjoys the property

i, j

E:,o sup xix, <a. O $ s $ T

(3-6)

By construction 2”’ V ( x , t ) = - k ( x , u, t ) , and V ( x , t ) is in the domain of 2; = 9” for each m > 0. It will be verified that

E:,,(xlP(s) xs) + E : , , ( x w x T )

for any sequence of Markov times s which tend monotonically to T. But, by the property (3-6) and the dominated convergence theorem, E ! & ( x ~ ( P ( s ) - M ) x , ) - + O as t s s + T ; by (3-6) and the fact that P ~ , , ( s ~ p , ~ ~ ~ ~ l I x , + ~ - x , J J l e > O } + O as z+O ( r z t ) , we have E:, [(x, - x,)’ M ( x , - x T ) + 0 as the Markov times s -+ T. Thus, by Theorem 5, C”(x, t ) = V(x , 2 ) .

Let w be another control, which is continuous in t and uniformly Lipschitz continuous in x. Then (3-6) holds for w , and V ( x , t ) is in the domain of 2; = gW. The proof that Ex: V ( x s , s) + E:[ V ( x T , T ) for any sequence of nondecreasing Markov times s tending to T is similar to the proof in the case of u. By virtue of the minimization in

Page 148: Stochastic Stability and Control Harold Joseph Kushner

3. EXAMPLES 133

(3-21, y w v (x, t ) 2 2 " V (x, t ) .

CU(X, t ) = v(x, t) s CW(X, t )

Thus, by Theorem 5 , we conclude that

for any such control w, and u is optimal. The case where a(t) is a linear function of x may be treated in

exactly the same way, and the dynamic programming solution is in- deed an optimal control with respect to the same class of comparison controls w. Finally, by appealing to Theorem 8, we note that u ( x , r ) is also optimal with respect to any bounded nonanticipative control. Also, the result of Example 1 is slightly easier to prove by a direct application of I t83 lemma, as in Theorem 8.

Example 2. A short treatment of the combined linear filtering and control problem with a quadratic rate of loss will be given. Example 2 is more difficult than Example 1. The added difficulty is due to the fact that the control driving (3-7) is a functional of the observations on (3-7). It must be proved that the control at time t is actually a function of the expectation of x, conditioned upon the observations up to t . The analysis is possible owing to the linearity of the system.

The system to be controlled is represented by

dx = A ( t ) x dt + B ( t ) u d t + W ( t ) d z , (3-7)

where the initial condition x,, = x is a normally distributed random variable. We suppose that the available observations can be repre- sented by the function y,:

d y = H ( t ) x dt + W, ( t ) d z , , (3-8a) or

yr = ( H (t) d t + WI(7) dz lr ) 9 (3-8b) j where W ; W , is a positive definite matrix for each t in the interval of interest. In more intuitive terms equations (3-8a) and (3-8b) imply

Page 149: Stochastic Stability and Control Harold Joseph Kushner

134 IV / OPTIMAL STOCHASTIC CONTROL

that the “function” q( t ) = dy/dt = H ( t ) x, i- W,( t ) 5, is observed, where 5, is white Gaussian noise. Mathematically, it is only the inte- grated “observation,” namely (3-8b), which can be treated.

Let W d z = W2 dz, + W3 dz2, where zlt and z2, are independent Wiener processes. Define m, = E[x, ly,, s 5 t ] (the expectation con- ditioned upon the minimum a-algebra over which y,, s 5 t , is measure- able). The development requires that, until further notice, we set the control u equal to zero in (3-7). When u=O, we denote the corre- sponding processes by Z t , j,, and f i t .

It is known (Kalman and Bucy [l]) that a version of the conditional expectation A, satisfies the equation

d A = A m d t + K dl

dl djj - H f i dt = H(R - f i ) d t + W, dz1 (3-9)

K = [CH’ + W, Wl WJ [ Wl W J -

= AC + CA‘ + [CH’ + W, W, W;] ( Wl W;)-’ [ H E + W2 W; Wi]

+ W W ’ .

C(r) is, of course, the covariance of (m, - x,), and its initial value E ( t , ) is a given positive definite matrix. Actually, the derivation in KaIman and Bucy [l] is still formal. The form (3-9) can be established as a special case of a result which describes the differential equations satisfied by a version of the conditional expectation of functions of the solution of It8 equations. Let x, be the solution to an It8 equation and let the observations be y, = pg(x,, 7 ) d7 + s” W, dz,,; then the results of Kushner [ l l ] give conditions on x, and on the possibly nonlinear functions g(x, t ) and h(x), under which a version of E[h(x,)ly, , s 5 t ] (considered as a stochastic process) satisfies a sto- chastic differential equation of the It6 type. From the equations for the case where g(x, t ) is linear in x, and h ( x ) is linear and quadratic in the components of x, the form (3-9) can be derived. See also Kushner [l].

Let 8, and B, be the minimal a-algebras over which j,, and is - r H ( 7 ) Ax dr = l,, s 5 t , are measurable, respectively. Since Az is measurable over 8,, 7 5 r, B, is contained in 8, and any function

Page 150: Stochastic Stability and Control Harold Joseph Kushner

3. EXAMPLES 135

which is measurable over BI is also measurable over 8,. The process 1, (that is, the solution of the stochastic differential equation dl = djj - Hfi dt = H(R - riz) dt + W, dz,) is a vector-valued Wiener process (the process 1, is unnormalized; its covariance is not t times the identity). This is easily verified by noting that it is a Gaussian process with orthogonal (hence independent) and infinitely divisible incre- ments and E(l, - Is) (/I - Is)’= s: W;(t ) W , ( r ) d t . The details are left to the reader. (See remark at the end of the example.) From this, it follows that fir is actually the solution to a linear It8 equation (3-9).

Now define fir by the “feedback” system (3-10) or (3-11). We suppose that the control is of the form u, = #(f i r , t ) , where #(A, t ) is continuous and satisfies a uniform Lipschitz condition in fi. Let @ ( t , s) be the fundamental matrix of R = Ax and let the initial con- dition for (3-11) be given at time s and suppose that f is = f is. (In other words, the initial condition is the initial value of the conditional expectation of xs.) Thus,

f i r = f i r + Us,,, r

Us,, = / @ ( I , t) B ( t ) u(rn, , t) dz (3-10) S

or t I

f ir = @ ( t , s) fis +/ @ ( t , t) K( t ) dl, + / @ ( t , t) B ( t ) u (fir, t) d t . S S

Equation (3-10) is equivalent to the It8 equation

d f i = A 2 dt + Bu(f i , t ) dt + K d l (3-1 1)

and has a unique solution with probability one in the interval [s, a), with fi, = fis given.

Let us add the same control u ( 5 , t ) to (3-7). Then

x, = 2, + us,, or

d x , = A x d t + B u ( f i , t ) d t + W d z . (3-12)

We will show first that f i r = m, = E [ x , ly,, s t ] with probability one. First, we show f i r = E[x1l8,] . Let P ( d o ) be the probability measure.

Page 151: Stochastic Stability and Control Harold Joseph Kushner

136 IV / OPTIMAL STOCHASTIC CONTROL

Since tii, is measurable over B, (it is a functional of I , , s 5 t ) , it is measurable over B, . Also, let A be any set in 8,. Then the definition rii, = E[2JB,] and evaluation

/ t i i ,P (dw) = / r i t , P ( d o ) + 1 U, , ,P (do) = 2 , P ( d o ) + 1 U,,,P(dw)

A A A s A A

= 1 x,P ( d o )

A

yields tii, = E[x,lB,] with probability one, that is, tii, is a version of the conditional expectation of x, given the observations on the uncon- trolled process 2,, s 5 t .

Although the control is a random process, the actual value that a particular realization assumes at time t is known. Since d 9 - Hm dt = dy - Htii dt,

d f i = [Atii + BU (tii, t ) ] dt + K ( d 9 - H & d t )

= [Atii + Bu (tii, t ) ] dt + K ( d y - Htii d t ) . (3-13)

Thus tii, and, hence, the control, is computed from the observations on x,, s S t . The observations on the controlled and uncontrolled process differ by

1 7

y , - $1 = 1 H (T) 1 @ (T, S ) B ( s ) u (tii,, S) ds d~ . (3-14)

0 0

Equations (3-13) and (3-14) imply that the observations on the con- trolled and uncontrolled process are nonanticipative functionals of one another. Hence t i i , = E [ x , ~ ~ , , , s ~ t ] = E [ x , ~ y , , s ~ t ] .

Let the realization of the conditional mean at time t be given as the number m, = m. Then the cost, defined by

C"(m, t ) = E i , , x ; M x , + E i , , (xi C(S) X, + u:D(s) u s ) ds j: is truly a function of m. The terms of the integrand which contain x,

Page 152: Stochastic Stability and Control Harold Joseph Kushner

3. EXAMPLES 137

can be replaced by terms containing only m,. We will show this to be true for the right-hand term only. We have

XLC(S) x, = m$(s) m,

+ (x, - m,)' C (s) (xs - m,) + 2 (x, - m,) C (s) m,

and T

E : , f j x l C ( s ) x , d s = E:,,x; C ( s ) x , d s . (3-15) t 1

Noting that (x, - m,) and m, are independent Gaussian random vari- ables, and using the above equalities, we obtain that (3-15) is equal to

T

G, f 1 (ml c (4 m, ds + 4 (4) ds t

where q(s) is a function of s only; it depends on the covariance of (x, - m,) but does not depend on either m, or us. We may write C"(m,t) as (modulo some known fuixtion of time)

C ; ( m , t ) = E:,f (mi C(s) m,+ ~ j D ( s ) u s ) ds

+ E:,,~nkMm,. (3-16)

1 We may write (3-13) as

dm = [ A m + Bu ( m , t ) ] dt + K dl

dm = [ A m + B u ( m , t ) ] dt + K W , db (3-17)

where b, is a Wiener process. Finally, the optimization problem is equivalent to the minimization of (3-16) subject to theconstraint (3-17). The remainder of the procedure is exactly as for Example 1. The optimal control is (3-3) with m replacing x. Again, a family of com- parison controls is the family of continuous u(m, t ) satisfying a uniform Lipschitz condition in m. The family may be extended to include a class of local Lipschitz controls and bounded nonanticipative controls.

Page 153: Stochastic Stability and Control Harold Joseph Kushner

138 IV / OPTIMAL STOCHASTIC CONTROL

Remark. In order to prove that u, - u,, = foKs dl, is a (unnorma- lized) Wiener process, we need only show that it is a (vector-valued) Gaussian martingale. The Gaussian property is obvious. Since

0

u, - uo is measurable over the minimum cr-algebra determined by conditions on the A,, 0 g T 5 t , the martingale property then follows from the evaluation

E { u , - uslAr, r 5 s}

[ K H ( 1 , - A,) dr + KW, dz , ] lAr , r 5 s} = 0

with probability one. The last equality follows from the independence of the Gaussian variables 1, - A* and 6,, r

The process u, has orthogonal, hence independent, increments, and is therefore a Wiener process. Furthermore, the evaluation

t.

f + h

+ ~ ( h )

implies that there is a normalized Brownian motion b, such that

0 0

Example 3. We consider a stochastic version of the so-called “norm invariant” problem (Athans and Falb [l]). Let the target set be S = { x : llxll 5 r } and let the process be

dx = f ( X ) d t + u dt + UZ dz (3-18)

where Q is a constant and I is the identity matrix. It is supposed that x‘f(x) = 0 (which implies that the uncontrolled system is conserva-

Page 154: Stochastic Stability and Control Harold Joseph Kushner

3. EXAMPLES 139

tive if u = 0). The control values are confined to a sphere of radius p ; that is, u'u 5 p z . We seek the control which minimizes the average time required to transfer xo = x to 8 s ; thus k ( x , u) = 1 and b(x) = 0.

It will be shown that the uniformly Lipschitz control (3-19) is an optimal control :

(3-19)

Equation (3-19) is also the optimal control for the corresponding deterministic problem (when 0 = 0). Owing to the symmetry of the system (3-18), this result is not surprising.

First we will use (3-19) and find a function V ( x ) which satisfies V ( x ) > 0 and 9" V ( x ) = - 1 for llxll > r, and V ( x ) = 0 for llxll = r. Then the optimality will be proved.

The symmetry of (3-18) suggests that C"(x) is of the form C"(x) = g( Ilxll) for some monotonically increasing function g(w).

Let w = JIxJJ. Then, supposing that the candidate V ( x ) is of the form g( llxll), the relation

aw

implies that

(3-20) aw where n is the dimension of the vector x. Equation (3-20) admits of a solution of the form

O' Bi g ( w ) = A , + Axw + A, log w + 17.

1 w (3-21)

Substituting (3-21) into (3-20), and equating coefficients of like terms

Page 155: Stochastic Stability and Control Harold Joseph Kushner

140 IV / OPTIMAL STOCHASTIC CONTROL

in w on each side of (3-20) we obtain the coefficients A i and B,:

1 o2 , A , = ( n - 1) - 2 ,

= P 2P

(3-22)

Since the set (3-22) contains only n nonzero terms, n - 2

V ( x ) = s(llxll) = A , + A , llxll + A , log llxll + c 7 Bi (3-23)

where the empty sum defined by 1: Bi/w = 0. The value of A , is obtained from the boundary condition

1 llxil’

n - 2 Bi g ( r ) = 0 = A , + A,r + A , log r + c i.

1 r

Since A i 2 0 and Bi 5 0, we have g(w) > 0, for w > r . The optimality of (3-19) and minimality of (3-23) will now be

verified .by the use of Theorems 1 and 3. By construction, V ( x ) > 0 and Y ” V ( x ) = - 1 in E - S, and V ( x ) = 0 on 13s. We may define V ( x ) = 0 on S. V ( x ) is in the domain of J,,,, b , for each m > 0 and A”,, b V(x) = ~ “ V ( x > . 2 ? ” V ( x ) = - 1 implies that x, + 8s with proba- bility one in a finite average time for x in E - S. Finally, by the use of Theorem 3 and F( V ( x ) ) = V ( x ) log ( A + V ( x ) ) and

9 ° F (v (x)) 5 0

for large A , and x in E - S, the uniform integrability condition of Theorem 1 is satisfied. Thus V ( x ) = C”(x).

Now, noting that the control (3-19) absolutely minimizes YW V ( x ) + 1, where w is subject to the constraint w’w S p2, Theorems 3 and 4 imply that (3-19) is optimal with respect to at least the family of uniform Lipschitz controls w such that

in E - S . Y p ” V ( x ) log@ + V ( x ) ) 5 0

Page 156: Stochastic Stability and Control Harold Joseph Kushner

4. A DISCRETE PARAMETER THEOREM 141

4. A Discrete Parameter Theorem

Theorem 9 is the discrete parameter version of Theorem 2. The discrete parameter case has been more extensively studied than the continuous parameter case. See Howard [l], Blackwell [l], Derman [I], [2], and Bellmanand Dreyfus [l] for some representative results. We denote the target set by S and suppose that the transition probability of the homogeneous * Markov process xl, . . . depends on the control. The values of the control u, depend only on x,. Nu is the first Markov time of arrival at S, and we suppose that the process is stopped at Nu. The cost is

b(x) h 0, k,(x, u ) 2 0,

G , n k l ( x n + l 7 u,) = k(x, un).

conditioned on the initial condition x1 = x. Write

Let k ( x , u) = 0 for x in S.

V(x ) , satisfies the functional equation By the algorithm of dynamic programming, the minimum cost,

V(x)=minE,,.[V(x,+,)+ kl(x,+1,un)1. (4-1) U

Theorem 9. Let u transfer x1 = x to S with probability one. (With Nu < a3 with probability one if b(x ) f 0.) Let V ( x ) be a con- tinuous nonnegative function satisfying b ( x ) = V ( x ) for XES and

E:,"V(x"+J - V ( x ) = - k ( x , u ) s 0. (4-2)

in E - S . Let either (4-3) or (4-4) together with either (4-5) or (4-6) hold :

E:V (x,,~,) + E:b ( X N , ) as n + 00 9 (4-3)

(4-4)

(4-5)

V (xnnN,) are uniformly integrable,

V (x) uniformly continuous on S ,

* The nonhomogeneous case result is the same, except for notation.

Page 157: Stochastic Stability and Control Harold Joseph Kushner

142 IV / OPTIMAL STOCHASTIC CONTROL

P:{ sup llxi[I Z N } + O as N-+oo. (4-6) N . > i t 1

Then C"(X) = V ( x ) .

Let w be a control such that the above conditions are satisfied except that

E:,nV(xn+l)- v(x)Z- k ( x , w ) *

C"(x) = Y (x) 5 CW(X).

Then

Proof. If b(x ) = 0, then x, + S and either (4-5) or (4-6) imply V(xn) + 0. If b(x ) f 0, then x, + S, Nu c 03, and either (4-5) or (4-6) imply V(xn) + 0 (all with probability one). Then (4-4) implies (4-3) whether or not b ( x ) = 0. Next, we note that the application of (4-3) to

n n N ,

V (x) = E: 1 k ( x i 9 ui) + E:Y ( x n + I n N , ) 1

implies that V ( x ) = C"(x). Similarly, it can be proved that V ( x ) 6 C"(x).

Page 158: Stochastic Stability and Control Harold Joseph Kushner

v / T H E D E S I G N O F C O N T R O L S

1. Introduction

In this chapter we discuss some methods of designing controls which are suggested by the results of Chapters I1 to IV. If a control problem is posed as a well-formulated mathematical optimization problem, as in the introduction to Chapter IV, then it is natural at least to attempt to compute the optimizing control. Owing to the difficulty of the computational problem (even if existence and other theoretical questions were settled), this is not always possible. In addition, the practical control problem is not usually posed as a well-formulated mathematical optimization problem.

The goal which the control is designed to accomplish may be phrased somewhat loosely. We may desire a control which will guarantee that a given target set is attained with probability one at some random time which is specified only by a bound on its average value or on the probability of large values. Or we may require that the system satisfy some particular statistical stability property, either in some asymptotic sense, or with bounded paths, or with a condition on the average rate of decrease of some error for large values of the error. Any member of some family of loss functions may be satisfactory. It may only be desired that the control, which accomplishes a given task, not take “large” values with a high probability. For the deterministic problem there is interest in the use of Liapunov function methods to design controls which will satisfy such qualitative requirements; for example,

143

Page 159: Stochastic Stability and Control Harold Joseph Kushner

144 V / THE DESIGN OF CONTROLS

Kalman and Bertram [ l ] , LaSalle [ l ] , Geiss [ l ] , Nahi [ l ] , Rekasius [ l ] , and Johnson [ l ] ; see Kushner [ 5 ] for the stochastic case.

In the next section, we give two examples of the use of the stability results to design controls which will assure that the resulting process has some specified stability property. The design of controls to improve the cost is discussed in Section 3 .

It is doubtful that the design of a control can be based solely on the Liapunov function method ; however, it should provide some helpful assistance.

2. The Calculation of Controls Which Assure a Given Stability Property

Example 1 . We will compute a control which will assure that the stability properties of (2-1) satisfy a given specification :

d x l = X , d t (2-1)

(2-2)

d x , = (- x 1 - x , + U) dt + c d z . The function

V ( x ) = *x; + X l X , + x:

is in the domain of 2; for each m , and 2; =Yu for any uniformly Lipschitz control:

Y V ( x ) = - x : - x : + u ( 2 x , + x l ) + 2. Suppose that U ( X ) is uniformly Lipschitz. Define S = { x : x ' x 5 R2}

where R2 2 d. Let u = 0. Outside of S, Y o V ( x ) < 0 and tends to - co as llxll tends to infinity. By Chapter 11, Corollary 2-1, there is a time T~ (possibly taking infinite values) so that x, + 8.9 with probability one as t - - * T O . Similarly, if the control u(x) satisfies sign u ( x ) = -sign ( 2 x , +x,), there is a T , such that x,+dS with probability one as t + T " . We will select a control u so that

p:{ SUP v(Xf)r--EjsP, (2-3) r , > t $ O

Page 160: Stochastic Stability and Control Harold Joseph Kushner

2 . CALCULATION OF CONTROLS 145

where p and E are fixed; that is, the control will assure that the proba- bility that x , leaves Q, before touching dS is no greater than p.

Let c > 0. Then, for any stochastic Liapunov function V ( x ) , V ( x ) + c gives a worse estimate than V ( x ) . This is easily seen by noting that

A m ( V ( x ) + c ) = AmV(X)

(provided only that the probability that x , is killed in [ t , t + A ) , given x, = X E Q ~ , is o(d)) and

Px{ sup v(x,) 2 E } = P x { sup v (XI) + c 2 E + c) ? , > , t o ? , > t t O

V ( x ) + c V ( x ) I - >-

E + C E

>= P x { sup V ( x , ) >= E } . r , > t > o

Therefore, to improve the estimate in our problem, we use P ( x ) = V ( x ) - v 2 0 in E - S, where v = minxcPS V ( x ) .

Define W ( x ) = exp $ P ( x ) - 1, m = exp t,bE, and E = E - v. W ( x ) is in the domain of 2; and A; = 9”. Note that W ( x ) >= 0, where P(x) 2 0.

Y ” W ( X ) = $ ( W ( x ) + 1)

If u(x) is such that 2”’ W ( x ) 2 0 in E - S, then by Chapter 11, Corol- lary 2-1,

P : { sup V ( x l ) 2 E } = P:{ sup W(xJ 2 exp $ E - 1) 7,>120 ?.>, to

exp $ P ( x ) - 1

exp $E - 1 < - ~ = B ( $ ) .

To complete the choice of u(x) , fix the initial condition V(x) , choose $so that B ( $ ) = p and, finally, choose u ( x ) so that u (2x2 + x,)cancels the noise contribution ($a2 /2 ) (2x2 + x , ) ’ ; one choice, which is not necessarily the “smallest,” is u = $a2(2x2 + x , ) / 2 .

Page 161: Stochastic Stability and Control Harold Joseph Kushner

146 V / THE DESIGN OF CONTROLS

It cannot be claimed that the chosen control is best in the sense of minimizing the left side of (2-2), or even that it improves the stability in any absolute sense. All that can be claimed is that (2-3) is satisfied. Nevertheless, if this cannot be mathematically ascertained by some other technique, and some other control, then the method employed does have value for the problem.

It should be mentioned that, in a realistic control problem, the specification ((2-3)) would rarely be given so precisely; a fair amount of freedom may be available in the choice of both S and ‘‘QE,” and other Liapunov functions, giving information on different regions, may also be useful. In fact, it would be desirable to compare several Liapunov functions, bounds, and regions.

Example 2. We consider now a finite time problem for the system (2-1). Let V ( x ) be defined as in Example 1 and let w(x) = exp +V(x) - 1, W ( x ) = exp, @ V(x) . Determine a control which assures that

P I { sup V ( X , ) ~ & } 5;p. (2-5) T h f b O

Recall that (Chapter 111, Example 3), x: + x i 2 pV(x) , where p = 0.552. Let u = - (2x, + xl) a‘+/2 and write (analogous to (2-4))

Y“‘(x) 5 - ~$P(x) + @ W ( X ) [ p + 0’ - ~ V ( X ) ] . (2-6)

The second term of (2-6) has a maximum, denoted by cp, at any point x such that

If the maximum occurs at V ( x ) = 0, then cp = t,h@ + a2); otherwise, cp = p expl[+(@ + a2)/P - I]. In what follows, we suppose that the latter case holds; we also suppose that E is large enough so that, for all + of interest,

We allow these assumptions merely for ease of presentation.

Page 162: Stochastic Stability and Control Harold Joseph Kushner

3. DESIGN OF CONTROLS TO DECREASE THE COST 147

Under condition (2-7), Theorem 1 of Chapter 111 gives

P: { sup v ( x J 2 E } = P I { sup W ( x J 2 exp $8 - 1 = m} TztzO T z t z O

JI can be selected so that the right-hand side of (2-8) equals p, as required by (2-5). Note that, as $+m, the right side of (2-8) tends to zero for all fixed T < co, provided (2-7) holds. Of course, as $ increases, so does the magnitude of the control.

3. Design of Controls to Decrease the Cost

Theorem 1. Suppose that the pair (V(x ) , u) satisfies the conditions of Theorem 1 (or Theorem 2 if b(x ) f 0), Chapter IV (so that V ( x ) = C”(x)). Let w be any control which transfers the initial condition x to S with probability one and for which V ( x ) is in the domain of A”:, for each m > 0. If b(x ) f 0, then let T, c co with probability one. Let

A”,”V(x) 5 - k ( x , W ) 0. (3-1)

CW(X) 5 C”(X). (3-2)

Then

If (3-1) is strict for some x, then so is (3-2) for some x.

Proof. By Theorem 1 (or Theorem 2, if applicable) of Chapter IV, V ( x ) = C”(x). Let zi be a sequence of Markov times (for the process with control w) tending to T, with probability one and satisfying E”;si -= 00, T~ Sinf { t : V(x,) 2 i, with control w}. Then, by Dynkin’s

Page 163: Stochastic Stability and Control Harold Joseph Kushner

148 V / THE DESIGN OF CONTROLS

formula and (3-l), T i

V ( x ) 1 E:V (xTi ) + E: 1 k ( x , , w,) ds . (3-3) 0

Now applying Fatou’s lemma and using the fact that V ( x , ) + V(x,,) with probability one, we obtain

V ( x ) 2 lim inf E:V (xTi) + lim inf E: k (x, , w,) ds i 0

2 E:b(x,,) + E: k ( x , , w,) ds = C W ( x ) , (3-4) 0

Q.E.D.

i whether or not b ( x ) = 0.

Remark. Note that it is not required that E:V(xTi) -EZb(xTw) . If this is not the case, then the difference is in favor of the control w.

If E - S +as is bounded, then it is again only required that the processes x, converge either to S if b(x ) is identically zero or to a specific point on S in finite time if b ( x ) is not identically zero.

Remark. Suppose that V ( x ) is know to equal the cost C”(x)= E:b(x ru) + E: s‘o. k ( x , , us) ds, and is in the domain of A”,”, for each m > 0, as well. Suppose also that k ( x , u) is continuous at each x and u and that us is also right continuous with probability one for t < T ~ .

Then Al,”V(x) = - k ( x , u). (This can be demonstrated by an application of Theorem 5.2 and Lemma 5.6 of Dynkin [2].) Then any control w for which V ( x ) is in the domain of A”,” and A”,” V ( x ) 5 - k(x , w ) yields a cost that is no larger than C”(x), provided V ( x J + V(xrw) as t + T,.

The theorem provides a procedure which, at least in principle, may be used to compute a sequence of controls, each member of the sequence yielding a smaller cost than the previous member. Suppose that the cost, corresponding to u, can be computed, and suppose that the pair ( C ” ( x ) , u) satisfies the conditions on ( V ( x ) , u) of the theorem. Suppose also that a control 1v can be found which satisfies the con-

Page 164: Stochastic Stability and Control Harold Joseph Kushner

3 . DESIGN OF CONTROLS TO DECREASE THE COST 149

ditions of the theorem. Then C"(x) S Cu(x). If C"(x) can be com- puted, and (Cw(x) , w) has the properties of ( V ( x ) , u) of the theorem, then the procedure may be repeated, etc.

For It8 processes, the computations involve, under suitable con- ditions, the solution of a partial differential equation. See, for example, Chapter IV. If a solution to the appropriate equation is available, it must still be checked against the conditions of the theorems of Chapter IV. The limiting properties of at least the sequence of so- lutions to this partial differential equation, for strong diffusion pro- cesses in bounded domains, is discussed by Fleming [l]. See also Fleming [2]. We will not pursue these interesting results here.

Remark. The following form for It8 processes will be useful in the example. Let

d x = f (x, U) d t + CJ (x) dz

where CJ does not depend on the control. Suppose that the triple V ( x ) , u = 0 and w, satisfies the conditions of Theorem 1, and that, on V(x) , xz = 2" and 2; = Sw. Let

C"(X) = E: [ k ( x , ) + l ( x , , us)] ds i 0 where

k ( x ) 2 0 , l ( x , 24) 2 0 ,

U"(X) + k ( x ) = 0 .

I ( x , 0) = 0 . Let

By the theorem, C'(x) 2 Cw(x) . The inequality is strict for some x, if

U W V ( X ) - S O V ( X ) + 2(x, w) < 0,

or, equivalently, if

Page 165: Stochastic Stability and Control Harold Joseph Kushner

150 V / THE DESIGN OF CONTROLS

THE LIAPUNOV FUNCTION APPROACH TO DESIGN

Suppose that some stochastic Liapunov function V ( x ) 2 0 is given. Then the control problem may be studied in several ways. The common factor underlying the several approaches is that V ( x ) is assumed to equal the cost associated with some control, and some loss function k(x , u). The control may be given a priori, but the loss function is determined by V(x) and u.

First let b(x) = 0 and define the loss to be k ( x , u) = -J,:V(x) 2 0 and suppose that the value of A",,,V(x) in any fixed set does not depend on m. Define R" = { x : k ( x , ;)I O } . Suppose that there is some y > 0 so that Q, + dQ, = { x : V ( x ) 5 y } 3 R". Then, for the loss k ( x , u) and target set Q, + dQ, = S, the cost corresponding to control u is C"(x) = V(x) - y. (We suppose, of course, that ( V ( x ) , u), statisfies the condi- tions of Theorem 1 .) Let w be any control satisfying the conditions of Theorem 1, and suppose also that S 3 R". Then the Theorem says that C w ( x ) 6 C"(x). Thus, given some suitable V(x) , we constructed a control problem and then improved the control. Obviously, the useful- ness of the procedure is dependent upon whether the chosen V ( x ) yields a k ( x , u) and an S of suitable forms. In any case, with the given V(x) and computed S and any k(x , u) S 0 in E - S , some type of stochastic stability is guaranteed.

Now, choose a set S which has a smooth boundary and which contains R", and define the function V(x) = b(x) on S. The procedure outlined above may be repeated. Now, C"(x) = V(x) and, if there is a w satisfying Theorem 1, then w is at least as useful as u.

The method is applied to a control process in Example 3. More examples are given in Kushner [5 ] .

Example 3 . Let the control system be defined by

d x , = x2 dt

d x Z = ( - x , - x x , + ~ ) d t + o d z . Define

V ( x ) = +x: + X l X Z + x i .

Page 166: Stochastic Stability and Control Harold Joseph Kushner

3. DESIGN OF CONTROLS TO DECREASE THE COST 151

V ( x ) is in the domain of A”,” for each m > 0 and each locally Lipschitz control u(x). On ~ ( x ) , 9” =Al,”. In particular,

9OV(X) = - x’x + 6‘. We will now construct a control problem based on V(x) , and com-

pare the control u = 0 to another control. Define Ro = {x:x’x S o’} and let S = Q , + aQ, = {x: V ( x ) 5 y } 2 Ro. By Corollary 2-1, Chapter 11, there is some z0 S 03 such that x, + dQ, as s + zo, with probability one. In addition, it is not hard to verify (although we omit the proof) * that E: V(xJ + y for any sequence of random times tending to zo . Thus, by Theorem 1, Chapter IV,

V(x) - y = C“X) = E,O (xix, - o’) ds. i 0

We conclude that V ( x ) - y is the cost corresponding to the control problem with target set S, loss x’x - 02, and control u = 0. Suppose now that we wish to find a control u = w for which there is a random time z, such that x,+dS as t + z w , and, in addition, for which (xo = x@S)

r w

Co (x) > CW(x) = E,” (x~x, - 6’ + wf) ds . s 0

The procedure to be followed is outlined in the last remark following Theorem 1.

Let w be a uniformly Lipschitz control. Define the loss

k(x, W ) = k ( x ) + l ( w ) 2 k ( x ) = x’x - 02, I(w) = w

9WV (x) = - x: - x: + o2 + w (XI + 2x2).

* Suppose that the distance between aS and aRU is greater than zero. Then - x’x + d < - E < 0 in E - S, and it is easy to show that EzoV (xr,) -+ y. Let (Y > 0. Then 9 0 V l + a (x) = (1 + a) Vn (x) [- X’X + u2 -to2 a (2xa + xd2/2] which, for small a, is nonpositive in E - S. An appeal to Theorem 3, Chapter IV, completes the demonstration for this case.

Page 167: Stochastic Stability and Control Harold Joseph Kushner

152 V / THE DESIGN OF CONTROLS

If sign w = - sign (xl + 2x2), then 6pw V ( x ) 5 Y o V(x) and, hence, x, + as as t + T, (that is, z, is defined). Suppose that w satisfies, in addition,

or, equivalently, if Y w V (x) 5 - k (x, W)

v: (x) (f (x, w) - f (x, 0)) + w 2 = (xl + 2x2) w + w 2 < 0,

then the control u = w is better than the control u = 0. The control minimizing the left side of the last expression (and which also satisfies the other requirements) is

x1 + 2x2 w = -

3

Other forms, such as w1 = w, if IwI 5 1, and w1 = sign w otherwise, are also satisfactory.

It does not appear easy to compute explicitly the improvement in the cost obtained by the use of w, Co(x) - Cw(x). Nevertheless, the use of control w allows an improved estimate of P]: (supr, > , ,, V ( x t ) >= m}. In fact, this estimate is essentially provided by Example 1.

Page 168: Stochastic Stability and Control Harold Joseph Kushner

REFERENCES

Aiserman, M. A., and Gantmacher, F. R. [I] Absolute Stability of Regulator Systems, Holden-Day, San Francisco,

1964.

[ I ] “On a First Order Stochastic Differential Equation,” Intern. J. Control 1, No. 4, 301-326 (1965).

[I] Optimal Control: An Introduction to the Theory and Its Applications, McGraw-Hill, New York, 1966.

Bellman, R. [ I ] Introduction to Matrix Analysis, McGraw-Hill, New York, 1960. [2] Adaptive Control Processes: A Guided Tour, Princeton Univ. Press,

Princeton, New Jersey, 1961. Bellman, R., and Dreyfus, S. E.

Applied Dynamic Programming, Princeton Univ. Press, Princeton, New Jersey, 1962.

“On the Stability of Systems with Random Parameters,” Trans. IRE-PGCT

Astrom, K. J.

Athans, M., and Falb, P. L.

[I]

Bertram, J. E., and Sarachik, P. E. [I]

5 (1959).

[ I ] Elements of the Theory of Markov Processes and Their Applications, McGraw-Hill, New York, 1960.

Blackwell, D.

Bharucha-Reid, A. T.

[I] “Discrete Dynamic Programming,” Ann. Math. Statist. 33, 719-726 (1962).

Bucy, R. S. [I ] “Stability and Positive Supermartingales,” J . Differential Equations 1,

NO. 2, 151-155 (1965). Caughy, T. K., and Gray, A. H., Jr.

[I] “On the Almost Sure Stability of Linear Dynamic Systems with Stochastic Coefficients,” J. Appl. Mech. ( A S M E ) 87, 365-372 (1965).

Chung, K. L. [ I ] Markov Chains with Stationary Transition Probabilities, Springer, Berlin,

1960. Derman, C.

[ I ] “On Sequential Control Processes,” Ann. Math. Statist. 35, 341-349 (1964).

[2] “Markovian Sequential Control Processes - Denumerable State Space,” 1. Math. And. and Appl. 10, No. 2, 303-318 (1965).

153

Page 169: Stochastic Stability and Control Harold Joseph Kushner

154 REFERENCES

Doob, J. L. [ I ] Stochastic Processes, Wiley, New York, 1953. [2] “Martingales and One Dimensional Diffusion,” Trans. Amer. Math.

SOC. 78, 168-208 (1955). Dreyfus, S. E.

[I] Dynamic Programming and the Calculus of Variations, Academic Press, New York, 1965.

[l] Foundations of the Theory of Markov Processes, Springer, Berlin, 1961 (translation of the 1959 Russian monograph).

[2] Markov Processes, Springer, Berlin, 1965 (translation of 1963 publication of State Publishing House, Moscow).

[3] “Controlled Random Sequences,” Theory of Probability and I ts Appli- cations 10, 1-14 (1965).

[I] Probability Theory and Its Applications. Vol. I , Wiley, New York, 1950. [2] “Diffusion Processes in One Dimension,” Trans. Amer. Mafh. SOC. 77,

Dynkin, E. B.

Feller, W.

1-31 (1954). Fleming, W. H.

[11 “Some Markovian Optimization Problems,” J . Math. and Mech. 12,

[2] “Duality and a Priori Estimates in Markovian Optimization Problems,” J. Math. Anal. and Appl. 16, 254-279 (1966).

[ l ] “On the Existence of Optimal Stochastic Controls,” J. Math. and Mech.

131-140 (1963).

Fleming, W. H., and Nisio, M.

15, 777-794 (1966). Fillipov, A. F.

[ l ] “On Certain Questions in the Theory of Optimal Control” (English translation), SlAM J. Control 1, 76-84 (1962).

[ I ] “Optimal Control of Continuous-time Markov Stochastic Systems,” J. Electronics and Control 10, 473-488 (1961).

Friedman, A. [ l ] Partial Differential Equations of Parabolic Type, Prentice Hall, Englewood

Cliffs, New Jersey, 1964.

[ I ] “The Analysis and Design of Nonlinear Control Systems via Liapunov’s Direct Method,” Grumman Aircraft Corp. Rept. RTD-TDR-63-4076 (1964).

Theory and Application of Liapunov’s Direct Method, Prentice Hall, Englewood Cliffs, New Jersey, 1963.

Florentin, J. J.

Geiss, G.

Hahn, W. [ l ]

Page 170: Stochastic Stability and Control Harold Joseph Kushner

REFERENCES 155

Howard, R. A. [ l ] Dynamic ProgramrningandMarkov Processes, Technology Press of M.I.T.,

1960. Infante, E. F.

[I] “Stability Criteria for nth Order, Homogeneous Linear Differential Equations,” in Proc. Intern. Symp. on Differential Equations and Dynam- ical Systems, December, 1965, Academic Press, New York, 1967. To be published.

Infante, E. F., and Weiss, L. [I] “On the Stability of Systems Defined over a Finite Time Interval,”

Proc. Natl. Acad. of Sci. 54, 44-48 (1965). 121 “Finite Time Stability under Perturbing Forces and on Product Spaces,”

in Proc. Intern. Symp. on Differential Equations and Dynamical Systems. December, 1965, Academic Press, New York, 1967. To be published.

[l] “A Modified Lyapunov Method for Nonlinear Stability Analysis,” Ingwerson, D. R.

Trans. IEEE-AC AC-6, 199-210 (1961). ItB, K.

[l ] “On Stochastic Differential Equations,” Memoirs, Anter. Math. SOC. No. 4 (1951).

[2] Lectures on Stochastic Processes, Tata Institute, Bombay, 1961.

[l] “Synthesis of Control Systems with Stability Constraints via the Second Method of Liapunov,” Trans IEEE-AC AC-9, 380-385 (1964).

[ l ] “The Theory of Optimal Control and the Calculus of Variations” in Mathematical Optimization Techniques (R. Bellman, ed.), pp. 309-332. Univ. of California Press, Berkeley, 1963.

Johnson, G. W.

Kalman, R. E.

Kalman, R. E., and Bertram, J. E. [I] “Control System Analysis and Design via the Second Method of

Liapunov,” J. Basic Eng. (ASME) 82, 371-393 (1960).

“New Results in Linear Filtering and Prediction Theory,” J. Basic Eng. Kalman, R. E., and Bucy, R. S.

[I] (ASME) 83D, 95-108 (1961).

Kats, I. 1.

[ l ] “On the Stability of Stochastic Systems in the Large,” J. Appl. Math. and Mech. ( P M M ) 28 (1964).

“On the Stability of Systems with Random Disturbances,” J. Appl. Math. and Mech. ( P M M ) 24 (1960).

[l] “Diffusion Processes and Elliptic Differential Equations Degenerating

Kats, I. I., and Krasovskii, N. N. [ l ]

Khas’minskii, R. Z.

Page 171: Stochastic Stability and Control Harold Joseph Kushner

156 REFERENCES

at the Boundary of a Domain,’’ Theory of Probability and Its Applications 3, 400-419 (1958). “Ergodic Properties of Recurrent Diffusion Processes and Stabilization of the solution to the Cauchy Problem for Parabolic Equations,” Theory of Probability and Its Applications 5, 179-196 (1960). “On the Stability of the Trajectories of Markov Processes,” J. Appl. Math. and Mech. ( P M M ) 26, 15541565 (1962). “The Behavior of a Conservative System under the Action of Slieht Friction and Slight Random Noise,” J . Appl. Math. and Mech. ( P M M ) 28, 1126-1130(1964).

[ I ] “On Almost Sure Stability of Linear Systems with Random Coefficients,” J . Math. andPhys. 43, 59-67 (1963).

[2] “On Almost Sure Asymptotic Sample Properties of Diffusion Processes Defined by Stochastic Differential Equations,” J . Math. Kyoto Univ.

[3] “On Relations between Moment Properties and Almost Sure Lyapunov Stability for Linear Stochastic Systems,” J . Math. Anal. and Appl. 10,

[2]

[3]

[4]

Kozin, F.

4, 5 15-528 (1965).

342-353 (1965). Krasovskii, N. N.

[I] Stability of Motion, Stanford Univ. Press, Stanford, California, 1963 (translation of the 1959 Russian book).

Kushner, H. J . “On the Differential Equations Satisfied by Conditional Probability Densities of Markov Processes, with Applications,” SIAM J . Control 2, 106-1 19 (1964). “Some Problems and Some Recent Results in Stochastic Control,” IEEE Intern. Conv. Rec. Pt. 6, 108-1 16 (1965). “On the Stability of Stochastic Dynamical Systems,” Proc. Natl. Acad.

“On the Theory of Stochastic Stability,” Rept. 65-1, Center for Dynamical Systems, Brown Univ. Providence, Rhode Island (1965); See also Advan. Control Systems 4, and Joint Automatic Control Conf., 1965. “Stochastic Stability and the Design of Feedback Controls,” Rept. 65-5, Center for Dynamical Systems, Brown Univ. Providence, Rhode Island (May, 1965); Proc. Polytechnic Inst. of Brooklyn Symp. on Systems, 1965. “On the Construction of Stochastic Liapunov Functions,” Trans. IEEE-

“On the Existence of Optimal Stochastic Controls,” SIAM J . Control 3 , 463-474 (1966). “Sufficient Conditions for the Optimality of a Stochastic Control,” SIAM J . Control 3,499-508 (1966).

Sci. 53, 8-12 (1965).

AC AC-10, 477-478 (1965).

Page 172: Stochastic Stability and Control Harold Joseph Kushner

REFERENCES 157

[9] “A Note on the Maximum Sample Excursions of Stochastic Approxi- mation Processes,” Ann. Math. Statist. 37, 513-516 (1966).

[lo] “On the Status of Optimal Control and Stability for Stochastic Systems,” IEEE Intern. Conv. Rec. Pt. 6, 143-151 (1966).

[I 1 ] “Dynamical Equations for Optimal Non-linear Filtering,” J. Differential Equations 3, No. 2 (1967) to appear.

“Stability and Control,” J . SIAM Control 1 , 3-15 (1962). LaSalle, J. P.

[I] LaSalle, J. P., and Lefschetz, S .

[I] Stability by Liapunov’s Direct Method, with Applications, Academic Press, New York, 1961.

[ l ] “Optimal Control for Nonlinear Processes,” Arch. Rail. Mech. Anal. 8, 36-58 (1961).

Lefschetz, S . [ I ] Stability of Nonlinear Control Systems, Academic Press, New York,

1964. Loeve, M.

[I] Probability Theory, 3rd ed. Van Nostrand, Princeton, New Jersey, 1963. Nahi, N. E.

[l] “On the Design of Optimal Systems via the Second Method of Liapunov,” Trans. IEEE-AC AC-9, 214-275 (1964).

[I J “On the Impossibility of Stabilizing a System in the Mean-Square by Random Perturbations of Its Parameters,” J. Appl. Math. and Mech.

Lee, E. B., and Marcus, L.

Rabotnikov, I. L.

( P M M ) 2 8 , 1131-1136 (1964). Rekasius, Z. V.

[I] “Suboptimal Design of Intentionally Nonlinear Controllers, Trans. IEEE-AC AC-9, 380-385 (1964).

Rice, S. 0. [I] “Mathematical Analysis of Random Noise,” Bell System Tech. J . 23,

282-332 (1944); 24, 46-156 (1945). Reprinted in Wax, N., Selected Papers on Noise and Stochastic Processes, Dover, New York, 1954.

“The Existence of Optimal Controls,” Michigan Math. J . 9, 109-119 ( 1962).

“Stochastic Analysis of Automatic Tracking Systems,” Proc. Is? IFAC Conf.. Moscow 1, 810-815 (1960).

[I] Studies in the Theory of Random Processes, Addison-Wesley, Reading, Massachusetts. 1965 (translated from the 1961 Russian edition).

Roxin, E. [l]

Ruina, J. P., and Van Valkenburg, M. E. 11)

Skorokhod, A. V.

Page 173: Stochastic Stability and Control Harold Joseph Kushner

158 REFERENCES

Wong, E., and Zakai, M. [I I “On the Convergence of Ordinary Integrals to Stochastic Integrals,”

Ann. Math. Statist. 36, 1560-1564(1965). [21 “The Oscillation of Stochastic Integrals,” Z. Wahrscheinlichkeitsrheorie

4, NO. 2, 103-112 (1965). Wonham, W. M.

[ I ] “Stochastic Problems in Optimal Control,” IEEE Intern. Conv. Rec. Pt. 4, (1963).

[2] “A Liapunov Criteria for Weak Stochastic Stability,” J. Di’erential Equations. 2, 195-207 (1966).

[3] “A Lyapunov Method for the Estimation of Statistical Averages,” J. Differential Equations. 2, 365-377 (1966).

[ I ] “Stability and Boundedness of Systems,” Arch. Rat/. Mech. and Anal. 6, 4 0 9 4 2 1 (1 960).

[ I ] “On the First Order Probability Distribution of the van der Pol Oscillator Output,” J. EIPcfronics and Control 14, 381-388 (1963).

[2] “The Effect of Background Noise on the Operation of Oscillators,” Intern. J. Electronics to appear.

Zubov, V. I. [ I ] “Methods of A. M. Liapunov and Their Application,” M a t . Leningrad

Univ. (1957) (English transl.: Noordhoff, Groningen, 1964).

Yoshizawa, T.

Zakai, M.

Page 174: Stochastic Stability and Control Harold Joseph Kushner

AUTHOR INDEX Numbers in italics indicate the page on which the complete reference is listed.

Aiserman, M. A., 28,153 Astrom, K. J., 57, 153 Athans, M., 104, 138, 153 Bellman, R., 70, 103, 141, 153 Bertram, J. E., 35, 144, 153, 155 Bharucha-Reid, A. T., 3, 153 Blackwell, D., 141, 153 Bucy, R. S., 36, 134, 153, 155 Caughy, T. K., 36, I53 Chung, K. L., 3,153 Derman, C., 105, 141, I53 Doob, J. L., 3, 14, 18, 19, 25, 115, 154 Dreyfus, S. E., 104, 141, 153, 154 Dynkin, E. B., 3, 5, 6, 7, 8, 9, 10, 11,

14, 16, 17, 18, 23, 37, 41, 52, 90, 113, 148,154

Falb, P. L., 104, 138, 153 Feller, W., 3, 60, 154 Fillipov, A. F., 106, 154 Fleming, W. H., 105, 106, 126, 149, 154 Florentin, J. J., 130, 154 Friedman, A., 90, 154 Gantmacher, F. R., 28, I53 Geiss, G., 72, 144, 154 Gray, A. H., Jr., 36, 153 Hahn, W., 28, 52, 154 Howard, R. A., 141, I55 Infante, R. A., 12, 79, I55 Ingwerson, D. R., 155 118, K., 5, 13, 16, 18, 19, 155

Johnson, G. W., 144, 155 Kalman, R. E., 104, 134, 144,155 Kats, I. I., 35, 36, 52, 155 Khas’minskii, R. Z., 36, 45, 60, 115,

Kozin, F., 36, 156 Krasovskii, N. N., 28, 35, 52, 155, I56 Kushner, H. J., 36, 52, 79, 106, 134, 144, 150,156; 157

LaSalle, J. P., 28, 63, 66, 94, 157 Lee, E. B., 106, 157 Lefschetz, S., 28, 63, 66, 94, 157 Loeve, M., 3, 5, 7, 25, 28, 157 Marcus, L., 106, I57 Nahi, N. E., 144,157 Nisio, M., 106, 154 Rabotnikov, I. L., 36, I57 Rekasius, Z. V., 72, 144.157 Rice, S. O., 77, 157 Roxin, E., 106, 157 Ruina, J. P., 77, 157 Sarachik, P. E., 35, 153 Skorohod, A. V., 14, 16, 19, 157 Van Valkenburg, M. E., 77, I57 Weiss, L., 79, 155 Wong, E., 12, 58,158 Wonham, W. M., 36, 158 Yoshizawa, T., 33, 55, 158 Zakai, M., 12, 58, 69, 158 Zubov. V. I., 72, I58

155, 156

159

Page 175: Stochastic Stability and Control Harold Joseph Kushner

SUBJECT INDEX

Asymptotic sets, 39, 42, 54 Converse theorem, 52 Design of controls, 143ff.

to decrease cost, 147 with given stability properties, 144 Liapunov function approach to, 156

Differential generator, 15 Dynamic programming, 103 Dynkin’s formula, 10 &-Neighborhood, 39 Equidistance bounded w.p.l., 32 Equistability w.p.l., 33, 55 Feller process, 8 Finite time stability, 77ff. First exit times, 77ff.

discrete parameter, 86 strong diffusion process, 89

Hamilton-Jacobi equation, 108 Hill climbing estimates, 91 Instability w.p.l., 32 It6 stochastic differential equation, 12

differential generator, 15 stopped process, 17 strong Markov property, 15

It6’s lemma, 16 Killing time, 3 Lagrange stability, 53 Liapunov function, 23

stochastic, 34, 55ff.

construction, 72 degenerate at origin, 49 It6 process, 43, 55ff. moment estimates, 50 nonhomogeneous process,

Poisson differential equation,

strong diffusion process, 44

41

46, 69

Markov process continuous parameter, 3 discrete parameter, 1 , 7 strong, 4

Markov time, 5 Martingale, 25

super, 25, 31 probability inequality, 26

Moment estimates, 50 Poisson differential equation, 18, 20,

46, 69 differential generator, 20, 21

Stability definitions, 30 of origin, w.p.l., 30, 37, 38

asymptotic w.p.l., 31, 39, 41 exponential, 32, 47

Stochastic continuity, 4 with respect to (Q, P, p), 31

uniform, 4

160

Page 176: Stochastic Stability and Control Harold Joseph Kushner

Stochastic control, 102ff. attaining target set, 1 IOff. discrete time, 141 fixed time, 121 linear system, 130

nonanticipative controls, 126 norm invariant system, 138 practical optimality, 128 strong diffusion process, 124

with filtering, 133

SUBJECT INDEX 161

sufficient condition for optimality, 11Off.

Stochastic differential equation, see

Stopped process, 1 1 Strong diffusion process, 22, 44, 89,

Ultimate boundedness w.p.l., 32 Uniform integrability, sufficient

Weak infinitesimal operator, 9

It6 or Poisson differential equation

124

condition for, 115

Page 177: Stochastic Stability and Control Harold Joseph Kushner

This page intentionally left blank