22
CHAPTER 1 Martingale Theory We review basic facts from martingale theory. We start with discrete- time parameter martingales and proceed to explain what modifications are needed in order to extend the results from discrete-time to continuous-time. The Doob-Meyer decomposition theorem for continuous semimartingales is stated but the proof is omitted. At the end of the chapter we discuss the quadratic variation process of a local martingale, a key concept in martin- gale theory based stochastic analysis. 1. Conditional expectation and conditional probability In this section, we review basic properties of conditional expectation. Let (Ω, F , P) be a probability space and G a σ-algebra of measurable events contained in F . Suppose that X L 1 (Ω, F , P), an integrable ran- dom variable. There exists a unique random variable Y which have the following two properties: (1) Y L 1 (Ω, G , P), i.e., Y is measurable with respect to the σ-algebra G and is integrable; (2) for any C G , we have E {X; C} = E {Y; C} . This random variable Y is called the conditional expectation of X with re- spect to G and is denoted by E {X|G }. The existence and uniqueness of conditional expectation is an easy con- sequence of the Radon-Nikodym theorem in real analysis. Define two mea- sures on (Ω, G ) by μ {C} = E {X; C} , ν {C} = P {C} , C G . It is clear that μ is absolutely continuous with respect to ν. The conditional expectation E {X|G } is precisely the Radon-Nikodym derivative dμ/dν. If Y is another random variable, then we denote E {X|σ(Y)} simply by E {X|Y}. Here σ(Y) is the σ-algebra generated by Y. The conditional probability of an event A is defined by P { A|G } = E { I A |G } . The following two examples are helpful. EXAMPLE 1.1. Suppose that the σ-algebra G is generated by a partition: Ω = i=1 A i , A i A j = if i 6 = j, 1

Martingale Theory - USTCstaff.ustc.edu.cn/~wangran/Course/Hsu/Chapter 1 Martingale Theory.pdf(Doob decomposition) Let X be a submartingale. Then there is a unique increasing predictable

  • Upload
    others

  • View
    5

  • Download
    0

Embed Size (px)

Citation preview

  • CHAPTER 1

    Martingale Theory

    We review basic facts from martingale theory. We start with discrete-time parameter martingales and proceed to explain what modifications areneeded in order to extend the results from discrete-time to continuous-time.The Doob-Meyer decomposition theorem for continuous semimartingalesis stated but the proof is omitted. At the end of the chapter we discuss thequadratic variation process of a local martingale, a key concept in martin-gale theory based stochastic analysis.

    1. Conditional expectation and conditional probability

    In this section, we review basic properties of conditional expectation.Let (Ω, F , P) be a probability space and G a σ-algebra of measurable

    events contained in F . Suppose that X ∈ L1(Ω, F , P), an integrable ran-dom variable. There exists a unique random variable Y which have thefollowing two properties:

    (1) Y ∈ L1(Ω, G , P), i.e., Y is measurable with respect to the σ-algebraG and is integrable;

    (2) for any C ∈ G , we haveE {X; C} = E {Y; C} .

    This random variable Y is called the conditional expectation of X with re-spect to G and is denoted by E {X|G }.

    The existence and uniqueness of conditional expectation is an easy con-sequence of the Radon-Nikodym theorem in real analysis. Define two mea-sures on (Ω, G ) by

    µ {C} = E {X; C} , ν {C} = P {C} , C ∈ G .It is clear that µ is absolutely continuous with respect to ν. The conditionalexpectation E {X|G } is precisely the Radon-Nikodym derivative dµ/dν.

    If Y is another random variable, then we denote E {X|σ(Y)} simply byE {X|Y}. Here σ(Y) is the σ-algebra generated by Y.

    The conditional probability of an event A is defined by

    P {A|G } = E {IA|G } .The following two examples are helpful.

    EXAMPLE 1.1. Suppose that the σ-algebra G is generated by a partition:

    Ω = ∪∞i=1Ai, Ai ∩ Aj = ∅ if i 6= j,1

  • 2 1. MARTINGALE THEORY

    and P {Ai} > 0. Then E {X|G } is constant on each Ai and is equal to theaverage of X on Ai, i.e.,

    E {X|G } (ω) = 1P {Ai}

    ∫Ai

    X dP, ω ∈ Ai.

    EXAMPLE 1.2. The conditional expectation E {X|Y} is measurable withrespect to σ(Y). By a well known result, it must be a Borel function f (Y) ofY. We usually write symbolically

    f (y) = E {X|Y = y} .

    Suppose that (X, Y) has a joint density function p(x, y) on R2. Then we cantake

    f (y) =

    ∫R

    xp(x, y) dy∫R

    p(x, y) dx.

    The following three properties of conditional expectation are often used.

    (1) If X ∈ G , then E {X|G } = X; more generally, if X ∈ G then

    E {XY|G } = XE {Y|G } .

    (2) If G1 ⊆ G2, then

    E {X|G1|G2} = E {X|G2|G1} = E {X|G1} .

    (3) If X is independent of G , then

    E {X|G } = E {X} .

    The monotone convergence theorem, the dominated convergence theo-rem, and Fatou’s lemma are three basic convergence theorems in Lebesgueintegration theory. They still hold when the usual expectation is replacedby the conditional expectation with respect to an aribtary σ-algebra.

    2. Martingales

    A sequence F∗ = {Fn, n ∈ Z+} of increasing σ-algebras on Ω is calleda filtration (of σ-algebras). The quadruple (Ω, F∗, F , P) with Fn ⊆ F is afiltered probability space. We usually assume that

    F = F∞de f=

    ∞∨n=0

    Fn,

    the smallest σ-algebra containing all Fn. Intuitively Fn represents the in-formation of an evolving random system under consideration accumulatedup to time n.

  • 2. MARTINGALES 3

    A sequence of random variables X = {Xn} is said to be adapted to thefiltration F∗ if Xn is measurable with respect to Fn for all n. The filtrationF X∗ generated by the sequence X is

    F Xn = σ {Xi, i ≤ n} .It is the smallest filtration to which the sequence X is adapted.

    DEFINITION 2.1. A sequence of integrable random variables

    X = {Xn, n ∈ Z+}on a filtered probability space (Ω, F∗, P) is called a martingale with respect to F∗if X is adapted to F∗ and

    E {Xn|Fn−1} = Xnfor all n. It is called a submartingale or supermartingale if in the above relation =is replaced by ≥ or ≤, respectively.

    If the reference filtration is not explicitly mentioned, it is usually under-stood what it should be from the context. In many situations, we simplytake the reference filtration is the one generated by the sequence itself.

    REMARK 2.2. If X is a submartingale with respect to some filtration F∗,then it is also a martingale with respect to its own filtration F X∗ .

    The definition of a supermartingale is rather unfortunate, for EXn ≤EXn−1, that is, the sequence of expected values is decreasing.

    Intuitively a martingale represents a fair game. The defining propertyof a martingale can be written as

    E {Xn − Xn−1|Fn−1} = 0.We can regard the difference Xn − Xn−1 as a gambler’s gain at the nth playof a game. The above equation says that even after applying all the infor-mation and knowledge he has accumulated up to time n− 1, his expectedgain is still zero.

    EXAMPLE 2.3. If {Xn} is a sequence of independent and integrable ran-dom variables with mean zero, then the partial sum

    Sn = X1 + X2 + · · ·+ Xnis a martingale.

    EXAMPLE 2.4. (1) If {Xn} is a martingale and f : R 7→ R a convex func-tion such that each f (Xn) is integrable, then { f (Xn)} is a submartingale. (2){Xn} is a submartingale and f : R 7→ R a convex and increasing functionsuch that each f (Xn) is integrable, then { f (Xn)} is a submartingale.

    EXAMPLE 2.5. (martingale transform) Let M be a martingale and Z ={Zn} adapted. Suppose further that each Zn is uniformly bounded. Define

    Nn =n

    ∑i=1

    Zi−1(Mi −Mi−1).

  • 4 1. MARTINGALE THEORY

    Then N is a martingale. Note that in the general summand, the multi-plicative factor Zi−1 is measurable with respect to the left time point of themartingale difference Mi −Mi−1.

    EXAMPLE 2.6. (Reverse martingale) Suppose that {Xn} is a sequence ofi.i.d. integrable random variables and

    Zn =X1 + X2 + · · ·Xn

    n.

    Then {Zn} is a reverse martingale, which means thatE {Zn|Gn+1} = Zn+1,

    whereGn = σ {Si, i ≥ n} .

    If we write Wn = Z−n and Hn = G−n for n = −1,−2, . . .. Then we canwrite

    E {Wn|Hn−1} = Wn−1.

    3. Basic properties

    Suppose that X = {Xn} is a submartingale. We have EXn ≤ EXn+1.Thus on average a submartingale is increasing. On the other hand, for amartingale we have EXn = EXn+1, which shows that it is purely noise. TheDoob decomposition theorem claims that a submartingale can be decom-posed uniquely into the sum of a martingale and an increasing sequence.The following example shows that the uniqueness question for the decom-position is not an entirely trivial matter.

    EXAMPLE 3.1. Consider Sn, the sum of a sequence of independent andsquare integrable random variables with mean zero. Then

    {S2n}

    is a sub-martingale. We have obviously

    S2n = S2n −ES2n + ES2n.

    From ES2n = ∑ni=1 EX

    2i it is easy to verify that Mn = S

    2n −ES2n is a martin-

    gale. Therefore the above identity is a decomposition of the submartingale{S2n}

    into the sum of a martingale and an increasing process. On the otherhand,

    S2n = 2n

    ∑i=1

    Si−1Xi +n

    ∑i=1

    X2i .

    The first sum on the right side is a martingale (in the form of a martingaletransform). Thus the above gives another such decomposition. In generalthese two decompositions are different.

    The above example shows that in order to have a unique decompositionwe need further restrictions.

    DEFINITION 3.2. A sequence Z = {Zn} is said to be predictable with respectto a filtration F∗ if Fn if Zn ∈ Fn−1 for all n ≥ 0.

  • 4. OPTIONAL SAMPLING THEOREM 5

    THEOREM 3.3. (Doob decomposition) Let X be a submartingale. Thenthere is a unique increasing predictable process Z with Z0 = 0 and a martingaleM such that

    Xn = Mn + Zn.

    PROOF. Suppose that we have such a decomposition. Conditioning onFn−1 in

    Zn − Zn−1 = Xn − Xn−1 − (Mn −Mn−1),we have

    Zn − Zn−1 = E {Xn − Xn−1|Fn−1} .The right side is nonnegative if X is a submartingale. This shows that aDoob decomposition, if exists, must be unique. It is now clear how to pro-ceed to show the existence. We define

    Zn =n

    ∑i=1

    E {Xi − Xi−1|Fi−1} .

    It is clear that X is increasing, predictable, and Z0 = 0. Define Mn = Xn −Zn. We have

    Mn −Mn−1 = Xn − Xn−1 −E {Xn − Xn−1|Fn−1} ,from which it is easy to see that E {Mn −Mn−1|Fn−1} = 0. This showsthat M is a martingale. �

    4. Optional sampling theorem

    The concept of a martingale derives much of its power from the option-al sampling theorem we will discuss in this section.

    Most interesting events concerning a random sequence occurs not at afixed constant time, but at a random time. The first time that the sequencereaches above a given level is a typical example.

    DEFINITION 4.1. A function τ : Ω → Z+ on a filtered measurable space(ω, F∗) is called a stopping time if {τ ≤ n} ∈ Fn for all n ≥ 0. Equivalently{τ = n} ∈ Fn for all n ≥ 0.

    Let X be a random sequence and τ is the first time such that Xn > 100.The event {τ ≤ n} means that the sequence has reached above 100 beforeor at time n. We can determine if {τ ≤ n} is true or false by looking at thesequence X1, X2, . . . , Xn. We need to look at the sequence beyond time n.Therefore τ is a stopping time. On the other hand, let σ be the last timethat Xn > 100. Knowing only the first n terms of the sequence will notdetermine the event {σ ≤ n}. We need to look beyond time n. Thus σ isnot a stopping time.

    EXAMPLE 4.2. Many years I was invited to give a talk by a fellow prob-abilist working in a town I had never been. When asking for the directionsto the mathematics department, I was instructed to turn at the last trafficlight on a street.

  • 6 1. MARTINGALE THEORY

    EXAMPLE 4.3. Let σn be stopping times. Then σ1 + σ2, supn σn, andinfn σn are all stopping times.

    We have mentioned that Fn should be regarded as the information ac-cumulated up to time n. We need the corresponding concept for a stoppingtime τ. Define

    Fτ = {C ∈ F∞ : C ∩ {τ ≤ n} ∈ Fn for all n ≥ 0} .It is easy to show that Fτ is a σ-algebra. But more importantly, we have thefollowing facts:

    (1) Fτ = Fn if τ = n;(2) if σ ≤ τ, then Fσ ⊆ Fτ;(3) τ is Fτ-measurable.(3) if {Xn} is adapted to F∗, then Xτ is Fτ-measurable.

    Note that the random variable Xτ is defined as Xτ(ω) = Xτ(ω)(ω).We say that a stopping time τ is bounded if there is an integer N such

    that P {τ ≤ N} = 1. If X is a sequence of integrable random variables andτ a uniformly bounded stopping time, then Xτ is also integrable. Keepthis technical point in mind when studying the following Doob’s optionalsampling theorem.

    THEOREM 4.4. Let X be a submartingale. Let σ and τ be two uniformlybounded stopping times such that σ ≤ τ. Then

    E {Xτ|Fσ} ≥ Xσ.

    PROOF. Let C ∈ Fσ. It is enough to show thatE {Xτ; C} ≥ E {Xσ; C} .

    This is implied by

    (4.1) E {Xτ; Cn} ≥ E {Xn; Cn} ,where Cn = C ∩ {σ = n}.

    For k ≥ n, we have obviouslyE {Xτ; Cn ∩ (τ = k})

    = E {Xk; Cn ∩ (τ ≥ k)} −E {Xk; Cn ∩ (τ ≥ k + 1)} .In the last term, the random variable Xk is integrated on the set Cn(τ ≥k + 1). From k ≥ n we have Cn ∈ Fn ⊆ Fk, hence

    Cn ∩ (τ ≥ k + 1) = Cn ∩ (τ ≤ k)c ∈ Fk.Using this and the fact that X is a submartingale we have

    E {Xk; Cn ∩ (τ ≥ k + 1)} ≤ E {Xk+1; Cn ∩ (τ ≥ k + 1)} .It follows that

    E {Xτ; Cn ∩ {τ = k}}≥ E {Xk; Cn ∩ (τ ≥ k)} −E {Xk+1; Cn ∩ (τ ≥ k + 1)} .

  • 5. SUBMARTINGALE INEQUALITIES 7

    We now sum over k ≥ n. On the right side we have a telescoping sum andonly the first term because τ is bounded. We also have Cn ∩ {τ ≥ n} = Cnbecause σ ≤ τ. It follows that E {Xτ; Cn} ≥ E {Xn; Cn}. �

    COROLLARY 4.5. Let X be a submartingale. Let σ ≤ τ be two boundedstopping times such that σ ≤ τ. Then E Xσ ≤ E Xτ.

    COROLLARY 4.6. (optional sampling theorem) Let X be a submartingale.Let τ0 ≤ τ1 ≤ τ2 ≤ · · · be an increasing sequence of bounded stopping times.Then {Xτn} is a submartingale with respect to the filtration {Fτn}.

    COROLLARY 4.7. Let X be a submartingale with respect to a filtration F∗and τ a stopping time. Then the stopped process {Xn∧τ} is a submartingale withrespect to the original filtration F∗.

    5. Submartingale inequalities

    The Lp-norm of a random variable X is

    ‖X‖p = {E|X|p}1/p .The set of random variables such that ‖X‖p < ∞ is denoted by Lp(Ω, F , P)(or an abbreviated variant of it).

    Let X∗n = max1≤i≤n Xi. We first show that the tail probability of X∗n iscontroled in some sense by the tail probability of the last element Xn.

    LEMMA 5.1. (Doob’s submartingale inequality) Let X be a nonnegativesubmartingale. Then for any λ > 0 we have

    P {X∗n ≥ λ} ≤1λ

    E {Xn; X∗n ≥ λ} .

    PROOF. Let τ = inf {i : Xi ≥ λ}with the convention that inf ∅ = n+ 1.It is clear that τ is a stopping time; hence {τ ≤ n} ∈ Fτ and by the optionalsampling theorem we have

    E {Xn; τ ≤ n} ≥ E {Xτ; τ ≤ n} ≥ λP {τ ≤ n} .In the last step we have used the fact that Xτ ≥ λ if τ ≤ n. From the aboveinequality and the fact that {X∗n ≥ λ} = {τ ≤ n} we have the desired in-equality. �

    COROLLARY 5.2. Let X be a nonnegative submartingale. Then

    P {X∗n ≥ λ} ≤EXn

    λ.

    EXAMPLE 5.3. The classical Kolmogorov inequality is a special case ofDoob’s submartingale inequality. Let {Xn, 1 ≤ n ≤ N} be a sequence ofindependent, square integrable random variables with mean zero and Sn =X1 + · · ·+ Xn. Then |Sn|2 is a nonnegative submartingale and we have

    P

    {max

    1≤n≤N|Sn| ≥ λ

    }≤ E|Sn|

    2

    λ2.

  • 8 1. MARTINGALE THEORY

    To convert the above inequality about probability to an inequality aboutmoments, we need the following fact about nonnegative random variables.

    PROPOSITION 5.4. Let X and Y be two nonnegative random variables. Sup-pose that they satisfy

    P {Y ≥ λ} ≤ 1λ

    E {X; Y ≥ λ}

    for all λ > 0. Then for any p > 1,

    ‖Y‖p ≤p

    p− 1 ‖X‖p.

    For the case p = 1 we have

    ‖Y‖1 ≤e

    e− 1‖1 + X ln+ X‖1.

    Here ln+ x = max {ln x, 0} for x ≥ 0.

    PROOF. We need to truncate the random variable Y in order to dealwith the possibility that the moment of Y may be infinite. Otherwise the thetruncation is unnecessary and the proof may be easier to read. Let YN =min {Y, N}. For p > 0, we have by Fubini’s theorem,

    EYpN = pE∫ YN

    0λp−1 dλ = pE

    ∫ ∞0

    λp−1 I{λ≤YN} = p∫ N

    0λp−1P {Y ≥ λ} dλ.

    Now if p > 1, then

    EYpN = p∫ N

    0λp−1P {Y ≥ λ} dλ

    ≤ p∫ N

    0λp−2E {X; Y ≥ λ} dλ

    =p

    p− 1E{

    XYp−1N}

    .

    Using Hölder’s inequality we have

    EYpN ≤p

    p− 1‖X‖p ‖YN‖p−1p .

    Now EYpN is finite, hence we have above inequality

    ‖YN‖p ≤pp1‖X‖p.

    Letting N → ∞ and using the monotone convergence theorem, we obtainthe desired inequality follows immediately. For the proof of the case p = 1,see EXERCISE ??. �

    The following moment inequalities for a nonnegative submartingaleare often useful.

  • 6. CONVERGENCE THEOREMS 9

    THEOREM 5.5. Let X be a nonnegative submartingale. Then

    ‖X∗n‖p ≤p

    p− 1 ‖Xn‖p, p > 1;

    ‖X∗n‖1 ≤e

    e− 1‖1 + Xn ln+ Xn‖1.

    Here ln+ x = max {ln x, 0}.

    PROOF. These inequalities follow immediately from PROPOSITION 5.4and LEMMA 5.1. �

    6. Convergence theorems

    Let X = {Xn} be a submartingale. Fix two real numbers a < b andlet UXN [a, b] be the number of upcrossings of the sequence X1, X2, · · · , XNfrom below a to above b. The precise definition of UXN [a, b] will be clearfrom the proof of the next lemma.

    LEMMA 6.1. (Upcrossing inequality) We have

    E UXN [a, b] ≤E(XN − a)+ −E(X1 − a)+

    b− a .

    PROOF. Let Yn = (Xn − a)+. Then Y is a positive submartingale. Anupcrossing of X from below a to above b is an upcrossing of Y from 0 tob − a and vice versa, so we may consider Y instead of X. Starting fromτ0 = 1 we define the following sequence of stopping times

    σi = inf {n ≥ τi−1 : Yn = 0} ,τi = inf {n ≥ σi : Yn ≥ b− a}

    with the convention that inf ∅ = N. If Xσi ≤ a and Xτi ≥ b, that is, if [σi, τi]is an interval of a completed upcrossing from below a to above b, then wehave Yτi − Yσi ≥ b − a. Even if [σi, τi] is not an interval of a completedupcrossing, we always have Yτi −Yσi ≥ 0. Therefore,

    (6.1)N

    ∑i=1

    (Yτi −Yσi) ≥ (b− a)UXn [a, b].

    On the other hand, since X is a submartingale and τi−1 ≤ σi, we haveE[Yσi −Yτi−1

    ]≥ 0.

    Now we have

    Yn −Y0 =N

    ∑i=1

    [Yτi −Yσi ] +N

    ∑i=1

    [Yσi −Yτi−1

    ].

    Taking the expected value we have

    E [YN −Y0] ≥ (b− a)EUXN [a, b].�

  • 10 1. MARTINGALE THEORY

    The next result is the basic convergence theorem in martingale theory.

    THEOREM 6.2. Let {Xn} be a submartingale such that supn≥1 EX+n ≤ Cfor some constant C. Then the sequence converges with probability one to an inte-grable random variable X∞.

    PROOF. We use the following easy fact from analysis: a sequence x ={xn} of real numbers converges to a finite number or ±∞ if and only ifthe number of upcrossings from a to b is finite for every pair of rationalnumbers a < b. Thus it is enough to show that

    P{

    UX∞[a, b] < ∞ for all rational a < b}= 1.

    Since the set of intervals with rational endpoints is countable, we need onlyto show that

    P{

    UX∞[a, b] < ∞}= 1

    for fixed a < b. We will show the stronger statement that EUX∞[a, b] < ∞.From the upcrossing inequality we have

    E UXN [a, b] ≤E(XN − a)+ −E(X1 − a)+

    b− afor all N. From

    (Xn − a)+ ≤ |Xn|+ |a|we have

    EUXN [a, b] ≤C + |a|b− a .

    We have UXN [a, b] ↑ UX∞[a, b] as N ↑ ∞, hence by the monotone convergencetheorem, EUXN [a, b] ↑ EUX∞[a, b]. It follows that

    E UX∞[a, b] ≤C + |a|b− a < ∞.

    It follows that the limit X∞ = limn→ Xn exists with probability one.From |Xn| = 2X+n − Xn,

    E|Xn| ≤ E[2X+n − Xn

    ]≤ 2 EX+n −EX1 ≤ 2C−EX1.

    By Fatou’s lemma we have E|X∞| ≤ 2C − EX1, which shows that X∞ isintegrable. �

    COROLLARY 6.3. Every nonnegative supermartingale converges to an inte-grable random variable with probability one.

    PROOF. If X is a nonnegative supermartingale, then −X is a nonposi-tive submartingale. Now apply the basic convergence THEOREM 6.2. �

  • 7. UNIFORMLY INTEGRABLE MARTINGALES 11

    7. Uniformly integrable martingales

    The basic convergence THEOREM 6.2 does not claim that EXn → EX∞or E|Xn − X∞| → 0. This does not hold in general. In many applications itis desirable to show that this convergence is in the sense of L1. To claim theconvergence in this sense, we need more conditions. One such conditionthat is used very often is that of uniform integrability. We recall this conceptand the relevant convergence theorem.

    DEFINITION 7.1. A sequence {Xn} is called uniformly integrable iflim sup

    C→∞sup E {|Xn|; |Xn| ≥ C} = 0.

    Equivalently, for any e > 0, there is a constant C such that

    E {|Xn|; |Xn| ≥ C} ≤ efor all n; i.e., the sequence {Xn} has uniformly small L1-tails.

    The following equivalent condition justifies the terminology uniformintegrability from another point of view.

    PROPOSITION 7.2. A sequence {Xn} is uniformly integrable if and only iffor any positive e, there exists a positive δ such that

    E {|Xn|; C} ≤ efor all n and all sets C such that P {C} ≤ e.

    The concept of uniform integrability is useful mainly because of the fol-lowing so-called Vitali convergence theorem , which supplements the threemore well-known convergence theorems (Fatou, monotone, and dominat-ed).

    THEOREM 7.3. If Xn → X in probability and {Xn} is uniformly integrable,then X is integrable and E|Xn − X| → 0.

    PROOF. If is clear that the uniform integrability implies that E|Xn| isuniformly bounded, say E|Xn| ≤ C. There is a subsequence of {Xn} con-verging to X with probability 1, hence by Fatou’s lemma we have E|X| ≤C. This shows that X is integrable. Let Yn = |Xn − X|. Then {Yn} is al-so uniformly integrable and Yn → 0 in probability. We need to show thatEYn → 0. For an e > 0 and let Cn = {Yn ≥ e}. Then P {Cn} → 0. We have

    EYn = E {Yn; Yn < e}+ E {Yn; Yn ≥ e} .The first term is bounded by e, and the second term goes to zero as n→ ∞by PROPOISITION 7.2. This shows that EYn → 0. �

    The following criterion for uniform integrability is often useful.

    PROPOSITION 7.4. If there is p > 1 and C such that E|Xn|p < ∞ for all n,then {Xn} is uniformly integrable.

    PROOF. EXERCISE ?? �

  • 12 1. MARTINGALE THEORY

    EXAMPLE 7.5. Let X be an integrable random variable and {Fn} be afamily of σ-algebras. Then the family of random variables Xn = E {Xn|Fn}is uniformly integrable. This can be seen as follows. We may assume thatX is nonnegative. Let Cn = {Xn ≥ C}. Then Cn ∈ Fn. We have

    E {Xn; Xn ≥ C} = E {X; Cn} .Considering the two cases X < K and X ≥ K, we have

    E {Xn; Xn ≥ C} ≤ KP {Cn}+ E {X; X ≥ K} .The second term can be made arbitrarily small for sufficiently large K. For afixed K, the first term can be made arbitrarily small if C is sufficiently largebecause

    P {Cn} ≤EXn

    C=

    EXC

    .

    THEOREM 7.6. If {Xn} is a uniformly integrable submartingale, then it con-verges with probability 1 to an integrable random variable X∞ and E|Xn−X∞| →0.

    PROOF. From uniform integrability we have E|Xn| ≤ C for some con-stant C; hence the submartingale converges. The L1-convergence followsfrom THEOREM 7.3. �

    The following theorem gives the general form of a uniformly integrablemartingale.

    THEOREM 7.7. Every uniformly integrable martingale {Xn} has the formXn = E {X|Fn} for some integrable random variable X.

    PROOF. From EXAMPLE 7.5) we see that a martingale of the indicatedform is uniformly integrable. On the other hand, if the martingale {Xn}is uniformly integrable, then Xn → X with probability 1 for an integrablerandom variable X. Furthermore, E|Xn − X∞| → 0. Taking the limit asm→ ∞ in Xn = E {Xm|Fn} we have Xn = E {X|Fn}. �

    The following theorem gives the “last term” of the uniformly integrablemartingale Xn = E {X|Fn}.

    THEOREM 7.8. Suppose that F∗ is a filtration of σ-algebras and that X is anintegrable random variable. Then with probability 1,

    E{

    X∣∣Fn}→ E {X∣∣F∞}

    where F∞ = σ {Fn, n ≥ 0}, the smallest σ-algebra containing all Fn, n ≥ 0.

    PROOF. Let Xn = E {X|Fn}. Then {Xn, n ≥ 0} is a uniformly inte-grable martingale (see EXERCISE ?? below). Hence the limit

    X∞ = limn→∞

    Xn

    exists and the convergence is in L1(Ω, F , P). We need to show that X∞ ={X|F∞} .

  • 8. CONTINUOUS TIME PARAMETER MARTINGALES 13

    It is clear that X∞ ∈ F∞. Letting m→ ∞ in Xn = E {Xm|Fn}, we have

    E {X|Fn} = Xn = E {X∞|Fn} .

    Now consider the collection G of sets G ∈ F∞ such that

    E {X∞; G} = E {X; G} .

    If we can show that G = F∞, then by the definition of conditional expecta-tions we have immediately X∞ = E {X|F∞}.

    The following two facts are clear:1. G is a monotone class;2. It contains every Fn.

    Therefore G contains the field ∪n≥1Fn. By the monotone class theorem (seeEXERCISE ??, G contains the smallest σ-algebra generated by this field. Thisshows that F∞ = F , and the proof is completed. �

    8. Continuous time parameter martingales

    Brownian motion B = {Bt} is the most important continuous time pa-rameter martingale. From this we will derive other examples such as B2t − tand exp[Bt − t/2]. We will introduce Brownian motion in the next chapter.We assume that the reader has some familiarity with Brownian motion sothat the topics of the next three sections are not as dry as it would havebeen.

    The definitions of martingales, submartingales, and supermartingalesextend in an obvious way to the case of continuous-time parameters. Forexample, a real-valued stochastic process X = {Xt} is a submartingale withrespect to a filtration F∗ = {Ft} if Xt ∈ L1(Ω, F∗, P) and

    E {Xt|Fs} ≥ Xs, s ≤ t.

    It is clear that for any increasing sequence {tn} of times the sequence {Xtn}is a submartingale with respect to {Ftn}. For this reason, the theory ofcontinuous-parameter submartingales is by and large parallel to that ofdiscrete-parameter martingales. Most of the results (except Doob’s decom-position theorem) in the previous sections on discrete parameter martin-gales have their counterparts in continuous parameter martingales and theproofs there work in the present situation after obvious necessary modifi-cations. However, there are two issues we need to deal with. The first oneis a somewhat technical measurability problem. In order to use discreteapproximations, we need to make sure that every quantity we deal with ismeasurable and can be approximated in some sense by the correspondingdiscrete counterpart. This is not true in general. For example, an expressionsuch as

    (8.1) X∗t = sup0≤s≤t

    Xs

  • 14 1. MARTINGALE THEORY

    is in general not measurable, hence it cannot be approximated this way.The second issue is to find a continuous-time version of the Doob decom-position, which plays a central role in stochastic analysis. In Doob’s theo-rem, we need to use the concept of predictability to ensure the uniqueness.This concept does not seem to have a straightforward generalization in thecontinuous-time situation. We will resolve these issues by restricting ourattention to a class of submartingales satisfying some technical conditions.This class turns out to be wide enough for our applications.

    The first restriction is on the filtration. We will show that this class issufficiently rich for our applications. Suppose that (Ω, F , P) is a probabil-ity space. A set is a null set if it is contained in a set of measure zero. Recallthat a σ-algebra G is said to be complete (relative to F ) with respect to theprobability measure P if it contains all null sets.

    DEFINITION 8.1. A filtration of σ-algebras F∗ = {Ft, t ∈ R+} on a prob-ability space (Ω, F∞, P) is said to satisfy the usual conditions if

    (1) F0, hence every Ft, is complete relative to F∞ with respect to P;(2) It is right-continuous, i.e., Ft = Ft+ for all t ≥ 0, where

    Ft+def=⋃s>t

    Fs.

    The right side is usually denoted by Ft+.

    CONDITION (1) can usually be satisfied by completing every Ft byadding subsets of sets of measure zero. For our purpose, the most im-portant example is the filtration generated by Brownian motion. We willshow that after completion CONDITION (2) is satisfied in this case. In gen-eral these conditions have to be stated as technical assumptions. Unlessotherwise stated, from now on we will assume that all filtrations satisfy the usualcondition.

    Our second restriction is on sample paths of stochastic processes.

    DEFINITION 8.2. A function φ : R+ → R1 is called a Skorokhod function(path) if at every point t ∈ R+ it is right continuous lims↓ φs = φt and has afinite left limit φt− = lims

  • 8. CONTINUOUS TIME PARAMETER MARTINGALES 15

    It is clear from definition that if two processes are versions of each oth-er, then they have the same family of finite-dimensional marginal distri-butions. For this reason, they are identical for all practical purposes. Instochastic analysis it is rarely necessary to distinguish a stochastic processfrom a version of it.

    The following result shows that having Skorokhod paths is not a veryrestrictive condition for submartingales.

    THEOREM 8.5. (Path regularity of submartingales) Let that {Xt, t ≥ 0}is a submartingale with respect to a filtration F∗ which satisfies the usual con-ditions. If the function t 7→ EXt is right-continuous. Then {Xt, t ≥ 0} has aregular version, i.e., there is a version of X almost all of whose paths are Skorokhod.

    PROOF. See Meyer [9] , page 95-96. �

    Once the process X has Skorokhod paths, a quantity such as X∗t definedin (8.1) becomes measurable because the maximum can be taken over adense countable set (e.g., the set of rational or dyadic numbers). We givea proof of Doob’s submartingale inequality as an example of passing fromdiscrete to continuous-time sumartingales under the assumption of samplepath regularity.

    THEOREM 8.6. (Submartingale inequality) Suppose that X is a nonnega-tive submartingale with regular sample paths and let X∗T = max0≤t≤T Xt. Thenfor any λ > 0,

    λP {X∗T ≥ λ} ≤ E {XT; X∗T ≥ λ} .PROOF. Without loss of generality we assume T = 1. Let

    ZN = max0≤i≤2N

    Xi/2N

    for simplicity. Since the paths are regular we have ZN ↑ X∗1 . For any posi-tive e, we have for all sufficiently large N,

    λP {X∗1 ≥ λ} ≤ λP {ZN ≥ λ− e} ≤λ

    λ− eE {X1; X1 ≥ λ− e} .

    In the last step we have used the submartingale inequality for discrete sub-martingales and the fact that X∗1 ≥ ZN . Letting e ↓ 0, we obtain the desiredinequality. �

    THEOREM 8.7. (Submartingale convergence theorem) Let X = {Xt} bea regular submartingale. If supt≥0 EX

    +t < ∞, then the limit X∞ = limt→∞ Xt

    exists and is integrable. If {Xt} is uniformly integrable, then E|Xt − X∞| → 0.PROOF. EXERCISE ??. �

    THEOREM 8.8. Under the same conditions as in the preceding theorem, if inaddition {Xt, t ≥ 0} is uniformly integrable, then the limit X∞ exists a.s. Fur-thermore, E|Xt − X∞| → 0 and Xt = E {X∞|Ft}.

    PROOF. Same as in the discrete-parameter case, mutatis mutandis. �

  • 16 1. MARTINGALE THEORY

    9. Doob–Meyer decomposition theorem

    In this section we discuss the Doob-Meyer decomposition theorem, theanalogue of Doob’s decomposition for continuous-time submartingales. Itstates, roughly speaking, that a submartingale X can be uniquely written asum of a martingale M and an increasing process Z, i.e.,

    Xt = Mt + Zt.

    This result plays a central role in stochastic analysis. There are severalproofs for this theorem and they all start naturally with approximating acontinuous-time submartingale by discrete-time submartingales. There aretwo difficulties we need to overcome. First, we need to show that thesediscrete-time approximations converge to the given submartingale in anappropriate sense. Second, in view of the fact in the discrete-time case, theuniqueness holds only after we assume that the increasing part of the de-composition is predictable, we need to formulate an appropriate analogueof predictability in the continuous-time case. Neither of these two prob-lems are easy to overcome. A complete proof of this result can be foundin Meyer [9]. The proof can be somewhat simplified under the hypothesisof continuous sample paths and the problem of predictability disappearsunder the usual conditions for the filtration; see Bass [1]. Fortunately, theDoob-Meyer theorem is one of those results for which we gain very little ingoing through its proof, and understanding its meaning is quite adequatefor our later study. We will restrict ourselves to a form of this decomposi-tion theorem that is sufficient for our purpose.

    We always assume that the filtration F∗ = {Ft, t ∈ R+} satisfies theusual conditions. The definition of a martingale requires that each Xt isintegrable. A local martingale is a more flexible concept.

    DEFINITION 9.1. A continuous, F∗-adapted process {Xt} is called a localmartingale if there exists an increasng sequence of stopping times τn ↑ ∞ withprobability one such that for each n the stopped process Xτn = {Xt∧τn , t ∈ R+}is an F∗-martingale.

    Similarly we can define a local martingale or a local supermartingale.

    EXAMPLE 9.2. Let {Bt} be a Brownian motion in R3 and a 6= 0. We willshow later that

    Xt =1

    |Bt + a|is a local martingale.

    REMARK 9.3. A word of warning is in order. For a local martingaleX = {Xt}, the condition E|Xt| < ∞ for all t does not guarantee that it is amartingale.

    The following fact often useful.

  • 9. DOOB–MEYER DECOMPOSITION THEOREM 17

    PROPOSITION 9.4. A nonnegative local martingale {Mt} is a supermartin-gale. In particular EMt ≤ EM0.

    PROOF. Let {τn} be the nondecreasing stopping times going to infinitysuch that the stopped processes Mτn are martingale. Then in particularM0 = Mτn0 is integrable by definition. We have

    E {Mt∧τn |Fs} ≤ Ms∧τn .We let n → ∞. On the left side, we use Fatou’s lemma. The limit on theright side is Ms because τn → ∞. Hence

    E {Mt|Fs} ≤ Ms.It follows that every Mt is integrable and {Mt} is a supermartingale. �

    If with probability 1 the sample path t 7→ Zt is an increasing function,we call Z an increasing process. If A = Z1 − Z2 for two increasing process-es, then A is called a process of bounded variation.

    REMARK 9.5. We will often drop the cumbersome phrase “with proba-bility 1” when it is obviously understood from the context.

    We are ready to state the important decomposition theorem.

    THEOREM 9.6. (Doob-Meyer decomposition) Suppose that the filtrationF∗ satisfies the usual conditions. Let X = {Xt, t ∈ R+} be an F∗-adapted localsubmartingale with continuous sample paths. There is a unique continuous F∗-adapted local martingale {Mt, t ∈ R+} and a continuous F∗-adapted increasingprocess {Zt, t ∈ R+} with Z0 = 0 such that

    Xt = Mt + Zt, t ∈ R+.PROOF. See Karatzas and Shreve [7], page 21-30. Bass [1] contains a

    simple proof for continuous submartingales. �

    In the theorem we stated that the martingale part M of a local sub-martingale is a local martingale. In general we cannot claim that it is a mar-tingale even if X is a submartingale. In many applications, it is importantto know that the martingale part of a submartingale is in fact a martingaleso that we can claim the equality

    EXt = EMt + EZt.

    We now discuss a condition for a local martingale to be a martingale, whichcan be useful in certain situations.

    DEFINITION 9.7. Let ST be the set of stopping times bounded by T. A con-tinuous process X = {Xt} is said of (DL)-class if {Xσ, σ ∈ FT} is uniformlyintegrable for all T ≥ 0.

    This definition is useful for the following reason.

    PROPOSITION 9.8. A local martingale is a martingale if and only if it is of(DL)-class.

  • 18 1. MARTINGALE THEORY

    PROOF. If M is a martingale, then

    Mσ = E {MT|Fσ} for all σ ∈ ST.Hence {Mσ, σ ∈ ST} is uniformly integrable (see EXAMPLE 7.5). Thus Mis of (DL)-class.

    Conversely, suppose that M is a (DL)-class local martingale and let{τn} be a sequence of stopping times going to infinity such that the stoppedprocesses Mτn are martingales. We have

    E {Mt∧τn |Fs} = Ms∧τn , s ≤ t.As n → ∞, the limit on the right side is Ms. On the left side {Mt∧τn}is uniformly integrable because t ∧ τn ∈ St. Hence we can pass to thelimit on the left side and obtain E {Mt|Fs} = Ms. This shows that M is amartingale. �

    THEOREM 9.9. Let X be a local submartingale X and X = M + Z be itsDoob-Meyer decomposition. Then the following two conditions are equivalent:

    (1) M is a martingale and Z is integrable;(2) X is of (DL)-class.

    PROOF. Suppose that Z is increasing and Z0 = 0. If it is integrable, i.e.,EZt < ∞ for all t ≥ 0, then it is clearly of (DL)-class. On the other hand,every martingale M is of (DL)-class because the family

    Mσ = E {MT|Fσ} , σ ∈ STis uniformly integrable. Thus if (1) holds, then both M and Z are of (DL)-class, hence the sum X = M + Z is also of (DL)-class. This proves that (1)implies (2).

    Suppose now that the submartingale X is of (DL)-class and X = M+ Zis its Doob-Meyer decomposition. Let {τn} be a sequence of stopping timestending to infinity such that the stopped processes Mτn are martingales. Wehave

    Xt∧τn = Mt∧τn + Zt∧τn .Taking the expectation we have

    EZt∧τn = EXt∧τn −EX0.We have Zt∧τn ↑ Zt, hence EZt∧τn ↑ EZt by the monotone convergencetheorem. On the other hand, the assumption that X is of (DL)-class impliesthat EXt∧τn → EXt. Hence,

    EZt = EXt −EX0 < ∞.Thus EZt is integrable. Since Z is also nonnegative and increasing, we seethat Z is of (DL)-class. Now that both X and Z are of (DL)-class, the dif-ference M = X − Z is also of (DL)-class (see EXERCISE ??) and is thereforea martingale. �

    In these notes the broadest class of stochastic processes we will dealwith is that of semimartingales.

  • 10. QUADRATIC VARIATION OF A CONTINUOUS MARTINGALE 19

    DEFINITION 9.10. A stochastic process is called a semimartingale (with re-spect to a given filtration of σ-algebras if it is the sum of a continuous local mar-tingale and a continuous process of bounded variation.

    In real analysis it is well known that a function of bounded variation isthe sum of two increasing functions. Using this result, we can show easilythat a semimartingale X has the form X = M + Z1− Z2, where M is a localmartingale and Zi are two incrasing processes.

    The significance of this class of stochastic processes lies in the fact that itis closed under most common operations we usually perform on them. Inparticular, Itô’s formula shows that a smooth function of a semimartingaleis still a semimartingale.

    For a semimartingale, the Doob-Meyer decomposition is still unique.

    PROPOSITION 9.11. A semimartingale X can be uniquely decomposed intothe sum X = M+ Z of a continuous local martingale M and a continuous processof bounded variation Z with Z0 = 0.

    PROOF. It is enough to show that if M + Z = 0 then M = 0 and Z = 0.Let Zt = Z1t − Z2t , where Zi are increasing processes with Z10 = Z20 = 0. Wehave Mt + Z1t = Z

    2t . By the uniqueness of Doob-Meyer decompositions we

    have Mt = 0 and Z1t = Z2t , hence Zt = 0. �

    10. Quadratic variation of a continuous martingale

    The quadratic variation is an important characterization of a continu-ous martingale. In fact Lévy’s criterion shows that Brownian motion B is amartingale completely characterized by its quadratic variation 〈B〉t = t.

    Let M = {Mt} be a continuous local martingale. Then M2 is a continu-ous local submartingale. By the Doob-Meyer decomposition theorem, thereis a continuous increasing process, which we will denote by 〈M, M〉 ={〈M, M〉t} or simply 〈M〉 = {〈M〉t}, with 〈M, M〉0 = 0, such that M2 −〈M, M〉 is a continuous local martingale. This increasing process 〈M, M〉 isuniquely determined by M and is called the quadratic variation process ofthe local martingale M.

    EXAMPLE 10.1. The most important example is 〈B〉t = t. This meansthat B2t − t is a (local) martingale. This can be verified directly from thedefinition of Brownian motion. We will show that the quadratic variationprocess of the martingale B2t − t is

    4∫ t

    0B2s ds.

    Let’s look at the quadratic variation process from another point of view.This also gives a good motivation for the term itself. Consider a partitionof the time interval [0t]:

    ∆ : 0 = t0 < t1 < · · · < tn = t.

  • 20 1. MARTINGALE THEORY

    The mesh of the partition is the lengths of its largest interval:

    |∆| = supl≥1|tl − tl−1|.

    Now let M be a continuous local martingale. We have the identity

    M2t = M20 + 2

    n

    ∑i=1

    Mti−1(Mti −Mti−1) +n

    ∑i=1

    (Mti −Mti−1)2.

    The second term on the right side looks like a Riemann sum and the thirdterm is the quadratic variation of the process M along the partition. InCHAPTER 3 we will show by stochastic integration theory that the Riemannsum converges (in probability) to the stochastic integral∫ t

    0Ms dMs

    as ∆| → 0. The stochastic integral above is a local martingale. It followsthat the quadratic variation along ∆ also converges, and by the uniquenessof Doob-Meyer decompositions we have in probability,

    lim|∆|→0

    n

    ∑i=1

    (Mti −Mti−1)2 = 〈M, M〉t.

    This relation justifies the name “quadratic variation process” for 〈M, M〉.We also have

    M2t = M20 + 2

    ∫ t0

    Ms dMs + 〈M, M〉t.

    This gives an explicit Doob-Meyer decomposition of the local submartin-gale M. It is also a simplest case of Itô’s formula.

    For a semimartingale X = M + Z in its Doob-Meyer decomposition,we define 〈Z, Z〉 to be the quadratic variation of its martingale part, i.e.,

    〈X, X〉t = 〈M, M〉t.Note that we still have

    (10.1) lim|∆|→0

    n

    ∑i=1

    (Xti − Xti−1)2 = 〈X, X〉t.

    To see this, letting ∆Xi = Xti − Xti−1 , we have

    (∆Xi)2 = (∆Mi)2 + (2∆Mi + ∆Zi)∆Zi.

    By sample path continuity, we have

    C(∆) def= max1≤i≤n

    |2∆Mi + ∆Zi| → 0

    as |∆| → 0. From this we have∣∣∣∣∣ n∑i=1(2∆Mi + ∆Zi)Zi∣∣∣∣∣ ≤ C(∆) n∑i=1 |∆Zi| ≤ C(∆)|Z|t → 0.

  • 11. FIRST ASSIGNMENT 21

    Here |Z|t denotes the total variation of {Zs, 0 ≤ s ≤ t}. It follows that

    lim∆|→0

    n

    ∑i=1

    (∆Xi)2 = lim∆|→0

    n

    ∑i=1

    (∆Mi)2 = 〈M, M〉t.

    This shows that (10.1) holds.Finally we show that the product of two local martingales M and N is a

    semimartingale. The covariation process 〈M, N〉 is defined by polarization:

    〈M, N〉 = 〈M + N, M + N〉 − 〈M− N, M− N〉4

    .

    It is obviously a process of bounded variation.

    PROPOSITION 10.2. Let M and N be two local martingales. The processMN − 〈M, N〉t is a local martingale.

    PROOF. By the definition of covariation it is easy to verify that MN −〈M, N〉 is equal to

    (M + N)2 − 〈M + N, M + N〉4

    − (M− N)2 − 〈M− N, M− N〉

    4.

    Both processes on the right side are local martingales, hence MN− 〈M, N〉is also a local martingale. �

    We will show later that the Doob-Meyer decomposition of MN is

    MtNt = M0N0 +∫ t

    0Ms dNs +

    ∫ t0

    Ns dMs + 〈M, N〉t.

    11. First assignment

    EXERCISE 1.1. Consider the Hilbert space L2(Ω, F , P) of square inte-grable random variables. The space L2(Ω, G , P) of G -measurable squareintegrable random variables is a closed subspace. If X ∈ L2(Ω, F , P). ThenE {X|G } is the projection of X to L2(Ω, G , P).

    EXERCISE 1.2. Conditioning reduces variance, i.e.,

    Var(X) ≥ Var (E {X|G }) .

    EXERCISE 1.3. Suppose that X = {Xn, n ≥ 0} is an i.i.d. integrablesequence. Let Sn = X1 + X2 + · · ·Xn. Then {Sn/n, n ≥ 1} is a reversedmartingale with respect to the reversed filtration

    F n = σ {Sm, m ≥ n} , n ∈ Z+.

    EXERCISE 1.4. If a sequence X is adapted to F∗ and τ is a stoppingtime, then Xτ ∈ Fτ.

    EXERCISE 1.5. Let X be a submartingale and σ and τ two bounded stop-ping times. Then

    E {Xτ|Fσ} ≥ Xσ∧τ.

  • 22 1. MARTINGALE THEORY

    EXERCISE 1.6. Let {Xn} be a submartingale and X†N = min0≤n≤N Xn.Then for any λ > 0, we have

    λP[

    X†N ≤ −λ]≤ E[XN − X1]−E

    [XN ; X†N ≤ −λ

    ].

    We also have for any λ > 0

    λE

    [max

    0≤n≤N|Xn| ≥ λ

    ]≤ E|X0|+ 2E|XN |.

    EXERCISE 1.7. If there is a C and p > 1 such that E|Xn|p ≤ C, then {Xn}is uniformly integrable.

    EXERCISE 1.8. Find a submartingale sequence such that EX+n ≤ C, butEXn → EX∞ does not hold.

    EXERCISE 1.9. Let {Xi, i ∈ I} be a uniformly integrable family of ran-dom variables. Then the family{

    Xi + Xj, (i, j) ∈ I × I}

    is also uniformly integrable.

    EXERCISE 1.10. Let M be a square integrable martingale. For any parti-tion ∆ = {0 = t0 < t1 < · · · < tn = t} of the interval [0, t] we have

    En

    ∑j=1

    [Mtj −Mtj−1

    ]2= E〈M, M〉t.