Scaling Limits of Stochastic Processes - Lancasterturnera/thesis.pdf · 2007-03-30 · Abstract In this thesis we analyse two classes of stochastic processes, both of which exhibit

Scaling Limits of StochasticProcesses

Amanda Georgina TurnerSt John’s College and Statistical Laboratory

University of Cambridge

A dissertation submitted for the

degree of Doctor of Philosophy

September 2006

Abstract

In this thesis we analyse two classes of stochastic processes, both of which exhibit

unusual scaling limits.

The first class is of sequences of Markov processes in two dimensions whose fluid

limit is a stable solution of an ordinary differential equation with a saddle fixed

point. In order to investigate the most interesting behaviour of these processes,

we establish a fluid limit which is valid for large times. The limit is shown to be

inherently random and its distribution is obtained. This is then used to derive

surprising scaling limits for the points where these processes hit straight lines

through the origin, and the minimum distance from the origin that the processes

can attain.

We apply the above results to study the accumulation of numerical rounding

errors incurred by some deterministic solvers for systems of ordinary differential

equations. We show that the trajectory of the numerical solution exhibits random-

like behaviour, and calculate the theoretical distribution of the trajectory. Numer-

ical experiments are then performed and the results are fitted to the predicted

distributions with good agreement.

The second class of processes that we consider are stochastic flows on the circle,

constructed by iteratively composing specific maps at times of a Poisson process.

These maps occur naturally in a simplified version of the Hastings-Levitov model

for planar diffusion-limited aggregation on the circle, known as the Eden model.

We define a metric space on which to realize our stochastic flows and show that,

under a specified scaling, they converge to the Brownian web with respect to this

metric.

i

Acknowledgments

This thesis would not have been achievable without the help and support of a large

number of people.

First of all I would like to thank my supervisor, Professor James Norris.

Throughout my PhD he has been generous with his time, never ceasing to provide

me with encouragement and inspiration. It has been a great pleasure to work

under his guidance.

I am privileged to have been part of the Cambridge University Statistical Lab-

oratory. I would like to thank everyone for making it such an enjoyable place to

work, and in particular the secretaries and computer officer for being so accom-

modating. Special thanks go to my long-suffering officemate Teresa Barata, and

to Christina Goldschmidt and Richard Samworth for their continual willingness to

share their time and knowledge. I am also indebted to Christina for proofreading

my thesis and offering many helpful comments.

This work was made possible by a studentship from the Engineering and Phys-

ical Sciences Research Council. I am grateful to the Department of Pure Math-

ematics and Mathematical Statistics and to St John’s College for the financial

support they have given to enable me to travel to conferences, and to St John’s

College for providing a pleasant environment in which to live.

Special mention should go to Danielle, Alan and all the friends I have made over

the years in Cambridge, and to the Cambridge University Canoe Club, for never

failing to provide me with distractions when I needed them, and sometimes when

I didn’t! My parents have also been an endless source of help and encouragement,

from painstakingly reading through drafts of my thesis to pick up the odd missing

bracket, to regularly enquiring if I was ever going to finish!

ii

iii

I would like to acknowledge the anonymous referees at the Annals of Probability

for their helpful comments in the preparation of the paper [32], which have been

incorporated into Chapter 2. The numerical work in Chapter 3 of my thesis was

done in collaboration with Sebastian Mosbach and has formed the basis for a joint

paper [28]. Chapter 4, and in particular Section 4.2, of my thesis contains work

which was done in collaboration with my supervisor, James Norris. It is intended

that this will contribute towards a joint paper in the future.

This dissertation is my own work and contains nothing which is the

outcome of work done in collaboration with others, except where specif-

ically indicated in these acknowledgments and in the text. This disser-

tation has not been submitted in whole or in part for any other degree

or qualification at any other university.

Amanda Turner

Cambridge

September 2006

I would like to add my thanks to my examiners Geoffrey Grimmett and Terry

Lyons for their thorough reading of my thesis and insightful comments.

Amanda Turner

February 2007

Contents

Abstract i

Acknowledgments ii

1 Introduction 1

2 Convergence of Markov processes near saddle fixed points 4

2.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4

2.2 The linear case . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8

2.2.1 Applications . . . . . . . . . . . . . . . . . . . . . . . . . . . 15

2.3 Linearization of the limit process . . . . . . . . . . . . . . . . . . . 18

2.4 Convergence of the fluctuations . . . . . . . . . . . . . . . . . . . . 24

2.5 A fluid limit for jump Markov processes . . . . . . . . . . . . . . . 27

2.6 Continuous diffusion Markov processes . . . . . . . . . . . . . . . . 32

2.7 Applications . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 36

2.7.1 Hitting lines through the origin . . . . . . . . . . . . . . . . 36

2.7.2 Minimum distance from the origin . . . . . . . . . . . . . . . 37

3 Accumulation of rounding errors in the numerical solution of

ODEs 42

3.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 42

iv

v

3.2 Theoretical background . . . . . . . . . . . . . . . . . . . . . . . . . 45

3.2.1 Accumulation of rounding errors . . . . . . . . . . . . . . . . 45

3.2.2 Explicit calculation of the variance . . . . . . . . . . . . . . 48

3.3 Numerical experiments . . . . . . . . . . . . . . . . . . . . . . . . . 50

3.3.1 The system . . . . . . . . . . . . . . . . . . . . . . . . . . . 50

3.3.2 Theoretical hitting distribution . . . . . . . . . . . . . . . . 51

3.3.3 Choice of parameters . . . . . . . . . . . . . . . . . . . . . . 52

3.3.4 Results and observations for explicit methods . . . . . . . . 54

3.3.5 Adaptive solvers . . . . . . . . . . . . . . . . . . . . . . . . 57

4 Stochastic flows, planar aggregation and the Brownian web 59

4.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 59

4.2 A Levy flow on the circle . . . . . . . . . . . . . . . . . . . . . . . . 61

4.2.1 Some generalities for functions on the circle . . . . . . . . . 61

4.2.2 Construction of the flow . . . . . . . . . . . . . . . . . . . . 64

4.2.3 Convergence to the Arratia flow . . . . . . . . . . . . . . . . 65

4.3 Hastings–Levitov DLA . . . . . . . . . . . . . . . . . . . . . . . . . 72

4.4 The Brownian web . . . . . . . . . . . . . . . . . . . . . . . . . . . 76

4.4.1 A description of the flow space . . . . . . . . . . . . . . . . . 77

4.4.2 Existence and uniqueness of the Brownian web . . . . . . . . 79

4.4.3 Convergence to the Brownian web . . . . . . . . . . . . . . . 80

4.5 Some properties of (D, dD) . . . . . . . . . . . . . . . . . . . . . . . 81

4.6 An equivalent space for the Brownian web . . . . . . . . . . . . . . 89

4.6.1 Compact sets of functions . . . . . . . . . . . . . . . . . . . 91

4.6.2 The isomorphism between the spaces . . . . . . . . . . . . . 92

Bibliography 97

Chapter 1

Introduction

Many fundamental results in probability theory stem from considering the limit of a

random process whose jump sizes tend to zero whilst the jump rate tends to infinity.

The simplest example of this is the law of large numbers for random variables which

states that the average value of a sequence (Xn)n∈N of independent identically

distributed random variables with finite mean µ converges to the deterministic

limit µ. By viewing these random variables as the jump sizes in a random walk,

the value ofX1 + · · ·+XN

N

can be regarded as the position of a random walk at time 1 with jump sizes of

order N−1 and jump rate N . The central limit theorem asserts that if, in addition,

the random variables have finite variance σ2, then

(X1 − µ) + · · ·+ (XN − µ)√N

converges in distribution to an N(0, σ2) random variable. As above, this value

can be regarded as the position of a (mean zero) random walk at time 1 with

jump sizes of order N− 12 and jump rate N . The fluid limit theorem and diffusion

approximation for stochastic processes generalize the law of large numbers and

central limit theorem for random variables by proving the existence of a limit

along the entire trajectory of a process, scaled as above, on a compact time-set.

Scaling limits of stochastic processes arise in a variety of contexts. Where the

1

Chapter 1. Introduction 2

random objects originate from geometrical or physical settings, the jump sizes can

be proportional to lattice spacing or particle sizes, or inversely proportional to the

number of particles. As in the case of the law of large numbers and central limit

theorem, limit theorems are often obtained by scaling time by the order of the

jump sizes or the square root of the order of the jump sizes. However, in some

situations limit results can be proved by scaling by more unusual powers of the

jump size, and are generally accompanied by interesting behaviour. In this thesis

we analyse two classes of processes both of which exhibit unexpected scaling limits.

The first class is of sequences (XNt )t>0 of Markov processes in two dimensions

whose fluid limit is a stable solution of an ordinary differential equation of the

form xt = b(xt), where

b(x) =(−µ 0

0 λ

)

x+ τ(x)

for some λ, µ > 0 and τ(x) = O(|x|2). Here the processes are indexed so that the

variance of the fluctuations of XNt is inversely proportional to N . The simplest

example arises from the OK Corral gunfight model which was formulated in 1998 by

Williams and McIlroy [34] and studied by Kingman [25] in 1999. These processes

exhibit their most interesting behaviour at times of order logN , so it is necessary

to establish a fluid limit that is valid for large times. We find that this limit is

inherently random and obtain its distribution. Using this, it is possible to derive

scaling limits for the points where these processes hit straight lines through the

origin, and the minimum distance from the origin that the processes can attain.

The power of N that gives the appropriate scaling is surprising. For example, if T

is the time that XNt first hits one of the lines y = x or y = −x, then

Nµ

2(λ+µ) |XNT | ⇒ |Z|

µλ+µ ,

for some zero mean Gaussian random variable Z.

Numerical rounding errors incurred by some deterministic solvers for systems of

ordinary differential equations can be modelled as a special case of these processes.

The above results can then be applied to acquire theoretical predictions about the

accumulation of these rounding errors. It is shown that the trajectory of the nu-

merical solution exhibits random-like behaviour, and the theoretical distribution

of the trajectory is obtained as a function of time, the step size and the numer-

Chapter 1. Introduction 3

ical precision of the computer. By performing multiple repetitions with different

values of the time step size, the random distributions predicted theoretically can

be observed numerically. We mainly focus on the explicit Euler and fourth order

Runge-Kutta (RK4) methods, but also briefly consider more complex algorithms

such as the implicit solvers VODE [5] and RADAU5 [14].

The second class of processes is motivated by Hastings-Levitov diffusion-limited

aggregation (DLA). DLA is a random growth model which was originally intro-

duced in 1981 by Witten and Sander [35]. In this model, particles perform Brow-

nian motions in the plane until they collide with a cluster at the origin, at which

point they stick to the cluster. In 1998 Hastings and Levitov [17] formulated a

model of DLA in which the cluster is represented by a sequence of iterated con-

formal maps. We construct a family of stochastic flows on the circle by iteratively

applying small localized perturbations to the circle at uniformly distributed points

and show that a case of simplified Hastings-Levitov DLA, where the incoming par-

ticles are slits of length N−1 sticking to the unit disc, falls under this scheme. This

model is known as the Eden model [8], and describes the growth of bacterial cells

or tissue cultures of cells that are constrained from moving. If time is scaled in

such a way that particles arrive as a Poisson process of rate proportional to N 3, the

resulting flow map (restricted to points on the unit circle) converges to a random

object known as the Brownian web. This object can be defined loosely as a family

of coalescing Brownian motions starting at all possible points in continuous space-

time. It was first studied in 1979 by Arratia [1] as a limit for discrete coalescing

random walks. Once again, the power of N by which time is scaled is curious and

may give rise to the fractal behaviour which can be observed in simulations of the

model.

The Markov processes near saddle points are investigated in Chapter 2, and

our results are applied to rounding errors in Chapter 3. Chapter 4 concerns the

stochastic flows arising from Hastings-Levitov DLA and shows that they converge

to the Brownian web.

Chapter 2

Convergence of Markov processes

near saddle fixed points

2.1 Introduction

The fluid limit theorem is a powerful result which shows that, under certain condi-

tions, sequences of Markov processes converge to solutions of ordinary differential

equations. We are interested in situations where the differential equation can be

written in the form

xt = Bxt + τ(xt), (2.1)

for some matrix B, where τ(x) = O(|x|2) is twice continuously differentiable. These

differential equations have been studied extensively in the dynamical systems lit-

erature, with the aim of finding precise relationships between their solutions and

solutions of the corresponding linear differential equations

yt = Byt. (2.2)

We restrict ourselves to the two dimensional case where the origin is a saddle

fixed point of the system i.e. B has eigenvalues λ,−µ, with λ, µ > 0. The phase

portrait of (2.1) in the neighbourhood of the origin is shown in Figure 2.1.

In particular, there exists some x0 6= 0 such that φt(x0) → 0 as t → ∞, where

4

Chapter 2. Convergence near saddle points 5

0

Uns

tabl

e M

anif

old

Stable Manifold Stable Manifold

Uns

tabl

e M

anif

old

Figure 2.1: The phase portrait of an ordinary differential equation having a saddlefixed point at the origin.

φ is the flow associated with the ordinary differential equation (2.1). The set of

such x0 is the stable manifold. There also exists some x∞ such that φ−1t (x∞) → 0

as t → ∞. The set of such x∞ is the unstable manifold. The saddle point case is

interesting in this setting as it is the only case in two dimensions where there is

both a stable and an unstable manifold.

Fix an x0 in the stable manifold and consider sequences of Markov processes

with initial condition XN0 = x0, where the processes are indexed so that the vari-

ance of the fluctuations of XNt is inversely proportional to N . The fluid limit

theorem tells us that for fixed values of t, XNt → φt(x0) as N → ∞. However, if

we allow the value of t to grow with N as N → ∞, we shall see that XNt deviates

from the stable solution to a limit which is inherently random, before converging

to an unstable solution (see Figure 2.2).

More precisely, we observe three different types of behaviour depending on the

time scale:

A. On compact time intervals [0, R], XNt converges to the stable solution of

(2.1), the fluctuations around this limit being of order N− 12 .


0x

Markov Process

Stable Solution

Unstable Solution

A

C

B

0

Figure 2.2: Diagram showing how the Markov process XNt deviates from the stable

solution φt(x0) for large values of t.

B. There exists some x0 6= 0, depending only on x0, and a Gaussian random

variable Z∞ such that if t lies in the interval [R, 12λ

logN −R], then

XNt = x0e

−µt(e1 + ε1) +N− 12Z∞e

λt(e2 + ε2)

for some εi(t, N) → 0 uniformly in t in probability as R,N → ∞, where

e1, e2 is the standard basis for R2. In other words, XNt can be approximated

by the solution to the linear ordinary differential equation (2.2) starting from

the random point(

x0

N−12 Z∞

)

.

C. On time intervals of a fixed length around 12λ

logN , XNt converges to the

unstable solution of (2.1).

The most interesting behaviour occurs on time intervals of fixed lengths around1

2(λ+µ)logN , as for these values of t the two terms x0e

−µt and N− 12Z∞e

λt are of

the same order. By considering

x0e−µte1 +N− 1

2Z∞eλte2,


we show in Section 2.7 that it is at these times that XNt crosses all the straight

lines passing through 0, and also that |XNt | attains its minimum value when t

is in this range. The distance from the origin of XNt for these values of t is of

order N− µ2(λ+µ) , which gives us surprising scaling limits for the points at which XN

t

intersects various straight lines, and for inf |XNt |.

In order to study the Markov processes at times of order logN , it is necessary to

establish a strong form of the fluid limit theorem that is valid for large times. The

key idea is to show that for N and t0 sufficiently large, the process (XNt )t>t0 is close

to (φt−t0(XNt0

))t>t0 . This is done in Section 2.2 in the case when (2.1) is linear and

XNt is a pure jump Markov process, in Section 2.5 for pure jump Markov processes

where (2.1) is non linear, and in Section 2.6 for continuous diffusion processes. In

Sections 2.3 and 2.4 we look at the process (φt−t0(XNt0 ))t>t0 for large values of N

and t0, which then enables us to obtain scaling limits for the process XNt . The same

idea can be used to obtain fluid limit theorems for arbitrary matrices B in (2.1)

e.g. with eigenvalues having the same sign, or in higher dimensions. However, an

analysis of the solutions of the underlying differential equation is required, which

we do not go into here.

The simplest example of this type of behaviour arises from the OK Corral

gunfight model which was formulated by Williams and McIlroy [34] and studied

by Kingman [25] and Kingman and Volkov [26]. Two lines of gunmen face each

other, there initially being N on each side. Each gunman fires lethal gunshots at

times of a Poisson process with rate 1 until either there is no one left on the other

side or he is killed. The process terminates when all the gunmen on one side are

dead. It is shown by Kingman that if SN is the number of survivors when the

process terminates, then

N− 34SN ⇒ 2

34 |Z| 12 ,

where Z ∼ N(0, 13). It is the occurrence of the unexpected power of N that

interested the above authors in the problem. By using our scaling limits we re-

derive this result in Section 2.2.1 and show that it is a special case of a much more

general phenomenon, and that in fact by a suitable choice of B, every number in

the interval (12, 1) may be obtained as a power of N in this way. An application of

the nonlinear case to a model of two competing species is given in Section 2.7.


2.2 The linear case

In this section we restrict ourselves to sequences of Markov processes in the special

case where equation (2.1) is linear. We begin by describing the conditions under

which a limit theorem exists for large times and then establish the exact limit

by means of an appropriate martingale inequality. In Section 2.2.1 this result is

used to derive scaling limits for the points where these processes hit straight lines

through the origin and we use this to obtain a solution to the OK Corral problem.

The fluid limit theorem that we state below is widely known and has been the

subject of many works. We use the formulation found in Darling and Norris [6].

Let (XNt )t>0 be a sequence of pure jump Markov processes, starting from x0

and taking values in some subsets IN of R2, with Levy kernels KN(x, dy). Let S

be an open subset of R2 with x0 ∈ S, and set SN = IN ∩ S. For x ∈ SN and

θ ∈ (R2)∗, define the Laplace transform corresponding to Levy kernel KN(x, dy)

by

mN (x, θ) =

∫

R2

e〈θ,y〉KN(x, dy).

We assume that there is a limit kernel K(x, dy) defined for x ∈ S, with corre-

sponding Laplace transform m(x, θ), with the following properties.

(a) There exists a constant η0 > 0 such that m(x, θ) is uniformly bounded for

all x ∈ S and |θ| 6 η0.

(b) As N → ∞,

supx∈SN

sup|θ|6η0

∣

∣

∣

∣

mN(x,Nθ)

N−m(x, θ)

∣

∣

∣

∣

→ 0.

Set b(x) = m′(x, 0) where ′ denotes differentiation in θ. Suppose that b is Lipschitz

on S so that b has an extension to a Lipschitz vector field b on R2. Then there is

a unique solution (xt)t>0 to the ordinary differential equation xt = b(xt) starting

from x0. Suppose that S contains a neighbourhood of the path (xt)t>0. By stopping

XNt at the first time it leaves S if necessary, we may assume that XN

t remains in

S for all t > 0. Under these assumptions, for all t0 > 0 and δ > 0,

lim supN→∞

N−1 log P(supt6t0

|XNt − xt| > δ) < 0.


Suppose additionally that

(c) b is C1 on S and

supx∈SN

N12 |bN(x) − b(x)| → 0,

where bN (x) = mN ′(x, 0).

(d) a, defined by a(x) = m′′(x, 0), is Lipschitz on S.

It follows from the above that for any η < η0 there exists a constant A such that

supx∈SN

sup|θ|6η

N |mN ′′(x,Nθ)| 6 A, (2.3)

where | · | is the operator norm.

Let γNt = N

12

(

XNt − xt

)

. Then for any t > 0, γNt ⇒ γt as N → ∞, where

(γt)t>0 is the unique solution to the linear stochastic differential equation

dγt = σ(xt)dWt + ∇b(xt)γtdt (2.4)

starting from 0, W a Brownian motion in R2, and σ ∈ R2 ⊗ (R2)∗ satisfying

σ(x)σ(x)∗ = a(x). The distribution of (γt)t>0 does not depend on the choice of σ.

We are interested in the case where b(x) = Bx for some matrix B =(−µ 0

0 λ

)

,

µ, λ > 0.

Let φt(x) be the solution to the ordinary differential equation

φt(x) = b (φt(x)) , φ0(x) = x. (2.5)

In the linear case we can solve (2.5) explicitly to get φt(x) = eBtx. We concen-

trate on processes where the initial condition is chosen to be x0 = (x0,1, 0) with

x0,1 6= 0, so that xt = φt(x0) → 0 as t → ∞. We shall show that for sufficiently

large values of N and t0, XNt is in some sense close to φt−t0(X

Nt0

) for t > t0.

Introduce random measures µN and νN on (0,∞) × R2, given by

µN =∑

∆XNt 6=0

δ(t,∆XNt ),


νN (dt, dy) = KN (XNt−, dy)dt,

where δ(t,y) denotes the unit mass at (t, y) and ∆XNt = XN

t −XNt−.

Let f(t, x) = e−Bt(

x− φt−t0(XNt0

))

, for t > t0. By Ito’s formula,

f(t, XNt ) = f(t0, X

Nt0

) +MB,Nt −MB,N

t0 +

∫ t

t0

(

∂f

∂t+KNf

)

(s,XNs−)ds,

where∂f

∂t= −Be−Btx,

KNf(s, x) =

∫

R2

(f(s, x+ y) − f(s, x))KN(x, dy)

=

∫

R2

e−BsyKN(x, dy)

= e−BsbN(x),

and

MB,Nt =

∫

(0,t]×R2

(

f(s,XNs− + y) − f(s,XN

s−))

(µN − νN)(ds, dy)

=

∫

(0,t]×R2

e−Bsy(µN − νN )(ds, dy).

So if t > t0, then

e−Bt(XNt − φt−t0(X

Nt0

)) = MB,Nt −MB,N

t0 +

∫ t

t0

e−Bs(bN(XNs−) − b(XN

s−))ds. (2.6)

Lemma 2.1. There exists some constant C such that

E

(

supt>t0

e−λt|eBt(MB,Nt −MB,N

t0 )|)

6 CN− 12 e−λt0 .


Proof. By the product rule,

e(B−λI)t(MB,Nt −MB,N

t0 ) =

∫ t

t0

(B − λI)e(B−λI)s(MB,Ns −MB,N

t0 )ds

+

∫ t

t0

∫

R2

e−λsy(µN − νN)(dy, ds)

and hence,

E

(

supt>t0


t0 )|)

6 E

(

supt>t0

∫ t

t0

(λ+ µ)e−(λ+µ)s|(MB,Ns −MB,N

t0 )1|ds)

+ E

(

supt>t0

∣

∣

∣

∣

∫ t

t0

∫

R2

e−λsy(µN − νN )(dy, ds)

∣

∣

∣

∣

)

6

∫ ∞

t0

(λ+ µ)e−(λ+µ)s(

E(MB,Ns −MB,N

t0 )21

)12ds

+ E

(

supt>t0

∣

∣

∣

∣

∫ t

t0

∫

R2


∣

∣

∣

∣

2)

12

.

Since

E

∫ t

0

∫

R2

|e−λsy|νN(dy, ds) <∞

for all t > 0, the process

(∫ t

0

∫

R2


)

t>0

is a martingale, and hence, by Doob’s L2 inequality

E

(

supt>t0

∣

∣

∣

∣

∫ t

t0

∫

R2


∣

∣

∣

∣

2)

6 4 supt>t0

E

(

∣

∣

∣

∣

∫ t

t0

∫

R2


∣

∣

∣

∣

2)

.


Now

E

(

(MB,Nt −MB,N

t0 )21

)

= E

∫ t

t0

∫

R2

e2µsy21ν

N (dy, ds)

6 E

∫ t

t0

e2µs|mN ′′(XNs−, 0)|ds

6e2µtA

2µN,

where A is defined in (2.3). Similarly

E

(

∣

∣

∣

∣

∫ t

t0

∫

R2


∣

∣

∣

∣

2)

6e−2λt0A

2λN.

Hence,

E

(

supt>t0


t0 )|)

6

∫ ∞

t0

(λ+ µ)e−λs

(

A

2µN

) 12

ds+ e−λt0

(

2A

λN

) 12

6A

12 (λ+ µ+ 2(λµ)

12 )

λ(2µ)12

N− 12 e−λt0 .

Theorem 2.2. For all ε > 0,

limt0→∞

lim supN→∞

P

(

supt>t0

e−λt|XNt − φt−t0(X

Nt0 )| > N− 1

2 ε

)

= 0.

Proof. Let N0 be sufficiently large that supN>N0N

12 ‖bN − b‖ < λε/2, where ‖bN −

b‖ = supx∈SN |bN(x) − b(x)|, and set

ΩN,t0 =

supt>t0


t0 )| 6 N− 12ε

2

.


By (2.6), on the set ΩN,t0 with N > N0,

supt>t0

e−λt∣

∣XNt − φt−t0(X

Nt0

)∣

∣ 6 supt>t0

∣

∣

∣e−λteBt(MB,N

t −MB,Nt0 )

∣

∣

∣

+ supt>t0

e−λt

∫ t

t0

|eB(t−s)| ‖bN − b‖ds

6 N− 12 ε.

Hence,

lim supN→∞

P

(

supt>t0


Nt0 )| > N− 1

2 ε

)

6 lim supN→∞

P(ΩcN,t0)

62Ce−λt0

ε→ 0

as t0 → ∞, where the second inequality follows by Markov’s inequality and Lemma

2.1.

Let Z∞ ∼ N(0, σ2∞), where

σ2∞ =

∫ ∞

0

e−2λsa(xs)2,2ds.

Theorem 2.3. The following converge in probability as N → ∞.

(i)

supt6tN

|eµtXNt,1 − x0,1| → 0

for any sequence tN → ∞ with e(λ+µ)tN = O(N12 );

(ii)

supt>tN

N12 e−λt|XN

t,1| → 0

for any sequence tN with e(λ+µ)tN = ω(N12 );

(iii)

supt1,t2>tN

N12 |e−λt1XN

t1,2 − e−λt2XNt2,2| → 0


for any sequence tN → ∞.

Furthermore, if σ∞ 6= 0, then

N12 e−λtXN

t,2 ⇒ Z∞

as t, N → ∞.

Remark 2.4. Given any sequence of times tN → ∞ as N → ∞, by the Skorohod

Representation Theorem, it is possible to choose a sample space in which ZN∞ =

N12 e−λtNXN

tN ,2 → Z∞ almost surely as N → ∞. In this case the above result can

be expressed as

XNt = x0,1e

−µt(e1 + ε1) +N− 12Z∞e

λt(e2 + ε2) (2.7)

where εi = εi(N, t) → 0, uniformly in t, in probability as N → ∞.

Proof. For any fixed t0, supt6t0 |eµtXNt,1 − x0,1| → 0 in probability as an immediate

consequence of the fluid limit theorem. For (i), it is therefore sufficient to show

that for any ε > 0, limt0→∞ lim supN→∞ P(supt06tN|eµtXN

t,1 − x0,1| > ε) = 0. Now

if t > t0, then φt−t0(XNt0 ) = eB(t−t0)XN

t0 = eB(t−t0)(xt0 +N− 12γN

t0 ). Since x0 = x0,1e1,

we have that eB(t−t0)xt0 = e−µtx0. Hence,

φt−t0(XNt0

) = e−µt(x0 +N− 12 eµt0γN

t0,1e1) +N− 12 eλte−λt0γN

t0,2e2,

and so

XNt,1 = e−µt

(

x0,1 +N− 12 eµt0γN

t0,1 + e(λ+µ)te−λt(XNt − φt−t0(X

Nt0

))1

)

and

XNt,2 = N− 1

2 eλt(

e−λt0γNt0,2 +N

12 e−λt(XN

t − φt−t0(XNt0 ))2

)

. (2.8)

Let N → ∞ and then t0 → ∞. Statements (i)-(iii) follow by Theorem 2.2 and the

fact that γNt0

⇒ γt0, a Gaussian random variable.

For the last part, note that by (2.4),

e−λt0γNt0,2 ⇒ e−λt0γt0,2 =

∫ t0

0

e−λs〈e2, σ(xs)dWs〉


as N → ∞. Since

∫ ∞

0

|e−λse∗2σ(xs)|2ds 6

∫ ∞

0

e−2λs|a(xs)|ds 6A

2λ,

where e∗i is the transpose of ei, e−λt0γt0,2 → Z∞ almost surely as t0 → ∞, for

Z∞ =

(∫ ∞

0

e−λtσ(xt)dWt

)

2

∼ N(0, σ2∞).

The result follows by (2.8) and Theorem 2.2.

2.2.1 Applications

Applications will be dealt with more fully in Section 2.7. However, we illustrate

here how the above result can be used to study the first time that XNt hits lθ or l−θ,

the straight lines passing through the origin at angles θ and −θ, where θ ∈ (0, π2),

as N → ∞. As XNt is not continuous, we define the time that XN

t first crosses

one of the lines l±θ as

TNθ = inf

t > 0 :∣

∣

∣

XNt−,2

XNt−,1

∣

∣

∣6 | tan θ| and

∣

∣

∣

XNt,2

XNt,1

∣

∣

∣> | tan θ|

.

Let

tN =1

2(λ+ µ)logN,

and

cθ =1

λ+ µlog

∣

∣

∣

∣

x0,1 tan θ

Z∞

∣

∣

∣

∣

.

Theorem 2.5. Under the assumptions listed at the beginning of Section 2.2,

TNθ − tN ⇒ cθ (2.9)

and

Nµ

2(λ+µ) |XNT N

θ| ⇒ | sec θ|| tan θ|−

µλ+µ |x0|

λλ+µ |Z∞|

µλ+µ (2.10)

as N → ∞.

Proof. For simplicity, we work in a sample space in which ZN∞ → Z∞ almost surely.


Define εi as in Remark 2.4. By observing that

x0,1e−µte1 +N− 1

2Z∞eλte2

first intersects one of the lines l±θ at time t = tN + cθ, given any ε > 0,

P(

TNθ 6 tN + cθ − ε

)

6 P

(

supt6tN +cθ−ε

∣

∣

∣

∣

∣

XNt,2

XNt,1

∣

∣

∣

∣

∣

> | tan θ|)

= P

(

supt6tN +cθ−ε

∣

∣

∣

∣

∣

x0,1e−µtε1,2 +N− 1

2Z∞eλt(1 + ε2,2)

x0,1e−µt(1 + ε1,1) +N− 12Z∞eλtε2,1

∣

∣

∣

∣

∣

> | tan θ|)

→ 0

as N → ∞, where εi,j is the jth coordinate of εi. Similarly,

P(

TNθ > tN + cθ + ε

)

6 P

(

inft>tN +cθ+ε

∣

∣

∣

∣

∣

XNt,2

XNt,1

∣

∣

∣

∣

∣

6 | tan θ|)

→ 0.

The result follows immediately.

Remark 2.6. The sign of Z∞ determines whether XNt hits lθ or l−θ at time TN

θ .

Since Z∞ is a Gaussian random variable with mean 0, each event occurs with

probability 12.

Example 2.7 (The OK Corral Problem). The OK Corral process is a Z2-

valued process (UNt , V

Nt ) used to model the famous gunfight. Here UN

t and V Nt

are the number of gunmen on each side and UN0 = V N

0 = N . Each gunman fires

lethal gunshots at times of a Poisson process with rate 1 until either there is no-one

left on the other side or he is killed. The transition rates are

(u, v) →

(u− 1, v) at rate v

(u, v − 1) at rate u

until uv = 0.

The process terminates when all the gunmen on one side are dead. We are

interested in the number of gunmen surviving when the process terminates, for


large values of N .

This model was formulated by Williams and McIlroy [34] and later studied by

Kingman [25] and subsequently Kingman and Volkov [26].

Let XNt =

(

UNt , V

Nt

)

/N . This gives a sequence of pure jump Markov processes,

starting from x0 = (1, 1), with Levy kernels

KN (x, dy) = Nx2δ(−1/N,0) +Nx1δ(0,−1/N).

If we let

K(x, dy) = x2δ(−1,0) + x1δ(0,−1),

then

m(x, θ) = x2e−θ1 + x1e

−θ2 =mN (x,Nθ)

N,

b(x) =(

0 −1−1 0

)

x = bN(x),

and

a(x) =(

x2 00 x1

)

.

So, under a rotation by π4, the conditions required for Theorem 2.5 are satisfied,

with λ = µ = 1. In the original coordinates, the process terminates when XNt

hits the x or y axes. Under the rotation, this corresponds to hitting l±π4. Hence,

if the OK Corral process terminates at time TN and there are SN survivors, then

TN = TNπ/4 and SN = N |XN

T Nπ/4

|, and so

TN − 1

4logN ⇒ 1

4log 2 − 1

2log |Z∞|

and

N− 34SN ⇒ 2

34 |Z∞| 12 ,

where Z∞ ∼ N(0, 13). The limiting distribution of N− 3

4SN is the one obtained by

Kingman in [25].

Remark 2.8. It is remarked by Kingman [25] that it is the occurrence of the sur-

prising power of N that makes the OK Corral process of interest. Theorem 2.5

shows that this is a special case of a more general phenomenon and, in fact, by

a suitable choice of λµ, every number in the interval ( 1

2, 1) may be obtained as a


power of N in this way.

2.3 Linearization of the limit process

We now turn to the general case where b(x) = Bx+τ(x) for B =(−µ 0

0 λ

)

, µ, λ > 0,

and τ : R2 → R2 twice continuously differentiable, with τ(0) = ∇τ(0) = 0. Let

φt(x) be the solution to the ordinary differential equation

φt(x) = b (φt(x)) , φ0(x) = x. (2.11)

This section consists of a technical calculation which expresses φt(x) in a linear

form.

We are interested in the behaviour of solutions starting near the stable mani-

fold. Lemma 2.10 proves the existence of the stable manifold and establishes the

limiting behaviour of a stable solution. First order behaviour is investigated in

Lemma 2.11, and these results are then used in Theorem 2.12 to express solutions

near the stable manifold in the required linear form. Theorem 2.13 shows that

over large time periods, solutions starting near the stable manifold approach the

unstable manifold.

Throughout this section we use the following classical planar linearization the-

orem due to Hartman [16].

Theorem 2.9. There exists a C1 diffeomorphism h : U → V = h(U), defined on

an open neighbourhood U of the origin, with uniformly Holder continuous partial

derivatives and having the form h(x) = x+ o(x) such that

h(φt(x)) = eBth(x)

for all (t, x) with φt(x) ∈ U .

Pick 0 < δ < 1 sufficiently small that the ball of radius δ centered at the origin

is contained in U ∩ V . Since h−1(x) = x + o(x), and ∇h(x) = I + o(1) we can


further ensure that δ is sufficiently small that

sup0<|x|<δ

(

|h(x)/x| ∨∣

∣h−1(x)/x∣

∣

)

< 2

and

sup|x|<δ

(|∇h(x) − I| ∨ |∇h−1(x) − I|) < 1/2.

Lemma 2.10. There exists an x0 with 0 < |x0| < δ/8 such that φt(x0) → 0 as

t → ∞. Furthermore, for any such x0, there exists some x0 with 0 < |x0| < δ/4

such that

eµtφt(x0) → ( x00 )

as t→ ∞, and

|φt(x0)| 6 2|x0|e−µt < δe−µt/2

for all t > 0.

Proof. Pick some x0 ∈ R with 0 < |x0| < δ/16 and define x0 = h−1(x0, 0). Then

0 < |x0| 6 sup0<|x|<δ

|h−1(x)/x||x0| <δ

8,

and

φt(x0) = h−1(

eBt ( x00 ))

= h−1(

e−µtx00

)

→ 0

as t→ ∞.

Conversely, given x0 satisfying the above conditions, define x0 = h(x0)1. Note

that because of the form of h(x), x0 has the same sign as x0,1. Since eBth(x0) =

h(φt(x0)) → 0 as t→ ∞, h(x0)2 = 0, and so

0 < |x0| = |h(x0)| 6 2|x0| < δ/4.

Also

eµtφt(x0) = eµth−1(

eBt ( x00 ))

= eµt((

e−µtx00

)

+ o(e−µtx0))

→ ( x00 )


as t→ ∞, and

|φt(x0)| =∣

∣h−1(

eBt ( x00 ))∣

∣ =∣

∣h−1(

e−µtx00

)∣

∣ 6 2|x0|e−µt <δ

2e−µt

for all t > 0.

Lemma 2.11. (i) There exists some D0 ∈ (R2)∗ \ 0, where 0 = (0 0), such

that

e−λt∇φt(x0) →(

0D0

)

as t→ ∞.

(ii) If |x| < δ and |φt(x)| < δ/2, then |∇φt(x)| < 4eλt.

(iii) If |x| + |y| < δ and sup06θ61 |φt(x + θy)| < δ/2, then there exist constants

K ∈ R and 0 < α 6 1 such that

|∇φt(x+ y) −∇φt(x)| 6 Keλt(1+α)|y|α.

Proof. (i) Let D0 = ∇h2(x0) ∈ (R2)∗ \ 0. Then

e−λt∇φt(x0) = ∇h−1(

e−µtx00

)

e(B−λI)t∇h(x0)

→ ( 0 00 1 )∇h(x0)

=(

0D0

)

as t→ ∞.

(ii) If |φt(x)| < δ/2, then |eBth(x)| = |h(φt(x))| < δ and so

|∇φt(x)| = |∇h−1(eBth(x))eBt∇h(x)|6 sup

|y|<δ

|∇h−1(y)| sup|y|<δ

|∇h(y)|eλt

< 4eλt.

(iii) Since h and h−1 have uniformly Holder continuous partial derivatives, there

exists some K0 ∈ R and 0 < α < 1 such that

|∇h(w) −∇h(z)| 6 K0|w − z|α


and

|∇h−1(w) −∇h−1(z)| 6 K0|w − z|α.

Therefore

|∇φt(x+ y) −∇φt(x)|=

∣

∣∇h−1(eBth(x+ y))eBt∇h(x+ y) −∇h−1(eBth(x))eBt∇h(x)∣

∣

6∣

∣∇h−1(eBth(x+ y))eBt (∇h(x + y) −∇h(x))∣

∣

+∣

∣

(

∇h−1(eBth(x + y)) −∇h−1(eBth(x)))

eBt∇h(x)∣

∣

6 2eλtK0|y|α + 2eλtK0|eλt(h(x + y) − h(x))|α

6 8K0eλt(1+α)|y|α.

Suppose that z ∈ R2, with 0 < |z| < 1, and xz = x0 + z.

Theorem 2.12. Fix C and consider the limit z → 0 with∣

∣

∣

zD0z

∣

∣

∣< C, where D0

is defined in Lemma 2.11. There exist wi, i = 1, 2 (not necessarily unique) with

wi(t, z) → 0 uniformly in t ∈ [R,− 1λ

log |z| −R] as z → 0 and R→ ∞ such that

φt(xz) = x0e−µt(e1 + w1) +D0ze

λt(e2 + w2).

Proof. Suppose that R > 1λ

log 8δ−4|x0| . If |x− x0| 6 |z| and

0 6 t 6

(

inf|x−x0|6|z|

inf

s > 0 : |φs(x)| >δ

2

)

∧(

−1

λlog |z| − R

)

,

then

|φt(x)| 6 |φt(x0)| + |φt(x) − φt(x0)|6 2|x0|e−µt + |∇φt(x0 + θ′(x− x0))| |x− x0|6 2|x0|e−µt + 4|z|eλt

6 2|x0| + 4e−λR

<δ

2


where θ′ ∈ (0, 1). Hence, |φt(x)| < δ/2 for all |x−x0| 6 |z| and t 6 − 1λ

log |z|−R.

Now

φt(xz) = φt(x0) + ∇φt(x0)z + (∇φt(x0 + θz) −∇φt(x0)) z

for some θ ∈ (0, 1) and so, defining

w1(t, z) = x−10

(

eµtφt(x0) − x0e1)

and

w2(t, z) = (D0z)−1(

e−λt∇φt(x0)z −D0ze2 + e−λt(∇φt(x0 + θz) −∇φt(x0))z)

,

we have

φt(xz) = x0e−µt(e1 + w1) +D0ze

λt(e2 + w2).

Then |w1| → 0 uniformly in t > R as R → ∞ by Lemma 2.10, and

|w2| 6|z|

|D0z|(∣

∣e−λt∇φt(x0) −(

0D0

)∣

∣+Keλαt|z|α)

6 C(∣

∣e−λt∇φt(x0) −(

0D0

)∣

∣ +Ke−λαR)

→ 0

uniformly in t ∈ [R,− 1λ

log |z| −R] as R → ∞ and z → 0, by Lemma 2.11.

Since φ−1t (x) satisfies (2.11) with b replaced by −b, we may apply Lemma

2.10 and Lemma 2.11 to deduce the existence of x∞ with 0 < |x∞| < δ/8 such

that eλtφ−1t (x∞) → ( 0

x∞) for some x∞ ∈ R as t → ∞, and D∞ such that

e−µt∇φ−1t (x∞) →

(

D∞

0

)

as t → ∞. Suppose that as z → 0, the sign of D0z

is eventually constant and non-zero. As x∞ has the same sign as x∞,2 (see the

proof of Lemma 2.10), we may choose x∞ such that D0zx∞

> 0.

There exists some t∞ > 0 such that φt(x0) does not intersect the line l(r) =

x∞ + rD∗∞ for any t > t∞. Let

sz = inft > t∞ : φt(xz) = x∞ + rD∗∞ for some r ∈ R.


Theorem 2.13. Fix C > 0 and consider the limit z → 0 with∣

∣

∣

zD0z

∣

∣

∣6 C. Then

sz −1

λlog

x∞D0z

→ 0

and(

x∞D0z

)µλ

(φsz(xz) − x∞) → x0D∗

∞|D∞|2

as z → 0.

Proof. We shall prove this theorem in the case where for z sufficiently small D0z,

x0 > 0. The other cases are similar.

Sinceφt(x0)2

φt(x0)1=eµtφt(x0)2

eµtφt(x0)1→ 0

x0= 0

as t→ ∞, there exists some T > 0 such that∣

∣

∣

φt(x0)2φt(x0)1

∣

∣

∣< 1 for all t > T . Let

tz = inft > T : |φt(xz)1| = |φt(xz)2|.

By expressing φt(xz) in the form derived in Theorem 2.12, we may use a similar

argument to that in Theorem 2.5 to show

tz −1

λ+ µlog

x0

D0z→ 0

as z → 0. Let f : B(0, 1) → R be defined by f(z) = φtz(xz)1. Again as in Theorem

2.5,

(D0z)− µ

λ+µf(z) → xλ

λ+µ

0

as z → 0.

Define g : R+ → R by

g(y) = φ−1t′y

(

x∞ + y D∗

∞

|D∞|2

)

1,

where t′y is defined in the same way as tz except for φ−1 instead of φ. (The

scaling factor of |D∞|2 is chosen so that D∞

(

y D∗

∞

|D∞|2

)

= y). Note that φsz(xz) =

x∞ + g−1(f(z)) D∗

∞

|D∞|2 .


By a similar argument to above, y−λ

λ+µg(y) → xµ

λ+µ∞ as y → 0. But then

∣

∣

∣

∣

∣

(

x∞D0z

)µλ

g−1(f(z)) − x0

∣

∣

∣

∣

∣

6 (D0z)−µ

λ

∣

∣

∣x

µλ∞g

−1(f(z)) − f(z)λ+µ

λ

∣

∣

∣+

∣

∣

∣

∣

(

(D0z)− µ

λ+µf(z))

λ+µλ − x0

∣

∣

∣

∣

=

(

(D0z)− µ

λ+µf(z)

y−λ

λ+µ g(y)

)λ+µ

λ ∣

∣

∣

∣

xµλ∞ −

(

y−λ

λ+µ g(y))

λ+µλ

∣

∣

∣

∣

+

∣

∣

∣

∣

(

(D0z)− µ

λ+µf(z))

λ+µλ − x0

∣

∣

∣

∣

→ 0

as z → 0, where y = g−1(f(z)) → 0 as z → 0. So

(

x∞D0z

)µλ

(φsz(xz) − x∞) =

(

x∞D0z

)µλ

g−1(f(z))D∗

∞|D∞|2 → x0

D∗∞

|D∞|2 .

Also, since t′y = sz − tz, and t′y − 1λ+µ

log x∞

y→ 0 as y → 0,

(sz − tz) −1

λ + µlog

x∞(

D0zx∞

)µλx0

→ 0

i.e.

sz −1

λlog

x∞D0z

→ 0.

2.4 Convergence of the fluctuations

Now suppose that XNt is a pure jump Markov process satisfying all the conditions

in Section 2.2, except with b(x) = Bx + τ(x), B and τ defined as in Section 2.3.

In this section we express φt−t0(XNt0

) in a linear form for large values of N and t0.

Recall from Section 2.2 (page 9) that γNt = N

12

(

XNt − xt

)

and γNt ⇒ γt for

each t as N → ∞, where (γt)t>0 is the unique solution to the linear stochastic


differential equation (2.4).

Fix some t0 > 0. Then φt−t0(XNt0 ) = φt(φ

−1t0 (XN

t0 )) and using the same notation

as in Section 2.2, there exists some θ ∈ (0, 1) such that

φ−1t0

(XNt0

) = φ−1t0

(xt0) +N− 12∇φ−1

t0(xt0)γ

Nt0

+N− 12 (∇φ−1

t0(xt0 + θN− 1

2γNt0

) −∇φ−1t0

(xt0))γNt0

= x0 +N− 12ZN

t0 ,

where ZNt0 ⇒ Zt0 = ∇φ−1

t0 (xt0)γt0 as N → ∞. Now

D0Zt0 = limt→∞

e∗2e−λt∇φt(x0)

∫ t0

0

∇φ−1s (xs)σ(xs)dWs

= limt→∞

e∗2e−λt

∫ t0

0

∇φt−s(xs)σ(xs)dWs,

and

lim inft→∞

e−2λt

∫ ∞

0

|e∗2∇φt−s(xs)σ(xs)|2ds

6 lim inft→∞

e−2λt

∫ ∞

0

|∇φt−s(xs)2|2|a(xs)|ds

6 lim inft→∞

e−2λt

∫ ∞

0

16|Ds|2e2λ(t−s)Ads

632A

λ,

where A is defined in (2.3) and the modulus of Ds = limt→∞ e−λt∇φt(xs)2 is

bounded above by 2, by the same argument used to show existence of D0 in

Lemma 2.11. Hence, if we define

σ2∞ =

∫ ∞

0

limt→∞

e−2λt∇φt−s(xs)2a(xs)∇φt−s(xs)∗2ds

=

∫ ∞

0

e−2λsDsa(xs)D∗sds, (2.12)

then D0Zt0 → Z∞ almost surely as t0 → ∞, where Z∞ ∼ N(0, σ2∞).

Choose x+∞ and x−∞, with 0 < |x±∞| < δ/2 and x−∞,2 < 0 < x+

∞,2, such that


φ−1t (x±∞) → 0 as t→ ∞. Define a random variable X∞ on the same sample space

as Z∞ by

X∞ =

x+∞ if Z∞ > 0

0 if Z∞ = 0

x−∞ if Z∞ < 0

and define X∞ similarly, except replacing x±∞ by x±∞.

By the Skorohod Representation Theorem, we may assume we are working

in a sample space in which ZNt0

→ Zt0 almost surely for all t0 ∈ N. Without

this assumption, analogous results about weak convergence hold, however this

assumption simplifies the formulation. Let

SN,t0 = infs > t∞ : φs−t0(XNt0

) = X∞ + rD∗∞ for some r ∈ R (2.13)

and

SN =1

2λlogN +

1

λlog

X∞Z∞

, (2.14)

where we interpret 00

= 1.

Theorem 2.14. Suppose that σ∞ 6= 0.

(i) As N → ∞ and then t0 → ∞,

eµt|φt−t0(XNt0

) − φt(x0)| → 0

in probability, uniformly in t on compacts.

(ii) If R 6 t 612λ

logN − R, then there exist ε′i(N, t0, t) → 0, uniformly in t, in

probability as R,N → ∞ and then t0 → ∞, such that

φt−t0(XNt0

) = x0e−µt(e1 + ε′1) +N− 1

2Z∞eλt(e2 + ε′2).

(iii) As N → ∞ and then t0 → ∞, SN,t0 − SN → 0 in probability. Furthermore,

if t = SN,t0 − s for some s, then

eλs|φt−t0(XNt0

) − φ−1s (X∞)| → 0


uniformly in s on compacts, in probability as N → ∞ and then t0 → ∞.

Proof. (i) By Lemma 2.11, for some θ ∈ (0, 1)

eµt|φt−t0(XNt0

) − φt(x0)| = eµt∣

∣

∣∇φt

(

x0 + θN− 12ZN

t0

)∣

∣

∣N− 1

2 |ZNt0|

6 4e(λ+µ)tN− 12 |ZN

t0 |→ 0

uniformly in t on compacts, in probability.

(ii) We apply Theorem 2.12 with z = N− 12ZN

t0and use the fact that D0Z

Nt0

→ Z∞

almost surely as N → ∞ and then t0 → ∞. A potential problem arises when

Z∞ is close to 0. However, as it is a Gaussian random variable, the probability

of this occurring can be made arbitrarily small.

(iii) The first result follows from Theorem 2.13 by a similar argument to (ii). For

the second result apply a similar argument to the proof of (i) to φ−1t .

2.5 A fluid limit for jump Markov processes

We now show that for large values of N and t, XNt is in some sense close to

φt−t0(XNt0 ) as t0 → ∞, and combine this with results from Section 2.3 to obtain

results analogous to those in the linear case in Section 2.2.

Let f(t, x) = e−Bt(

x− φt−t0(XNt0

))

. By Ito’s formula,

f(t, XNt ) = f(0, XN

0 ) +MB,Nt +

∫ t

0

(

∂f

∂t+Kf

)

(s,XNs−)ds,

where∂f

∂t= −Be−Btx− e−Btτ(φt−t0(X

Nt0 )),


Kf(s,XNs−) =

∫

R2

(


s−))

KN (XNs−, dy)

=

∫

R2

e−BsyKN(XNs−, dy)

= e−BsbN (XNs−),

and

MB,Nt =

∫

(0,t]×R2

(


s−))

(µN − νN)(ds, dy)

=

∫

(0,t]×R2

e−Bsy(µN − νN )(ds, dy).

So if t > t0, then

e−Bt(

XNt − φt−t0(X

Nt0 ))

= MB,Nt −MB,N

t0

+

∫ t

t0

e−Bs(

bN(XNs−) − b(XN

s−))

ds (2.15)

+

∫ t

t0

e−Bs(

τ(XNs−) − τ(φs−t0(X

Nt0

)))

ds.

Since τ ∈ C2, ∇τ is Lipschitz continuous on the unit disc with Lipschitz

constant denoted by K0. In addition to the restrictions on δ from Section 2.3,

suppose that δ < λµ9K0(λ+µ)

.

Theorem 2.15. For all ε > 0,

limt0→0

lim supN→∞

P

(

supt06t6SN,t0


Nt0

)| > εN− 12

)

= 0.

Proof. Let

RN,t0 = inf

t > t0 : e−λt|XNt − φt−t0(X

Nt0

)| > N− 12 ε

∧ SN,t0 .

We shall show that RN,t0 = SN,t0 by bounding the terms on the right hand side of

(2.15).

Fix c > 0. Since increasing ε decreases the above probability, we may assume


0 < ε < η0 ∧ λe−λc

9K0, where η0 is defined at the start of section 2.2. Suppose that

C > 4 and pick R >1λ

log(

8CK0eλc

λ

)

. Define

Ω1N,t0

=

supt>t0


t0 )| < N− 12ε

3

,

Ω2N,t0 ,R =

sup06t6R

eµt|φt−t0(XNt0 ) − φt(x0)| <

δ

2

∩

supR<t<SN,t0

−R|ε′1(N, t0, t)| ∨ |ε′2(N, t0, t)| < 1

∩

supSN,t0

−R6t6SN,t0

eλ(SN,t0−t)|φt−t0(X

Nt0 ) − φ−1

SN,t0−t(X∞)| < δ

2

,

where ε′1 and ε′2 are defined in Theorem 2.14, and

Ω3N,t0,c =

St0,N 61

2λlogN + c

.

Let N0 be sufficiently large that supN>N0N

12 ‖bN − b‖ < λε/3.

On the set Ω1N,t0

∩ Ω2N,t0 ,R ∩ Ω3

N,t0,c ∩ C−1 < |Z∞| < C with N > N0, if

t0 6 t < R, then

|φt−t0(XNt0 )| 6 δe−µt,

if R 6 t 6 SN,t0 − R, then

|φt−t0(XNt0

)| 6 |x0|e−µt(1 + |ε′1|) +N− 12 |Z∞|eλt(1 + |ε′2|)

6δ

2e−µt +N− 1

2 2Ceλt,

and if SN,t0 − R 6 t 6 SN,t0 , then

|φt−t0(XNt0 )| < δe−λ(SN,t0

−t).


From (2.15), for some θ ∈ (0, 1),

e−λt∣


Nt0

)∣

∣

6e−λt|eBt(MB,Nt −MB,N

t0 )| + e−λt

∫ t

t0

|eB(t−s)|∣

∣bN(XNs−) − b(XN

s−)∣

∣ ds

+ e−λt

∫ t

t0

|eB(t−s)|∣

∣τ(XNs−) − τ(φs−t0(X

Nt0

))∣

∣ ds


t0 )| + 1

λ‖bN − b‖

+

∫ t

t0

e−λs∣

∣∇τ(

φs−t0(XNt0 ) + θ(XN

s− − φs−t0(XNt0 )))∣

∣ |XNs− − φs−t0(X

Nt0 )|ds


t0 )| + 1

λ‖bN − b‖

+K0

∫ t

t0

(

|φs−t0(XNt0

)| + |XNs− − φs−t0(X

Nt0

)|)

e−λs|XNs− − φs−t0(X

Nt0

)|ds.

Hence, on Ω1N,t0

∩ Ω2N,t0 ,R ∩ Ω3

N,t0,c ∩ C−1 < |Z∞| < C with N > N0,

supt06t6RN,t0

e−λt∣


Nt0 )∣

∣

6N− 12ε

3+N− 1

2ε

3+K0

∫ RN,t0

t0

(|φs−t0(XNt0 )| + |XN

s− − φs−t0(XNt0 )|)N− 1

2 εds

6N− 12 ε

(

2

3+K0

(

∫ Rt0,N

t0

(

δ(e−µt + e−λ(SN,t0−t)) +N− 1

2 εeλt)

dt

+

∫ SN,t0−R

t0

N− 12 2Ceλtdt

))

6N− 12 ε

(

2

3+K0

(

δ(λ+ µ)

λµ+εeλc

λ+

2Ceλc

λe−λR

))

<N− 12 ε.

Since XNt is right continuous, this means RN,t0 = SN,t0 and so

P

(

supt06t6SN,t0


Nt0 )| > N− 1

2 ε

)

6 P((Ω1N,t0

)c) + P((Ω2N,t0,R)c) + P((Ω3

N,t0,c)c) + P

(

|Z∞| 6∈ (C−1, C))

.


Letting N, t0, R, C, c→ ∞ in that order, and using Lemma 2.1 and Theorem 2.14

gives

limt0→∞

lim supN→∞

P

(

supt6SN,t0


Nt0 )| > N− 1

2 ε

)

= 0.

Remark 2.16. The same idea can be used to obtain convergence results for arbitrary

matrices B e.g. with eigenvalues having the same sign or in higher dimensions. The

rate of convergence and the time up to which convergence is valid will depend on

the eigenvalues of B and bounds on |φt(x)|.

Combining the above result with Theorem 2.14 we get the following.

Theorem 2.17. (i) For all N ∈ N,

N12 |XN

t − φt(x0)|

is bounded uniformly in t on compacts, in probability. (This follows directly

from the fluid limit theorem and diffusion approximation stated in Section

2.2).

(ii) Suppose that R 6 t 612λ

logN −R. Then, provided that σ∞ 6= 0, for i = 1, 2

there exist εi(N, t) → 0 uniformly in t in probability as R,N → ∞ such that

XNt = x0e

−µt(e1 + ε1) +N− 12Z∞e

λt(e2 + ε2),

(cf. (2.7)).

(iii) As N → ∞,

XNSN−s → φ−1

s (X∞),

uniformly on compacts in s > 0, in probability.

Remark 2.18. These results can be reformulated as results about weak convergence

which are true independently of the choice of sample space, in a manner analogous

to Theorem 2.3. In particular, for any sequence tN → ∞ as N → ∞, ZN∞ =

N12 e−λtNXN

tN ,2 ⇒ Z∞. Working on a space in which this sequence converges almost

surely is sufficient for Theorem 2.17.


2.6 Continuous diffusion Markov processes

Our interest in this problem arose through looking at the OK Corral problem. It

was therefore natural to prove results for pure jump Markov processes. However

the proof of the analogous result in the case of continuous diffusion processes is

similar and we give it below. The pure jump and continuous cases can be combined

to obtain results for more general Markov processes.

Let (XNt )t>0 be a sequence of diffusion processes, starting from x0 and taking

values in some open subset S ⊂ R2, that satisfy the stochastic differential equations

dXNt = σN(XN

t )dWt + bN(XNt )dt

with σN , bN Lipschitz.

Suppose that there exist limit functions b(x) = Bx+ τ(x), with B and τ as in

Section 2.3 and σ, bounded, satisfying

(a)

supx∈S

N12 |bN(x) − b(x)| → 0.

(b)

supx∈S

|N 12σN(x) − σ(x)| → 0.

It follows that there exists a constant A such that for all N

‖σN‖ 6 (A/N)12 . (2.16)

Let γNt = N

12 (XN

t − xt), where xt is defined as before. It is straightforward,

using Gronwall’s Lemma, to show that γNt → γt as N → ∞, where (γt)t>0 is the

unique solution to the linear stochastic differential equation

dγt = σ(xt)dWt + ∇b(xt)γtdt (2.17)

starting from 0, where W is a Brownian motion.


Consider the function f(t, x) = e−Bt(

x− φt−t0(XNt0

))

for t > t0. By Ito’s

formula,

f(t, XNt ) = f(t0, X

Nt0

) +MB,Nt −MB,N

t0 +

∫ t

t0

(

∂f

∂s(s,XN

s ) + e−BsbN(XNs )

)

ds,

where∂f

∂t= −Be−Btx− e−Btτ(φt−t0(Xt0)),

and

MB,Nt =

∫ t

0

e−BsσN(XNs )dWs.

So if t > t0,

e−Bt(

XNt − φt−t0(X

Nt0

))

= MB,Nt −MB,N

t0

+

∫ t

t0

e−Bs(

bN(XNs−) − b(XN

s−))

ds (2.18)

+

∫ t

t0

e−Bs(

τ(XNs−) − τ(φs−t0(X

Nt0

)))

ds.

By comparison with (2.15), in order for the conclusion of Theorem 2.17 to hold

for diffusion processes, it is sufficient to prove an analogue of Lemma 2.1.

Lemma 2.19. There exists some constant C such that

E

(

supt>t0


t0 )|)

6 CN− 12 e−λt0 .

Proof. By the product rule,

e(B−λI)t(MB,Nt −MB,N

t0 ) =

∫ t

t0

(B − λI)e(B−λI)s(MB,Ns −MB,N

t0 )ds

+

∫ t

t0

e−λsσN(XNs )dWs


and, hence,

E

(

supt>t0


t0 )|)

6 E

(

supt>t0

∫ t

t0

(λ+ µ)e−(λ+µ)s|(MB,Ns −MB,N

t0 )1|ds)

+ E

(

supt>t0

∣

∣

∣

∣

∫ t

t0

e−λsσN(XNs )dWs

∣

∣

∣

∣

)

6

∫ ∞

t0

(λ+ µ)e−(λ+µ)s(

E(MB,Ns −MB,N

t0 )21

)12ds

+ E

(

supt>t0

∣

∣

∣

∣

∫ t

t0

e−λsσN(XNs )dWs

∣

∣

∣

∣

2)

12

.

Since

E

∫ t

0

|e−λsσN(XNs )|2ds <∞

for all t > 0, the process

(∫ t

0

∫

R2

e−λsσN(XNs )dWs

)

t>0

is a martingale, and hence, by Doob’s L2 inequality

E

(

supt>t0

∣

∣

∣

∣

∫ t

t0

e−λsσN(XNs )dWs

∣

∣

∣

∣

2)

6 4 supt>t0

E

(

∣

∣

∣

∣

∫ t

t0

e−λsσN (XNs )dWs

∣

∣

∣

∣

2)

.

Now

E

(

(MB,Nt −MB,N

t0 )21

)

= E

∫ t

t0

e2µsaN(XNs )1,1ds

6 E

∫ t

t0

e2µs A

Nds

6e2µtA

2µN,


where A is defined in (2.16). Similarly,

E

(

∣

∣

∣

∣

∫ t

t0

e−λsσN(XNs )dWs

∣

∣

∣

∣

2)

6e−2λt0A

2λN.

Hence,

E

(

supt>t0


t0 )|)

6

∫ ∞

t0

(λ+ µ)e−λs

(

A

2µN

)12

ds+ e−λt0

(

2A

λN

)12

6A

12 (λ+ µ+ 2(λµ)

12 )

λ(2µ)12

N− 12 e−λt0 .

Define σ∞, Z∞, X∞, X∞ as in Section 2.4 and let

SN =1

2λlogN +

1

λlog

X∞Z∞

.

The following analogue of Theorem 2.17 for diffusion processes holds.

Theorem 2.20. (i) For all N ∈ N,

N12 |XN

t − φt(x0)|

is bounded uniformly in t on compacts, in probability.

(ii) Suppose that R 6 t 612λ

logN −R. Then, provided that σ∞ 6= 0, for i = 1, 2

there exist εi(N, t) → 0 uniformly in t in probability as R,N → ∞ such that

XNt = x0e

−µt(e1 + ε1) +N− 12Z∞e

λt(e2 + ε2).

(iii) As N → ∞,

XNSN−s → φ−1

s (X∞),

uniformly on compacts in s > 0, in probability.


2.7 Applications

Throughout this section we work in a sample space on which ZN∞ → Z∞ almost

surely so that, in particular, the statement of Theorem 2.17 holds.

2.7.1 Hitting lines through the origin

As in the linear case, Theorems 2.17 and 2.20 may be used to study the first time

that XNt hits lθ or l−θ, the straight lines passing through the origin at angles θ and

−θ, where θ ∈ (0, π2), as N → ∞. As in Section 2.2, we define the time that XN

t

first crosses one of the lines l±θ by

TNθ = inf

t > 0 :∣

∣

∣

XNt−,2

XNt−,1

∣

∣

∣6 | tan θ| and

∣

∣

∣

XNt,2

XNt,1

∣

∣

∣> | tan θ|

.

First note that by Lemma 2.10,

φt(x0)2

φt(x0)1=eµtφt(x0)2

eµtφt(x0)1→ 0

x0= 0

as t → ∞. In particular, since tan θ 6= 0, there exists some sθ > 0 such that∣

∣

∣

φt(x0)2φt(x0)1

∣

∣

∣< | tan θ| for all t > sθ. To rule out the trivial case where TN

θ converges

to the first time that φt(x0) hits l±θ, we shall assume that x0 is chosen sufficiently

close to the origin that sθ = 0.

We prove the following result in the case where XNt is a pure jump process. The

proof for continuous diffusion processes is identical, except that it uses Theorem

2.20 in place of Theorem 2.17.

Theorem 2.21. Under the conditions required for Theorem 2.17

TNθ − tN ⇒ cθ

and

Nµ

2(λ+µ) |XNT N

θ| ⇒ | sec θ|| tan θ|−

µλ+µ |x0|

λλ+µ |Z∞|

µλ+µ


as N → ∞, where

tN =1

2(λ+ µ)logN and cθ =

1

λ + µlog

∣

∣

∣

∣

x0 tan θ

Z∞

∣

∣

∣

∣

.

Proof. By the fluid limit theorem and diffusion approximation, for any constant

R > 0,

P(

TNθ 6 R

)

6 P

(

supt6R

∣

∣

∣

∣

∣

XNt,2

XNt,1

∣

∣

∣

∣

∣

> | tan θ|)

= P

(

supt6R

∣

∣

∣

∣

∣

φt(x0)2 +N− 12γN

t,2

φt(x0)1 +N− 12γN

t,1

∣

∣

∣

∣

∣

> | tan θ|)

→ 0

as N → ∞.

By an identical argument to that in the proof of Theorem 2.5,

P(

R 6 TNθ 6 tN + cθ − ε

)

→ 0

and

P(

tN + cθ + ε 6 TNθ 6 SN − R

)

→ 0

as R,N → ∞. The result follows immediately.

Remark 2.22. As in the linear case, the sign of Z∞ determines whether XNt hits lθ

or l−θ at time TNθ . Since Z∞ is a Gaussian random variable with mean 0, each event

occurs with probability 12. Furthermore, provided that x∞ is chosen sufficiently

close to the origin that φ−1t (x∞) does not intersect l±θ, if XN

t hits one of the two

lines then the probability of it hitting either line again before SN converges to 0,

as N → ∞.

2.7.2 Minimum distance from the origin

Our second application is to investigate the minimum distance from the origin that

XNt can attain for large values of N .


Theorem 2.23. Under the conditions required for Theorem 2.17,

Nµ

2(λ+µ) inft6SN

|XNt | ⇒

(µ

λ

) λ2(λ+µ)

(

λ

µ+ 1

)12

|x0|λ

λ+µ |Z∞|µ

λ+µ

as N → ∞.

Proof. By the fluid limit theorem and diffusion approximation, for any constant

R > 0,

inft6R

Nµ

2(λ+µ)∣

∣XNt

∣

∣ > inft6R

Nµ

2(λ+µ)

(

|φt(x0)| −N− 12 |γN

t |)

→ ∞,

as N → ∞.

By Theorem 2.17,

infR6t6tN−R

Nµ

2(λ+µ)∣

∣XNt

∣

∣

> infR6t6tN−R

(

eµ(tN−t)|x0|(1 − |ε1|) − eλ(t−tN )|Z∞|(1 + |ε2|))

→ ∞

in probability as R,N → ∞, where

tN =1

2(λ+ µ)logN.

For each c > 0, there exists some ε = ε(N) → 0 in probability as N → ∞ such

that

infSN−c6t6SN

Nµ

2(λ+µ)∣

∣XNt

∣

∣ > inf06s6c

Nµ

2(λ+µ)(

|φ−1s (X∞)| − ε

)

→ ∞.

Also,

inftN +R6t6 1

2λlog N−R

Nµ

2(λ+µ)∣

∣XNt

∣

∣

> inftN +R6t6 1

2λlog N−R

eλ(t−tN )|Z∞|(1 − |ε2|) − eµ(tN−t)|x0|(1 + |ε1|)

→ ∞

in probability as R,N → ∞.


Finally if t = tN + c, then

Nµ

2(λ+µ)∣

∣XNt

∣

∣ = Nµ

2(λ+µ)

(

(

e−µtx0(1 + ε1,1) +N− 12Z∞e

λtε2,1

)2

+(

e−µtx0ε1,2 +N− 12Z∞e

λt(1 + ε2,2))2)

12

→(

(e−µcx0)2 + (eλcZ∞)2

)12

in probability uniformly in c on compact intervals. The right hand side is minimised

when

c =1

2(λ+ µ)log

µx20

Z2∞λ

.

Therefore,

Nµ

2(λ+µ) inft6SN

|XNt | ⇒

(µ

λ

)λ

2(λ+µ)

(

λ

µ+ 1

)12

|x0|λ

λ+µ |Z∞|µ

λ+µ

as N → ∞.

Example 2.24. Let (UNt , V

Nt ) be a Z2-valued process modelling the sizes of two

populations of the same species with UN0 = V N

0 = N . The environment that

they occupy is assumed to be closed. Each individual reproduces at rate 1. Ad-

ditionally, the individuals are in competition with each other, a death occurring

due to competition over resources at rate α, and due to aggression between the

populations at rate β. Hence, the transition rates are

(u, v) →

(u+ 1, v) at rate u

(u− 1, v) at rate αu(u+ v − 1)/N + βuv/N

(u, v + 1) at rate v

(u, v − 1) at rate αv(u+ v − 1)/N + βuv/N.

Let XNt =

(

UNt , V

Nt

)

/N . This gives a sequence of pure jump Markov processes,


starting from x0 = (1, 1), with Levy kernels

KN (x, dy) = Nx1δ(1/N,0) +N (αx1(x1 + x2 − 1/N) + βx1x2) δ(−1/N,0)

+Nx2δ(0,1/N) +N (αx2(x1 + x2 − 1/N) + βx1x2) δ(0,−1/N).

If we let

K(x, dy) = x1δ(1,0) +(αx21 +(α+β)x1x2)δ(−1,0) +x2δ(0,1) +(αx2

2 +(α+β)x1x2)δ(0,−1)

then for S = (0, 2)2 and η0 = 1,

m(x, θ) = x1eθ1 + (αx2

1 + (α + β)x1x2)e−θ1 + x2e

θ2 + (αx22 + (α + β)x1x2)e

−θ2

satisfies

supx∈SN

sup|θ|6η0

∣

∣

∣

∣

mN (x,Nθ)

N−m(x, θ)

∣

∣

∣

∣

→ 0

as N → ∞. Therefore,

b(x) = m′(x, 0) =(

x1(1−αx1−(α+β)x2)x2(1−αx2−(α+β)x1)

)

.

The deterministic differential equation

φt(x) = b(φt(x)), φ0(x) = x

is a special case of the Lotka-Volterra model for two-species competition. See

Brown and Rothery [4] for a detailed interpretation of the parameters α and β.

Further generalizations are discussed in Durrett [7].

It is straightforward to check that b(x) is C1 on S and satisfies

supx∈SN

N12 |bN (x) − b(x)| → 0

as N → 0, and that

a(x) =(

x1(1+αx1+(α+β)x2) 00 x2(1+αx2+(α+β)x1)

)

is Lipschitz on S.


Now b(x) has a saddle fixed point at(

12α+β

, 12α+β

)

and, by symmetry, any

point x on the line x1 = x2 satisfies φt(x) →(

12α+β

, 12α+β

)

as t → ∞. So under

an appropriate translation and rotation, the conditions required for Theorem 2.17

are satisfied, with λ = 1 and µ = β2α+β

. (Note that σ2∞ > 0 since a(x) is positive

definite on S). Hence, for times t satisfying t tN , where tN = 2α+β4(α+β)

logN ,

the two populations co-exist with the sizes of both being equal. However, at

time tN +O(1), the deterministic approximation breaks down and one side begins

to dominate. Our results give a quantitative description of the behaviour of the

processes in this region; however, we do not go into this here. At time t = SN +s =12logN +O(1), XN

t → φ−1s (X∞) in probability as N → ∞, where SN is defined in

Theorem 2.15 and X∞ is defined in Section 2.4. Now b(x) has stable fixed points at

(α−1, 0) and (0, α−1) and hence φ−1s (X∞) converges to one of these two fixed points

as s→ ∞. For any ε ∈ (0, 1) we say that a population is ε-extinct if the proportion

of the original population that remains is less than ε. Thus, for arbitrarily small

ε, one of the populations will become ε-extinct at time 12logN +O(1).

Chapter 3

Accumulation of rounding errors

in the numerical solution of ODEs

3.1 Introduction

In this chapter we examine the rounding errors incurred by deterministic solvers

for systems of ordinary differential equations (ODEs). We show, by the application

of ideas from Chapter 2, that the accumulation of rounding errors results in a ‘so-

lution’ to the ODE which exhibits random behaviour. The theoretical distribution

of the solution is obtained as a function of time, the step size and the numerical

precision of the computer. We consider, in particular, systems which amplify the

effect of the rounding errors so that over long time periods the solutions exhibit

divergent behaviour. The distributions predicted theoretically are then observed

numerically by performing multiple repetitions with different values of the time

step size.

Consider ordinary differential equations of the form

xt = b(xt).

These can be solved numerically using iteration methods of the type

xt+h = xt + β(h, xt),

42

Chapter 3. Rounding errors 43

where β(h, x)/h→ b(x) as h→ 0.

The simplest example is the Euler method, where β(h, x) = hb(x). This method

is generally not used in practice as it is relatively inaccurate and unstable compared

to other methods. However, more useful methods, such as the fourth order Runge-

Kutta formula (RK4), fall into this scheme.

When solving an ordinary differential equation numerically, each time an iter-

ation is performed an error ε is incurred due to rounding i.e.

Xht+h = Xh

t + β(h,Xht ) + ε. (3.1)

Rounding errors in numerical computations are an inevitable consequence of fi-

nite precision arithmetic. The first work thoroughly analysing the effects of round-

ing errors on numerical algorithms is the classical textbook by Wilkinson [33]. A

recent comprehensive treatment of the behaviour of numerical algorithms in finite

precision, including an extensive list of references, can be found in Higham [21].

Although rounding errors are not random in the sense that the exact error incurred

in any given calculation is fully determined (see Higham [21] or Forsythe [12]), in

many situations probabilistic models have been shown to adequately describe their

behaviour. In fact, statistical analysis of rounding errors can be traced back to

one of the first works on rounding error analysis by Goldstine and von Neumann

[13].

Henrici [18, 19, 20] proposes a probabilistic model for individual rounding errors

whereby they are assumed to be independent and uniform, the exact distribution

depending on the specific finite precision arithmetic being used. Using the central

limit theorem, he shows that the theoretical distribution of the error accumulated

after a fixed number of steps in the numerical solution of an ODE is asymptotically

normal with variance proportional to h−1. By varying the initial conditions, he

obtains numerical distributions for the accumulated errors with good agreement.

Hull and Swenson [22] test the validity of the above model by adding a randomly

generated error with the same distribution at each stage of the calculation, and

comparing the distribution of the accumulated errors with those obtained purely

by rounding. They observe that, although rounding is neither a random process

nor are successive errors independent, probabilistic models appear to provide a


good description of what actually happens.

We shall concentrate on floating point arithmetic, as used by modern com-

puters. However, our methods can be used equally well for any finite precision

arithmetic. We use the model, discussed and tested by the authors cited above,

whereby under generic conditions the errors in (3.1) can be viewed as independent,

zero mean, uniform random variables,

εi ∼ U [−|Xht,i|2−p, |Xh

t,i|2−p],

p being a constant determined by the precision of the computer.

In the first half of the chapter we analyse the cumulative effect of these rounding

errors as the step size h tends to 0. Where previous authors have considered the

accumulated error at a particular point, we derive a theoretical model for the entire

trajectory. Cases in R2 where the ordinary differential equation has a saddle fixed

point at the origin demonstrate the most interesting behaviour, as the structure

of the ODE system amplifies the effect of the rounding errors and causes the

numerical solution to diverge from the actual solution. In this case, the solution

Xht exhibits random behaviour and its theoretical distribution can be obtained as

an explicit function of time, the step size and the precision of the computer. As

the step size h tends to 0, the numerical solution exhibits three different types

of behaviour, depending on the time. More precisely, there exists a constant c,

determined by the ODE system, such that for times much smaller than −c log h

the numerical solution converges to the actual solution; for times close to −c log h

the solution undergoes a transition, determined by a Gaussian random variable

whose distribution is obtained; for times much larger than −c log h the numerical

solution diverges from the actual solution.

In the second half of the chapter, we perform numerical simulations which

illustrate this behaviour. By performing multiple repetitions with different values

of the time step size, the random distributions predicted theoretically are observed.

Where previous authors have obtained their numerical distributions by varying

the initial conditions, we do so by introducing small variations in the step size h.

During the transition period described in the previous paragraph, the numerical

solution intersects straight lines through the origin and we compare the theoretical


and numerical distributions for the points at which these intersections occur. Both

the mean and the standard deviation of these distributions are of the form ahγ ,

where γ ∈ (0, 1/2] is a constant determined by the ODE system, and a can be

found explicitly in terms of the precision of the computer, i.e. the number of bits

used internally by the computer to represent floating point numbers. We mainly

focus on the explicit Euler and RK4 methods, but show that the same behaviour

is also observable for more complex algorithms such as the adaptive solvers VODE

[5] and RADAU5 [14].

3.2 Theoretical background

In Chapter 2, limiting results are established for sequences of Markov processes

that approximate solutions of ordinary differential equations with saddle fixed

points. By modelling the rounding errors as random variables, we show that

the solutions obtained when performing numerical schemes for solving ordinary

differential equations can be viewed as a special case of this. This enables us to

quantify how the rounding errors accumulate. The resulting numerical solutions

exhibit random behaviour, the exact distribution of which is obtained.

In Section 3.2.1 we describe how rounding errors can be modelled as random

variables with specified distributions. The results of Chapter 2 are applied to

obtain a qualitative description of the accumulation of the rounding errors. The

distribution is calculated explicitly in Section 3.2.2.

3.2.1 Accumulation of rounding errors

We are interested in numerically solving ordinary differential equations of the form

xt = b(xt). (3.2)

In particular we consider using iteration methods of the type

xt+h = xt + β(h, xt) (3.3)


where β(h, x)/h→ b(x) as h→ 0.

Each time an iteration is performed, an error ε = ε(h, t) is incurred due to

rounding. The process (Xht )t∈hN is obtained iteratively by

Xht+h = Xh

t + β(h,Xht ) + ε. (3.4)

Modern computers store real numbers by expressing them in binary as x = m2n

for some 1 6 |m| < 2 and n ∈ Z. They allocate a fixed number of bits to store

the mantissa m and a (different) fixed number of bits to store the exponent n [23].

When adding to x a number of smaller order, the size of the rounding error incurred

is between 0 and 2n−p = 2blog2 |x|c−p, where p is the number of bits allocated to store

the mantissa. Although it is possible to carry out the calculations below using the

exact value of 2blog2 |x|c−p, the calculations are greatly simplified by approximating

it by |x|2−p. This results in the ‘effective’ value of p differing from the actual value

of p by some number between 0 and 1. Provided β(h,Xht ) is sufficiently small

compared with Xht , the errors ε can therefore be viewed as independent, mean

zero, uniform random variables with approximate distribution

εi ∼ U [−|Xht,i|2−p, |Xh

t,i|2−p]

(see Henrici [18, 19, 20]). The assumption that the εi are independent is in general

not true. In fact, in certain pathological cases, for example where there is a lot

of symmetry in the components, the εi can be strongly correlated. Nevertheless,

under generic conditions one would expect any correlations to be weak and so this

is a reasonable assumption to make. We shall see by the agreement of our numerical

and theoretical results that the effect of making this assumption is indeed small.

Although the above iterations are carried out at discrete time intervals, it is

convenient to embed the processes in continuous time by performing the iterations

at times of a Poisson process with rate h−1. As β(h, x) does not depend on t,

this does not affect the shape of the resulting trajectories. In this way Markov

processes Xht are obtained that approximate the stable solution of (3.2) for small


values of h. If, in addition, the assumption is made that

h−12

(

β(h, x)

h− b(x)

)

→ 0

as h → 0 (note that both the Euler and Runge-Kutta methods satisfy this con-

dition), then under the correspondence N ∼ h−1, the conditions needed to apply

the results in Chapter 2 are satisfied.

We focus on R2 in the case where the origin is a saddle fixed point of the

system i.e. b(xt) = Bxt + τ(xt), where B is a matrix with eigenvalues λ,−µ,

with λ, µ > 0 and corresponding eigenvectors v1, v2, and τ(x) = O(|x|2) is twice

continuously differentiable. This case is of particular interest as the structure of

the system amplifies the effect of the rounding errors and causes the numerical

solution to diverge from the actual solution over large times. Similar behaviour

can be observed in higher dimensions where the matrix B has at least one positive

and one negative eigenvalue, although the corresponding quantitative analysis is

much harder and we do not go into it here.

As shown in Chapter 2, our numerical solution exhibits the following random

behaviour:

A. For times of order much smaller than − log h, Xht approximates the stable

solution of (3.2), the fluctuations around this limit being of order h12 .

B. There exists some x0 6= 0, depending only on x0, and a Gaussian random

variable Z∞, such that if t lies in the interval [−c log h,− 12λ

log h + c log h]

for some c > 0, then Xht is asymptotic to

x0e−µtv1 + h

12Z∞e

λtv2, (3.5)

the solution to the linear ordinary differential equation

yt = Byt

starting from the random point x0v1 + h12Z∞v2.

C. Provided Z∞ 6= 0, in time intervals around − 12λ

log h whose length is of


much smaller order than − log h, Xht approximates one of the two unstable

trajectories of (3.2), each with probability 12, depending on the sign of Z∞.

The random behaviour resulting from the accumulation of rounding errors is

most noticeable on time intervals of fixed lengths around − 12(λ+µ)

log h, as for these

values of t the two terms x0e−µt and h

12Z∞e

λt in (3.5) are of the same order. During

these time intervals, the numerical solution undergoes a transition from converging

to the actual solution to diverging from it. During this transition, for each value of

θ ∈ (0, π/2), Xht crosses one of the straight lines passing through 0 in the direction

v1 cos θ ± v2 sin θ. These intersections are important as they indicate the onset of

divergent behaviour. The distribution of the point at which Xht intersects one of

the lines in the direction v1 cos θ ± v2 sin θ is asymptotic to

hµ

2(λ+µ) |Z∞|µ

λ+µ |x0|λ

λ+µ | tan θ|µ

λ+µ (v1 cos θ ± v2 sin θ). (3.6)

In Section 3.2.2 we show how to evaluate the variance of Z∞, doing so explicitly

in the linear case and obtaining bounds in the non-linear case. In Section 3.3 these

results are verified by numerically obtaining the predicted distribution for hitting

a line through the origin.

3.2.2 Explicit calculation of the variance

Consider a numerical scheme that satisfies the above conditions, applied to obtain

a solution to the ordinary differential equation (3.2), starting from x0 for some

x0 in the stable manifold. In the non-linear case we require that x0 is sufficiently

close to the origin such that τ(x0) is small. In general, for simplicity, we assume

that |x0| 6 1.

Define the flow φ associated with this system by

φt(x) = b(φt(x)), φ0(x) = x

and let xt = φt(x0).

Suppose that v1, v2 ∈ R2 are the unit right-eigenvectors of B corresponding to

−µ, λ respectively, and that v′1, v′2 ∈ (R2)∗ are the corresponding left-eigenvectors


(i.e. v′ivj = δij).

Define

x0 = limt→∞

eµtv′1φt(x0)

and

Ds = limt→∞

e−λtv′2∇φt(xs).

It is shown in the proofs of Lemmas 2.10 and 2.11 that these limits exist and that

|x0| 6 2|x0| 6 2 and |Ds| 6 2.

Finally, let

a(x) =1

32−2p

(

x21 0

0 x22

)

be the covariance matrix of the multivariate uniform random variable ε, defined

in equation (3.4), when Xht = x. Then Z∞ ∼ N(0, σ2

∞) where, by (2.12),

σ2∞ =

∫ ∞

0

e−2λsDsa(xs)D∗sds.

Note that σ2∞ 6

23λ

2−2p.

In the general non-linear case, evaluating σ2∞ explicitly is not possible as it

involves solving (3.2). It is possible to obtain better approximations than that

above, although the important observation is that σ2∞ is proportional to 2−2p.

In the linear case, φt(x) = eBtx and x0 = |x0|v1. Hence xt = |x0|e−µtv1,

x0 = |x0|, and Ds = v′2, and so

σ2∞ =

1

3(λ+ µ)2−2p|x0|2(v1,1v

′2,1)

2.

Note that the directions of v1 and v′2, relative to the standard basis, are critical.

For example, if either v1 or v′2 is parallel to one of the standard basis vectors, then

σ2∞ = 0.


3.3 Numerical experiments

In this section we solve ODEs numerically using deterministic solvers and observe

the predicted random distributions arising as a consequence of the accumulation

of rounding errors. For simplicity, and in order to observe the desired effects as

clearly as possible, we mainly focus on the most elementary of all numerical ODE

solution methods, the standard explicit Euler algorithm with constant time step

size. However, we observe similar behaviour for RK4 and also briefly mention

results obtained with more complex solvers, such as VODE [5].

3.3.1 The system

For x : [0,∞) → R2, consider the linear ODE

x(t) = Bx(t), (3.7)

where

B =

(

−µ 0

0 λ

)

for fixed λ, µ > 0. Introduce new coordinates

x(t) = R(ϕ)x(t)

by rotating about the origin by a fixed angle ϕ ∈ [0, π/2), i.e.

R(ϕ) =

(

cosϕ − sinϕ

sinϕ cosϕ

)

.

We arrive at the transformed system

˙x(t) = B(ϕ)x(t) (3.8)

with

B(ϕ) = R(ϕ)BR(ϕ)∗,


which will be the system under consideration in the following. Throughout, the

initial value

x(0) = R(ϕ)

(

1

0

)

=

(

cosϕ

sinϕ

)

(3.9)

is used. The phase space evolution is sketched in Figure 3.1.

Figure 3.1: Phase space for the saddle point ODE system (3.8) with sample tra-jectories and lines where hitting distributions are recorded (dashed lines).

3.3.2 Theoretical hitting distribution

As discussed in Section 3.2.1, the numerical solution to the above ODE system

undergoes a transition from converging to the actual solution to diverging from

it. During this transition, the numerical trajectory crosses one of the straight

lines passing through 0 at an angle ϕ ± θ for each value of θ ∈ (0, π/2). These

intersections are important as they indicate the onset of divergent behaviour. The

hitting distributions also provide a means of measuring the random variable Z∞,

which determines the random variations in our solutions, and hence of verifying

the theoretical results.

Equation (3.6) gives the asymptotic distribution of the magnitude of the point

at which the numerical solution hits the line through the origin at angle ϕ± π4

as


|Z|µ

λ+µ , where Z is a Gaussian random variable with mean 0 and variance

σ2 = hσ2∞ =

1

3(λ+ µ)h2−2p(cosϕ sinϕ)2 (3.10)

i.e. Z ∼ N(0, σ2). We obtain an explicit formula for the asymptotic distribution

by starting from the N(0, σ2) distribution

p(x)dx =1√2πσ

exp

(

− 1

2σ2x2

)

dx

and performing a change of variable given by y = |x|µ

λ+µ . The result is

p(y)dy =2(λ+ µ)√

2πσµy

λµ exp

(

− 1

2σ2y

2(λ+µ)µ

)

dy.

In the case λ = µ = 1, which is considered below, setting a = 4√2πσ

produces

the family of distributions

f(y)dy = ay exp(

− π

16a2y4

)

dy, y ∈ (0,∞), (3.11)

which will be fitted to the numerical data to confirm the theoretical value of a.

3.3.3 Choice of parameters

Rounding errors are deterministic in the sense that any given number of iterations

of a particular numerical scheme will generate the same solution. In order to

obtain a distribution from the numerical solutions to (3.7), for each repetition it

is necessary to vary at least one parameter by a small amount. In this section we

discuss this issue as well as the choice of the fixed parameters of the system such

as the eigenvalues.

The possible parameters that can be varied are the initial value x0, and the time

step size h. As x0 is constrained by being on the stable manifold, any variation

is required to be in the direction of the eigenvector corresponding to eigenvalue

−µ. Varying the initial value in this way did not yield any interesting results

as the chosen distribution of initial values was reproduced exactly in the hitting


distribution. In terms of the system it is also preferable to vary the time step

size as this parameter is internal to the algorithm, whereas the initial value is a

physical parameter of the system. We varied the time step size as follows. Given

a user-supplied value of h, define the step size hi for the ith repetition by

hi = h+ ∆h(i− 1 − k), i = 1, . . . , L,

where the number of repetitions L = 2k + 1 and 0 < ∆h h are user-supplied.

For all simulations, we set k = 104.

Reasonable choices of h and ∆h are limited by several factors. The hitting

distribution predicted theoretically in Section 3.3.2 is asymptotic as h → 0 and

hence, if h is too large (in the considered case, if h > 10−1 for both single and

double precision), the observed hitting distribution differs substantially from the

theoretical one. The onset of such effects can be seen for large values of h in

Figure 3.4. Lower bounds on h are imposed by computational cost and by the

numerical precision of the computer. In practice, computational expense becomes

prohibitive for values of h much larger than the smallest values permitted by

numerical accuracy. Our particular choice of step size distribution requires that

k∆h should be (much) smaller than h. The lower limit for ∆h is determined solely

by the numerical precision, i.e. ∆h/h must not be smaller than the numerical

precision.

We did not investigate in detail the dependence of our observations on the

distribution of step sizes. However, preliminary experiments with varying ∆h

and even with non-uniform step size distributions suggest that this dependence

is very weak for a wide range of conditions. Figure 3.2 shows that the shape of

the distribution exhibits no discernible systematic dependence on ∆h over at least

nine orders of magnitude. The deviations seen for values of ∆h smaller than about

10−19 are due to the fact that ∆h/h approaches the limits of numerical precision.

The remaining parameters that we need to choose are the eigenvalues λ,−µand the rotation angle ϕ. Since the limit distribution is given by |Z|

µλ+µ , for some

Gaussian random variable Z, if the values of λ and µ differ significantly then the

distribution is hard to observe in a numerical experiment. This suggests choosing

λ and µ of the same order of magnitude, and we therefore take λ = µ = 1 for all


0

1

2

3

4

5

10-20

10-18

10-16

10-14

10-12

10-10

Pa

ram

ete

r a

[1

01

4]

Step size variation ∆h

Figure 3.2: Step size variation for Euler’s algorithm (double precision, step sizeh = 10−4, L = 20001 repetitions each).

simulations.

There is some subtlety in the choice of the rotation angle ϕ. For certain values,

trivial trajectories or symmetry effects can occur which conceal the desired accu-

mulation of rounding errors. For instance, for ϕ = 0 the second component x2 of

the solution is always zero, and therefore the trajectory stays on the line x2 = 0

(or equivalently x2 = 0) with no fluctuations. Note that this is in agreement with

σ2 = 0 in equation (3.10). For ϕ = π/4, any rounding error that appears in one

component also appears in the other one, which implies that, again, the trajectory

always stays on the line x2 = 0 (or equivalently x1 = x2). This case is pathological

as it consistently violates our assumption that the rounding errors for the different

components are independent. For these reasons, we chose ϕ = π/5 throughout.

3.3.4 Results and observations for explicit methods

Using the values of the parameters discussed above, we carried out multiple repe-

titions of Euler’s algorithm and RK4. In each run we noted the point at which the

trajectory given by the numerical solution intersected one of the lines x1 = ±x2


(the dashed lines in Figure 3.1). Histograms were then produced by partition-

ing the interval [0, 1] into a given fixed number of subintervals of equal length and

counting how many times y fell into each subinterval, where y denotes the distance

of the point of intersection from the origin. The empirical distributions shown in

Figure 3.3 were obtained. The theoretical distribution (3.11) was fitted to the

empirical distributions with very good agreement.

0

0.5

1

1.5

2

2.5

3

3.5

0 2 4 6 8 10

h=1.0x10-4

h=3.2x10-4

h=1.0x10-3

Rel

ativ

e fr

eque

ncy

dens

ity p

(y)

[107 ]

y [10-8]

Figure 3.3: Observed hitting distributions with theoretical fits for Euler’s algorithm(∆h = 10−10, L = 20001 repetitions each).

For each value of h, we obtained a value for the parameter a by fitting a

distribution of the form (3.11) to our numerical data. In Figure 3.4 the parameter

a is plotted as a function of the time step size h, both for single (Figure 3.4(a)) and

double (Figure 3.4(b)) precision (4 and 8 bytes internal representation of floating

point numbers respectively). Error bars due to the fit are only about 1% and hence

insignificant. In both cases, the dependence between a and h is well described by

a ∝√h.


(a) Single precision (∆h = 10−8).

1014

1015

1016

10-5 10-4 10-3 10-2

Euler

4th order Runge-Kutta

4.9563x1016*h1/2

Par

amet

er a

Time step size h

(b) Double precision (∆h = 10−10).

Figure 3.4: Parameter a in equation (3.11) as function of the time step size h forsimple explicit methods (Euler and 4th order Runge-Kutta).

Equation (3.10) predicts the value of ah− 12 to be

ah−12 =

4√

3√π cos π

5sin π

5

× 2p = 8.220 × 2p.

For Euler’s method, the above data give ah− 12 = 9.411×107 for single precision and

ah−12 = 4.956× 1016 for double precision. For the 4th order Runge-Kutta method,

the values are ah−12 = 9.27 × 107 (with a relatively large error of ±0.12 × 107)

for single precision and ah−12 = 4.746 × 1016 for double precision. Using the

approximation discussed in Section 3.2.1, the actual value of p is between 23 and 24,

when working in single precision, and between 52 and 53 when working in double

precision. The particular value depends on the exact number being computed. Our

theoretical results therefore predict ah− 12 lies between 6.895× 107 and 1.379× 108

for single precision and between 3.702×1016 and 7.404×1016 for double precision.

There are three possible sources of error in our calculations. The first is the

error in fitting the numerical data to the theoretical model, the second is that

our theoretical models are based on asymptotic results as h → 0, whereas we are


applying them to values of h which are necessarily larger than the precision of the

computer. The third source of error arises from the assumption that at each stage

the rounding error can be viewed as an independent uniform random variable,

depending on a fixed value of p. The above results show that these errors are all

small and that our theoretical model provides a very good fit.

3.3.5 Adaptive solvers

Our theoretical results cover ODE solvers which use algorithms of the form (3.3).

In practice, more sophisticated adaptive solvers are used, such as VODE [5] and

RADAU5 [14]. For these solvers, the user inputs the error tolerances RTOL (relative)

and ATOL (absolute) and the global time step hg (the time interval after which

the user requests solution output from the solver). However, the user has no

immediate control over the size of the actual steps taken. These are determined

algorithmically as a function of the error tolerance parameters RTOL and ATOL,

generally by trial-and-error methods using heuristics, rather than by an explicit

formula.

0

1

2

3

4

0 1 2 3 4

hg=1.0x10-4

hg=3.2x10-4

Rel

ativ

e fr

eque

ncy

dens

ity p

(y)

[107 ]

y [10-8]

Figure 3.5: Hitting distributions for VODE.


Although it is not possible to analyse such adaptive solvers in the way that we

have analysed explicit solvers above, it is still of interest to see whether they exhibit

the same qualitative random behaviour. We performed numerical experiments

similar to those discussed above and obtained the distributions shown in Figure

3.5 in the case where RTOL=0.

Experiments do not readily suggest a simple relationship between the param-

eter a in equation (3.11) and any of the parameters ATOL, RTOL, and hg. This

is possibly not surprising given the lack of direct control over the time step size.

However, the fact that the results are qualitatively similar supports the assertion

that the observed phenomena are not specific to a particular algorithm, but rather

are general effects.

Chapter 4

Stochastic flows, planar

aggregation and the Brownian

web

4.1 Introduction

In this chapter we change course and consider a class of stochastic flows on the

circle which, under a certain scaling, converge to the Brownian web.

The Brownian web can loosely be defined as a family of coalescing Brownian

motions, starting at all possible points in continuous space-time. Arratia [1] first

considered this object in 1979 as a limit for discrete coalescing random walks, and

since then it has been studied by Toth and Werner [30] and Fontes, Isopi, Newman

and Ravishankar [10] amongst others.

Our motivation for looking at the Brownian web arises from a surprising connec-

tion with Hastings-Levitov diffusion-limited aggregation (DLA). DLA is a random

growth model which was originally introduced in 1981 by Witten and Sander [35].

In this model particles diffuse in from “infinity” and perform Brownian motions in

the plane until they collide with a cluster at the origin, at which point they stick to

the cluster. In 1998 Hastings and Levitov [17] formulated a model of DLA in which

the cluster is represented by a sequence of iterated conformal maps. We study a

59

Chapter 4. The Brownian web 60

simplified version of this model, known as the Eden model [8], and show that the

boundary values of the associated process of conformal mappings converge to the

Brownian web.

We begin by defining a class of stochastic flows which result from iteratively

applying small localized perturbations to the circle at uniformly distributed points.

By scaling the rate at which we apply these perturbations appropriately, the flows

converge to the Brownian web as the size of the individual perturbations ap-

proaches zero. We consider the case of simplified Hastings-Levitov DLA where

the incoming particles are slits of length N−1 sticking to the unit disc. If time is

scaled in such a way that particles arrive as a Poisson process of rate proportional

to N3, the resulting flow map, restricted to points on the unit circle, evolves as

an element of our class of stochastic flows and so converges to the Brownian web.

The power of N by which time is scaled is curious, and may give rise to the fractal

behaviour which can be observed in simulations of the model.

The paper [10] of Fontes, Isopi, Newman and Ravishankar, the original work

characterizing the Brownian web, constructs it as a random element of the space of

compact collections of paths with specified starting points. In this chapter we take

an alternative approach and formulate the Brownian web as an element of a space

of flows. We believe that working in a space whose structure inherently contains

the restrictions imposed by the Brownian web is more natural and find that this

simplifies characterization and convergence results. In particular, we prove that

there is a unique measure on our space of flows with respect to which the finite

dimensional distributions of the flows are those of coalescing Brownian motions.

This contrasts with the results of Fontes, Isopi, Newman and Ravishankar, where

they find that there are other natural measures on their space with this property.

We also show that any sequence of flows, whose finite dimensional distributions

converge to those of the Brownian web, converges to the Brownian web. It is

interesting to note that tightness is automatically satisfied.

This chapter is organized as follows. In Section 4.2 we construct the class of

flows on the circle and show that the finite dimensional distributions converge to

those of coalescing Brownian motions. The simplified Hastings-Levitov DLA model

is discussed in Section 4.3 and scaling limits are established for the boundary of

the associated process. A metric space of flows is constructed in Section 4.4, and


characterization and weak convergence results are formulated for the Brownian web

as an element of this space. Section 4.5 deals with some technical issues pertaining

to the space of flows, and Section 4.6 looks more closely at the correspondence

between our results and those of Fontes, Isopi, Newman and Ravishankar. In

particular, we show that the space of flows on which we construct the Brownian

web is isomorphic to a subspace of the space of compact sets of functions on which

they construct the Brownian web.

4.2 A Levy flow on the circle

In this section we construct a family of stochastic flows on the circle and show that

under certain conditions they converge to Arratia’s flow of coalescing Brownian

motions. In Section 4.2.1 we describe the class of functions on the circle which will

form the “building blocks” of our flows. In Section 4.2.2 we construct the flows

themselves, and in Section 4.2.3 we define Arratia’s flow of coalescing Brownian

motions and show that the finite dimensional distributions of our flows converge

to coalescing Brownian motions.

4.2.1 Some generalities for functions on the circle

We identify the unit circle S1 with the set R/Z ∼ [0, 1). We say that a nonde-

creasing function f : R → R is of degree 1 on the circle if

f(x + n) = f(x) + n for every n ∈ Z. (4.1)

Such maps correspond to functions S1 → S1 in an obvious way. This correspon-

dence is not strictly bijective, however, as for any m ∈ Z, both f and f +m give

rise to the same map.

Let D0 denote the set of all nondecreasing maps of degree 1 on the circle. We

can define an equivalence relation on this set by f ∼ g if f(x) = g(x) for all

points x at which f is continuous. Let [D0] be the set of all equivalence classes of

nondecreasing degree 1 maps on the circle. Every element of [D0] has a unique right


(and left) continuous version. Let C0 denote the set of all contractions g : R → R

which are periodic with period 1. Define a map Φ : [D0] → C0 by

Φ(f)(t) = t− x where1

2(x+ f(x−)) 6 t 6

1

2(x + f(x+)).

A geometrical interpretation of the function Φ(f) is that it is the map obtained

from f by rotating the axes by π/4 and scaling appropriately (see Figure 4.1).

10−1

−1

1

x

f(x) t

tΦ( )( )f

Figure 4.1: The map Φ(f) obtained from f by rotating the axes by π4.

Since f ∈ D0 is nondecreasing, Φ(f) is well defined and, by (4.1), is periodic

with period 1. Suppose that t > s, and x > y are such that Φ(f)(t) = t − x and

Φ(f)(s) = s− y. If x = y, then Φ(f)(t) − Φ(f)(s) = t− s. Otherwise,

Φ(f)(t) − Φ(f)(s) = t− s− (x− y) < t− s.

Also, if x 6= y,

Φ(f)(t) − Φ(f)(s) = −(t− s) + (2t− x) − (2s− y)

> −(t− s) + f(x−) − f(y+)

> −(t− s).


Hence,

|Φ(f)(t) − Φ(f)(s)| 6 |t− s|,

and so Φ(f) ∈ C0. The map Φ is invertible with the (right continuous representative

of the) inverse Φ−1 : C0 → [D0] given by

Φ−1(g)(x) = supt + g(t) : x = t− g(t).

Define a metric on [D0] by

dD0(f, g) = ‖Φ(f) − Φ(g)‖ = supt∈[0,1)

|Φ(f)(t) − Φ(g)(t)|.

The condition dD0(f, g) < ε is equivalent to

f(x− ε) − ε < g(x−) 6 g(x+) < f(x+ ε) + ε

for all x ∈ R and so, if dD0(fn, f) → 0 as n → ∞, then fn(x) → f(x) for every

point x at which f is continuous. Lemma 4.11 shows that the converse is also

true. Since (C0, ‖ ·‖) is a complete separable metric space, ([D0], dD0) is a complete

separable metric space.

Maps in D0 can be thought of as being perturbations of the identity map. For

each non-zero f ∈ D0 define f : R → R by f(x) = f(x) − x. The function f is

periodic with period 1. Let η = η(f) be the positive real number that satisfies

η

∫

[0,1)

f(x)2dx = 1. (4.2)

Suppose that |f(x)| is maximized at some xm ∈ [0, 1). Since f is increasing,

|f(x)| >12‖f‖ for all x ∈ [xm, xm + ‖f/2‖] if f(xm) > 0, or x ∈ [xm − ‖f /2‖, xm]

if f(xm) 6 0. Hence,

‖f‖ ∧ 1 6 2η−13 , (4.3)

so large values of η mean that the perturbation of f about the identity map is

small.


Define the drift b = b(f) by

b = η

∫

[0,1)

f(x)dx. (4.4)

Let δ = δ(f) > 0 be the smallest positive real number that satisfies

supδ6a61−δ

η

∫

[0,1)

|f(x + a)f(x)|dx 6 δ. (4.5)

Small values of δ mean that the perturbation of f about the identity map is well

localized on the circle.

Note that since the set of discontinuities of any f ∈ D0 has Lebesgue measure

zero, these values are well defined for elements of [D0].

4.2.2 Construction of the flow

Given f ∈ D0, we construct a compensated Poisson process which results from

applying f , ‘centered’ at uniformly distributed points, at rate η.

Let Ω denote the set of integer-valued measures on R × [0, 1) which are finite

on bounded sets, excluding those measures with two or more atoms at the same

time point i.e. for all ω ∈ Ω, ω(t× [0, 1)) ∈ 0, 1 for all t ∈ R. Write F o for the

σ-algebra on Ω generated by evaluations on open sets.

We first construct the process for f ∈ D0 with b = 0, where b is defined in

(4.4). Given ω ∈ Ω, for each t ∈ R define Ft ∈ D0 by

Ft(x) =

u+ f(x− u) if ωt×[0,1) = δ(t,u)

x otherwise.

Write Xts(x) = X(·, s, t, x), where X = Xf is the map X : Ω × (s, t) : s 6

t × R → R defined as follows. For each s 6 t, construct Xts ∈ D0 by

Xts = FTn · · · FT1 id,


where T1 < · · · < Tn are the times of the atoms of ω in (s, t]× [0, 1). Equivalently,

define Xts for t > s recursively at jump times by Xss = id and Xts = Ft Xt−,s for

all t > s.

There is a unique probability measure P = Pη on (Ω,F o) making the identity

map µ(ω, dt, du) = ω(dt, du), ω ∈ Ω, into a Poisson random measure on R × [0, 1)

with intensity ν(dt, du) = ηdtdu, where η = η(f) is as in (4.2). We write F for the

completion of F o with respect to P, extending P to F as usual. For s, t ∈ R with

s 6 t, let Fst denote the completion with respect to P of the σ-algebra generated

by µ(U) for open sets U ⊆ [s, t] × [0, 1).

With respect to P, for each x ∈ R, Xts(x) is an Fts-martingale starting from

x, obtained by applying f , ‘centered’ at uniformly distributed points, at times of

a Poisson process with rate η.

We now extend this to f ∈ D0 with b taking any real value. Set

ωb(dt, du) = ω(dt, d(u+ bt))

and apply the preceding construction with ω replaced by ωb in the definition of Ft

to obtain the map Xb. Define

Xts(x) = Xbts(x+ bs) − bt.

As above, with respect to P, for each x ∈ R, Xts(x) is an Fts-martingale starting

from x, obtained by applying f , ‘centered’ at uniformly distributed points, at times

of a Poisson process with rate η.

4.2.3 Convergence to the Arratia flow

The Brownian web is a continuous family of coalescing one dimensional Brownian

motions. We give a full characterization in Section 4.4, where we prove that as

η → ∞ and δ → 0, the distribution of X converges to that of the Brownian web.

In this section we define what we mean by a finite dimensional flow of coalescing

Brownian motions and the Arratia flow on the circle. We show that if f ∈ D0 with

η sufficiently large and δ sufficiently small, then X approximates the Arratia flow.


Definition 4.1. An n-dimensional system of coalescing Brownian motions on the

circle, starting at x, is an n-dimensional random process B = (B1, . . . , Bn) with

the following properties.

(i) For each 1 6 i 6 n, the continuous process (Bi(t))t>0 is a standard Brownian

motion with respect to FB, where FB denotes the filtration generated by B,

with Bi(0) = xi.

(ii) For each pair 1 6 i < j 6 n, the process (Bj(t) − Bi(t))t>0 is a Brownian

motion (with diffusivity σ2 = 2) with respect to FB, stopped at the first

time it hits Z.

The intuitive interpretation is of a family of Brownian motions on the circle,

the paths of any two of which are independent until they meet, at which point

they coalesce.

Definition 4.2. An n-dimensional flow of coalescing Brownian motions on the

circle, starting at (x1, s1), . . . , (xn, sn), is an n-dimensional random process B =

(B1, . . . , Bn) with the following properties.

(i) For each 1 6 i 6 n, the continuous process (Bi(t))t>siis a standard Brownian

motion with respect to FB, where FB denotes the filtration generated by B,

with Bi(si) = xi.

(ii) For each pair 1 6 i < j 6 n, the process (Bj(t)−Bi(t))t>si∨sjis a Brownian

motion (with diffusivity σ2 = 2) with respect to FB, stopped at the first

time it hits Z.

Definition 4.3. The Arratia flow on the circle is a family of random processes

(Bsx(t))t>s : s, x ∈ R, with Bsx(s) = x, such that for any deterministic n ∈ N any

(s1, x1), . . . , (sn, xn) ∈ R2, Bs1x1 , . . . , Bsnxn is an n-dimensional flow of coalescing

Brownian motions on the circle.

The construction of the above objects is discussed in Arratia [2].

In the following theorem we show that the finite dimensional distributions of

the flows constructed in Section 4.2.2 converge to those of the Arratia flow in the


following sense. Fix n ∈ N and (s1, x1), . . . , (sn, xn) ∈ R2. Suppose that for each

f ∈ D0, µf is the law of ((Xts1(x1))t>s1 , . . . , (Xtsn(xn))t>sn), and µB is the law of

(Bs1x1, . . . , Bsnxn). Then

((Xts1(x1))t>s1, . . . , (Xtsn(xn))t>sn) ⇒ (Bs1x1, . . . , Bsnxn)

as η → ∞ and δ → 0 means

supd(µf , µB) : f ∈ D0, η(f) > N, δ(f) 6 N−1 → 0

as N → ∞, where d is the Prohorov metric on the space of probability measures

(see Ethier and Kurtz [9]).

Throughout the rest of this chapter we use interchangeably the notation Xts

and Xt,s and, similarly, Bsx and Bs,x.

Theorem 4.4. Let X be constructed as above, and η, δ be defined as in (4.2) and

(4.5). Then the following hold.

(i) As η → ∞,

(Xts(x))t>s ⇒ (Bsx(t))t>s,

where (Bsx(t))t>s is a standard Brownian motion, starting from x at time s.

(ii) Given any n ∈ N and (s1, x1), . . . , (sn, xn) ∈ R2,

((Xts1(x1))t>s1 , . . . , (Xtsn(xn))t>sn) ⇒ B

as η → ∞ and δ → 0, where B is a flow of coalescing Brownian motions

starting at (s1, x1), . . . , (sn, xn).

Proof. (i) Denote Xt = Xts(x), t > s. By Ito’s formula (see, for example,

Kallenberg [24]),

(

exp

(

iθXt −(∫

(s,t]×[0,1)

(

eiθf(Xr−u) − 1 − iθf(Xr − u))

ηdrdu

)))

t>s


is a martingale and hence, for any θ ∈ R and t2 > t1 > s,

E(eiθ(Xt2−Xt1 ))

= E

(

exp

(∫

(t1,t2]×[0,1)

(

eiθf(Xr−u) − 1 − iθf(Xr − u))

ηdrdu

))

= E

(

exp

(∫

(t1,t2]×[0,1)

∫ 1

0

η(iθf(Xr − u))2(1 − z)eizθf(Xr−u)dzdrdu

))

= exp

(

−θ2(t2 − t1)

∫ 1

0

(1 − z)η

∫

[0,1)

f(u)2eizθf(u)dudz

)

,

where the second equality follows by Taylor’s theorem and the third follows

by Fubini’s theorem and substituting u for Xr − u.

By (4.2) and (4.3),

∣

∣

∣

∣

η

∫

[0,1)

f 2(u)eizθf(u)du− 1

∣

∣

∣

∣

6 z|θ|η∫

[0,1)

|f(u)|3du

6 z|θ| ‖f‖→ 0

as η → ∞. Hence,

∣

∣

∣

∣

E(eiθ(Xt2−Xt1 )) − exp

(

−1

2θ2(t2 − t1)

)∣

∣

∣

∣

61

2|θ|3(t2 − t1)‖f‖ → 0

as η → ∞.

As Xt has independent increments, the finite dimensional distributions con-

verge to those of Brownian motion.

It remains to show that the family of laws of (Xt)t>s for all f ∈ D0 is tight.

We prove the more general result that for any n ∈ N, (s1, x1), . . . , (sn, xn) ∈R2 the family of laws of ((Xts1(x1))t>s1 , . . . , (Xtsn(xn))t>sn), f ∈ D0 is tight.


Suppose that (s, x) ∈ R2 and s 6 r 6 u 6 t. Then

E(|Xus(x) −Xrs(x)|2|Xts(x) −Xus(x)|2)= E(|Xus(x) −Xrs(x)|2)E(|Xts(x) −Xus(x)|2)

= E

(∫

(r,u]×[0,1)

f(Xrs(x) − u)2ηdrdu

)

×E

(∫

(u,t]×[0,1)

f(Xrs(x) − u)2ηdrdu

)

= (u− r)(t− u)

6 (t− r)2.

Hence (see Billingsley [3], page 143), (Xts(x))t>s ⇒ (Bsx(t))t>s as η → ∞.

(ii) We first show the following. For any x < y ∈ [0, 1), let (Wts)t>s be a

Brownian motion on the circle with diffusivity σ2 = 2, starting from y − x.

Given any ε > 0, let Sε = inft > s : Xts(y) − Xts(x) /∈ (ε, 1 − ε) and

Tε = inft > s : Wts /∈ (ε, 1 − ε). Then as η → ∞ and δ → 0,

(XSεts (y) −XSε

ts (x))t>s ⇒ (W Tεts )t>s.

Suppose that X ′ is constructed on the same probability space as X, with X

and X ′ independent and identically distributed. Define

Zt(x) = Xts(x) t > s,

Zt(y) =

Xts(y) s 6 t 6 Sε,

X ′tSε

(XSεs(y)) t > Sε.


For any θ ∈ R and t > s, as in the proof of (i),

E(eiθ(Zt(x)−Zt(y)))

= E(

eiθ(Zt∧Sε(x)−Zt∧Sε(y))E(eiθ((Zt(x)−Zt∧Sε(x))−(Zt(y)−Zt∧Sε(y)))|FSε))

= E

(

exp

(

− θ2

∫

(s,t∧Sε]×[0,1)

∫ 1

0

η(f(Zr(x) − u) − f(Zr(y) − u))2

(1 − z)eizθ(f(Xr(x)−u)−f(Zr(y)−u))dzdrdu

)

× E(eiθ(Zt(x)−Zt∧Sε(x))|FSε)E(eiθ(Zt(y)−Zt∧Sε(y))|FSε)

)

.

Now provided δ < ε, by (4.5),

∣

∣

∣

∣

η

∫

[0,1)

f(Zr(x) − u)f(Zr(y) − u)eizθ(f(Zr(x)−u)−f(Zr(y)−u))du

∣

∣

∣

∣

6 δ,

for all s 6 r 6 Sε, and so, by a similar argument to (i),

∣

∣E(eiθ(Zt(x)−Zt(y))) − exp(

−θ2(t− s))∣

∣

=∣

∣

∣E(eiθ(Zt(x)−Zt(y))) − E

(

e−θ2(t∧Sε−s)e−12θ2(t−t∧Sε)e−

12θ2(t−t∧Sε)

)∣

∣

∣

→ 0

as η → ∞ and δ → 0.

Convergence of the finite dimensional distributions and tightness can be

shown by similar arguments to (i). Hence, (Zt(x) − Zt(y))t>s ⇒ (Wts)t>s

and so (XSεts (x) −XSε

ts (y))t>s ⇒ (W Tεts )t>s, as η → ∞ and δ → 0.

We now prove the general result. Suppose that for each f ∈ D0 the law of

((Xts1(x1))t>s1 , . . . , (Xtsn(xn))t>sn) is given by µf . By (i) we know that the

family of laws µf : f ∈ D0 is tight. Hence, every sequence with η → ∞and δ → 0 has a subsequence which converges weakly. We prove that any

such weak limit law is equal to µB, the law of (Bs1x1, . . . , Bsnxn).

It is enough to show that for all x < y ∈ [0, 1), if (Xts(y) − Xts(x))t>s ⇒(Yts)t>s, then (Yts)t>s ∼ (BT

ts)t>s, where (Bts)t>s is a Brownian motion on the


circle with diffusivity σ2 = 2, starting from y − x and T = inft > s : Bts ∈0, 1. Since our initial result is true for all values of ε > 0, and (Yts)t>s is

almost surely continuous, (Y Sts )t>s ∼ (BT

ts)t>s, where S = inft > s : Yts ∈0, 1. It remains to show that Yt+S,s

S<∞ = YSs

S<∞ almost surely for all

t > 0.

We show that (Yts)t>s is a martingale (with respect to the natural filtration)

by the following standard argument. Suppose that s 6 t1 < · · · < tn 6 t < u.

Since Xts has independent increments of mean 0, and Xts(y) −Xts(x) takes

values in the interval [0, 1], if g : Rn → R is any bounded continuous function

E((Xus(y) −Xus(x))g(Xt1s(y) −Xt1s(x), . . . , Xtns(y) −Xtns(x)))

= E((Xts(y) −Xts(x))g(Xt1s(y) −Xt1s(x), . . . , Xtns(y) −Xtns(x))).

But

E((Xus(y) −Xus(x))g(Xt1s(y) −Xt1s(x), . . . , Xtns(y) −Xtns(x)))

→ E(Yusg(Yt1s, . . . , Ytns))

and

E((Xts(y) −Xts(x))g(Xt1s(y) −Xt1s(x), . . . , Xtns(y) −Xtns(x)))

→ E(Ytsg(Yt1s, . . . , Ytns)).

Therefore (Yts)t>s is a martingale taking values in [0, 1]. Hence, by the op-

tional stopping theorem,

E(Yt+S,s

S<∞,YSs=0) = E(YSs

S<∞,YSs=0) = 0

and so Yt+S,s

S<∞,YSs=0 = 0 almost surely. Similarly Yt+S,s

S<∞,YSs=1 =

S<∞ almost surely, as required.


4.3 Hastings–Levitov DLA

In this section we describe diffusion-limited aggregation (DLA) and show how the

processes defined in Section 4.2.2 occur naturally in a simplified version of the

Hastings-Levitov model for planar DLA [17].

Diffusion-limited aggregation is a random growth model which was originally

introduced in 1981 by Witten and Sander [35]. Consider the unit disc at the origin

of the plane R2. A particle is introduced at “infinity” and performs a Brownian

motion until it contacts the unit disc, at which point it sticks. A second particle

is introduced and performs a Brownian motion until it contacts either the disc or

the first particle at which point it sticks. Further particles are introduced creating

a large tree-like structure, the shape of which is strongly dependent on the size of

the incoming particles. Of particular interest is the limiting case where the particle

size tends to zero as the rate of arrivals tends to infinity. Simulations suggest that

an incoming particle is most likely to attach to the tips of the cluster; the resulting

structure is therefore highly branched and fractal.

Hastings and Levitov formulated a model of DLA in which the cluster is repre-

sented by a sequence of iterated conformal maps. We describe a simplified version

of their construction below.

Suppose that D is the open unit disc, and let A be a compact subset of C,

of diameter r0 such that K = A ∪ D is simply connected. Set D0 = C \ D and

D = C \K. There is a unique conformal map g : D → D0 and a unique constant

κ ∈ [0,∞) such that g(z) ∼ e−κz as z → ∞ (see, for example, Lawler [27]).

For θ ∈ [0, 2π) and z ∈ eiθD, set gθ(z) = eiθg(e−iθz). Let (Θn)n∈N be a sequence

of independent random variables, distributed uniformly on [0, 2π) and let (νt)t>0

be a Poisson process of rate 1, with jump times 0 = T0 < T1 < · · · .

For each z ∈ D0, define a C-valued jump process (Zt(z))t<ζ(z) as follows. Set

Z0(z) = z, and initially set ζ(z) = ∞. While ζ(z) = ∞, recursively for n > 0,

check whether ZTn(z) ∈ eiΘn+1K. If so, set ζ(z) = Tn+1; otherwise, set ZTn+1(z) =

gΘn+1(ZTn(z)). Then, for Tn 6 t < Tn+1 ∧ ζ(z), let Zt(z) = ZTn(z).

We define set-valued jump processes (Kt)t>0, (Dt)t>0 by Dt = z ∈ D0 : ζ(z) >

t and Kt = C \ Dt. Note that Zt is the unique conformal map Dt → D0 with


(a) The cluster after a few arrivalswith N = 1.

(b) The cluster after 100 arrivalswith N = 1.

(c) The cluster after 800 arrivalswith N = 10.

(d) The cluster after 5000 arrivalswith N = 25.

(e) The cluster after 20000 arrivalswith N = 50.

(f) The stochastic flow (Xt0)t∈[0,1]

with N = 50.

Figure 4.2: The slit model case of simplified Hastings-Levitov DLA


Zt(z) ∼ e−κνtz as z → ∞. The set Kt represents the cluster formed by particles

with shape A \ D and hence studying the process Zt gives an insight into the

evolving shape of the cluster.

This model is a simplification of Hastings-Levitov DLA. At each iteration we

are scaling the size of the cluster boundary, whilst keeping the size and shape of

the set A unchanged, whereas for Hastings-Levitov DLA it is necessary to scale

A appropriately as well. As the scaling depends on the position on the cluster

boundary at which the particle attaches itself, this complicates the problem signif-

icantly and, in particular, the resulting process is no longer Markov. As such our

model is a toy model, and differs in structure from DLA. For example, it is easy

to show that the cluster formed by the above model has only one infinite branch,

whereas simulations of actual DLA appear to have several. This simplification is

known as the Eden model [8], and describes the growth of bacterial cells or tissue

cultures of cells that are constrained from moving. Figure 4.2(b) illustrates how,

in the simplified model, the particles arriving later tend to be larger than those

arriving earlier.

The advantage of this model, however, is that the location at which each arriv-

ing particle attaches itself is correctly distributed. A particle performing Brownian

motion will meet the cluster in a location whose distribution is determined by the

harmonic measure along the boundary of the cluster. As conformal mappings pre-

serve harmonic measure, the location is precisely that obtained by ‘centering’ the

map g at a uniformly distributed point on the boundary of the unit disc.

We are interested in the asymptotics of the processes Z (and the corresponding

clusters K) in the limit as r0 (and hence κ) tends to 0. In particular, we study the

case where A is the slit y : 1 6 y 6 1+N−1 as N → ∞ (see Figure 4.2). In this

case g(z) = m−1 r h 14(

11+2N )

2 m(z), where

m(z) =iz − i

z + 1

is the Mobius transformation taking the unit circle to the real line,

r(z) =z

√

1 −(

11+2N

)2


is a linear scaling, and

ht(z)2 = 4t+ z2

is the Loewner transform taking H \ z = si : s ∈ (0, t] → H, the branch of the

square root to be used being determined accordingly. This gives

g(z) =

(

1 −(

1

1 + 2N

)2)

z + 1

2z

z + 1 +

√

√

√

√z2 + 1 − 2z1 +

(

11+2N

)2

1 −(

11+2N

)2

− 1.

The process Z exhibits some interesting and unexpected behaviour. In particu-

lar, let us consider how the boundary of the disc evolves under appropriate scaling.

Let Xts(x) be the position at time t of the point on the boundary that was at x

at time s, under the identification of the boundary with the interval [0, 1).

Scaling time by η (to be determined),

Xts(x) = Ft(Xt−,s(x)),

where Ft is constructed as in Section 4.2.2 from f ∈ D0 given by

f(x+ n) = n +

π−1 tan−1√

tan2 πx+(1+2N)−2

1−(1+2N)−2 x ∈ [0, 12)

−π−1 tan−1√

tan2 πx+(1+2N)−2

1−(1+2N)−2 x ∈ (12, 1),

for each n ∈ N. Then ‖f‖ = Θ(N−1) and b(f) = 0. By considering separately the

contributions when x = O(N−1) and x = ω(N−1), and using appropriate Taylor

expansions, it can be shown that

6N3π3

∫

[0,1)

f(x)2dx→ 1

as N → ∞ and hence η(f) = 6N 3π3 + o(N3) → ∞ as N → ∞.

Also, by considering separately the cases x = O(N−1) with x + a = ω(N−1),

and x = ω(N−1), it can be shown that

N14 sup

N−14 6a61−N−

14

η

∫

[0,1)

|f(x + a)f(x)|dx→ 0


as N → ∞. Therefore δ(f) = O(N− 14 ) and so, in particular, δ → 0 as N → ∞.

Hence, X satisfies the conditions of Theorem 4.4 and so its finite dimensional

distributions converge to those of the Arratia flow as N → ∞ (see Figure 4.2(f)).

We shall prove in the next section the stronger result that X converges to the

Brownian web.

In future work, we intend to investigate scaling limits for the evolution of the

conformal map in the above model away from the boundary and, therefore, to

obtain a limiting structure for the cluster Kt.

4.4 The Brownian web

The Brownian web is the collection of graphs of coalescing one-dimensional Brown-

ian motions (with unit diffusion constant and zero drift) starting from all possible

points in continuous space-time. This object was originally studied in 1979 by

Arratia [1]. Further work has been carried out by many people including Harris

[15], who was interested in coalescing stochastic flows, and Piterbarg [29], who

showed that the Arratia flow arises as a weak limit of rescaled isotropic stochastic

flows. In 1998 Toth and Werner [30] took an alternative view of the Brownian web,

showing that it could be used to construct a continuous true self-repelling motion.

Fontes, Isopi, Newman and Ravishankar [10] extended this work in 2004 by pro-

viding a new characterization and obtaining convergence results. In 2005, Fontes

and Newman [11] studied a closely related object, the so-called full Brownian web.

In both [10] and [11], the authors obtained results by regarding the Brownian

web as an object in the space of compact sets of functions with specified starting

points. By instead regarding the Brownian web as an element of a space of flows,

we are able to simplify these authors’ characterization and convergence results.

The exact correspondence between our work and that in [10] and [11] is discussed

in Section 4.6. The viewpoint that we take of the Brownian web being an element

of an equivalence class on a space of stochastic flows, is used by Tsirelson [31].

However, he does not explicitly construct a metric space on which to realize the

flows in the way that we do below.

In Section 4.4.1 we define the space of flows. In Section 4.4.2 we construct


the Brownian web as a random element of this space and prove uniqueness of the

distribution. Section 4.4.3 establishes criteria for convergence to the Brownian web,

and shows that the flows constructed in Section 4.2.2 converge to the Brownian

web as elements of the space of flows. The proofs of the various technical results

referred to in this section can be found in Section 4.5.

4.4.1 A description of the flow space

Throughout the rest of this chapter we limit ourselves to time taking values in the

compact interval [−T, T ] for some fixed T > 0. An alternative approach (used in

[10] and [11]) is to introduce a metric under which R ∪ ∞ is compact. Both

approaches result in identical topologies, but we use the former as it avoids the

technical difficulties resulting from compactification.

Let D0 be the set of nondecreasing degree 1 maps on the circle, and ([D0], dD0)

be the metric space of equivalence classes on D0, defined in Section 4.2.1.

Definition 4.5. We say that φ : (s, t) ∈ [−T, T ] × [−T, T ] : s 6 t × R → R,

denoted by φ(s, t, x) = φts(x), is a cadlag flow on the circle if it satisfies the

following properties.

(a) For all s 6 t, φts ∈ D0.

(b) For all s 6 t 6 u, φus = φut φts.

(c) For all s ∈ [−T, T ], φss = id.

(d) For all s ∈ [−T, T ] and x ∈ R, φ(s, ·, x) : [s, T ] → R is right continuous with

regular left limits, where we say φt−,s(x) is regular if φt−,s(x) → x as s ↑ t.

We define an equivalence relation on this set of flows by φ ∼ φ′ if φts(x) = φ′ts(x)

at all points (s, t, x) for which φts is continuous at x. Let D be the space of all

equivalence classes of cadlag flows on the circle, together with the metric

dD(φ1, φ2) = infλ∈Λ

[γ(λ) ∨ sups6t

dD0(φ1ts, φ

2λ(t)λ(s))],


where Λ is the set of strictly increasing functions λ mapping [−T, T ] onto itself for

which

γ(λ) = sup−T6t<u6T

logλ(u) − λ(t)

u− t<∞

(cf. the Skorohod metric (see Billingsley [3]) on the space DR[−T, T ] of cadlag

functions from [−T, T ] to R. The difference here is that the supremum is taken

over the set (s, t) ∈ [−T, T ] × [−T, T ] : s 6 t). (D, dD) is a complete separable

metric space (see Theorem 4.15). Define FD to be the Borel σ-algebra on D.

Definition 4.6. We now give an alternative formulation of D which only depends

on the value of φ at continuity points. A map φ : (s, t) ∈ [−T, T ] × [−T, T ] : s 6

t → [D0], denoted by φ(s, t) = φts, is a cadlag flow on the circle if it satisfies the

following properties.

(a′) For all s 6 t, φts ∈ [D0].

(b′) For all s 6 t 6 u, φus(x) = φut(φts(x)) at every point x for which φts is

continuous at x and φut is continuous at φts(x).

(c′) For all s ∈ [−T, T ], φss = id.

(d′) Given ε > 0, for each s ∈ [−T, T ] there exists δ > 0 such that ‖φut − id‖ < ε

for all s 6 t 6 u < s+ δ and all s− δ < t 6 u < s.

It is shown in Lemma 4.14 that if φ satisfies the above conditions, then there exists

some φ′ ∼ φ that satisfies the conditions in Definition 4.5.

In what follows, we use interchangeably the same notation to mean a class

representative of an element of D, an element of D defined as in Definition 4.5,

and an element of D defined as in Definition 4.6. Lemma 4.14 shows that it is

consistent to view these three objects as the same thing.

Definition 4.7. We say that φ : (s, t) ∈ [−T, T ] × [−T, T ] : s 6 t × R → R,

denoted by φ(s, t, x) = φts(x), is a compact flow on the circle if conditions (c) and

(d) in Definition 4.5 are replaced by the following condition.

(e) For every ε > 0, there exists some δ > 0 such that for all s 6 t 6 s + δ,

‖φts − id‖ < ε.


Let C ⊂ D be the space of all equivalence classes of compact flows on the circle.

The metric dD restricted to C simplifies to

dC(φ1, φ2) = sup

s6tdD0(φ

1ts, φ

2ts).

Define FC to be the Borel σ-algebra on C.

4.4.2 Existence and uniqueness of the Brownian web

The following theorem characterizes the Brownian web.

Theorem 4.8. There exists a (C,FC)-valued random variable W whose distri-

bution is uniquely determined by the following property. If E is any determinis-

tic countable dense subset of [−T, T ] × [0, 1) (dense on −T × [0, 1)), then for

any deterministic m ∈ N and (s1, x1), . . . , (sm, xm) ∈ E , the joint distribution of

Wts1(x1), . . . ,Wtsm(xm) is that of coalescing Brownian motions on the circle (with

unit diffusion constant).

Proof. We first prove that the distribution determined by the condition in Theorem

4.8 is unique. Suppose that W 1,W 2 are two such C-valued random variables.

Define S to be the d-system given by

S = C ∈ FC : P(W 1 ∈ C) = P(W 2 ∈ C).

Now S contains the π-system consisting of all finite intersections of open cylinders,

defined in Definition 4.16 in the following section. By Lemma 4.17, the open

cylinders of C generate the σ-algebra FC and so, by Dynkin’s π-system Lemma,

S = FC. Hence, W 1 and W 2 have the same distribution.

We construct a class representative of a C-valued random variable with the

required property as follows. Let (Ω,F ,P) be a probability space on which (Bj)j∈N,

an independent identically distributed family of standard Brownian motions, is

defined. Suppose that E = (sj, xj) : j ∈ N. For each j ∈ N, n ∈ Z, we define

W nj to be the Brownian path starting at position xj + n at time sj given by

W nj (t) = xj + n+Bj(t− sj), t > sj.


Using the method of Arratia [2], we construct coalescing Brownian paths out of

the family of paths (W nj )j∈N,n∈Z by specifying coalescing rules. When two paths

meet for the first time, they coalesce into a single path, which is that of the

Brownian motion with the lower index. Denote the coalescing Brownian paths by

(W nj )j∈N,n∈Z. Define a random variable W : Ω × (s, t) ∈ [−T, T ] × [−T, T ] : s 6

t × R → R by

Wts(x) = infW nj (t) : sj 6 s, W n

j (s) > x.

By definition, for all (sj, xj) ∈ E , (Wtsj(xj))t>sj

= (W 0j (t))t>sj

and so W has the

required finite dimensional distributions. It is proved by Arratia in Section 5 of

[2] that almost surely W satisfies the conditions to be a class representative of a

C-valued random variable.

In what follows, we fix E to be a deterministic countable dense subset of

[−T, T ] × [0, 1) (dense on −T × [0, 1)). By the uniqueness theorem proved

above, all results are independent of the choice of E .

4.4.3 Convergence to the Brownian web

In this section, we establish criteria under which D-valued random variables con-

verge to the Brownian web, and show that, under specified conditions, the stochas-

tic flows constructed in Section 4.2.2 converge to the Brownian web.

Theorem 4.9. Suppose that (Xn)n∈N is a sequence of (D,FD)-valued random

variables. If, for any deterministic m ∈ N and (s1, x1), . . . , (sm, xm) ∈ E , the joint

distribution of Xnts1

(x1), . . . , Xntsm

(xm) converges as n → ∞ to that of coalescing

Brownian motions on the circle (with unit diffusion constant), then the distribution

µn of Xn converges to the distribution µW of the Brownian web.

Proof. Let E = (si, xi) : i ∈ N and define µm = limn→∞ µmn , where µm

n is the

law of (Xnts1

(x1), . . . , Xntsm

(xm)). By Kolmogorov’s existence theorem, there exists

a measure µ on the space of families of continuous functions fi : [si, T ] → R :

i ∈ N : fi(si) = xi such that µ (πm)−1 = µm where πm is the natural projection

map from fi : [si, T ] → R : i ∈ N : fi(si) = xi to fi : [si, T ] → R : i 6

m : fi(si) = xi. Suppose that Bsixi: i ∈ N is distributed with law µ. As


in the proof of existence in Theorem 4.8, the random variable W : Ω × (s, t) ∈[−T, T ] × [−T, T ] : s 6 t × R → R, defined by

W (·, s, t, x+m) = Wts(x +m) = m + infBsjxj(t) : sj 6 s, Bsjxj

(s) > x

for all s 6 t, x ∈ [0, 1), m ∈ Z, is almost surely a class representative of a C-valued

random variable, whose distribution is that of the Brownian web.

It remains to show that the distributions of the Xn converge to that of W as

elements of the metric space (D, dD). But this is an immediate consequence of

Proposition 4.18 in the next section.

Theorem 4.10. Let X be constructed as in Section 4.2.2, and define η, δ as in

(4.2) and (4.5) respectively. Then as η → ∞ and δ → 0, the distribution of X

converges to that of the Brownian web.

Proof. Suppose that f ∈ D0 and that X = Xf is constructed as in Section 4.2.2.

Since Xts = Ft Xt−,s almost surely for some Ft ∈ D0, X (with time restricted to

the compact set [−T, T ]) satisfies the conditions in Definition 4.5 and so is a (class

representative of a) D-valued random variable.

By Theorem 4.4, the conditions of Theorem 4.9 are satisfied and hence the

distribution of X converges to that of the Brownian web.

4.5 Some properties of (D, dD)

This section contains the proofs of various technical results, pertaining to the space

(D, dD), which are referred to elsewhere in this chapter.

Lemma 4.11. Suppose that (fn)n>1 is a sequence in D0, with fn(x) → f(x) for

every x at which f is continuous. Then dD0(fn, f) → 0 as n→ ∞.

Proof. It is enough to show that there exists a subsequence nr for which fnr → f .

Suppose that tn ∈ [0, 1] are chosen such that

|Φ(fn)(tn) − Φ(f)(tn)| = dD0(fn, f).


Since [0, 1] is a compact set, there exists a subsequence nr for which tnr → t. Let

xn ∈ R be such that

t ∈ [xn + fn(xn−), xn + fn(xn+)].

By restricting to a further subsequence if necessary, there exists some x ∈ R such

that xnr → x. Given ε > 0, there exist x − ε < y1 < x < y2 < x + ε such that

f is continuous at y1 and y2. Pick N ∈ N sufficiently large that for all nr > N ,

|tnr − t| < ε, y1 < xnr < y2 and |fnr(yi) − f(yi)| < ε for i = 1, 2. Then

f(x− 2ε) + x− 2ε 6 f(y1) + x− 2ε < fnr(y1) + x− ε 6 fnr(xnr−) + xnr

and, similarly,

f(x + 2ε) + x + 2ε > fnr(xnr+) + xnr .

Hence,

t ∈ [x− 2ε+ f(x− 2ε), x+ 2ε+ f(x + 2ε)].

Therefore, if nr > N ,

dD0(fnr , f) = |Φ(fnr)(tnr) − Φ(f)(tnr)|6 |Φ(fnr)(tnr) − Φ(fnr)(t)| + |Φ(fnr)(t) − Φ(f)(t)|

+|Φ(f)(t) − Φ(f)(tnr)|6 2|tnr − t| + |xnr − (x− 2ε)| ∨ |xnr − (x+ 2ε)|< 5ε,

and so fnr → f as required.

Lemma 4.12. Suppose that (fn)n>1 and (gn)n>1 are sequences in D0, with fn → f

and gn → g as n → ∞. Then for every x ∈ R for which g is continuous at x and

f is continuous at g(x), fn gn(x) → f g(x) as n→ ∞.

Proof. Given ε > 0, there exists 0 < δ < ε such that

f(g(x+ δ) + 2δ) + δ < f(g(x)) + ε


and

f(g(x− δ) − 2δ) − δ > f(g(x)) − ε.

Pick N ∈ N sufficiently large that for all nr > N , dD0(fnr , f) < δ and dD0(gnr , g) <

δ. Then for all y ∈ R,

f(y − δ) − δ < fnr(y−) 6 fnr(y) < f(y + δ) + δ

and

g(y − δ) − δ < gnr(y−) 6 gnr(y) < g(y + δ) + δ.

Hence,

f g(x) − ε < f(g(x− δ) − 2δ) − δ

6 f(gnr(x) − δ) − δ

< fnr(gnr(x)).

Similarly

f g(x) + ε > fnr gnr(x),

and so fnr gnr(x) → f g(x), as required.

Lemma 4.13. Suppose that φ : (s, t) ∈ [−T, T ] × [−T, T ] : s 6 t × R → R,

denoted by φ(s, t, x) = φts(x), satisfies conditions (a), (b) and (c) in Definition

4.5. Then the following are equivalent.

(i) For all s ∈ [−T, T ] and x ∈ R, φ(s, ·, x) : [s, T ] → R is right continuous.

(ii) For all s ∈ [−T, T ] and x ∈ R, φts(x) → x as t ↓ s.

(iii) Given ε > 0, for each s ∈ [−T, T ] there exists δ > 0 such that ‖φut − id‖ < ε

for all s 6 t 6 u < s+ δ.

Similarly, the following are equivalent.

(i′) For all s ∈ [−T, T ] and x ∈ R, φ(s, ·, x) : [s, T ] → R has regular left limits.

(ii′) Given ε > 0, for each s ∈ [−T, T ] and x ∈ R, there exists δ > 0 such that

|φtu(x) − x| < ε for all s− δ < u 6 t < s.


(iii′) Given ε > 0, for each s ∈ [−T, T ] there exists δ > 0 such that ‖φut − id‖ < ε

for all s− δ < t 6 u < s.

Proof. It is immediate that (i) implies (ii). Suppose that (ii) holds. Then given

s ∈ [−T, T ], for each m ∈ N, there exists some δ > 0 such that s 6 t < s + δ

implies that |φts(i/m) − i/m| < m−1, i = 0, . . . , m− 1. But then if t 6 u < s+ δ,

‖φut − id‖6 sup

i(|φts((i + 2)/m) − (i− 1)/m| ∨ |φts((i− 1)/m) − (i + 2)/m|)

< 4m−1,

and so (iii) holds. To see that (iii) implies (i), suppose that s 6 t ∈ [−T, T ], x ∈ R

and tn ↓ t. Then |φtns(x) − φts(x)| 6 ‖φtnt − id‖ → 0.

The proof of the conditions for left limits is similar.

Lemma 4.14. Suppose that φ satisfies the conditions in Definition 4.6. Then

there exists some φ′ with φ′ts ∼ φts for all s 6 t that satisfies the conditions in

Definition 4.5.

Proof. We first show that, for any φ satisfying the conditions in Definition 4.6,

there are only countably many values of t ∈ [−T, T ] for which φ(s, ·) : [s, T ] → [D0]

has a discontinuity at t for some s 6 t. Let An = t ∈ [−T, T ] : ‖φts − φt−,s‖ >n−1 for some s 6 t. Suppose that (tm)m>1 is a sequence of distinct points in

An. Since [−T, T ] is compact, by restricting to a subsequence if necessary, we

may assume tm ↑ t or tm ↓ t for some t. But then there exists δ > 0 such that

if t 6 u < tm 6 t + δ, then ‖φtms − φus‖ 6 ‖φtmt − id‖ + ‖φut − id‖ < n−1,

and similarly if t − δ < u < tm < t, then ‖φtms − φus‖ 6 ‖φtmu − id‖ < n−1,

contradicting tm ∈ An.

For every s ∈ ([−T, T ] ∩ Q) ∪ (⋃

n>1An) ∪ T, there exists a countable dense

set consisting of points x at which φts is continuous for all rationals t > s. Let Ebe the countable dense set consisting of such pairs (s, x). Define fsx : [s, T ] → R

by fsx(t) = φts(x). The family fsx : (s, x) ∈ E consists of noncrossing paths. Set

φ′ts(x) = inffry(t) : r 6 s, fry(s) > x, (r, y) ∈ E.


It is straightforward to check that φ′ satisfies conditions (a), (b) and (c) in Def-

inition 4.5, and that φ′ts ∼ φts for all s 6 t. Condition (d) follows by Lemma

4.13.

Theorem 4.15. The spaces (D, dD) and (C, dC), defined in Section 4.4.1, are

complete separable metric spaces.

Proof. To prove completeness, it is enough to prove that every Cauchy sequence

in (D, dD) contains a convergent subsequence. Suppose that (ψn)n>1 is a Cauchy

sequence in (D, dD). There exists a subsequence (φr)r>1 = (ψnr)r>1 such that

dD(φn, φn+1) < 2−n. Then Λ contains a sequence (µn)n>1 for which γ(µn) < 2−n

and

sups6t

dD0(φnts, φ

n+1µn(t)µn(s)) = sup

s6tdD0(φ

nµ−1

n (t)µ−1n (s)

, φn+1ts ) < 2−n.

As in the proof of the completeness of the Skorohod space DR[−T, T ] (see Billings-

ley [3]),

µn+m · · · µn → λn

as m→ ∞ for some λn ∈ Λ, with γ(λn) 6 2−(n−1) and

sups6t

dD0(φnλ−1

n (t)λ−1n (s)

, φn+1

λ−1n+1(t)λ

−1n+1(s)

) < 2−n.

Hence, for all s 6 t, (φnλ−1

n (t)λ−1n (s)

)n>1 is a Cauchy sequence in [D0] and, as [D0] is

complete, there exist φts ∈ [D0] for which φnλ−1

n (t)λ−1n (s)

→ φts. Define φ : (s, t) ∈[−T, T ] × [−T, T ] : s 6 t → [D0], by φ(s, t, x) = φts(x). Conditions (a′) and

(c′) in Definition 4.6 are immediate. Condition (b′) follows from Lemma 4.12, and

condition (d′) follows from the property that

‖f − id‖ = 2dD0(f, id)

for all f ∈ D0. Hence, by Lemma 4.14, there exists some φ′ ∈ D with φ′ ∼ φ.

Then, as for all m > n

γ(λn) ∨ sups6t

dD0(φnλ−1

n (t)λ−1n (s)

, φmλ−1

m (t)λ−1m (s)

) < 2−(n−1),

letting m→ ∞ gives φn → φ′ as n→ ∞.


To show that (C, dC) is complete, it is enough to check that if (φn)n>1 is a

sequence in C with φn → φ for some φ ∈ D, then φ satisfies condition (e) in

Definition 4.7. This follows by an identical argument to the proof of condition (d′)

above.

To prove separability we first observe that since [D0] is separable, it has a

countable dense subset A0 = fi : i ∈ N. For each n ∈ N, im,k ∈ N, where

0 6 m 6 k 6 n, and −T = t0 < t1 < · · · < tn = T < tn+1, define α : (s, t) ∈[−T, T ] × [−T, T ] : s 6 t → [D0] by

α(s, t) = αts = fim,k, s ∈ [tm, tm+1), t ∈ [tk, tk+1). (4.6)

Let A be the countable collection of all such elements α, where −T = t0 < t1 <

· · · < tn = T < tn+1 are of the form qT for some rational q. Note that in general

such elements α do not lie in D. We shall show that there is a dense subset of D,

indexed by A.

Suppose that φ ∈ D and ε > 0. Recall from the proof of Lemma 4.14 that the

map t 7→ φts has only finitely many discontinuities in the interval (−T, T ) (for all

s 6 t) of size > ε, at s1, . . . , sm, say, where −T = s0 < s1 < · · · < sm < sm+1 = T .

By Lemma 4.13, for each j = 0, . . . , m there exists some 0 < δj < sj+1 − sj such

that if sj 6 s 6 t < sj+1∧(s+δj), then ‖φts−id‖ < ε. Pick some integers M > 1eε−1

and n > 3M(T ∧ 1)(δ0 ∧ · · · ∧ δm)−1 and let tk = −T + knT for k = 0, . . . , n + 1.

Since A0 is dense in [D0], there exist positive integers ik,m, 0 6 m 6 k 6 n such

that

dD0(φtktm , fim,k) < ε

for all 0 6 m 6 k 6 n. Define α ∈ A as in (4.6). For each j = 1, . . . , m withnsj

T/∈ N, there exists some kj such that sj ∈ (tkj

, tkj+1). Let

t′kj=sj − e−εtkj+1

1 − e−ε∈ (tkj−M , sj),

and

t′kj+1=sj − eεtkj+1

1 − eε∈ (tkj+1, tkj+M+1).

Note that by the definition of n, |kj − ki| > 3M for all i 6= j and so the t′kjare


strictly increasing. Let λ ∈ Λ be the strictly increasing piecewise linear function

which joins (t′kj, t′kj

) to (tkj+1, sj) to (t′kj+1, t′kj+1) for those j = 1, . . . , m for which

nsj

T/∈ N, and which has gradient 1 otherwise. Then

γ(λ) 6 minj

(

log

∣

∣

∣

∣

∣

t′kj+1 − sj

t′kj+1 − tkj+1

∣

∣

∣

∣

∣

∧ log

∣

∣

∣

∣

∣

sj − t′kj

tkj+1 − t′kj

∣

∣

∣

∣

∣

)

= ε.

Also, for any s 6 t, there exist m 6 k with 0 6 t− tk, s− tm < δ0 ∧ · · · ∧ δm such

that

dD0(φts, αλ(t)λ(s)) 6 dD0(φts, φtktm) + dD0(φtktm , αtktm)

< ‖φttk − id‖ + dD0(φtks, φtks φstm) + ε

< 3ε,

where in the last inequality we have used the fact that for any f, g ∈ D0,

dD0(f, f g) 6 ‖g − id‖.

Hence, dD(φ, α) 6 3ε. Observe that although A 6⊂ D, the metric dD extends to

general functions α : (s, t) ∈ [−T, T ]× [−T, T ] : s 6 t → [D0] in an obvious way.

Now, for each α ∈ A, pick

φα ∈ φ ∈ D : dD(φ, α) = infφ′∈D

dD(φ′, α).

Note that φα exists since D is closed. Given ε > 0 and φ ∈ D, there exists α ∈ Asuch that dD(φ, α) < ε. But then dD(φ, φα) 6 dD(φ, α) + dD(α, φα) < 2ε and

hence, the countable set φα : α ∈ A is dense in D.

By an identical argument, it is possible to construct a countable dense subset

of C. Hence, C is separable.

Definition 4.16. Suppose that the sets Ii, i = 0, . . . , n are intervals in R. Let

t0 < · · · < tn ∈ [−T, T ], and define

Ct0,...,tnI0,...,In

= φ ∈ C : there exists x ∈ I0 such that φtit0(x±) ∈ Ii, i = 1, . . . , n.


We call such sets the open cylinders of C if the Ii are all open, and closed cylinders

of C if the Ii are all closed. It is straightforward to see that the closed cylinders

can be generated by the open cylinders, using countable set operations.

Lemma 4.17. The σ-algebra FC is generated by the open cylinders of C.

Proof. We first show that the open cylinders are open subsets of C. Suppose that

φ ∈ Ct0,...,tnI0,...,In

and that x ∈ I0 is such that φtit0(x±) ∈ Ii for i = 1, . . . , n. Since the

Ii are open intervals, there exists some ε > 0 such that (φtit0(x− ε) − ε, φtit0(x +

ε) + ε) ⊂ Ii for i = 0, . . . , n. Then if φ′ ∈ BC(φ, ε), we have φ′ ∈ Ct0,...,tnI0,...,In

. Hence,

Ct0,...,tnI0,...,In

is open.

We now show that it is possible to generate the closed ε-balls in C with cylinders.

Suppose that φ ∈ C. For s ∈ [−T, T ], n ∈ N, let ti = ti(s) = s+ in(T −s). For each

x ∈ [0, 1), let Ii = Ii(x, s) = [φtis((x − ε)−) − ε, φtis((x + ε)+) + ε], i = 1, . . . , n,

and Cεφ(s, x, n) = Ct0,...,tn

I0,...,In. Let

Cεφ =

⋂

(s,x)∈E

⋂

n∈N

Cεφ(s, x, n).

The closed ball BC(φ, ε) is contained in Cεφ. Conversely, if φ′ ∈ Cε

φ, by considering

sequences tn → t, and sequences (sn, xn) ∈ E with φssn(xn) → x, it can be shown

that for every s 6 t, dD0(φ′ts, φts) 6 ε. Hence, Cε

φ ⊂ BC(φ, ε), as required.

Proposition 4.18. Suppose that (φn)n∈N is a sequence in D and that for each

(s, x) ∈ E the process (φnts(x))t>s converges (uniformly) to some continuous R-

valued process (fsx(t))t>s. If the flow map φ : (s, t) ∈ [−T, T ] × [−T, T ] : s 6

t × R → R defined by

φ(s, t, x+m) = φts(x +m) = m + inffsjxj(t) : sj 6 s, fsjxj

(s) > x

for all s 6 t, x ∈ [0, 1), m ∈ Z, is a class representative of a C-valued random

variable, then dD(φn, φ) → 0.

Proof. Given ε > 0, pick some m ∈ N with m > 5/ε. There exist −T = a0 <

· · · < ak = T such that ‖φsai− id‖ < m−1 for all s ∈ [ai, ai+1), i = 0, . . . , k − 1.


Let yj = j/m for j = 0, . . .m. For each 0 6 i 6 k and 0 6 j 6 m, there exists

some (si,j, xi,j) ∈ E for which

fsi,jxi,j(t) −m−1 < φtai

(yj) 6 fsi,jxi,j(t),

for all t > ai. Pick N ∈ N sufficiently large that

supt>si,j

|φntsi,j

(xi,j) − fsi,jxi,j(t)| < m−1.

Suppose s 6 t with s ∈ [ai, ai+1). For any x ∈ [0, 1), there exists some j such that

x ∈ [yj, yj+1). Then if n > N

x > φsai(yj−3) + 2m−1 > fsi,j−3xi,j−3

(s) +m−1 > φnssi,j−3

(xi,j−3)

and hence, φnssi,j−3

(xi,j−3) < x. Similarly φnssi,j+4

(xi,j+4) > x and so

φntsi,j−3

(xi,j−3) 6 φnts(x−) 6 φn

ts(x+) 6 φntsi,j+4

(xi,j+4).

But then

φts(x− 5m−1) − 5m−1 < φtai(yj−3) − 2m−1

< fsi,j−3xi,j−3(t) −m−1

< φntsi,j−3

(xi,j−3)

6 φnts(x−),

and similarly

φnts(x+) < φts(x + 5m−1) + 5m−1.

Therefore dD0(φnts, φts) < 5m−1 and so dD(φ, φn) < ε, as required.

4.6 An equivalent space for the Brownian web

The paper [10] of Fontes, Isopi, Newman and Ravishankar, the original work char-

acterizing the Brownian web, constructs it as a random element of the space H


of compact collections of R-valued paths with specified starting points. In this

chapter, we have chosen to formulate the Brownian web as an element of the

space C of flows defined in Section 4.4.1. We note that H is a considerably larger

space than is required to support the Brownian web and believe that using a space

whose structure inherently contains the noncrossing and space filling restrictions

imposed by the Brownian web is more natural and simplifies characterization and

convergence results.

In [10], the authors make the comment that there is more than one natural

H-valued random variable that satisfies the following two conditions.

(i) From any deterministic point (x, t) in space-time, there is almost surely a

unique path Wx,t starting from (x, t).

(ii) For any deterministic n and (x1, t1), . . . , (xn, tn), the joint distribution of

Wx1,t1 , . . . ,Wxn,tn is that of coalescing Brownian motions (with unit diffusion

constant).

The standard Brownian web is the minimal collection of paths in H that satisfies

(i) and (ii). In [11], Fontes and Newman describe the forward full Brownian web,

which is the maximal collection of (noncrossing) paths that satisfies (i) and (ii).

In [11], the authors characterize a third object, the full Brownian web, which is a

random variable on the space HF of compact collections of paths from R → R, and

show that there is a one-to-one correspondence between the full Brownian web and

the forward full Brownian web. It can be similarly shown that there is a one-to-one

correspondence between the standard Brownian web and the full Brownian web.

In this section we shall show that C is in some sense isomorphic to a subset of

H′ ⊂ HF and that the full Brownian web almost surely lives on H′. Therefore, the

Brownian web that we construct on C is equivalent to the full Brownian web and,

hence, is also equivalent to both the standard Brownian web and the forward full

Brownian web. By working in the space C, we have the advantage of the existence

of a unique natural random variable that satisfies conditions (i) and (ii) above.

In Section 4.6.1 we describe the space HF on which the full Brownian web is

constructed and give its characterization. In Section 4.6.2 we discuss the sense in

which C is isomorphic to H′.


4.6.1 Compact sets of functions

In Fontes and Newman [11], HF is constructed from space-time points in (R2, ρ),

the compactification of R2 under a specified metric ρ. Instead, we shall take our

space points on the circle and our time points from [−T, T ] for some fixed T .

This avoids any technical issues resulting from compactification, but is essentially

equivalent. We give an outline of the results in [11] below.

Construct the two spaces (ΠF , dF ) and (HF , dHF ) as follows. Let ΠF denote

the set of continuous functions f : [−T, T ] → R and let

dF (f1, f2) = sup−T6t6T

|f1(t) − f2(t)|.

The space (ΠF , dF ) is complete and separable.

Let HF denote the set of all subsets K of (ΠF , dF ), for which f ∈ K if and

only if f +m ∈ K for all m ∈ Z and for which K|[0,1] = f ∈ K : f(−T ) ∈ [0, 1]is compact. Define the induced Hausdorff metric dHF by

dHF (K1, K2) = supg1∈K1

infg2∈K2

dF (g1, g2) ∨ supg2∈K2

infg1∈K1

dF (g1, g2).

The space (HF , dHF ) is also complete and separable.

Definition 4.19. Let H′ ⊂ HF consist of those K ∈ HF which satisfy the follow-

ing two conditions.

(a) The paths of K are noncrossing (although they may touch; in particular they

may coalesce or bifurcate).

(b) For any point (x, t) ∈ R × [−T, T ], there exists some f ∈ K with f(t) = x.

Fontes and Newman [11] give the following characterization of the full Brownian

web.

Definition 4.20. A full Brownian web WF is any (HF , dHF )-valued random vari-

able whose distribution has the following properties.

(a) Almost surely the paths of WF are noncrossing (although they may touch,

including coalescing and bifurcating).


(b1) From any deterministic point (x, t) ∈ R × [−T, T ], there is almost surely a

unique path W Fx,t passing through x at time t.

(b2) For any deterministic n, (x1, t1), . . . , (xn, tn), the joint distribution of the

semipaths W Fxj ,tj

(t), t > tj, j = 1, . . . , n is that of a flow of coalescing

Brownian motions on the circle (with unit diffusion constant).

They show (Theorem 3.2 of [11]) that any two Brownian webs have the same

distribution.

We would like to show that (H′, dHF ) is isomorphic to (C, dC). However, we

need the following additional regularity condition.

Definition 4.21. Let H1 ⊂ H′ consist of those K ∈ HF which satisfy the following

additional condition.

(c) For each point (x, s) ∈ Q2 with s ∈ [−T, T ], there is a unique path fx,s

passing through x at time s.

Let C1 ⊂ C consist of those flows φ ∈ C which satisfy the following condition.

(c′) For each (x, s) ∈ Q2 with s ∈ [−T, T ], φts is continuous at x for all t > s.

We shall show that (H1, dHF ) is isomorphic to (C1, dC). In both settings the

full Brownian web (respectively Brownian web) is almost surely H1 (respectively

C1) valued.

4.6.2 The isomorphism between the spaces

We define an isometry θ : C1 → H1 as follows.

For each t0 ∈ [−T, T ], let C[t0] be the set of all continuous functions f :

[t0, T ] → R. Let

Π =⋃

t0∈[−T,T ]

C[t0].


Let H be the set containing all subsets of Π with noncrossing paths. For each

φ ∈ C1, define θ(φ) ∈ H by

θ(φ) =⋃

t0∈[−T,T ]∩Q

f ∈ C[t0] : f = φ(t0, ·, x) for some x ∈ Q.

We say that a set U ∈ H is maximal if, for any f ∈ Π, U ∪ f ∈ H implies

f ∈ U . There is a unique maximal set, which we denote θ(φ), that contains θ(φ).

Define θ(φ) ∈ H1 by θ(φ) = θ(φ) ∩ C[−T ].

Proposition 4.22. θ(φ) ∈ H1.

Proof. By definition, θ(φ) consists of continuous noncrossing functions and, since

φts ∈ D0, by (4.1), if f ∈ θ(φ), then f+m ∈ θ(φ) for allm ∈ Z. It is straightforward

to check that condition (c′) in Definition 4.21 implies condition (c). The property

that for every (x, t) ∈ R × [−T, T ] there exists some f ∈ θ(φ) with f(t) = x is an

immediate consequence of maximality. It remains to check that θ(φ)|[0,1] = f ∈θ(φ) : f(−T ) ∈ [0, 1] is compact.

Suppose that f1, f2, . . . ∈ θ(φ)|[0,1]. Since paths in θ(φ) are noncrossing, there

exists a subsequence nr such that fnr(t) is monotone for all t ∈ [−T, T ]. Further-

more, for each t, fnr(t) lies in an interval of length 1 for all nr and so there exists

some f : [−T, T ] → R such that fnr → f pointwise.

Given ε > 0, there exist −T = a0 < · · · < aM = T such that ‖φsak− id‖ < ε

4

for all ak 6 s 6 ak+1, k = 0, . . . ,M − 1. Pick N sufficiently large that if nr > N ,

then |fnr(ak) − f(ak)| < ε4

for k = 1, . . . ,M − 1. But then, for ak 6 s < ak+1,

|fnr(s) − f(s)| 6 |φsak(f(ak) + ε

4) − φsak

(f(ak) − ε4)| < ε.

Hence, fnr → f uniformly. Therefore, f is continuous and θ(φ)∪f is noncross-

ing and so f ∈ θ(φ) with f(−T ) ∈ [0, 1], proving compactness.

Proposition 4.23. The function θ : C1 → H1 is bijective.

Proof. To see that θ is injective, suppose that θ(φ) = θ(φ′) for some φ, φ′ ∈ C1.

Then θ(φ) ∪ θ(φ′) ∈ H. Now for each s ∈ [−T, T ], x ∈ R, there exists fsx :


[s, T ] → R such that fsx ∈ θ(φ) and f ′sx : [s, T ] → R such that f ′

sx ∈ θ(φ′), with

fsx(t) = φts(x±) and f ′sx(t) = φ′

ts(x±) for all t > s. By the noncrossing property,

for each n ∈ N, f ′sx(t) ∈ [fs,x− 1

n(t), fs,x+ 1

n(t)]. Letting n→ ∞ gives φ′

ts(x) = φts(x)

at every point (s, t, x) for which φts is continuous at x. Hence, φ = φ′.

To see that θ is surjective, for K ∈ H1, let

φts(x) = inffur(t) : u 6 s, r ∈ Q, fur(s) > x,

where, for each (u, r) ∈ Q, fur is the unique element of K with f(u) = r. We first

show that φ ∈ C1. Since K consists of noncrossing functions and f ∈ K implies

f +m ∈ K for all m ∈ Z, φts ∈ D0. It is straightforward to check that condition

(c) in Definition 4.21 implies condition (c′).

Now suppose that r 6 s 6 t. For every x ∈ R there exists a sequence of

functions fn ∈ K with fn(r) ↓ x and fn(s) ↓ φsr(x) as n → ∞. Since K|[y,y+1] =

f ∈ K : f(−T ) ∈ [y, y + 1] is compact for all y (and the functions fn are

monotone and eventually lie in K|[y,y+1] for some y), there exists some f ∈ K such

that fn → f . Then φtr(x) = f(t) for all t > r and φts(φsr(x)) = f(t) for all t > s.

Hence, φts φsr = φtr.

Since K|[0,1] is compact, given ε > 0, there exist f1, . . . , fN ∈ K such that

‖fi − fi+1‖ < ε2

for i = 1, . . . , N (where we take fN+1 = f1 + 1). Since the fi are

continuous, there exists δ > 0 such that if |s− t| < δ, then |fi(s) − fi(t)| < ε2

for

i = 1, . . . N . Then for any s 6 t < s+ δ and any x ∈ R at which φts is continuous,

there exists some i,m for which x ∈ [fi(s) +m, fi+1(s) +m). But then

|φts(x) − x| 6 |fi(t) − fi+1(s)| ∨ |fi+1(t) − fi(s)|6 |fi(t) − fi(s)| ∨ |fi+1(t) − fi+1(s)| + ‖fi − fi+1‖< ε.

Finally, observe that by the argument used to show that φtsφsr = φtr, θ(φ) ⊂

⋃

s∈[−T,T ]f ∈ C[s] : there exists g ∈ K such that f(t) = g(t) for all t > s and

hence θ(φ) = K, as required.

Proposition 4.24. The function θ : C1 → H1 is an isometry.


Proof. Since for all K ∈ H1, K =⋃

m∈Z K|[m,m+1] and K|[m,m+1] is compact for

all m, given K1, K2 ∈ H1, there exist f ∈ K1, g ∈ K2 such that ‖f − g‖ =

dHF (K1, K2). Without loss of generality, suppose that g is chosen so that ‖f −g‖ = infh∈K2 ‖f − h‖. Because K2 satisfies property (b) of Definition 4.19, g can

be chosen so that there exist s, t ∈ [−T, T ] for which f(t) − g(t) = ‖f − g‖ =

−(f(s) − g(s)). In the case where Ki = θ(φi) for some φ1, φ2 ∈ C1, if f ∈ K1 with

f(s) = x and g ∈ K2 with g(s) = y, then

f(t) ∈ [φ1ts(x−), φ1

ts(x+)],

g(t) ∈ [φ2ts(y−), φ2

ts(y+)],

and so

φ2ts(y−) − φ1

ts(x+) 6 x− y 6 φ2ts(y+) − φ1

ts(x−).

Hence there exists some

u ∈[

1

2(x+ φ1

ts(x−)),1

2(x+ φ1

ts(x+))

]

∩[

1

2(y + φ2

ts(y−)),1

2(y + φ2

ts(y+))

]

.

Therefore,

dHF (θ(φ1), θ(φ2)) = |x− y|= |Φ(φ1

ts)(u) − Φ(φ2ts)(u)|

6 ‖Φ(φ1ts) − Φ(φ2

ts)‖6 dC(φ

1, φ2).

Conversely, suppose that s, t ∈ [−T, T ] and u ∈ [0, 1) are such that

dC(φ1, φ2) = |Φ(φ1

ts)(u) − Φ(φ2ts)(u)|.

There exist x, y ∈ [0, 1) such that

u ∈[

1

2(x+ φ1

ts(x−)),1

2(x+ φ1

ts(x+))

]

∩[

1

2(y + φ2

ts(y−)),1

2(y + φ2

ts(y+))

]

.

Without loss of generality suppose that x > y. There exists f ∈ θ(φ1) such that


f(r) = φ1rs(x−) for all r > s. Since x− y 6 φ2

ts(y+)− φ1ts(x−), if g ∈ θ(φ2), either

|g(s) − f(s)| > |x− y| or |g(t) − f(t)| > |φ2ts(y+) − φ1

ts(x−)| > |x− y|. Hence,

dHF (θ(φ1), θ(φ2)) > |x− y| = |Φ(φ1ts)(u) − Φ(φ2

ts)(u)| = dC(φ1, φ2),

and so

dHF (θ(φ1), θ(φ2)) = dC(φ1, φ2),

as required.

Bibliography

[1] Arratia, R. A. (1979). Coalescing Brownian motions on the line. Ph.D.

thesis, University of Wisconsin, Madison.

[2] Arratia, R. A. (1981). Coalescing Brownian motions and the voter model on

Z. Unpublished partial manuscript. Available from [email protected].

[3] Billingsley, P. (1999). Convergence of probability measures. Wiley Series in

Probability and Statistics: Probability and Statistics. John Wiley & Sons Inc.,

New York.

[4] Brown, D. and Rothery, P. (1993). Models in Biology: Mathematics,

Statistics and Computing. Wiley, Chichester.

[5] Brown, P. N., Byrne, G. D., and Hindmarsh, A. C. (1989). VODE:

a variable-coefficient ODE solver. SIAM J. Sci. Statist. Comput. 10, 5, 1038–

1051.

[6] Darling, R. W. R. and Norris, J. R. (2005). Structure of large random

hypergraphs. Ann. Appl. Probab. 15, 1A, 125–152.

[7] Durrett, R. (1999). Stochastic spatial models. SIAM Rev. 41, 4, 677–718

(electronic).

[8] Eden, M. (1961). A two-dimensional growth process. In Proc. 4th Berkeley

Sympos. Math. Statist. and Prob., Vol. IV. Univ. California Press, Berkeley,

Calif., 223–239.

97

Bibliography 98

[9] Ethier, S. N. and Kurtz, T. G. (1986). Markov processes. Wiley Se-

ries in Probability and Mathematical Statistics: Probability and Mathematical

Statistics. John Wiley & Sons Inc., New York.

[10] Fontes, L. R. G., Isopi, M., Newman, C. M., and Ravishankar,

K. (2004). The Brownian web: characterization and convergence. Ann.

Probab. 32, 4, 2857–2883.

[11] Fontes, L. R. G. and Newman, C. M. (2005). The full Brownian web as

scaling limit of stochastic flows. arXiv:math.PR/0511029.

[12] Forsythe, G. E. (1959). Reprint of a note on rounding-off errors. SIAM

Rev. 1, 66–67.

[13] Goldstine, H. H. and von Neumann, J. (1951). Numerical inverting of

matrices of high order. II. Proc. Amer. Math. Soc. 2, 188–202.

[14] Hairer, E. and Wanner, G. (1996). Solving ordinary differential equa-

tions. II. Springer Series in Computational Mathematics, Vol. 14. Springer-

Verlag, Berlin.

[15] Harris, T. E. (1984). Coalescing and noncoalescing stochastic flows in R1.

Stochastic Process. Appl. 17, 2, 187–210.

[16] Hartman, P. (1960). On local homeomorphisms of Euclidean spaces. Bol.

Soc. Mat. Mexicana (2) 5, 220–241.

[17] Hastings, M. B. and Levitov, L. S. (1998). Laplacian growth as one-

dimensional turbulence. Physica D 116 (1-2), 244.

[18] Henrici, P. (1962). Discrete variable methods in ordinary differential equa-

tions. John Wiley & Sons Inc., New York.

[19] Henrici, P. (1963). Error propagation for difference method. John Wiley

and Sons, Inc., New York.

[20] Henrici, P. (1964). Elements of numerical analysis. John Wiley & Sons

Inc., New York.

Bibliography 99

[21] Higham, N. J. (1996). Accuracy and stability of numerical algorithms. So-

ciety for Industrial and Applied Mathematics (SIAM), Philadelphia, PA.

[22] Hull, T. E. and Swenson, J. R. (1966). Tests of probabilistic models for

the propagation of roundoff errors. Comm. ACM 9, 108–113.

[23] IEEE754 (1985). IEEE standard for binary floating-point arithmetic,

ANSI/IEEE Standard 754-1985. Reprinted in SIGPLAN Notices, 22(2):9-25,

1987.

[24] Kallenberg, O. (2002). Foundations of modern probability. Probability

and its Applications (New York). Springer-Verlag, New York.

[25] Kingman, J. F. C. (1999). Martingales in the OK Corral. Bull. London

Math. Soc. 31, 5, 601–606.

[26] Kingman, J. F. C. and Volkov, S. E. (2003). Solution to the OK Corral

model via decoupling of Friedman’s urn. J. Theoret. Probab. 16, 1, 267–276.

[27] Lawler, G. F. (2005). Conformally invariant processes in the plane. Math-

ematical Surveys and Monographs, Vol. 114. American Mathematical Society,

Providence, RI.

[28] Mosbach, S. and Turner, A. G. (2005). A quantitative investigation

into the accumulation of rounding errors in numerical ODE solution. Technical

Report 36, c4e-Preprint Series, Cambridge.

[29] Piterbarg, V. V. (1998). Expansions and contractions of isotropic stochas-

tic flows of homeomorphisms. Ann. Probab. 26, 2, 479–499.

[30] Toth, B. and Werner, W. (1998). The true self-repelling motion. Probab.

Theory Related Fields 111, 3, 375–452.

[31] Tsirelson, B. (2004). Nonclassical stochastic flows and continuous products.

Probab. Surv. 1, 173–298 (electronic).

[32] Turner, A. G. (2006). Convergence of Markov processes near saddle fixed

points. To appear in Ann. Probab.

Bibliography 100

[33] Wilkinson, J. H. (1994). Rounding errors in algebraic processes. Dover

Publications Inc., New York.

[34] Williams, D. and McIlroy, P. (1998). The OK Corral and the power

of the law (a curious Poisson-kernel formula for a parabolic equation). Bull.

London Math. Soc. 30, 2, 166–170.

[35] Witten, T. A. and Sander, L. M. (1981). Diffusion-limited aggregation,

a kinetic critical phenomenon. Phys. Rev. Lett. 47, 19, 1400–1403.

Documents

Scaling Limits of Stochastic Processes - Lancasterturnera/thesis.pdf · 2007-03-30 · Abstract In this thesis we analyse two classes of stochastic processes, both of which exhibit