Lecture Notes | Stochastic Processes · 0. Introduction to stochastic processes In Probability Theory, a stochastic process (or random process) is a collection of (indexed) random

Lecture Notes — Stochastic Processes

Manuel Cabral Morais

Department of Mathematics

Instituto Superior Tecnico

Lisbon/Bern, February–May 2014

Preliminary note

I am convinced that the students of Introduction to Stochastic Processes will benefit

from these lecture notes, which were written assuming that the structure of the classes is

based on the philosophy learning by doing. Thus: the subjects tend to be motivated; the

definitions are introduced; the results are stated (occasionally proved) and illustrated by

examples and exercises worked together with the students.

Some more facts about these lecture notes. The main sources of inspiration are

undoubtedly Ross (1983, 1989, 2003) and Kulkarni (1995). However, I decided to

complement the lecture notes with material from a few other sources.

Please also note that the definitions, results, etc. are preceded by headers; the source(s

of inspiration) I used is(are) added in most cases; I strongly believe that the presentation

benefits from these headers and that the identification of sources is not only fair but

absolutely essential. The examples and the detailed solutions of some exercises in these

lecture notes are presented in small sections with headers with the purpose of suggesting

how students should structure the detailed solutions of the exercises in Introduction to

Stochastic Processes.

I am fully responsible for the typos, imprecisions or errors in these lectures notes —

if you detect any, do let me know by sending an e-mail to [email protected].

I would like to express my sincere thanks to Prof. Antonio Pacheco, for giving me

the opportunity to teach this course and for some invaluable material used during the

preparation of these lecture notes.

Enjoy them and I wish you a splendid semester...

Manuel Cabral Morais

Bern, February 12, 2014

i

Contents

Preliminary note i

0. Introduction to stochastic processes 1

0.1 Stochastic processes and their characterization . . . . . . . . . . . . . . . . 2

0.2 A pivotal characteristic of some stochastic processes . . . . . . . . . . . . . 8

0.3 A few examples of stochastic processes . . . . . . . . . . . . . . . . . . . . 10

1 Poisson Processes 17

1.1 Properties of the exponential distribution . . . . . . . . . . . . . . . . . . . 18

1.2 Poisson process: definitions . . . . . . . . . . . . . . . . . . . . . . . . . . 31

1.3 Event times in Poisson processes . . . . . . . . . . . . . . . . . . . . . . . . 39

1.4 Merging and splitting Poisson processes . . . . . . . . . . . . . . . . . . . . 43

1.5 Non-homogeneous Poisson process . . . . . . . . . . . . . . . . . . . . . . . 51

1.6 Conditional Poisson process . . . . . . . . . . . . . . . . . . . . . . . . . . 58

1.7 Compound Poisson process . . . . . . . . . . . . . . . . . . . . . . . . . . . 62

2 Renewal Processes 70

2.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 70

2.2 Properties of the number of renewals . . . . . . . . . . . . . . . . . . . . . 72

2.3 Renewal function . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 75

2.4 Renewal-type equations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 78

2.5 Key renewal theorem and some other limit theorems . . . . . . . . . . . . 84

2.6 Recurrence times; the inspection paradox . . . . . . . . . . . . . . . . . . . 95

2.7 Renewal reward processes . . . . . . . . . . . . . . . . . . . . . . . . . . . 102

2.8 Alternating renewal processes . . . . . . . . . . . . . . . . . . . . . . . . . 107

ii

2.9 Delayed renewal processes . . . . . . . . . . . . . . . . . . . . . . . . . . . 111

2.10 Regenerative processes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 114

3 Discrete time Markov chains 119

3.1 Definitions and examples . . . . . . . . . . . . . . . . . . . . . . . . . . . . 120

3.2 Chapman-Kolmogorov equations; marginal and joint distributions . . . . . 125

3.3 Classification of states; recurrent and transient states . . . . . . . . . . . . 130

3.4 Limit behavior of irreducible Markov chains . . . . . . . . . . . . . . . . . 141

3.5 Limit behavior of reducible Markov chains . . . . . . . . . . . . . . . . . . 149

3.6 Markov chains with costs/rewards . . . . . . . . . . . . . . . . . . . . . . . 153

3.7 Reversible Markov chains . . . . . . . . . . . . . . . . . . . . . . . . . . . . 157

3.8 Branching processes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 162

3.9 First passage times; absorption probabilities . . . . . . . . . . . . . . . . . 167

4 Continuous time Markov chains 175

4.1 Definitions and examples . . . . . . . . . . . . . . . . . . . . . . . . . . . . 176

4.2 Properties of the transition matrix; Chapman-Kolmogorov equations . . . . 178

4.3 Computing the transition matrix: finite state space . . . . . . . . . . . . . 184

4.4 Computing the transition matrix: infinite state space . . . . . . . . . . . . 185

4.5 Birth and death processes . . . . . . . . . . . . . . . . . . . . . . . . . . . 186

4.6 Classification of states . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 196

4.7 Limit behavior of CTMC . . . . . . . . . . . . . . . . . . . . . . . . . . . . 200

4.8 Birth and death queueing systems in equilibrium . . . . . . . . . . . . . . . 208

4.8.1 Performance measures . . . . . . . . . . . . . . . . . . . . . . . . . 209

4.8.2 M/M/1, the classical queueing system . . . . . . . . . . . . . . . . 212

4.8.3 The M/M/∞ queueing system . . . . . . . . . . . . . . . . . . . . . 218

4.8.4 M/M/m, the m server case . . . . . . . . . . . . . . . . . . . . . . . 221

4.8.5 M/M/m/m, the m–server loss system . . . . . . . . . . . . . . . . . 226

Bibliography 229

iii

0. Introduction to stochastic

processes

In Probability Theory, a stochastic process (or random process) is a collection of (indexed)

random variables (r.v.). These collections of r.v. are frequently used to represent the

evolution of a random quantity (X) over time (t)

(http://en.wikipedia.org/wiki/Stochastic process). This random

quantity could be, for example:

• a stock market index, such as the Dow Jones Industrial Average

(DJIA)1 at the end of a daily trading session at the New York

Stock Exchange (NYSE).

A stochastic process is the random analogue of a deterministic

process: even if the initial condition is known, there are several

(often infinitely many) directions in which the process may evolve

(http://en.wikipedia.org/wiki/Stochastic process).

Remark 0.1 — Practical importance of stochastic processes (Shumway and

Stoffer, 2006, p. 1)

The relevance of stochastic processes in practice can be described by mentioning a brief

list of some of the important areas in which stochastic processes arise:

1. Economics — we frequently deal with daily stock market quotations or monthly

unemployment figures;

1The DJIA is an index that shows how 30 large publicly owned companies based in the USA have

traded in the stock market (http://en.wikipedia.org/wiki/Dow Jones Industrial Average).

1

http://en.wikipedia.org/wiki/Stochastic_process

http://en.wikipedia.org/wiki/Stochastic_process

http://en.wikipedia.org/wiki/Dow_Jones_Industrial_Average

2. Social sciences — population birth rates and school enrollments series have been

followed for many centuries in several countries;

3. Epidemiology — numbers of influenza cases are often monitored over long periods

of time;

4. Medicine — blood pressure measurements are traced over time to evaluate the

impact of pharmaceutical drugs used in treating hypertension. •

Quiz 0.2 — Stochastic processes

Try to think of more stochastic processes in the world similar to 1–4 in Remark 0.1. •

0.1 Stochastic processes and their characterization

As referred by Brockwell and Davis (1991, p. 8), to allow for the inpredictable nature of

future observations, we have to suppose that each observation at time t is a realization of

a r.v. X(t) (or Xt). As a result, the sequence of observations taken sequentially in time

is a realization of a collection of r.v., known as a stochastic process.

Definition 0.3 — Stochastic process (Karr, 1993, p. 45; Brockwell and Davis, 1991,

p. 8)

A stochastic process (with index set T ) is a collection X(t) : t ∈ T of r.v. defined on a

common probability space (Ω,F ,P). •

Remark 0.4 — State of the process; index set; discrete- and continuous-time

processes

• The index t is often interpreted as time and, thus, X(t) is referred as to the state

of the process at time t (Ross, 2003, p. 83).

• T is called index set (or parameter set) (Ross, 1983, p. 26; Kulkarni, 1995, p. 2).

2

• If T is a countable set (e.g., N0 or Z) then we are dealing with a discrete-time process

(Ross, 1983, p. 26).

• If T is a continuum (e.g., R+0 or R) then X(t) : t ∈ T is said to be a continuous-

time process (Ross, 1983, p. 26).2 •

Remark 0.5 — Sample path, state space; discrete and continuous value

processes

• For each t ∈ T , X(t) is a r.v. (Ross, 2003, p. 83).

• Any realization of the stochastic process X(t) : t ∈ T is called a sample path

(Ross, 1983, p. 26).

• The set of all possible values that the r.v. X(t) can take at all t, say S, is said to

be the state space of the stochastic process X(t) : t ∈ T (Ross, 2003, p. 84).

• If S is a countable set (e.g., N0 or Z), X(t) : t ∈ T is a discrete value process

(Yates and Goodman, 1999, p. 206).

• If S is a continuum (e.g., R+0 or R), X(t) : t ∈ T is a continuous value process

(Yates and Goodman, 1999, p. 206). •

In this course we restrict our attention to stochastic processes in discrete or continuous

time. Moroever, we shall assume that X(t) can take values on either a discrete or a

continuous set.

2Stochastic processes in which T is not a subset of R are also of importance for instance in geophysics

where T is the surface of a sphere and Xt represents a relevant r.v. at location t on the surface of the

Earth (Wei, 1990, p. 1).

3

Example 0.6 — Stochastic processes

1. Discrete-time, discrete value process — X(t) : t ∈ N, where X(t) is the

outcome (“effective”, 1; “non effective”, 0) referring to patient t, in a clinical trial in

which an experimental drug is administered to a series of patients (Kulkarni, 1995,

p. 4).

2. Discrete-time, continuous value process — X(t) : t ∈ 1, . . . , 365, where

X(t) represents the noontime temperature in degrees Celsius at Lisbon Airport on

day t, from January 1 to December 31, 2013 (Yates and Goodman, 1999, p. 203).

3. Continuous-time, discrete value process — X(t) : t ∈ [0, 1], where X(t) is

the number of active calls associated to a coverage cell at time t, tomorrow from 8

to 9PM (Yates and Goodman, 1999, p. 203).

4. Continuous-time, continuous value process — X(t) : t ∈ R+0 , where X(t)

denotes the temperature in degrees Kelvin on the surface of a space shuttle at time

t, starting at launch time t = 0 (Yates and Goodman, 1999, p. 202). •

Quiz 0.7 — Stochastic processes

Give more examples of discrete/continuous-time, discrete/continuous value processes.3 •

Example 0.8 — Sample paths (Hayek, 2009, p. 97)

Consider X(t) : t ∈ N, where X(1), X(2), . . . are independent and identically

distributed (i.i.d.) r.v. such that P [X(t) = 1] = p and P [X(t) = −1] = 1 − p, for

each t, where p ∈ (0, 1). Moreover, suppose Y (t) =∑t

i=1X(i), for t ∈ N.

3X(t) might represent: the number of arrivals to a queue during the service interval of the tth customer,

or the socio-economic status of a family after t generations (discrete-time, discrete value processes); the

waiting time (including the service time) of the tth customer in a system (discrete-time, continuous

value process); the number of people in a queue at time t (continuous-time, discrete value process); the

accumulated operating time of a server in [0, t], or the accumulated claims paid by an insurance company

in [0, t] (continuous-time, continuous value processes).

4

Both X(t) : t ∈ N and Y (t) : t ∈ N are discrete-time (discrete value) stochastic

processes. A sample path of X(t) : t ∈ N and the corresponding sample path of

Y (t) : t ∈ N are shown below for p = 12.

1 2 3 4 5 6 7 8t

-1

1

1 2 3 4 5 6 7 8t

-4

-3

-2

-1

0

1

2

3

4

•

Motivation 0.9 — Characterization of a stochastic process (Kulkarni, 1995, pp.

9–10)

The r.v. X is fully characterized by its distribution function (d.f.),

FX(x) = P (X ≤ x), x ∈ R.

A random vector (X1, . . . , Xn) is completely described by its joint d.f.,

FX1,...,Xn(x1, . . . , xn) = P (X1 ≤ x1, . . . , Xn ≤ xn), xi ∈ R (i = 1, . . . , n).

Can we similarly and completely describe a stochastic process X(t) : t ∈ T? •

The full mathematical description of a stochastic process varies depending on whether

the index set T is finite, infinite (yet countable) or uncountable. Moreover, the full

description of a continuous-time stochastic process is not trivial because we have to deal

with an uncountable number of r.v. in this case; a complete description can be provided if

we make certain assumptions about the continuity of sample paths, etc. (Kulkarni, 1995,

p. 10).

Proposition 0.10 — Characterization of stochastic processes (Kulkarni, 1995, p.

10)

• If the index set T is finite and #T = n, then the stochastic process X(t) : t ∈ T

is completely described by the corresponding joint d.f.

5

• If T is infinite but countable, say T = N0, the discrete-time process X(t) : t ∈ N0

is fully described by a consistent family of (finite-dimensional) joint d.f., say Fn :

n ∈ N0.4 What one means by “fully described a stochastic process” in this case is

to be able to construct a probability space on which the process resides.5

• If T is uncountable, say T = R+0 , and almost all the sample paths of X(t) : t ∈ R+

0

are right-continuous6 with a finite number of jumps in a(ny) finite interval of time,

then X(t) : t ∈ R+0 is completely described by a consistent family of (finite-

dimensional) joint d.f. Ft1,...,tn(x1, . . . , xn) = P [X(t1) ≤ x1, . . . , X(tn) ≤ xn], for any

n ∈ N and 0 ≤ t1 < · · · < tn. •

Quiz 0.11 — Characterization of stochastic processes

How can we fully characterize the stochastic process X(t) : t ∈ N, where X(1), X(2), . . .

are i.i.d. r.v. with common d.f. F?7

4Let Fn(x0, x1, . . . , xn) = P [X(0) ≤ x0, X(1) ≤ x1, . . . , X(n) ≤ xn], xi ∈ R (i = 0, 1, . . . , n) and n ∈

N0. Then the family of joint d.f. Fn : n ∈ N0 is called consistent if limx→+∞ Fn+1(x0, x1, . . . , xn, x) =

Fn(x0, x1, . . . , xn), for all xi ∈ R (i = 0, 1, . . . , n) and n ∈ N0. That is, the “marginal” distribution of

(X(0), X(1), . . . , X(n)) obtained from the joint d.f. of (X(0), X(1), . . . , X(n), X(n + 1)) should be the

same as the joint d.f. specified for (X(0), X(1), . . . , X(n)) (Walrand, 2004, p. 190).5The Kolmogorov existence theorem guarantees that a suitably “consistent” collection of finite-

dimensional distributions will define a stochastic process. This theorem is credited to soviet

mathematician Kolmogorov (http://en.wikipedia.org/wiki/Andrey Kolmogorov) and can be stated as

follows (for more details, see Karr, 1993, p. 65). Let Fn be a joint d.f. on IRn+1 and suppose that

limx→+∞ Fn+1(x0, x1, . . . , xn, x) = Fn(x0, x1, . . . , xn), for all xi ∈ R (i = 0, 1, . . . , n) and n ∈ N0. Then

there is a probability space, say (Ω,F , P ), and a sequence of r.v. X(t)t∈N0 defined on it such that Fn

is the d.f. of (X(0), X(1), . . . , X(n)), for each n ∈ N0.6I.e., X(s) tends to X(t) as s decreases to t, for all t.7This stochastic process is completely described by F because we can create a consistent family of

joint d.f. as follows: Fn(x1, . . . , xn) =∏ni=1 F (xi), xi ∈ R (i = 1, . . . , n) and n ∈ N. (Kulkarni, 1995, p.

10).

6

http://en.wikipedia.org/wiki/Andrey_Kolmogorov

Motivation 0.12 — Partial characterization of stochastic processes (Brockwell

and Davis, 1991, p. 11)

Let us remind the reader that, while handling a random vector, it is often useful to

compute its mean vector, and, more importantly, the covariance matrix (or the correlation

matrix) to gain insight into the dependence between those r.v.

While dealing with a stochastic process X(t) : t ∈ T, we have to extend the concept

of mean vector, covariance or the correlation matrices — the mean, the autocovariance

and the autocorrelation functions provide the necessary extension. •

Definition 0.13 — Mean, variance, autocovariance and autocorrelation

functions (Wei, 1990, p. 7)

A stochastic process X(t) : t ∈ T can be partially described by the following functions:

µ(t) = E[X(t)], t ∈ T ;

σ2(t) = V [X(t)], t ∈ T ;

γ(t1, t2) = Cov(X(t1), X(t2)), t1, t2 ∈ T ;

ρ(t1, t2) = Corr(X(t1), X(t2))

=γ(t1, t2)√

σ2(t1)× σ2(t2), t1, t2 ∈ T.

They represent the mean, variance, autocovariance and autocorrelation functions,

respectively. •

Quiz 0.14 — Mean, variance and autocovariance functions

Let A and B two independent N(0, 1) r.v., and X(t) = A + Bt + t2, t ∈ R. Determine

the mean, variance and autocovariance functions.8 •

8For t ∈ R, we have µ(t) = t2, σ2(t) = 1 + t2 and γ(t1, t2) = 1 + t1t2 (Hajek, 2009, p. 99).

7

0.2 A pivotal characteristic of some stochastic

processes

A crucial feature of several stochastic processes is some form of statistical equilibrium or

stationarity.

In order to state the following notions of stationarity, let us consider (without loss of

generality) a stochastic process X(t) : t ∈ R+0 .

Definition 0.15 — nth order stationarity in distribution (Wei, 1990, p. 7)9

The stochastic process X(t) : t ∈ R+0 is said to be nth order stationary in distribution

(n ∈ N), if the n−dimensional joint d.f. is time invariant, i.e., if

Ft1,...,tn(x1, . . . , xn) = Ft1+u,...,tn+u(x1, x2, . . . , xn), (1)

for any (x1, x2, . . . , xn) ∈ Rn, any n−tuple (t1, t2, . . . , tn) ∈ R+0 and any u > 0. •

Remark 0.16 — nth order stationarity in distribution

• If X(t) : t ∈ R+0 is nth order stationarity in distribution then the n−dimensional

joint d.f. are unaffected when shifting all the time epochs t1, t2, . . . , tn by any positive

amount u (Grimmett and Stirzaker, 2001a, p. 361).

• A higher order of stationarity always implies a lower order of stationarity because

the n−dimensional joint d.f. determines all finite dimensional joint d.f. of lower

dimension, say m < n. (Wei, 1990, p. 7). •

Definition 0.17 — Strict stationarity (Wei, 1990, p. 7; Grimmett and Stirzaker,

2001a, p. 361)

The stochastic process X(t) : t ∈ R+0 is strictly stationary if it is nth order stationary

in distribution, for any n ∈ N. •9Wei (1990, p. 7) states this notion of stationary for X(t) : t ∈ Z; the extension for X(t) : t ∈ R+

0

follows in a straightforward manner.

8

Remark 0.18 — Strict stationarity

• The terms strongly stationary and completely stationary are also used to denote a

strictly stationary stochastic process (Wei, 1990, p. 7).

• For a strictly stationary process, the mean function µ(t) is constant, say equal to

µ, provided the expectation of X(t) exists, i.e., E[|X(t)|] < +∞ (Wei, 1990, p. 7).

Likewise, if E[X2(t)] < +∞, then the variance function σ2(t) is also constant, say

equal to σ2; moreover, the autocovariance function satisfies γ(t, t+ u) = γ(0, u), for

t ∈ R+0 and u > 0 (Wei, 1990, pp. 7–8). •

Quiz 0.19 — Strict stationarity

Is the stochastic process defined in Quiz 0.14 strictly stationary?10 •

It is very difficult or virtually impossible to verify strict stationarity; thus, we often use

weaker notions of stationarity defined in terms of the moments of the stochastic process

(Wei, 1990, p. 8).

Definition 0.20 — First and second order weak stationarity (Wei, 1990, p. 8;

Pires, 2001, p. 11)

• A first order weakly stationary process X(t) : t ∈ R+0 has constant mean function

µ(t) = µ, t ∈ R+0 .

• A second order weakly stationary process X(t) : t ∈ R+0 — or simply stationary

— has constant mean function µ(t) = µ, for t ∈ R+0 , and an autocovariance function

which depends on the time lag alone, i.e.,

γ(t, t+ u) = γ(0, u) (2)

for any t, u ∈ R+0 . •

10No! For instance, µ(t) = t2 is not constant.

9

Quiz 0.21 — Second order weak stationarity

Let X(t) : t ∈ Z be a stochastic process such that

X(t) = µ+ φ[X(t− 1)− µ] + ε(t), t ∈ Z, (3)

where: φ is a constant satisfying −1 < φ < 1; and ε(t) : t ∈ Z is a sequence of

disturbances such that ε(t)i.i.d.∼ N (0, σ2

ε), with σ2ε = (1− φ2)σ2(0).

Is X(t) : t ∈ Z a stationary (i.e., a second order weakly stationary) process?11 Is

this stochastic process strictly stationary? •

0.3 A few examples of stochastic processes

A sequence of i.i.d. r.v. is the simplest stochastic process. Although devoid of an

interesting structure, we can construct non trivial stochastic processes from it, suchlike

Y (t) : t ∈ N, defined in Example 0.8 and called a (one-dimensional) random walk.

This the simplest discrete-time stochastic process

with a non trivial structure (Kulkarni, 1995, p. 11);

it is going to be addressed (in more detail) in future

chapters.

Remark 0.22 — Applications of random walk

The path followed by atom in a gas moving under

the influence of collisions with other atoms can be

described by a random walk (RW). Random walk

has also been applied in other areas such as:

• Economics — RW used to model shares prices and other factors;

• Population genetics — RW describes the statistical properties of genetic drift;12

11Yes! Since −1 < φ < 1, X(t) admits the following representation: X(t) = µ+∑+∞i=0 φ

iε(t− i), we get

E[X(t)] = · · · and V [X(t)] = · · · are constant, and cov(X(t), X(t+ k)) = cov(X(t)− µ,X(t+ k)− µ) =

E[∑+∞

i=0

∑+∞j=0 φ

iφjε(t− i)ε(t+ k − j)]

= σ2ε

∑+∞i=0 φ

iφi+k = σ2ε

φk

1−φ2 , for k ∈ N0.12Genetic drift is one of several evolutionary processes which lead to changes in allele frequencies over

time.

10

• Ecology — RW used to describe individual animal movements, and occasionally

to model population dynamics. •

An extremely relevant stochastic process arises from counting events occurring one at

a time.13

Definition 0.23 — Counting process (Ross, 1989, p. 210)

A stochastic process N(t) : t ≥ 0 is said to a counting process if N(t) represents the

total number of events (e.g. arrivals, departures) that have occurred up to time t and

must satisfy:

• N(t) ∈ N0, t ≥ 0;

• N(s) ≤ N(t), 0 ≤ s < t;

• N(t)−N(s) corresponds to the number of events that have occurred in the interval

(s, t], 0 ≤ s < t. •

Example 0.24 — Counting process (Ross, 2003, p. 288)

• Let N(t) be the number of persons who enter a specific store at or prior to time t.

Then N(t) : t ≥ 0 is a counting process in which an event corresponds to a person

entering the store.

• Let N(t) be the number of children born by time t in a maternity. Then N(t) :

t ≥ 0 is a counting process in which an event occurs whenever a child is born. •

Quiz 0.25 — Counting process (Ross, 2003, p. 288)

Let N(t) represent now the number of persons in a store at time t. Is N(t) : t ≥ 0 a

counting process?14

Give examples of counting processes. •

The two following properties play a major role in the characterization of counting

processes.

13What follows is an adaption of Morais (2011, Sec. 3.6).14No! N(t) : t ≥ 0 does not satisfy N(s) ≤ N(t), 0 ≤ s < t.

11

Definition 0.26 — Counting process with stationary increments (Ross, 1989, p.

210)

The counting process N(t) : t ≥ 0 is said to have stationary increments if distribution

of the number of events that occur in any interval of time depends only on the length of

the interval,15 that is,

• N(t2 + s)−N(t1 + s)d= N(t2)−N(t1), ∀ 0 ≤ t1 < t2, s > 0. •

Definition 0.27 — Counting process with independent increments (Ross, 1989,

p. 209)

The counting process N(t) : t ≥ 0 is said to have independent increments if the number

of events that occur in disjoint intervals are independent r.v., i.e.,

• for 0 < t1 < · · · < tn, N(t1), N(t2) − N(t1), N(t3) − N(t2), . . . , N(tn) − N(tn−1)

are independent r.v.

•

What follows is a detailed description of a counting process in discrete time (arising

from a sequence of i.i.d. r.v.) with stationary and independent increments.16

Motivation 0.28 — Bernoulli (counting) process (Karr, 1993, p. 88)

Counting sucesses in repeated, independent trials, each of which has one of two possible

outcomes (success and failure). •

Definition 0.29 — Bernoulli process (Karr, 1993, p. 88)

A Bernoulli process with parameter p is a sequence Xi : i ∈ N of i.i.d. r.v. with Bernoulli

distribution with parameter p = P (success). •15The distributions do not depend on the origin of the time interval; they only depend on the length

of the interval.16Since we are going to deal with discrete-time processes X(t) is replaced by Xi, etc.

12

Definition 0.30 — Important r.v. in a Bernoulli process (Karr, 1993, pp. 88–89)

In isolation a Bernoulli process is neither deep or interesting. However, we can identify

three associated and very important r.v.:

• Sn =∑n

i=1Xi, the number of successes in the first n trials (n ∈ N);

• Tk = minn : Sn = k, the time (trial number) at which the kth. success occurs

(k ∈ N), that is, the number of trials needed to get k successes;

• Uk = Tk−Tk−1, the time (number of trials) between the kth. and (k−1)th. successes

(k ∈ N, T0 = 0, U1 = T1). •

Definition 0.31 — Bernoulli counting process (Karr, 1993, p. 88)

The sequence Sn : n ∈ N is usually termed as Bernoulli counting process (or success

counting process). •

Exercise 0.32 — Bernoulli counting process

Simulate a Bernoulli process with parameter p = 12

and consider n = 100 trials. Plot the

realizations of both the Bernoulli process and the Bernoulli counting process. •

Definition 0.33 — Bernoulli success time process (Karr, 1993, p. 88)

The sequence Tk : k ∈ N is usually called the Bernoulli success time process. •

Proposition 0.34 — Important distributions in a Bernoulli process (Karr, 1993,

pp. 89–90)

In a Bernoulli process with parameter p (p ∈ [0, 1]) we have:

• Sn ∼ Binomial(n, p), n ∈ N;

• (Sm | Sn = k) ∼ Hypergeometric(n,m, k), 0 ≤ m ≤ n, 0 ≤ k ≤ n.

• Tk ∼ NegativeBinomial(k, p), k ∈ N;

• Uki.i.d.∼ Geometric(p)

d= NegativeBinomial(1, p), k ∈ N. •

13

Proposition 0.35 — Properties of the Bernoulli counting process (Karr, 1993,

p. 90)

The Bernoulli counting process Sn : n ∈ N has:

• independent increments — i.e., for 0 < n1 < · · · < nk, the r.v. Sn1 , Sn2 − Sn1 ,

Sn3 − Sn2 , . . . , Snk − Snk−1are independent;

• stationary increments — that is, for fixed j ∈ N, the distribution of Sk+j−Sk is the

same for all k ∈ N. •

Quiz 0.36 — Properties of the Bernoulli counting process

(a) Argue why Proposition 0.35 (Karr, 1993, p. 90) holds.

(b) Obtain the mean, variance and autocovariance functions of the Bernoulli counting

process. Is it a second order weakly stationary process? •

Remark 0.37 — Bernoulli counting process (web.mit.edu/6.262/www/lectures/

6.262.Lec1.pdf)

Some application areas for discrete stochastic processes such as the Bernoulli counting

process (and the Poisson process, studied in the next chapter) are:

• Operations Research — Queueing in any area, failures in manufacturing systems,

finance, risk modelling, network models;

• Biology and Medicine — Epidemiology, genetics and DNA studies, cell modelling,

bioinformatics, medical screening, neurophysiology;

• Computer Systems — Communication networks, intelligent control systems, data

compression, detection of signals, job flow in computer systems, physics – statistical

mechanics. •

14

Exercise 0.38 — Bernoulli process modelling of sexual HIV transmission

(Pinkerton and Holtgrave, 1998, pp. 13–14)

In the Bernoulli-process model of sexual HIV transmission, each act of sexual intercourse

is treated as an independent stochastic trial that is associated to a probability α of HIV

transmission. α is also known as the infectivity of HIV and a number of factors are

believed to influence α.17

Prove that the expression of the probability of HIV transmission in n multiple contacts

with the same infected partner is 1− (1− α)n. •

Definition 0.39 — Independent Bernoulli processes

Two Bernoulli counting processes S(1)n : n ∈ N and S(2)

n : n ∈ N are independent

if for every positive integer k and all times n1, . . . , nk, we have that the random

vector(S

(1)n1 , . . . , S

(1)nk

)associated with the first process is independent of

(S

(2)n1 , . . . , S

(2)nk

)associated with the second process. •

Proposition 0.40 — Merging independent Bernoulli processes

Let S(1)n : n ∈ N and S(2)

n : n ∈ N be two independent Bernoulli counting processes

with parameters α and β, respectively. Then the merged process S(1)n ⊕ S(2)

n : n ∈ N is

a Bernoulli counting process with parameter α + β − αβ.18

•

17Such as the type of sex act engaged, sex role, etc.18An event is said to occur in the merged process if and only if an event occurs in at least one of the

two original processes, which happens with probability α+ β − αβ (Bertsekas, 2—, p. 10).

15

Quiz 0.41 — Merging independent Bernoulli processes

Give an example of a merger between two independent Bernoulli processes. Provide

a detailed description of the two original processes and the process resulting from the

merger. •

Proposition 0.42 — Splitting a Bernoulli process (or sampling a Bernoulli

process)

Let Sn : n ∈ N be a Bernoulli counting process with parameter α. Splitting the original

Bernoulli counting process based on a selection probability p yields two Bernoulli counting

processes with parameters αp and α(1− p).

•

Quiz 0.43 — Splitting a Bernoulli process

(a) Are the two processes resulting from splitting a Bernoulli process independent?19

(b) Give an example where we are dealing with a splitting of a Bernoulli process.20

Provide a detailed description of the original process and the two resulting from

the splitting. •

19NO! If we try to merge the two splitting processes and assume they are independent we get a

parameter αp+ α(1− p)− αp× α(1− p) which is different from α.20A two-machine work center may see a stream of arriving parts to be processed and split them by

sending each part to a randomly chosen machine (Bertsekas, 2—, p. 10).

16

Chapter 1

Poisson Processes

Is there a continuous analogue of the Bernoulli process?

Yes!

Motivation 1.1 — Poisson processes

In the Bernoulli process, the times between consecutive events are i.i.d. r.v. with Geometric

distribution with parameter p — the only discrete distribution with lack of memory...

Similarly, if the times between consecutive events are i.i.d. r.v. with Exponential

distribution with parameter λ — the only continuous distribution with lack of memory!

—, we end up dealing with the (homogenous) Poisson process, named after the French

mathematician Simeon-Denis Poisson. In this stochastic process events occur continuously

and independently of one another. •

Assuming that the times between consecutive events are i.i.d. r.v. exponentially

distributed is certainly a simplifying assumption so as to render the mathematics tractable

(Ross, 2003, p. 269), and yet the radioactive decay of atoms, telephone calls arriving at

a switchboard, page view requests to a website and several other phenomena are well-

modeled as Poisson processes (http://en.wikipedia.org/wiki/Poisson process).

What follows is an extended version of Morais (2011, Section 3.7), prepended by a

section inspired by Kulkarni (1995, Section 5.1), Ross (2003, Section 5.2) and Pacheco

(2002, Section 2.1).

17

http://en.wikipedia.org/wiki/Poisson_process

1.1 Properties of the exponential distribution

The purpose of this section is to state results concerning the exponential distribution,

namely a few that play a major role in the analysis of some stochastic processes.

Definition 1.2 — Exponential distribution

The r.v. X is said to have exponential distribution with parameter λ > 0 — for short

X ∼ Exponential(λ) — if it has p.d.f. given by

fX(x) =

0, x < 0

λe−λx, x ≥ 0.(1.1)

•

Exercise 1.3 — C.d.f., moments, m.g.f., expected value, variance, coefficient

of variation, median, mode, skewness, kurtosis of the Exponential distribution

Let X ∼ Exponential(λ). Prove the following results:

(a) FX(x) =

0, x ≤ 0

1− e−λx, x > 0.

The survival function of X, SX(x) = 1 − FX(x), is an exponential function with

negative exponent, which explains in part why the exponential distribution is also

called negative exponential distribution (Pacheco, 2002, p. 38).

(b) E(Xs) = Γ(s+1)λs

, for s > −1, where: Γ(s) =∫ +∞

0λsxs−1e−λx dx; Γ(s+1) = sΓ(s), s >

0; and Γ(s+ 1) = s!, for s ∈ IN0.

(c) MX(t) = E(etX) = λλ−t , for t < λ.1

(d) E(X) = 1λ.

(e) V (X) = 1λ2

.

1If the function MX(t) = E(etX)

exists in a neighborhood of t = 0, it is called the moment generating

function (m.g.f.) of the r.v. X. Note that MX(t) = E(etX)

=∑+∞k=0

tkE(Xk)k! . Moreover, if the m.g.f. is

defined for |t| ≤ t0, where t0 > 0, then E(Xk) = dkMX(t)dtk

∣∣∣t=0

, for k = 1, 2, · · · .

18

(f) CV (X) =

√V (X)

|E(X)| = 1 (coefficient of variation).

(g) median(X) = λ−1 × ln(2).

(h) mode(X) = 0.

(i) SC(X) = E[X−E(X)]3[SD(X)]3

= 2 (skewness coefficient; SC(X) > 0, thus, a skewed to the

right distribution).

(j) KC(X) = E[X−E(X)]4[SD(X)]4

− 3 = 6 (excess kurtosis coefficient; KC(X) > 0, hence a

leptokurtic distribution). •

Proposition 1.4 — Univariate properties of the Exponential distribution

Let X ∼ Exponential(λ). Then X has the following properties:

• Lack of memory — This is the most important property of the exponential

distribution and it can be stated as follows:

P (X > s+ t | X > s) = P (X > t), s, t ≥ 0 (1.2)

(Kulkarni, 1995, p. 189), i.e., P (X > s + t) = P (X > s) × P (X > t), s, t ≥ 0.2

Equivalently, the distribution of the residual lifetime of X at age s, (X − s|X > s),

has the same distribution as X itself (Pacheco, 2002, p. 38):

(X − s|X > s)st= X. (1.3)

The Exponential r.v. is the only continuous r.v. with the lack of memory property

(Kulkarni, 1995, p. 190).3

2This means that the conditional probability that we need to wait more than another t seconds

before the first arrival, given that the first arrival has not yet happened after s seconds, is

equal to the initial probability that we need to wait more than t seconds for the first arrival

(http://en.wikipedia.org/wiki/Exponential distribution#Memorylessness).3The proof of this relevant result is interesting and can be found in Ross (1983, pp. 24–25), Kulkarni

(1995, p. 190) or Ross (2003, p. 275).

19

http://en.wikipedia.org/wiki/Exponential_distribution#Memorylessness

• Failure rate — The lack of memory property is translated into a constant failure

rate function for the Exponential r.v.:

λX(x) =fX(x)

1− FX(x)= λ, x ≥ 0. (1.4)

Since the failure rate function completely characterizes the c.d.f. of a non-negative

r.v.,4 the Exponential r.v. is the only non-negative continuous r.v. with constant

failure rate (Kulkarni, 1995, p. 191). •

Another useful result refers to the distributions of the minimum of independent

exponentially distributed r.v., one of several multivariate properties of the

Exponential distribution.

Proposition 1.5 — Minimum of Exponentials

Let Xiindep∼ Exponential(λi), i = 1, . . . , n. It turns out that the smallest of the Xi also

has an Exponential distribution — with parameter equal to the sum of the λi:

minX1, . . . , Xn ∼ Exponential

(n∑i=1

λi

). (1.5)

Moreover, if we define a r.v. N as

N = j, iff Xj = minX1, . . . , Xn, (1.6)

then minX1, . . . , Xn and N are independent r.v.5 and

P (N = j, minX1, . . . , Xn > x) =λj∑ni=1 λi

× e−∑ni=1 λi x. (1.7)

•

Exercise 1.6 — Minimum of independent Exponentials

(a) Prove result (1.5).

4In fact, the c.d.f. of a non-negative continuous r.v. X can be obtained in terms of λX(x): FX(x) =

1− exp[−∫ x0λX(u) du

], x ≥ 0 (Ross, 2003, p. 277).

5The proof of this intriguing result can be found in Kulkarni (1995, pp. 192–193), for n = 2; the

extension is easily proved.

20

(b) Let Xiindep∼ Exponential(λi) represent the duration of component i of a system.

What is the survival function of the duration of the system if the components are

set in series? Obtain the expected value and variance of the duration of the series

system.

(c) Prove result (1.7) without assuming that N and Xj are independent r.v. or using

result (1.9). •

The assumption of independence is critical (Kulkarni, 1995, p. 192) to obtain the

following result concerning the probability that one Exponential r.v. is smaller than

another:

Proposition 1.7 — Probability of first failure

Let Xiindep∼ Exponential(λi), i = 1, . . . , n. Then

P (X1 < X2) = P (X1 = minX1, X2)

=λ1

λ1 + λ2

, (1.8)

and, in general,

P (Xj = minX1, . . . , Xn) =λj∑ni=1 λi

. (1.9)

•

Example/Exercise 1.8 — Probability of first failure

(a) Prove result (1.8).

Let Xiindep∼ Exponential(λi), i = 1, 2. Then

P (X1 < X2) =

∫ +∞

0

P (X1 < X2 | X2 = x)× fX2(x) dx

X1,X2 indep=

∫ +∞

0

P (X1 < x)× fX2(x) dx

=

∫ +∞

0

(1− e−λ1x

)× λ2e

−λ2x dx

=λ1

λ1 + λ2

.

21

(b) Prove result (1.9).

(c) Harry and John arrived at the same time to the barber shop: Harry to get shaved,

John to get a haircut. Suppose that Harry and John were immediately (and

independently!) served. Moreover, assume that the duration of a haircut (resp. a

shave) is an Exponential r.v. with expected value equal to 20 (resp. 15) minutes.

Calculate the probability that John gets his hair cut before Harry gets his beard

shaved? •

Exercise 1.9 — Probability of first failure (Kulkarni, 1995, Example 5.1, p. 192)

The running track in a stadium is 1 km long. Two runners start on it at the same time.

Suppose the speeds of the runners are Xiindep∼ Exponencial(λi), i = 1, 2. The mean speeds

of runners 1 and 2 are 20 km/hr and 22 km/hr, respectively.

What is the probability that runner 1 wins the race? •

Exercise 1.10 — Probability of first failure (bis)

Let X1 and X2 be two independent non-negative continuous r.v. with failure rate functions

λ1(x) and λ2(x), respectively.

Prove that

P [X1 < X2 | minX1, X2 = x] =λ1(x)

λ1(x) + λ2(x).

•

Exercise 1.11 — Probability of first failure (bis, bis)

Consider a post office with two clerks (who operate independently!). Suppose that

customers A, B and C enter the system simultaneously, A is served by one of the clerks,

B by the other and C is told that her/his service will begin as soon as either A and B

leaves.

What is the probability that, of the three customers, A is the last to leave the post

office if the amount of time a clerk spends with a customer is6

6Adapted from Ross (1983, Example 1.6(a), pp. 23–24).

22

(a) equal to 10 minutes?

(b) a r.v. with discrete uniform distribution in 1, 2, 3?

(c) an Exponential r.v. with expected value 1/λ? •

A stronger version of the lack of memory property is stated in the next proposition.

Proposition 1.12 — Strong lack of memory; Renyi’s represention

Let Xii.i.d.∼ Exponential(λ), i = 1, . . . , n, and X(1) = minX1, . . . , Xn, . . . , X(n) =

maxX1, . . . , Xn the associated order statistics. Then

(Xj −X(1) | Xj > X(1)

) d= Xj, (1.10)

for j = 1, . . . , n.7

Moreover, (Xj −minX1, . . . , Xn | Xj > minX1, . . . , Xn) : j = 1, . . . , n is a

sequence of independent r.v. As a consequence, if D1, D2, . . . , Dn represent the (1st.

order) spacings — i.e., D1 = X(1), D2 = X(2) −X(1), . . . , Dn = X(n) −X(n−1) —, then

Dkindep∼ Exponencial((n− k + 1)λ) (1.11)

for k = 1, . . . , n,8 and we can certainly add that

X(k) =k∑i=1

Di (1.12)

E[X(k)] =k∑i=1

1

(n− i+ 1)λ, (1.13)

where (1.12) is usually called the Renyi’s represention of the order statistic X(k). •

7When Xi represents the lifetime of component i (i = 1, . . . , n) and the n components are put to test

at the same time, this result can be interpreted as follows: the remaining lifetime of component j, given

that it is larger that the smallest of the lifetimes, is still exponentially distributed.8This result allows us to say that the time between successive failures — when dealing with items

whose lifetimes are i.i.d. Exponential r.v. — are independent Exponential r.v. (with different parameters).

23

Exercise 1.13 — Strong lack of memory

Prove results (1.10) and (1.11). •

Exercise 1.14 — Renyi’s represention

Use the Renyi’s representation to obtain the expected value and variance of the duration

of a parallel system with n components with i.i.d. exponentially distributed lifetimes.

Interpret the expression of the expected value you obtained.9 •

A parallel system is said to be operating with (n− 1) warm standbys. Another way of

operating the system is to use (n − 1) components as spares, thus, only one component

is working at a time and when it fails it is immediately replaced by one of the remaining

spare — the spares are in cold standby, that is, they do not fail unless they are put into

use (Kulkarni, 1995, p. 195); if the lifetimes of the components are i.i.d. Exponential r.v.

then the duration of this new system has a known distribution.

The distribution of sums of i.i.d. Exponential r.v. also arises when we are dealing with

the epoch of the nth arrival in a Poisson process.

Proposition 1.15 — Sums of i.i.d. Exponentials (Erlang distribution)

Let Xii.i.d.∼ Exponential(λ), i = 1, . . . , n, and Sn =

∑ni=1 Xi. Then Sn ∼ Gamma(n, λ) ∼

Erlang(n, λ), i.e.,

fSn(x) =λn

(n− 1)!xn−1e−λx, x ≥ 0. (1.14)

•

The gamma distribution stands in the same relation to exponential as negative

binomial to geometric: sums of i.i.d. exponential r.v. have gamma distribution (Morais,

2011, p. 78).

9The equation of E[X(n)] is an example of the law of diminishing returns: a system of one component

has expected lifetime 1/λ, whereas a system with two components in parallel has expected lifetime 1.5/λ,

thus, doubling the number of components translated in just a 50% increase in the mean lifetime; one

reason behind this diminishing return is that all the n components are in operation and hence subject to

failure simultaneously although the parallel system requires just one component to function (Kulkarni,

1995, p. 195).

24

The parameter n is usually called the number of phases of the Erlang distribution

and λ its rate (the reciprocal of the expected value); the Erlang distribution is a

particular case of the gamma family of distributions with very important applications

(Pacheco, 2002, p. 39). The Erlang distribution was developed by Agner Krarup Erlang

(1878–1929) to examine the number of telephone calls which might be made at the

same time to the operators of the switching stations; this work on telephone traffic

engineering has been expanded to consider waiting times in queueing systems in general;

the distribution is now used in the fields of stochastic processes and of biomathematics

(http://en.wikipedia.org/wiki/Erlang distribution).

Exercise 1.16 — M.g.f., moments, expected value, variance, coefficient of

variation, mode, skewness, excess kurtosis of the Erlang distribution

Let Sn ∼ Gamma(n, λ), n ∈ N. Prove that:

(a) MSn(t) =(

λλ−t

)n, for t < λ;

(b) E(Skn) = (n+k−1)!(n−1)!λk

, k ∈ N;

(c) E(Sn) = nλ;

(d) V (Sn) = nλ2

;

(e) CV (Sn) = 1√n≤ 1;

(f) mode(Sn) = n−1λ

;10

(g) SC(Sn) = 2√n

(skewed to the right distribution);

(h) KC(Sn) = 6n. •

Exercise 1.17 — Erlang distribution

Prove that

FErlang(n,λ)(x) =∞∑i=n

(λx)i

i!e−λx = 1− FPoisson(λx)(n− 1), x > 0, n ∈ N. (1.15)

•

10mode(Gamma(α, λ)) = α−1λ , α ∈ R+\(0, 1); median(Sn) has no simple closed form.

25

http://en.wikipedia.org/wiki/Erlang_distribution

Exercise 1.18 — Erlang distribution (bis) (Kulkarni, 1995, Example 5.4, p. 197)

Suppose the times between two successive births at a maternity hospital are i.i.d.

exponential r.v. with mean equal to one day.

What is the probability that the 10th birth in a calendar year takes place after January

15? •

The Erlang distribution arises when we sum i.i.d. exponential r.v. When the i.d.

assumption is dropped, another distribution arises with a coefficient of variation smaller

than one.

Proposition 1.19 — Sums of independent Exponentials: Hypo-exponential

distribution

Let Xiindep∼ Exponential(λi), i = 1, . . . , n, and suppose that λi 6= λj for all i 6= j. Then∑n

i=1Xi is said to be a Hypo-exponential r.v. and its p.d.f. is given by

f∑ni=1Xi

(x) =n∑i=1

Ci,n × λi e−λix, (1.16)

where Ci,n =∏

j 6=iλj

λj−λi (Ross, 2003, pp. 284–285; Kulkarni, 1995, p. 197). •

Exercise 1.20 — Hypo-exponential distribution

Let∑n

i=1Xi ∼ Hypo-exponential(λ1, . . . , λn).

(a) Describe an example where the Hypo-exponential distribution arises.

(b) Derive the p.d.f. of∑n

i=1Xi when n = 2, without using result (1.16).11

(c) Prove result (1.16) by taking advantage of the result derived in (b).12 •

Exercise 1.21 — C.d.f. and failure rate function of the Hypo-exponential

distribution

Let∑n

i=1Xi ∼ Hypo-exponential(λ1, . . . , λn). Prove that:

11See Ross (2003, p. 284).12This proof can be found in Ross (2003, pp. 285–286).

26

(a) P (∑n

i=1 Xi > x) =∑n

i=1Ci,ne−λix, x > 0;

(b) λ∑ni=1Xi

(x) =∑ni=1 Ci,nλi e

−λix∑ni=1 Ci,ne

−λix , x > 0;

(c) limx→+∞ λ∑ni=1Xi

(x) = minλ1, . . . , λn (interpret this result!).13 •

Exercise 1.22 — M.g.f., expected value, variance, coefficient of variation of the

Hypo-exponential distribution

Let∑n

i=1Xi ∼ Hypo-exponential(λ1, . . . , λn). Prove that:

(a) M∑ni=1Xi

(t) =∏n

i=1

(λiλi−t

), for t < minλ1, . . . , λn);

(b) E (∑n

i=1 Xi) =∑n

i=11λi

;

(c) V (∑n

i=1Xi) =∑n

i=11λ2i

;

(d) CV (∑n

i=1 Xi) =

√∑ni=1

1

λ2i∑n

i=11λi

.14 •

Proposition 1.23 — Mixtures of independent Exponentials: Hyper-

exponential distribution

X is said to have a Hyper-exponential distribution if its p.d.f. is given by

fX(x) =n∑i=1

pi fXi(x), (1.17)

where: Xiindep∼ Exponential(λi), i = 1, . . . , n, with λi 6= λj whenever i 6= j; pi > 0 and∑n

i=1 pi = 1. The Hyper-exponential distribution is an example of a mixture density.15 It

is going to be represent for short by X ∼ Hyper-exponential(λ1, . . . , λn; p1, . . . , pn). •13 The remaining lifetime of a hypo-exponentially distributed item that has survived to age x is, for very

large x, approximately that of an exponentially distributed r.v. with parameter equal to the minimum of

the parameters of the r.v. which are the summands of the Hypo-exponential (Ross, 2003, p. 286).14CV (

∑ni=1Xi) < 1 (http://en.wikipedia.org/wiki/Hypoexponential distribution).

15Its name is due to the fact that the coefficient of variation of this distribution is greater than the one

of the Exponential distribution, whose coefficient of variation is 1 (http://en.wikipedia.org/wiki/Hyper-

exponential distribution).

27

http://en.wikipedia.org/wiki/Hypoexponential_distribution

http://en.wikipedia.org/wiki/Hyper-exponential_distribution

http://en.wikipedia.org/wiki/Hyper-exponential_distribution

To see how such a r.v. might arise, consider a factory responsible for the production

of n types of batteries, with a type i battery lasting for an Exponential distributed time

with parameter λi, i = 1, . . . , n. Suppose further that pi represents the proportion of

produced batteries of type i (i = 1, . . . , n). If a battery is randomly chosen from the daily

production, then its lifetime X will have a Hyper-exponential distribution (Ross, 2003, p.

278).

Exercise 1.24 — C.d.f. and failure rate function of the Hyper-exponential

distribution

Let X ∼ Hyper-exponential(λ1, . . . , λn; p1, . . . , pn). After having described an(other)

example where the Hyper-exponential distribution arises, prove that:

(a) P (X > x) =∑n

i=1 pi e−λix, x > 0;

(b) λX(x) =∑ni=1 pi λi e

−λix∑ni=1 pi e

−λix ;

(c) limx→+∞ λX(x) = minλ1, . . . , λn (interpret this result!).16 •

Exercise 1.25 — M.g.f., expected value, second order moment and coefficient

of variation of the Hyper-exponential distribution

Let X ∼ Hyper-exponential(λ1, . . . , λn; p1, . . . , pn). Prove that:

(a) MX(t) = E (etx) =∑n

i=1 piλiλi−t , t < minλ1, . . . , λn;

(b) E(X) =∑n

i=1piλi

;

(c) E(X2) =∑n

i=12piλ2i

;

(d) CV (X) =

√∑ni=1

2piλ2i

−(∑n

i=1piλi

)2∑ni=1

piλi

> 1. •

16As a randomly chosen item ages, its failure rate function converges to the failure rate of the

exponential type with the smallest failure rate, which is intuitive because the longer the item lasts,

the more likely it is an item type with the smallest failure rate (Ross, 2003, p. 279).

28

Proposition 1.26 — Random sums of i.i.d. Exponentials

Let:

• Xii.i.d.∼ Exponential(λ), i ∈ N;

• N ∼ Geometric(p);

and

S =N∑i=1

Xi. (1.18)

If N is independent of Xi : i ∈ N then

S ∼ Exponential(λp) (1.19)

(Kulkarni, 1995, p. 198). •

Exercise 1.27 — Random sums of i.i.d. Exponentials

After describing an example where a random sums of i.i.d. Exponentials can arise, prove

result (1.19). •

Exercise 1.28 — Random sums of i.i.d. Exponentials (bis)17

A machine is subject to a series of randomly occurring shocks. Assume that: the times

(in hours) between consecutive shocks are i.i.d. r.v. with Exponential distribution with

common parameter λ = 1/10; each shock has the same probability p = 0.3 of breaking

the machine.

(a) Identify the distribution of the time (in hours) until the machine breaks down, S.

(b) Compute the expected value of S and the probability that the machine operates for

more than E(S) hours. •17This exercise was inspired by Kulkarni (1995, Example 5.5, pp. 198–199).

29

If we drop the i.d. assumption and N is an integer r.v. taking values 1, . . . ,m in the

Proposition 1.26, then we end up with another interesting r.v. which is a mixture of

Hypo-exponential distributions.

Proposition 1.29 — Random sums of independent Exponentials: Coxian r.v.

Let:

• Xnindep∼ Exponential(λn), n = 1, . . . ,m, and suppose that λi 6= λj for all i 6= j;

• N an integer r.v. with p.f. pn = P (N = n), n = 1, . . . ,m.

If N is independent of Xn : n ∈ N then∑N

j=1Xj is said to be a Coxian r.v. and its

p.d.f. is given by

f∑Nj=1Xj

(x) =m∑n=1

[pn

n∑i=1

Ci,n × λi e−λix], (1.20)

where Ci,n =∏

j 6=iλj

λj−λi (Ross, 2003, p. 287). •

Coxian r.v. arise as follows (Ross, 2003, p. 287). Suppose an item goes through m

treatment stages and after each stage there is a probability r(n) = P (N = n | N ≥ n)

that the item will be considered unfit to proceed to the next treatment stage. Moreover,

admit that the times spent in each treatment are independent exponential r.v. and that

the probability that the item has just completed treatment stage n and is considered unfit

to proceed to the next stage of the treatment program is (regardless of the time the item

took to go through the n stages) is equal to r(n). Then the total time spent in treatment

is a Coxian r.v.

Exercise 1.30 — Coxian r.v.

Derive:

(a) result (1.20);

(b) the expected value of a Coxian r.v. •

30

1.2 Poisson process: definitions

The importance of the Poisson process is undisputed as a model for counting events

occurring one at a time.

This section is essentially devoted to three alternate definitions of the Poisson process,

in terms of:

1. the inter-event time distribution;

2. characteristics of the increments of the counting process and the distribution of the

number of events in the interval (0, t];

3. characteristics of the increments of the counting process and the probability of the

occurrence of i (i = 1, 2) events in an interval of infinitesimal range, say (0, h].

Definition 1.31 — Poisson process; 1st. definition (Kulkarni, 1995, p. 199)

Let:

(i) Xi : i ∈ N be a sequence of r.v. representing the inter-event times;

(ii) S0 = 0;

(iii) Sn =∑n

i=1Xi be the time of the occurrence of the nth event;

(iv) N(t) = maxn ∈ N0 : Sn ≤ t, t ≥ 0 — i.e., N(t) represents the number of events

that have taken place in the interval (0, t].

If Xii.i.d.∼ Exponential(λ) then the counting process N(t) : t ≥ 0 is said to be a Poisson

process with rate λ — for short,

N(t) : t ≥ 0 ∼ PP (λ).

•

Example 1.32 — Sample path of a Poisson process (Kulkarni, 1995, pp. 199–200)

This is a typical path of a Poisson process with rate λ, N(t) : t ≥ 0:

31

Note that N(0) = 0 and N(t) jumps by one at t = Sn, n ∈ N, thus, it has piecewise

constant sample paths (Kulkarni, 1995, p. 199). •

Quiz 1.33 — Poisson process; 1st. definition

What is the distribution of Sn? •

Exercise 1.34 — Poisson process; 1st. definition (Ross, 2003, pp. 294–295)

Suppose that people immigrate into a territory according to a Poisson process with rate

λ = 1 person per day.

(a) What is the expected time until the 10th immigrant arrives to the territory?

(b) What is the probability that the elapsed time between the 10th and the 11th arrival

exceeds two days? •

Exercise 1.35 — Distribution of N(t) (Kulkarni, 1995, pp. 200–201)

Prove that N(t) ∼ Poisson(λt), for any fixed t ≥ 0, by capitalizing on the fact that the

nth event will occur prior to or at time t iff the number of events occuring by time t is at

least n (Ross, 2003, p. 294), i.e.,

N(t) ≥ n⇔ Sn ≤ t. (1.21)

•

Exercise 1.36 — Distribution of N(t) (bis)

Suppose that customers arrive at a 24/7 shop according to a PP (λ), with λ = 10

customers per hour.18

(a) What is the distribution and the expected value of the number of arrivals in the

interval (0, 8]?

18Inspired by Kulkarni (1995, p. 201).

32

(b) What is the probability that nobody arrives in the first 6 minutes? •

Definition 1.37 — Poisson process; 2nd. definition (Karr, 1993, p. 91; Kulkarni,

1995, p. 203)

A counting process N(t) : t ≥ 0 is said to be a Poisson process with rate λ if:

(i) N(t) : t ≥ 0 has independent and stationary increments;19

(ii) N(t) ∼ Poisson(λt). •

Remark 1.38 — Poisson process (Karr, 1993, p. 91)

Actually, N(t) ∼ Poisson(λt) follows from the fact that N(t) : t ≥ 0 has independent

and stationary increments (see the proof in Billingsley, 2012, Section 23, pp. 297–309),

thus, redundant in Definition 1.37. •

The independent and stationary increments make the calculations tractable (!!!) when

we are dealing with the Poisson process.

Example/Exercise 1.39 — Capitalizing on independent and stationary

increments

Admit that the times between any consecutive birth notifications are i.i.d. r.v. with

Exponential distribution with expected value equal to 2 hours.

(a) Determine the distribution, the expected value and the coefficient of variation of the

yearly number of birth notifications.

(b) Obtain the probability that there are no birth notifications in one day.

(c) Calculate the probability of 100 birth notifications in 3 days given than in the first 2

of those 3 days there were 80 birth notifications.

19Recall that: N(t) : t ≥ 0 has independent increments iff the r.v. N(t1), N(t2)−N(t1), . . . , N(tn)−

N(tn−1) are independent, for 0 < t1 < · · · < tn; N(t) : t ≥ 0 has stationary increments iff, for any

fixed t ≥ 0, the distribution of the increment N(t + s) − N(s) is the same for all s ≥ 0 (Karr, 1993, p.

91).

33

• Stochastic process

N(t) : t ≥ 0 ∼ PP (λ)

• R.v.

N(t) = Number of birth notifications in (0, t]

N(t) ∼ Poisson(λt), where λ = 0.5 birth notifications per day

• Requested probability

P (100 notif. in 3 days | 80 notif. in the first 2 of those 3 days)

= P [N(3× 24) = 100 | N(2× 24) = 80]

=P [N(3× 24) = 100, N(2× 24) = 80]

P [N(2× 24) = 80]

=P [N(2× 24) = 80, N(3× 24)−N(2× 24) = 100− 80]

P [N(2× 24) = 80]

indep. incr.=

P [N(2× 24) = 80]× P [N(3× 24)−N(2× 24) = 20]

P [N(2× 24) = 80]

= P [N(3× 24)−N(2× 24) = 20]

station. incr.= P [N(3× 24− 2× 24) = 20]

= P [N(24) = 20]

= e−0.5×24 (0.5× 24)20

20!

' 0.00968.

•

Exercise 1.40 — Capitalizing on independent and stationary increments (bis)

A machine produces electronic components according to a Poisson process with rate equal

to 10 components per hour. Let N(t) be the number of produced components up to time

t.

Evaluate the probability of producing at least 8 components in the first hour given

that exactly 20 components have been produced in the first two hours. •

34

Exercise 1.41 — Distribution of N(t) and Sn (Pacheco, 2002, Example 19, p. 41)

Orders for laptops arrive at a computer manufacturing facility according to a Poisson

process with rate 0.5 orders per hour.

Compute the probability that more than 7 orders for laptops are received over a 4

hour period and the probability that the third laptop order arrives in the second hour of

operation. •

Exercise 1.42 — Distribution of N(t) (Ross, 2003, Exercises 37 and 38, p. 41)

Cars pass in a certain point in the highway in accordance to a Poisson process with rate

λ = 3 cars per minute.

(a) If Harry blindly runs across the highway in that specific point, then what is the

probability that he will be uninjured if the amount of time it takes to cross the

highway is s seconds?20

Obtain and comment the results for s = 2, 5, 10, 20.

(b) Now, suppose Harry is agile enough to escape from a single car, but if he encounters at

least two cars while attempting to cross the highway at that point he will be injured.

What is the probability that Harry will be unhurt if he takes s = 5, 10, 20, 30 seconds

to cross the highway? •

Exercise 1.43 — Capitalizing on independent and stationary increments; mean

and autocovariance functions of the Poisson process

Let N(t) : t ≥ 0 ∼ PP (λ).

(a) Obtain E[N(t)].

Is N(t) : t ≥ 0 a first order weakly stationary process?

(b) Verify that Cov(N(t), N(t+ s)) = λt and Cov(N(t), N(s)) = λmint, s, for t, s ≥ 0.

Are we dealing with a second order weakly stationary process?

20Admit that if Harry is crossing the highway when a car passes by, then he will be injured.

35

(c) Define the r.v. E [N(t+ s) | N(t)]. •

The previous exercises suggest that we can capitalize on the fact that the

Poisson process has independent and stationary increments to derive the joint p.f. of

N(t1), . . . , N(tn).

Proposition 1.44 — Joint p.f. of N(t1), . . . , N(tn) in a Poisson process (Karr, 1993,

p. 91)

Let N(t) : t ≥ 0 ∼ PP (λ). Then, for 0 < t1 < · · · < tn and 0 = k0 ≤ k1 ≤ · · · ≤ kn,

P [N(t1) = k1, . . . , N(tn) = kn] =n∏j=1

e−λ(tj−tj−1) [λ(tj − tj−1)]kj−kj−1

(kj − kj−1)!, (1.22)

where t0 = 0 and k0 = 0. •

Exercise 1.45 — Joint p.f. of N(t1), . . . , N(tn) in a Poisson process

Prove Proposition 1.44 by taking advantage, namely, of the fact that a Poisson process

has independent and stationary increments.21 •

Exercise 1.46 — Joint p.f. of N(t1), . . . , N(tn) in a Poisson process (bis) (Kulkarni,

1995, p. 205)

Consider the customer arrival process described in Exercise 1.36 and determine the

probability that one customer arrives between 1:00PM and 1:06PM and two customers

arrive between 1:03PM and 1:12PM. •

Exercise 1.47 — Joint p.f. of N(t1), . . . , N(tn) in a Poisson process (bis, bis)

Let N(t) : t ≥ 0 ∼ PP (λ).

(a) Prove that (N(s) | N(t) = n) ∼ Binomial(n, st), for 0 < s < t.

(b) Obtain E[N(t) | N(t+ s)]. •21See, for example, Karr (1993, p. 92) or Kulkarni (1995, p. 204).

36

As we previously mentioned, the Poisson process can be alternatively defined in terms

of the characteristics of the increments of this counting process and the probability of the

occurrence of i (i = 1, 2) events in an interval of infinitesimal range.

Definition 1.48 — Poisson process; 3rd. definition (Ross, 1989, p. 212)

The counting process N(t) : t ≥ 0 is said to be a Poisson process with rate λ, if:

(i) N(0) = 0;

(ii) N(t) : t ≥ 0 has independent and stationary increments;

(iii) P [N(h) = 1] = λh+ o(h);22

(iv) P [N(h) ≥ 2] = o(h). •

Remark 1.49 — Poisson process; 3rd. definition (Ross, 2003, p. 292)

The explicit assumption that the process N(t) : t ≥ 0 has stationary increments can be

eliminated (from Definition 1.48), as long as assumptions (ii), (iii) and (iv) in Definition

1.48 are replaced by:

(ii′) N(t) : t ≥ 0 has independent increments;

(iii′) P [N(t+ h)−N(t) = 1] = λh+ o(h);

(iv′) P [N(t+ h)−N(t) ≥ 2] = o(h). •

Exercise 1.50 — Poisson process; 2nd. and 3rd. definitions

Prove that Definitions 1.37 and 1.48 are equivalent.23

22Function f is said to be o(h) if limh→0f(h)h = 0 (Ross, 1989, p. 211).

f(x) = x2 is o(h) since limh→0f(h)h = limh→0 h = 0.

f(x) = x is not o(h) since limh→0f(h)h = limh→0 1 = 1 6= 0.

23See Ross (1989, pp. 212–214) or Ross (2003, pp. 291–292).

37

Exercise 1.51 — More on the Poisson process (Hastings, 2001, Example 2, pp.

125–126)

Suppose that cars travelling on an expressway arrive at a toll station according to a

Poisson process with rate λ = 5 cars per minute. Let N(t) be the number of cars that

arrive to the toll station in (0, t].

(a) Calculate P [N(2) > 8].

(b) Obtain P [N(1) < 5 | N(0.5) > 2].

(c) Assuming that λ is no longer known, find the largest possible arrival rate λ such that

the probability of 8 or more arrivals in the first minute does not exceed 0.1. •

Exercise 1.52 — More on the Poisson process (bis) (Hastings, 2001, Activity 5

and examples 2 and 3, pp. 268–269)

Admit that users of a computer lab arrive according to a Poisson process with rate λ = 15

users per hour. Let N(t) be the number of users that arrive to the computer lab in (0, t]

and eventually use Mathematica to answer the following questions.

(a) What is the standard deviation of the number of arrivals to the computer lab in a 2

hour period?

(b) Obtain the probability that there are 5 or fewer arrivals in the first half hour.

(c) Calculate the probability of the 3rd. arrival occurs somewhere between the first 10

and the first 20 minutes.

(d) Determine the probability that 10 or fewer users arrived in the first hour.

(e) What is the probability that the second user arrives within the first 6 minutes?

(f) Now assume that λ is no longer known and find the smallest value of the arrival rate

λ such that the probability of having at least 5 arrivals in the first hour is at least

0.95. •

38

1.3 Event times in Poisson processes

Definition 1.31 singles out important r.v. in a homogeneous Poisson process.

Remark 1.53 — Important r.v. in a Poisson process (Karr, 1993, pp. 88–89, 92–93)

Let N(t) : t ≥ 0 be a Poisson process with rate λ. Then:

• Sn = inft : N(t) = n represents the time of the occurrence of the nth event (e.g.

arrival), n ∈ IN ; S0 = 0;

• Xn = Sn − Sn−1 corresponds to the time between the (n− 1)th and nth events (e.g.

interarrival time), n ∈ IN .

We also know that N(t) ∼ Poisson(λt), t > 0, and:

• Sn ∼ Erlang(n, λ), n ∈ N;

• Xni.i.d.∼ Exponential(λ), n ∈ N. •

Remark 1.54 — Relating N(t) and Sn in a Poisson process

Let us remind the reader that N(t) ≥ n ⇔ Sn ≤ t. Thus,

FSn(t) = FErlang(n,λ)(t)

= P [N(t) ≥ n]

=+∞∑j=n

e−λt(λt)j

j!

= 1− FPoisson(λt)(n− 1), n ∈ N. (1.23)

Thus, values of the c.d.f. of a r.v. with an Erlang distribution, such as Sn, may be obtained

using (tables with) values of the Poisson c.d.f. •

39

The next proposition provides an expression for the probability that the nth event in

one Poisson process occurs before the mth event in a second and independent Poisson

process.

Proposition 1.55 — Comparing event times of two independent Poisson

process (Ross, 2003, pp. 300–301)

Let:

• N1(t) : t ≥ 0 and N2(t) : t ≥ 0 be two independent Poisson processes with rates

λ1 and λ2 (respectively);

• S(1)n denote the time of the nth event of the first PP;

• S(2)m denote the time of the mth event of the second PP.

Then

P[S(1)n < S(2)

m

]=

n+m−1∑k=n

(n+m− 1

k

)(λ1

λ1 + λ2

)k (λ2

λ1 + λ2

)n+m−1−k

= 1− FBinomial(n+m−1,λ1/(λ1+λ2))(n− 1). (1.24)

•

Exercise 1.56 — Comparing event times of two independent PP

Consider the setting of Proposition 1.55.

(a) Show that P [S(1)1 < S

(2)1 ] = λ1

λ1+λ2, without using result (1.24).24

(b) Argue that P [S(1)2 < S

(2)1 ] =

(λ1

λ1+λ2

)2

.25

(c) Prove (1.24), using the fact that S(1)n ∼ Erlang(n, λ1), S

(2)m ∼ Erlang(m,λ2) and

FNegativeBinomial(r,p)(x) =x∑i=r

(i− 1

r − 1

)(1− p)i−rpr

= 1− FBinomial(x,p)(r − 1) = FBinomial(x,1−p)(x− r).

•24Hint (Ross, 2003, pp. 300-301): S

(1)1 ∼ Exponential(λ1) is independent of S

(2)1 ∼ Exponential(λ2).

25Hint (Ross, 2003, p. 301): Use result (a) and the lack of memory property of Poisson processes.

40

Exercise 1.57 — Comparing event times of two independent PP (bis)

Men and women enter a supermarket according to two independent Poisson processes

having respective rates two and four per minute.

Compute the probability that the first male customer arrives before the arrival of the

second female customer. •

Exercise 1.58 — Joint p.d.f. of event times of a Poisson process

Obtain the joint p.d.f. of S1, S2, S3.26 •

Suppose we are told that exactly one event of a Poisson process has taken place by

time t (i.e., N(t) = 1), and we are asked to determine the distribution of the time at which

the event occurred S1 — since the Poisson process possesses stationary and independent

increments it is in fact reasonable that each interval in (0, t] of equal length should have

the same probability of containing the event (Ross, 2003, p. 301).

Proposition 1.59 — Conditional distribution of the first event time (Ross, 1989,

p. 223)

Let N(t) : t ≥ 0 be a Poisson process with rate λ > 0. Then

(S1 | N(t) = 1) ∼ Uniform(0, t). (1.25)

•

Exercise 1.60 — Conditional distribution of the first event time

Prove Proposition 1.59 (Ross, 1989, p. 223; Ross, 2003, p. 302). •

Proposition 1.59 can be generalized and the joint distribution of event times S1, . . . , Sn,

given that exactly n events took place in (0, t], can be obtained (Kulkarni, 1995, p. 209).

26Hint: Rewrite S1, S2, S3 in terms of X1, X2, X3 and capitalize on the fact that these r.v. are

independent and exponentially distributed.

41

Proposition 1.61 — Conditional distribution of the event times (Kulkarni, 1995,

pp. 208–209)

Let:

• N(t) : t ≥ 0 ∼ PP (λ);

• Sn the time of the nth event (n ∈ N);

• Yii.i.d.∼ Uniform(0, t), i = 1, . . . , n.

Then

(S1, . . . , Sn | N(t) = n) ∼ (Y(1), . . . , Y(n)), (1.26)

that is,

fS1,...,Sn|N(t)=n(s1, . . . , sn) =n!

tn, (1.27)

for 0 < s1 < · · · < sn < t and n ∈ N. •

Remark 1.62 — Conditional distribution of the event times (Ross, 1989, p. 224)

Proposition 1.61 is usually paraphrased as stating that, under the condition that n events

have occurred in (0, t], the times S1, . . . , Sn at which events occur behave as the order

statistics Y(1), . . . , Y(n), associated to Yii.i.d.∼ Uniform(0, t).

This result is a particular case of Campbell’s theorem. For the statement of this

theorem, please refer to in http://www.stats.gla.ac.uk/glossary/?q=node/43. •

Exercise 1.63 — Conditional distribution of the event times

Prove Proposition 1.61 (Ross, 1989, p. 224; Ross, 2003, p. 303; Kulkarni, 1995, pp. 209–

210).27 •

27Recall the following results. Let Xii.i.d.∼ X, i = 1, . . . , n, where X is a continuous r.v. with p.d.f.

fX(x) and c.d.f. FX(x). Then (Rohatgi, 1976, 150–152): fX(1),...,X(n)(x(1), . . . , x(n)) = n!×

∏ni=1 fX(x(i)),

for x(1) ≤ · · · ≤ x(n); FX(i)(x) = 1 − FBinomial(n,FX(x))(i − 1), for i = 1, . . . , n; fX(i)

(x) = n!(i−1)! (n−i)! ×

[FX(x)]i−1 × [1− FX(x)]

n−i × fX(x), for i = 1, . . . , n. Moreover, when X ∼ Uniform(0, t) we get

E[X(k)] = ktn+1 (Kulkarni, 1995, p. 209).

42

http://www.stats.gla.ac.uk/glossary/?q=node/43

Exercise 1.64 — Conditional distribution of the event times (Kulkarni, 1995,

examples 5.10 and 5.11, pp. 210–212)

(a) Compute P [S1 > s | N(t) = n].

(b) Find E[Sk | N(t) = n], for k = 1, . . . , n and also for k = n+ 1, . . .

(c) Obtain E[S1 | N(t) = 2, S1 ≤ 4, S2 ≤ 10], t ≥ 10. •

1.4 Merging and splitting Poisson processes

The operation of merging two counting processes to generate a new process is also called

superposition (Kulkarni, 1995, p. 214).

Suppose we merge two independent PP. Is the combined process another PP?

Yes!

Proposition 1.65 — Merging independent Poisson processes (Kulkarni, 1995, p.

214)

Let N1(t), t ≥ 0 and N2(t), t ≥ 0 be two independent Poisson processes with rates

λ1 and λ2, respectively. Then the merged process N(t) = N1(t) + N2(t), t ≥ 0 is a

Poisson process with rate λ1 + λ2.

•

Kulkarni (1995, Theorem 5.5, p. 214) generalizes Proposition 1.65 to the superposition

of r independent Poisson processes.

Quiz 1.66 — Merging independent Poisson processes

43

Merging (or superposition) of Poisson processes arises, for example, when customers arrive

at a service facility from different sources — each source generating a Poisson stream

(Kulkarni, 1995, p. 214).

Give more examples of the superposition of PP. •

Exercise 1.67 — Merging two independent Poisson processes

(a) Prove Proposition 1.65 using Definition 1.31 or 1.37.

(b) Show that the probability that the first event in the (merged process) comes from the

first process is equal to λ1λ1+λ2

. •

Exercise 1.68 — Merging two independent PP; comparing event times of two

independent PP (Pacheco, 2002, p. 45)

Suppose that a manufacturing facility of Exercise 1.41 also produces desktops and that

the orders of desktops arrive according to a Poisson process, with rate equal to 1 desktop

per hour and independent of the process of orders of laptops.

(a) Obtain the probability that the total number of orders does not exceed 2 in the

interval (5, 8].

(b) Compute the probability that the 3rd. desktop is ordered before the 2nd. laptop is

ordered. •

Exercise 1.69 — Merging two independent PP; comparing event times of two

independent PP (bis)

Men and women enter a supermarket according to two independent Poisson processes

having respective rates two and four per minute, respectively.

(a) What is the probability that the number of arrivals (men and women) exceeds ten in

the first 20 minutes?

44

(b) Starting at an arbitrary time, compute the probability that the second man arrives

before the third woman arrives (Ross, 1989, Exercise 20, p. 242). •

Exercise 1.70 — Merging more than two independent Poisson process (bis)

(Kulkarni, 1995, Example 5.14, pp. 215–216)

Jobs are submitted by 4 distinct and independent sources for execution on a central

computer. The jobs arrive from source i according to a Poisson process with rate λi =

110, 1

15, 1

30, 1

60jobs per minute, for i = 1, . . . , 4 (respectively).

(a) Let N(t) be the total number of jobs submitted for execution up to time t.

Characterize the stochastic process N(t) : t ≥ 0.

(b) What is the probability that no jobs arrive in a 10-minute interval?

(c) Obtain:

(i) P [N(10) = 5 | N(5) = 2];

(ii) P [N(5) = 2 | N(10) = 5];

(iii) P [N(10) < 6 | N(5) > 3]. •

Exercise 1.71 — Sampling a Poisson (process)

The number of signals emitted by a source in (0, t], say N(t), has Poisson(λt). Suppose

that each signal is recorded by a receptor with probability p, regardless of the remaining

signals. LetN1(t) (resp.N(t)−N1(t)) be the number of signals recorded (resp. unrecorded)

by the receptor up to time t.

(a) Obtain the p.f. of N1(t) (resp. N(t)−N1(t)) conditional to N(t) = n and derive the

marginal distribution of N1(t) (resp. N(t)−N1(t)).

(b) Are (N1(t) | N(t) = n) and (N −N1(t) | N(t) = n) (conditionally) independent r.v.?

(c) What about N1(t) and N(t)−N1(t)? Are they independent r.v.? •

45

The operation of generating two counting processes out of a single counting process is

called splitting (Kulkarni, 1995, p. 214).28

Are the two processes resulting from splitting a Poisson process also PP?

Yes!

Proposition 1.72 — Splitting a Poisson process (or sampling a Poisson

process) (Ross, 1989, p. 217)

Let N(t) : t ≥ 0 be a Poisson process with rate λ. Splitting the original Poisson process

based on a selection probability p yields two independent Poisson processes with rates

λp and λ(1− p).

We can also add that:

(N1(t)|N(t) = n) ∼ Binomial(n, p); (1.28)

(N2(t)|N(t) = n) ∼ Binomial(n, 1− p). (1.29)

•

Exercise 1.73 — Splitting a Poisson process

Prove Proposition 1.72 (Ross, 1989, pp. 218–219). •

Example 1.74 — Splitting a Poisson process (Ross, 1989, Example 3c, p. 220)

If immigrants to area A arrive at a Poisson rate of ten per week, and if each immigrant is

of English descent with probability 112

(independently of the remaining immigrants), then

what is the probability that no people of English descent will immigrate to area A during

the month of February ?

28Splitting is obviously the opposite of superposition (Kulkarni, 1995, p. 217).

46


N(t) : t ≥ 0 ∼ PP (λ)

• R.v.

N(t) = number of immigrants to area A up to time t

N(t) ∼ Poisson(λt)

λ = 10 people per week

• Split process

N1(t) : t ≥ 0 ∼ PP (λp)

• R.v.

N1(t) = number of immigrants to area A with English descent up to time t

N1(t) ∼ Poisson(λpt)

p = P (selecting immigrant of English descent) = 112

λp = 56

immigrants of English descent per week


P [N1(4) = 0] = e−56×4

= e−103 .

•

Exercise 1.75 — Splitting a Poisson process (bis) (Pacheco, 2002, Example 20, p.

43)

Suppose that in Exercise 1.41 each order is correctly processed with probability 0.98,

independently of other orders.

Compute the probability that at least one order is not correctly processed in the first

24h of operation. •

47

Exercise 1.76 — Splitting a Poisson process (bis, bis) (Kulkarni, 1995, Example,

5.16, pp. 219–220)

Suppose radioactive particles arrive to a Geiger counter according to a Poisson process

having rate λ = 103 particles per second and the counter fails to register a particle with

probability 0.1, independent of everything else.

What is the probability that the total number of radioactive particles that arrived at

the Geiger counter is greater than 5 in a one-hundredth of a second, given that in the

same time interval the Geiger counter registered exactly 4 radioactive particles. •

Exercise 1.77 — Splitting a Poisson process (bis, bis, bis)

Inquiries arrive at a recorded message device according to a Poisson process of rate 15

inquiries per minute.

(a) Find the probability that in a one-minute period, 3 inquiries arrive during the first

10 seconds and 2 inquiries arrive during the last 15 seconds.

(b) Admit that 25% of the those inquiries are actually complaints. If 10 inquiries have

arrived to the recorded message device in a one-minute period, what is the probability

that at least 3 of those 10 inquiries are complaints? •

Exercise 1.78 — More on splitting a Poisson process (Ross, 1989, p. 243, Exercise

23)

Cars pass a point on a highway at a Poisson rate of one per minute. If five percent of the

cars on the road are Dodges, then:

(a) What is the probability that at least one Dodge passes during an hour?

(b) If 50 cars have passed by an hour, what is the probability that five of them were

Dodges?

(c) Given that ten Dodges have passed by in an hour, obtain the expected value of the

number of cars to have passed by in that time. •

48

We could be also interested in studying the nature of the split processes Ni(t) : t ≥0, i = 1, . . . , r, resulting from the classification of events into r distinct types.

The process of classification is called splitting mechanism; we are going to focus on

the Bernoulli splitting mechanism, under which each event is classify as a type i event

with probability pi — independent of every other event —, where pi > 0 and∑r

i=1 pi = 1

(Kulkarni, 1995, p. 218).

Proposition 1.79 — Splitting a Poisson process in r processes (Kulkarni, 1995,

pp. 218–219)

Let N(t) : t ≥ 0 ∼ PP (λ) and Ni(t) : t ≥ 0, i = 1, . . . , r, be the split processes

generated by the Bernoulli splitting mechanism. Then

Ni(t) : t ≥ 0 ∼ PP (λ× pi). (1.30)

Moreover, the r processes Ni(t) : t ≥ 0 are independent PP and

(N1(t), . . . , Nr(t)|N(t) = n) ∼ Multinomial(n, (p1, . . . , pr)). (1.31)

•

Exercise 1.80 — Splitting a Poisson process in r processes

Prove Proposition 1.79 (Kulkarni, 1995, p. 219). •

Can we consider a Bernoulli splitting of a PP in two processes such that p depends on

the time at which the event took place?

Yes!

In this case we are dealing with what is called a non-homogeneous Bernoulli splitting

mechanism (Kulkarni, 1995, p. 220).

Definition 1.81 — Non-homogeneous Bernoulli splitting mechanism (Kulkarni,

1995, p. 220)

Let:

• p : R+0 → [0, 1] be a pre-specified function;

49

• N(t) : t ≥ 0 ∼ PP (λ).

Under the non-homogeneous Bernoulli splitting mechanism an event that took place at

time s is registered with probability p(s), regardless of the remaining events. •

Proposition 1.82 — Non-homogeneous Bernoulli splitting (Kulkarni, 1995, p.

220)

Consider:

• N(t) : t ≥ 0 ∼ PP (λ);

• a non-homogeneous Bernoulli splitting mechanism associated to a pre-specified

function p : R+0 → [0, 1];

• N1(t) the number of registered events during (0, t] under the non-homogeneous

Bernoulli splitting mechanism.

Then

N1(t) ∼ Poisson

(λ

∫ t

0

p(s) ds

). (1.32)

•

Exercise 1.83 — Non-homogeneous Bernoulli splitting

Prove Proposition 1.82 (Kulkarni, 1995, pp. 220–221). •

Exercise 1.84 — Non-homogeneous Bernoulli splitting (Ross, 2003, p. 308)

Let us suppose that individuals contract HIV in accordance to a Poisson process having

rate λ. Suppose:

• the incubation period of the HIV (i.e., the time elapsed from exposure to the HIV

until the individual shows the first symptoms and signs) is a r.v. with distribution

G;

50

• the incubation periods of the HIV in different infected individuals are i.i.d. r.v.

What is the distribution ofN1(t), the number of individuals who have shown symptoms

by time t? •

Exercise 1.85 — Non-homogeneous Bernoulli splitting (bis) (Kulkarni, 1995, pp.

221–222)

Suppose users arrive at a public library according to a Poisson process with rate λ. Admit

that the amount of time a user spends in the library is a r.v. with c.d.f. G and independent

of the times spent by the other users in the library.

After having assumed that the library opened at time 0 and never closed:

(a) What is the distribution and the expected value of the number of users in the library

at time t?

(b) Obtain the probability that no users are in the library at time t. •

1.5 Non-homogeneous Poisson process

Considering a constant arrival rate is rather unrealistic! Therefore it is pertinent to ask

ourselves if a counting process, obtained by allowing the arrival rate at time t to be a

function of t, is easily manageable?

Yes it is!

When λ is replaced with a non-negative function λ(t), we are dealing with the first of the

three generalizations of the Poisson process.

Definition 1.86 — Non-homogeneous Poisson process (Ross, 1989, p. 234)

The counting process N(t) : t ≥ 0 is said to be a non-homogeneous Poisson process

with intensity function λ(t), t ≥ 0 — for short, N(t) : t ≥ 0 ∼ NHPP (λ(t)) — if:

• N(0) = 0;

51

• N(t) : t ≥ 0 has independent increments;

• P [N(t+ h)−N(t) = 1] = λ(t)× h+ o(h), t ≥ 0;

• P [N(t+ h)−N(t) ≥ 2] = o(h), t ≥ 0. •

Proposition 1.87 — Non-homogeneous Poisson process (Ross, 2003, p. 316)

Let N(t) : t ≥ 0 ∼ NHPP (λ(t)). Then

N(t+ s)−N(s) ∼ Poisson

(∫ t+s

s

λ(z) dz

), s ≥ 0, t > 0. (1.33)

•

Remark 1.88 — Mean value function and relevance of a non-homogeneous

Poisson process (Ross, 2003, pp. 316, 318)

• Let N(t) : t ≥ 0 ∼ NHPP (λ(t)) and

m(t) =

∫ t

0

λ(z) dz. (1.34)

Then

N(t+ s)−N(s) ∼ Poisson(m(t+ s)−m(s))

N(t) ∼ Poisson(m(t)).

Unsurprisingly, m(t) is called the mean value function of the non-homogenous

Poisson process.

• The relevance of the non-homogenenous Poisson process is essentially due to the

fact that the condition of stationary increments was dropped; the possibility that

events are more likely to occur at certain times than others is now allowed! •

Example 1.89 — Non-homogeneous Poisson process

A souvenir shop is open from 10:00 to 16:00. Admit that customers enter this shop

according to a non-homogeneous Poisson process with a time dependent rate described in

the following table:

52

Period Rate

10:00–12:00 6

12:00–14:00 15

14:00–16:00 linearly decreases from 15 to 10

(a) Identify the intensity function of the arrival process.

• Stochastic process and r.v.

N(t) : t ≥ 0 ∼ NHPP (λ(t))

N(t) = number of customer arrivals by time t

• Intensity function (or time dependent rate)

λ(t) =

0, 0 < t ≤ 10

6, 10 < t ≤ 12

15, 12 < t ≤ 14

15 + (t− 14)× 10−1516−14

= 15− 2.5(t− 14), 14 < t ≤ 16

0, 16 < t ≤ 24.

(b) Find the probability that no customers enter the shop between 13:00 and 15:00.


According to Proposition 1.87, N(t+ s)−N(s) ∼ Poisson(∫ t+s

sλ(z) dz

). Thus,

P [N(15)−N(13) = 0] = e−∫ 1513 λ(z) dz

= e−(∫ 1413 15 dz+

∫ 1514 [15−2.5(z−14)] dz)

= e−28.75

' 0.

•

Exercise 1.90 — Non-homogeneous Poisson process (Ross, 2003, Example 5.22,

pp. 318–319)

Harry owns a vegetarian food stand that opens at 8AM.

53

• From 8AM until 11AM customers seem to arrive at a linearly increasing rate that

starts with 5 customers per hour at 8AM and reaches a maximum of 20 customers

per hour at 11AM.

• From 11AM until 1PM the arrival rate remains constant at 20 customers per hour.

• The arrival rate then drops linearly from 1PM until closing time at 5PM at which

time it has the value of 12 customers per hour.

(a) If we assume that the numbers of customers arriving at Harry’s stand during non

overlapping periods are independent, then what is a good probability model for the

number of customers arriving in the interval (0, t]?

(b) What is the probability that no customers arrive between 8:30AM and 9:30AM on

Monday morning?

(c) Obtain the expected number of arrivals in the period mentioned in (b). •

Exercise 1.91 — Non-homogeneous Poisson process (bis)

Admit that a travel agency is open from 8:00 to 17:00 and customers arrive to it according

to a non-homogenous process and that the time dependent arrival rate:

• equals 4 customers per hour, from 8:00 and 10:00;

• is of 8 customers per hour, from 10:00 and 12:00;

• linearly increases from 8 to 10 customers per hour, from 12:00 to 14:00;

• linearly decreases from 10 to 4 customers per hour, from 14:00 to 17:00.

(a) Determine the intensity function of the process and find the expected number of

arriving customers during a whole day.

(b) Calculate the probability that the number of arrivals between 13:00 and 15:00 exceeds

5. What is the probability that there are no arrivals during this period? •

54

Exercise 1.92 — Non-homogeneous Poisson process (bis, bis)

Consider a non-homogeneous Poisson process with mean value function given by m(t) =

t2 + 2t, t ≥ 0.

(a) Determine the probability that exactly n events occurs in (4, 5].

(b) Obtain the intensity function of the process. •

It is easy to compute joint and conditional distributions of a non-homogeneous Poisson

process due to the independence of increments (Kulkarni, 1995, p. 225). Thus, we state

now a result similar to Proposition 1.44.

Proposition 1.93 — Joint and conditional distributions in a non-homogeneous

Poisson process

Let N(t) : t ≥ 0 ∼ NHPP (λ(t)). Then, for 0 < t1 < · · · < tn and 0 ≤ k1 ≤ · · · ≤ kn,

P [N(t1) = k1, . . . , N(tn) = kn] =n∏j=1

e−[m(tj)−m(tj−1)] [m(tj)−m(tj−1)]kj−kj−1

(kj − kj−1)!, (1.35)

where k0 = 0, t0 = 0 and m(t) =∫ t

0λ(z) dz (Kulkarni, 1995, pp. 225–226). Moreover,

(N(s) | N(t) = n) ∼ Binomial(n, m(s)m(t)

), for 0 < s < t and n ∈ N. •

Exercise 1.94 — Joint p.f. of N(t1), . . . , N(tn) in a non-homogeneous Poisson

process29

Suppose that customers arrive to do business at a bank according to a non-homogeneous

Poisson process with time dependent rate λ(z) = 20 + 10 cos [2π(z − 9.5)] × I[9,17](z).

What is the probability that twenty customers arrive between 9:30 and 10:30, and

another twenty arrive in the following half hour? •

29Inspired by www.maths.uq.edu.au/courses/STAT3004/PastYears/notes/poisson processes.pdf

55

Exercise 1.95 — Conditional distribution of (N(s) | N(t) = n)

The number of arrivals to a shop is governed by a Poisson process with time dependent

rate

λ(t) =

4 + 2t, 0 ≤ t ≤ 4

24− 3t, 4 < t ≤ 8.

(a) Draw the graphs of λ(t) and m(t), for 0 ≤ t ≤ 8.

(b) Derive the probability of no arrivals in the interval (3, 5].

(c) Determine the expected values of the number of arrivals in the last 5 opening hours

(i.e., in the interval (3, 8]), given that 15 customers have arrived in the last 3 opening

hours (that is, in the interval (5, 8]).

(d) Given that 60 customers visited the shop during those 8 opening hours, find an

approximate value to the probability that more than 40 of those 60 customers arrived

in the interval (0, 6]. •

Exercise 1.96 — Epochs of a non-homogeneous Poisson process (Ross, 2003, p.

321)

Let N(t) : t ≥ 0 ∼ NHPP (λ(t)) and Sn the time of the nth arrival (n ∈ N). Prove that

fSn(t) = λ(t)e−m(t) [m(t)]n−1

(n− 1)!,

where m(t) =∫ t

0λ(z) dz. •

We can generalize Proposition 1.61 and derive the conditional distribution of the event

times S1, . . . , Sn given N(t) = n, in a non-homogeneous Poisson process.

Proposition 1.97 — Conditional distribution of the event times in a non-

homogeneous Poisson process

Let:

• N(t) : t ≥ 0 ∼ NHPP (λ(t)) and m(t) =∫ t

0λ(z) dz be its mean value function;

56

• Sn the time of the nth event (n ∈ N);

• Yii.i.d.∼ Y, i = 1, . . . , n, where P (Y ≤ u) = m(u)

m(t), 0 ≤ u ≤ t.

Then

(S1, . . . , Sn | N(t) = n) ∼ (Y(1), . . . , Y(n)). (1.36)

•

Exercise 1.98 — Inter-event times in a non-homogeneous Poisson process

Let Xi be the time between the (i− 1)th and ith events of a NHPP (λ(t)).

Are the r.v. Xi, i ∈ N, identically distributed?30 •

Proposition 1.99 — Inter-event times in a non-homogeneous Poisson process

(Kulkarni, 1995, p. 227)

Let N(t) : t ≥ 0 ∼ NHPP (λ(t)) and Xn+1 the time between the (n + 1)th and nth

events (n ∈ N). Then

P (Xn+1 > t) = P (Sn+1 − Sn > t)

=

∫ +∞

0

λ(s)e−m(t+s) [m(s)]n−1

(n− 1)!ds, (1.37)

where m(t) =∫ t

0λ(z) dz. •

Exercise 1.100 — Inter-event times in a non-homogeneous Poisson process

Prove Proposition 1.99. •

Exercise 1.101 — More on the non-homogeneous Poisson process (bis, bis)

Let N(t) : t ≥ 0 ∼ NHPP (λ(t)), where the intensity function λ(t) is positive for

t ≥ 0. Now, define N∗(t) = N (m−1(t)), where m(t) =∫ t

0λ(z) dz denotes the mean value

function.

Prove that N∗(t) : t ≥ 0 ∼ PP (1). Comment this result.31 •30The r.v. Xi are neither independent, nor identically distributed; for example, P (X1 > t) = e−m(t)

(Kulkarni, 1995, p. 226) and P (X2 > t) =∫ +∞0

λ(s)e−m(t+s) ds.31If we re-scale time in a non-homogeneous Poisson process with mean value function m(t) by taking

m−1(t) instead of t, then we end up dealing with a homogeneous Poisson process with unit rate.

57

Exercise 1.102 — The output process of an infinite server Poisson queue and

the non-homogeneous Poisson process (Ross, 2003, Example 5.23, p. 320)

Prove that the output process of the M/G/∞ queue — i.e., the number of customers

who (by time t) have already left the infinite server queue with Poisson arrivals and

general service d.f. G — is a non-homogeneous Poisson process with intensity function

λ(t) = λG(t). •

1.6 Conditional Poisson process

What happens if the arrival rate is a positive r.v.? Is the resulting stochastic process

mathematically tractable?

Yes!

In this case we end up dealing with another generalization of the Poisson process.

Definition 1.103 — Conditional Poisson process (Ross, 1983, pp. 49–50)

Let:

• Λ be a positive r.v. having c.d.f. G;

• N(t) : t ≥ 0 be a counting process such that, given that Λ = λ, N(t) : t ≥ 0

is a Poisson process with rate λ.

Then N(t) : t ≥ 0 is called a conditional (or mixed) Poisson process and32

P [N(t+ s)−N(s) = n] =

∫ +∞

0

e−λt(λt)n

n!dG(λ), (1.38)

for s ≥ 0. •

32What follows is a Lebesgue-Stieltjes integral equivalent to the Riemann-Stieltjes integral, which is

particularly common in probability theory when G is the c.d.f. of a real-valued r.v. such as Λ. For more

details, the reader is referred to Morais (2011, Subsection 4.2.1) and links therein.

58

Remark 1.104 — Conditional Poisson process

N(t) : t ≥ 0 is not a homogeneous PP (Ross, 1983, p. 50). Even though it has stationary

increments, it has not independent increments in general:

• the stationary increments follow immediately from (1.38) (Ross, 2003, p. 327), which

does not depend on the origin s of the time interval (s, t + s], thus, P [N(t) = n] is

given by (1.38).

• knowing how many events occur in an interval gives information about the possible

value of the random arrival rate Λ, therefore affecting the distribution of the number

of events in other time intervals (Ross, 2003, p. 327). •

Example/Exercise 1.105 — Conditional Poisson process33

Suppose that the number of requests to a web server follows a conditional (or mixed)

Poisson process with random rate Λ (in requests per minute) and admit that Λ ∼

Gamma(r, β), where r, β ∈ IN .

(a) Derive a simplified expression for the probability that the server receives at most m

requests in t minutes (m ∈ IN0, t > 0) and obtain the value of this probability for

r = β = t = 2, m = 8.


N(t) : t ≥ 0 conditional (or mixed) Poisson process with random rate Λ

• R.v.

N(t) = number of requests to a web server in the 1st. t minutes

(N(t) | Λ = λ) ∼ Poisson(λt)

• Random rate and its p.d.f.

Λ ∼ Gamma(r, β), r, β ∈ N

gΛ(λ) = βr

Γ(r)λr−1 e−βλ, λ ≥ 0

33Inspired by Ross (2003, Example 5.27, p. 327).

59

• P.f. of N(t)

Since Λ is a continuous r.v. with p.d.f. gΛ(λ), we get

P [N(t+ s)−N(s) = n] =

∫ +∞

0

e−λt(λt)n

n!gΛ(λ)dλ.

by using (1.38).

Moreover, since N(t) : t ≥ 0 has stationary increments, we can add that:

P [N(t) = n] =

∫ +∞

0

e−λt(λt)n

n!

βr

Γ(r)λr−1e−βλdλ

=tnΓ(n+ r)βr

n! Γ(r)(β + t)r+n

∫ +∞

0

(β + t)n+r

Γ(n+ r)λn+r−1e−(β+t)λ dλ

=tnΓ(n+ r)βr

n! Γ(r)(β + t)r+n

∫ +∞

0

fGamma(n+r,β+t)(λ) dλ

=

(n+ r − 1

n

)(β

β + t

)r (1− β

β + t

)n, n ∈ N0

= p.f. of a NegativeBinomial∗(r, β/(β + t))


For m ∈ N0, we have

P [N(t) ≤ m] = FNegativeBinomial∗(r,β/(β+t))(m)

= FNegativeBinomial(r,β/(β+t))(m+ r)

= 1− FBinomial(m+r,β/(β+t))(r − 1).

Thus, for r = β = t = 2 and m = 8, we get

P [N(2) ≤ 8] = 1− FBinomial(8+2,2/(2+2))(2− 1)

table= 1− 0.0107

= 0.9893.

(b) Prove that (Λ|N(t) = n) ∼ Gamma(r + n, β + t).

(c) Determine limh→0+P [N(t+h)−N(t)=1|N(t)=n]

h. •

Exercise 1.106 — Conditional Poisson process (bis)

60

An airport has a 24/7 booking center at which customers arrive according to a conditional

Poisson process with an exponentially distributed arrival rate Λ.

(a) What is the probability that no arrivals occur in a t hours period?

(b) Admit that the expected value of Λ is equal to 4 customers per hour. Verify that the

probability that Λ does not exceed 6 customers per hour — given that 49 customers

arrived in (0, 12], is given by Fχ2100

(147).34 •

Exercise 1.107 — Conditional Poisson process (bis, bis) (Ross, 1983, p. 50)

Admit that, depending on factors not at present understood, the rate at which seismic

shocks occur in a certain region over a given season is either λ1 or λ2. Admit also that

the rate equals λ1 for p× 100% of the seasons and λ2 in the remaining time.

A simple model would be to suppose that N(t), t ≥ 0 is a conditional Poisson

process such that Λ is either λ1 or λ2 with respective probabilities p and 1− p.

Prove that the probability that it is a λ1−season, given n shocks in the first t units of

a season, equals

p e−λ1t(λ1t)n

p e−λ1t(λ1t)n + (1− p) e−λ2t(λ2t)n, (1.39)

by applying the Bayes’ theorem. •

Exercise 1.108 — Splitting PP; Conditional PP

One estimates that meteors enter the atmosphere in a specific region of the globe according

to a Poisson process having rate λ equal to 100 meteors per hour and that 1% of those

meteors are visible to the “naked eye” as shooting stars.

(a) What is the probability that an observer is lucky enough to see at least two shooting

stars in 30 minutes?

After a detailed study of the meteor automatic detection process of a certain type of

telescope, one admits that:

34Hint: X ∼ Gamma(α, δ) ⇔ 2δX ∼ χ22α.

61

• the automatic detection rate of meteors per hour has Uniform distribution in the

interval [20, 200];

• conditionally on the knowledge of the automatic detection rate, the number of

meteors detected by the telescope is governed by a Poisson process.

(b) What are the expected value and standard deviation of the number of meteors

automatically detected by the telescope in 6 hours? (Hint: E(X) = E[E(X|Y )]

and V (X) = V [E(X|Y )] + E[V (X|Y )].)

(c) What is the probability that the telescope automatically detects at least one meteor

in 15 minutes? •

1.7 Compound Poisson process

Now, consider a continuous-time stochastic process with jumps. Admit

that the jumps occur randomly according to a Poisson process and the

size of the jumps is also random, with a specified probability distribution

(http://en.wikipedia.org/wiki/Compound Poisson process). Does the total size of

the jumps that occurred up to time t define a stochastic process easy to deal with?

Yes!

It is another generalization of the Poisson process.

Definition 1.109 — Compound Poisson process (Ross, 1989, p. 237)

A stochastic process X(t) : t ≥ 0 is said to be a compound Poisson process if it can be

represented as

X(t) =

N(t)∑i=1

Yi, (1.40)

where

• N(t) : t ≥ 0 ∼ PP (λ) and

62

http://en.wikipedia.org/wiki/Compound_Poisson_process

• Yii.i.d.∼ Y and independent of N(t) : t ≥ 0. •

Example 1.110 — Compound Poisson process (Ross, 2003, pp. 321–322)

Compound Poisson processes arise, for example, in the following settings.

• Suppose that buses arrive to a venue according to a Poisson process, the numbers

of persons in each bus are i.i.d. r.v. Yi and X(t) denotes the total number of persons

who arrived by time t. Then X(t) : t ≥ 0 is a compound Poisson process.

• Admit that customers leave a supermarket in accordance to a Poisson process, the

amounts of money each person has spent are i.i.d. r.v. Yi and X(t) represents the

total amount of money spent by the customers who left the supermarket until time

t. Then X(t) : t ≥ 0 is also a compound Poisson process. •

Quiz 1.111 — Compound Poisson process

Give more examples of compound Poisson processes.

Proposition 1.112 — Compound Poisson process (Ross, 1989, pp. 238–239)

Let X(t) : t ≥ 0 be the compound Poisson process described in Definition 1.109. Then

E[X(t)] = λt× E(Y ) (1.41)

V [X(t)] = λt× E(Y 2). (1.42)

•

Exercise 1.113 — Compound Poisson process

Let X(t) : t ≥ 0 be a compound Poisson process.

(a) Prove Proposition 1.112, by noting that E[X(t)] = EE[X(t)|N(t)] and

V [X(t)] = EV [X(t)|N(t)]+ V E[X(t)|N(t)] (Ross, 1989, pp. 238–239).

63

(b) Use the total probability law to prove that the m.g.f. of X(t) can be written as

MX(t)(s) = E[esX(t)

]= eλt[MY (s)−1] (1.43)

(Kulkarni, 1995, pp. 229-230).35 •

Example 1.114 — Compound Poisson process

Let X(t) be the total amount of money payed by an insurance company in (0, t]. Admit

that:

• the number of payments is governed by a Poisson process having rate λ equal to 5

payments a week;

• the payments are i.i.d. r.v. with Exponential distribution with expected value equal

to 20 000 Euros.

Determine the expected value and variance of the total amount of money payed by the

insurance company in 4 weeks.

• Stochastic processX(t) =

∑N(t)i=1 Yi : t ≥ 0

∼ Compound PP

• R.v. et al.

X(t) = total amount payed to the insurance company by time t

Yi = ith amount payed to the insurance company

Yii.i.d.∼ Y

Y ∼ Exponential(1/20000)

Yi : i ∈ N indep. of N(t) : t ≥ 0 ∼ PP (λ = 5 payments per week)

35See the proof also in http://en.wikipedia.org/wiki/Compound Poisson process. This result is another

consequence of Campbell’s theorem, named after Norman Robert Campbell who first published the result

in 1909/1910; this result gives the m.g.f. of a compound Poisson process, from which the expected value

and variance can be easily computed (http://en.wikipedia.org/wiki/Campbell’s theorem (probability)).

64

http://en.wikipedia.org/wiki/Compound_Poisson_process

http://en.wikipedia.org/wiki/Campbell's_theorem_(probability)

• Requested expected value and variance

According to Proposition 1.112, we have E[X(t)] = λt × E(Y ) and V [X(t)] =

λtE(Y 2) = λt× [V (Y ) + E2(Y )]. Thus,

E[X(4)] = 5× 4× 20000

= 400000

V [X(4)] = 5× 4× [200002 + 200002]

= 1.6× 1010.

•

Exercise 1.115 — Compound Poisson process (bis) (Kulkarni, 1995, Example 5.20,

pp. 230–231)

Suppose that:

• customers arrive at a restaurant in batches of size 1, 2, 3, 4, 5 and 6;

• the batches themselves arrive according to a Poisson process having rate λ;

• the successive batch sizes Yi are i.i.d. r.v. to Y with the following p.f.

P (Y = y) =

0.1, y = 1, 3

0.25, y = 2, 4

0.15, y = 5, 6.

Compute the mean and the variance of the number of customers who arrived at the

restaurant in (0, t]. •

Exercise 1.116 — Compound Poisson process (bis, bis) (Walrand, 2004, p. 208)

Let N(t) : t ≥ 0 be a Poisson process with rate λ. At each jump time, a random

number Yi of customers arrive at a cashier waiting line. The r.v. Yi are i.i.d. with mean µ

and variance σ2. Let X(t) be the number of customers who arrived by time t, for t ≥ 0.

Calculate E[X(t)] and V [X(t)]. •

65

Exercise 1.117 — Compound Poisson process (bis, bis, bis) (Ross, 2003, pp. 322,

326)

Suppose that families migrate to an area at a Poisson rate λ = 2 per week. Assume that

the number of people in each family is independent and takes values 1, 2, 3 and 4 with

respective probabilities 16, 1

3, 1

3and 1

6.

(a) What is the expected value and variance of the number of individuals migrating to

this area during a five-week period?

(b) Find an approximate value for the probability that at least 240 people migrate within

the next 50 weeks. •

Remark 1.118 — Values of the compound Poisson process; joint p.f. of

X(t1), . . . , X(tn)

• Since the r.v. Yi may take positive as well as negative values, the value of

X(t) =∑N(t)

i=1 Yi may either increase or decrease (Kulkarni, 1995, p. 228), unlike

any counting process.

• If Yi are integer-valued r.v., then so is X(t); moreover, for 0 < t1 < · · · < tn, we

have

P [X(t1) = k1, . . . , X(tn) = kn] =n∏j=1

pkj−kj−1(tj − tj−1), (1.44)

where pk(t) = P [X(t) = k], k0 = 0 and t0 = 0 (Kulkarni, 1995, pp. 228–229). •

Since the r.v. Yi (i = 1, . . . , n) are i.i.d. and N(t) : t ≥ 0 has stationary and

independent increments, the compound Poisson processX(t) =

∑N(t)i=1 Yi : t ≥ 0

also

has stationary and independent increments (Kulkarni, 1995, p. 228).

66

Remark 1.119 — Homogeneous, non-homogeneous, conditional and

compound Poisson processes

Stochastic process Independent increments? Stationary increments?

Homogeneous PP Yes!!! Yes!!!

Non-homogeneous PP Yes!!! No!

Conditional PP No! Yes!!!

Compound PP Yes!!! Yes!!!

•

Exercise 1.120 — Independent increments and the compound Poisson process

(Ross, 1983, Exercise 2.26, p. 53)

Obtain the autocovariance function of a compound Poisson process X(t) : t ≥ 0. •

Quiz 1.121 — A generalization of the compound Poisson process (Kulkarni,

1995, p. 231)

It is possible to construct a non-homogeneous compound Poisson process, X(t) : t ≥ 0,

by assuming that, in Definition 1.109, N(t) : t ≥ 0 is a non-homogeneous Poisson

process.

Can you derive the m.g.f. of X(t)? •

Exercise 1.122 — A generalization of the compound Poisson process (Ross,

1983, Exercise 2.14, p. 52)

Admit that: busloads of customers arrive at an infinite server queue according to a Poisson

process having rate λ; G denotes the service distribution (of each busload of customers

regardless of its size); a bus contains j customers with probability αj. Let X(t) denote

the number of customers that have been served by time t.

(a) Obtain E[X(t)].

(b) Is X(t) Poisson distributed? •

67

Exercise 1.123 — Mind expanding exercise (Ross, 1983, Exercise 2.25, p. 53)

A two-dimensional Poisson process is characterized as follows:

(i) it is a process of randomly occurring events in the plane;

(ii) for any region of area A the number of events in that region has a Poisson distribution

with parameter λA;

(iii) the numbers of events in non-overlapping regions are independent r.v.

Now consider an arbitrary point in the plane and let X denote its distance from its nearest

event of the two-dimensional Poisson process.

(a) Show that P (X > t) = e−λπt2.

(b) Prove that E(X) = 12√λ. •

Exercise 1.124 — Mind expanding exercise (bis)

Every Sunday, 15 units of a perishable product are stocked in order to be sold in the

remaining week days. The orders of this product are governed by a Poisson process with

rate λ equal to 3 units per day.36 Moreover, admit that due to the nature of the product

all unsold units are destroyed on Sunday before restocking for the next week.

(a) Determine the probability that there are no units for sale on Tuesday (at 00:00).

(b) Compute the probability that all the 15 units were sold by Saturday (at 23:59:59).

(c) Obtain a simplified expression for the expected value of units destroyed weekly. •

Exercise 1.125 — Mind expanding exercise (bis, bis)

Consider a junction between a main and a secondary road. Admit that: cars pass in the

main road according to a Poisson process with a rate of 10 cars per minute; Harry is

driving a car in the secondary road and needs 10 seconds to enter the main road; the cars

circulating in the main road take a negligible time to pass the junction.

Let:36Note that an order of the product does not result in a sale if there are no units in stock.

68

• N be the number of cars that pass the junction while Harry waits to enter the main

road;

• Yn be the time (in seconds) at which the nth car passed the junction while Harry

waited to enter the main road;

• Y0 = 0.

(a) Obtain the distribution of N and compute its 75% percentage point.

(b) Show that E(Yn) = 2n× 3−8e−53

1−e−53

, for n = 1, . . . , N , and N ∈ N.

(c) Obtain the expected value of the time Harry has to wait at the junction until he

initiates the manoeuvre to enter the main road. •

69

Chapter 2

Renewal Processes

(Homogeneous) Poisson processes are counting processes for which the times between

successive events are i.i.d. exponential r.v.

Can we be slightly more realistic, by dropping the exponentially distributed

assumption and considering inter-event times with a common but arbitrary distribution?

Yes!

The resulting process is called a renewal process and is still mathematically tractable.

Applications of renewal processes include calculating the expected time

for a monkey who is randomly tapping at a keyboard to type the word

Macbeth and comparing the long-term benefits of different insurance policies

(http://en.wikipedia.org/wiki/Renewal theory). More importantly, many questions

about more complex and interesting stochastic processes can be addressed by identifying

a relevant renewal process.1

2.1 Introduction

Informally, a renewal process is a generalization of the Poisson process. Expectedly,

renewal processes are going to be defined in terms of the times between consecutive

events as a Poisson process in Definition 1.31.

1What follows was essentially inspired by Kulkarni (1995, Chap. 8), Ross (1983, Chap. 3), and Ross

(2003, Chap. 7).

70

http://en.wikipedia.org/wiki/Renewal_theory

Definition 2.1 — Renewal process (Kulkarni, 1995, pp. 401–402; Ross, 2003, pp.

401–402)

Let:

• Xi : i ∈ N be a sequence of r.v. representing the inter-event times;

• S0 = 0;

• Sn =∑n

i=1Xi be the time of the occurrence of the nth event;

• N(t) = supn ∈ N0 : Sn ≤ t, t ≥ 0 — i.e., N(t) represents the number of events

(or renewals) that occurred in (0, t].

If Xi : i ∈ N is a sequence of i.i.d. non-negative (real) r.v. with common c.d.f. F ,2 then

the counting process N(t) : t ≥ 0 is said to be a renewal process. •

Remark 2.2 — Renewal sequence; characterization of renewal processes

• Sn : n ∈ N is said to be a renewal sequence (Kulkarni, 1995, p. 402).

• The renewal process N(t) : t ≥ 0 is fully characterized by the inter-event time

distribution F (Kulkarni, 1995, Theorem 8.1, p. 404).

• The designation of N(t) : t ≥ 0 as a renewal process is due to the fact that

N(t + Sn) − n : t ≥ 0 is stochastically identical to N(t) : t ≥ 0, for n ∈ N0

(Kulkarni, 1995, pp. 404-405). •

Example 2.3 — Renewal process (Kulkarni, 1995, pp. 403–404)

• Admit a process requires the continuous use of a specific machine. At time 0 a

brand new machine is put to work until it fails after a random amount of time

X1; this machine is instantly replaced with a new one which will last for a random

amount of time X2; etc. In case Xi are i.i.d. r.v. and N(t) denotes the number of

failures/replacements by time t, N(t) : t ≥ 0 is a renewal process.

2To avoid trivialities, we assume that F (0) = P (Xi = 0) < 1. From the non-negativity of Xi and the

fact that Xi is not identically 0, we get E(Xi) > 0.

71

• Let Sn be the completion time of the nth busy cycle of a M/G/1 queue and N(t) be

the number of busy cycles completed by time t. Then N(t) : t ≥ 0 is a renewal

process. •

Quiz 2.4 — Renewal process

Give examples of stochastic processes arising in your daily life that could be modeled as

renewal processes. •

2.2 Properties of the number of renewals

It is essential to study in some detail the properties of the number of renewals up to time

and including time t, N(t). The distribution of N(t) can be obtained, at least in theory

(Ross, 2003, p. 403), by capitalizing on a familiar result

N(t) ≥ n ⇔ Sn ≤ t, (2.1)

or on the fact that

N(t) = n ⇔ Sn ≤ t < Sn+1. (2.2)

But before we proceed to derive the p.f. of N(t) let us explore a bit more the

relationship between N(t) and Sn by solving the following exercise.

Exercise 2.5 — Relating N(t) and Sn (Ross, 1983, Exercise 3.1, p. 93)

Is it true that:

(a) N(t) < n ⇔ Sn > t ?

(b) N(t) ≤ n ⇔ Sn ≥ t ?

(c) N(t) > n ⇔ Sn < t ?

Justify your answers! •

72

Proposition 2.6 — Relating the p.f. of N(t) and the c.d.f. of the event times

(Ross, 2003, p. 403)

Let:

• N(t) : t ≥ 0 be a renewal process;

• S0 = 0;

• F0(t) = P (S0 ≤ t) = 1, t ≥ 0;

• Sn the time of the occurrence of the nth event, for n ∈ N;

• Fn(t) = P (Sn ≤ t) the c.d.f. of Sn.

Then P [N(t) ≥ n] = Fn(t) and

P [N(t) = n] = Fn(t)− Fn+1(t), (2.3)

for n ∈ N and t > 0. •

Computing the p.f. of N(t) is a non trivial task for all but a few renewal processes

(Kulkarni, 1995, p. 405).

Example 2.7 — P.f. of N(t) (Ross, 2003, Example 7.1, pp. 403–404)

Admit that

P (Xn = k) = (1− p)k−1p, k, n ∈ N,

that is, S1 = X1 can be interpreted as the number of i.i.d. Bernoulli trials until the first

success and Sn =∑n

i=1 Xi may be interpreted as the number of trials necessary to attain

n successes.

Obtain the distribution of Sn and P [N(t) = n].

• Renewal process

N(t) : t ≥ 0

73

• R.v.

N(t) = number of renewals up to time t

Xi = time between the (i− 1)th and ith renewals

Xii.i.d.∼ Geometric(p), i ∈ N

• Renewal times

Sn = time of the nth renewal, n ∈ N

Sn =∑n

i=1Xi

Sn ∼ NegativeBinomial(n, p)

P (Sn = k) =(k−1n−1

)pn(1− p)k−n, k = n, n+ 1, . . .

• P.f. of N(t)

For 0 < t < 1, P [N(t) = 0] = 1.

For t ≥ 1, N(t) ∼ Binomial(btc, p), where btc represents the integer part of t.3 In

fact:

P [N(t) = 0] = P (X1 > t)

= (1− p)btc;

P [N(t) = n] = P (Sn ≤ t)− P (Sn+1 ≤ t)

=

btc∑k=n

(k − 1

n− 1

)pn(1− p)k−n −

btc∑k=n+1

(k − 1

n

)pn+1(1− p)k−n−1

= · · ·

=

(btcn

)pn(1− p)btc−n,

for n ∈ 1, . . . , btc. •

Exercise 2.8 — P.f. of N(t) (bis) (Ross, 2003, Exercise 7.2, p. 460)

Suppose the inter-event distribution of a renewal process, say N(t) : t ≥ 0, is Poisson

with expected value λ, i.e., Xii.i.d.∼ Poisson(λ), i ∈ N.

3We are essentially dealing with a Bernoulli counting process...

74

(a) Find the distribution of Sn.

(b) Compute P [N(t) = n]. •

Now, we attempt to answer is whether an infinite number of renewals can occur in a

finite time (Ross, 1983, p. 55).

Proposition 2.9 — Finiteness of N(t) in finite time (Kulkarni, 1995, Theorem 8.3,

p. 406)

Let N(t) : t ≥ 0 be a renewal process. Then N(t) is a proper r.v. for all (finite!) t ≥ 0,

i.e.,

P [N(t) < +∞] = 1, (2.4)

for 0 ≤ t < +∞. •

Exercise 2.10 — Finiteness of N(t) in finite time

Prove Proposition 2.9 (Ross, 1983, pp. 55–56; Kulkarni, 1995, p. 406). •

Since Proposition 2.9 is valid, we can write N(t) = maxn ∈ N0 : Sn ≤ t (instead of

N(t) = supn : Sn ≤ t), and we can indeed identify the p.f. of N(t) in terms of the c.d.f.

of the renewal times Sn.

2.3 Renewal function

Definition 2.11 — Renewal function (Ross, 2003, p. 404)

The expected value of the number of renewals up to t,

m(t) = E[N(t)], t ≥ 0, (2.5)

defines what is called the mean-value or the renewal function. •

By capitalizing on the fact that N(t) is a non-negative integer r.v. such that P [N(t) ≥n] = Fn(t) we can provide an expression for m(t).

75

Proposition 2.12 — Renewal function (Ross, 2003, p. 404)

Let N(t) : t ≥ 0 be a renewal process. Then

m(t) =+∞∑n=1

Fn(t). (2.6)

•

Exercise 2.13 — Renewal function

Prove Proposition 2.12 (Ross, 2003, p. 404; Ross, 1983, pp. 56–57). •

Exercise 2.14 — Renewal function (Ross, 2003, Exercise 6, p. 461)

Consider a renewal process N(t) : t ≥ 0 with a Gamma(r, λ) (r ∈ N) inter-renewal

distribution.

(a) Show that, for t ≥ 0 and n ∈ N0,

P [N(t) ≥ n] =+∞∑i=nr

e−λt(λt)i

i!.

(b) Use (a) to prove that

m(t) =+∞∑i=r

⌊i

r

⌋e−λt

(λt)i

i!,

for t ≥ 0.4 •

The renewal function is very important because it completely characterizes the renewal

process (Kulkarni, 1995, p. 414).

Proposition 2.15 — Renewal function and the inter-renewal distribution

There is a one-to-one correspondence between m(t) and the inter-renewal distribution F

(Ross, 2003, p. 404).5 •

4Use the relationship between the Gamma(r, λ) distribution and the sum of r independent

exponentially distributed r.v. with rate λ to define N(t) in terms of the number of events of a Poisson

process with rate λ.5The proof can be found in Kulkarni (1995, p. 414) and it makes use of the Laplace-Stieltjes transforms

76

Exercise 2.16 — Renewal function and the inter-renewal distribution (Ross,

2003, Example 7.2, p. 405)

(a) Prove Proposition 2.15.

(b) Suppose the renewal process N(t) : t ≥ 0 has renewal function m(t) = 2t, t ≥ 0.

What is the distribution of N(10)? •

Exercise 2.17 — Renewal function and the inter-renewal distribution (bis)

(Ross, 2003, Exercise 7.3, p. 460)

If the mean-value function of the renewal process N(t) : t ≥ 0 is given by m(t) = t2, t ≥

0, what is the value of P [N(5) = 0]? •

It is time to inquire about the finiteness of m(t) = E[N(t)] in finite time.

Some readers might think that the finiteness of N(t) (with probability 1) implies the

finiteness of m(t); even though such reasoning should be avoided,6 the result is valid, as

stated in the next proposition.

Proposition 2.18 — Finiteness of m(t) in finite time (Ross, 1983, p. 57; Kulkarni,

1995, Theorem 8.8, p. 416)

Let N(t) : t ≥ 0 be a renewal process. Then

m(t) < +∞, (2.7)

for 0 ≤ t < +∞.7 •of m(t) and F (t), m(s) =

∫ +∞0−

e−st dm(t) and F (s) = E(e−s S1

)= MS1

(−s) =∫ +∞0−

e−st dF (t),

respectively. In fact, these two Laplace-Stieltjes transforms satisfy m(s) = F (s)

1−F (s)and the renewal

function can be obtained by inverting its Laplace-Stieltjes transform — so this is another method of

computing the renewal function besides the use of (2.6). Note that the Laplace transform of a function

f(t) is defined as∫ +∞0−

e−stf(t) dt and should not be mistaken for the Laplace-Stieltjes transform even

though they can be obvious related. Moreover, Mathematica can be used to obtain the Laplace transform

and the inverse Laplace transform.6Consider, for instance, the r.v. Y that takes value 2n with probability ( 1

2 )n, for n ∈ N. Even though

Y is finite — P (Y < +∞) =∑+∞n=1 P (Y = 2n) = 1 —, we have E(Y ) =

∑+∞n=1 2nP (Y = 2n) = +∞.

7Ross (1983, p. 57) and Kulkarni (1995, pp. 416–417) provide proofs of this result; Ross (2003, p. 405)

states the result without a proof.

77

2.4 Renewal-type equations

In general the renewal function is difficult to compute for an arbitrary inter-arrival

distribution F , namely using (2.6).8 However, we are able to derive the expected value of

N(t) via an integral equation.

Remark 2.19 — Renewal argument (Kulkarni, 1995, p. 407)

One of the most useful tools of renewal theory is the renewal argument. It allows us to

derive an integral equation for certain probabilistic quantities, such as m(t), in renewal

processes, by conditioning on the time of the first renewal S1. •

Proposition 2.20 — Renewal equation (Kulkarni, 1995, p. 415)

Let N(t) : t ≥ 0 be a renewal process with inter-renewal distribution F . Then the

renewal argument leads to the following integral eq. involving the renewal function:

m(t) = F (t) +

∫ t

0

m(t− x) dF (x). (2.8)

(2.8) is called the renewal equation and provides another method of computing m(t). •

Exercise 2.21 — Renewal equation

Prove the renewal equation (Kulkarni, 1995, pp. 414–415; Ross, 2003, p. 406). •

The renewal equation can sometimes be solved to obtain the renewal function (Ross,

2003, p. 406)

Exercise 2.22 — Renewal equation (Ross, 2003, Example 7.3, pp. 406–407)

Solve the renewal equation for 0 ≤ t ≤ 1, when the inter-renewal distribution is

Uniform(0, 1) and show that m(t) = et − 1, 0 ≤ t ≤ 1. •

Exercise 2.23 — Renewal equation (bis) (Kulkarni, 1995, examples 8.8 and 8.15,

pp. 404, 416)

Suppose the inter-renewal times have Bernoulli(α) distribution (α ∈ (0, 1)).9

Prove that m(t) = btc+1−αα

by solving the renewal equation. •

8Or capitalizing on the relationship between the Laplace-Stieltjes transforms of m(t) and F (t).9The associated renewal process is called the negative binomial process.

78

We know that P [N(t) = 0] = P (X1 > t) = 1− F (t). Can we derive P [N(t) = n], for

n ∈ N, when we are not able to obtain a nice formula for the convolution Fn(t)?

In certain cases!

The renewal argument can be also used to derive what is called a renewal-type equation

for P [N(t) = n].

Proposition 2.24 — Renewal-type equation for P [N(t) = n] (Kulkarni, 1995, pp.

408–409)

Let N(t) : t ≥ 0 be a renewal process with inter-renewal distribution F . Then the

renewal argument leads to the following renewal-type equation:

P [N(t) = n] =

∫ t

0

P [N(t− x) = n− 1] dF (x), n ∈ N. (2.9)

•

Exercise 2.25 — Renewal-type equation for P [N(t) = n]

Prove the renewal-type equation (2.9) (Kulkarni, 1995, p. 408). •

The integral equation (2.9) is not simple to solve unless the inter-renewal times are

discrete r.v. with common p.f. P (X = i) = αi, i ∈ N0. In this case, N(t)st= N(btc) and

its p.f. can be obtained recursively, for a fixed t ≥ 0 (Kulkarni, 1995, p. 408):

P [N(btc) = 0] = P (X1 > btc)

= 1−btc∑i=0

αi; (2.10)

P [N(btc) = n] =

btc∑i=0

P [N(btc − i) = n− 1]× αi, n ∈ N. (2.11)

Equation (2.11) provides a computationally stable method of computing the p.f. of the

number of renewals up to time t, N(t) (Kulkarni, 1995, p. 409).

Exercise 2.26 — Renewal-type equation for P [N(t) = n]

Use results (2.10) and (2.11) to derive the renewal function obtained in Exercise 2.23. •

79

The renewal argument can also be used to obtain a renewal-type equation involving

the expected value of the time of the first renewal after t, SN(t)+1.

Proposition 2.27 — Renewal-type equation for E[SN(t)+1] (Kulkarni, 1995, p. 418)

Let N(t) : t ≥ 0 be a renewal process with inter-renewal distribution F and H(t) =

E[SN(t)+1]. Then

H(t) = E(S1) +

∫ t

0

H(t− x) dF (x). (2.12)

•

Exercise 2.28 — Renewal-type equation for E[SN(t)+1]

Derive the renewal-type equation (2.12) (Kulkarni, 1995, p. 418). •

After stating the renewal equation and two renewal-type equations (for P [N(t) = n]

and E[SN(t)+1]), we proceed with a more thorough treatment of a general renewal-type

equation.

Definition 2.29 — Renewal-type equation (Kulkarni, 1995, pp. 420–421)

The type of integral equations that arise from using the renewal argument are called

renewal-type equations and have the following form:

H(t) = D(t) +

∫ t

0

H(t− x) dF (x), (2.13)

where F (x) is the c.d.f. of a non-negative r.v., D(t) is a known function and H(t) is a

function to be determined. •

If D(t) = F (t) and H(t) = m(t) then equation (2.13) corresponds to the renewal

equation (2.8). If D(t) = E(S1), for all t ≥ 0, and H(t) = E[SN(t)+1], we get the renewal-

type equation for E[SN(t)+1] in (2.12).

The following result not only gives the conditions for the existence and uniqueness of

the solution of the renewal-type equation in (2.13), but also one possible representation

of the solution (Kulkarni, 1995, p. 421).10

10The proof of this result can be found in Kulkarni (1995, pp. 421–422).

80

Proposition 2.30 — Existence, uniqueness and representation of the solution

of the renewal-type equation (Kulkarni, 1995, Theorem 8.10, p. 421)

Let N(t) : t ≥ 0 be a renewal process, with inter-renewal distribution F (x) and renewal

function m(t). Suppose |D(t)| < +∞, for all t ≥ 0. Then the renewal-type equation

H(t) = D(t) +∫ t

0H(t−x) dF (x) has a unique solution. This unique solution is such that

|H(t)| < +∞, for all t ≥ 0, and can be written in terms of m(t):

H(t) = D(t) +

∫ t

0

D(t− x) dm(x). (2.14)

•

Since the renewal function m(t) is difficult to compute, the solution (2.14) of a renewal-

type equation is not easy to obtain. However, there are a few methods of solving renewal-

type equations (Kulkarni, 1995, p. 423), such as the following methods:

• Laplace-Stieltjes transforms — for short, LST (Kulkarni, 1995, pp. 423-426);

• discrete approximation (Kulkarni, 1995, pp. 426–427);

• successive approximation (Kulkarni, 1995, p. 427).

For the sake of briefness, we concisely describe and illustrate the first of these three

methods.

Proposition 2.31 — Solving a renewal-type equation using the method of LST

(Kulkarni, 1995, p. 423)

If D(t) admits a LST, D(s) =∫ +∞

0−e−st dD(t), then the LST of H(t) = D(t) +

∫ t0D(t −

x) dm(x) is given by

H(s) =

∫ +∞

0−e−st dH(t)

=D(s)

1− F (s), (2.15)

and H(t), the solution of the renewal type-equation, can be obtained by inverting the

right-hand side of (2.15). •

81

Exercise 2.32 — Solving a renewal-type equation using the method of LST



(Kulkarni, 1995, Example 8.16, pp. 423–425)

Consider a machine that alternates between two states — up and down. Admit that:

• the successive up times Ui are i.i.d. r.v. with exponential distribution with parameter

µ;

• if the up time is equal to Ui, then the subsequent down time Di = cUi, where c is a

non-negative constant;

• the machine is up at time 0;

• H(t) represents the probability that the machine is up at time t.

Derive a renewal-type equation for H(t) and solve it by using the method of Laplace-

Stieltjes transforms. •


(bis) (Kulkarni, 1995, Exercise 24, pp. 470–471)

Consider a system that can be in one of two states: on and off. Admit that:

• the system is on at time 0;

• the successive durations of on and off periods are independent and exponentially

distributed with parameter λ and µ, respectively;

• W (t) represents the total amount of on time during (0, t].

Derive a renewal-type equation for H(t) = E[W (t)] and solve it by using the method

of Laplace-Stieltjes transforms. •

82

Exercise 2.35 — Back to the renewal equation (Ross, 2003, Exercise 18, p. 465)

Use Mathematica to compute the renewal function when the inter-renewal distribution is

a hyper-exponential with survival function given by

1− F (t) = pe−µ1t + (1− p)e−µ2t,

for t ≥ 0 and µ1, µ2 > 0. •

Proposition 2.36 — Another renewal-type equation (Ross, 1983, p. 65)

Let N(t) : t ≥ 0 be a renewal process, with inter-renewal distribution F (x) and renewal

function m(t). Then the c.d.f. of SN(t), the time of the last renewal prior (or at) time t,

can be represented in the form (2.14) with D(t) = F (t) = 1− F (t):

P [SN(t) ≤ s] = F (t) +

∫ s

0

F (t− x) dm(x), (2.16)

for 0 ≤ s ≤ t. •

Exercise 2.37 — Another renewal-type equation

Prove Proposition 2.36 (Ross, 1983, p. 66). •

Remark 2.38 — Another renewal-type equation (Ross, 1983, p. 65)

Proposition 2.36 leads us to conclude that:

P [SN(t) = 0] = F (t); (2.17)

dFSN(t)(s) = F (t− s) dm(s), 0 < s < +∞. (2.18)

If the inter-renewal times are continuous r.v. with common p.d.f. f , then:

dm(s) =+∞∑n=1

fn(s) ds

=+∞∑n=1

P [nth renewal occurs in (s, s+ ds)]

= P [renewal occurs in (s, s+ ds)]; (2.19)

fSN(t)(s) ds = P [renewal in (s, s+ ds), next inter-renewal time > t− s]

= F (t− s) dm(s). (2.20)

•

83

2.5 Key renewal theorem and some other limit

theorems

Computing the exact distribution of N(t) for finite t is far from being a trivial problem,

either analytically or numerically (Kulkarni, 1995, p. 409). Moreover, the renewal function

m(t) is also difficult to compute for an arbitrary inter-arrival distribution F .

Unsurprisingly, we have to turn our attention to the study of the limiting behavior of

N(t), as t → +∞ (Kulkarni, 1995, p. 405), and we are bound to study the asymptotic

behavior of m(t) (Kulkarni, 1995, p. 416). However,

N(+∞) ≡ limt→+∞

N(t) = +∞, (2.21)

with probability 1,11 even though N(t) is finite for finite t. (2.21) follows because the only

way in which N(+∞) can be finite is for one of the inter-renewal times to be infinite, and

the probability of this last event is equal to zero (Ross, 1983, pp. 57-58; Ross, 2003, pp.

402–403).12

Is it possible to identify the approximate distribution of N(t) for large t?

Yes!

N(t) is asymptotically normally distributed.

Theorem 2.39 — Central limit theorem for renewal processes (Ross, 1983,

Theorem 3.3.5, pp. 62–63)

Let N(t) : t ≥ 0 be a renewal process whose inter-renewal times Xi have common

expected value µ and finite variance σ2. Then

limt→+∞

P

N(t)− tµ√

tσ2

µ3

< z

= Φ(z). (2.22)

Consequently, for sufficiently large t, the distribution of N(t) is approximately normal

11N(+∞) represents the total number of renewals that occur. Moreover, N(+∞) = +∞ implies

m(+∞) = +∞.12Ross (1983, p. 58) and Ross (2003, p. 403) note that P [N(+∞) < +∞] = P (Xi = +∞ for some i) =

P(∪+∞i=1 Xi = +∞

)≤∑+∞i=1 P (Xi = +∞) = 0. We should add that this is true if F (+∞) = 1, that is,

if the renewal process N(t) : t ≥ 0 is recurrent, as put by Kulkarni (1995, p. 409).

84

with mean tµ

and variance tσ2

µ3and we have

P [N(t) < n] ' Φ

n− tµ√

tσ2

µ3

. (2.23)

•

Exercise 2.40 — Central limit theorem for renewal processes

Prove Theorem 2.39 (Ross, 1983, pp. 62–63). •

Exercise 2.41 — Central limit theorem for renewal processes (Kulkarni, 1995,

Example 8.13, p. 413)

Suppose a part in a machine is available from two different sources, a and b. When the

part fails it is replaced by a new one from source a (resp. b) with probability 0.3 (resp. 0.7)

independently of everything else. A part from source a (resp. b) last for an exponentially

distributed time with a mean of 8 (resp. 5) days and it takes exactly 1 (resp. half a) day

to install it. Moreover, assume that a failure has taken place just before time 0.

Compute the approximate distribution of the number of failures during the first year

(not counting the one just before time 0). •

Exercise 2.42 — Central limit theorem for renewal processes (bis)

Resume Exercise 2.35 and obtain an approximate value to P [N(50) ≥ 50], when:

(a) p = 12, µ1 = 1 and µ2 = 2;

(b) p = 1 and µ1 = 1. •

Although N(t) and m(t) go to infinity, as t→ +∞, and N(t) is asymptotically normal,

it would be quite useful to inquire at what rate N(t) and m(t) grow (Ross, 1938, p. 58;

Ross, 2003, p. 407).

Can we calculate limt→+∞N(t)t

and limt→+∞m(t)t

?

Yes!

The limit of the sequence of r.v. N(n)/n : n ∈ N, which can be represented by

limt→+∞N(t)t

, is given in the following proposition.

85

Proposition 2.43 — Strong law of large numbers for renewal processes (Ross,

1983, Proposition 3.3.1, pp. 58–59; Serfozo, 2009, Corollary 11, p. 105)

Let N(t) : t ≥ 0 be a renewal process whose inter-renewal times X1, X2, . . . have

common expected value µ. Then

N(t)

t

w.p.1→ 1

µ, (2.24)

that is, the long-run rate at which renewals occur equals 1µ.13 •

Remark 2.44 — Almost sure convergence or convergence with probability 1

• Almost sure convergence — or convergence with probability one — is the

probabilistic version of pointwise convergence known from elementary real analysis.

• The sequence of r.v. Y1, Y2, . . . is said to converge almost surely or with probability

1 to a r.v. Y if

P

(ω : lim

n→+∞Yn(ω) = Y (ω)

)= 1 (2.25)

(Karr, 1993, p. 135; Rohatgi, 1976, p. 249). In this case we write

Yna.s.→ Y or Yn

w.p.1→ Y . Moreover, equation (2.25) does not mean that

limn→+∞ P (ω : Yn(ω) = Y (ω)) = 1.

• Almost sure convergence is preserved under continuous mappings (Karr, 1993, p.

148). •

Exercise 2.45 — Strong law of large numbers for renewal processes

Prove Proposition 2.43, by using the fact thatSN(t)

N(t)≤ t

N(t)<

SN(t)+1

N(t)and by applying the

strong law of large numbers (SLLN)14 (Ross, 1983, pp. 58–59) and then the preservation

of almost sure convergence under continuous mappings. •13Since the rate at which renewals occur will equal 1

µ w.p.1, 1µ is also called the rate of the renewal

process (Ross, 1983, p. 59). This result is valid for both finite and infinite µ (Kulkarni, 1995, p. 410-411).14The SLLN for i.i.d. r.v. in L1 (or Kolmogorov’s SLLN) can be stated as follows. Let Y1, Y2, . . .

be a sequence of i.i.d. r.v. to Y . Then Yn = 1n

∑ni=1 Yi

a.s.→ µ iff Y ∈ L1 (i.e., E(|Y |) < +∞), and then

µ = E(Y ) (Karr, 1993, p. 188; Rohatgi, 1976, p. 274, Theorem 7). Note that if µ =∞, then we have to

use a more delicate argument to show that the result is still valid (Kulkarni, 1995, p. 411).

86

Exercise 2.46 — Strong law of large numbers for renewal processes (Ross, 2003,

examples 7.5 and 7.6, pp. 409–410)

Evaristo has a radio that works on a single 3 volt battery. As soon as the battery fails,

Evaristo immediately replaces it with a new battery.

(a) Admit that the lifetime (in hours) of those batteries is uniformly distributed over the

interval [30, 60]. At what rate has Evaristo to change batteries in the long-run?

(b) Now, admit that Evaristo does not keep any surplus batteries and each time a failure

occurs he must go and buy a new battery spending a uniformly distributed time over

the interval [0, 1]. Recalculate the rate at which Evaristo has to change batteries in

the long-run. •

Exercise 2.47 — Strong law of large numbers for renewal processes (bis) (Ross,

2003, Exercise 7, p. 461)

Clotilde works regretfully on a temporary basis and the mean length of each job she gets

is three months.

At what rate does Clotilde get new jobs in the long-run if the amount of time she

spends unemployed is exponentially distributed with mean equal to 2. •

Exercise 2.48 — Strong law of large numbers for renewal processes (bis, bis)

(Ross, 2003, Example 7.8, pp. 411–412)

A game consists of a sequence of independent trials — each of which results in outcome

i with probability Pi (i = 1, . . . , n and∑n

i=1 Pi = 1) — which is observed until the same

outcome occurs k times in a row; this outcome is then declared to be the winner of the

game. For instance, if k = 2 and the sequence of outcomes is 1, 2, 4, 3, 5, 2, 1, 3, 3, then

the game stops after nine trials and number 3 is declared the winner.

(a) What is the probability that outcome i wins, i = 1, . . . , n?

(b) Determine the expected number of trials until an outcome is declared the winner. •

87

To prove that limt→+∞m(t)t

= 1µ, which is not a simple consequence of Proposition 2.43

(Ross, 2003, p. 409),15 we have to digress to the notion of stopping time, state Wald’s

equation (Ross, 1983, p. 59) and establish such limit independently (Kulkarni, 1995, p.

418).

Definition 2.49 — Stopping time (Ross, 1983, p. 59)

An integer-valued r.v. N is said to be a stopping time for the sequence of independent

r.v. Xi : i ∈ N if the event N = n is independent of Xn+1, Xn+2, . . . , for all n ∈ N.16 •

Exercise 2.50 — Stopping time

Let Xi : i ∈ N be a sequence of i.i.d. r.v. with Bernoulli(1/2) distribution.17

(a) Prove that N = min n :∑n

i=1Xi = 10 is a stopping time for this sequence.

(b) Now, consider Yist= 2Xi − 1, i ∈ N, i.e., P (Yi = −1) = P (Yi = 1) = 1

2.

Show that N = min n :∑n

i=1 Yi = 1 is a stopping time for the sequence of i.i.d. r.v.

Yi : i ∈ N. •

Exercise 2.51 — Stopping time (Ross, 2003, Exercise 13, pp. 462–463)

Let Xi : i ∈ N be a sequence of i.i.d. r.v. with Bernoulli(p) distribution, where 0 < p < 1,

and define:

(a) N1 = inf n ∈ N :∑n

i=1 Xi = 5;

(b) N2 =

3, X1 = 0

5, X1 = 1;

(c) N3 =

3, X4 = 0

2, X4 = 1.

Which of these three r.v. are stopping times for the sequence Xi : i ∈ N? Justify. •

15Because it is not always true that limt→+∞m(t)t

(= limt→+∞E

[N(t)t

])= E

[limt→+∞

N(t)t

]; after

all the almost sure converge does not imply the convergence in expected value.16N essentially represents the number of r.v. observed before stopping.17Ross (1983, p. 59) illustrates the notion of stopping time without proving that we are indeed dealing

with two stopping times for two sequences of r.v.

88

Proposition 2.52 — Wald’s equation (Ross, 1983, Theorem 3.3.2, p. 59)

Let:

• Xi : i ∈ N be a sequence of i.i.d. r.v. with common finite expectation E(X);

• N be a stopping time for the sequence Xi : i ∈ N such that E(N) <∞.

Then

E

(N∑i=1

Xi

)= E(N)× E(X). (2.26)

•

Exercise 2.53 — Wald’s equation


Exercise 2.54 — Wald’s equation (Ross, 2003, Exercise 115, p. 464)

Consider a miner trapped in a room that contains 3 doors:

• door 1 leads him/her to freedom after two days of travel;

• door 2 returns him/her to the room after a four-day journey;

• door 3 returns him/her to the room after a six-day journey.

Suppose at all times the miner is equally likely to choose any of the 3 doors, and let T

denote the time it takes the miner to become free.

(a) Define a sequence of i.i.d. r.v. Xi : i ∈ N and a stopping time N such that T =∑Ni=1 Xi.

(b) Use Wald’s equation to obtain E(T ).

(c) Compute E(∑N

i=1Xi | N = n)

and verify that it is not equal to E (∑n

i=1Xi).

(d) Use line (c) for a second derivation of E(T ). •

89

Exercise 2.55 — Back to stopping times

Argue that:

(a) N(t)+1 is indeed a stopping time for the sequence of inter-renewal times, X1, X2, . . .

(Ross, 1983, p. 60);

(b) N(t) is not a stopping time for X1, X2, . . . .18 •

Since N(t) + 1 is a stopping time for the sequence of inter-renewal times, we can use

Wald’s equation to state the following auxiliary result that plays a crucial role in the proof

of the elementary renewal theorem.

Proposition 2.56 — Relating E[SN(t)+1

]and the renewal function (Ross, 1983,

Corollary 3.3.3, p. 61)

Let N(t) : t ≥ 0 be a renewal process, whose inter-renewal times X1, X2, . . . have

common (and finite) expected value µ, and m(t) be its renewal function. Then the

expected value of SN(t)+1 =∑N(t)+1

i=1 Xi, the time of the first renewal after time t, is equal

to:

E[SN(t)+1

]= µ× [m(t) + 1]. (2.27)

•

Exercise 2.57 — Relating E[SN(t)+1

]and the renewal function


Theorem 2.58 — Elementary renewal theorem (Ross, 1983, Theorem 3.3.4, p. 61;

Ross, 2003, p. 409; Kulkarni, 1995, Theorem 8.9, p. 417)

Let N(t) : t ≥ 0 be a renewal process whose inter-renewal times X1, X2, . . . have

common expected value µ. Then

limt→+∞

m(t)

t=

1

µ, (2.28)

where 1/∞ ≡ 0. That is, the expected average renewal rate converges to 1µ. •

18Hint: N(t) = n ⇔ Sn =∑ni=1Xi ≤ t and Sn+1 =

∑n+1i=1 Xi > t (Ross, 2003, p. 464).

90

Exercise 2.59 — Elementary renewal theorem

(a) Prove Theorem 2.58 (Ross, 1983, p. 61; Kulkarni, 1995, pp. 419–420).

(b) Resume Exercise 2.23 and apply the elementary renewal theorem to obtain

limt→+∞m(t)t

(Kulkarni, 1995, Example 8.21, pp. 430–431). •

Now, it is time to study the limiting behavior of the solutions of the renewal-type

equations (Kulkarni, 1995, p. 428). But before we proceed, we need to define lattice r.v.19

and its period,20 and also directly Riemann integrable (dRi) functions.

Definition 2.60 — Lattice r.v. and its period (Ross, 1983, p. 63; Kulkarni, 1995,

Definition 8.3, p. 428)

A non-negative r.v. X and its c.d.f. F are said to be lattice if there exists a constant d > 0

such that∑+∞

n=0 P (X = nd) = 1, that is, if X only takes on integral multiples of some

positive number d.

The largest d having this property is said to be the period of X. •

Example 2.61 — Lattice r.v. and its period (Ross, 1983, p. 63; Kulkarni, 1995,

Example 8.18, p. 428)

The r.v. taking values in the following sets are lattice:

• 0, 1, 2, . . . (d = 1);

• 0, 2, 4, . . . (d = 2);

• 0,√

2 (d =√

2). •

Definition 2.62 — Directly Riemann integrable (dRi) function (Caravena, 2012)

A non-negative function D, defined on (the real line or on) a half-line, is said to be

directly Riemann integrable if the upper and lower Riemann sums of D over the whole

(unbounded) domain converge to the same finite limit, as the mesh of the partition

vanishes. •19Or arithmetic or periodic r.v.20Or span.

91

Remark 2.63 — Directly Riemann integrable function (Ross, 1983, p. 64)

Let:

• D be a function defined on [0,+∞];

• mn(a) (resp. mn(a)) be the supremum (resp. infimum) of D(t) over the interval

[(n− 1)a, na], for any a > 0.

Then D is said to be a directly Riemann integrable function if∑+∞

n=1mn(a) and∑+∞n=1mn(a) are finite, for all a > 0, and lima→0

∑+∞n=1 amn(a) = lima→0

∑+∞n=1 amn(a).

A (jointly!) sufficient condition for D to be directly Riemann integrable function is

that

(i) D(t) ≥ 0, t ≥ 0,

(ii) D(t) is non-increasing,

(iii)∫ +∞

0D(t) dt < +∞. •

Theorem 2.64 — Key renewal theorem (Ross, 1983, Theorem 3.4.2, p. 65; Kulkarni,

1995, Theorem 8.11, pp. 428-429)

Let:

• N(t) : t ≥ 0 be a renewal process, with renewal function m(t) and whose inter-

renewal times Xi : i ∈ N have common c.d.f. F and expected value µ;

• D(t) be a directly Riemann integrable (dRi) function;

• H(t) be a solution to the following renewal-type equation

H(t) = D(t) +

∫ t

0

H(t− x) dF (x)

= D(t) +

∫ t

0

D(t− x) dm(x),

92

If F is not lattice then

limt→+∞

H(t) = limt→+∞

∫ t

0

H(t− x) dF (x)

= limt→+∞

∫ t

0

D(t− x) dm(x)

=1

µ

∫ +∞

0

D(y) dy. (2.29)

If F is lattice with period d then

limk→+∞

H(kd+ x) =d

µ

+∞∑n=0

D(nd+ x). (2.30)

•

Remark 2.65 — Key renewal theorem

The proof of the key renewal theorem is excruciating and can be found in: Feller (1971,

Vol. II, pp. 364–366), for non lattice distributions; Feller (1968, Vol. I, pp. 335-337), for

lattice distributions. •

Exercise 2.66 — Key renewal theorem

Use the key renewal theorem to obtain limt→+∞

[m(t)− t

µ

], when the inter-renewal times

are not lattice and have finite variance (Kulkarni, 1995, Example 8.23, pp. 431–433). •

The next theorem is an application of the key renewal theorem (Kulkarni, 1995, p.

429).

Theorem 2.67 — Blackwell’s (renewal) theorem (Ross, 1983, Theorem 3.4.1, p. 63)

N(t) : t ≥ 0 be a renewal process, with renewal function m(t) and whose inter-renewal

times X1, X2, . . . , have common c.d.f. F and expected value µ.

• If F is not lattice then

limt→+∞

[m(t+ a)−m(t)] =a

µ. (2.31)

• If F is lattice with period d then

limn→+∞

E[number of renewals at nd] =d

µ. (2.32)

•

93

Remark 2.68 — Interpreting Blackwell’s theorem (Ross, 1983, pp. 63–64)

• Blackwell’s theorem states that if F is not lattice then the expected number of

renewals in an interval of length a, far from the origin, is approximately aµ, i.e.,

it is proportional to the length of the interval (a) and the long-run rate at which

renewals occur ( 1µ).

• If F is lattice with period d then limt→+∞[m(t + a)−m(t)] does not exist because

renewals can only occur at integral multiples of d and, thus, the expected number

of renewals in an interval far from the origin would clearly depend on how many

points integer multiples of the period d it contains and not on the interval length.

In the lattice case the relevant limit is that of the expected number of renewals

at nd, which is proportional to the period (d) and to the long-run rate at which

renewals occur ( 1µ). •

Remark 2.69 — Relating the Blackwell’s theorem and the key renewal

theorem

Blackwell’s theorem and the key renewal theorem can be shown to be equivalent (Ross,

1983, p. 65). In fact, we can deduce Blackwell’s theorem from the key renewal theorem,

by considering a function D(t) = I[0,h](t), for a fixed h > 0 (Kulkarni, 1995, pp. 429–430);

the reverse can be proven by approximating the directly Riemann integrable function D(t)

with step functions (Ross, 1983, p. 65). •

Exercise 2.70 — Blackwell’s theorem

Show that Blackwell’s theorem is verified in the following cases:

(a) N(t) : t ≥ 0 ∼ PP (λ) (Kulkarni, 1995, Example 8.20, p. 430);

(b) a renewal process N(t) : t ≥ 0 with inter-renewal times with Gamma(α = 2, λ = 1)

distribution. •

94

2.6 Recurrence times; the inspection paradox

Now, we study the following r.v. associated to a renewal process N(t) : t ≥ 0.

Definition 2.71 — Age, residual life and total life at time t (Kulkarni, 1995, p.

433; Ross, 1983, pp. 67-68)

Let N(t) : t ≥ 0 be a renewal process whose inter-renewal times are not lattice.

Then, for t ≥ 0, we define

A(t) = t− SN(t) (2.33)

Y (t) = SN(t)+1 − t (2.34)

XN(t)+1 = SN(t)+1 − SN(t)

= A(t) + Y (t), (2.35)

which are called the age at time t, the residual (or excess) life at time t and the total life

at time t, respectively. •

Remark 2.72 — Age, residual life and total life at time t (Kulkarni, 1995, p. 433;

Ross, 1983, pp. 67-68)

• A(t) represents the time from t since the last renewal and is sometimes called the

backward recurrence time.

• Y (t) denotes the time from t until the next renewal and is called the forward

recurrence time.

• XN(t)+1 represents the time between the last renewal before (or at t) and first renewal

after t, i.e., the inter-renewal time covering t. •

Exercise 2.73 — Age, residual life and total life at time t

(a) Draw a scheme with A(t), Y (t) and XN(t)+1 (Kulkarni, 1995, p. 434).

(b) Draw sample paths of the following stochastic processes:

(i) A(t) : t ≥ 0 (the age process),

95

(ii) Y (t) : t ≥ 0 (the residual life process) and

(iii) XN(t)+1 : t ≥ 0 (the total life process)

(Kulkarni, 1995, pp. 434–435).21 •

Exercise 2.74 — Age, residual life and total life at time t (Ross, 1983, Exercise

3.11, p. 95)

Let A(t) and Y (t) denote the age and residual life at t of a renewal process. Fill in the

missing terms, considering 0 < x ≤ t and y > 0:

(a) A(t) > x ⇔ 0 events in the interval ;

(b) Y (t) > y ⇔ 0 events in the interval ;

(c) P [Y (t) > y] = P [A( ) > ]. •

Exercise 2.75 — Age, residual life and total life at time t (bis) (Ross, 1983,

Exercises 3.11 and 3.12, p. 95)

Let A(t) and Y (t) denote the age and residual life at t of a renewal process, N(t) : t ≥ 0,

with inter-renewal c.d.f. Consider 0 < x ≤ t, 0 ≤ s ≤ t + x/2, 0 ≤ u < t + x and y > 0,

and find:

(a) the joint c.d.f. of (A(t), Y (t)) for a Poisson process;

(b) P [Y (t) > y | A(t) = x];

(c) P [Y (t) > x | A(t+ x/2) = s];

(d) P [Y (t) > x | A(t+ x) > u] for a Poisson process;

(e) P [A(t) > x, Y (t) > y]. •21According to Kulkarni (1995, p. 435), the sample paths of the: age process have slope 1 and downward

jumps of size Xn at Sn; residual life process decrease at a unit rate with upward jumps of size Xn+1 at

Sn; total life process are piecewise constant with upward or downward jumps of size Xn+1 −Xn at Sn.

96

Proposition 2.76 — Relating E[Y (t)] and the renewal function

Let N(t) : t ≥ 0 be a renewal process, whose inter-renewal times X1, X2, . . . have

common (and finite) expected value µ, and m(t) be its renewal function. Then the

expected residual life is equal to:

E[Y (t)] = µ× [m(t) + 1]− t. (2.36)

•

Exercise 2.77 — Relating E[Y (t)] and the renewal function


Exercise 2.78 — Relating E[Y (t)] and the renewal function

Consider the renewal process whose inter-renewal times have an hypo-exponential

distribution with parameters µ1 and µ2 (i.e., we are dealing with a convolution of two

exponentials).22

(a) Obtain the renewal function m(t) using the relationship between the LST of m(t) and

of the common c.d.f. F (t) of the inter-renewal times: m(s) = F (s)

1−F (s).23

(b) Determine the expected residual life at time t, E[Y (t)]. •

A curious feature of renewal processes is that if we wait some predetermined

time t and then observe how large the renewal interval containing time

t is, we should expect it to be larger than a typical renewal interval

(http://en.wikipedia.org/wiki/Renewal theory#The inspection paradox).

This counterintuitive fact is called the inspection paradox (Kulkarni, 1995, p. 439) and

is formalized in the following proposition.

22For more details check Proposition 1.19.23Ross (2003, Example 7.9, pp. 414–415) obtained the renewal function by first determining the

expected residual life via a continuous-time Markov chain reasoning.

97

http://en.wikipedia.org/wiki/Renewal_theory#The_inspection_paradox

Proposition 2.79 — Inspection paradox (Ross, 1983, Exercise 3.3, p. 93)

Let:

• N(t) : t ≥ 0 be a renewal process with inter-renewal times Xi : i ∈ N and

inter-renewal distribution F ;

• XN(t)+1 be the inter-renewal time covering t.

Then

P[XN(t)+1 > x

]≥ P (X > x), (2.37)

for any x > 0, i.e., XN(t)+1 ≥st Xi, i ∈ N.24 •

Exercise 2.80 — Inspection paradox (Ross, 1983, Exercise 3.3, p. 93)

(a) Prove Proposition 2.79 (Ross, 2003, p. 438).

(b) Compute P[XN(t)+1 > x

]when F (x) = 1− e−λx, x ≥ 0 (Ross, 2003, pp. 439-440). •

By capitalizing on limit theorems we are able to derive several results concerning the

limit behavior of the age, the residual life and the total life of a renewal process. Two of

those results are stated as an exercise.

Exercise 2.81 — Limit behavior of A(t)t

and E[Y (t)]t

Use the:

(a) SLLN for renewal processes to prove that A(t)t

w.p.1→ 0 (Ross, 1983, Exercise 3.12, p.

95).

(b) elementary renewal theorem to show that limt→+∞E[Y (t)]

t= 0 (Ross, 2003, p. 414). •

Can we determine the limit behavior of E[A(t)], E[Y (t)] and E[XN(t)+1]?

Yes!

We have to capitalize on the key renewal theorem for non-lattice inter-renewal times.

24XN(t)+1 ≥st Xi reads as follows: XN(t)+1 is stochastically larger than Xi. Moreover, XN(t)+1 ≥stXi, i ∈ N ⇒ E[XN(t)+1] ≥ E(Xi), i ∈ N.

98

Proposition 2.82 — Limit behavior of E[Y (t)], E[A(t)] and E[XN(t)+1] (Ross, 1983,

Proposition 3.4.6, p. 71; Kulkarni, 1995, Theorem 8.13 and Corollary 8.4, pp. 438-439)

Consider a renewal process N(t) : t ≥ 0 whose inter-renewal times have a common

non-lattice distribution F , expected value E(X) = µ and E(X2) < +∞. Then:

limt→+∞

E[Y (t)] =E(X2)

2µ; (2.38)

limt→+∞

E[A(t)] =E(X2)

2µ; (2.39)

limt→+∞

E[XN(t)+1] =E(X2)

µ. (2.40)

•

Note that limt→+∞E[XN(t)+1] = E(X2)µ≥ E(X), thus, agreeing with the inspection

paradox.

Exercise 2.83 — Limit behavior of E[Y (t)] and m(t)− tµ

(a) Prove Proposition 2.82 (Ross, 1983, pp. 70–71; Kulkarni, 1995, pp. 438-439).25

(b) Use Proposition 2.82 to prove that if E(X2) < +∞ and X is not lattice then

limt→+∞

[m(t)− t

µ

]= E(X2)

2µ2− 1.26 •

Exercise 2.84 — Limit behavior of E[Y (t)]

Consider a renewal process whose inter-arrival distribution is Gamma(n, λ).

(a) Use Proposition 2.82 to prove that limt→+∞E[Y (t)] = n+12λ

(Ross, 1983, Exercise 3.13,

p. 95).

(b) Compute limt→+∞E[Y (t)], by capitalizing not only on the fact that the inter-arrival

distribution is a sum of n independent and exponentially distributed r.v., but also on

the lack of memory of the exponential distribution and any convenient properties of

the Poisson process. •

25To prove the first (resp. second) result, derive the following renewal-type equation: E[Y (t)] =∫ +∞t

(x−t) dF (x)+∫ t0E[Y (t−x)] dF (x), t ≥ 0 (resp. E[A(t)] = t×[1−F (t)]+

∫ t0E[A(t−x)] dF (x), t ≥ 0).

26This should be the result of Exercise 2.66.

99

Can we use the key renewal theorem to derive the limiting survival function of A(t)

and Y (t)?

Yes!

Proposition 2.85 — Obtaining the limiting survival function of Y (t) via the

key renewal theorem (Kulkarni, 1995, Theorem 8.12, p. 435)

Consider a renewal process N(t) : t ≥ 0 whose inter-renewal times have a common

non-lattice distribution F and expected value E(X) = µ. Then

limt→+∞

P [Y (t) > x] =

∫ +∞x

[1− F (u)] du

µ, x > 0. (2.41)

•

Exercise 2.86 — Obtaining the limiting survival function of Y (t) via the key

renewal theorem

Prove Proposition 2.85 (Kulkarni, 1995, pp. 435–436).27 •

Proposition 2.87 — Obtaining the limiting survival function of A(t) (Kulkarni,

1995, Corollary 8.2, p. 436)

Under the conditions of Proposition 2.85, the limiting survival function of A(t) is given

by

limt→+∞

P [A(t) > y] =

∫ +∞y

[1− F (u)] du

µ, t > y. (2.42)

•

Exercise 2.88 — Obtaining the limiting survival function of A(t)

Prove Proposition 2.87 (Kulkarni, 1995, p. 436).28 •

27Consider H(t) = P [Y (t) > x], show that H(t) satisfies the renewal-type equation H(t) = [1− F (x+

t)] +∫ t0H(t− u)F (u) and then apply the key renewal theorem.

28Capitalize on the fact that A(t) > y ⇔ No renewals in [t− y, t] ⇔ Y (t− y) > y.

100

Exercise 2.89 — Obtaining the limiting c.d.f. of Y (t) and A(t)

Use propositions 2.85 and 2.87 to show that limt→+∞ P [Y (t) ≤ x] = limt→+∞ P [A(t) ≤

x] =∫ x0 [1−F (u)] du

µ. •

Remark 2.90 — Equilibrium distribution (Kulkarni, 1995, p. 437; Ross, 2003, pp.

432 and 469)

The c.d.f. Fe(x) =∫ x0 [1−F (u)]du

µis called the equilibrium distribution associated with inter-

renewal distribution F . It represents the long-run proportion of time the age and the

residual life of the renewal process does not exceed x. •

Exercise 2.91 — Equilibrium distribution (Ross, 2003, Exercise 42, p. 469)

Let Fe(x) =∫ x0 [1−F (u)]du

µbe the equilibrium distribution associated with the inter-renewal

distribution F .

(a) Show that if F is an exponential distribution then F = Fe. Comment this result.

(b) Let c be some positive constant and F (x) = I[c,+∞)(x) (i.e., the inter-renewal times

are all equal to c). Show that Fe is the uniform distribution over (0, c).

(c) The city of Berkeley, California, allows for two hours at all non-metered locations

within one mile of the University of California. Parking officials regularly tour around,

passing the same point every 2 hours. When an official encounters a car, he/she marks

it with chalk. If the same car is there on the official’s return 2 hours later, then the

parking ticket is written.

What is the probability you receive a ticket if you park your car in one of those

locations and return after 3 hours? •

101

2.7 Renewal reward processes

Can we generalize compound Poisson processes and study reward models associated with

renewal processes?

Yes!

They are called renewal reward processes, they consider that each time a renewal

occurs we receive a reward, and those processes are formally defined as follows.

Definition 2.92 — Renewal reward process (Kulkarni, 1995, p. 452; Ross, 2003, pp.

416–417)

Let:

• N(t) : t ≥ 0 a renewal process;

• Xn be the nth inter-renewal time (n ∈ N);

• Rn be the reward earned at the time of the nth renewal (n ∈ N);

• (Xn, Rn) : n ∈ N i.i.d.∼ (X,R);29

• R(t) =∑N(t)

n=1 Rn be the total reward earned by time t.30

Then R(t) : t ≥ 0 is called a renewal reward process. •

Exercise 2.93 — Renewal reward process

(a) Are renewal processes and compound PP examples of renewal reward processes?

(b) Give a detailed example of a renewal reward process.31

(c) Having in mind that Rn is a real-valued r.v., draw a typical sample path of a renewal

reward process R(t) : t ≥ 0 (Kulkarni, 1995, p. 453).32 •29We shall assume that the rewards Rn, n ∈ N can (and usually) depend on Xn, n ∈ N.30R(t) = 0 if N(t) = 0.31See, for instance, Kulkarni (1995, Example 8.33, p. 453).32The sample paths of R(t) : t ≥ 0 may go up and down, and a jump of size Rn occurs at time

Sn =∑ni=1Xi.

102

Computing the distribution of R(t) is rather difficult (Kulkarni, 1995, p. 454), and

obtaining the expected value of R(t) is far from trivial, namely because N(t) is not a

stopping time for the sequence of i.i.d. inter-renewal times (and neither for the sequence

of i.i.d. rewards).

Consequently, the question arises as to whether it is possible to study the limit behavior

of R(t)t

and E[R(t)]t

?

Yes!

We can use the SLLN for renewal processes (resp. Wald’s equation and the elementary

renewal theorem) to compute limt→+∞R(t)t

(resp. limt→+∞E[R(t)]

t).

Proposition 2.94 — SLLN (and elementary renewal theorem) for renewal

reward processes (Ross, 2003, Proposition 7.3, p. 417)

Let R(t) : t ≥ 0 be a renewal reward process such that the common expected values of

the inter-renewal times and rewards, E(R) and E(X), are finite. Then

R(t)

t

w.p.1→ E(R)

E(X), (2.43)

i.e., the long-run reward per time unit equals E(R)E(X)

. Moreover,

limt→+∞

E[R(t)]

t=E(R)

E(X). (2.44)

•

Exercise 2.95 — SLLN (and elementary renewal theorem) for renewal reward

processes

Prove Proposition 2.94 (Ross, 1983, pp. 78-79).33 •

Exercise 2.96 — SLLN for renewal reward processes (Ross, 2003, Example 7.10,

pp. 417–418)

Resume Exercise 2.109 and suppose that the amounts that the successive customers

deposit in the bank are independent r.v. with a common c.d.f. and expected value H

and µH , respectively.

At what rate deposits accumulate in the long-run? •

33The proof of (2.43) is similar to the one of the SLLN for renewal processes (Proposition 2.43).

103

Exercise 2.97 — SLLN for renewal reward processes (bis) (Ross, 2003, Example

7.11, pp. 418–419)

The lifetime of a car is a continuous r.v. with c.d.f. H and p.d.f. h. Evaristo has a policy

that he buys a new car as soon as his old one either breaks down or reaches the age of T

years. Suppose that a new car costs C1 (thousand) euros and also that an additional cost

of C2 (thousand) euros is incurred whenever Evaristo’s car breaks down.

(a) Under the assumption that a used car has no resale value, how much Evaristo spends

in cars per time unit in the long-run?34

(b) Now, suppose the lifetime of a car (in years) is uniformly distributed over (0, 10),

T ≤ 10, C1 = 3 (thousand) euros, and C2 = 0.5 (thousand) euros. What value of T

minimizes Evaristo’s cost per time unit in the long-run? •

Exercise 2.98 — SLLN for renewal reward processes (bis, bis) (Ross, 2003,

Exercises 22–24, pp. 465–466)

Resume line (a) of Exercise 2.97.

(a) Recalculate the long-run cost per time unit if one assumes that a T -year-old car in

working order has an expected resale value of R(T ).

(b) What value of T minimizes the previous cost per time unit in the long-run when:

(i) H represents the c.d.f. of the uniform distribution over (2, 8), C1 = 4 (thousand)

euros, C2 = 1 (thousand) euros, and R(T ) = 4− T2?

(ii) H is the c.d.f. of the exponential distribution with mean 5 years, C1 = 3

(thousand) euros, C2 = 0.5 (thousand) euros, and R(T ) = 0? Interpret the

result. •

Exercise 2.99 — SLLN for renewal reward processes (bis, bis, bis) (Ross, 2003,

Example 7.12, p. 420)

Suppose that:

34Ross (2003, p. 419) called it long-run average cost.

104

• customers arrive at a train depot according to a renewal process with mean inter-

arrival time µ;

• whenever there are N customers waiting in the depot, a train leaves;

• the depot incurs a cost at the rate of nc per time unit whenever there are n customers

waiting.

(a) What is the cost per time unit incurred by the train depot in the long-run?

(b) Suppose now that each time a train leaves, the depot incurs a cost of 6 monetary

units. What value of N minimizes the cost per time unit incurred by the train depot

in the long-run? •

Exercise 2.100 — Renewal reward processes (4bis) (Ross, 2003, Exercise 26, p.

466)

Resume Exercise 2.99 and suppose that:

• the customers arrive according to a Poisson process with rate λ;

• a train is summoned whenever there are N customers waiting in the depot, but the

train takes K time units to arrive at the depot;

• when the train arrives to the depot it picks all waiting customers.

What is now the cost per time unit incurred by the train depot in the long-run? •

Exercise 2.101 — SLLN for renewal reward processes (5bis) (Ross, 2003,

Example 7.14, p. 421-423)

Consider a manufacturing process that sequentially produces items, each of which is either

defective or acceptable. The following type of scheme is often employed in an attempt to

detect and eliminate most of the defective items:

105

• initially, every single item is inspected and this continues until there are k items

that are acceptable;

• at this point 100% inspection ends and each successive item is independently

inspected with probability α ∈ (0, 1);

• this partial inspection continues until a defective item is encountered, at which time

100% inspection is resumed, and the process begins anew.

Admit each item is defective with probability q, independently of the remaining items.

(a) What proportion of items are inspected in the long-run?

(b) If defective items are removed when detected, what proportion of the remaining items

are defective in the long-run? •

Exercise 2.102 — SLLN for renewal (reward) processes (Ross, 2003, Exercise 8,

p. 461)

A machine in use is replaced by a new machine either when it fails or when it reaches the

age of T years.

After having admitted that the lifetimes of the successive machines are independent

with a common c.d.f. (resp. p.d.f.) F (resp. f), show that:

(a) the long-run rate at which machines are replaced equals 1∫ T0 x f(x) dx+T×[1−F (T )]

;

(b) the long-run rate at which machines in use fail is given by F (T )∫ T0 x f(x) dx+T×[1−F (T )]

. •

Exercise 2.103 — Key renewal theorem and renewal reward processes (bis,

bis) (Ross, 1983, Exercise 3.20, p. 97)

For a renewal reward process show that

limt→+∞

E[RN(t)+1

]=E(R1 ×X1)

E(X1). (2.45)

In this proof assume the inter-renewal distribution is not lattice and that any relevant

function is dRi. •

106

2.8 Alternating renewal processes

Consider that:

• a system that alternates between two states — up (or on) and down (or off );

• the system is initially up/on and remains up/on for a random time U1;

• it then goes down/off and remains down/off for a random time D1;

• it then goes up/on for a time U2, then down/off for a time D2, etc.

Can we deduce the long-run proportion of time that the alternating renewal process

is up/on (resp. down/off)?

Yes!

It suffices to apply the key renewal theorem, but let us define first an alternating

renewal process.

Definition 2.104 — Alternating renewal process (Kulkarni, 1995, p. 447; Ross,

2003, pp. 66–67)

Let:

• Un be the nth up time;

• Dn be the nth down time;

• (Un, Dn) : n ∈ N i.i.d.∼ (U,D); 35

• Xn = Un +Dn the duration of the nth up and down cycle;

• Z(t) be the state of the process at time t (1 ≡ up/on; 0 ≡ down/off).

Then

Z(t) =

1, if ∃n ∈ N : Sn =∑n

i=1 Xi =∑n

i=1(Ui +Di) ≤ t < Sn + Un+1

0, if ∃n ∈ N : Sn + Un+1 ≤ t < Sn+1,(2.46)

and Z(t) : t ≥ 0 is called an alternating renewal process.36 •35We allow Un and Dn to be dependent!36(Un, Dn) : n ∈ N is usually called the alternating renewal sequence.

107

Exercise 2.105 — Alternating renewal processes

Draw a typical sample path of an alternating renewal process (Kulkarni, 1995, p. 448). •

Proposition 2.106 — Key renewal theorem and alternating renewal processes

(Ross, 1983, Theorem 3.4.4, p. 67; Kulkarni, 1995, Theorem 8.23, pp. 447–448)

Let:

• H, G and F be the distributions of Un, Dn and Xn = Un +Dn, respectively;

• E(U) = E(Un) (resp. E(D) = E(Dn)) denotes the expected length of an up/on

(resp. a down/off) period;

• P (t) = P (system is up/on at time t) = P [Z(t) = 1];

• Q(t) = P (system is down/off at time t) = 1− P (t) = P [Z(t) = 0].

If E(Xn) = E(Un +Dn) < +∞ and F is not lattice then the proportion of time that the

system is up is, in the long-run, equal to

limt→+∞

P (t) =E(U)

E(U) + E(D). (2.47)

Moreover,

limt→+∞

Q(t) =E(D)

E(U) + E(D). (2.48)

•

Remark 2.107 — Key renewal theorem and alternating renewal processes

(Kulkarni, 1995, Theorem 8.23, p. 448)

If F is lattice with period d, then the above the results from Proposition 2.106 should be

restated as follows:

limn→+∞

P (nd) =E(U)

E(U) + E(D);

limn→+∞

Q(nd) =E(D)

E(U) + E(D).

•

108

Exercise 2.108 — Key renewal theorem and alternating renewal processes

Prove Proposition 2.106 (Ross, 1983, p. 67; Kulkarni, 1995, pp. 448–449). •


Consider an M/G/1/1 system.37

(a) What is the rate at which customers enter the system in the long-run (Ross, 2003,

Example 7.7, p. 410)?

(b) What proportion of potential customers that actually enter the bank in the long-run

(Ross, 2003, Example 7.7, p. 410)?

(c) Determine the long-run proportion of time that the server is busy. •


(bis)

Consider a single-server bank (with infinite capacity) to which customers arrive in

accordance with a Poisson process with rate λ. Moreover, admit that the service provided

by the server is a r.v. with c.d.f. G and expected value µ−1 (µ > λ).38

Obtain the long-run proportion of time that the server is busy (Ross, 1989, Example

5.1d, pp. 322-325). •

The limit behavior of the c.d.f. of the age and the residual life of a renewal process can

be determined using Proposition 2.10639 and appropriate alternating renewal processes.

37In this case: potential customers arrive to this single-server system according to a Poisson process

with rate λ; a potential customer will enter the system iff the only server is free when he/she arrived; the

time spent in the system by an entering customer corresponds to the duration of the service provided by

the server and is a r.v. with c.d.f. G.38We are dealing with a M/G/1 system.39Instead of the key renewal theorem.

109

Proposition 2.111 — Obtaining the limiting c.d.f. of A(t), Y (t) and XN(t)+1 via

alternating renewal processes (Ross, 1983, Proposition 3.4.5, p. 68)

If the inter-renewal distribution F is not lattice and µ < +∞ then:

limt→+∞

P [A(t) ≤ x] =

∫ x0

[1− F (y)]dy

µ; (2.49)

limt→+∞

P [Y (t) ≤ x] =

∫ x0

[1− F (y)]dy

µ; (2.50)

limt→+∞

P [XN(t)+1 ≤ x] =

∫ x0ydF (y)

µ. (2.51)

•

Exercise 2.112 — Obtaining the limiting c.d.f. of A(t), Y (t) and XN(t)+1 via

alternating renewal processes

After having considered convenient alternating renewal processes, prove Proposition 2.111

(Ross, 1983, p. 68). •

Exercise 2.113 — Limiting c.d.f. of A(t) (Ross, 2003, Exercise 41, p. 469)

Each time a certain machine breaks down it is replaced by a new one of the same type.

In the long-run, what percentage of time is the machine in use less than one year if

the life distribution of a machine is:

(a) uniformly distributed over (0, 2)?

(b) exponentially distributed with expected value 1? •

110

2.9 Delayed renewal processes

Can we mathematically treat a counting process for which the first inter-event time has

a different distribution from the remaining ones?

Yes!

It is a generalization of renewal processes we describe in this section.

Definition 2.114 — Delayed renewal process (Ross, 1983, p. 74)

Let:

• Xi : i ∈ N be a sequence of independent non-negative r.v. representing the inter-

event times, with X1 having distribution G and Xi (i = 2, 3, . . . ) having distribution

F ;

• S0 = 0;

• Sn =∑n

i=1Xi, n ∈ N;

• ND(t) = supn ∈ N0 : Sn ≤ t be the number of events that occurred in (0, t].

Then ND(t) : t ≥ 0 is called a (general or) delayed renewal process. •

Remark 2.115 — Delayed renewal process (Kulkarni, 1995, p. 440)

The term delayed is pertinent because the process behaves exactly like a (standard)

renewal process after the occurrence of the first renewal at time X1. ND(t + X1) − 1 :

t ≥ 0 is indeed a (standard) renewal process with inter-renewal distribution F . •

Exercise 2.116 — Delayed renewal process

Give a few detailed examples of delayed renewal processes (Kulkarni, 1995, examples 8.24

and 8.26, p. 440). •

Exercise 2.117 — Delayed renewal process (bis) (Ross, 1983, Exercise 3.15, p. 96)

In Exercise 2.109 suppose that potential customers arrive in accordance with a renewal

process having distribution F .

111

Would the number of events by time t constitute a (possibly delayed) renewal process

if an event corresponds to a customer:

(a) entering the system?

(b) leaving the system?

What if F were exponential? •

The delayed renewal process inherits most properties of the (standard) renewal process

(Kulkarni, 1995, p. 440), for instance, ND(t) and its expected value are finite in finite time,

the p.f. of ND(t) can be written in terms of the c.d.f. of Sn and Sn+1, the LST of the

renewal function depends on the LST of the inter-renewal distributions G and F , etc.

Proposition 2.118 — Properties of ND(t) (Kulkarni, 1995, pp. 440–441)

Let:

• ND(t) : t ≥ 0 be a delayed renewal process;

• G and F be the inter-renewal distributions referring to X1 and Xi (i = 2, 3, . . . ),

respectively;

• Fn−1(t) = P (Sn−X1 = X2 + · · ·+Xn ≤ t) the (n− 1)− fold convolution of F with

itself (n ∈ N);

• (G ? Fn−1)(t) = P (Sn ≤ t) =∫ t

0G(t− x) dFn−1(x).

Then

• P [ND(t) < +∞] = 1, 0 ≤ t < +∞;

• P [ND(t) = n] = P (Sn ≤ t)− P (Sn+1 ≤ t) = (G ? Fn−1)(t)− (G ? Fn)(t);

• mD(t) = E[ND(t)] < +∞, 0 ≤ t < +∞;

• mD(t) =∑+∞

n=1(G ? Fn−1)(t);

• mD(s) =∫ +∞

0−e−st dmD(t) = G(s)

1−F (s). •

112

Moreover, it is easy to prove similar limit theorems for the delayed renewal process

(Ross, 1983, p. 74). Note that the distribution of the first inter-renewal time X1, G, plays

no role in the asymptotic behavior of the delayed renewal process (Kulkarni, 1995, p.

441), as illustrated by the following proposition.

Proposition 2.119 — Limit theorems for delayed renewal processes (Ross, 1983,

Proposition 3.5.1, pp. 74–75; Kulkarni, 1995, pp. 441–442)

Let ND(t) : t ≥ 0 be a delayed renewal process and µ = E(Xi), i = 2, 3, . . . . Then

• ND(t)t

w.p.1→ 1µ

(SLLN for delayed renewal processes);

• limt→+∞mD(t)t

= 1µ

(elementary renewal theorem for delayed renewal processes);

• F is not lattice ⇒ limt→+∞ [mD(t+ a)−mD(t)] = aµ

(Blackwell’s theorem for

delayed renewal processes);

• G and F are lattice with period d ⇒ limn→+∞E[number of renewals at nd] = dµ

(ibidem). •

Proposition 2.120 — Key renewal theorem for delayed renewal processes (Ross,

1983, Proposition 3.5.1, p. 75; Kulkarni, 1995, pp. 442–443)

Let:

• ND(t) : t ≥ 0 be a delayed renewal process and µ = E(Xi) < +∞, i = 2, 3, . . . ;

• D(t) be a dRi function;

• HD(t) be a solution to the renewal-type equation HD(t) = D(t)+∫ t

0D(t−x) dmD(x).

If F is not lattice then

limt→+∞

H(t) = limt→+∞

∫ t

0

D(t− x) dmD(x)

=1

µ

∫ +∞

0

D(y) dy. (2.52)

•

113

Exercise 2.121 — Delayed renewal processes (Ross, 1983, Exercise 3.18, p. 96)

Consider a delayed renewal process ND(t) : t ≥ 0 whose first inter-event has distribution

G and the others have distribution F .

(a) Prove that mD(t) satisfies the following renewal-type equation:

mD(t) = G(t) +

∫ t

0

m(t− x) dG(x), (2.53)

where m(t) =∑+∞

n=1 Fn(t).

(b) Show that if G has a finite mean then limt→+∞ t× [1−G(t)] = 0.

(c) Let AD(t) denote the age at time t. Prove that if F is not lattice, with∫ +∞

0x2 dF (x) <

+∞, and limt→+∞ t× [1−G(t)] = 0, then

limt→+∞

E[AD(t)] =

∫ +∞0

x2 dF (x)

2∫ +∞

0x dF (x)

, (2.54)

•

2.10 Regenerative processes

Can we further generalize renewal processes?

Yes!

Renewal processes lead to an important and more general class of stochastic processes

defined below.

Definition 2.122 — Regenerative process (Ross, 1983, p. 84)

Consider a stochastic process X(t) : t ≥ 0 with state space N0 and having the property

that there are time points at which the stochastic process restarts itself (probabilistic

speaking!).40 Then X(t) : t ≥ 0 is called a regenerative process. •40That is, w.p.1 there is a time S1 such that the continuation of the process beyond S1 is a probabilistic

replica of the whole process starting at 0. Note that this property implies the existence of further points

S2, S3, . . . with the same property as S1.

114

Remark 2.123 — Regenerative process (Ross, 1983, p. 84)

It follows that S1, S2, S3, . . . constitute the event times of a renewal process. Moreover,

we say that a cycle is completed every time a renewal occurs and N(t) = maxn ∈ N0 :

Sn ≤ t denotes the number of cycles by time t.41 •

Exercise 2.124 — Regenerative process

Give a few detailed examples of regenerative processes (Kulkarni, 1995, examples 8.36

and 8.39, pp. 460–461). •

To obtain the limiting p.f. of X(t) we have to use the key renewal theorem.

Proposition 2.125 — Limiting behavior of P [X(t) = j] (Kulkarni, 1995, Theorem

8.26, p. 461; Ross, 1983, Theorem 3.7.1, p. 84)

Let:

• X(t) : t ≥ 0 be a regenerative process with state space N0;42

• S1 be the first regeneration epoch and F its distribution;

• Uj be the time that the process spends in state j during [0, S1).

If F is not lattice and E(S1) < +∞ then

Pj ≡ limt→+∞

P [X(t) = j]

=E(amount of time in state j during a cycle)

E(time of a cycle)

=E(Uj)

E(S1). (2.55)

•

Exercise 2.126 — Limiting behavior of P [X(t) = j]

Prove Proposition 2.125 (Ross, 1983, p. 84). •41Once again we consider S0 = 0.42And right continuous sample paths with left limits.

115

Example/Exercise 2.127 — Limiting behavior of P [X(t) = j] (Ross, 1983, Exercise

3.26, p. 98)

Packages arrive to a mailing depot in accordance with a Poisson process having rate λ.

Trucks, picking up all waiting packages, arrive (instantly pick the waiting packages and

immediately leave the depot) in accordance to a renewal process with a non-lattice inter-

event distribution F . Let X(t) denote the number of packages waiting to be picked at

time t.

(a) What type of stochastic process is X(t) : t ≥ 0? Justify!

(b) Find an expression for limt→+∞ P [X(t) = j], j ∈ N0.

• Regenerative process

X(t) : t ≥ 0

• Regeneration times

Times S1, S2, . . . of departing trucks

• Limiting value of P [X(t) = j]

According to Proposition 2.125,

limt→+∞

P [X(t) = j] =E(Uj)

E(S1),

where:

E(S1) =

∫ +∞

0

x dF (x);

E(Uj) = E[E(Uj | S1)]

=

∫ +∞

0

E(Uj | S1 = x) dF (x).

Conditioning on the number of packages that arrived during the time x elapsed

between two successive truck departures, we can add that N(x) ∼ Poisson(λx)

and

E(Uj) =

∫ +∞

0

+∞∑i=j

E[Uj | S1 = x,N(x) = i]× P [N(x) = i]

dF (x).

116

Furthermore, let S∗1 , . . . , S∗i be the epochs of the i arrivals of packages that

occurred in an interval of length x. Since the (S∗1 , . . . , S∗i | N(x) = i) behave like

the order statistics of a random sample (Y1, . . . , Yi) from a Uniform(0, x) and,

thus,

Y(k)

x∼ Beta(k, i− k + 1), k = 1, . . . , i

E

[Y(k)

x

]=

k

k + (i− k + 1)

=k

i+ 1.

Moreover, there are j packages in the depot waiting be picked between the arrival

of the jth and (j + 1)th packages. Therefore,(Ujx| S1 = x,N(x) = i

)∼

Y(j+1)

x−Y(j)

x.

Consequently:

E[Uj | S1 = x,N(x) = i] = x× E[Y(j+1)

x−Y(j)

x| S1 = x,N(x) = i

]= x×

(j + 1

i+ 1− j

i+ 1

)=

x

i+ 1;

E(Ui) =

∫ +∞

0

+∞∑i=j

x

i+ 1× P [N(x) = i]

dF (x)

=

∫ +∞

0

1

λ

+∞∑i=j

e−λx(λx)j+1

(i+ 1)!

dF (x)

=1

λ

∫ +∞

0

FGamma(j,λ)(x) dF (x);

limt→+∞

P [X(t) = j] =

∫ +∞0

FGamma(j,λ)(x) dF (x)

λ×∫ +∞

0x dF (x)

.

•

We are particularly interested in determining the long-run proportion of time a

regenerative process spends at state j (Ross, 2003, p. 425). We can obtain this quantity

by applying the theory of renewal reward process.

117

Proposition 2.128 — Long-run proportion of time that X(t) = j (Ross, 1983,

Theorem 3.7.2, p. 85)

For a regenerative process X(t) : t ≥ 0 with E(S1) < +∞, we have

amount of time in j during (0, t)

t

w.p.1→ Pj, (2.56)

that is, the long-run proportion of time a regenerative process spends at state j is equal

to Pj. •

Exercise 2.129 — Long-run proportion of time that X(t) = j


Exercise 2.130 — Long-run proportion of time that X(t) = j (Kulkarni, 1995,

Example 8.41, pp. 465–466)

Customers arrive at a bus depot according to a renewal process with i.i.d. inter-arrival

times with mean µ < +∞. As soon as there are k (k ∈ N) customers waiting at the

depot, a shuttle is immediately dispatched to (instantly) clear all the k customers.

Let X(t) denote the number of customers in the depot at time t. What is the long-run

proportion of time the bus depot has j (j ∈ 0, 1, . . . , k − 1) customers? •

118

Chapter 3

Discrete time Markov chains

While trying to realistically model a system, we are forced to tackle all sorts of

dependencies which make for unmanageable or impossible calculations (Resnick, 1992,

p. 60). Thus, when constructing a stochastic model, the challenge is to have dependencies

which allow for sufficient realism but which can be analytically tamed to permit

mathematical tractability (Resnick, 1992, p. 60). Markov processes balance these two

demands quite nicely because

conditional on a history up to the present, the probabilistic structure of the

future does not depend on the whole history but only on the present

(Resnick, 1992, p. 60), i.e., they satisfy the Markov property (Kulkarni, 1995, p 17).

Markov processes were named after the Russian mathematician Andrey Markov (1856–

1922), who produced the first purely theoretically results in 1906 for these processes

(http://en.wikipedia.org/wiki/Markov chain).

This chapter is devoted to the simplest Markov processes, time-homogeneous discrete

time Markov chains (DTMC) with finite or countable state space.1

A few quantities that could be modeled by a DTMC: the state of deterioration of a

piece of equipment; the popularity of a politician; the inventory level of an item in a store;

the number of jobs waiting to be processed by a computer (Hastings, 2001, p. 309).

1For the definition of DTMC with a general state space, the reader is referred to Kulkarni (1995,

Definition 2.1, pp. 17–18).

119

http://en.wikipedia.org/wiki/Markov_chain

3.1 Definitions and examples

The formal definition and examples of time-homogeneous DTMC with finite or countable

state space can be found in this section.

Definition 3.1 — Time-homogeneous DTMC with finite or countable state

space (Ross, 2003, p. 181)

Let Xn : n ∈ N0 be a stochastic process with a finite or countable state space S. If

P (Xn+1 = j | Xn = i,Xn−1 = in−1, . . . , X0 = i0) = P (Xn+1 = j | Xn = i)

= Pij, (3.1)

for all i0, . . . , in−1, i, j ∈ S and n ∈ N0, then Xn : n ∈ N0 is said to be a time-

homogeneous DTMC with finite or countable state space.2 •

From now on, we shall assume that the state space S is finite or countable and the

DTMC is time-homogeneous, thus, we shall drop the terms time-homogeneous and with

finite or countable state space whenever we refer to such a DTMC.

Remark 3.2 — (One-step) transition probability matrices, stochastic matrices,

transition diagrams

• A DTMC is a probabilistic model that undergoes transitions from one state to

another (http://en.wikipedia.org/wiki/Markov chain). Consequently, the law of

motion is specified by one-step transition probabilities (Walrand, 2004, p. 225) and,

rightly so, the matrix

P = [Pij]i,j∈S (3.2)

is called the one-step transition probability matrix. Often we omit the word one-

step and simply refer to P as transition probability matrix (Kulkarni, 1995, p. 17)

or briefly TPM.

2Although the stochastic process possesses stationary/time-homogeneous transition probabilities, it is

in general not stationary (Resnick, 1992, p. 64). If P (Xn+1 = j | Xn = i) depends not only on i and j,

but also on n, Xn : n ∈ N0 is said to be a (non-homogeneous) DTMC with finite or countable state

space.

120


• P is a stochastic matrix because Pij ≥ 0, for all i, j ∈ S and∑

j∈S Pij = 1, for all

i ∈ S (Kulkarni, 1995, Definition 2.3 and Theorem 2.1, pp. 17–18).

• The random behavior of a DTMC is best visualized via its transition diagram — a

directed graph with

– one node for each state i ∈ S,

– a directed arc from node i to node j if Pij > 0 and

– a loop at node i (i.e., an arc from node i to itself) if Pii > 0

(Kulkarni, 1995, p. 19). •

Proposition 3.3 — Characterization of a DTMC (Kulkarni, 1995, Theorem 2.2, p.

18)

A DTMC Xn : n ∈ N0 is fully characterized by

• its TPM, P, and

• the p.f. of X0 denoted by α = [αi]i∈S = [P (X0 = i)]i∈S .3 •

Remark 3.4 — Applications of DTMC

Markov chains constitute an important class of probabilistic models because they are

fairly general and good numerical techniques exist for computing probabilistic quantities

referring to Markov chains (Walrand, 2004, p. 225). Unsurprisingly, the application of

Markov chains has been reported in areas, such as:

• chemistry (the classical model of enzyme activity, Michaelis-Menten kinetics, can

be viewed as a Markov chain);

• internet (the PageRank of a webpage as used by Google is defined by a Markov

chain; Markov models have also been used to analyze web navigation behavior of

users);

3SinceX0 is the initial state of the DTMC, the row vector [αi]i∈S is usually called the initial distribution

of the DTMC.

121

• music (Markov chains are employed in algorithmic music composition, particularly

in software programs such as CSound, Max or SuperCollider);

• physics (Markov processes appear extensively in thermodynamics and statistical

mechanics);

• sports (Markov chain models have been used in advanced baseball analysis since

1960).4

According to Kulkarni (1995, p. 33), DTMC have also been used in

• sociology to study the issues of social mobility (how the social/economic status of

the nth generation affects that of the (n + 1)th generation) or the impact of social

(and sexist!) traditions (how family names propagate through generations in a

family tree, etc.).

For an extensive account of other applications of DTMC (namely in genetics, manpower

planning, neurology, telecommunication), the reader is referred to Kulkarni (1995, pp.

30–41). •

Example 3.5 — DTMC and weather prediction (Ross, 2003, Example 4.1, p. 182;

http://en.wikipedia.org/wiki/Examples of Markov chains)

A very simple weather model can be represented by the following transition diagram

where a sunny day is 90% likely to be followed by another sunny day, and a rainy day is

50% likely to be followed by another rainy day. As a consequence, the probabilities of the

4For a more detailed account on these applications of DTMC, consult

http://en.wikipedia.org/wiki/Markov chain or http://en.wikipedia.org/wiki/Examples of Markov chains.

122

http://en.wikipedia.org/wiki/Examples_of_Markov_chains


http://en.wikipedia.org/wiki/Examples_of_Markov_chains

weather conditions (sunny or rainy), given the weather on the preceding day, are given

by the TPM

P =

0.9 0.1

0.5 0.5

.Its rows can be labelled sunny and rainy, and its columns are labelled in the same way

and order.

This two-sate DTMC is a particular case on the one with a TPM of the form

P =

α 1− α

β 1− β

,where α, β ∈ [0, 1]. •

Exercise 3.6 — TPM (Ross, 2003, Example 4.3, p. 182)

On any given day, Evaristo is either cheerful (C), so-so (S), or glum (G).

• If he is cheerful today, then he will be C, S, or G tomorrow with probabilities 0.5,

0.4, 0.1, respectively.

• If he is feeling so-so today, then he will be C, S, or G tomorrow with probabilities

0.3, 0.4, 0.3.

• If he is glum today, then he will be C, S, or G tomorrow with probabilities 0.2, 0.3,

0.5.

Let Xn denote Evaristo’s mood on the nth day, then Xn : n ∈ N0 is a three-state Markov

chain (state 1 = C, state 2 = S, state 3 = G).

Identify the TPM of this DTMC. •

Exercise 3.7 — Peculiar TPM

(a) Let Xn : n ∈ N0 be a sequence of i.i.d. discrete r.v. with p.f. pj = P (Xn = j), j ∈ S.

Identify the TPM of this DTMC (Kulkarni, 1995, Example 2.3, pp. 21–22).

123

(b) Let:

• Zn : n ∈ N0 be a sequence of i.i.d. discrete r.v. with common p.f. pj = P (Zn =

j), j ∈ Z;

• X0 = 0 and Xn =∑n

i=1 Zi, n ∈ N.

Then Xn : n ∈ N0 is a DTMC with state space S = Z.

Identify the TPM of this DTMC and note that Pij = pj−i, i.e., Xn : n ∈ N0 is a

space-homogeneous DTMC (Kulkarni, 1995, Example 2.4, pp. 22–23). •

Exercise 3.8 — Identifying DTMC (Ross, 1983, Exercise 4.7, p. 135)

Let X1, X2, . . . be independent r.v. such that P (Xi = j) = αj, j ≥ 0. Say that a record

occurs at time n if Xn > maxX1, . . . , Xn−1, where X0 = −∞, and if a record does occur

at time n call Xn the record value.

Let Ri denote the ith record value.

(a) Argue that Ri : i ≥ 1 is a Markov chain and compute its transition probabilities.

(b) Let Ti denote the time between the ith and (i+ 1)th record.

(i) Is Ti : i ≥ N a Markov chain?

(ii) What about (Ri, Ti) : i ≥ 1?

Compute transition probabilities where appropriate. •

Exercise 3.9 — More on transition probabilities (Ross, 1983, Exercise 4.2, p. 134)

Prove that, for a DTMC,

P (Xn = j | Xn1 = in1 , . . . , Xnk = ink) = P (Xn = j | Xnk = ink),

whenever 0 ≤ n1 < n2 < · · · < nk < n. •

124

3.2 Chapman-Kolmogorov equations; marginal and

joint distributions

So far we dealt with one-step transition probabilities Pij, i, j ∈ S. Can we calculate

n−step transition probabilities, such as

P (Xn+m = j | Xm = i), i, j ∈ S, n,m ∈ N0? (3.3)

Yes!

The n−step transition probabilities P (Xn+m = j | Xm = i) are usually denoted by

P nij (Ross, 1983, p. 103; Ross, 2003, p. 185), or by P

(n)ij , or even by p

(n)ij (Kulkarni, 1995,

p. 41).

Proposition 3.10 — Chapman-Kolmogorov equations and n − step transition

probabilities (Ross, 2003, p. 185; Kulkarni, 1995, Theorem 2.3, p. 42)

The Chapman-Kolmogorov equations provide a method for computing the n−step

transition probabilities and can be stated as follows:

P n+mij = P (Xn+m = j | X0 = i) (3.4)

=∑k∈S

P nik P

mkj , (3.5)

for i, j ∈ S and n,m ∈ N0. Equivalently, P nij =

∑k∈S P

lik P

n−lkj , for i, j ∈ S, n ∈ N0 and a

fixed l in 0, 1, . . . , n. •

Exercise 3.11 — Chapman-Kolmogorov equations

Prove the Chapman-Kolmogorov equations (Ross, 2003, p. 185; Kulkarni, 1995, Theorem

2.3, p. 42). •

Proposition 3.12 — n−step TPM (Ross, 2003, p. 186; Kulkarni, 1995, Theorem 2.4,

p. 42)

Let P(n) =[P nij

]i,j∈S denote the n−step TPM. Then the Chapman-Kolmogorov equations

assert that P(n+m) = P(n)P(m), n,m ∈ N0 and, most importantly,

125

P(n) = Pn, n ∈ N0, (3.6)

i.e., the n−step TPM may be calculated by multiplying the TPM P by itself n times. •

Remark 3.13 — Computing n−step TPM

The n−step TPM may be obtained either by multiplying the TPM P by itself n times or

by using the method of direct multiplication described by Algorithm 2 in Kulkarni (1995,

p. 47). For instance, to obtain P37 we have to compute P2, P4 = (P2)2, P8 = (P4)2,

P16 = (P8)2, P32 = (P16)2, and compute P37 as P P4 P32.

For more methods of computing the powers of a TPM, please refer to Kulkarni (1995,

pp. 47–54) and Kleinrock (1975, pp. 36–38). •

Exercise 3.14 — n−step TPM

Consider a DTMC with two states and TPM

P =

p 1− p

1− p p

.Use mathematical induction to prove that the n−step TPM is given by

Pn =

12

+ (2p−1)n

212− (2p−1)n

2

12− (2p−1)n

212

+ (2p−1)n

2

,for n ∈ N0. •

Exercise 3.15 — n−step TPM (bis) (Ross, 2003, Example 4.3, p. 182)

Resume Exercise 3.6.

(a) Obtain the probability that Evaristo is cheerful (C) two days after being glum (G).

(b) What is the probability that Evaristo is not cheerful (C) in four days time, given that

he is so-so (S) today? •

126

Exercise 3.16 — n−step TPM (bis, bis) (Resnick, 1992, Exercise 2.1, p. 147)

Consider a DTMC, Xn : n ≥ 0, with state space 1, 1, 2 and TPM equal to

P =

0.3 0.3 0.4

0.2 0.7 0.1

0.2 0.3 0.5

.Compute:

(a) P (X8 = 3 | X0 = 1);

(b) P (X4 = 3, X8 = 3 | X0 = 1);

(c) P (X16 = 3 | X0 = 1);

(d) P (X12 = 3, X16 = 3 | X0 = 1).

Try not to this by hand. •

Can we obtain P (Xn = j), for j ∈ S?

Yes! But how?

We need to know the initial distribution of the DTMC, to use the total probability

law and to capitalize on the n−step transition probabilities.

Proposition 3.17 — Marginal probabilities

Let:

• Xn : n ∈ N0 be a DTMC with TPM P = [Pij]i,j∈S ;

• α = [αi]i∈S be the row vector with the initial distribution of the DTMC (i.e., the

p.f. of X0).

Then

P (Xn = j) =∑i∈S

P (X0 = i)× P (Xn = j | X0 = i)

=∑i∈S

αi × P nij, j ∈ S, (3.7)

127

and the row vector with the p.f. of Xn is given by

αn = [P (Xn = j)]j∈S

= αPn. (3.8)

•

Exercise 3.18 — Marginal probabilities

Resume Exercise 3.16 and calculate P (X8 = 2) by considering the probability distribution

of the initial state given by α0 = 0.7, α1 = 0.2 and α2 = 0.1. •

Exercise 3.19 — n−step transition probabilities and marginal probabilities

Brand switching models are used quite often in practice by industries to predict market

shares, etc. Admit a DTMC, Xn : n ≥ 0 with state space A,B,C and TPM equal to

P =

0.1 0.2 0.7

0.2 0.4 0.4

0.1 0.3 0.6

,describes the choice of beer brand a typical customer buys weekly (Kulkarni, 1995,

Example 2.6, p. 26).

(a) Compute P (X2 = A | X0 = B).

(b) Consider the initial distribution α = [0.2 0.3 0.5] (Kulkarni, 1995, Example 3.1, p.

65) and find P (X2 = A). •

Can we compute joint probabilities such as P (Xn1 = in1 , . . . , Xnk = ink)?

Yes!

By simply capitalizing on the multiplication rule and on the Markov property.

Proposition 3.20 — Joint probabilities

Let:

128

• Xn : n ∈ N0 be a DTMC with TPM P = [Pij]i,j∈S ;

• α = [αi]i∈S be the row vector with the initial distribution of this DTMC.

Then

P (Xnk = ink , . . . , Xn1 = in1) =

(∑i∈S

αi × P n1i,in1

)×

k∏j=2

Pnj−nj−1

inj−1 ,inj, (3.9)

for 0 ≤ n1 < n2 < · · · < nk and in1 , . . . , ink ∈ S. •

Exercise 3.21 — Joint probabilities


Exercise 3.22 — Marginal and joint probabilities

Let Xn : n ∈ N0 be a DTMC with state space S = 1, 2, 3, 4 and the TPM

P =

0.1 0.2 0.3 0.4

0.2 0.2 0.3 0.3

0.5 0 0.5 0

0.6 0.2 0.1 0.1

and initial distribution α = [0.25 0.25 0.25 0.25]. Compute

(a) the p.f., the expected value and the variance of X4,

(b) P (X3 = 4, X2 = 1, X1 = 3, X0 = 1),

(c) P (X3 = 4, X2 = 1, X1 = 3)

(Kulkarni, 1995, Example 2.10, pp. 43-44). •

129

3.3 Classification of states; recurrent and transient

states

To infer the evolution of the DTMC it is critical to understand which paths through the

state space are possible and to unravel the allowable movements of the stochastic process

(Resnick, 1992, p. 77).

To identify which states j can be reached from a (starting) state i, we need to define

the notion of accessibility.

Definition 3.23 — Accessibility (Ross, 2003, p. 189)

State j is said to be accessible from state i — for short, i → j — if P nij > 0 for some

n ∈ N0. •

Remark 3.24 — Accessibility

A state i ∈ S is accessible from itself (i.e., i → i) because P 0ii = P (X0 = i | X0 = i) =

1 > 0.5 •

Exercise 3.25 — Accessibility

Consider a DTMC with TPM

P =

p 1− p

1− p p

,where p ∈ (0, 1).

(a) Draw the transition diagram of this DTMC.

(b) Is state 2 accessible from state 1?

(c) Is state 2 accessible from state 1 if p = 1? And if p = 0?

5In fact, P0 = [P (X0 = j | X0 = i)]i,j∈S = I#S×#S , where I#S×#S represents the identity matrix

with rank #S.

130

(d) Compute P 2n12 , n ∈ N, when p = 0? Comment this result. •

Definition 3.26 — Communicating states (Ross, 2003, p. 190)

Two states i and j that are accessible to each other are said to communicate, and we

write i↔ j. •

Proposition 3.27 — Properties of communication (Ross, 1983, Proposition 4.2.1,

p. 104; Kulkarni, 1995, Theorem 3.1, p. 71)

Communication is an equivalence relation, i.e.:

• i↔ i (reflexivity);

• i↔ j ⇒ j ↔ i (symmetry);

• i↔ j, j ↔ k ⇒ i↔ k (transitivity). •

Exercise 3.28 — Properties of communication

Prove Proposition 3.27 (Ross, 1983, p. 104; Kulkarni, 1995, p. 71). •

Two states that communicate are said to be in the same class. Moreover, since

communication is a reflexive, symmetric and transitive relation, we can use it to partition

the state space S into subsets known as communicating classes (Kulkarni, 1995, p. 71).

Definition 3.29 — Communicating class (Kulkarni, 1995, Definition 3.3, p. 71;

http://en.wikipedia.org/wiki/Markov chain)

A set of states C ⊂ S is a communicating class if every pair of states in C communicates

with each other, and no state in C communicates with any state not in C, that is:

(i) i, j ∈ C ⇒ i↔ j;

(ii) i ∈ C, i↔ j ⇒ j ∈ C. •

Definition 3.30 — Closed communicating class (Kulkarni, 1995, Definition 3.4, p.

72)

A communicating class C ⊂ S is said to be closed if i ∈ C, j 6∈ C ⇒ i 6→ j. •

131


Definition 3.31 — Irreducible and reducible DTMC

(http://en.wikipedia.org/wiki/Markov chain; Ross, 2003, p. 190; Kulkarni, 1995,


The DTMC is said to be irreducible if its state space S is a single closed communicating

class.6 Otherwise, the DTMC is called reducible. •

Exercise 3.32 — Irreducible and reducible DTMC

Draw the transition diagrams of the DTMC with state space S = 1, 2 and the following

TPM, verify if they are irreducible/reducible and whether the communicating classes are

closed or not:

(a) P =

0.2 0.8

0.3 0.7

,

(b) P =

1 0

0.3 0.7

,

(c) P =

1 0

0 1

(Kulkarni, 1995, Example 3.4, pp. 73–74).

(d) Consider now S = 1, 2, 3, 4, 5, 6. Draw the transition diagram associated to the

following TPM and identify the communicating classes

P =

12

0 0 0 0 12

0 13

0 0 23

0

16

16

16

16

16

16

0 0 0 1 0 0

0 23

0 0 13

0

12

0 0 0 0 12

.

Which of these classes are closed? (Kulkarni, 1995, Example 3.5, p. 74). •6In other words, if it is possible to get to any state from any state, that is, if all states communicate

with each other.

132


After introducing the notions of accessibility, communication, communicating class

and irreducibility, it is time to introduce the concept of periodicity.

Definition 3.33 — Periodic and aperiodic states (Ross, 1983, pp. 104–105;

Kulkarni, 1995, Definitions 3.6 and 3.7, pp. 74–75)

State i is said to be periodic, with period d ≡ d(i), if P nii = 0 whenever n is not divisible

by d and d is the greatest positive integer with this property.

A state with period d = 1 is said to be aperiodic. •

Remark 3.34 — Periodicity

• If P nii = 0, for n ∈ N, then the period of state i is said to be infinite (Ross, 1983,

pp. 104–105), that is the DTMC never returns to state i after leaving this state.

• An alternative definition of periodicity can be stated in terms of a r.v. that tells us

when the DTMC revisits state i (the first hitting time or recurrence time), given

that it started in state i:

Ti = minn ∈ N : Xn = i | X0 = i. (3.10)

Then state i is said to be periodic with period d ≡ d(i) if d is the largest integer

such that

P (Ti = n) > 0 ⇒ n is an integer multiple of d (3.11)

(Kulkarni, 1995, Definition 3.7, p. 75). •

Exercise 3.35 — Periodicity

Consider the DTMC on S = 1, 2 with TPM

P =

0 1

1 0

(Kulkarni, 1995, Example 3.8, p. 76).

Find the period of state 1, d ≡ d(1), and show that state 2 has the same period. •

133

Exercise 3.35 suggests that, like communication, periodicity is a class property.

Proposition 3.36 — A property of periodicity (Ross, Proposition 4.2.2, p. 105;

Kulkarni, 1995, Theorem 3.2, p. 76)

Periodicity is a class property, i.e.,

i↔ j ⇒ d(i) = d(j) (3.12)

•

Exercise 3.37 — A property of periodicity

Show Proposition 3.36 (Ross, 1983, p. 105; Kulkarni, 1995, p. 76). •

Now, we introduce the concepts of recurrence and transience of states. These notions

play an important role in the study of the limiting behavior of DTMC (Kulkarni, 1995, p.

77). But before proceeding into the definitions of recurrent and transient states, we need

to define the following probabilities.

Definition 3.38 — Probabilities of a first transition to a state and of ever

making a transition to a state (Ross, 1983, p. 105)

For any states i, j ∈ S define fnij to be the probability that, starting from state i, the first

transition into state j occurs exactly at time n. Formally,

f 0ij = 0, i 6= j (f 0

ii = 1) (3.13)

fnij = P (Xn = j,Xn−1 6= j, . . . , X1 6= j | X0 = i), n ∈ N. (3.14)

(The computation of fnij is thoroughly discussed in Section 3.9).

The probability of ever making a transition into state j, given that the process starts

in state i, equals

fij =+∞∑n=1

fnij. (3.15)

•

134

Note that for i 6= j, fij is positive iff state j is accessible from state i.

Exercise 3.39 — More on n−step transition probabilities (Ross, 1983, Exercise

4.4, p. 134)

Prove that P nij =

∑nk=1 f

kijP

n−kjj . •

Definition 3.40 — Recurrent and transient states (Ross, 2003, p. 191)

For any state i ∈ S, let

fi ≡ fii =+∞∑n=1

fnii = P (Ti < +∞) (3.16)

be the probability that, starting in state i, the process will ever reenter state i.7 Then

state i is said to be:

• recurrent if fi = 1;

• transient if fi < 1. •

Remark 3.41 — Transient and recurrent states, recurrence time and absorbing

states

• A state i is said to be transient if, given that we start in state i, there is a

non-zero probability that we will never return to i, that is, P (Ti < +∞) < 1

(http://en.wikipedia.org/wiki/Markov chain).

• State i is recurrent (or persistent) if it is not transient; recurrent states have finite

recurrence time with probability 1 (http://en.wikipedia.org/wiki/Markov chain),

i.e., P (Ti < +∞) = 1.

• If state i is transient then, starting in state i, the number of time periods that the

process will be back to state i has a geometric distribution with finite mean 11−fi

(Ross, 2003, pp. 191–192).

• If Pii = 1 then state i is said to be an absorbing state; in this case state i is

(obviously!) a recurrent state. •

7Recall that the r.v. Ti is the first return time to state i.

135


Exercise 3.42 — Recurrent and transient states (bis) (Resnick, 1992, Exercise

2.15(a), p. 151)

The Media Police has identified six states associated with TV watching habits of its

inhabitants:

• 1 (never watch TV);

• 2 (watch only PBS);

• 3 (watch TV fairly frequently);

• 4 (addicted);

• 5 (undergoing behavior modification);

• 6 (brain dead).

Transitions from state to state can be modeled as a DTMC with the following TPM:

P =

1 0 0 0 0 0

0.5 0 0.5 0 0 0

0.1 0 0.5 0.3 0 0.1

0 0 0 0.7 0.1 0.2

13

0 0 13

13

0

0 0 0 0 0 1

.

After having drawn the associated transition diagram, identify which states are transient

and which are recurrent? •

Necessary and sufficient conditions to guarantee recurrence and transience of a state

i can be written in terms of expected number of periods that the DTMC is in state i,∑+∞n=1 P

nii (Ross, 2003, p. 192).8

8By letting In = 1, if Xn = i, and In = 0, otherwise, we have that∑+∞i=1 In represents the number of

periods that the DTMC is in state i; moreover, E(∑+∞i=1 In | X0 = i) =

∑+∞n=1 P

nii (Ross, 2003, p. 192).

136

Proposition 3.43 — Recurrent and transient states (Ross, 1983, Proposition 4.2.3,

p. 105; Ross, 2003, p. 192)

These are necessary and sufficient conditions for recurrence and transience:

• state i is recurrent iff∑+∞

n=1 Pnii = +∞;

• state i is transient iff∑+∞

n=1 Pnii < +∞. •

Exercise 3.44 — Recurrent and transient states


Exercise 3.45 — More on transient states (Ross, 1983, Exercise 4.8, p. 135)

Show that if fi ≡ fii < 1 and fj ≡ fjj < 1 then:

(a)∑+∞

n=1 Pnij < +∞;

(b) fij =∑+∞n=1 P

nij

1+∑+∞n=1 P

njj

. •

Proposition 3.46 — Property of recurrence and transience (Ross, 2003, p. 193;

Kulkarni, 1995, Theorem 3.5, p. 81)

Recurrence and transience9 are class properties, i.e.:

i is recurrent (resp. transient), i↔ j ⇒ j is recurrent (resp. transient). (3.17)

•

Exercise 3.47 — Property of recurrence and transience

Prove Proposition 3.46 (Ross, 2003, p. 193; Kulkarni, 1995, p. 81). •

Exercise 3.48 — Classification of states

Consider a DTMC Xn : n ∈ N0 with state space S = Z and transition probabilities

given by

Pi,i+1 = p = 1− Pi,i−1, i ∈ Z,

9Like communication and periodicity.

137

where 0 < p < 1 and Xn : n ∈ N represents the one-dimensional random walk on the

integer number line.

Classify the states of this DTMC (Ross, 2003, Example 4.15, pp. 194–195). •

Exercise 3.49 — Classification of states of a symmetric random walk (bis)

(Ross, 1983, Exercise 4.5, p. 134)

Show that the symmetric random walk is recurrent in two dimensions and transient in 3

dimensions (Ross, 2003, Example 4.15, pp. 195–196; Resnick, 1992, pp. 95–97). •

Definition 3.50 — Mean recurrence time (Ross, 1983, p. 108)

The mean recurrence time at state i is the expected number of transitions needed to

return to state i:

µii = E(Ti)

=

+∞, if state i is transient∑+∞n=1 n× fnii , if state i is recurrent.

(3.18)

•

Even if the recurrence time Ti is finite with probability 1 when the state i is recurrent,

it need not have a finite expectation (http://en.wikipedia.org/wiki/Markov chain).

Unsurprisingly, recurrent states are further classified according to the finiteness (or not)

of this expected value.

Definition 3.51 — Positive and null recurrence (Ross, 1983, p. 108; Kulkarni, 1995,


A recurrent state i is said to be:

• positive recurrent if µii < +∞;

• null recurrent, if µii = +∞. •

Exercise 3.52 — Property of positive and null recurrence (Ross, 1983, Exercise

4.10, p. 135)

Show that positive and null recurrence are class properties (Kulkarni, 1995, p. 82). •

138


The next proposition gives a necessary and sufficient condition for positive and null

recurrence.

Proposition 3.53 — Necessary and sufficient condition for positive and null

recurrence (Kulkarni, 1995, Theorem 3.4, pp. 80–81)

Let

P ?ij(n) =

1

n+ 1

n∑k=0

P kij (3.19)

be the expected number of visits to state j starting from state i, per time unit up to time

n. Then a recurrent state i is:

• positive recurrent iff limn→+∞ P?ii(n) > 0;

• null recurrent iff limn→+∞ P?ii(n) = 0. •

Definition 3.54 — Recurrent/transient/positive-recurrent/null-recurrent

classes (resp. DTMC) (Kulkarni, 1995, definitions 3.10 and 3.11, p. 82)

A communicating class (resp. DTMC) is said to be recurrent/transient/positive-

recurrent/null-recurrent if all its states are recurrent/transient/positive recurrent/null

recurrent. •

Determining recurrence and transience of a finite communicating class or a finite state

space DTMC is trivial (Kulkarni, 1995, p. 82), essentially because of the following results.

Proposition 3.55 — Classification of states of a finite communicating class and

a finite state space DTMC (Kulkarni, 1995, theorems 3.7 and 3.8 and corollaries 3.1

and 3.2, pp. 82–84)

• Let C ⊂ S be a finite closed communicating class. Then all states in C are

positive recurrent.

• Let C ⊂ S be a finite communicating class that is not closed. Then all

states in C are transient.

139

• There are no null recurrent states in a finite state space DTMC.

• Not all states in a finite state space DTMC can be transient. •

The simple and extremely useful in Proposition 3.55 do not hold for infinite-state-space

DTMC (Kulkarni, 1995, p. 84).

For methods of establishing transience and recurrence in the infinite-state-space case,

please refer to (Kulkarni, 1995, pp. 84–98).

Exercise 3.56 — Classification of states of a finite state space DTMC (Ross,

1983, Exercise 4.11, p. 135)

Show that in a finite state space DTMC there are no null recurrent states and not all

states can be transient. •

Exercise 3.57 — More on the classification of states

Specify the classes of the following DTMC, and determine whether they are

recurrent/transient/positive-recurrent/null-recurrent classes:

(a) P =

p 1− p

1− p p

, where p ∈ (0, 1);

(b) P1 =

0 1

212

12

0 12

12

12

0

P2 =

0 0 0 1

0 0 0 1

12

12

0 0

0 0 1 0

(Ross, 2003, Exercise 14, p. 254);

(c) P3 =

0 0 1 0

1 0 0 0

12

12

0 0

13

13

13

0

P4 =

0 1 0 0

0 0 0 1

0 1 0 0

13

0 23

0

;

140

(d) P5 =

12

0 12

0 0

14

12

14

0 0

12

0 12

0 0

0 0 0 12

12

0 0 0 12

12

P6 =

14

34

0 0 0

12

12

0 0 0

0 0 1 0 0

0 0 13

23

0

1 0 0 0 0

;

(e) P7 =

13

0 23

0 0 0

0 14

0 34

0 0

23

0 13

0 0 0

0 15

0 45

0 0

14

14

0 0 14

14

16

16

16

16

16

16

P8 =

1 0 0 0 0 0

0 34

14

0 0 0

0 18

78

0 0 0

14

14

0 18

38

0

13

0 16

14

14

0

0 0 0 0 0 1

. •

3.4 Limit behavior of irreducible Markov chains

Let:

• Xn : n ∈ N0 be an irreducible DTMC with state space S and TPM P;

• α = [P (X0 = j)]j∈S be the initial distribution of the DTMC;

• αn = [P (Xn = j)]j∈S be the marginal distribution of Xn.

Since αn = αPn it is clear that once we devise the limiting behavior of Pn then

immediately derive the limiting behavior of αn (Kulkarni, 1995, p. 65), as illustrated

by the following example/exercise.

Example/Exercise 3.58 — Limiting behavior of the DTMC for brand

switching (Kulkarni, 1995, pp. 65–66)

Resume the brand switching model described in Exercise 3.19. Recall that the TPM is

this case

P =

0.1 0.2 0.7

0.2 0.4 0.4

0.1 0.3 0.6

141

Now, suppose that the initial distribution is

α = [0.2 0.3 0.5],

i.e., a typical customer buys brands A, B and C with probabilities 0.2, 0.3 and 0.5 at

time 0, respectively — essentially, 0.2, 0.3 and 0.5 are the initial market shares of these 3

brands.

It is of interest to the manufacturers of brands A, B and C to know how the market

shares will evolve with time (n). For that matter, use Mathematica to complete the

following table

n Pn αn

1

0.1 0.2 0.7

0.2 0.4 0.4

0.1 0.3 0.6

[0.130 0.310 0.560]

2

0.12 0.31 0.57

0.14 0.32 0.54

0.13 0.32 0.55

[0.131 0.318 0.551]

5

0.13188 0.31869 0.54943

0.13186 0.31868 0.54946

0.13187 0.31868 0.54945

[0.131869 0.318682 0.549449]

10

[ ]

100

[ ]

and realize that all 3 rows of Pn converge to [0.131868 0.318681 0.549451], as n→ +∞,

and so does αn.

The numbers 0.131868, 0.318681 and 0.549451 represent the long-run market shares

of brands A, B and C,10 respectively. •10I.e., the fraction of the long-run daily sales volume that goes to brands A, B and C (Kulkarni, 1995,

142

Before we provide a summary of the main results concerning the limiting behavior of

irreducible DTMC, we need a preliminary definition (Ross, 1983, p. 108).

Definition 3.59 — Stationary distribution (Ross, 1983, p. 108)

A probability distribution Pj : j ∈ S is said to be stationary for a DTMC Xn : n ∈ N0,

with state space S and TPM P = [Pij]i,j∈S , if

Pj =∑i∈S

PiPij, j ∈ S. (3.20)

•

Exercise 3.60 — Stationary distribution

Use mathematical induction to show that if the p.f. of X0 is given by Pj : j ∈ S defined

in (3.20) then

P (Xn = j) =∑i∈S

P (Xn = j | Xn−1 = i)× P (Xn−1 = i)

= Pj,

for all n ∈ N and j ∈ S (Ross, 1983, pp. 108–109). •

The rather convenient limiting results that are going to be stated in the next theorems

are a consequence of the discrete version of the key renewal theorem if we interpret the

transitions into state j as being renewals, as suggested by Ross (1983, p. 108).11

p. 68).11Kulkarni (1995, p. 100) called it the discrete renewal theorem (Kulkarni, 1995, Theorem 3.11, p. 100).

Ross (1983, Theorem 4.3.1, p. 108) also states it, as follows. Let Nj(t) the number of transitions into

state j up to time t. If states i and j communicate then:

(i) P[limn→+∞

Nj(n)n = 1

µjj| X0 = i

]= 1;

(ii) limn→+∞1n

∑nk=1 P

nij = 1

µjj;

(iii) If state j is aperiodic then limn→+∞ Pnij = 1µjj

;

(iv) If state j is periodic with period d then limn→+∞ Pndjj = dµjj

;

143

Theorem 3.61 — Limiting behavior of irreducible aperiodic DTMC (Kulkarni,

1995, theorems 3.13–3.15, pp. 103–105; Ross, 1983, Theorem 4.3.3, p. 109)

An irreducible aperiodic DTMC, with state space S and TPM P = [Pij]i,j∈S , belongs

to one of the following two classes.

(i) Either the states are all transient or all null recurrent and in this case

limn→+∞

P nij = 0, i, j ∈ S, (3.21)

and there is no stationary distribution.

(ii) Or else, all states are positive recurrent and

limn→+∞

P nij = πj > 0, i, j ∈ S, (3.22)

where πj : j ∈ S is the unique stationary distribution and satisfies the following

system of equations: πj =∑

i∈S πiPij, j ∈ S∑j∈S πj = 1.

(3.23)

•

Remark 3.62 — Limiting behavior of an irreducible positive recurrent DTMC

• An irreducible DTMC is positive recurrent iff there is a solution to the system

of equations (3.23); if there is a solution then is it is unique and πj > 0, j ∈ S

(Kulkarni, 1995, Theorem 3.18, p. 111).

• The previous result is extremely useful because it allows us to solve (3.23) without

checking for positive recurrence; in fact, if we solve (3.23) we are automatically

guaranteed positive recurrence (Kulkarni, 1995, p. 111). •

Theorem 3.63 — Limiting behavior of irreducible positive recurrent and

periodic DTMC (Ross, 1983, p. 111; Kulkarni, 1995, Theorem 3.17, p. 109)

144

For an irreducible positive recurrent and periodic DTMC with period d,

limn→+∞

P ndij = d× πj, i, j ∈ S, (3.24)

where πj : j ∈ S is the unique non-negative solution of (3.23).12 •

Remark 3.64 — Interpretation of the πj (Kulkarni, 1995, p. 111)

• In an aperiodic DTMC, πj has two interpretations:

(i) πj is the limiting probability that the DTMC is in state j;

(ii) πj is the long-run fraction of time that the DTMC spends in state j and

µjj = 1πj

.

• If the DTMC is periodic then only the second interpretation is valid.

• πj : j ∈ S is the stationary distribution regardless of the fact that the DTMC

is aperiodic or not — if X0 has p.f. πj : j ∈ S then Xn has the same p.f. for all

n ∈ N (recall Exercise 3.60). •

Exercise 3.65 — Limiting behavior of irreducible aperiodic DTMC

Resume Example/Exercise 3.58 (brand switching) and use Theorem 3.61 to confirm that

the stationary distribution if given by [0.131868 0.318681 0.549451]. •

Exercise 3.66 — Limiting behavior of irreducible aperiodic DTMC (bis)

Resume Exercise 3.6 in which Evaristo’s mood is governed by a DTMC with a TPM

P =

0.5 0.4 0.1

0.3 0.4 0.3

0.2 0.3 0.5

.In the long-run, what proportion of time is the stochastic process in each of the three

states (Ross, 2003, Example 4.18, p. 202)? •12Equivalently, limn→+∞ P ?ij(n) = limn→+∞

1n+1

∑nk=0 P

kij = πj , i, j ∈ S.

145

Remark 3.67 — Obtaining the limiting probabilities (Kulkarni, 1995, p. 113)

• The study of the limiting behavior of irreducible positive recurrent DTMC

essentially involves solving the system of equations πj =∑

i∈S πiPij, j ∈ S∑j∈S πj = 1.

• If the DTMC has finite state space and the transition probabilities are given

numerically (such as in exercises 3.65 and 3.66) — rather than algebraically as in

exercises 3.72 and 3.73 — then one can provide numerical values for the πj : j ∈ S,

namely by making use of Proposition 3.68 and Mathematica to avoid tedious

calculations by hand. •

Proposition 3.68 — Obtaining the limiting probabilities numerically (Resnick,

1992, Proposition 2.14.1, p. 138)

Let:

• 1 = [1 · · · 1] a row vector with #S = m ones;

• I be the identity matrix with rank m;

• P = [Pij]i,j∈S be an m×m irreducible TPM;

• ONE is the m×m matrix all of whose entries are equal to 1;

• π = [πj]j∈S be the row vector denoting the stationary distribution.

Then

π = 1× (I−P + ONE)−1. (3.25)

•

Exercise 3.69 — Obtaining the limiting probabilities numerically

Prove Proposition 3.68 (Resnick, 1992, pp. 138–139). •

146

Exercise 3.70 — Obtaining the limiting probabilities numerically

Admit that in Evaristo’s home country, the transitions between (1) upper-, (2) middle-,

or (3) lower-class of the successive generations can be regarded as transitions of a DTMC

with TPM given by

P =

0.45 0.48 0.07

0.05 0.70 0.25

0.01 0.50 0.49

.Determine the percentage of inhabitants in each one of those social classes in the

long-run (Ross, 2003, Example 4.19, pp. 202–203). •

Exercise 3.71 — Obtaining the limiting probabilities numerically (bis) (Ross,

2003, Exercise 25, p. 256)

Each morning Evaristo leaves his house and goes for a run. He is equally likely to leave

either from his front or back door. Upon leaving the house, he chooses a pair of running

shoes (or goes running barefoot if there are no running shoes at the door from which he

departed). On his return he is equally likely to enter, and leave his running shoes, either

by the front or back door.

What proportion of the time does Evaristo run barefoot if he owns a total of k pairs

of running shoes? •

Exercise 3.72 — Obtaining the limiting probabilities algebraically (Ross, 1983,

Exercise 4.9, p. 135; Ross, 2003, Exercise 20, p. 255)

A TPM P is said to be doubly stochastic if∑

i∈S Pij = 1, for all j.

If the associated DTMC has a n states and is ergodic,13 show that the limiting

probabilities are given by 1n. •

13A positive recurrent, aperiodic state is called ergodic (Ross, 1983, p. 108).

147

Exercise 3.73 — Obtaining the limiting probabilities algebraically (bis) (Ross,

1983, Exercise 4.13, p. 135)

Clotilde possesses r umbrellas which she employs in going from her home to office, and

vice versa. If she is at home (resp. the office) at the beginning (resp. end) of a day and it is

raining, then she will take an umbrella with her to the office (resp. home), provided there

is one to be taken. If it is not raining, then she never takes an umbrella. Assume that,

independent of the past, it rains at the beginning (resp. end) of a day with probability p.

(a) Define a Markov chain with r+1 states which will help us to determine the proportion

of time that Clotilde gets wet.

(Note: She gets wet if it is raining, and all umbrellas are at her other location.)

(b) Compute the limiting probabilities.

(c) What value of p maximizes the fraction of time Clotilde gets wet when r = 3? •

Remark 3.74 — Obtaining the limiting probabilities, infinite state space

(Kulkarni, 1995, p. 113)

The limiting behavior of irreducible positive recurrent DTMC with infinite state space

can be also derived, namely when the transition probabilities are given algebraically, such

as in Exercise 3.75. For a detailed description of methods of solution, please refer to

Kulkarni (1995, pp. 113-123). •

Exercise 3.75 — Obtaining the limiting probabilities algebraically, infinite

state space (Ross, 1983, Exercise 4.16, p. 136)

Consider a DTMC with state space S = N0 and TPM such that

Pi,i+1 = pi = 1− Pi,i−1,

where p0 = 1.

Find the necessary and sufficient condition on the pi’s for this DTMC to be positive

recurrent, and compute the limiting probabilities in this case (Kulkarni, 1995, Example

3.23, pp. 115–117). •

148

Exercise 3.76 — More on limiting probabilities (Ross, 1983, Exercise 4.31, p. 139)

Let Xn : n ∈ N denote an irreducible DTMC with a countable space S.

Now consider a new stochastic process Yn : n ∈ N where Yn denotes the nth value

of Xn : n ∈ N that is between 0 and N . For instance, if N = 3 and X1 = 1, X2 = 3,

X3 = 5, X4 = 6, X5 = 2, then Y1 = 1, Y2 = 3, Y3 = 2.

(a) Is Yn : n ∈ N a DTMC? Explain briefly.

(b) Let πj denote the proportion of time that Xn : n ∈ N is in state j. If πj > 0 for all

j, what proportion of time is Yn : n ∈ N in each of the states 0, 1, . . . , N? •

3.5 Limit behavior of reducible Markov chains

In this section, we follow Kulkarni (1995, pp. 132–137) quite closely and assume that the

DTMC has k closed communicating classes C1, . . . , Ck and the remaining states form a

set T(

i.e., T = S\⋃kr=1Cr

). Moreover the states are assumed to have been relabeled so

that the TPM of the reducible DTMC is of the form

P =

P(1) · · · · · · · · · O O...

. . ....

......

. . ....

......

. . ....

...

O · · · · · · · · · P(k) O

D Q

, (3.26)

where:

• P(r) = [Pij]i,j∈Cr denotes the stochastic matrix associated with class Cr, r =

1, . . . , k;

• the O’s are matrices of zeroes;

149

• Q = [Qij]i,j∈T is a sub-stochastic matrix governing the transitions between the states

in T ;14

• D = [Dij]i∈T, j∈S\T is a matrix such that∑

j∈S\T Dij +∑

j∈T Qij = 1, i ∈ T .

Exercise 3.77 — Relabeling the states of a reducible DTMC (Kulkarni, 1995,

Example 3.26, pp. 135–136)

Relabel the states of the TPM in Exercise 3.32(d),

P =

12

0 0 0 0 12

0 13

0 0 23

0

16

16

16

16

16

16

0 0 0 1 0 0

0 23

0 0 13

0

12

0 0 0 0 12

,

such that the TPM of the reducible DTMC is of the form (3.26). Identify the matrices

P(1), P(2), P(3), Q and D. •

Remark 3.78 — Limiting behavior of a reducible DTMC (Kulkarni, 1995, p. 132)

• Elementary matrix algebra leads to the following n−step TPM

Pn =

Pn(1) · · · · · · · · · O O...

. . ....

......

. . ....

......

. . ....

...

O · · · · · · · · · Pn(k) O

Dn Qn

, (3.27)

where Dn = [Dn(i, j)]i∈T, j∈S\T .

14I.e., Qij ≥ 0, i, j ∈ T and∑j∈T Qij < 1, i ∈ T .

150

• Moreover, since P(r) is, for r = 1, . . . , k, a TPM of an irreducible DTMC, we can

add, for instance that

limn→+∞

P nij(r) = πj(r), (3.28)

where the limiting probabilities πj(r) : j ∈ Cr are given by the unique solution of

the system of equations πj(r) =∑

i∈Cr πiPij, j ∈ Cr∑j∈Cr πj(r) = 1,

(3.29)

in case Cr is a positive recurrent and aperiodic closed communicating class.

• Similarly, since all states in T must be transient we know that

limn→+∞

Qnij = 0, i, j ∈ T. (3.30)

• In conclusion, deriving the limiting behavior of Pn boils down to the study of the

limiting behavior of Dn, described in Proposition 3.79. •

Proposition 3.79 — Limiting behavior of a reducible DTMC (Kulkarni, 1995,

Theorem 3.21, pp. 134–135)

Let:

• αi(r) = P (Xn ∈ Cr, for some n ∈ N0 | X0 = i), for i ∈ T and fixed r ∈ 1, . . . , k.15

For i ∈ T , j ∈ Cr and fixed r ∈ 1, . . . , k:

• if Cr is transient or null recurrent then

limn→+∞

Dn(i, j) = 0, i ∈ T, j ∈ Cr; (3.31)

15For a fixed r ∈ 1, . . . , k, the quantities αi(r), for i ∈ T , are given by the smallest non-negative

solution ui : i ∈ T to ui =∑j∈Cr

Pij +∑j∈T Pij uj , for i ∈ T (Kulkarni, 1995, Theorem 3.20, p. 133).

151

• if Cr is positive recurrent and aperiodic then

limn→+∞

Dn(i, j) = αi(r)πj(r), i ∈ T, j ∈ Cr, (3.32)

where πj(r) : j ∈ Cr are given by the solution of (3.29);

• if Cr is positive recurrent and periodic then

limn→+∞

1

n+ 1

n∑m=0

Dm(i, j) = αi(r)πj(r), i ∈ T, j ∈ Cr, (3.33)

where πj(r) : j ∈ Cr are given by the solution of (3.29); in this case Dn(i, j) has

not a limit as n→ +∞. •

Exercise 3.80 — Limiting behavior of a reducible DTMC

Use Mathematica to “verify” that

P =

12

12

0 0 0 0

12

12

0 0 0 0

0 0 13

23

0 0

0 0 23

13

0 0

0 0 0 0 1 0

16

16

16

16

16

16

converges to

0.5 0.5 0 0 0 0

0.5 0.5 0 0 0 0

0 0 0.5 0.5 0 0

0 0 0.5 0.5 0 0

0 0 0 0 1 0

0.2 0.2 0.2 0.2 0.2 0

(Kulkarni, 1995, Example 3.26, pp. 135–137). •

152

3.6 Markov chains with costs/rewards

Consider a DTMC Xn : n ∈ N0 and suppose that every time we visit state i we incur

in a cost c(i). Then the expected total cost up to time N is given by

E

[N∑n=0

c(Xn) | X0 = i

], (3.34)

and the expected cost per time unit — up to time N — equals

1N+1

E[∑N

n=0 c(Xn) | X0 = i].

Can we calculate the long-run expected cost per time unit?

Yes!

It is related to the limiting probabilities, as stated in the following proposition for an

irreducible, positive recurrent DTMC.

Proposition 3.81 — Long-run expected cost per time unit (Kulkarni, 1995,

Theorem 3.23, p. 140)

Let:

• Xn : n ∈ N0 be an irreducible, positive recurrent DTMC, with TPM P, state

space S;

• πj : j ∈ S be the stationary distribution of this DTMC;

• c(i) be the cost incurred whenever we visit state i.

If |c(i)| ≤ B, for all i ∈ S, then the long-run expected cost per time unit — or long-run

cost rate — is given by

limN→+∞

1

N + 1E

[N∑n=0

c(Xn) | X0 = i

]=∑j∈S

πjc(j), (3.35)

regardless of the value of initial state i. •

Remark 3.82 — Long-run expected cost per time unit (Kulkarni, 1995, Theorem

3.23, p. 140)

• It can be shown that Equation (3.35) is still valid if∑

j∈S πj|c(j)| < +∞, which is

a condition weaker than |c(i)| ≤ B, i ∈ S.

153

• Proposition 3.81 can be extended to reducible DTMC, however the long-run cost

rate depends on the initial state i. •

Exercise 3.83 — Long-run expected cost per time unit


Exercise 3.84 — Long-run expected cost per time unit

Clotilde used to play semi-pro basketball and her scoring productivity per game fluctuated

between 3 states:

• 1 (scored 0 or 1 points);

• 2 (scored between 2 or 5 points);

• 3 (scored more than 5 points).

Inevitably, if Clotilde scored more than 5 points in one game, her jealous teammates

refused to pass her the ball in the next game.

The team statistician, upon observing the transitions between states, concluded that

these transitions could be modeled by a DTMC with TPM

P =

0 1

323

13

0 23

1 0 0

.(a) What is the long-run proportion of games that Clotilde scores more than 5 points

(Resnick, 1992, pp. 139–141)?

(b) The salaries in the semi-pro leagues include incentives for scoring. Clotilde was paid

20, 30 and 40 euros per game, for a scorings in states 1, 2 and 3, respectively.

What was the long-run earning rate of Clotilde (Resnick, 1992, pp. 139, 141)? •

154

Exercise 3.85 — Long-run expected cost per time unit (bis) (Resnick, 1992,

Exercise 2.29, pp. 156–157)

Evaristo visits the dentist every six months. Because of a sweet tooth and fetish for

chocolate, the condition of his teeth varies according to a DTMC on the states 1, 2, 3, 4,

where: 1 means no dental work is required; 2 means a cleaning is required; 3 means a

filling is required; and 4 means a root canal work is needed. Admit that transitions from

state to state are governed by the TPM

P =

0.6 0.2 0.1 0.1

0.4 0.4 0.1 0.1

0.3 0.3 0.2 0.2

0.4 0.5 0.1 0

.

Charges for each visit to the dentist depend on the work done: 20, 30, 50 and 300 euros

if the condition of Evaristo’s teeth are in states 1, 2, 3 and 4, respectively.

(a) What is the percentage of visits associated to charge of at least 50 euros?

(b) Determine Evaristo’s long-run cost rate for maintaining his teeth. •

Can we consider time-dependent cost functions, namely when we admit that if we

incur a cost of c monetary units at time n then this cost is equivalent to αnc (α ∈ [0, 1))

at time 0?

Yes!

Proposition 3.86 — Expected total discounted cost (Kulkarni, 1995, p. 138)

Let:

• Xn : n ∈ N0 be a DTMC, with TPM P, state space S;

• c = [c(i)]i∈S be a column vector of costs;

• α (α ∈ [0, 1)) be the rate at which the costs c(i) are discounted.

155

Then the expected total discounted cost incurred over the infinite horizon, starting at

state i, is equal to

φ(i) = E

[+∞∑n=0

αnc(Xn) | X0 = i

]. (3.36)

Moreover, φ(i) satisfies

φ(i) = c(i) + α∑j∈S

Pijφ(j), i ∈ S. (3.37)

Equivalently, the column vector φ = [φ(i)]i∈S is given by

φ = (I− αP)−1 × c. (3.38)

•

Exercise 3.87 — Expected total discounted cost

Prove Proposition 3.86 (Kulkarni, 1995, p. 138). •

Exercise 3.88 — Long-run cost rate; expected total discounted cost

Consider a brand-switching model such that a typical customer keeps switching between

brands A, B and C according to the following TPM:

P =

0.1 0.2 0.7

0.2 0.4 0.4

0.1 0.3 0.6

.Suppose brands A, B and C cost 1.00, 1.50 and 2.00 euros, respectively.

(a) Find the long-run expected cost per time unit (Kulkarni, 1995, Example 3.28, p. 141).

(b) Compute the expected total discounted expenditure of a typical customer, assuming

a discount factor α = 0.90 (Kulkarni, 1995, Example 3.27, pp. 139–140). •

156

3.7 Reversible Markov chains

Are there any DTMC with the property that when the direction of time is reversed the

behavior of the process remains the same?

Yes!

Some DTMC (and other stochastic processes) have this curious property and are

called reversible DTMC. Loosely speaking, if we film a DTMC and then run the film

backwards the result will be statistically indistinguishable from the original DTMC

(www.statslab.cam.ac.uk/∼frank/BOOKS/book/ch1.pdf).

But before we proceed, note that we have to consider from now on:

• a DTMC with index set Z, Xm : m ∈ Z,16 that happens to be irreducible positive

recurrent and also stationary;17

• the reversed process of this DTMC at n (n ∈ Z), Xn−m : m ∈ Z.

Proposition 3.89 — Property of the reversed process (Kulkarni, 1995, Theorem

3.25, p. 143; Ross, 2003, p. 232)

Let Xm : m ∈ Z be an irreducible positive recurrent DTMC, with stationary

distribution πj : j ∈ S. Then its reversed process at n, Xn−m : m ∈ Z is a DTMC

with transition probabilities

Qij =πj × Pjiπi

, (3.39)

for i, j ∈ S. •

Exercise 3.90 — Property of the reversed process

Prove Proposition 3.89 (Kulkarni, 1995, pp. 143–144; Ross, 2003, p. 232). •

16For the definition of such a DTMC please refer to Kulkarni (1995, Definition 3.14, p. 142).17That is, its initial state, say X−∞, is chosen according to the stationary probabilities πj : j ∈ S.

157

Definition 3.91 — Time reversible DTMC (Kulkarni, 1995, Definition 3.12, p. 142)

The DTMC Xm : m ∈ Z is said to be time reversible if it is has the same probabilistic

behavior as Xn−m : m ∈ Z, for all n ∈ Z.18 •

If we capitalize on Definition 3.91, Proposition 3.89 suggests necessary and sufficient

conditions for time reversibility.

Proposition 3.92 — Necessary and sufficient conditions for time reversibility

(Kulkarni, 1995, Theorem 3.26, p. 142)

Let Xm : m ∈ Z be an irreducible positive recurrent DTMC, with stationary

distribution πj : j ∈ S. Then Xm : m ∈ Z is time reversible iff

πi × Pij = πj × Pji, i, j ∈ S. (3.40)

•

Equations (3.40) are usually called detailed balance equations (Kulkarni, 1995, p. 144)

and can be stated as follows: for all states i and j, the rate at which the DTMC goes

from i to j, πi×Pij, is equal to the rate at which it goes from j to i, πj ×Pji (Ross, 2003,

p. 233).

Exercise 3.93 — Time reversible DTMC

Consider a DTMC Xn : n ∈ Z with state space S = 0, 1, . . . ,M and transition

probabilities

Pi,i+1 = αi = 1− Pi,i−1, i = 1, . . . ,M − 1

P0,1 = α0 = 1− P0,0

PM,M = αM = 1− PM,M−1.

(a) Prove that Xn : n ∈ Z is time reversible and compute its limiting probabilities

(Ross, 2003, Example 4.31, pp. 234–235).

18That is, if (Xn1 , Xn2 , . . . , Xnk) has the same distribution as (Xn−n1 , Xn−n2 , . . . , Xn−nk

), for all

n, n1, n2, . . . , nk ∈ Z and k ∈ N (www.statslab.cam.ac.uk/∼frank/BOOKS/book/ch1.pdf).

158

(b) Determine those limiting probabilities when αi = α, i = 0, 1, . . . ,M (Ross, 2003,

Example 4.31, p. 235).

(c) The DTMC considered in this exercise arose in an urn model proposed by the

physicists P. and T. Ehrenfest to describe the movements of molecules. These authors

admitted that M molecules were distributed among two urns, I and II, and that at

each time point n one of the molecules is chosen at random, removed from its urn,

and placed in the other one. Let Yn be the number of molecules in urn I at time n.

Then Yn : n ∈ N0 is a DTMC with the same state space and transition probabilities

as Xn : n ∈ N0.

Compute the limiting probabilities in this case (Ross, 2003, pp. 235–236). •

Exercise 3.94 — Time reversible DTMC

Let G be an arbitrary connected graph with cost cij associated with the arc (i, j). Now

consider a particle moving from node i to node j with probability

Pij =cij∑k cik

,

where cik = 0 if there is no arc (i, k).

Define a DTMC that describes the movement of this particle and show that that

DTMC is time reversible (Ross, 2003, Example 4.32, pp. 236–237). •

Exercise 3.95 — Time reversible DTMC (Ross, 1983, Exercise 4.29, p. 138)

Consider a time reversible DTMC with state space N0 and transition probabilities Pij

and limiting probabilities πi. Now, consider the same DTMC truncated with state space

0, 1, . . . ,M and transition probabilities

Pij =

Pij∑Mk=0 Pik

, i, j = 0, 1, . . . ,M

0, otherwise.

159

Show that the truncated DTMC is also time reversible and has limiting probabilities

given by

πi =πi∑M

k=0 Pik∑Mk=0

(πk∑M

j=0 Pkj

) .•

Note that the detailed balance equations allow us to determine if a DTMC

is time reversible based on the transition probabilities and the stationary

distribution, while the stationary distribution is determined solely by the

transition probabilities. Thus, we are led to believe that it is possible to

determine if a DTMC is time reversible from the transition probabilities alone

(www.math.ucsd.edu/∼williams/courses/.../scullardMath289 Reversibility.pdf). This

constitutes a result also known as Kolmogorov’s criterion.

Proposition 3.96 — Kolmogorov’s criterion for time reversibility (Ross, 2003,

Theorem 4.2, p. 238; Kulkarni, 1995, Theorem 3.27, p. 145)

An irreducible, positive recurrent stationary DTMC is time reversible iff its transition

probabilities satisfy

Pi,i1 × Pi1,i2 × · · · × Pik,i = Pi,ik × · · · × Pi2,i1 × Pi1,i, i, i1, i2, . . . , ik, (3.41)

for any k ∈ N, i.e., if starting in state i, any path back to state i has the same probability

as the reversed path. •

Exercise 3.97 — Kolmogorov’s criterion for time reversibility


Exercise 3.98 — Kolmogorov’s criterion for time reversibility

Use Proposition 3.96 to investigate if the DTMC described by the following transition

diagrams are time reversible:

160

(a)

4 Examples

Examples of stochastic processes which are reversible include stationary birth-death processes, M/M/1 queues, and symmetric random walks. For instance, itis easy to see from Kolmogorov’s Criterion that the Markov chain given belowis reversible:

On the other hand, Kolmogorov’s Criterion shows that the following processis not reversible, as a clockwise loop around the graph has probability 1

256 whilea counterclockwise loop has probability 1

16 :

3

(b)

4 Examples

Examples of stochastic processes which are reversible include stationary birth-death processes, M/M/1 queues, and symmetric random walks. For instance, itis easy to see from Kolmogorov’s Criterion that the Markov chain given belowis reversible:

On the other hand, Kolmogorov’s Criterion shows that the following processis not reversible, as a clockwise loop around the graph has probability 1

256 whilea counterclockwise loop has probability 1

16 :

3

(www.math.ucsd.edu/ williams/courses/.../scullardMath289 Reversibility.pdf). •

161

3.8 Branching processes

Branching processes:

• are Markov processes that model a population in which each individual in generation

n produces some random number of individuals in generation n+1, according, in the

simplest case, to a fixed probability distribution that does not vary from individual

to individual (http://en.wikipedia.org/wiki/Branching process);

• have been applied, for instance, in biology,19 sociology20 and engineering (Ross,

2003, p. 228), namely to model the size of a population of individuals, bacteria,

etc., the spread of surnames and the propagation of neutrons in a nuclear reactor

(http://en.wikipedia.org/wiki/Branching process).

The most common formulation of a branching process is as a Galton-Watson process,

arising originally from Francis Galton’s statistical investigation of the extinction of family

names (http://en.wikipedia.org/wiki/Branching process).

Definition 3.99 — Branching process, X0 = 1 (Kulkarni, 1995, p. 34; Ross, 2003,

pp. 228–229)

Let:

• Xn denote the number of individuals of the nth generation, starting with X0 = 1

individual (the size of the zeroth generation);

• Zl (or Zl,n) be the number of offspring of the lth individual of the nth generation.

If Zl : l ∈ N are non-negative integer i.i.d. r.v., with p.f. Pj = P (Zl = j), j ∈ N0, and

independent of the size of the generation, then

Xn =

Xn−1∑l=1

Zl, n ∈ N, (3.42)

19For concrete illustrations of branching processes in biology, see http://en.wikipedia.org/wiki/Galton-

Watson process.20See Kulkarni (1995, pp. 33–34).

162

http://en.wikipedia.org/wiki/Branching_process



http://en.wikipedia.org/wiki/Galton-Watson_process

http://en.wikipedia.org/wiki/Galton-Watson_process

and Xn : n ∈ N0 is usually called a branching process (or, rightly so, a Galton-Watson

process). •

Proposition 3.100 — Branching process, X0 = 1 (Kulkarni, 1995, p. 34; Ross, 2003,

p. 229)

The branching process Xn : n ∈ N0 is a DTMC with state space S = N0,21 and

transition probabilities given by

Pij = P

(Xn =

Xn−1∑l=1

Zl = j | Xn−1 = i

)

= P

(i∑l=1

Zl = j

). (3.43)

•

Remark 3.101 — Classification of states of a branching process, X0 = 1

• Since P00 = P (Xn+1 = 0 | Xn = 0) = 1, we can add state 0 is absorbing and, thus,

recurrent.

• If P1 = 1 then Xn = 1, n ∈ N0, and the branching process is a DTMC with state

space S = 1, which is obviously an absorbing state.

• If P1 < 1 and P0 = 0 then the branching process is a increasing DTMC, with state

space S = N, and all its states are transient (Resnick, 1992, p. 97).22

• If P1 < 1 and P0 > 0 then all the states of the branching process are also transient.

(Resnick, 1992, pp. 97–98).23 •

Deriving the p.f. of Xn is far from being simple. One way of identifying this p.f.

is via the p.g.f. of Xn, Pn(s) = PXn(s) = E(sXn), s ∈ [0, 1], after all P (Xn = k) =

1k!× dkPXn (s)

dsk

∣∣∣s=0

.

21In most cases! See Remark 3.101.

22Because fkk = P (eventual return to k) = P (Xn+1 = k | Xn = k) = P (Zl,n+1st= Zl = 1, l =

1, . . . , k) = P k1 < 1, for k ∈ N.23Note that in this case fkk ≤ P (X1 6= 0 | X0 = k) = 1−P (X1 = 0 | X0 = k) = 1−P k0 < 1, for k ∈ N.

163

Proposition 3.102 — P.g.f. of a branching process, X0 = 1 (Resnick, 2003, p. 19)

Let:

• Xn : n ∈ N0 be a branching process such that X0 = 1;

• Pn(s) = E(sXn) =∑+∞

j=0 sjP (Xn = j), s ∈ [0, 1], be the p.g.f. of Xn;

• P (s) = E(sZl) =∑+∞

j=0 sjPj, s ∈ [0, 1], be the common p.g.f. of the r.v. Zl.

Then the p.g.f. of Xn can be obtained recursively:

Pn+1(s) = Pn[P (s)], n ∈ N, s ∈ [0, 1], (3.44)

Similarly, Pn+1(s) = P [Pn(s)]. •

Exercise 3.103 — P.g.f. of a branching process, X0 = 1

Prove Proposition 3.102 taking advantage of Equation (3.42) (Resnick, 2003, p. 19). •

In general, determining Pn(s) is not trivial24 (and so is the computation of the p.f. of

Xn via its p.g.f.). However, we can add expressions for the expected value and variance

of a branching process.

Proposition 3.104 — Expected value and variance of a branching process, X0 =

1 (Ross, 2003, pp. 229–230)

Let Xn : n ∈ N be a branching process such that X0 = 1. Then

E(Xn | X0 = 1) = µn (3.45)

V (Xn | X0 = 1) =

σ2µn−1 × µn−1µ−1

, if µ 6= 1

nσ2, if µ = 1,(3.46)

where µ = E(Zl) =∑

j∈N0j × Pj and σ2 = V (Zl) =

∑j∈N0

(j − µ)2 × Pj represent the

expected value and variance of the number of offspring an individual has. •

Exercise 3.105 — Expected value and variance of a branching process, X0 = 1


24For a case where calculations are possible, please refer to Resnick (1992, p. 20).

164

Remark 3.106 — Expected value and variance of a branching process, X0 = 1

From Proposition 3.104 we can conclude that:

limn→+∞

E(Xn | X0 = 1) =

0, µ < 1

1, µ = 1

+∞, µ > 1;

(3.47)

limn→+∞

V (Xn | X0 = 1) =

0, µ < 1

+∞, µ = 1

+∞, µ > 1.

(3.48)

•

A central question in the theory of branching processes is the probability of (ultimate)

extinction ((http://en.wikipedia.org/wiki/Branching process), π, and the probability of

extinction on or before generation n, πn = P (Xn = 0 | X0 = 1) = Pn(0) = P (Pn−1(0)) =

P (πn−1), for n ∈ N (and π0 = 0). Unsurprisingly, the problem of determining the value of

the probability of extinction π was first raised in connection with the extinction of family

surnames by Galton in 1889 (Ross, 2003, p. 231).

Proposition 3.107 — Probability of extinction, X0 = 1 (Ross, 1983, p. 117)

Let π denote the probability that the population will eventually die out (assuming that

X0 = 1), i.e.,

π = limn→+∞

P (Xn = 0 | X0 = 1). (3.49)

Suppose that P0 > 0 and P0 + P1 < 1. Then:

• π = 1 iff µ ≤ 1;

• if µ > 1, π is the smallest positive number satisfying π = P (π), i.e.,

π =+∞∑j=0

πj × Pj, (3.50)

where Pj denotes the probability that an individual has j offspring.25 •25 In fact π0 is the smallest positive number satisfying s = P (s), where P (s) represents the p.g.f. of Zl

(Resnick, 1992, Theorem 1.4.1, p. 21). Moreover, for s ∈ [0, 1), π0 = limn→+∞ Pn(s).

165


Exercise 3.108 — Probability of extinction, X0 = 1

Prove Proposition 3.107 (Ross, 2003, p. 231, to show (3.50); Resnick, 1992, p. 22, to show

that π0 is the smallest positive number satisfying (3.50)). •

Exercise 3.109 — Probability of extinction, X0 = 1

Compute the extinction probability π when:

(a) P0 = 12, P1 = 1

4, P2 = 1

4(Ross, 2003, Example 4.28, p. 232);

(b) P0 = 14, P1 = 1

4, P2 = 1

2(Ross, 2003, Example 4.29, p. 232);

(c) P0 = 14, P1 = 1

12, P2 = 2

3;

(d) P0 = 16, P1 = 1

12, P2 = 3

4. •

Exercise 3.110 — Probability of extinction, X0 = m

What is the probability that the population will die out if it initially consists of m

individuals (Ross, 2003, Example 4.30, p. 232)? •

Exercise 3.111 — Probability of extinction (Ross, 1983, Exercise 4.24, p. 137)

Let Xn : n ∈ N be a branching process such that the number of offspring per individual

has a binomial distribution with parameters (2, p). Starting with a single individual (i.e.,

X0 = 1), calculate:

(a) the extinction probability π;

(b) the probability that the population becomes extinct (for the first time) in the 3rd.

generation.

Suppose that, instead of starting with a single individual, X0 = Z0, where Z0 ∼

Poisson(λ).

(c) Show that, in this case, the extinction probability is given by

exp [−λ(1− π)] ,

for π ≡ π(p) and p > 12. •

166

3.9 First passage times; absorption probabilities

We start this section by noting that:

• to classify a state i as recurrent or transient we have to calculate fi, the probability

that, starting in state i, the process will ever reenter state i; since fi ≡ fii =∑+∞

n=1 fnii

we need to calculate fnii , a particular case of fnij, the probability that, starting from

state i, the first transition into state j occurs exactly at time n;

• a state i is called absorbing if it is impossible to leave this state, thus, the state i is

absorbing iff Pii = 1; if every state can reach an absorbing state, then the Markov

chain is an absorbing Markov chain (http://en.wikipedia.org/wiki/Markov chain);

• we are frequently interested in finding the probability that a DTMC (e.g., a

branching process) reaches an absorbing state (e.g., extinction).

We can use a recursive scheme to compute the probabilities fnij, as described in the

next proposition.

Proposition 3.112 — First passage probabilities (Resnick, 1992, pp. 89–90)

Let:

• Xn : n ∈ N0 a DTMC with state space S and TPM P;

• fnij be the probability that, starting from state i, the first transition into state j

occurs exactly at time n, i.e., fnij = P (Xn = j,Xn−1 6= j, . . . , X1 6= j | X0 = i), n ∈

N, and fnj

= [fnij]i∈S be the corresponding column vector;

• (j)P be a matrix obtained by setting all the entries of the jth column of P equal to

0.

Then fnij can be obtained by a first jump decomposition,26

fnij =

Pij, n = 1∑k 6=j Pikf

n−1kj , n = 2, 3, . . .

(3.51)

26fnij =∑k 6=j, k∈S P (Xn = j,Xn−1 6= j, . . . , X1 = k,X0 = i) =

∑k 6=j P (Xn = j,Xn−1 6= j, . . . , X2 6=

j | X1 = k)× P (X1 = k | X0 = i).

167


or, more conveniently, fnj

can be computed as a matrix recursion:

fnj

=

f 1

j= [Pij]i∈S , n = 1

(j)P× fn−1

j=[

(j)P]n−1 × f 1

j, n = 2, 3, . . .

(3.52)

•

Exercise 3.113 — First passage probabilities (Resnick, 1992, Exercise 2.8, p. 149)

Consider a DTMC on S = 1, 2, 3 with TPM

P =

1 0 0

12

16

13

13

35

115

.(a) Find fni3 = P (Xn = 3, Xn−1 6= 3, . . . , X1 6= 3 | X0 = i), for i, n = 1, 2, 3, without using

Proposition 3.112.

(b) Obtain a recursive equation for fni3, i = 1, 2, 3 and n ∈ N. 27 •

We are frequently interested in characterizing the time Ti until the system goes from

some initial state i to some terminal critical state, which may represent a machine

breakdown, bankruptcy, or simply a state of interest (Resnick, 1992, p. 102).

These problems can be often be formulated as first passage probabilities/times to

absorption/exiting times, as put by Resnick (1992, p. 102), and solutions can be provided

by making use of what reminds us of the renewal argument and Resnick (1992, p. 104)

calls first step analysis, as illustrated by Exercise 3.114.

Exercise 3.114 — Time to absorption (Ross, 1983, Exercise 4.15, pp. 135–136)

A DTMC has an absorbing state 0 — that is, P00 = 1 — and set of transient states N.

Let Ti be the time the DTMC takes to reach state 0 given it starts in state i, and let

Mi = E(Ti), i ∈ N.

(a) Show that Mi = 1 +∑

j∈N PijMj.

(b) Let σi(n) = P (Ti > n). Derive a formula of σi(n+ 1) in terms of σj(n), j ∈ N. •

27See Resnick (1992, p. 90).

168

In the absence of absorbing states, the distribution of exiting times can be determined

in a quite trivial way in specific cases, as shown by Exercise 3.115.

Exercise 3.115 — Exiting times (Resnick, 1992, Exercise 2.4, p. 148)

Suppose Pii > 0 and let

τi = infn ∈ N : Xn 6= i | X0 = i

be the exit time from state i.

Show that τi has a geometric distribution and identify its parameter. •

The general treatment of first passage probabilities/times is as follows.

Proposition 3.116 — First passage probabilities/times (Resnick, 1992, pp. 106–

107)

Let:

• Xn : n ∈ N0 a DTMC with state space S;

• S = T ∪C1 ∪C2 ∪ · · · be the (canonical) decomposition of the state space, where T

consists of the set of transient states and the communicating classes Cl are closed

and recurrent;

• P = [Pij]i,j∈S be its TPM;

• Q = [Qij]i,j∈T be the restriction of P to the transient states;

• R = [Pkl]k∈T, l∈T ;

• P =

Q R

0 P2

be the partition of the TPM;

• τ = infn ∈ N0 : Xn 6∈ T be the exit time of the set of transient states;28

• Xτ be the first state hit by the DTMC outside T (assuming τ finite!);

28We shall assume that τ is finite for all starting states i ∈ T .

169

• uik = P (Xτ = k | X0 = i) be the probability that the first state the DTMC reaches

when it leaves the set of transient states is k ∈ T , given that the initial state of the

chain is state i ∈ T ;

• U = [uik]i∈T, k∈T be the matrix of the uik probabilities;

• E(τ | X0 = i), i ∈ T , be the expected value of the first passage time τ , given the

initial state of the DTMC is i ∈ T ;

• E[∑τ−1

n=0 g(Xn) | X0 = i], i ∈ T , be the expected cumulative reward starting from

the initial state i ∈ T until we leave T , given , where g(j) represents the reward for

being in state j.

Then once the DTMC leaves T , it will hit one of the closed recurrent communicating

classes and can never return to T , and ui(Cl) = P (Xτ ∈ Cl | X0 = i) =∑

k∈Cl uik.

Moreover:

U = (I−Q)−1 ×R (3.53)

[E(τ | X0 = i)]i∈T = (I−Q)−1 × 1 (3.54)[E[

τ−1∑n=0

g(Xn) | X0 = i]

]i∈T

= (I−Q)−1 × g, (3.55)

where (I − Q) is usually known as the fundamental matrix, 1 is a vector of ones and g

the vector of rewards.29

Exercise 3.117 — First passage times (Resnick, 1992, Exercise 2.16, pp. 151–152)

Evaristo owns a restaurant which fluctuates in successive years between 3 states — 1

(bankruptcy), 2 (verge of bankruptcy), and 3 (solvency) —, according to a DTMC with

TPM equal to

P =

1 0 0

0.5 0.25 0.25

0.5 0.25 0.25

.29When the state space is finite or when T is finite, (I − Q) has indeed an inverse, which can be

represented as (I−Q)−1 =∑+∞n=0 Q

n.

170

(a) Compute the expected number of years until Evaristo’s restaurant goes bankrupt

assuming it starts from the state of solvency.

Evaristo’s rich aunt, Mrs. T. da Cunha, decides that it is bad for the family name if the

restaurant is allowed to go bankrupt. Thus, when state is entered, Mrs. T. da Cunha

infuses Evaristo’s restaurant with cash returning it to solvency with probability 1, i.e.,

the TPM for this new DTMC is

P =

0 0 1

0.5 0.25 0.25

0.5 0.25 0.25

.(b) Is this new DTMC irreducible? Is it aperiodic?

(c) What is the expected number of years between consecutive cash infusions from

Evaristo’s rich aunt.30 •

Exercise 3.118 — First passage times (bis) (Resnick, 1992, Exercise 2.17, p. 152)

Some graduate students exhibit 4 states of mind: 1 (suicidal); 2 (severe depression); 3

(mild depression); 4 (seeking for professional psychiatric help).

Admit weekly changes in state of mind can be modeled as a DTMC with TPM given by

P =

1 0 0 0

0.50 0 0.25 0.25

0.25 0.50 0 0.25

0 0 0 1

.

(a) Compute the probability the student will eventually be suicidal starting from state

X0 = 2? Recalculate this probability considering X0 = 3.

(b) Find the expected number of changes of state of mind until a student is suicidal or

seeks for professional psychiatric help, considering the initial state X0 = 2. Determine

this expected number assuming now X0 = 3. •30Recall that µjj = 1

πj.

171

Exercise 3.119 — First passage times (bis, bis) (Resnick, 1992, pp. 106–107)

Coltilde runs a restaurant and organizes there “amateur night” on Fridays. The clientele

of the restaurant judges the performers and their quality falls into 5 categories with:

• 1 being the best;

• 5 being atrocious and able to cause a riot with probability 0.3.

Clotilde admits that the succession of states on amateur night can be modeled as a

DTMC with

• 6 states, where i represents a class i performer (i = 1, . . . , 5), and state 6 represents

“riot”;

• and TPM given by

P =

0.05 0.15 0.3 0.3 0.2 0

0.05 0.3 0.3 0.3 0.05 0

0.05 0.2 0.3 0.35 0.1 0

0.05 0.2 0.3 0.35 0.1 0

0.01 0.1 0.1 0.1 0.39 0.3

0.2 0.2 0.2 0.2 0.2 0

.

To play it safe Clotilde starts the evening off with a class 2 performer.

(a) What is the probability that a class 1 performer is discovered before a riot is started?

(b) What is the expected number of performers seen before the first riot? •

Exercise 3.120 — More on first passage times (Ross, 1983, Exercise 4.21, p. 137)

A spider hunting a fly moves between locations 1 and 2 according to a DTMC with TPM

P =

0.7 0.3

0.3 0.7

and starting in location 1.

The fly, unaware of the spider, starts in location 2 and moves according to a DTMC

with TPM

172

Q =

0.4 0.6

0.6 0.4

(a) Show that the progress of the hunt, except for knowing the location where it ends,

can be described by a three-state DTMC. Obtain the TPM for this DTMC.

(b) Find the probability that at time n the spider and fly are both in the same

compartment.

(c) What is the average duration of the hunt? •

Exercise 3.121 — Classification of states of a symmetric random walk (Ross,

1983, Exercise 4.6, pp. 134–135)

For the symmetric random walk starting at 0:

(a) What is the expected time to return to 0?

(b) Let N2n denote the number of returns by time 2n. Show that E(N2n) = 2n+122n

(2nn

)− 1.

(c) Use (b) and Stirling approximation31 to show that for n large E(Nn) is proportional

to√n. •

Exercise 3.122 — The gambler’s ruin problem

Consider a gambler who at each play of the game has a probability of winning one

monetary unit and probability q = 1− p of losing one monetary unit.

Assuming successive plays of the game are independent, prove that the probability

that, starting with i monetary units, the gambler’s fortune will reach N before reaching

0 equals:1−(q/p)i

1−(q/p)N, if p 6= 1

2

iN, if p = 1

2

(Ross, 1983, Example 4.4(a), p. 115–116). •31n! ∼

√2π e−n nn+1/2, for sufficiently large n.

173

Exercise 3.123 — The gambler’s ruin problem (bis) (Ross, 1983, Exercise 4.18, p.

136)

In the gambler’s ruin problem, prove that the probability that he/she wins the next

gamble, given that the present fortune is i and he/she eventually reaches N , is equal top[1−(q/p)i+1]

1−(q/p)i, if p 6= 1

2

i+12i, if p = 1

2

•

Exercise 3.124 — The gambler’s ruin problem (bis, bis) (Ross, 1983, Exercise

4.20, pp. 136–137)

Suppose that two independent sequences X1, X2, . . . and Y1, Y2, . . . are coming in form

some laboratory and that they represent the results of Bernoulli trails with unknown

success probabilities P1 and P2.32

To decide whether P1 > P2 or P2 > P1, we use the following test. Choose some

positive integer M and stop at N , the first value of n such either∑n

i=1(Xi − Yi) = M or∑ni=1(Xi − Yi) = −M . In the former case we then assert that P1 > P2, and in the latter

that P2 > P1.

Show that, when P1 > P2:

(a) the probability of making an error (that is, of asserting that P2 > P1) is equal to

11+λM

, where λ = P1(1−P2)P2(1−P1)

;

(b) the expected number of pairs observed is M(λM−1)(P1−P2)(λM+1)

.

Hint: Relate this to the gambler’s ruin problem. •

32That is, P (Xi = 1) = 1 − P (Xi = 0) = P1 and P (Yi = 1) = 1 − P (Yi = 0) = P2 and all r.v. are

independent.

174

Chapter 4

Continuous time Markov chains

In the DTMC setting, we enter a state i at time n and stay there for exactly one time

unit and then jump to state j at time n+ 1 with probability Pij, regardless of the states

we have visited up to and including time n− 1 (Kulkarni, 1995, p 240).

Since many processes we may wish to model occur in continuous time,1 can we consider

a stochastic model, with index set R+0 , such that we enter state i at time t, stay there for

a random amount of time, then jump to state j with probability Pij and still satisfy the

Markov property?

YES!

As long as the time spent in state i is an exponentially distributed r.v. independent of

the next state visited. The resulting stochastic process is called a continuous time Markov

chains (CTMC).

CTMC have a wide variety of applications in the real world (Ross, 2003, p. 349)

— they naturally arise in control and optimization, manufacturing systems, biology and

financial engineering. Moreover, a large class of queueing models can be studied as CTMC

(Resnick, 1992, p. 367).

We have already dealt with a few CTMC. For instance, if the total number of arrivals

by time t is the state of the process at time t, then the Poisson process N(t) : t ≥ 0 is

a CTMC with state space N0 (Ross, 2003, p. 349).

1E.g., disease transmission events, state of deterioration of mechanical components, etc.

175

4.1 Definitions and examples

We start this section with two possible definitions of CTMC.

Definition 4.1 — CTMC (Ross, 2003, p. 350)

Let X(t) : t ≥ 0 be a continuous time stochastic process taking values in the set of

non-negative integers (that is, the state space S ⊆ N0). If

P [X(t+ s) = j | X(s) = i, X(u) = x(u), 0 ≤ u < s]

= P [X(t+ s) = j | X(s) = i],(4.1)

for all s, t ≥ 0 and non-negative integers i, j and x(u), 0 ≤ u < s, then X(t) : t ≥ 0 is

said to be a CTMC. If, in addition, the transition probabilities satisfy

P [X(t+ s) = j | X(s) = i] = P [X(t) = j | X(0) = i], (4.2)

then the CTMC is said to be time-homogenous.2 •

To motivate another definition of CTMC, let us recall Ross (1983, p. 142), who

mentions that, by the Markov property, this stochastic process has the following properties

each time it enters state i:

• the amount of time spent in state i (sojourn or holding time!) before making a

transition into a different state has exponential distribution with parameter νi;3

• the probability that the process leaves state i and the next state it enters is j equals

Pij, where∑

j 6=i Pij = 1.4

Definition 4.2 — CTMC (bis) (Kulkarni, 1995, pp. 240–241)

Let:

• X(t) : t ≥ 0 be a continuous time stochastic process with state space S ⊆ N0;

2Or to have stationary or homogenous transition probabilities. The material in this and the next

sections only refers to CTMC with stationary transition probabilities.3A state i is called called instantaneous if νi = +∞ (Kulkarni, 1995, p. 241), i.e., the expected sojourn

time in state i is equal to 0. From now on, we shall only deal with CTMC with no instantaneous states.4Pii = 0 unless state i is an absorbing state — in this case Pii = 1 (Kulkarni, 1995, p. 246) and νi = 0.

176

• S0 = 0;

• Sn be the time of the nth transition;

• Yn = Sn − Sn−1 be the nth sojourn or holding time;

• X0 = X(0) be the initial state of the process;

• Xn = X(S+n ) = X(Sn) be the state of the stochastic process immediately after the

nth transition;

• Pij = P [X(S+n+1) = j | X(S+

n ) = i].

Then the stochastic process X(t) : t ≥ 0 is said to be a CTMC with initial state

X0 = X(0) if it changes states at times 0 < S1 < S2 < . . . and the embedded process

X0, (Xn, Yn) : n ∈ N satisfies

P [Xn+1 = j, Yn+1 > y | (Xn, Yn) = (i, yn), (Xn−1, Yn−1) = (in−1, yn−1), . . . ,

(X1, Y1) = (i1, y1), X0 = i0] = Pij e−νi×y,

(4.3)

for all non-negative integers i, j, in−1, . . . , i1, i0 and non-negative real numbers

y, yn, yn−1, . . . , y1.

Xn : n ∈ N0 is usually called the embedded DTMC in the CTMC X(t) : t ≥ 0. •

Less formally, in a CTMC the succession of states visited still follows a DTMC but

now the flow of time is perturbed by exponentially distributed sojourn (or holding) times

in each state (Resnick, 1992, p. 367).

Example/Exercise 4.3 — (Sample path of a) CTMC

Some of the stochastic processes we have previously studied are indeed CTMC (Kulkarni,

1995, p. 242):

• X(t) : t ≥ 0 ∼ PP (λ) is a CTMC with νi = λ and Pi,i+1 = 1;

• A compound PP — with batch arrival rate λ and batch sizes with p.f. ak =

P (batch size = k), k ∈ N — is a CTMC with νi = λ and Pij = aj−i, for j > i.

Draw a typical sample path of a CTMC (Kulkarni, 1995, Figure 6.1, p. 241). •

177

4.2 Properties of the transition matrix; Chapman-

Kolmogorov equations

The law of motion of a CTMC is governed by a time-dependent transition probability

matrix.

Definition 4.4 — Transition probability matrix (Kulkarni, 1995, p. 243)

Let:

• X(t) : t ≥ 0 be a (time-homogeneous) CTMC with state space S;

• Pij(t) = P [X(t) = j | X(0) = i], i, j ∈ S, be the (time-dependent) transition

probabilities.

Then

P(t) = [Pij(t)]i,j∈S (4.4)

is called the transition probability matrix (TPM). •

Remark 4.5 — Transition probability matrix (Kulkarni, 1995, p. 243)

The reader should not mistake the transition probabilities Pij(t) = P [X(t) = j | X(0) = i]

of the CTMC for the transition probabilities Pij of the embedded DTMC. •

Proposition 4.6 — Characterization of a CTMC (Kulkarni, 1995, Theorem 6.2, p.

244)

A CTMC X(t) : t ≥ 0 is fully characterized by its

• TPM, P(t), and

• initial distribution, that is, the p.f. of X(0) denoted by α = [αi]i∈S =

[P [X(0) = i]]i∈S . •

178

P(t) is certainly a stochastic matrix and satisfy the Chapman-Kolmogorov equations

(obviously rewritten for a continuous time stochastic process).

Proposition 4.7 — Properties of the TPM; Chapman-Kolmogorov equations

(Kulkarni, 1995, Theorem 6.3, p. 253–254)

The TPM, P(t), of a CTMC X(t) : t ≥ 0 has the following properties:

• Pij(t) ≥ 0, i, j ∈ S, t ≥ 0;

•∑

j∈S Pij(t) = 1, i ∈ S, t ≥ 0.

Moreover, P(0) = I and the Chapman-Kolmogorov equations5 are written as follows:

Pij(t+ s) =∑k∈S

Pik(t)× Pkj(s), i, j ∈ S, t, s ≥ 0, (4.5)

or, in matrix form,

P(t+ s) = P(t)×P(s) (4.6)

= P(s)×P(t), t, s ≥ 0. (4.7)

•

Exercise 4.8 — Chapman-Kolmogorov equations

Prove the Chapman-Kolmogorov equations (4.5) (Ross, 2003, p. 363). •

Exercise 4.9 — Properties of the TPM (Isaacson and Madsen, 1976, Exercise 1, p.

231)

Which of the following matrices have the properties of the TPM for a CTMC?

(a) P(t) =

e−t 1− e−t

0 1

(b) P(t) =

et 1− et

0 1

5The equations were arrived at independently by both the British mathematician Sydney

Chapman (1888–1970) and the Russian mathematician Andrey Kolmogorov (1903–1987)

(http://en.wikipedia.org/wiki/Chapman-Kolmogorov equation).

179

http://en.wikipedia.org/wiki/Chapman-Kolmogorov_equation

(c) P(t) =

1 0

1− te−t te−t

(d) P(t) =

t+ e−t 1− t− e−t

0 1

(e) P(t) =

1− te−t te−t 0 0

te−t 1− 3te−t 2te−t 0

0 te−t 1− 2te−t te−t

0 0 te−t 1− te−t

•

Marginal and joint probabilities (ADDED)

Let:

• X(t) : t ≥ 0 be a CTMC with TPM P(t) = [Pij(t)]i,j∈S ;

• α = [αi]i∈S be the row vector with the initial distribution of the CTMC (i.e., the

p.f. of X(0)).

Then

P [X(t) = j] =∑i∈S

P [X(0) = i]× P [X(t) = j | X(0) = i]

=∑i∈S

αi × Pij(t), j ∈ S, (4.8)

and the row vector with the p.f. of X(t) is given by

[P [X(t) = j)]j∈S = α×P(t). (4.9)

Moreover,

P [X(t1) = x(t1), . . . , X(tk) = x(tk)] =

[∑i∈S

αi × Pi,x(t1)(t1)

]

×k∏j=2

Px(tj−1),x(tj)(tj − tj−1), (4.10)

for 0 ≤ t1 < t2 < · · · < tk and x(t1), . . . , x(tk) ∈ S. •

180

Definition 4.10 — Instantaneous transition rates (Ross, 2003, p. 362)

For any pair of states i and j (i 6= j), let

qij = νi × Pij, (4.11)

where νi is the rate at which the process makes a transition when in state i and Pij

is the probability that this transition is into state j. The quantities qij are called the

instantaneous transition rates and represent the rate, when in state i, at which the process

makes a transition into state j. •

Remark 4.11 — Instantaneous transition rates and rate diagrams (Kulkarni,

1995, p. 246)

A rate diagram is a directed graph in which each state is represented by a node and there

is an arc going from node i to node j (if qij > 0) with qij written on it.

The rate diagrams helps us visualize the dynamics of the CTMC and are the continuous

analogue of the transition diagrams of DTMC. •

Exercise 4.12 — Rate diagram (Kulkarni, 1995, examples 6.1 and pp. 242 and 246–

247)

Consider a machine that can be either up (1) or down (0). If the machine is up (resp.

down), it fails (resp. is repaired) after an Exp(µ) (resp. Exp(λ)) amount of time. Once

this machine is repaired it is good as new.

Let X(t) : t ≥ 0 be the state of the machine at time t and draw the corresponding

rate diagram. •

Specifying the instantaneous transition rates determines the parameters of the CTMC

(Ross, 2003, p. 362). In addition, the instantaneous transition rates are related to the

infinitesimal behavior of the transition probabilities, as stated by the next proposition.

Proposition 4.13 — Infinitesimal behavior of the transition probabilities (Ross,

2003, Lemma 6.2, p. 362)

Let X(t) : t ≥ 0 be a CTMC with state space S, TPM P(t) and instantaneous transition

rates qij. Then:

181

limh→0+

Pij(h)

h= qij, i 6= j; (4.12)

limh→0+

1− Pii(h)

h= νi. (4.13)

•

Capitalizing on Proposition 4.13 and on the Chapman-Kolmogorov equations, we can

derive a set of differential equations that the transition probabilities Pij(t) satisfy (Ross,

2003, p. 362) and provide a solution for them.

Proposition 4.14 — Kolmogorov’s backward and forward equations (Ross, 2003,

Theorem 6.1, pp. 364, 367)

For all states i and j and times t ≥ 0:

dPij(t)

dt= lim

h→0+

Pij(h+ t)− Pij(t)h

=∑k 6=i

qikPkj(t)− νiPij(t) (backward equations); (4.14)

dPij(t)

dt= lim

h→0+

Pij(t+ h)− Pij(t)h

=∑k 6=j

Pik(t)qkj − Pij(t)νj (forward equations). (4.15)

•

Exercise 4.15 — Kolmogorov’s backward and forward equations

Prove Proposition 4.14 (Ross, 2003, Theorem 6.1, pp. 363–364 and 367). •

Proposition 4.16 — Kolmogorov’s backward and forward equations in matrix

form (Ross, 2003, p. 388)

Let

rij =

qij, i 6= j

−νi, i = j(4.16)

and R = [rij]i,j∈S .6 Then the Kolmogorov’s backward and forward equations can be

written in matrix form:6R is usually called the rate matrix (or the infinitesimal generator) of the CTMC.

182

dP(t)

dt=

[dPij(t)

dt

]i,j∈S

= R×P(t) (backward equations) (4.17)

= P(t)×R (forward equations). (4.18)

•

Proposition 4.17 — Solution of the Kolmogorov’s backward and forward

equations in matrix form (Ross, 2003, p. 388)

The solution of the matrix differential equations dP(t)dt

= R×P(t) and dP(t)dt

= P(t)×R

is

P(t) = eR t (4.19)

=+∞∑n=0

Rn tn

n!. (4.20)

•

According to Ross (2003, p. 389), the direct use of (4.20) to compute P(t) turns out to

be very inefficient (why?),7 not to mention the case of CTMC with infinite state space.

Consequently, we are going to discuss methods to derive or to approximate the TPM P(t)

in the next two sections.

7Since R contains both positive and negative entries we are bound to deal with computer round-off

errors when we compute the powers of the matrix R. Moreover, to arrive at a good approximation we

have to compute a lot of the terms in the infinite sum (4.20).

183

4.3 Computing the transition matrix: finite state

space

Rather than using (4.20) to compute the TPM, we can use the matrix equivalent of the

identities

ex = limn→+∞

(1 +

x

n

)n= lim

n→+∞

[(1− x

n

)−1]n

to efficiently (derive or) approximate P(t).

Proposition 4.18 — Two approximations to P(t) (Ross, 2003, pp. 389–390)

Since

P(t) = eR t

= limn→+∞

(I +

R t

n

)n(4.21)

= limn→+∞

[(I− R t

n

)−1]n, (4.22)

if we let n be a power of 2, say n = 2k, then we can approximate P(t) by raising either the

matrix(I + R t

n

)or the matrix

(I− R t

n

)−1to the nth power, which can be accomplished

by k matrix multiplications.8 •

Exercise 4.19 — Two approximations to P(t)

Consider

R =

−λ λ

µ −µ

,where λ = 1 and µ = 2.

8For instance, we multiply(I + R t

n

)by itself to obtain

(I + R t

n

)2and then multiplying that by itself

to obtain(I + R t

n

)4and so on.

184

(a) Use Mathematica, in particular the function MatrixExp, to obtain P(t), for t = 1, 100.

(b) Compare the exact results in (a) to the approximate ones, obtained by using

Proposition 4.18. •

For a more detailed account on other methods to compute the TPM P(t) of a CTMC

with finite state space, the reader is referred to Kulkarni (1995, pp. 261–274).

4.4 Computing the transition matrix: infinite state

space

When the CTMC is defined on an infinite state space, exact analytical computation of

P(t) is only possible if the CTMC has a special structure, such as the one in Exercise

4.20.

The method of Laplace transforms, also used when the state space is finite, is

particularly useful and therefore briefly described, following Kulkarni (1995, pp. 269–270).

Since P(t) is a matrix of bounded continuous functions, it is possible to define Laplace

transforms of its entries:

P ?ij(s) =

∫ +∞

0

e−stPij(t)dt, i, j ∈ S. (4.23)

Using the properties of Laplace transforms and Kolmogorov’s forward equations, we

successively get∫ +∞

0

e−stdPij(t)

dtdt = s× P ?

ij(s)− Pij(0) (4.24)

and

s× P ?ij(s)− Pij(0) =

∑k 6=j

P ?ik(s)× qkj − P ?

ij(s)× νj. (4.25)

Now, using the fact that P(0) = I, we can solve (4.25) recursively, as in the next exercise.

Exercise 4.20 — Obtaining P(t) via the method of Laplace transforms

Consider X(t) : t ≥ 0 ∼ PP (λ) and Pk(t) ≡ P [X(t) = k | X(0) = 0]:

185

(a) write Kolmogorov’s forward equations for this CTMC in terms of Pk(t);

(b) use the method of Laplace transforms to obtain a solution to Pk(t)

(Kulkarni, 1995, Example 6.20, p. 275). •

Kulkarni (1995, pp. 275–282) shows how the problem of computing P(t) — when the

CTMC is defined on an infinite state space — needs analytical tools such as Laplace

transforms, partial differential equations, etc.9

All these methods are particularly useful to solve Kolmogorov’s backward and forward

equations, while dealing with a broad and popular class of CTMC, the birth and death

processes.

4.5 Birth and death processes

Birth-death processes are special cases of CTMC where the state transitions are of only

two types:

• birth (or arrival) which increase the state variable by one;

• death (or departure) which decrease the state variable by one.

The model’s name comes from a common application, the use of such models to represent

the current size of a population where the transitions are literally due to births and deaths

(http://en.wikipedia.org/wiki/Birth-death process).

Unsurprisingly, birth-death processes have many applications in demography, queueing

theory, performance engineering, epidemiology or in biology — they may be used,

for example to study the evolution of bacteria, the number of people with a

disease within a population, or the number of customers in line at the supermarket

(http://en.wikipedia.org/wiki/Birth-death process).

9In fact, partial differential equations arise while solving Kolmogorov’s forward equations via the p.g.f.,

as illustrated by Kulkarni (1995, Example 6.24, pp. 278–282).

186

http://en.wikipedia.org/wiki/Birth-death_process


Definition 4.21 — Birth and death process (Ross, 2003, p. 352)

Let the state variable X(t) be the number of people in a system at time t. Now, suppose

that whenever there are n people in the system

• the time until the next birth/arrival is exponentially distributed, with mean λ−1n

(n ∈ N0), and independent of the

• the time until the next death/departure which is exponentially distributed with

mean µ−1n (n ∈ N).

Then X(t) : t ≥ 0 is called a birth and death process, with birth rates λn : n ∈ N0

and death rates µn : n ∈ N. •

Remark 4.22 — Birth and death processes (Ross, 2003, pp. 352–353; Kleinrock,

1975, p. 54)

• A birth and death process is a CTMC with state space N0 for which transitions only

to states n− 1 and n+ 1 are possible from state n.

• The rates at which the process makes a transition when in state i are:

ν0 = λ0; (4.26)

νi = λi + µi, i ∈ N. (4.27)

The transition probabilities Pij of the embedded DTMC are equal to:

P01 = 1; (4.28)

Pi,i+1 = P (birth before a death given i people in the system)

=λi

λi + µi, i ∈ N; (4.29)

Pi,i−1 = P (death before a birth given i people in the system)

=µi

λi + µi, i ∈ N. (4.30)

• In addition, the instantaneous transition rates are given by

187

qij = νi × Pij =

λi, j = i+ 1

µi, j = i− 1,(4.31)

for i 6= j.

• Given that X(t) = i, the probability that:

– one birth occurs in the interval (t, t+ ∆t] is given by

Pi,i+1(∆t) = λi ×∆t+ o(∆t);

– one death occurs in the interval (t, t+ ∆t] is equal to

Pi,i−1(∆t) = µi ×∆t+ o(∆t);

– no death or birth occur in the interval (t, t+ ∆t] amounts to

Pi,i(∆t) = 1− (λi + µi)×∆t+ o(∆t).

Consequently,

– multiple births,

– multiple deaths,

– a birth and a death,

in intervals of infinitesimal range ∆t are not possible.

• A birth and death process for which µn = 0, n ∈ N (resp. λn = 0, n ∈ N0), is called

a pure birth (resp. pure death) process. •

Exercise 4.23 — Rate diagrams and rate matrices of birth and death processes

Draw the rate diagrams of the following birth and death processes:

(a) Poisson process with arrival rate λ (Kulkarni, 1995, Figure 6.5, p. 249);

188

(b) pure birth process with birth rates λi (Kulkarni, 1995, Figure 6.6, p. 249);

(c) pure death process with death rates µi (Kulkarni, 1995, Figure 6.7, p. 250);

(d) general birth and death process with birth and death rates λi and µi,

respectively (Kulkarni, 1995, Figure 6.8, p. 251; http://en.wikipedia.org/wiki/Birth-

death process).

Identify the rate matrices of all these CTMC. •

Before we proceed to describe the derivation of the TPM P(t), we illustrate the

computation of the expected value of the state variable of a few birth and death processes

with a two exercises.

Exercise 4.24 — Expected value of a linear growth model

Consider a population in each individual gives birth at an exponential rate λ and dies at

an exponential rate µ.

After having identified the birth and death rates of this linear growth model, derive

Mi(t) = E[X(t) | X(0) = i], the expected value of the size of the population at time t

given that the population started with i ∈ N individuals (Ross, 1989, Example 3c, pp.

252–254). •

Exercise 4.25 — Expected value of a linear growth model with immigration

Admit the size of a bird colony is governed by a birth and death process with rates

λn = nλ+ θ, n ∈ N0 and µn = nµ, n ∈ N.10

Derive E[X(t) | X(0) = i], the expected value of the size of the bird colony at time

t given that this colony was founded by i ∈ N individuals (Ross, 2003, Example 6.4, pp.

353–355). •10This is called a linear growth model with immigration: each individual in the population is assumed

to give birth at an exponential rate λ; there is an exponential rate of increase θ of the population due to

an external source such as immigration; deaths are assumed to occur at an exponential rate µ for each

member of the population (Ross, 2003, Example 6.4, p. 353).

189



Proposition 4.26 — Kolmogorov’s backward and forward equations for birth

and death processes (Ross, 2003, examples 6.10 and 6.12, pp. 364 and 368)

For birth and death processes:

• Kolmogorov’s backward (h+ t) equations become

dP0j(t)

dt=λ0 P1j(t)− λ0 P0j(t), j ∈ N0 (4.32)

dPij(t)

dt=λi Pi+1,j(t) + µi Pi−1,j(t)− (λi + µi)Pij(t), i ∈ N, j ∈ N0; (4.33)

• Kolmogorov’s forward (t+ h) equations are given by

dPi0(t)

dt=Pi1(t)µ1 − Pi0(t)λ0, i ∈ N0 (4.34)

dPij(t)

dt=Pi,j−1(t)λj−1 + Pi,j+1(t)µj+1 − Pij(t) (λj + µj), i ∈ N0, j ∈ N. (4.35)

•

Exercise 4.27 — Kolmogorov’s backward and forward equations for a pure

birth process

Write Kolmogorov’s backward and forward equations for a pure birth process (Ross, 2003,

Example 6.9, p. 364). •

Solving Kolmogorov’s backward differential equations is feasible, namely for some

birth and death processes with finite state space such as the CTMC of the next exercise.

Exercise 4.28 — Solving Kolmogorov’s backward differential equations

Suppose that:

• a machine works for an exponential amount of time with mean λ−1 before breaking

down;

• it takes an exponential amount of time with mean µ−1 to repair the machine.

190

(a) Show that if the machine is in working condition (state 0) at time 0 then the

probability that it will be working at time t is equal to

P00(t) =λ

λ+ µ× e−(λ+µ)t +

µ

λ+ µ

and

P10(t) =µ

λ+ µ− µ

λ+ µ× e−(λ+µ)t.

(Ross, 1989, Example 4c, pp. 263–265; Ross, 2003, Example 6.11, pp. 364–366).

(b) Consider λ = 1, µ = 2 and t = 10 and compare P(t) to its approximations(I + R t

n

)nand

[(I− R t

n

)−1]n

, where n = 210. •

Solving Kolmogorov’s forward differential equations is also possible in certain

cases, namely for pure birth processes, as shown by Proposition 4.29 and Exercise 4.31.

Moreover, Kolmogorov’s forward differential equations are in fact differential-

difference equations; they can always be solved, at least in principle, by recurrence,

that is, successive substitution (Cooper, 1981, p. 16).

Proposition 4.29 — Solving Kolmogorov’s forward equations for pure birth

processes (Ross, 1989, Proposition 4.1, p. 266)

Let X(t) : t ≥ 0 be a pure birth process with rates λi, i ∈ N0. Then the entries of the

TPM can be obtained recursively:

Pii(t) = e−λit, i ∈ N0; (4.36)

Pij(t) = λj−1 × e−λjt ×∫ t

0

eλjsPi,j−1(s) ds, i ∈ N0, j = i+ 1, i+ 2, . . . ; (4.37)

and Pij(t) = 0, for j = 0, 1, . . . , i. •

Exercise 4.30 — Solving Kolmogorov’s forward equations for pure birth

processes


191

Exercise 4.31 — Solving Kolmogorov’s forward equations for a Yule process

The Yule process is a pure birth process having rates λj = jλ, j ∈ N0.

(a) Use Proposition 4.29 to prove that, for fixed i ∈ N,

Pij(t) =

(j − 1

i− 1

)(e−λt

)i (1− e−λt

)j−i, j = i, i+ 1, . . .

(Ross, 1989, pp. 266–267).

(b) Give a probabilistic interpretation to the result (Ross, 1983, pp. 144–145). •

Kolmogorov’s forward differential equations are also easy to derive and handle

when we are dealing with pure death processes, such as the ones in exercises 4.32 and

4.33.

Exercise 4.32 — Verifying Kolmogorov’s forward differential equations

Admit the size of a population at time t, X(t), can be be described by a pure death

process with rates µk = kµ, k = 0, 1, . . . , n, where n (n ∈ N) represents the initial

number of individuals.

(a) Write Kolmogorov’s forward differential equations in terms of Pk(t) ≡ Pnk(t) =

P [X(t) = k | X(0) = n].

(b) Show that

Pk(t) =

(n

k

)(e−µt

)k (1− e−µt

)n−k, k = 0, 1, . . . , n,

verifies the Kolmogorov’s forward equations written in (a). •

Exercise 4.33 — Kolmogorov’s forward differential equations for a pure death

process

There are n0 (n0 ∈ N) seals in an isolated cove; there are all sick and have to be captured

and taken from the cove to be treated.

192

Let X(t) be the number of (uncaptured) seals in the isolated cove at time t and admit

that X(t) : t ≥ 0 is a pure death process with rates µk = kµ, k ∈ 0, 1, . . . , n0.

(a) Derive Kolmogorov’s forward equations in terms of Pk(t) ≡ Pn0,k(t) = P [X(t) = k |

X(0) = n0].

(b) Argue that the solution to these equations is

Pk(t) = P [X(t) = k | X(0) = n0] =

(n0

k

)(pt)

k (1− pt)n0−k ,

and identify pt.

(c) Compute E[X(t) | X(0) = n0].

(d) Let Tc be the time needed to capture all the seals. Derive the p.d.f. of Tc. •

The p.g.f. method, also called z − transform method, is frequently used to reduce the

Kolmogorov’s forward differential equations to a single partial differential equation,

whose solution can be derived for some birth and death processes.

Let:

• X(t) : t ≥ 0 be a birth and death process such that X(0) = i (where i 6= 0);

• Pj(t) ≡ P [X(t) = j | X(0) = i] be the p.f. of the r.v. (X(t) | X(0) = i);

• P (z, t) = E[zX(t) | X(0) = i

], |z| ≤ 1, be the p.g.f. of (X(t) | X(0) = i).

Then multiplying the jth Kolmogorov’s forward differential equation in (4.35) by zj and

summing up in j (Kulkarni, 1995, p. 279), we get a single equation:∑j∈S

zj × dPj(t)

dt=∑j∈S

zj × [Pj−1(t)λj−1 + Pj+1(t)µj+1 − Pj(t) (λj + µj)] . (4.38)

By noting that∑j∈S

zj × dPj(t)

dt=∂P (z, t)

∂t(4.39)

and that, depending of the birth and death rates, the right term of (4.38) can be written

in terms of P (z, t) and

193

∂P (z, t)

∂z=∑j∈S

jzj−1 × Pj(t) =∑j∈S

(j + 1)zj × Pj+1(t), (4.40)

(4.38) is nothing but a (first order) partial differential equation whose solution is the

p.g.f. of the r.v. (X(t) | X(0) = i).

Exercise 4.34 — Solving Kolmogorov’s forward equations via the p.g.f. method

(Kleinrock, 1975, Exercise 2.10(a)–(d), p. 81)

Admit X(t) : t ≥ 0 is a Yule process — i.e., a pure birth process with birth rates

λj = jλ, for j ∈ N0 — with X(0) = 1.

(a) Derive Kolmogorov’s forward equations in terms of Pj(t) ≡ P1j(t) = P [X(t) = j |

X(0) = 1].

(b) After having rewritten the Kolmogorov’s forward equations derived in (a) as a partial

differential equation in terms of the p.g.f. of the r.v. (X(t) | X(0) = 1), P (z, t) =

E[zX(t) | X(0) = 1

], verify that

P (z, t) =z e−λt

1− (1− e−λt)× z, |z| ≤ 1,

satisfies that partial differential equation (Cooper, 1981, Exercise 6a)b), p. 34).

(c) Identify the distribution of (X(t) | X(0) = 1) and compute E[X(t) | X(0) = 1]. •


(bis) (Kleinrock, 1975, Exercise 2.12, p. 82)

Let:

• X(t) : t ≥ 0 be a birth and death process with X(0) = 0 and rates λj = λ, j ∈ N0

and µj = jµ, j ∈ N;

• Pj(t) ≡ P0j(t) be the p.f. of the r.v. (X(t) | X(0) = 0).

(a) Derive Kolmogorov’s forward equations in terms of Pj(t).

194

(b) After having rewritten the Kolmogorov’s forward equations derived in (a) as a partial

differential equation in terms of the the p.g.f. of (X(t) | X(0) = 0), verify that

P (z, t) = exp

[−λ× (1− e−µt)× (1− z)

µ

], |z| ≤ 1,

satisfies that partial differential equation (Cooper, 1981, pp. 32–33).

(c) Rewrite P (z, t) as a power series to identify Pj(t) and calculate limt→+∞ Pj(t) (Cooper,

1981, p. 33). •


(bis, bis) (Kleinrock, 1975, Exercise 2.14, pp. 82–83)

Let:

• X(t) : t ≥ 0 a birth and death process with X(0) = 1 and rates λj = jλ, j ∈ N0

and µj = jµ, j ∈ N;

• Pj(t) ≡ P1j(t) be the p.f. of the r.v. (X(t) | X(0) = 1).

(a) Derive Kolmogorov’s forward equations in terms of Pj(t) and a partial differential

equation satisfied by the p.g.f. of (X(t) | X(0) = 1) (Kulkarni, 1995, Example 6.24,

pp. 278–279).

(b) Verify that P (z, t) =µ[1−e(λ−µ)t]−[λ−µ e(λ−µ)t]zµ−λ e(λ−µ)t−λ [1−e(λ−µ)t]z

satisfies the partial differential equation

derived in (b).

(c) Calculate the expected value and the variance of (X(t) | X(0) = 1).

(d) After having rewritten P (z, t) as a power series, show that

Pj(t) =

α(t), j = 0

[1− α(t)]× [1− β(t)]× [β(t)]j−1 , j ∈ N,

and obtain expressions for α(t) and β(t) (Kulkarni, 1995, Example 6.24, p. 281).

(e) Find the extinction probability, limt→+∞ P0(t). •

195

4.6 Classification of states

The concepts of accessibility, communication, irreducibility, transience and recurrence for

CTMC can be defined in the same lines as for DTMC. Consequently, these concepts are

going to be briefly discussed.

Definition 4.37 — CTMC and accessibility, communication, irreducibility,

transience and recurrence (Kulkarni, 1995, definitions 6.2–6.8, pp. 283–285)

Let:

• X(t) : t ≥ 0 be a CTMC with state space S, TPM P(t) and initial state i;

• S1 be the time of the first jump of this stochastic process;

• Tj = inft ≥ S1 : X(t) = j be the first time the CTMC enters state j ∈ S;

• Ti = inft ≥ S1 : X(t) = i be the first time the CTMC returns to state i ∈ S;

• fij = P [Tj < +∞ | X(0) = i] be the probability that the first visit to state j (resp.

the first return to the initial state i if j = i) occurs in finite time;

• µij = E[Tj | X(0) = i] be the expected time until the first visit to state j (resp. the

first return to the initial state i if j = i).

Then, for i, j ∈ S:

• state j is said to be accessible from state i, i.e., i→ j, if Pij(t) > 0 for some t ≥ 0;

• states i and j are said to communicate, i.e., i↔ j, if i→ j and j → i;11

• a set of states C ⊂ S is said to be a communicating class if

(i) i, j ∈ C ⇒ i↔ j

(ii) i ∈ C, i↔ j ⇒ j ∈ C;

11Two states that communicate are obviously said to be in the same class.

196

• A communicating class C ⊂ S is said to be closed if i ∈ C, j 6∈ C ⇒ i 6→ j.

• the CTMC is said to be irreducible if its state space S is a single closed

communicating class, i.e., if all states communicate with each other; otherwise,

the CTMC is called reducible;

• state i is said to be recurrent if fii = 1;

• state i is called transient if fii < 1;

• a recurrent state i is said to be

(i) positive recurrent if µii < +∞

(ii) null recurrent if µii = +∞. •

Remark 4.38 — Periodicity (Kulkarni, 1995, p. 287)

Tj is a continuous r.v. and, thus, if state j is accessible from state i (i → j) then it is

possible to visit j at any time t > 0 starting from i.12 Consequently, the notion of period

of a state of a CTMC does not exist. •

Since CTMC can be alternatively described in terms of holding times and an embedded

DTMC, can accessibility, communication, irreducibility, transience and recurrence be

defined in terms of such DTMC?

Yes!

This is indeed possible if we are dealing with what is called a regular CTMC, i.e., with

no instantaneous states.

Definition 4.39 — Regular CTMC (Ross, 1983, p. 142)

A CTMC is said to be regular if with probability one, the number of transitions in any

finite length time is also finite, that is, if supi∈S νi < +∞. •

12That is, if ∃s > 0 : Pij(s) > 0 then Pij(t) > 0,∀t > 0.

197

Proposition 4.40 — Accessibility, communication, irreducibility, transience

and recurrence redefined for CTMC (Kulkarni, 1995, theorems 6.8 and 6.9, pp.

284–285)

Let:

• X(t) : t ≥ 0 be a regular CTMC with state space S and TPM P(t) = [Pij(t)]i,j∈S ;

• R = [rij]i,j∈S be the associated rate matrix (or infinitesimal generator), where rij =

qij = νi × Pij (i 6= j) and rij = −νi (i = j);

• Xn : n ∈ N0 be the embedded DTMC with TPM P = [Pij]i,j∈S , where13

Pij =

qijνi, νi 6= 0, i 6= j

0, νi 6= 0, i = j

0, νi = 0, i 6= j

1, νi = 0, i = j.

(4.41)

Then

X(t) : t ≥ 0 Xn : n ∈ N0

i→ j ⇔ i→ j

i↔ j ⇔ i↔ j

C is a communicating class ⇔ C is a communicating class

MC is irreducible ⇔ MC is irreducible

i is recurrent ⇔ i is recurrent

i is transient ⇔ i is transient

•

Remark 4.41 — Transience and recurrence redefined for CTMC (Kulkarni, 1995,

p. 285)

Immediate consequences of Proposition 4.40:

• recurrence and transience are class properties;

13This is because the quantities are undefined when νi = 0 (Kulkarni, 1995, p. 284).

198

• the criteria to test recurrence and transience of DTMC (see Proposition 3.43) can

be used to establish the recurrence and transience of the embedded DTMC and

therefore of the CTMC. •

Needless to say that positive and null recurrence cannot be defined in terms of that

embedded DTMC because those two concepts rely on the holding times. However, the next

proposition establishes a criterion for positive (resp. null) recurrence somewhat related to

a result related to the positive recurrence of the DTMC (see Remark 3.62).

Proposition 4.42 — Criterion for positive (resp. null) recurrence (Kulkarni,

1995, Theorem 6.10, p. 285)

Let:

• X(t) : t ≥ 0 be an irreducible and recurrent CTMC with state space S;

• Xn : n ∈ N0 be the recurrent embedded DTMC with TPM P = [Pij]i,j∈S ;

• π be a positive solution to π = π ×P.

Then the CTMC is positive (resp. null) recurrent iff∑

i∈Sπiνi< +∞ (resp.

∑i∈S

πiνi

=

+∞). •

Proposition 4.42 also proves that positive and null recurrence are class properties in

the CTMC setting (Kulkarni, 1995, p. 286).14

14Please refer to Kulkarni (1995, Example 6.28, pp. 286–287) for a positive recurrent CTMC with null

recurrent embedded DTMC and vice-versa.

199

4.7 Limit behavior of CTMC

Computing the TPM P(t) for a fixed finite t is not a trivial problem to handle,

algebraically or numerically (Kulkarni, 1995, p. 282). Expectedly, we shift our focus

to the study of the behavior of P(t) as t→ +∞. But can we determine limt→+∞P(t)?

Yes!

What follows provides answers to questions, such as:

• when does Pij(t) have a limit as t→ +∞?

• how to compute limt→+∞ Pij(t)?

(Kulkarni, 1995, p. 282).

Example/Exercise 4.43 — Limit behavior of P(t)

(a) The CTMC described in Exercise 4.28 has TPM equal to

P(t) =

λλ+µ

− λλ+µ

− µλ+µ

− µλ+µ

× e−(λ+µ)t +

µλ+µ

λλ+µ

µλ+µ

λλ+µ

,thus,

limt→+∞

P(t) =

µλ+µ

λλ+µ

µλ+µ

λλ+µ

,obviously independent of the initial state of the CTMC (Kulkarni, 1995, Example

6.25, p. 282).

(b) Consider the CTMC described in Kulkarni (1995, Example 6.13, pp. 261–262), with

five states and the following rate matrix

−λ1 0 λ1 0 0

0 −λ2 0 λ2 0

0 µ1 −(µ1 + λ2) 0 λ2

µ2 0 0 −(µ2 + λ1) λ1

0 0 µ2 µ1 −(µ1 + µ2)

,

200

where λ1 = 1, λ2 = 2, µ1 = 0.1 and µ2 = 0.15 (Kulkarni, 1995, Example 6.26, p. 283).

After having drawn the rate diagram of this CTMC, use the Mathematica function

MatrixExp to obtain P(t) and investigate the limit behavior of this TPM.

(c) Consider the CTMC from Exercise 4.36 — now with X(0) = i and λ > µ. It can be

shown that

limt→+∞

Pij(t) =

(µλ

)i, j = 0

0, j ∈ N,

thus, the limiting probabilities are dependent of the initial state (Kulkarni, 1995,

Example 6.27, p. 283). •

After this example/exercise, we proceed with results concerning the limit behavior of

the TPM P(t) of a general CTMC.

Proposition 4.44 — Limit behavior of P(t) (Kulkarni, 1995, theorems 6.11–6.12 and

Corollary 6.3, pp. 287–288)

Let X(t) : t ≥ 0 be a CTMC. Then:

• limt→+∞ Pjj(t) = 1νj×µjj , where 1/µjj is taken to be 0 if µjj = +∞;

• limt→+∞ Pij(t) =

fij

νj×µjj , νj > 0

fij, νj = 0,

where, once again, 1/µjj is taken to be 0 if µjj = +∞;

• if j is a transient or null recurrent state of the CTMC then limt→+∞ Pij(t) = 0, for

all i ∈ S. •

Now, we turn our attention to the limit behavior of positive recurrent (i.e., ergodic),

irreducible CTMC. Unsurprisingly, it depends on the stationary distribution of the

embedded DTMC.

201

Theorem 4.45 — Limiting behavior of irreducible, positive recurrent CTMC

(Kulkarni, 1995, Theorem 6.13, p. 288; Ross, 1983, p. 152)

Let:

• X(t) : t ≥ 0 be an irreducible, positive recurrent CTMC;

• Xn : n ∈ N0 be the embedded DTMC;

• π = [πj]j∈S be the unique stationary distribution of the embedded DTMC.15

Then the limiting probabilities

Pj = limt→+∞

Pij(t) (4.42)

are given by

Pj =

πjνj∑k∈S

πkνk

, j ∈ S. (4.43)

•

Remark 4.46 — Limiting behavior of irreducible, positive recurrent CTMC

(Ross, 1983, p. 152)

• Pj also equals the long-run proportion of time the CTMC is in state j.

• If the initial state is chosen according to the limiting probabilities Pj : j ∈ S,

then P [X(t) = j] =∑

i∈S Pi × Pij(t) = Pj, for all t, i.e., the resultant CTMC is

stationary.16 •

The next theorem gives one method of computing the limiting distribution of X(t) in

terms of the rate matrix.

15I.e., πj =∑i∈S πiPij , j ∈ S, and

∑j∈S πj = 1; in other words π = πP.

16In fact, P [X(t) = j] =∑i∈S Pi×Pij(t) =

∑i∈S [lims→+∞ Pki(s)]×Pij(t) = lims→+∞

∑i∈S Pki(s)×

Pij(t) = lims→+∞ Pkj(s+ t) = Pj .

202

Theorem 4.47 — Limiting distribution of an irreducible, positive recurrent

CTMC in terms of its rate matrix (Kulkarni, 1995, Theorem 6.11, p. 289; Ross,

1983, p. 152)

Let X(t) : t ≥ 0 be an irreducible, positive recurrent CTMC with rate matrix R. Then

the limiting distribution, represented by the row vector P = [Pj]j∈S , is given by the unique

non negative solution to P ×R = 0∑j∈S Pj = 1.

(4.44)

•

Remark 4.48 — Limiting distribution of an irreducible, positive recurrent

CTMC in terms of its rate matrix

• P ×R = 0 can be written as

Pj × νj =∑i∈S

Pi × qij, j ∈ S (4.45)

(Ross, 1983, p. 152), where qii = 0.

• Pj × νj = rate at which the process leaves state j,

because Pj is the proportion of time the process is in state j and when it is in state

j it leaves at rate νj (Ross, 1983, p. 153).

•∑

i∈S Pi × qij = rate at which the process enters state j,

because Pi is the proportion of time the process is in state i and when it is in state

i it departs to state j at rate qij (Ross, 1983, p. 153).

• Since equations (4.45) can be thought as a statement of the equality of the rate

at which the process leaves and enters state j, they are sometimes referred to as

balance equations (Ross, 1983, p. 153).

203

• An irreducible CTMC is positive recurrent iff there is a solution to the system of

equations (4.44).17 Hence, like in the DTMC setting, by solving these equations, we

are automatically guaranteed positive recurrence of the CTMC. •

Exercise 4.49 — Limiting distribution of an irreducible, positive recurrent

CTMC in terms of its rate matrix

Derive and solve the balance equations of the CTMC with the following rate matrices:

(a)

−λ λ

µ −µ

(Kulkarni, 1995, Example 6.29, p. 290);

(b)

−λ1 0 λ1 0 0

0 −λ2 0 λ2 0

0 µ1 −(µ1 + λ2) 0 λ2

µ2 0 0 −(µ2 + λ1) λ1

0 0 µ2 µ1 −(µ1 + µ2)

,

where λ1 = 1, λ2 = 2, µ1 = 0.1 and µ2 = 0.15 (Kulkarni, 1995, Example 6.30, p.

291).18 •

Let us now determine the limiting probabilities for a birth and death process (Ross,

1983, p. 153), with rates λn, n ∈ N0, and µn, n ∈ N. These are obtained by equating the

rate at which the process leaves a state with the rate at which it enters that state,19 as

follows:

State Rate at which process leaves state = Rate at which process enters state

0 P0λ0 = P1µ1

n ∈ N Pn(λn + µn) = Pn−1λn−1 + Pn+1µn+1

and then rewriting and solving these equations in terms of P0 we get the limiting

probabilities in the following proposition.

17See Kulkarni (1995, Theorem 6.15, p. 290).18Try not to solve (b) by hand...19This is the result of taking limits as t → +∞ throughout Kolmogorov’s forward equations (4.34) –

(4.35), setting limt→+∞dPij(t)dt = limt→+∞

dPj(t)dt = 0 (because if

dPij(t)dt converges then it must converge

to 0) and limt→+∞ Pij(t) = Pj , and normalizing so that∑j∈S Pj = 1 (Cooper, 1981, p. 21).

204

Proposition 4.50 — Limiting probabilities for a birth and death process (Ross,

1983, p. 154)

Let X(t) : t ≥ 0 be a birth and death process with rates λn, n ∈ N0, and µn, n ∈ N.

Then

P0 =1

1 +∑+∞

n=1λ0 λ1 ... λn−1

µ1 µ2 ... µn

(4.46)

Pj =λj−1

µjPj−1

= P0 ×λ0 λ1 . . . λj−1

µ1 µ2 . . . µj, j ∈ N. (4.47)

•

For an account on the limiting behavior of reducible CTMC, the reader should refer

to Kulkarni (1995, pp. 296–299).

Exercise 4.51 — Limiting probabilities for a birth and death process


Exercise 4.52 — Limiting probabilities for a birth and death process

A taxi company has one mechanic who replaces fuel pumps when they fail. Assume:

• the waiting time in days until a fuel pump fails is exponentially distributed with

parameter 1300

;

• the company has 1000 cars;

• the repair time for each car is exponentially distributed with expected repair time

of 14

days.

Find the long-run distribution for X(t), the number of cars with a broken fuel pump

at time t, by considering X(t) : t ≥ 0 a process where a birth corresponds to a broken

fuel pump and a death corresponds to a repaired fuel pump (Isaacson and Madsen, 1976,

Example VII.3.5, p. 246).20 •20Note that the rates are given by λn = 1000−n

300 and µn+1 = 4, for n = 0, 1, . . . , 1000.

205

Remark 4.53 — Existence of limiting probabilities for a birth and death

process

• Equation (4.46) shows us what condition is needed for the limiting probabilities for

a birth and death process to exist:

+∞∑n=1

λ0 λ1 . . . λn−1

µ1 µ2 . . . µn< +∞ (4.48)

(Ross, 1983, p. 154); we are simply requiring that P0 > 0 (Kleinrock, 1975, p. 93).

• We should also note that the condition for the existence of limiting probabilities for

a birth and death process is met whenever the sequence λkµk

: k ∈ N remains below

the unit from some k onwards, i.e., if

∃k0 :λkµk

< 1,∀k ≥ k0 (4.49)

(Kleinrock, 1975, p. 94).

Simply stated, in order for those expressions to represent a probability distribution

we have to place a condition on the birth and death rates that essentially says that

the system occasionally empties (Kleinrock, 1975, p. 93). •

Remark 4.54 — Classification of states of a birth and death process (Kleinrock,

1975, pp. 93–94)

Let

S1 =+∞∑n=1

λ0 λ1 . . . λn−1

µ1 µ2 . . . µn(4.50)

S2 =+∞∑n=1

µ1 µ2 . . . µnλ0 λ1 . . . λn

. (4.51)

Then, all states will be:

• positive recurrent (i.e., ergodic) iff S1 < +∞ and S2 = +∞;

• null recurrent iff S1 = +∞ and S2 = +∞;

206

• transient iff S1 = +∞ and S2 < +∞;

It is the ergodic case that gives rise to the equilibrium/limiting probabilities and that is

of most interest to our studies. •

Exercise 4.55 — (Existence of) limiting probabilities for a birth and death

process (Ross, 1983, Exercise 5.13, p. 179)

The size of a biological population is assumed modeled as a birth and death process — for

which immigration is not allowed when the population size is N or larger — with rates

λk =

kλ+ θ, k = 0, 1, . . . , N − 1

kλ, k = N,N + 1, . . .

and µk = kµ, k ∈ N.

Determine the proportion of time that immigration is restricted, in case N = 3,

λ = θ = 1 and µ = 2. •

Exercise 4.56 — (Existence of) limiting probabilities for a birth and death

process (bis)

After having established conditions that guarantee the existence of limiting probabilities,

obtain (in case it is possible) those probabilities for the birth and death processes with

the following birth and death rates λk, k ∈ N0, and µk, k ∈ N:

(a) λk ≡ λ and µk ≡ µ;

(b) λk ≡ λ and µk = kµ;

(c) λk ≡ λ and µk =

kµ, k = 1, 2, . . . , c

cµ, k = c+ 1, c+ 2, . . ., where c ∈ N;

(d) λk = kλ and µk ≡ µ;

(e) λk = αkλ and µk ≡ µ, with 0 < α < 1

(Kleinrock and Gail, 1996, p. 71);

207

(f) λk =

(M − k)λ, k = 0, 1, 2, . . . ,M

0, k = M + 1,M + 2, . . .and µk ≡ µ

(Ross, 1983, Example 5.5(b), p. 155). •

Interestingly enough, some of the birth and death processes described in Exercise 4.56

are in fact related to queueing systems we shall study in Section 4.8.

4.8 Birth and death queueing systems in equilibrium

This section is devoted to a class of models in which customers arrive in some random

manner at a service facility. Upon arrival they are made to wait in queue21 until it is their

turn to be served. Once served they are generally assumed to leave the system (Ross,

2003, p. 475). These models are usually queueing systems.

Queueing theory started with research by the Danish mathematician, statistician and

engineer Agner Krarup Erlang (1878–1929), when he created models to describe the

Copenhagen telephone exchange; the ideas have since seen applications in such areas

like telecommunications, traffic engineering, computing and in the design of factories,

shops, offices and hospitals (https://en.wikipedia.org/wiki/Queueing theory).

In this section, we narrowed the class of queueing systems to the ones that can be

modeled as birth and death processes — also called birth and death queues. Recall that

these systems enjoy a most convenient property: the times between consecutive arrivals

and the service times are all exponentially distributed r.v. (Kleinrock, 1975, p. 89), and

are all independent r.v.

We are going to describe these queueing systems using Kendall’s notation

(https://en.wikipedia.org/wiki/Kendall’s notation) in the form A/S/c, where:

• A describes the time between consecutive arrivals to the queue;

21The word queue comes, via French, from the Latin cauda, meaning tail

(https://en.wikipedia.org/wiki/Queueing theory#Etymology).

208

https://en.wikipedia.org/wiki/Queueing_theory

https://en.wikipedia.org/wiki/Kendall's_notation

https://en.wikipedia.org/wiki/Queueing_theory#Etymology

• S refers to the service time distribution;

• c represents the number of (identical) service channels or servers.

For instance, when we write M/M/1:

• the first M stands for a Poisson arrival process (that is, for a Markovian arrival

process);

• the second M refers exponentially distributed service times (that is, for Markovian

service times);

• 1 means we are dealing with a single server queue.

We shall also assume that customers are served according to a a first-come, first-served

(FCFS) service policy, whereby the requests of customers are attended to in the order

that they arrived, without other biases or preferences (http://en.wikipedia.org/wiki/First-

come, first-served).

Finally, since the study of the transient behavior of queueing systems is far from being

trivial, we focus on their equilibrium behavior, namely derive limiting probabilities of the

number of customers an arriving customer sees in the system.

4.8.1 Performance measures

For birth and death queueing models, we will be interested in determining the following

performance measures in the long-run or equilibrium:

• Ls, the number of customers in the system — an arriving customer sees;

• Lq, the number of customers in the queue (waiting to be served) — an arriving

customer sees;

• Ws, the time an arriving customer will spend in the system;

• Wq, the time an arriving customer will spend in the queue waiting to be served.22

22Note that Wsst= Wq + service time

209

http://en.wikipedia.org/wiki/First-come,_first-served

http://en.wikipedia.org/wiki/First-come,_first-served

Ws and Wq influence customer satisfaction, whereas Ls and Lq are particulary important

performance measures to resource management (Pacheco, 2002, p. 76).

The distributions of these four r.v. depend on the following parameters:

• λ, the arrival rate;

• µ, the service rate;

• c, the number of (identical) servers in parallel;

• a = λµ, the (offered) load;23

• Pb, the blocking probability;24

• λe = λ× (1− Pb), the input rate;25

• ρ = λc µ

, the traffic intensity;26

• ρe = λec µ

= ρ× (1− Pb), the carried traffic intensity;

• Pi, the long-run fraction of time in state i.

Remark 4.57 — PASTA (Poisson Arrivals See Time Averages) (Pacheco, 2002,

p. 76)

Birth and death queueing systems possess the PASTA (Poisson Arrivals See Time

Averages) property, i.e., the long-run fraction of customers that find at arrival i customers

in the system coincides with the fraction of time the system spends in state i. •23It corresponds to the expected amount of time a (single) server would take to serve all customers

that in the long-run arrive to the system during one unit of time, including blocked customers (Pacheco,

2002, p. 74).24It is the long-run fraction of customers that are blocked (Pacheco, 2002, p. 75) upon arrival and

unable to enter the system.25It is the effective arrival rate which corresponds to the rate at which customers enter the system,

thus, we are excluding blocked customers (Pacheco, 2002, p. 75).26Or utilization factor (Kleinrock, 1975, p. 98). It is a (relative) measure of congestion

and represents the load offered to each server if the work is divided equally among servers

(Pacheco, 2002, p. 75); it should be strictly less than one for the system to function well

(https://en.wikipedia.org/wiki/Queueing theory#Utilization).

210

https://en.wikipedia.org/wiki/Queueing_theory#Utilization

In addition, relationships between these four performance measures can be obtained

by using all these parameters and capitalizing on the following result.

Theorem 4.58 — Little’s law (http://en.wikipedia.org/wiki/Little’s law; Ross, 2003,

p. 478)

The long-term average number of customers in a stable system, L, is equal to the long-

term average effective arrival rate, λe, multiplied by the average time a customer spends

in the system, W — expressed algebraically:

L = λeW. (4.52)

•

Remark 4.59 — Little’s law

• Consequently:

E(Ls) = λeE(Ws); (4.53)

E(Lq) = λeE(Wq). (4.54)

• Although Little’s law looks intuitively reasonable, it is a quite remarkable result

(http://en.wikipedia.org/wiki/Little’s law), as it is valid regardless of the

– arrival process distribution;

– service distribution;

– number of servers;

– service policy (as long as it is not biased);

– etc. •

211

http://en.wikipedia.org/wiki/Little's_law

http://en.wikipedia.org/wiki/Little's_law

4.8.2 M/M/1, the classical queueing system

The celebrated M/M/1 queue is the simplest non trivial interesting queueing system


An M/M/1 queue may be described by a birth and death process with rates:

λk = λ, k ∈ N0 (4.55)

µk = µ, k ∈ N (4.56)


Furthermore, the necessary and sufficient condition for the ergodicity in the M/M/1

system is simply written in terms of the traffic intensity:27

ρ =λ

µ< 1 (4.57)


Needless to say that the next results, referring to Ls, Lq, Ws and Wq, are stated

assuming that ρ < 1.

Proposition 4.60 — M/M/1: distribution of Ls (Kleinrock, 1975, p. 96)

The steady-state probability of finding k customers in the M/M/1 system only depends

on λ and µ through their ratio ρ and is given by:

P (Ls = k) = ρk (1− ρ), k ∈ N0, (4.58)

i.e., Ls ∼ Geometric∗(1− ρ). •

Exercise 4.61 — M/M/1: distribution of Ls

Prove Proposition 4.60 (Kleinrock, 1975, pp. 95–96). •27Note that in this case ρ = ρe because the M/M/1 system has a waiting area with infinite capacity.

212

Exercise 4.62 — M/M/1: characteristics of Ls

Consider an M/M/1 queueing system.

(a) Plot the p.f. of Ls for ρ = 12

(Kleinrock, 1975, Figure 3.2, p. 97).

(b) Obtain the expected value and the variance of Ls as a function of ρ.

(c) Plot E(Ls) (Kleinrock, 1975, Figure 3.3, p. 97) to show that this performance measure

grows in an unbounded fashion with ρ (Kleinrock, 1975, p. 98).

(d) Show that Ls stochastically increases with the arrival rate, λ, and with the expected

service time, µ−1.28 •

Proposition 4.63 — M/M/1: distribution of Lq

The equilibrium probability of finding k customers waiting to be served in the M/M/1

system equals:

P (Lq = k) =

1− ρ2, k = 0

ρk+1 (1− ρ), k ∈ N.(4.59)

•

Exercise 4.64 — M/M/1: distribution of Lq


Proposition 4.65 — M/M/1: distribution of Ws

Since the service times are memoryless in the M/M/1 queueing system, we get:29

(Ws | Ls = k) ∼ Gamma(k + 1, µ), k ∈ N0; (4.60)

Ws ∼ Exponential(µ(1− ρ)). (4.61)

•28A r.v. X, whose distribution depends on the parameter θ, is said to stochastically increase with θ if

Pθ(X > x) is an increasing function of θ, for all −∞ < x < +∞.29Given that upon arrival a customer finds k customers in the M/M/1 system, he/she will leave this

system after the completion of 1 + (k − 1) + 1 services: the service that have already started when the

customer arrived; the ones of the k − 1 customers waiting to be served when the customer arrived; and

his/her own service.

213

Exercise 4.66 — M/M/1: distribution of Ws

Prove Proposition 4.65.30 •

Proposition 4.67 — M/M/1: distribution of Wq

For the M/M/1 queueing system, Wq is a mixed r.v. with the following characteristics:

(Wq | Ls = 0)st= 0;

(Wq | Ls = k) ∼ Gamma(k, µ), k ∈ N; (4.62)

(Wq | Wq > 0) ∼ Exponential(µ(1− ρ)); (4.63)

FWq(t) =

0, t < 0

1− ρ, t = 0

(1− ρ) + ρ× FExp(µ(1−ρ))(t), t > 0.

(4.64)

•

Exercise 4.68 — M/M/1: distribution of Wq


Exercise 4.69 — M/M/1 queueing system

Consider an M/M/1 queueing system and draw the graphs of the following parameters

in terms of ρ:

(a) the limiting probability that the system is empty;

(b) E(Wq);

(c) E(Ws) (Kleinrock, 1975, Figure 3.4, p. 97). •

Exercise 4.70 — M/M/1 queueing system (bis)

Derive V (Ls), V (Lq), V (Ws) and V (Wq). •

The following table condenses the distributions and expected values of the four

performance measures of an (ergodic) M/M/1.

30Apply the total probability law to prove (4.61).

214

M/M/1

Rates λk = λ, k ∈ N0

µk = µ, k ∈ N

Ls P (Ls = k) = ρk (1− ρ), k ∈ N0

E(Ls) = ρ(1−ρ)

Lq P (Lq = k) =

1− ρ2, k = 0

ρk+1 (1− ρ), k ∈ N

E(Lq) = ρ2

(1−ρ)

Ws (Ws | Ls = k) ∼ Gamma(k + 1, µ), k ∈ N0

Ws ∼ Exponential(µ(1− ρ))

E(Ws) = 1µ(1−ρ)

Wq (Wq | Ls = k) ∼ Gamma(k, µ), k ∈ N

FWq(t) =

0, t < 0

1− ρ, t = 0

(1− ρ) + ρ× FExp(µ(1−ρ))(t), t > 0

(Wq | Wq > 0) ∼ Exponential(µ(1− ρ))

E(Wq) = ρµ(1−ρ)

Exercise 4.71 — M/M/1 queueing system (bis, bis)

Admit that defective items from a production line arrive to the repair shop of the same

factory according to a Poisson process with constant rate λ. The repair shop has a single-

server who completes repairs after independent and exponentially distributed times with

expected value equal to 3 minutes.

(a) Determine the distribution of Ls.

(b) The manager of the production line wishes that the probability of having more than

5 defective items waiting for repair does not exceed 10% and that the probability of

having an idle server in the repair shop does not exceed 30%. Identify the arrival

rates that satisfy both conditions. •

215

Exercise 4.72 — More on the M/M/1 queueing model

Passengers arrive to a passport control area in a very small airport according to a Poisson

process having rate equal to 30 passengers per hour. The passport control has a sole

officer who completes checks after independent and exponentially distributed times with

expected value equal to 1.5 minutes.

(a) Obtain the probability that the server is idle.

(b) Calculate the expected number of passengers in the passport control area.

(c) What is the probability that passengers form a queue and its expected size?

(d) Determine not only the expected time an arriving passenger spends in the passport

control area, but also the expected value this passenger waits to be served.

(e) What is the probability that a passenger waits at least 10 minutes until his/her

passport starts to be checked by the officer? •

Exercise 4.73 — More on the M/M/1 queueing model (bis)

People arrive to a phone booth according to a Poisson process with rate 0.1 persons per

minute and the durations of the phone calls are independent and exponentially distributed

r.v. with common expected value equal to 3 minutes.

(a) What is the probability that someone has to wait to make a phone call?

(b) Determine the expected size of the queue.

(c) The phone company will install another phone booth in the same area if the expected

waiting time is of at least 3 minutes. Calculate the increase in the arrival rate that

justifies the installation of the second phone booth.

(d) Obtain the probability that a customer has to wait more than 10 minutes to start

his/her phone call.

216

(e) What is the probability that a person does not spend more than 10 minutes from

arrival to the system until the end of the phone call.

(f) Calculate the percentage of time the phone booth is being used. •

Exercise 4.74 — More on the M/M/1 queueing model (bis)

Vehicles arrive to a car wash according to a Poisson process with rate 5 vehicles per hour

and the durations of the car washes are independent and exponentially distributed r.v.

with expected value equal to 10 minutes. Admit that the car wash has a waiting area

with infinite capacity.

(a) What is the probability that a vehicle has to wait to be washed?

(b) Determine the expected number of vehicles that have to wait to be washed.

(c) Compute the standard deviation of the time spent in queue waiting for the vehicle to

be washed.

(d) What is the percentage of time the car wash machines are nor working? •

Exercise 4.75 — M/M/1 queueing system with discouraged arrivals

Consider an M/M/1 queueing system where arrivals tend to get discouraged when more

and more people are present in the system. One possible way to model this effect is to

consider an harmonic discouragement of arrivals with respect to the number present in

the system, i.e., having birth rates equal to

λk =λ

k + 1, k ∈ N0,

and keep the death rates equal to µk = µ, k ∈ N (Kleinrock, 1975, p. 99).

(a) Draw the rate diagram of this birth and death process (Kleinrock, 1975, Figure 3.5,

p. 100).

(b) Verify that the process is ergodic if λµ< +∞.

217

(c) Show that the limiting probabilities are given by

Pk = e−λ/µ(λ/µ)k

k!, k ∈ N0

(Kleinrock, 1975, pp. 99–100), i.e., Ls ∼ Poisson(λ/µ). •

4.8.3 The M/M/∞ queueing system

The M/M/∞ can be thought as a system where there is always a new server for each

arriving customer (Kleinrock, 1975, p. 101).31

This queueing system can be obviously described by a birth and death process with

rates

λk = λ, k ∈ N0, (4.65)

µk = kµ, k ∈ N, (4.66)

and the ergodic condition is simply λµ< +∞ (Kleinrock, 1975, p. 101).

Proposition 4.76 — M/M/∞: distribution of Ls (Kleinrock, 1975, p. 101)

Ls ∼ Poisson(λ/µ). •

Exercise 4.77 — M/M/∞: distribution of Ls


Suffice to say that we have not to wait in this system and the time spent in the system

coincides with the duration of the service, thus,

Lqst= 0 (4.67)

Ws ∼ Exponential(µ) (4.68)

Wqst= 0. (4.69)

Since this system is quite simple to describe in the equilibrium state we are tempted

to state the transient behavior of the number of customers in the system at time t.

31It may also be interpreted as a system with a responsive server who accelerates his/her service rate

linearly (Kleinrock, 1975, p. 101), to avoid any customers waiting.

218

Proposition 4.78 — M/M/∞: transient behavior of number of customers

Let X(t) be the number of customers in the M/M/∞ system at time t. Then

(X(t) | X(0) = 0) ∼ Poisson(λ (1− e−µt)/µ). (4.70)

•

Exercise 4.79 — M/M/∞: transient behavior of number of customers

Prove Proposition 4.78, by deriving the Kolmogorov’s forward equations and the

associated partial differential equation. •

Even though the M/G/∞ queueing system32 cannot be modeled as a birth and death

process, we digress and state the transient and limit behavior of its number of customers.

Proposition 4.80 — M/G/∞: transient and limit behavior of number of

customers (Pacheco, 2002, p. 91)

Let X(t) be the number of customers in the M/G/∞ system at time t. Then

(X(t) | X(0) = 0) ∼ Poisson

(λ

∫ t

0

[1−G(t− s)] ds)

(4.71)

limt→+∞

(X(t) | X(0) = 0) ∼ Poisson(λ/µ). (4.72)

•

Exercise 4.81 — M/G/∞: transient and limit behavior of number of

customers

Prove Proposition 4.80 (Pacheco, 2002, p. 91). •

Exercise 4.82 — M/G/∞: transient and limit behavior of number of

customers

Users arrive at a library according to a Poisson process with rate equal to 3 users per

minute and spend in the library an amount of time with Uniform(10, 210) distribution.

What is the expected number of users in the library two hours after it opened (Pacheco,

2002, p. 91)? •

32This system is associate with a Poisson arrival process with rate λ and service time distribution

function G with finite expected value µ−1.

219

M/M/∞


µk = kµ, k ∈ N

Ls Ls ∼ Poisson(λ/µ)

Lq Lqst= 0

Ws Ws ∼ Exp(µ)

Wq Wqst= 0

X(t) = number of customers in the system at time t

(X(t) | X(0) = 0) ∼ Poisson(λ (1− e−µt)/µ)

M/G/∞

(X(t) | X(0) = 0) ∼ Poisson(λ∫ t

0[1−G(t− s)] ds

)limt→+∞(X(t) | X(0) = 0) ∼ Poisson(λ/µ)

220

4.8.4 M/M/m, the m server case

Once again we consider a queueing system with an unlimited waiting area and with a

constant arrival rate; this system provides a maximum of m servers, is within the reach

of a birth and death formulation and leads to

λk = λ, k ∈ N0 (4.73)

µk = minkµ,mµ

=

kµ, k = 1, . . . ,m

mµ, k = m+ 1,m+ 2, . . .(4.74)

(Kleinrock, 1975, p. 102).

From these birth and death rates, it is easily seen that the condition for ergodicity is

written, expectedly, in terms of the traffic intensity:

ρ =λ

mµ< 1. (4.75)

Proposition 4.83 — M/M/m: distribution of Ls (Kleinrock, 1975, pp. 102–103)

The limit probability of finding k customers in the M/M/m system depends, once again,

on λ and µ through the traffic intensity ρ = λmµ

:

P (Ls = k) =

P0(mρ)k

k!, k = 0, 1, . . . ,m− 1

P0mmρk

m!, k = m,m+ 1, . . . ,

(4.76)

where P0 = P (Ls = 0) =[∑m−1

k=0(mρ)k

k!+ (mρ)m

m!(1−ρ)

]−1

.

Equivalently,

P (Ls = k) =

m!k!

(1− ρ)(mρ)k−mC(m,mρ), k = 0, 1, . . . ,m− 1

(1− ρ) ρk−mC(m,mρ), k = m,m+ 1, . . . ,(4.77)

where

C(m,mρ) = P (queueing) = P (Ls ≥ m) =

(mρ)m

m!(1−ρ)∑m−1k=0

(mρ)k

k!+ (mρ)m

m!(1−ρ)

(4.78)

is usually referred to as Erlang’s C formula (or Erlang’s second formula).33 •33Some authors, such as Kleinrock (1975, p. 103), represent this probability by C(m, ρ) instead of

C(m,mρ). We prefer the notation of Pacheco (2002, p. 80).

221

Proposition 4.84 — M/M/m: distribution of Lq

The equilibrium probability of finding k customers waiting in line in the M/M/m queueing

system is simply given by:

P (Lq = k) =

1− ρC(m,mρ), k = 0

(1− ρ) ρk C(m,mρ), k ∈ N.(4.79)

•

Proposition 4.85 — M/M/m: distribution of Ws

The distribution of Ws conditional Ls = k, depends on the fact that the arriving customer

is immediately served or not:

(Ws | Ls = k) ∼

Exp(µ), k = 0, . . . ,m− 1,

Exp(µ) ?Gamma(k −m+ 1,mµ), k = m,m+ 1, . . . ,(4.80)

where ? represents the convolution (or sum of two independent r.v.).

The survival function of Ws has two expressions, depending on whether ρ is equal

to m−1m

or not:

1− FWs(t) =

[1 + µtC(m,mρ)] e−µt, t ≥ 0, ρ = m−1m[

1 + eµ[1−m(1−ρ)]t

1−m(1−ρ)× C(m,mρ)

]e−µt, t ≥ 0, ρ 6= m−1

m.

(4.81)

•

Proposition 4.86 — M/M/m: distribution of Wq

Once more Wq is a mixed r.v. In this case:

(Wq | Ls = k)st=

0, k = 0, . . . ,m− 1

Gamma(k −m+ 1,mµ), k = m,m+ 1, . . . ;(4.82)

(Wq | Wq > 0) ∼ Exponential(mµ(1− ρ)); (4.83)

1− FWq(t) =

1, t < 0

C(m,mρ), t = 0

C(m,mρ)× [1− FExp(mµ(1−ρ))(t)], t > 0.

(4.84)

•

222

Exercise 4.87 — M/M/m: distributions of Ls, Lq, Ws and Wq

Prove propositions 4.83–4.86 and obtain the expected values below.

M/M/m


µk =

kµ, k = 1, . . . ,m

mµ, k = m+ 1,m+ 2, . . .

Ls P (Ls = k) =

m!k!

(1− ρ)(mρ)k−mC(m,mρ), k = 0, 1, . . . ,m− 1

(1− ρ) ρk−mC(m,mρ), k = m,m+ 1, . . .

C(m,mρ) = P (Ls ≥ m) =(mρ)m

m!(1−ρ)∑m−1k=0

(mρ)k

k!+

(mρ)m

m!(1−ρ)

C(1, ρ) = ρ

C(2, 2ρ) = 2ρ2

1+ρ

E(Ls) = mρ+ ρ1−ρC(m,mρ)

Lq P (Lq = k) =

1− ρC(m,mρ), k = 0

(1− ρ) ρk C(m,mρ), k ∈ N

E(Lq) = ρ1−ρC(m,mρ)

Ws (Ws | Ls = k) ∼

Exp(µ), k = 0, . . . ,m− 1,

Exp(µ) ?Gamma(k −m+ 1,mµ), k = m,m+ 1, . . .

1− FWs(t) =

[1 + µtC(m,mρ)] e−µt, t ≥ 0, ρ = m−1m[

1 + eµ[1−m(1−ρ)]t

1−m(1−ρ)× C(m,mρ)

]e−µt, t ≥ 0, ρ 6= m−1

m

E(Ws) = 1µ

+ C(m,mρ)mµ(1−ρ)

Wq (Wq | Ls = k) ∼ Gamma(k −m+ 1,mµ), k = m,m+ 1, . . .

(Wq | Wq > 0) ∼ Exponential(mµ(1− ρ))

1− FWq(t) =

1, t < 0

C(m,mρ), t = 0

C(m,mρ)× [1− FExp(mµ(1−ρ))(t)], t > 0

E(Wq) = C(m,mρ)mµ(1−ρ)

•

223

Exercise 4.88 — M/M/m queueing system

A system has two servers, who attend to customers in a FCFS basis and whose service

times are independent and exponentially distributed r.v. with mean value 1.8 minutes.

Considering that customers arrive to the system according to a Poisson process with rate

equal to 1 customer per minute, compute:

(a) the probability that there are more than 10 customers in the system;

(b) the expected time a customer spends in line waiting to be served;

(c) the expected number of customers in the system;

(d) the probability that exactly one server is idle. •

Exercise 4.89 — M/M/m queueing system (bis)

A small public office has two officers, who service times are independent and exponentially

distributed r.v. with rate equal to 60 visitors per hour. Admit that the times between

consecutive arrivals of visitors are i.i.d. r.v. exponentially distributed with parameter equal

to 100 visitors per hour and calculate:

(a) the probability that there are more than 4 visitors in the system;

(b) the expected number of visitors in the system;

(c) the expected time a visitor spends in the system. •

Exercise 4.90 — M/M/m queueing system (bis, bis)

A department has three secretaries, who process requests that arrive according to a

Poisson process with rate equal to 20 requests per 8 hours. Assume that the processing

times are independent and exponentially distributed r.v. with expected value equal to 40

minutes.

(a) What is the percentage of time all (resp. at least one of) the secretaries are busy?

224

(b) Obtain the expected time one waits for a request to be completely processed

(c) Admit that due to financial problems one of the secretaries had to be fired. Recompute

the quantities in (a) and (b). •

Exercise 4.91 — M/M/m queueing system (bis, bis, bis)

Airplanes arrive to an airport according to a Poisson process having rate equal to 18

airplanes per hour, and their times in a runway during landing are independent and

exponentially distributed r.v. with expected value equal to 2 minutes.

Derive the number of runways the airport should have so that the probability that an

arriving airplane waits to land does not exceed 0.20. •

Exercise 4.92 — A M/M/m queueing system with a impatient customers

(Isaacson and Madsen, 1976, Example VII.3.4, pp. 245–246)

Assume:

• customers arrive at a ticket counter with m windows according to a Poisson process

with parameter 6 per minute;

• customers are served on a first-come-first-served basis;

• service times are independent and exponentially distributed with mean 13

of a

minute.

(a) What is the minimum number of windows needed to guarantee that the line does not

get infinitely long?

(b) Assume m = 4 and that we are dealing with impatient customers who:

• wait for service if Ls ≤ 4;

• wait for service with probability 12

if Ls = 5;

• leave if Ls ≥ 6.

What is the distribution of Ls? •

225

4.8.5 M/M/m/m, the m–server loss system

Kendall’s notation has been extended, namely to A/S/c/K where:

• K stands for the capacity of the system, i.e., the maximum number of customers

allowed in the system including those in service.

When the number is at this maximum, further arrivals are turned away

(http://en.wikipedia.org/wiki/Kendall’s notation).34 K is sometimes denoted by m + c

where c is the buffer size, that is, the number of places in the waiting area.

The M/M/m/m queueing system, is a m−server system with no waiting area:

each newly arriving customer is given her/his private server; however, if a customer

arrives when all servers are occupied, that customer is lost (Kleinrock, 1975, p. 105).

Unsurprisingly, this queueing system also called a m−server loss system.

This queueing system can be modeled as birth and death process with

λk =

λ, k = 0, 1, . . . ,m− 1

0, k = m,m+ 1, . . .(4.85)

µk =

kµ, k = 1, . . . ,m

0, k = m+ 1,m+ 2, . . .(4.86)

(Kleinrock, 1975, p. 105). Since we are dealing with a finite state space (S =

0, 1, . . . ,m), ergodicity is obviously assured as long as the traffic intensity ρ = λmµ

is finite, and this condition can be written in terms of the offered load:

mρ =λ

µ< +∞. (4.87)

Proposition 4.93 — M/M/m/m: distribution of Ls (Kleinrock, 1975, p. 105)

The limit probability of finding k customers in the M/M/m/m queueing system depends

on the offered load mρ = λµ:

P (Ls = k) =

P0(mρ)k

k!, k = 0, 1, . . . ,m

0, k = m+ 1,m+ 2, . . . ,(4.88)

where P0 = P (Ls = 0) =[∑m

k=0(mρ)k

k!

]−1

. •

34If this number is omitted, the capacity is assumed to be unlimited, or infinite.

226

http://en.wikipedia.org/wiki/Kendall's_notation

Remark 4.94 — M/M/m/m system and Erlang’s B formula

The long-run fraction of lost customers is equal to

P (Ls = m) =(mρ)m

m!∑mk=0

(mρ)k

k!

(4.89)

= B(m,mρ) ≡ B(m,λ/µ), (4.90)

usually referred to as Erlang’s B formula (or Erlang’s first formula or Erlang loss formula)

and it was first derived by Erlang in 1917 (Kleinrock, 1975, p. 106).

The (equilibrium) distribution of Ls is sometimes written in terms of B(m,mρ):

P (Ls = k) =

m!k! (mρ)m−k

×B(m,mρ), k = 0, 1, . . . ,m

0, k = m+ 1,m+ 2, . . .(4.91)

•

Exercise 4.95 — M/M/m/m: distributions of Ls, Lq, Ws and Wq

Prove Proposition 4.93 and show that E(Ls) = mρ[1−B(m,mρ)]. •

Exercise 4.96 — Erlang’s B and C formulae (Cooper, 1981, pp. 82, 92)

Prove that:

(a) Erlang’s B formula can be obtained in a recursive way:

B(m,mρ) = B(m,λ/µ)

=

ρ

1+ρ, m = 1

mρ×B(m−1,λ/µ)m+mρ×B(m−1,λ/µ)

, m = 2, 3, . . . ;

(b) Erlang’s C formula is related with Erlang’s B formula as follows:

C(m,mρ) =m×B(m,mρ)

m−mρ× [1−B(m,mρ)]

C(m,λ/µ) =1

1 + (m−mρ)× [mρ×B(m− 1, λ/µ)]−1,

where B(0, λ/µ) = 1;

(c) C(m,mρ) > B(m,mρ);

(d) C(m,mρ) ≡ C(m,λ/µ) = 1

1+ 1−ρρ× m−1−mρ×C(m−1,λ/µ)

(m−1−mρ)×C(m−1,λ/µ)

, for m > mρ+ 1. •

227

We are dealing, once more, with a system where there is no wait — in this case because

there is no waiting area and, thus, arriving customers who find all the m servers busy are

lost. As a consequence:

Lqst= 0 (4.92)

Ws ∼ Exp(µ) (4.93)

Wqst= 0. (4.94)

M/M/m/m

Rates λk =

λ, k = 0, 1, . . . ,m− 1

0, k = m,m+ 1, . . .

µk =

kµ, k = 1, . . . ,m

0, k = m+ 1,m+ 2, . . .

Ls P (Ls = k) =

(mρ)k

k!∑mj=0

(mρ)j

j!

= m!k! (mρ)m−k

×B(m,mρ), k = 0, 1, . . . ,m

0, k = m+ 1,m+ 2, . . .

B(m,mρ) =(mρ)m

m!∑mj=0

(mρ)j

j!

E(Ls) = mρ[1−B(m,mρ)]

Lq Lqst= 0

Ws Ws ∼ Exp(µ)

Wq Wqst= 0

Exercise 4.97 — M/M/m/m system

Answer the questions in Exercise 4.89, considering that the small public office has no

waiting area. •

228

Bibliography

• Bertsekas, D.P. (2—). Stochastic Processes (Chapter 5).

(www.telecom.otago.ac.nz/tele302/ref/Bertsekas ch5.pdf)

• Billingsley, P (1990). Probability and Measure, 3rd. ed. Wiley.

(QA273.4-.67.BIL.37008 and QA273.4-.67.BIL.36649 refer to the library code of the

2nd. edition)

• Brockwell, P.J. and Davis, R.A. (1991). Time Series: Theory and Methods (2nd.

edition). Springer-Verlag.

• Caravena, F. (2012). A note on directly Riemann integrable functions. Accessed

from http://arxiv.org/abs/1210.2361 on 2013-04-03.

• Cooper, R.B. (1981). Introduction to Queueing Theory (2nd. edition). North

Holland.

• Feller, W. (1968). An introduction to probability theory and its applications, Vol. 1

(3rd. edition). John Wiley & Sons.

(QA273.4-.67.FEL.30377, QA273.4-.67.FEL.27086)

• Feller, W. (1971). An introduction to probability theory and its applications, Vol. 2

(2rd. edition). John Wiley & Sons.

(QA273.FEI.1018)

• Grimmett, G.R. and Stirzaker, D.R. (2001a). Probability and Random Processes

(3rd. edition). Oxford.

(QA274.12-.76.GRI.40695 refers to the library code of the 1st. and 2nd. editions

from 1982 and 1992, respectively.)

• Grimmett, G.R. and Stirzaker, D.R. (2001b). One Thousand Exercises in

Probability. Oxford University Press.

• Hajek, B. (2009). Notes for ECE 534 — An Exploration of Random Processes for

Engineers.

(http://www.ifp.illinois.edu/ hajek/Papers/randomprocesses.html)

229

• Hastings, K. (2001). Introduction to probability with Mathematica. Chapman &

Hall.

(QA273.19.HAS.54617)

• Isaacson, D.L. and Madsen, R.W. (1976). Markov Chains: Theory and Applications.

John Wiley & Sons.

(QA274.12-.76.ISA.28858)

• Karr, A.F. (1993). Probability. Springer-Verlag.

• Kleinrock, L. (1975). Queueing Systems, Volume I: Theory. John Wiley & Sons.

(T57.9.KLE)

• Kleinrock, L. and Gail, R. (1996). Queueing Systems: Problems and Solutions.

John Wiley & Sons.

(T57.92.KLE.49916)

• Kulkarni, V.G. (1995). Modeling and Analysis of Stochastic Systems. Chapman &

Hall.

(QA274.12-.76.KUL.59065, QA274.12-.76.KUL.45259)

• Morais, M.C. (2011). Lecture Notes — Probability Theory. Departamento de

Matematica, Instituto Superior Tecnico, Universidade Tecnica de Lisboa.

(https://fenix.ist.utl.pt/disciplinas/tp/2010-2011/1-semestre/material-didactico)

• Morais, M.C. (2012). Real- and Integer-valued Time Series and Quality Control

Charts. Departamento de Matematica, Instituto Superior Tecnico, Universidade

Tecnica de Lisboa.

• Pacheco, A. (2002). Class Notes – Stochastic Manufacturing and Service Systems.

Georgia Institute of Technology, Atlanta, USA.

(https://fenix.ist.utl.pt/disciplinas/ipe64/2012-2013/2-semestre/material-

didactico)

• Pinkerton, S.D. and Holtgrave, D.R. (1998). The Bernoulli-process model in HIV

transmission: applications and implications. In Handbook of economic evaluation

230

of HIV prevention programs, Holtgrave, D.R. (Ed.), pp. 13–32. Plenum Press, New

York.

• Resnick, S. (1992). Adventures in Stochastic Processes Birkhauser, Boston.

(QA274.12-.76.RES.43493)

• Rohatgi, V.K. (1976). An Introduction to Probability Theory and Mathematical

Statistics. John Wiley & Sons.

(QA273-280/4.ROH.34909)

• Ross, S.M. (1983). Stochastic Processes. John Wiley & Sons, New York.

(QA274.12-.76.ROS.36921, QA274.12-.76.ROS.37578)

• Ross, S.M. (1989). Introduction to Probability Models (fourth edition). Academic

Press. (QA274.12-.76.ROS.43540 refers to the library code of the 5th. revised edition

from 1993.)

• Ross, S.M. (2003). Introduction to Probability Models (8th edition). Academic Press,

San Diego, California.

(QA273.ROS.62694)

• Serfozo, R. (2009). Basics of Applied Stochastic Processes. Springer-Verlag.

• Shumway, R.H. and Stoffer, D.S. (2006). Time Series Analysis and Its Applications:

With R Examples (2nd. edition). Springer-Verlag.

• Walrand, J. (2004). Lecture Notes on Probability Theory and Random Processes.

Department of Electrical Engineering and Computer Sciences, University of

California, Berkeley.

(walrandpc.eecs.berkeley.edu/126notes.pdf)

• Yates, R.D. and Goodman, D.J. (1999). Probability and Stochastic Processes: A

friendly Introduction for Electrical and Computer Engineers. John Wiley & Sons,

Inc. (QA273-280/4.YAT.49920)

231

Documents

Lecture Notes | Stochastic Processes · 0. Introduction to stochastic processes In Probability Theory, a stochastic process (or random process) is a collection of (indexed) random