Upload
others
View
19
Download
0
Embed Size (px)
Citation preview
Lecture Notes — Stochastic Processes
Manuel Cabral Morais
Department of Mathematics
Instituto Superior Tecnico
Lisbon/Bern, February–May 2014
Preliminary note
I am convinced that the students of Introduction to Stochastic Processes will benefit
from these lecture notes, which were written assuming that the structure of the classes is
based on the philosophy learning by doing. Thus: the subjects tend to be motivated; the
definitions are introduced; the results are stated (occasionally proved) and illustrated by
examples and exercises worked together with the students.
Some more facts about these lecture notes. The main sources of inspiration are
undoubtedly Ross (1983, 1989, 2003) and Kulkarni (1995). However, I decided to
complement the lecture notes with material from a few other sources.
Please also note that the definitions, results, etc. are preceded by headers; the source(s
of inspiration) I used is(are) added in most cases; I strongly believe that the presentation
benefits from these headers and that the identification of sources is not only fair but
absolutely essential. The examples and the detailed solutions of some exercises in these
lecture notes are presented in small sections with headers with the purpose of suggesting
how students should structure the detailed solutions of the exercises in Introduction to
Stochastic Processes.
I am fully responsible for the typos, imprecisions or errors in these lectures notes —
if you detect any, do let me know by sending an e-mail to [email protected].
I would like to express my sincere thanks to Prof. Antonio Pacheco, for giving me
the opportunity to teach this course and for some invaluable material used during the
preparation of these lecture notes.
Enjoy them and I wish you a splendid semester...
Manuel Cabral Morais
Bern, February 12, 2014
i
Contents
Preliminary note i
0. Introduction to stochastic processes 1
0.1 Stochastic processes and their characterization . . . . . . . . . . . . . . . . 2
0.2 A pivotal characteristic of some stochastic processes . . . . . . . . . . . . . 8
0.3 A few examples of stochastic processes . . . . . . . . . . . . . . . . . . . . 10
1 Poisson Processes 17
1.1 Properties of the exponential distribution . . . . . . . . . . . . . . . . . . . 18
1.2 Poisson process: definitions . . . . . . . . . . . . . . . . . . . . . . . . . . 31
1.3 Event times in Poisson processes . . . . . . . . . . . . . . . . . . . . . . . . 39
1.4 Merging and splitting Poisson processes . . . . . . . . . . . . . . . . . . . . 43
1.5 Non-homogeneous Poisson process . . . . . . . . . . . . . . . . . . . . . . . 51
1.6 Conditional Poisson process . . . . . . . . . . . . . . . . . . . . . . . . . . 58
1.7 Compound Poisson process . . . . . . . . . . . . . . . . . . . . . . . . . . . 62
2 Renewal Processes 70
2.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 70
2.2 Properties of the number of renewals . . . . . . . . . . . . . . . . . . . . . 72
2.3 Renewal function . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 75
2.4 Renewal-type equations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 78
2.5 Key renewal theorem and some other limit theorems . . . . . . . . . . . . 84
2.6 Recurrence times; the inspection paradox . . . . . . . . . . . . . . . . . . . 95
2.7 Renewal reward processes . . . . . . . . . . . . . . . . . . . . . . . . . . . 102
2.8 Alternating renewal processes . . . . . . . . . . . . . . . . . . . . . . . . . 107
ii
2.9 Delayed renewal processes . . . . . . . . . . . . . . . . . . . . . . . . . . . 111
2.10 Regenerative processes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 114
3 Discrete time Markov chains 119
3.1 Definitions and examples . . . . . . . . . . . . . . . . . . . . . . . . . . . . 120
3.2 Chapman-Kolmogorov equations; marginal and joint distributions . . . . . 125
3.3 Classification of states; recurrent and transient states . . . . . . . . . . . . 130
3.4 Limit behavior of irreducible Markov chains . . . . . . . . . . . . . . . . . 141
3.5 Limit behavior of reducible Markov chains . . . . . . . . . . . . . . . . . . 149
3.6 Markov chains with costs/rewards . . . . . . . . . . . . . . . . . . . . . . . 153
3.7 Reversible Markov chains . . . . . . . . . . . . . . . . . . . . . . . . . . . . 157
3.8 Branching processes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 162
3.9 First passage times; absorption probabilities . . . . . . . . . . . . . . . . . 167
4 Continuous time Markov chains 175
4.1 Definitions and examples . . . . . . . . . . . . . . . . . . . . . . . . . . . . 176
4.2 Properties of the transition matrix; Chapman-Kolmogorov equations . . . . 178
4.3 Computing the transition matrix: finite state space . . . . . . . . . . . . . 184
4.4 Computing the transition matrix: infinite state space . . . . . . . . . . . . 185
4.5 Birth and death processes . . . . . . . . . . . . . . . . . . . . . . . . . . . 186
4.6 Classification of states . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 196
4.7 Limit behavior of CTMC . . . . . . . . . . . . . . . . . . . . . . . . . . . . 200
4.8 Birth and death queueing systems in equilibrium . . . . . . . . . . . . . . . 208
4.8.1 Performance measures . . . . . . . . . . . . . . . . . . . . . . . . . 209
4.8.2 M/M/1, the classical queueing system . . . . . . . . . . . . . . . . 212
4.8.3 The M/M/∞ queueing system . . . . . . . . . . . . . . . . . . . . . 218
4.8.4 M/M/m, the m server case . . . . . . . . . . . . . . . . . . . . . . . 221
4.8.5 M/M/m/m, the m–server loss system . . . . . . . . . . . . . . . . . 226
Bibliography 229
iii
0. Introduction to stochastic
processes
In Probability Theory, a stochastic process (or random process) is a collection of (indexed)
random variables (r.v.). These collections of r.v. are frequently used to represent the
evolution of a random quantity (X) over time (t)
(http://en.wikipedia.org/wiki/Stochastic process). This random
quantity could be, for example:
• a stock market index, such as the Dow Jones Industrial Average
(DJIA)1 at the end of a daily trading session at the New York
Stock Exchange (NYSE).
A stochastic process is the random analogue of a deterministic
process: even if the initial condition is known, there are several
(often infinitely many) directions in which the process may evolve
(http://en.wikipedia.org/wiki/Stochastic process).
Remark 0.1 — Practical importance of stochastic processes (Shumway and
Stoffer, 2006, p. 1)
The relevance of stochastic processes in practice can be described by mentioning a brief
list of some of the important areas in which stochastic processes arise:
1. Economics — we frequently deal with daily stock market quotations or monthly
unemployment figures;
1The DJIA is an index that shows how 30 large publicly owned companies based in the USA have
traded in the stock market (http://en.wikipedia.org/wiki/Dow Jones Industrial Average).
1
2. Social sciences — population birth rates and school enrollments series have been
followed for many centuries in several countries;
3. Epidemiology — numbers of influenza cases are often monitored over long periods
of time;
4. Medicine — blood pressure measurements are traced over time to evaluate the
impact of pharmaceutical drugs used in treating hypertension. •
Quiz 0.2 — Stochastic processes
Try to think of more stochastic processes in the world similar to 1–4 in Remark 0.1. •
0.1 Stochastic processes and their characterization
As referred by Brockwell and Davis (1991, p. 8), to allow for the inpredictable nature of
future observations, we have to suppose that each observation at time t is a realization of
a r.v. X(t) (or Xt). As a result, the sequence of observations taken sequentially in time
is a realization of a collection of r.v., known as a stochastic process.
Definition 0.3 — Stochastic process (Karr, 1993, p. 45; Brockwell and Davis, 1991,
p. 8)
A stochastic process (with index set T ) is a collection X(t) : t ∈ T of r.v. defined on a
common probability space (Ω,F ,P). •
Remark 0.4 — State of the process; index set; discrete- and continuous-time
processes
• The index t is often interpreted as time and, thus, X(t) is referred as to the state
of the process at time t (Ross, 2003, p. 83).
• T is called index set (or parameter set) (Ross, 1983, p. 26; Kulkarni, 1995, p. 2).
2
• If T is a countable set (e.g., N0 or Z) then we are dealing with a discrete-time process
(Ross, 1983, p. 26).
• If T is a continuum (e.g., R+0 or R) then X(t) : t ∈ T is said to be a continuous-
time process (Ross, 1983, p. 26).2 •
Remark 0.5 — Sample path, state space; discrete and continuous value
processes
• For each t ∈ T , X(t) is a r.v. (Ross, 2003, p. 83).
• Any realization of the stochastic process X(t) : t ∈ T is called a sample path
(Ross, 1983, p. 26).
• The set of all possible values that the r.v. X(t) can take at all t, say S, is said to
be the state space of the stochastic process X(t) : t ∈ T (Ross, 2003, p. 84).
• If S is a countable set (e.g., N0 or Z), X(t) : t ∈ T is a discrete value process
(Yates and Goodman, 1999, p. 206).
• If S is a continuum (e.g., R+0 or R), X(t) : t ∈ T is a continuous value process
(Yates and Goodman, 1999, p. 206). •
In this course we restrict our attention to stochastic processes in discrete or continuous
time. Moroever, we shall assume that X(t) can take values on either a discrete or a
continuous set.
2Stochastic processes in which T is not a subset of R are also of importance for instance in geophysics
where T is the surface of a sphere and Xt represents a relevant r.v. at location t on the surface of the
Earth (Wei, 1990, p. 1).
3
Example 0.6 — Stochastic processes
1. Discrete-time, discrete value process — X(t) : t ∈ N, where X(t) is the
outcome (“effective”, 1; “non effective”, 0) referring to patient t, in a clinical trial in
which an experimental drug is administered to a series of patients (Kulkarni, 1995,
p. 4).
2. Discrete-time, continuous value process — X(t) : t ∈ 1, . . . , 365, where
X(t) represents the noontime temperature in degrees Celsius at Lisbon Airport on
day t, from January 1 to December 31, 2013 (Yates and Goodman, 1999, p. 203).
3. Continuous-time, discrete value process — X(t) : t ∈ [0, 1], where X(t) is
the number of active calls associated to a coverage cell at time t, tomorrow from 8
to 9PM (Yates and Goodman, 1999, p. 203).
4. Continuous-time, continuous value process — X(t) : t ∈ R+0 , where X(t)
denotes the temperature in degrees Kelvin on the surface of a space shuttle at time
t, starting at launch time t = 0 (Yates and Goodman, 1999, p. 202). •
Quiz 0.7 — Stochastic processes
Give more examples of discrete/continuous-time, discrete/continuous value processes.3 •
Example 0.8 — Sample paths (Hayek, 2009, p. 97)
Consider X(t) : t ∈ N, where X(1), X(2), . . . are independent and identically
distributed (i.i.d.) r.v. such that P [X(t) = 1] = p and P [X(t) = −1] = 1 − p, for
each t, where p ∈ (0, 1). Moreover, suppose Y (t) =∑t
i=1X(i), for t ∈ N.
3X(t) might represent: the number of arrivals to a queue during the service interval of the tth customer,
or the socio-economic status of a family after t generations (discrete-time, discrete value processes); the
waiting time (including the service time) of the tth customer in a system (discrete-time, continuous
value process); the number of people in a queue at time t (continuous-time, discrete value process); the
accumulated operating time of a server in [0, t], or the accumulated claims paid by an insurance company
in [0, t] (continuous-time, continuous value processes).
4
Both X(t) : t ∈ N and Y (t) : t ∈ N are discrete-time (discrete value) stochastic
processes. A sample path of X(t) : t ∈ N and the corresponding sample path of
Y (t) : t ∈ N are shown below for p = 12.
1 2 3 4 5 6 7 8t
-1
1
1 2 3 4 5 6 7 8t
-4
-3
-2
-1
0
1
2
3
4
•
Motivation 0.9 — Characterization of a stochastic process (Kulkarni, 1995, pp.
9–10)
The r.v. X is fully characterized by its distribution function (d.f.),
FX(x) = P (X ≤ x), x ∈ R.
A random vector (X1, . . . , Xn) is completely described by its joint d.f.,
FX1,...,Xn(x1, . . . , xn) = P (X1 ≤ x1, . . . , Xn ≤ xn), xi ∈ R (i = 1, . . . , n).
Can we similarly and completely describe a stochastic process X(t) : t ∈ T? •
The full mathematical description of a stochastic process varies depending on whether
the index set T is finite, infinite (yet countable) or uncountable. Moreover, the full
description of a continuous-time stochastic process is not trivial because we have to deal
with an uncountable number of r.v. in this case; a complete description can be provided if
we make certain assumptions about the continuity of sample paths, etc. (Kulkarni, 1995,
p. 10).
Proposition 0.10 — Characterization of stochastic processes (Kulkarni, 1995, p.
10)
• If the index set T is finite and #T = n, then the stochastic process X(t) : t ∈ T
is completely described by the corresponding joint d.f.
5
• If T is infinite but countable, say T = N0, the discrete-time process X(t) : t ∈ N0
is fully described by a consistent family of (finite-dimensional) joint d.f., say Fn :
n ∈ N0.4 What one means by “fully described a stochastic process” in this case is
to be able to construct a probability space on which the process resides.5
• If T is uncountable, say T = R+0 , and almost all the sample paths of X(t) : t ∈ R+
0
are right-continuous6 with a finite number of jumps in a(ny) finite interval of time,
then X(t) : t ∈ R+0 is completely described by a consistent family of (finite-
dimensional) joint d.f. Ft1,...,tn(x1, . . . , xn) = P [X(t1) ≤ x1, . . . , X(tn) ≤ xn], for any
n ∈ N and 0 ≤ t1 < · · · < tn. •
Quiz 0.11 — Characterization of stochastic processes
How can we fully characterize the stochastic process X(t) : t ∈ N, where X(1), X(2), . . .
are i.i.d. r.v. with common d.f. F?7
4Let Fn(x0, x1, . . . , xn) = P [X(0) ≤ x0, X(1) ≤ x1, . . . , X(n) ≤ xn], xi ∈ R (i = 0, 1, . . . , n) and n ∈
N0. Then the family of joint d.f. Fn : n ∈ N0 is called consistent if limx→+∞ Fn+1(x0, x1, . . . , xn, x) =
Fn(x0, x1, . . . , xn), for all xi ∈ R (i = 0, 1, . . . , n) and n ∈ N0. That is, the “marginal” distribution of
(X(0), X(1), . . . , X(n)) obtained from the joint d.f. of (X(0), X(1), . . . , X(n), X(n + 1)) should be the
same as the joint d.f. specified for (X(0), X(1), . . . , X(n)) (Walrand, 2004, p. 190).5The Kolmogorov existence theorem guarantees that a suitably “consistent” collection of finite-
dimensional distributions will define a stochastic process. This theorem is credited to soviet
mathematician Kolmogorov (http://en.wikipedia.org/wiki/Andrey Kolmogorov) and can be stated as
follows (for more details, see Karr, 1993, p. 65). Let Fn be a joint d.f. on IRn+1 and suppose that
limx→+∞ Fn+1(x0, x1, . . . , xn, x) = Fn(x0, x1, . . . , xn), for all xi ∈ R (i = 0, 1, . . . , n) and n ∈ N0. Then
there is a probability space, say (Ω,F , P ), and a sequence of r.v. X(t)t∈N0 defined on it such that Fn
is the d.f. of (X(0), X(1), . . . , X(n)), for each n ∈ N0.6I.e., X(s) tends to X(t) as s decreases to t, for all t.7This stochastic process is completely described by F because we can create a consistent family of
joint d.f. as follows: Fn(x1, . . . , xn) =∏ni=1 F (xi), xi ∈ R (i = 1, . . . , n) and n ∈ N. (Kulkarni, 1995, p.
10).
6
Motivation 0.12 — Partial characterization of stochastic processes (Brockwell
and Davis, 1991, p. 11)
Let us remind the reader that, while handling a random vector, it is often useful to
compute its mean vector, and, more importantly, the covariance matrix (or the correlation
matrix) to gain insight into the dependence between those r.v.
While dealing with a stochastic process X(t) : t ∈ T, we have to extend the concept
of mean vector, covariance or the correlation matrices — the mean, the autocovariance
and the autocorrelation functions provide the necessary extension. •
Definition 0.13 — Mean, variance, autocovariance and autocorrelation
functions (Wei, 1990, p. 7)
A stochastic process X(t) : t ∈ T can be partially described by the following functions:
µ(t) = E[X(t)], t ∈ T ;
σ2(t) = V [X(t)], t ∈ T ;
γ(t1, t2) = Cov(X(t1), X(t2)), t1, t2 ∈ T ;
ρ(t1, t2) = Corr(X(t1), X(t2))
=γ(t1, t2)√
σ2(t1)× σ2(t2), t1, t2 ∈ T.
They represent the mean, variance, autocovariance and autocorrelation functions,
respectively. •
Quiz 0.14 — Mean, variance and autocovariance functions
Let A and B two independent N(0, 1) r.v., and X(t) = A + Bt + t2, t ∈ R. Determine
the mean, variance and autocovariance functions.8 •
8For t ∈ R, we have µ(t) = t2, σ2(t) = 1 + t2 and γ(t1, t2) = 1 + t1t2 (Hajek, 2009, p. 99).
7
0.2 A pivotal characteristic of some stochastic
processes
A crucial feature of several stochastic processes is some form of statistical equilibrium or
stationarity.
In order to state the following notions of stationarity, let us consider (without loss of
generality) a stochastic process X(t) : t ∈ R+0 .
Definition 0.15 — nth order stationarity in distribution (Wei, 1990, p. 7)9
The stochastic process X(t) : t ∈ R+0 is said to be nth order stationary in distribution
(n ∈ N), if the n−dimensional joint d.f. is time invariant, i.e., if
Ft1,...,tn(x1, . . . , xn) = Ft1+u,...,tn+u(x1, x2, . . . , xn), (1)
for any (x1, x2, . . . , xn) ∈ Rn, any n−tuple (t1, t2, . . . , tn) ∈ R+0 and any u > 0. •
Remark 0.16 — nth order stationarity in distribution
• If X(t) : t ∈ R+0 is nth order stationarity in distribution then the n−dimensional
joint d.f. are unaffected when shifting all the time epochs t1, t2, . . . , tn by any positive
amount u (Grimmett and Stirzaker, 2001a, p. 361).
• A higher order of stationarity always implies a lower order of stationarity because
the n−dimensional joint d.f. determines all finite dimensional joint d.f. of lower
dimension, say m < n. (Wei, 1990, p. 7). •
Definition 0.17 — Strict stationarity (Wei, 1990, p. 7; Grimmett and Stirzaker,
2001a, p. 361)
The stochastic process X(t) : t ∈ R+0 is strictly stationary if it is nth order stationary
in distribution, for any n ∈ N. •9Wei (1990, p. 7) states this notion of stationary for X(t) : t ∈ Z; the extension for X(t) : t ∈ R+
0
follows in a straightforward manner.
8
Remark 0.18 — Strict stationarity
• The terms strongly stationary and completely stationary are also used to denote a
strictly stationary stochastic process (Wei, 1990, p. 7).
• For a strictly stationary process, the mean function µ(t) is constant, say equal to
µ, provided the expectation of X(t) exists, i.e., E[|X(t)|] < +∞ (Wei, 1990, p. 7).
Likewise, if E[X2(t)] < +∞, then the variance function σ2(t) is also constant, say
equal to σ2; moreover, the autocovariance function satisfies γ(t, t+ u) = γ(0, u), for
t ∈ R+0 and u > 0 (Wei, 1990, pp. 7–8). •
Quiz 0.19 — Strict stationarity
Is the stochastic process defined in Quiz 0.14 strictly stationary?10 •
It is very difficult or virtually impossible to verify strict stationarity; thus, we often use
weaker notions of stationarity defined in terms of the moments of the stochastic process
(Wei, 1990, p. 8).
Definition 0.20 — First and second order weak stationarity (Wei, 1990, p. 8;
Pires, 2001, p. 11)
• A first order weakly stationary process X(t) : t ∈ R+0 has constant mean function
µ(t) = µ, t ∈ R+0 .
• A second order weakly stationary process X(t) : t ∈ R+0 — or simply stationary
— has constant mean function µ(t) = µ, for t ∈ R+0 , and an autocovariance function
which depends on the time lag alone, i.e.,
γ(t, t+ u) = γ(0, u) (2)
for any t, u ∈ R+0 . •
10No! For instance, µ(t) = t2 is not constant.
9
Quiz 0.21 — Second order weak stationarity
Let X(t) : t ∈ Z be a stochastic process such that
X(t) = µ+ φ[X(t− 1)− µ] + ε(t), t ∈ Z, (3)
where: φ is a constant satisfying −1 < φ < 1; and ε(t) : t ∈ Z is a sequence of
disturbances such that ε(t)i.i.d.∼ N (0, σ2
ε), with σ2ε = (1− φ2)σ2(0).
Is X(t) : t ∈ Z a stationary (i.e., a second order weakly stationary) process?11 Is
this stochastic process strictly stationary? •
0.3 A few examples of stochastic processes
A sequence of i.i.d. r.v. is the simplest stochastic process. Although devoid of an
interesting structure, we can construct non trivial stochastic processes from it, suchlike
Y (t) : t ∈ N, defined in Example 0.8 and called a (one-dimensional) random walk.
This the simplest discrete-time stochastic process
with a non trivial structure (Kulkarni, 1995, p. 11);
it is going to be addressed (in more detail) in future
chapters.
Remark 0.22 — Applications of random walk
The path followed by atom in a gas moving under
the influence of collisions with other atoms can be
described by a random walk (RW). Random walk
has also been applied in other areas such as:
• Economics — RW used to model shares prices and other factors;
• Population genetics — RW describes the statistical properties of genetic drift;12
11Yes! Since −1 < φ < 1, X(t) admits the following representation: X(t) = µ+∑+∞i=0 φ
iε(t− i), we get
E[X(t)] = · · · and V [X(t)] = · · · are constant, and cov(X(t), X(t+ k)) = cov(X(t)− µ,X(t+ k)− µ) =
E[∑+∞
i=0
∑+∞j=0 φ
iφjε(t− i)ε(t+ k − j)]
= σ2ε
∑+∞i=0 φ
iφi+k = σ2ε
φk
1−φ2 , for k ∈ N0.12Genetic drift is one of several evolutionary processes which lead to changes in allele frequencies over
time.
10
• Ecology — RW used to describe individual animal movements, and occasionally
to model population dynamics. •
An extremely relevant stochastic process arises from counting events occurring one at
a time.13
Definition 0.23 — Counting process (Ross, 1989, p. 210)
A stochastic process N(t) : t ≥ 0 is said to a counting process if N(t) represents the
total number of events (e.g. arrivals, departures) that have occurred up to time t and
must satisfy:
• N(t) ∈ N0, t ≥ 0;
• N(s) ≤ N(t), 0 ≤ s < t;
• N(t)−N(s) corresponds to the number of events that have occurred in the interval
(s, t], 0 ≤ s < t. •
Example 0.24 — Counting process (Ross, 2003, p. 288)
• Let N(t) be the number of persons who enter a specific store at or prior to time t.
Then N(t) : t ≥ 0 is a counting process in which an event corresponds to a person
entering the store.
• Let N(t) be the number of children born by time t in a maternity. Then N(t) :
t ≥ 0 is a counting process in which an event occurs whenever a child is born. •
Quiz 0.25 — Counting process (Ross, 2003, p. 288)
Let N(t) represent now the number of persons in a store at time t. Is N(t) : t ≥ 0 a
counting process?14
Give examples of counting processes. •
The two following properties play a major role in the characterization of counting
processes.
13What follows is an adaption of Morais (2011, Sec. 3.6).14No! N(t) : t ≥ 0 does not satisfy N(s) ≤ N(t), 0 ≤ s < t.
11
Definition 0.26 — Counting process with stationary increments (Ross, 1989, p.
210)
The counting process N(t) : t ≥ 0 is said to have stationary increments if distribution
of the number of events that occur in any interval of time depends only on the length of
the interval,15 that is,
• N(t2 + s)−N(t1 + s)d= N(t2)−N(t1), ∀ 0 ≤ t1 < t2, s > 0. •
Definition 0.27 — Counting process with independent increments (Ross, 1989,
p. 209)
The counting process N(t) : t ≥ 0 is said to have independent increments if the number
of events that occur in disjoint intervals are independent r.v., i.e.,
• for 0 < t1 < · · · < tn, N(t1), N(t2) − N(t1), N(t3) − N(t2), . . . , N(tn) − N(tn−1)
are independent r.v.
•
What follows is a detailed description of a counting process in discrete time (arising
from a sequence of i.i.d. r.v.) with stationary and independent increments.16
Motivation 0.28 — Bernoulli (counting) process (Karr, 1993, p. 88)
Counting sucesses in repeated, independent trials, each of which has one of two possible
outcomes (success and failure). •
Definition 0.29 — Bernoulli process (Karr, 1993, p. 88)
A Bernoulli process with parameter p is a sequence Xi : i ∈ N of i.i.d. r.v. with Bernoulli
distribution with parameter p = P (success). •15The distributions do not depend on the origin of the time interval; they only depend on the length
of the interval.16Since we are going to deal with discrete-time processes X(t) is replaced by Xi, etc.
12
Definition 0.30 — Important r.v. in a Bernoulli process (Karr, 1993, pp. 88–89)
In isolation a Bernoulli process is neither deep or interesting. However, we can identify
three associated and very important r.v.:
• Sn =∑n
i=1Xi, the number of successes in the first n trials (n ∈ N);
• Tk = minn : Sn = k, the time (trial number) at which the kth. success occurs
(k ∈ N), that is, the number of trials needed to get k successes;
• Uk = Tk−Tk−1, the time (number of trials) between the kth. and (k−1)th. successes
(k ∈ N, T0 = 0, U1 = T1). •
Definition 0.31 — Bernoulli counting process (Karr, 1993, p. 88)
The sequence Sn : n ∈ N is usually termed as Bernoulli counting process (or success
counting process). •
Exercise 0.32 — Bernoulli counting process
Simulate a Bernoulli process with parameter p = 12
and consider n = 100 trials. Plot the
realizations of both the Bernoulli process and the Bernoulli counting process. •
Definition 0.33 — Bernoulli success time process (Karr, 1993, p. 88)
The sequence Tk : k ∈ N is usually called the Bernoulli success time process. •
Proposition 0.34 — Important distributions in a Bernoulli process (Karr, 1993,
pp. 89–90)
In a Bernoulli process with parameter p (p ∈ [0, 1]) we have:
• Sn ∼ Binomial(n, p), n ∈ N;
• (Sm | Sn = k) ∼ Hypergeometric(n,m, k), 0 ≤ m ≤ n, 0 ≤ k ≤ n.
• Tk ∼ NegativeBinomial(k, p), k ∈ N;
• Uki.i.d.∼ Geometric(p)
d= NegativeBinomial(1, p), k ∈ N. •
13
Proposition 0.35 — Properties of the Bernoulli counting process (Karr, 1993,
p. 90)
The Bernoulli counting process Sn : n ∈ N has:
• independent increments — i.e., for 0 < n1 < · · · < nk, the r.v. Sn1 , Sn2 − Sn1 ,
Sn3 − Sn2 , . . . , Snk − Snk−1are independent;
• stationary increments — that is, for fixed j ∈ N, the distribution of Sk+j−Sk is the
same for all k ∈ N. •
Quiz 0.36 — Properties of the Bernoulli counting process
(a) Argue why Proposition 0.35 (Karr, 1993, p. 90) holds.
(b) Obtain the mean, variance and autocovariance functions of the Bernoulli counting
process. Is it a second order weakly stationary process? •
Remark 0.37 — Bernoulli counting process (web.mit.edu/6.262/www/lectures/
6.262.Lec1.pdf)
Some application areas for discrete stochastic processes such as the Bernoulli counting
process (and the Poisson process, studied in the next chapter) are:
• Operations Research — Queueing in any area, failures in manufacturing systems,
finance, risk modelling, network models;
• Biology and Medicine — Epidemiology, genetics and DNA studies, cell modelling,
bioinformatics, medical screening, neurophysiology;
• Computer Systems — Communication networks, intelligent control systems, data
compression, detection of signals, job flow in computer systems, physics – statistical
mechanics. •
14
Exercise 0.38 — Bernoulli process modelling of sexual HIV transmission
(Pinkerton and Holtgrave, 1998, pp. 13–14)
In the Bernoulli-process model of sexual HIV transmission, each act of sexual intercourse
is treated as an independent stochastic trial that is associated to a probability α of HIV
transmission. α is also known as the infectivity of HIV and a number of factors are
believed to influence α.17
Prove that the expression of the probability of HIV transmission in n multiple contacts
with the same infected partner is 1− (1− α)n. •
Definition 0.39 — Independent Bernoulli processes
Two Bernoulli counting processes S(1)n : n ∈ N and S(2)
n : n ∈ N are independent
if for every positive integer k and all times n1, . . . , nk, we have that the random
vector(S
(1)n1 , . . . , S
(1)nk
)associated with the first process is independent of
(S
(2)n1 , . . . , S
(2)nk
)associated with the second process. •
Proposition 0.40 — Merging independent Bernoulli processes
Let S(1)n : n ∈ N and S(2)
n : n ∈ N be two independent Bernoulli counting processes
with parameters α and β, respectively. Then the merged process S(1)n ⊕ S(2)
n : n ∈ N is
a Bernoulli counting process with parameter α + β − αβ.18
•
17Such as the type of sex act engaged, sex role, etc.18An event is said to occur in the merged process if and only if an event occurs in at least one of the
two original processes, which happens with probability α+ β − αβ (Bertsekas, 2—, p. 10).
15
Quiz 0.41 — Merging independent Bernoulli processes
Give an example of a merger between two independent Bernoulli processes. Provide
a detailed description of the two original processes and the process resulting from the
merger. •
Proposition 0.42 — Splitting a Bernoulli process (or sampling a Bernoulli
process)
Let Sn : n ∈ N be a Bernoulli counting process with parameter α. Splitting the original
Bernoulli counting process based on a selection probability p yields two Bernoulli counting
processes with parameters αp and α(1− p).
•
Quiz 0.43 — Splitting a Bernoulli process
(a) Are the two processes resulting from splitting a Bernoulli process independent?19
(b) Give an example where we are dealing with a splitting of a Bernoulli process.20
Provide a detailed description of the original process and the two resulting from
the splitting. •
19NO! If we try to merge the two splitting processes and assume they are independent we get a
parameter αp+ α(1− p)− αp× α(1− p) which is different from α.20A two-machine work center may see a stream of arriving parts to be processed and split them by
sending each part to a randomly chosen machine (Bertsekas, 2—, p. 10).
16
Chapter 1
Poisson Processes
Is there a continuous analogue of the Bernoulli process?
Yes!
Motivation 1.1 — Poisson processes
In the Bernoulli process, the times between consecutive events are i.i.d. r.v. with Geometric
distribution with parameter p — the only discrete distribution with lack of memory...
Similarly, if the times between consecutive events are i.i.d. r.v. with Exponential
distribution with parameter λ — the only continuous distribution with lack of memory!
—, we end up dealing with the (homogenous) Poisson process, named after the French
mathematician Simeon-Denis Poisson. In this stochastic process events occur continuously
and independently of one another. •
Assuming that the times between consecutive events are i.i.d. r.v. exponentially
distributed is certainly a simplifying assumption so as to render the mathematics tractable
(Ross, 2003, p. 269), and yet the radioactive decay of atoms, telephone calls arriving at
a switchboard, page view requests to a website and several other phenomena are well-
modeled as Poisson processes (http://en.wikipedia.org/wiki/Poisson process).
What follows is an extended version of Morais (2011, Section 3.7), prepended by a
section inspired by Kulkarni (1995, Section 5.1), Ross (2003, Section 5.2) and Pacheco
(2002, Section 2.1).
17
1.1 Properties of the exponential distribution
The purpose of this section is to state results concerning the exponential distribution,
namely a few that play a major role in the analysis of some stochastic processes.
Definition 1.2 — Exponential distribution
The r.v. X is said to have exponential distribution with parameter λ > 0 — for short
X ∼ Exponential(λ) — if it has p.d.f. given by
fX(x) =
0, x < 0
λe−λx, x ≥ 0.(1.1)
•
Exercise 1.3 — C.d.f., moments, m.g.f., expected value, variance, coefficient
of variation, median, mode, skewness, kurtosis of the Exponential distribution
Let X ∼ Exponential(λ). Prove the following results:
(a) FX(x) =
0, x ≤ 0
1− e−λx, x > 0.
The survival function of X, SX(x) = 1 − FX(x), is an exponential function with
negative exponent, which explains in part why the exponential distribution is also
called negative exponential distribution (Pacheco, 2002, p. 38).
(b) E(Xs) = Γ(s+1)λs
, for s > −1, where: Γ(s) =∫ +∞
0λsxs−1e−λx dx; Γ(s+1) = sΓ(s), s >
0; and Γ(s+ 1) = s!, for s ∈ IN0.
(c) MX(t) = E(etX) = λλ−t , for t < λ.1
(d) E(X) = 1λ.
(e) V (X) = 1λ2
.
1If the function MX(t) = E(etX)
exists in a neighborhood of t = 0, it is called the moment generating
function (m.g.f.) of the r.v. X. Note that MX(t) = E(etX)
=∑+∞k=0
tkE(Xk)k! . Moreover, if the m.g.f. is
defined for |t| ≤ t0, where t0 > 0, then E(Xk) = dkMX(t)dtk
∣∣∣t=0
, for k = 1, 2, · · · .
18
(f) CV (X) =
√V (X)
|E(X)| = 1 (coefficient of variation).
(g) median(X) = λ−1 × ln(2).
(h) mode(X) = 0.
(i) SC(X) = E[X−E(X)]3[SD(X)]3
= 2 (skewness coefficient; SC(X) > 0, thus, a skewed to the
right distribution).
(j) KC(X) = E[X−E(X)]4[SD(X)]4
− 3 = 6 (excess kurtosis coefficient; KC(X) > 0, hence a
leptokurtic distribution). •
Proposition 1.4 — Univariate properties of the Exponential distribution
Let X ∼ Exponential(λ). Then X has the following properties:
• Lack of memory — This is the most important property of the exponential
distribution and it can be stated as follows:
P (X > s+ t | X > s) = P (X > t), s, t ≥ 0 (1.2)
(Kulkarni, 1995, p. 189), i.e., P (X > s + t) = P (X > s) × P (X > t), s, t ≥ 0.2
Equivalently, the distribution of the residual lifetime of X at age s, (X − s|X > s),
has the same distribution as X itself (Pacheco, 2002, p. 38):
(X − s|X > s)st= X. (1.3)
The Exponential r.v. is the only continuous r.v. with the lack of memory property
(Kulkarni, 1995, p. 190).3
2This means that the conditional probability that we need to wait more than another t seconds
before the first arrival, given that the first arrival has not yet happened after s seconds, is
equal to the initial probability that we need to wait more than t seconds for the first arrival
(http://en.wikipedia.org/wiki/Exponential distribution#Memorylessness).3The proof of this relevant result is interesting and can be found in Ross (1983, pp. 24–25), Kulkarni
(1995, p. 190) or Ross (2003, p. 275).
19
• Failure rate — The lack of memory property is translated into a constant failure
rate function for the Exponential r.v.:
λX(x) =fX(x)
1− FX(x)= λ, x ≥ 0. (1.4)
Since the failure rate function completely characterizes the c.d.f. of a non-negative
r.v.,4 the Exponential r.v. is the only non-negative continuous r.v. with constant
failure rate (Kulkarni, 1995, p. 191). •
Another useful result refers to the distributions of the minimum of independent
exponentially distributed r.v., one of several multivariate properties of the
Exponential distribution.
Proposition 1.5 — Minimum of Exponentials
Let Xiindep∼ Exponential(λi), i = 1, . . . , n. It turns out that the smallest of the Xi also
has an Exponential distribution — with parameter equal to the sum of the λi:
minX1, . . . , Xn ∼ Exponential
(n∑i=1
λi
). (1.5)
Moreover, if we define a r.v. N as
N = j, iff Xj = minX1, . . . , Xn, (1.6)
then minX1, . . . , Xn and N are independent r.v.5 and
P (N = j, minX1, . . . , Xn > x) =λj∑ni=1 λi
× e−∑ni=1 λi x. (1.7)
•
Exercise 1.6 — Minimum of independent Exponentials
(a) Prove result (1.5).
4In fact, the c.d.f. of a non-negative continuous r.v. X can be obtained in terms of λX(x): FX(x) =
1− exp[−∫ x0λX(u) du
], x ≥ 0 (Ross, 2003, p. 277).
5The proof of this intriguing result can be found in Kulkarni (1995, pp. 192–193), for n = 2; the
extension is easily proved.
20
(b) Let Xiindep∼ Exponential(λi) represent the duration of component i of a system.
What is the survival function of the duration of the system if the components are
set in series? Obtain the expected value and variance of the duration of the series
system.
(c) Prove result (1.7) without assuming that N and Xj are independent r.v. or using
result (1.9). •
The assumption of independence is critical (Kulkarni, 1995, p. 192) to obtain the
following result concerning the probability that one Exponential r.v. is smaller than
another:
Proposition 1.7 — Probability of first failure
Let Xiindep∼ Exponential(λi), i = 1, . . . , n. Then
P (X1 < X2) = P (X1 = minX1, X2)
=λ1
λ1 + λ2
, (1.8)
and, in general,
P (Xj = minX1, . . . , Xn) =λj∑ni=1 λi
. (1.9)
•
Example/Exercise 1.8 — Probability of first failure
(a) Prove result (1.8).
Let Xiindep∼ Exponential(λi), i = 1, 2. Then
P (X1 < X2) =
∫ +∞
0
P (X1 < X2 | X2 = x)× fX2(x) dx
X1,X2 indep=
∫ +∞
0
P (X1 < x)× fX2(x) dx
=
∫ +∞
0
(1− e−λ1x
)× λ2e
−λ2x dx
=λ1
λ1 + λ2
.
21
(b) Prove result (1.9).
(c) Harry and John arrived at the same time to the barber shop: Harry to get shaved,
John to get a haircut. Suppose that Harry and John were immediately (and
independently!) served. Moreover, assume that the duration of a haircut (resp. a
shave) is an Exponential r.v. with expected value equal to 20 (resp. 15) minutes.
Calculate the probability that John gets his hair cut before Harry gets his beard
shaved? •
Exercise 1.9 — Probability of first failure (Kulkarni, 1995, Example 5.1, p. 192)
The running track in a stadium is 1 km long. Two runners start on it at the same time.
Suppose the speeds of the runners are Xiindep∼ Exponencial(λi), i = 1, 2. The mean speeds
of runners 1 and 2 are 20 km/hr and 22 km/hr, respectively.
What is the probability that runner 1 wins the race? •
Exercise 1.10 — Probability of first failure (bis)
Let X1 and X2 be two independent non-negative continuous r.v. with failure rate functions
λ1(x) and λ2(x), respectively.
Prove that
P [X1 < X2 | minX1, X2 = x] =λ1(x)
λ1(x) + λ2(x).
•
Exercise 1.11 — Probability of first failure (bis, bis)
Consider a post office with two clerks (who operate independently!). Suppose that
customers A, B and C enter the system simultaneously, A is served by one of the clerks,
B by the other and C is told that her/his service will begin as soon as either A and B
leaves.
What is the probability that, of the three customers, A is the last to leave the post
office if the amount of time a clerk spends with a customer is6
6Adapted from Ross (1983, Example 1.6(a), pp. 23–24).
22
(a) equal to 10 minutes?
(b) a r.v. with discrete uniform distribution in 1, 2, 3?
(c) an Exponential r.v. with expected value 1/λ? •
A stronger version of the lack of memory property is stated in the next proposition.
Proposition 1.12 — Strong lack of memory; Renyi’s represention
Let Xii.i.d.∼ Exponential(λ), i = 1, . . . , n, and X(1) = minX1, . . . , Xn, . . . , X(n) =
maxX1, . . . , Xn the associated order statistics. Then
(Xj −X(1) | Xj > X(1)
) d= Xj, (1.10)
for j = 1, . . . , n.7
Moreover, (Xj −minX1, . . . , Xn | Xj > minX1, . . . , Xn) : j = 1, . . . , n is a
sequence of independent r.v. As a consequence, if D1, D2, . . . , Dn represent the (1st.
order) spacings — i.e., D1 = X(1), D2 = X(2) −X(1), . . . , Dn = X(n) −X(n−1) —, then
Dkindep∼ Exponencial((n− k + 1)λ) (1.11)
for k = 1, . . . , n,8 and we can certainly add that
X(k) =k∑i=1
Di (1.12)
E[X(k)] =k∑i=1
1
(n− i+ 1)λ, (1.13)
where (1.12) is usually called the Renyi’s represention of the order statistic X(k). •
7When Xi represents the lifetime of component i (i = 1, . . . , n) and the n components are put to test
at the same time, this result can be interpreted as follows: the remaining lifetime of component j, given
that it is larger that the smallest of the lifetimes, is still exponentially distributed.8This result allows us to say that the time between successive failures — when dealing with items
whose lifetimes are i.i.d. Exponential r.v. — are independent Exponential r.v. (with different parameters).
23
Exercise 1.13 — Strong lack of memory
Prove results (1.10) and (1.11). •
Exercise 1.14 — Renyi’s represention
Use the Renyi’s representation to obtain the expected value and variance of the duration
of a parallel system with n components with i.i.d. exponentially distributed lifetimes.
Interpret the expression of the expected value you obtained.9 •
A parallel system is said to be operating with (n− 1) warm standbys. Another way of
operating the system is to use (n − 1) components as spares, thus, only one component
is working at a time and when it fails it is immediately replaced by one of the remaining
spare — the spares are in cold standby, that is, they do not fail unless they are put into
use (Kulkarni, 1995, p. 195); if the lifetimes of the components are i.i.d. Exponential r.v.
then the duration of this new system has a known distribution.
The distribution of sums of i.i.d. Exponential r.v. also arises when we are dealing with
the epoch of the nth arrival in a Poisson process.
Proposition 1.15 — Sums of i.i.d. Exponentials (Erlang distribution)
Let Xii.i.d.∼ Exponential(λ), i = 1, . . . , n, and Sn =
∑ni=1 Xi. Then Sn ∼ Gamma(n, λ) ∼
Erlang(n, λ), i.e.,
fSn(x) =λn
(n− 1)!xn−1e−λx, x ≥ 0. (1.14)
•
The gamma distribution stands in the same relation to exponential as negative
binomial to geometric: sums of i.i.d. exponential r.v. have gamma distribution (Morais,
2011, p. 78).
9The equation of E[X(n)] is an example of the law of diminishing returns: a system of one component
has expected lifetime 1/λ, whereas a system with two components in parallel has expected lifetime 1.5/λ,
thus, doubling the number of components translated in just a 50% increase in the mean lifetime; one
reason behind this diminishing return is that all the n components are in operation and hence subject to
failure simultaneously although the parallel system requires just one component to function (Kulkarni,
1995, p. 195).
24
The parameter n is usually called the number of phases of the Erlang distribution
and λ its rate (the reciprocal of the expected value); the Erlang distribution is a
particular case of the gamma family of distributions with very important applications
(Pacheco, 2002, p. 39). The Erlang distribution was developed by Agner Krarup Erlang
(1878–1929) to examine the number of telephone calls which might be made at the
same time to the operators of the switching stations; this work on telephone traffic
engineering has been expanded to consider waiting times in queueing systems in general;
the distribution is now used in the fields of stochastic processes and of biomathematics
(http://en.wikipedia.org/wiki/Erlang distribution).
Exercise 1.16 — M.g.f., moments, expected value, variance, coefficient of
variation, mode, skewness, excess kurtosis of the Erlang distribution
Let Sn ∼ Gamma(n, λ), n ∈ N. Prove that:
(a) MSn(t) =(
λλ−t
)n, for t < λ;
(b) E(Skn) = (n+k−1)!(n−1)!λk
, k ∈ N;
(c) E(Sn) = nλ;
(d) V (Sn) = nλ2
;
(e) CV (Sn) = 1√n≤ 1;
(f) mode(Sn) = n−1λ
;10
(g) SC(Sn) = 2√n
(skewed to the right distribution);
(h) KC(Sn) = 6n. •
Exercise 1.17 — Erlang distribution
Prove that
FErlang(n,λ)(x) =∞∑i=n
(λx)i
i!e−λx = 1− FPoisson(λx)(n− 1), x > 0, n ∈ N. (1.15)
•
10mode(Gamma(α, λ)) = α−1λ , α ∈ R+\(0, 1); median(Sn) has no simple closed form.
25
Exercise 1.18 — Erlang distribution (bis) (Kulkarni, 1995, Example 5.4, p. 197)
Suppose the times between two successive births at a maternity hospital are i.i.d.
exponential r.v. with mean equal to one day.
What is the probability that the 10th birth in a calendar year takes place after January
15? •
The Erlang distribution arises when we sum i.i.d. exponential r.v. When the i.d.
assumption is dropped, another distribution arises with a coefficient of variation smaller
than one.
Proposition 1.19 — Sums of independent Exponentials: Hypo-exponential
distribution
Let Xiindep∼ Exponential(λi), i = 1, . . . , n, and suppose that λi 6= λj for all i 6= j. Then∑n
i=1Xi is said to be a Hypo-exponential r.v. and its p.d.f. is given by
f∑ni=1Xi
(x) =n∑i=1
Ci,n × λi e−λix, (1.16)
where Ci,n =∏
j 6=iλj
λj−λi (Ross, 2003, pp. 284–285; Kulkarni, 1995, p. 197). •
Exercise 1.20 — Hypo-exponential distribution
Let∑n
i=1Xi ∼ Hypo-exponential(λ1, . . . , λn).
(a) Describe an example where the Hypo-exponential distribution arises.
(b) Derive the p.d.f. of∑n
i=1Xi when n = 2, without using result (1.16).11
(c) Prove result (1.16) by taking advantage of the result derived in (b).12 •
Exercise 1.21 — C.d.f. and failure rate function of the Hypo-exponential
distribution
Let∑n
i=1Xi ∼ Hypo-exponential(λ1, . . . , λn). Prove that:
11See Ross (2003, p. 284).12This proof can be found in Ross (2003, pp. 285–286).
26
(a) P (∑n
i=1 Xi > x) =∑n
i=1Ci,ne−λix, x > 0;
(b) λ∑ni=1Xi
(x) =∑ni=1 Ci,nλi e
−λix∑ni=1 Ci,ne
−λix , x > 0;
(c) limx→+∞ λ∑ni=1Xi
(x) = minλ1, . . . , λn (interpret this result!).13 •
Exercise 1.22 — M.g.f., expected value, variance, coefficient of variation of the
Hypo-exponential distribution
Let∑n
i=1Xi ∼ Hypo-exponential(λ1, . . . , λn). Prove that:
(a) M∑ni=1Xi
(t) =∏n
i=1
(λiλi−t
), for t < minλ1, . . . , λn);
(b) E (∑n
i=1 Xi) =∑n
i=11λi
;
(c) V (∑n
i=1Xi) =∑n
i=11λ2i
;
(d) CV (∑n
i=1 Xi) =
√∑ni=1
1
λ2i∑n
i=11λi
.14 •
Proposition 1.23 — Mixtures of independent Exponentials: Hyper-
exponential distribution
X is said to have a Hyper-exponential distribution if its p.d.f. is given by
fX(x) =n∑i=1
pi fXi(x), (1.17)
where: Xiindep∼ Exponential(λi), i = 1, . . . , n, with λi 6= λj whenever i 6= j; pi > 0 and∑n
i=1 pi = 1. The Hyper-exponential distribution is an example of a mixture density.15 It
is going to be represent for short by X ∼ Hyper-exponential(λ1, . . . , λn; p1, . . . , pn). •13 The remaining lifetime of a hypo-exponentially distributed item that has survived to age x is, for very
large x, approximately that of an exponentially distributed r.v. with parameter equal to the minimum of
the parameters of the r.v. which are the summands of the Hypo-exponential (Ross, 2003, p. 286).14CV (
∑ni=1Xi) < 1 (http://en.wikipedia.org/wiki/Hypoexponential distribution).
15Its name is due to the fact that the coefficient of variation of this distribution is greater than the one
of the Exponential distribution, whose coefficient of variation is 1 (http://en.wikipedia.org/wiki/Hyper-
exponential distribution).
27
To see how such a r.v. might arise, consider a factory responsible for the production
of n types of batteries, with a type i battery lasting for an Exponential distributed time
with parameter λi, i = 1, . . . , n. Suppose further that pi represents the proportion of
produced batteries of type i (i = 1, . . . , n). If a battery is randomly chosen from the daily
production, then its lifetime X will have a Hyper-exponential distribution (Ross, 2003, p.
278).
Exercise 1.24 — C.d.f. and failure rate function of the Hyper-exponential
distribution
Let X ∼ Hyper-exponential(λ1, . . . , λn; p1, . . . , pn). After having described an(other)
example where the Hyper-exponential distribution arises, prove that:
(a) P (X > x) =∑n
i=1 pi e−λix, x > 0;
(b) λX(x) =∑ni=1 pi λi e
−λix∑ni=1 pi e
−λix ;
(c) limx→+∞ λX(x) = minλ1, . . . , λn (interpret this result!).16 •
Exercise 1.25 — M.g.f., expected value, second order moment and coefficient
of variation of the Hyper-exponential distribution
Let X ∼ Hyper-exponential(λ1, . . . , λn; p1, . . . , pn). Prove that:
(a) MX(t) = E (etx) =∑n
i=1 piλiλi−t , t < minλ1, . . . , λn;
(b) E(X) =∑n
i=1piλi
;
(c) E(X2) =∑n
i=12piλ2i
;
(d) CV (X) =
√∑ni=1
2piλ2i
−(∑n
i=1piλi
)2∑ni=1
piλi
> 1. •
16As a randomly chosen item ages, its failure rate function converges to the failure rate of the
exponential type with the smallest failure rate, which is intuitive because the longer the item lasts,
the more likely it is an item type with the smallest failure rate (Ross, 2003, p. 279).
28
Proposition 1.26 — Random sums of i.i.d. Exponentials
Let:
• Xii.i.d.∼ Exponential(λ), i ∈ N;
• N ∼ Geometric(p);
and
S =N∑i=1
Xi. (1.18)
If N is independent of Xi : i ∈ N then
S ∼ Exponential(λp) (1.19)
(Kulkarni, 1995, p. 198). •
Exercise 1.27 — Random sums of i.i.d. Exponentials
After describing an example where a random sums of i.i.d. Exponentials can arise, prove
result (1.19). •
Exercise 1.28 — Random sums of i.i.d. Exponentials (bis)17
A machine is subject to a series of randomly occurring shocks. Assume that: the times
(in hours) between consecutive shocks are i.i.d. r.v. with Exponential distribution with
common parameter λ = 1/10; each shock has the same probability p = 0.3 of breaking
the machine.
(a) Identify the distribution of the time (in hours) until the machine breaks down, S.
(b) Compute the expected value of S and the probability that the machine operates for
more than E(S) hours. •17This exercise was inspired by Kulkarni (1995, Example 5.5, pp. 198–199).
29
If we drop the i.d. assumption and N is an integer r.v. taking values 1, . . . ,m in the
Proposition 1.26, then we end up with another interesting r.v. which is a mixture of
Hypo-exponential distributions.
Proposition 1.29 — Random sums of independent Exponentials: Coxian r.v.
Let:
• Xnindep∼ Exponential(λn), n = 1, . . . ,m, and suppose that λi 6= λj for all i 6= j;
• N an integer r.v. with p.f. pn = P (N = n), n = 1, . . . ,m.
If N is independent of Xn : n ∈ N then∑N
j=1Xj is said to be a Coxian r.v. and its
p.d.f. is given by
f∑Nj=1Xj
(x) =m∑n=1
[pn
n∑i=1
Ci,n × λi e−λix], (1.20)
where Ci,n =∏
j 6=iλj
λj−λi (Ross, 2003, p. 287). •
Coxian r.v. arise as follows (Ross, 2003, p. 287). Suppose an item goes through m
treatment stages and after each stage there is a probability r(n) = P (N = n | N ≥ n)
that the item will be considered unfit to proceed to the next treatment stage. Moreover,
admit that the times spent in each treatment are independent exponential r.v. and that
the probability that the item has just completed treatment stage n and is considered unfit
to proceed to the next stage of the treatment program is (regardless of the time the item
took to go through the n stages) is equal to r(n). Then the total time spent in treatment
is a Coxian r.v.
Exercise 1.30 — Coxian r.v.
Derive:
(a) result (1.20);
(b) the expected value of a Coxian r.v. •
30
1.2 Poisson process: definitions
The importance of the Poisson process is undisputed as a model for counting events
occurring one at a time.
This section is essentially devoted to three alternate definitions of the Poisson process,
in terms of:
1. the inter-event time distribution;
2. characteristics of the increments of the counting process and the distribution of the
number of events in the interval (0, t];
3. characteristics of the increments of the counting process and the probability of the
occurrence of i (i = 1, 2) events in an interval of infinitesimal range, say (0, h].
Definition 1.31 — Poisson process; 1st. definition (Kulkarni, 1995, p. 199)
Let:
(i) Xi : i ∈ N be a sequence of r.v. representing the inter-event times;
(ii) S0 = 0;
(iii) Sn =∑n
i=1Xi be the time of the occurrence of the nth event;
(iv) N(t) = maxn ∈ N0 : Sn ≤ t, t ≥ 0 — i.e., N(t) represents the number of events
that have taken place in the interval (0, t].
If Xii.i.d.∼ Exponential(λ) then the counting process N(t) : t ≥ 0 is said to be a Poisson
process with rate λ — for short,
N(t) : t ≥ 0 ∼ PP (λ).
•
Example 1.32 — Sample path of a Poisson process (Kulkarni, 1995, pp. 199–200)
This is a typical path of a Poisson process with rate λ, N(t) : t ≥ 0:
31
Note that N(0) = 0 and N(t) jumps by one at t = Sn, n ∈ N, thus, it has piecewise
constant sample paths (Kulkarni, 1995, p. 199). •
Quiz 1.33 — Poisson process; 1st. definition
What is the distribution of Sn? •
Exercise 1.34 — Poisson process; 1st. definition (Ross, 2003, pp. 294–295)
Suppose that people immigrate into a territory according to a Poisson process with rate
λ = 1 person per day.
(a) What is the expected time until the 10th immigrant arrives to the territory?
(b) What is the probability that the elapsed time between the 10th and the 11th arrival
exceeds two days? •
Exercise 1.35 — Distribution of N(t) (Kulkarni, 1995, pp. 200–201)
Prove that N(t) ∼ Poisson(λt), for any fixed t ≥ 0, by capitalizing on the fact that the
nth event will occur prior to or at time t iff the number of events occuring by time t is at
least n (Ross, 2003, p. 294), i.e.,
N(t) ≥ n⇔ Sn ≤ t. (1.21)
•
Exercise 1.36 — Distribution of N(t) (bis)
Suppose that customers arrive at a 24/7 shop according to a PP (λ), with λ = 10
customers per hour.18
(a) What is the distribution and the expected value of the number of arrivals in the
interval (0, 8]?
18Inspired by Kulkarni (1995, p. 201).
32
(b) What is the probability that nobody arrives in the first 6 minutes? •
Definition 1.37 — Poisson process; 2nd. definition (Karr, 1993, p. 91; Kulkarni,
1995, p. 203)
A counting process N(t) : t ≥ 0 is said to be a Poisson process with rate λ if:
(i) N(t) : t ≥ 0 has independent and stationary increments;19
(ii) N(t) ∼ Poisson(λt). •
Remark 1.38 — Poisson process (Karr, 1993, p. 91)
Actually, N(t) ∼ Poisson(λt) follows from the fact that N(t) : t ≥ 0 has independent
and stationary increments (see the proof in Billingsley, 2012, Section 23, pp. 297–309),
thus, redundant in Definition 1.37. •
The independent and stationary increments make the calculations tractable (!!!) when
we are dealing with the Poisson process.
Example/Exercise 1.39 — Capitalizing on independent and stationary
increments
Admit that the times between any consecutive birth notifications are i.i.d. r.v. with
Exponential distribution with expected value equal to 2 hours.
(a) Determine the distribution, the expected value and the coefficient of variation of the
yearly number of birth notifications.
(b) Obtain the probability that there are no birth notifications in one day.
(c) Calculate the probability of 100 birth notifications in 3 days given than in the first 2
of those 3 days there were 80 birth notifications.
19Recall that: N(t) : t ≥ 0 has independent increments iff the r.v. N(t1), N(t2)−N(t1), . . . , N(tn)−
N(tn−1) are independent, for 0 < t1 < · · · < tn; N(t) : t ≥ 0 has stationary increments iff, for any
fixed t ≥ 0, the distribution of the increment N(t + s) − N(s) is the same for all s ≥ 0 (Karr, 1993, p.
91).
33
• Stochastic process
N(t) : t ≥ 0 ∼ PP (λ)
• R.v.
N(t) = Number of birth notifications in (0, t]
N(t) ∼ Poisson(λt), where λ = 0.5 birth notifications per day
• Requested probability
P (100 notif. in 3 days | 80 notif. in the first 2 of those 3 days)
= P [N(3× 24) = 100 | N(2× 24) = 80]
=P [N(3× 24) = 100, N(2× 24) = 80]
P [N(2× 24) = 80]
=P [N(2× 24) = 80, N(3× 24)−N(2× 24) = 100− 80]
P [N(2× 24) = 80]
indep. incr.=
P [N(2× 24) = 80]× P [N(3× 24)−N(2× 24) = 20]
P [N(2× 24) = 80]
= P [N(3× 24)−N(2× 24) = 20]
station. incr.= P [N(3× 24− 2× 24) = 20]
= P [N(24) = 20]
= e−0.5×24 (0.5× 24)20
20!
' 0.00968.
•
Exercise 1.40 — Capitalizing on independent and stationary increments (bis)
A machine produces electronic components according to a Poisson process with rate equal
to 10 components per hour. Let N(t) be the number of produced components up to time
t.
Evaluate the probability of producing at least 8 components in the first hour given
that exactly 20 components have been produced in the first two hours. •
34
Exercise 1.41 — Distribution of N(t) and Sn (Pacheco, 2002, Example 19, p. 41)
Orders for laptops arrive at a computer manufacturing facility according to a Poisson
process with rate 0.5 orders per hour.
Compute the probability that more than 7 orders for laptops are received over a 4
hour period and the probability that the third laptop order arrives in the second hour of
operation. •
Exercise 1.42 — Distribution of N(t) (Ross, 2003, Exercises 37 and 38, p. 41)
Cars pass in a certain point in the highway in accordance to a Poisson process with rate
λ = 3 cars per minute.
(a) If Harry blindly runs across the highway in that specific point, then what is the
probability that he will be uninjured if the amount of time it takes to cross the
highway is s seconds?20
Obtain and comment the results for s = 2, 5, 10, 20.
(b) Now, suppose Harry is agile enough to escape from a single car, but if he encounters at
least two cars while attempting to cross the highway at that point he will be injured.
What is the probability that Harry will be unhurt if he takes s = 5, 10, 20, 30 seconds
to cross the highway? •
Exercise 1.43 — Capitalizing on independent and stationary increments; mean
and autocovariance functions of the Poisson process
Let N(t) : t ≥ 0 ∼ PP (λ).
(a) Obtain E[N(t)].
Is N(t) : t ≥ 0 a first order weakly stationary process?
(b) Verify that Cov(N(t), N(t+ s)) = λt and Cov(N(t), N(s)) = λmint, s, for t, s ≥ 0.
Are we dealing with a second order weakly stationary process?
20Admit that if Harry is crossing the highway when a car passes by, then he will be injured.
35
(c) Define the r.v. E [N(t+ s) | N(t)]. •
The previous exercises suggest that we can capitalize on the fact that the
Poisson process has independent and stationary increments to derive the joint p.f. of
N(t1), . . . , N(tn).
Proposition 1.44 — Joint p.f. of N(t1), . . . , N(tn) in a Poisson process (Karr, 1993,
p. 91)
Let N(t) : t ≥ 0 ∼ PP (λ). Then, for 0 < t1 < · · · < tn and 0 = k0 ≤ k1 ≤ · · · ≤ kn,
P [N(t1) = k1, . . . , N(tn) = kn] =n∏j=1
e−λ(tj−tj−1) [λ(tj − tj−1)]kj−kj−1
(kj − kj−1)!, (1.22)
where t0 = 0 and k0 = 0. •
Exercise 1.45 — Joint p.f. of N(t1), . . . , N(tn) in a Poisson process
Prove Proposition 1.44 by taking advantage, namely, of the fact that a Poisson process
has independent and stationary increments.21 •
Exercise 1.46 — Joint p.f. of N(t1), . . . , N(tn) in a Poisson process (bis) (Kulkarni,
1995, p. 205)
Consider the customer arrival process described in Exercise 1.36 and determine the
probability that one customer arrives between 1:00PM and 1:06PM and two customers
arrive between 1:03PM and 1:12PM. •
Exercise 1.47 — Joint p.f. of N(t1), . . . , N(tn) in a Poisson process (bis, bis)
Let N(t) : t ≥ 0 ∼ PP (λ).
(a) Prove that (N(s) | N(t) = n) ∼ Binomial(n, st), for 0 < s < t.
(b) Obtain E[N(t) | N(t+ s)]. •21See, for example, Karr (1993, p. 92) or Kulkarni (1995, p. 204).
36
As we previously mentioned, the Poisson process can be alternatively defined in terms
of the characteristics of the increments of this counting process and the probability of the
occurrence of i (i = 1, 2) events in an interval of infinitesimal range.
Definition 1.48 — Poisson process; 3rd. definition (Ross, 1989, p. 212)
The counting process N(t) : t ≥ 0 is said to be a Poisson process with rate λ, if:
(i) N(0) = 0;
(ii) N(t) : t ≥ 0 has independent and stationary increments;
(iii) P [N(h) = 1] = λh+ o(h);22
(iv) P [N(h) ≥ 2] = o(h). •
Remark 1.49 — Poisson process; 3rd. definition (Ross, 2003, p. 292)
The explicit assumption that the process N(t) : t ≥ 0 has stationary increments can be
eliminated (from Definition 1.48), as long as assumptions (ii), (iii) and (iv) in Definition
1.48 are replaced by:
(ii′) N(t) : t ≥ 0 has independent increments;
(iii′) P [N(t+ h)−N(t) = 1] = λh+ o(h);
(iv′) P [N(t+ h)−N(t) ≥ 2] = o(h). •
Exercise 1.50 — Poisson process; 2nd. and 3rd. definitions
Prove that Definitions 1.37 and 1.48 are equivalent.23
22Function f is said to be o(h) if limh→0f(h)h = 0 (Ross, 1989, p. 211).
f(x) = x2 is o(h) since limh→0f(h)h = limh→0 h = 0.
f(x) = x is not o(h) since limh→0f(h)h = limh→0 1 = 1 6= 0.
23See Ross (1989, pp. 212–214) or Ross (2003, pp. 291–292).
37
Exercise 1.51 — More on the Poisson process (Hastings, 2001, Example 2, pp.
125–126)
Suppose that cars travelling on an expressway arrive at a toll station according to a
Poisson process with rate λ = 5 cars per minute. Let N(t) be the number of cars that
arrive to the toll station in (0, t].
(a) Calculate P [N(2) > 8].
(b) Obtain P [N(1) < 5 | N(0.5) > 2].
(c) Assuming that λ is no longer known, find the largest possible arrival rate λ such that
the probability of 8 or more arrivals in the first minute does not exceed 0.1. •
Exercise 1.52 — More on the Poisson process (bis) (Hastings, 2001, Activity 5
and examples 2 and 3, pp. 268–269)
Admit that users of a computer lab arrive according to a Poisson process with rate λ = 15
users per hour. Let N(t) be the number of users that arrive to the computer lab in (0, t]
and eventually use Mathematica to answer the following questions.
(a) What is the standard deviation of the number of arrivals to the computer lab in a 2
hour period?
(b) Obtain the probability that there are 5 or fewer arrivals in the first half hour.
(c) Calculate the probability of the 3rd. arrival occurs somewhere between the first 10
and the first 20 minutes.
(d) Determine the probability that 10 or fewer users arrived in the first hour.
(e) What is the probability that the second user arrives within the first 6 minutes?
(f) Now assume that λ is no longer known and find the smallest value of the arrival rate
λ such that the probability of having at least 5 arrivals in the first hour is at least
0.95. •
38
1.3 Event times in Poisson processes
Definition 1.31 singles out important r.v. in a homogeneous Poisson process.
Remark 1.53 — Important r.v. in a Poisson process (Karr, 1993, pp. 88–89, 92–93)
Let N(t) : t ≥ 0 be a Poisson process with rate λ. Then:
• Sn = inft : N(t) = n represents the time of the occurrence of the nth event (e.g.
arrival), n ∈ IN ; S0 = 0;
• Xn = Sn − Sn−1 corresponds to the time between the (n− 1)th and nth events (e.g.
interarrival time), n ∈ IN .
We also know that N(t) ∼ Poisson(λt), t > 0, and:
• Sn ∼ Erlang(n, λ), n ∈ N;
• Xni.i.d.∼ Exponential(λ), n ∈ N. •
Remark 1.54 — Relating N(t) and Sn in a Poisson process
Let us remind the reader that N(t) ≥ n ⇔ Sn ≤ t. Thus,
FSn(t) = FErlang(n,λ)(t)
= P [N(t) ≥ n]
=+∞∑j=n
e−λt(λt)j
j!
= 1− FPoisson(λt)(n− 1), n ∈ N. (1.23)
Thus, values of the c.d.f. of a r.v. with an Erlang distribution, such as Sn, may be obtained
using (tables with) values of the Poisson c.d.f. •
39
The next proposition provides an expression for the probability that the nth event in
one Poisson process occurs before the mth event in a second and independent Poisson
process.
Proposition 1.55 — Comparing event times of two independent Poisson
process (Ross, 2003, pp. 300–301)
Let:
• N1(t) : t ≥ 0 and N2(t) : t ≥ 0 be two independent Poisson processes with rates
λ1 and λ2 (respectively);
• S(1)n denote the time of the nth event of the first PP;
• S(2)m denote the time of the mth event of the second PP.
Then
P[S(1)n < S(2)
m
]=
n+m−1∑k=n
(n+m− 1
k
)(λ1
λ1 + λ2
)k (λ2
λ1 + λ2
)n+m−1−k
= 1− FBinomial(n+m−1,λ1/(λ1+λ2))(n− 1). (1.24)
•
Exercise 1.56 — Comparing event times of two independent PP
Consider the setting of Proposition 1.55.
(a) Show that P [S(1)1 < S
(2)1 ] = λ1
λ1+λ2, without using result (1.24).24
(b) Argue that P [S(1)2 < S
(2)1 ] =
(λ1
λ1+λ2
)2
.25
(c) Prove (1.24), using the fact that S(1)n ∼ Erlang(n, λ1), S
(2)m ∼ Erlang(m,λ2) and
FNegativeBinomial(r,p)(x) =x∑i=r
(i− 1
r − 1
)(1− p)i−rpr
= 1− FBinomial(x,p)(r − 1) = FBinomial(x,1−p)(x− r).
•24Hint (Ross, 2003, pp. 300-301): S
(1)1 ∼ Exponential(λ1) is independent of S
(2)1 ∼ Exponential(λ2).
25Hint (Ross, 2003, p. 301): Use result (a) and the lack of memory property of Poisson processes.
40
Exercise 1.57 — Comparing event times of two independent PP (bis)
Men and women enter a supermarket according to two independent Poisson processes
having respective rates two and four per minute.
Compute the probability that the first male customer arrives before the arrival of the
second female customer. •
Exercise 1.58 — Joint p.d.f. of event times of a Poisson process
Obtain the joint p.d.f. of S1, S2, S3.26 •
Suppose we are told that exactly one event of a Poisson process has taken place by
time t (i.e., N(t) = 1), and we are asked to determine the distribution of the time at which
the event occurred S1 — since the Poisson process possesses stationary and independent
increments it is in fact reasonable that each interval in (0, t] of equal length should have
the same probability of containing the event (Ross, 2003, p. 301).
Proposition 1.59 — Conditional distribution of the first event time (Ross, 1989,
p. 223)
Let N(t) : t ≥ 0 be a Poisson process with rate λ > 0. Then
(S1 | N(t) = 1) ∼ Uniform(0, t). (1.25)
•
Exercise 1.60 — Conditional distribution of the first event time
Prove Proposition 1.59 (Ross, 1989, p. 223; Ross, 2003, p. 302). •
Proposition 1.59 can be generalized and the joint distribution of event times S1, . . . , Sn,
given that exactly n events took place in (0, t], can be obtained (Kulkarni, 1995, p. 209).
26Hint: Rewrite S1, S2, S3 in terms of X1, X2, X3 and capitalize on the fact that these r.v. are
independent and exponentially distributed.
41
Proposition 1.61 — Conditional distribution of the event times (Kulkarni, 1995,
pp. 208–209)
Let:
• N(t) : t ≥ 0 ∼ PP (λ);
• Sn the time of the nth event (n ∈ N);
• Yii.i.d.∼ Uniform(0, t), i = 1, . . . , n.
Then
(S1, . . . , Sn | N(t) = n) ∼ (Y(1), . . . , Y(n)), (1.26)
that is,
fS1,...,Sn|N(t)=n(s1, . . . , sn) =n!
tn, (1.27)
for 0 < s1 < · · · < sn < t and n ∈ N. •
Remark 1.62 — Conditional distribution of the event times (Ross, 1989, p. 224)
Proposition 1.61 is usually paraphrased as stating that, under the condition that n events
have occurred in (0, t], the times S1, . . . , Sn at which events occur behave as the order
statistics Y(1), . . . , Y(n), associated to Yii.i.d.∼ Uniform(0, t).
This result is a particular case of Campbell’s theorem. For the statement of this
theorem, please refer to in http://www.stats.gla.ac.uk/glossary/?q=node/43. •
Exercise 1.63 — Conditional distribution of the event times
Prove Proposition 1.61 (Ross, 1989, p. 224; Ross, 2003, p. 303; Kulkarni, 1995, pp. 209–
210).27 •
27Recall the following results. Let Xii.i.d.∼ X, i = 1, . . . , n, where X is a continuous r.v. with p.d.f.
fX(x) and c.d.f. FX(x). Then (Rohatgi, 1976, 150–152): fX(1),...,X(n)(x(1), . . . , x(n)) = n!×
∏ni=1 fX(x(i)),
for x(1) ≤ · · · ≤ x(n); FX(i)(x) = 1 − FBinomial(n,FX(x))(i − 1), for i = 1, . . . , n; fX(i)
(x) = n!(i−1)! (n−i)! ×
[FX(x)]i−1 × [1− FX(x)]
n−i × fX(x), for i = 1, . . . , n. Moreover, when X ∼ Uniform(0, t) we get
E[X(k)] = ktn+1 (Kulkarni, 1995, p. 209).
42
Exercise 1.64 — Conditional distribution of the event times (Kulkarni, 1995,
examples 5.10 and 5.11, pp. 210–212)
(a) Compute P [S1 > s | N(t) = n].
(b) Find E[Sk | N(t) = n], for k = 1, . . . , n and also for k = n+ 1, . . .
(c) Obtain E[S1 | N(t) = 2, S1 ≤ 4, S2 ≤ 10], t ≥ 10. •
1.4 Merging and splitting Poisson processes
The operation of merging two counting processes to generate a new process is also called
superposition (Kulkarni, 1995, p. 214).
Suppose we merge two independent PP. Is the combined process another PP?
Yes!
Proposition 1.65 — Merging independent Poisson processes (Kulkarni, 1995, p.
214)
Let N1(t), t ≥ 0 and N2(t), t ≥ 0 be two independent Poisson processes with rates
λ1 and λ2, respectively. Then the merged process N(t) = N1(t) + N2(t), t ≥ 0 is a
Poisson process with rate λ1 + λ2.
•
Kulkarni (1995, Theorem 5.5, p. 214) generalizes Proposition 1.65 to the superposition
of r independent Poisson processes.
Quiz 1.66 — Merging independent Poisson processes
43
Merging (or superposition) of Poisson processes arises, for example, when customers arrive
at a service facility from different sources — each source generating a Poisson stream
(Kulkarni, 1995, p. 214).
Give more examples of the superposition of PP. •
Exercise 1.67 — Merging two independent Poisson processes
(a) Prove Proposition 1.65 using Definition 1.31 or 1.37.
(b) Show that the probability that the first event in the (merged process) comes from the
first process is equal to λ1λ1+λ2
. •
Exercise 1.68 — Merging two independent PP; comparing event times of two
independent PP (Pacheco, 2002, p. 45)
Suppose that a manufacturing facility of Exercise 1.41 also produces desktops and that
the orders of desktops arrive according to a Poisson process, with rate equal to 1 desktop
per hour and independent of the process of orders of laptops.
(a) Obtain the probability that the total number of orders does not exceed 2 in the
interval (5, 8].
(b) Compute the probability that the 3rd. desktop is ordered before the 2nd. laptop is
ordered. •
Exercise 1.69 — Merging two independent PP; comparing event times of two
independent PP (bis)
Men and women enter a supermarket according to two independent Poisson processes
having respective rates two and four per minute, respectively.
(a) What is the probability that the number of arrivals (men and women) exceeds ten in
the first 20 minutes?
44
(b) Starting at an arbitrary time, compute the probability that the second man arrives
before the third woman arrives (Ross, 1989, Exercise 20, p. 242). •
Exercise 1.70 — Merging more than two independent Poisson process (bis)
(Kulkarni, 1995, Example 5.14, pp. 215–216)
Jobs are submitted by 4 distinct and independent sources for execution on a central
computer. The jobs arrive from source i according to a Poisson process with rate λi =
110, 1
15, 1
30, 1
60jobs per minute, for i = 1, . . . , 4 (respectively).
(a) Let N(t) be the total number of jobs submitted for execution up to time t.
Characterize the stochastic process N(t) : t ≥ 0.
(b) What is the probability that no jobs arrive in a 10-minute interval?
(c) Obtain:
(i) P [N(10) = 5 | N(5) = 2];
(ii) P [N(5) = 2 | N(10) = 5];
(iii) P [N(10) < 6 | N(5) > 3]. •
Exercise 1.71 — Sampling a Poisson (process)
The number of signals emitted by a source in (0, t], say N(t), has Poisson(λt). Suppose
that each signal is recorded by a receptor with probability p, regardless of the remaining
signals. LetN1(t) (resp.N(t)−N1(t)) be the number of signals recorded (resp. unrecorded)
by the receptor up to time t.
(a) Obtain the p.f. of N1(t) (resp. N(t)−N1(t)) conditional to N(t) = n and derive the
marginal distribution of N1(t) (resp. N(t)−N1(t)).
(b) Are (N1(t) | N(t) = n) and (N −N1(t) | N(t) = n) (conditionally) independent r.v.?
(c) What about N1(t) and N(t)−N1(t)? Are they independent r.v.? •
45
The operation of generating two counting processes out of a single counting process is
called splitting (Kulkarni, 1995, p. 214).28
Are the two processes resulting from splitting a Poisson process also PP?
Yes!
Proposition 1.72 — Splitting a Poisson process (or sampling a Poisson
process) (Ross, 1989, p. 217)
Let N(t) : t ≥ 0 be a Poisson process with rate λ. Splitting the original Poisson process
based on a selection probability p yields two independent Poisson processes with rates
λp and λ(1− p).
We can also add that:
(N1(t)|N(t) = n) ∼ Binomial(n, p); (1.28)
(N2(t)|N(t) = n) ∼ Binomial(n, 1− p). (1.29)
•
Exercise 1.73 — Splitting a Poisson process
Prove Proposition 1.72 (Ross, 1989, pp. 218–219). •
Example 1.74 — Splitting a Poisson process (Ross, 1989, Example 3c, p. 220)
If immigrants to area A arrive at a Poisson rate of ten per week, and if each immigrant is
of English descent with probability 112
(independently of the remaining immigrants), then
what is the probability that no people of English descent will immigrate to area A during
the month of February ?
28Splitting is obviously the opposite of superposition (Kulkarni, 1995, p. 217).
46
• Stochastic process
N(t) : t ≥ 0 ∼ PP (λ)
• R.v.
N(t) = number of immigrants to area A up to time t
N(t) ∼ Poisson(λt)
λ = 10 people per week
• Split process
N1(t) : t ≥ 0 ∼ PP (λp)
• R.v.
N1(t) = number of immigrants to area A with English descent up to time t
N1(t) ∼ Poisson(λpt)
p = P (selecting immigrant of English descent) = 112
λp = 56
immigrants of English descent per week
• Requested probability
P [N1(4) = 0] = e−56×4
= e−103 .
•
Exercise 1.75 — Splitting a Poisson process (bis) (Pacheco, 2002, Example 20, p.
43)
Suppose that in Exercise 1.41 each order is correctly processed with probability 0.98,
independently of other orders.
Compute the probability that at least one order is not correctly processed in the first
24h of operation. •
47
Exercise 1.76 — Splitting a Poisson process (bis, bis) (Kulkarni, 1995, Example,
5.16, pp. 219–220)
Suppose radioactive particles arrive to a Geiger counter according to a Poisson process
having rate λ = 103 particles per second and the counter fails to register a particle with
probability 0.1, independent of everything else.
What is the probability that the total number of radioactive particles that arrived at
the Geiger counter is greater than 5 in a one-hundredth of a second, given that in the
same time interval the Geiger counter registered exactly 4 radioactive particles. •
Exercise 1.77 — Splitting a Poisson process (bis, bis, bis)
Inquiries arrive at a recorded message device according to a Poisson process of rate 15
inquiries per minute.
(a) Find the probability that in a one-minute period, 3 inquiries arrive during the first
10 seconds and 2 inquiries arrive during the last 15 seconds.
(b) Admit that 25% of the those inquiries are actually complaints. If 10 inquiries have
arrived to the recorded message device in a one-minute period, what is the probability
that at least 3 of those 10 inquiries are complaints? •
Exercise 1.78 — More on splitting a Poisson process (Ross, 1989, p. 243, Exercise
23)
Cars pass a point on a highway at a Poisson rate of one per minute. If five percent of the
cars on the road are Dodges, then:
(a) What is the probability that at least one Dodge passes during an hour?
(b) If 50 cars have passed by an hour, what is the probability that five of them were
Dodges?
(c) Given that ten Dodges have passed by in an hour, obtain the expected value of the
number of cars to have passed by in that time. •
48
We could be also interested in studying the nature of the split processes Ni(t) : t ≥0, i = 1, . . . , r, resulting from the classification of events into r distinct types.
The process of classification is called splitting mechanism; we are going to focus on
the Bernoulli splitting mechanism, under which each event is classify as a type i event
with probability pi — independent of every other event —, where pi > 0 and∑r
i=1 pi = 1
(Kulkarni, 1995, p. 218).
Proposition 1.79 — Splitting a Poisson process in r processes (Kulkarni, 1995,
pp. 218–219)
Let N(t) : t ≥ 0 ∼ PP (λ) and Ni(t) : t ≥ 0, i = 1, . . . , r, be the split processes
generated by the Bernoulli splitting mechanism. Then
Ni(t) : t ≥ 0 ∼ PP (λ× pi). (1.30)
Moreover, the r processes Ni(t) : t ≥ 0 are independent PP and
(N1(t), . . . , Nr(t)|N(t) = n) ∼ Multinomial(n, (p1, . . . , pr)). (1.31)
•
Exercise 1.80 — Splitting a Poisson process in r processes
Prove Proposition 1.79 (Kulkarni, 1995, p. 219). •
Can we consider a Bernoulli splitting of a PP in two processes such that p depends on
the time at which the event took place?
Yes!
In this case we are dealing with what is called a non-homogeneous Bernoulli splitting
mechanism (Kulkarni, 1995, p. 220).
Definition 1.81 — Non-homogeneous Bernoulli splitting mechanism (Kulkarni,
1995, p. 220)
Let:
• p : R+0 → [0, 1] be a pre-specified function;
49
• N(t) : t ≥ 0 ∼ PP (λ).
Under the non-homogeneous Bernoulli splitting mechanism an event that took place at
time s is registered with probability p(s), regardless of the remaining events. •
Proposition 1.82 — Non-homogeneous Bernoulli splitting (Kulkarni, 1995, p.
220)
Consider:
• N(t) : t ≥ 0 ∼ PP (λ);
• a non-homogeneous Bernoulli splitting mechanism associated to a pre-specified
function p : R+0 → [0, 1];
• N1(t) the number of registered events during (0, t] under the non-homogeneous
Bernoulli splitting mechanism.
Then
N1(t) ∼ Poisson
(λ
∫ t
0
p(s) ds
). (1.32)
•
Exercise 1.83 — Non-homogeneous Bernoulli splitting
Prove Proposition 1.82 (Kulkarni, 1995, pp. 220–221). •
Exercise 1.84 — Non-homogeneous Bernoulli splitting (Ross, 2003, p. 308)
Let us suppose that individuals contract HIV in accordance to a Poisson process having
rate λ. Suppose:
• the incubation period of the HIV (i.e., the time elapsed from exposure to the HIV
until the individual shows the first symptoms and signs) is a r.v. with distribution
G;
50
• the incubation periods of the HIV in different infected individuals are i.i.d. r.v.
What is the distribution ofN1(t), the number of individuals who have shown symptoms
by time t? •
Exercise 1.85 — Non-homogeneous Bernoulli splitting (bis) (Kulkarni, 1995, pp.
221–222)
Suppose users arrive at a public library according to a Poisson process with rate λ. Admit
that the amount of time a user spends in the library is a r.v. with c.d.f. G and independent
of the times spent by the other users in the library.
After having assumed that the library opened at time 0 and never closed:
(a) What is the distribution and the expected value of the number of users in the library
at time t?
(b) Obtain the probability that no users are in the library at time t. •
1.5 Non-homogeneous Poisson process
Considering a constant arrival rate is rather unrealistic! Therefore it is pertinent to ask
ourselves if a counting process, obtained by allowing the arrival rate at time t to be a
function of t, is easily manageable?
Yes it is!
When λ is replaced with a non-negative function λ(t), we are dealing with the first of the
three generalizations of the Poisson process.
Definition 1.86 — Non-homogeneous Poisson process (Ross, 1989, p. 234)
The counting process N(t) : t ≥ 0 is said to be a non-homogeneous Poisson process
with intensity function λ(t), t ≥ 0 — for short, N(t) : t ≥ 0 ∼ NHPP (λ(t)) — if:
• N(0) = 0;
51
• N(t) : t ≥ 0 has independent increments;
• P [N(t+ h)−N(t) = 1] = λ(t)× h+ o(h), t ≥ 0;
• P [N(t+ h)−N(t) ≥ 2] = o(h), t ≥ 0. •
Proposition 1.87 — Non-homogeneous Poisson process (Ross, 2003, p. 316)
Let N(t) : t ≥ 0 ∼ NHPP (λ(t)). Then
N(t+ s)−N(s) ∼ Poisson
(∫ t+s
s
λ(z) dz
), s ≥ 0, t > 0. (1.33)
•
Remark 1.88 — Mean value function and relevance of a non-homogeneous
Poisson process (Ross, 2003, pp. 316, 318)
• Let N(t) : t ≥ 0 ∼ NHPP (λ(t)) and
m(t) =
∫ t
0
λ(z) dz. (1.34)
Then
N(t+ s)−N(s) ∼ Poisson(m(t+ s)−m(s))
N(t) ∼ Poisson(m(t)).
Unsurprisingly, m(t) is called the mean value function of the non-homogenous
Poisson process.
• The relevance of the non-homogenenous Poisson process is essentially due to the
fact that the condition of stationary increments was dropped; the possibility that
events are more likely to occur at certain times than others is now allowed! •
Example 1.89 — Non-homogeneous Poisson process
A souvenir shop is open from 10:00 to 16:00. Admit that customers enter this shop
according to a non-homogeneous Poisson process with a time dependent rate described in
the following table:
52
Period Rate
10:00–12:00 6
12:00–14:00 15
14:00–16:00 linearly decreases from 15 to 10
(a) Identify the intensity function of the arrival process.
• Stochastic process and r.v.
N(t) : t ≥ 0 ∼ NHPP (λ(t))
N(t) = number of customer arrivals by time t
• Intensity function (or time dependent rate)
λ(t) =
0, 0 < t ≤ 10
6, 10 < t ≤ 12
15, 12 < t ≤ 14
15 + (t− 14)× 10−1516−14
= 15− 2.5(t− 14), 14 < t ≤ 16
0, 16 < t ≤ 24.
(b) Find the probability that no customers enter the shop between 13:00 and 15:00.
• Requested probability
According to Proposition 1.87, N(t+ s)−N(s) ∼ Poisson(∫ t+s
sλ(z) dz
). Thus,
P [N(15)−N(13) = 0] = e−∫ 1513 λ(z) dz
= e−(∫ 1413 15 dz+
∫ 1514 [15−2.5(z−14)] dz)
= e−28.75
' 0.
•
Exercise 1.90 — Non-homogeneous Poisson process (Ross, 2003, Example 5.22,
pp. 318–319)
Harry owns a vegetarian food stand that opens at 8AM.
53
• From 8AM until 11AM customers seem to arrive at a linearly increasing rate that
starts with 5 customers per hour at 8AM and reaches a maximum of 20 customers
per hour at 11AM.
• From 11AM until 1PM the arrival rate remains constant at 20 customers per hour.
• The arrival rate then drops linearly from 1PM until closing time at 5PM at which
time it has the value of 12 customers per hour.
(a) If we assume that the numbers of customers arriving at Harry’s stand during non
overlapping periods are independent, then what is a good probability model for the
number of customers arriving in the interval (0, t]?
(b) What is the probability that no customers arrive between 8:30AM and 9:30AM on
Monday morning?
(c) Obtain the expected number of arrivals in the period mentioned in (b). •
Exercise 1.91 — Non-homogeneous Poisson process (bis)
Admit that a travel agency is open from 8:00 to 17:00 and customers arrive to it according
to a non-homogenous process and that the time dependent arrival rate:
• equals 4 customers per hour, from 8:00 and 10:00;
• is of 8 customers per hour, from 10:00 and 12:00;
• linearly increases from 8 to 10 customers per hour, from 12:00 to 14:00;
• linearly decreases from 10 to 4 customers per hour, from 14:00 to 17:00.
(a) Determine the intensity function of the process and find the expected number of
arriving customers during a whole day.
(b) Calculate the probability that the number of arrivals between 13:00 and 15:00 exceeds
5. What is the probability that there are no arrivals during this period? •
54
Exercise 1.92 — Non-homogeneous Poisson process (bis, bis)
Consider a non-homogeneous Poisson process with mean value function given by m(t) =
t2 + 2t, t ≥ 0.
(a) Determine the probability that exactly n events occurs in (4, 5].
(b) Obtain the intensity function of the process. •
It is easy to compute joint and conditional distributions of a non-homogeneous Poisson
process due to the independence of increments (Kulkarni, 1995, p. 225). Thus, we state
now a result similar to Proposition 1.44.
Proposition 1.93 — Joint and conditional distributions in a non-homogeneous
Poisson process
Let N(t) : t ≥ 0 ∼ NHPP (λ(t)). Then, for 0 < t1 < · · · < tn and 0 ≤ k1 ≤ · · · ≤ kn,
P [N(t1) = k1, . . . , N(tn) = kn] =n∏j=1
e−[m(tj)−m(tj−1)] [m(tj)−m(tj−1)]kj−kj−1
(kj − kj−1)!, (1.35)
where k0 = 0, t0 = 0 and m(t) =∫ t
0λ(z) dz (Kulkarni, 1995, pp. 225–226). Moreover,
(N(s) | N(t) = n) ∼ Binomial(n, m(s)m(t)
), for 0 < s < t and n ∈ N. •
Exercise 1.94 — Joint p.f. of N(t1), . . . , N(tn) in a non-homogeneous Poisson
process29
Suppose that customers arrive to do business at a bank according to a non-homogeneous
Poisson process with time dependent rate λ(z) = 20 + 10 cos [2π(z − 9.5)] × I[9,17](z).
What is the probability that twenty customers arrive between 9:30 and 10:30, and
another twenty arrive in the following half hour? •
29Inspired by www.maths.uq.edu.au/courses/STAT3004/PastYears/notes/poisson processes.pdf
55
Exercise 1.95 — Conditional distribution of (N(s) | N(t) = n)
The number of arrivals to a shop is governed by a Poisson process with time dependent
rate
λ(t) =
4 + 2t, 0 ≤ t ≤ 4
24− 3t, 4 < t ≤ 8.
(a) Draw the graphs of λ(t) and m(t), for 0 ≤ t ≤ 8.
(b) Derive the probability of no arrivals in the interval (3, 5].
(c) Determine the expected values of the number of arrivals in the last 5 opening hours
(i.e., in the interval (3, 8]), given that 15 customers have arrived in the last 3 opening
hours (that is, in the interval (5, 8]).
(d) Given that 60 customers visited the shop during those 8 opening hours, find an
approximate value to the probability that more than 40 of those 60 customers arrived
in the interval (0, 6]. •
Exercise 1.96 — Epochs of a non-homogeneous Poisson process (Ross, 2003, p.
321)
Let N(t) : t ≥ 0 ∼ NHPP (λ(t)) and Sn the time of the nth arrival (n ∈ N). Prove that
fSn(t) = λ(t)e−m(t) [m(t)]n−1
(n− 1)!,
where m(t) =∫ t
0λ(z) dz. •
We can generalize Proposition 1.61 and derive the conditional distribution of the event
times S1, . . . , Sn given N(t) = n, in a non-homogeneous Poisson process.
Proposition 1.97 — Conditional distribution of the event times in a non-
homogeneous Poisson process
Let:
• N(t) : t ≥ 0 ∼ NHPP (λ(t)) and m(t) =∫ t
0λ(z) dz be its mean value function;
56
• Sn the time of the nth event (n ∈ N);
• Yii.i.d.∼ Y, i = 1, . . . , n, where P (Y ≤ u) = m(u)
m(t), 0 ≤ u ≤ t.
Then
(S1, . . . , Sn | N(t) = n) ∼ (Y(1), . . . , Y(n)). (1.36)
•
Exercise 1.98 — Inter-event times in a non-homogeneous Poisson process
Let Xi be the time between the (i− 1)th and ith events of a NHPP (λ(t)).
Are the r.v. Xi, i ∈ N, identically distributed?30 •
Proposition 1.99 — Inter-event times in a non-homogeneous Poisson process
(Kulkarni, 1995, p. 227)
Let N(t) : t ≥ 0 ∼ NHPP (λ(t)) and Xn+1 the time between the (n + 1)th and nth
events (n ∈ N). Then
P (Xn+1 > t) = P (Sn+1 − Sn > t)
=
∫ +∞
0
λ(s)e−m(t+s) [m(s)]n−1
(n− 1)!ds, (1.37)
where m(t) =∫ t
0λ(z) dz. •
Exercise 1.100 — Inter-event times in a non-homogeneous Poisson process
Prove Proposition 1.99. •
Exercise 1.101 — More on the non-homogeneous Poisson process (bis, bis)
Let N(t) : t ≥ 0 ∼ NHPP (λ(t)), where the intensity function λ(t) is positive for
t ≥ 0. Now, define N∗(t) = N (m−1(t)), where m(t) =∫ t
0λ(z) dz denotes the mean value
function.
Prove that N∗(t) : t ≥ 0 ∼ PP (1). Comment this result.31 •30The r.v. Xi are neither independent, nor identically distributed; for example, P (X1 > t) = e−m(t)
(Kulkarni, 1995, p. 226) and P (X2 > t) =∫ +∞0
λ(s)e−m(t+s) ds.31If we re-scale time in a non-homogeneous Poisson process with mean value function m(t) by taking
m−1(t) instead of t, then we end up dealing with a homogeneous Poisson process with unit rate.
57
Exercise 1.102 — The output process of an infinite server Poisson queue and
the non-homogeneous Poisson process (Ross, 2003, Example 5.23, p. 320)
Prove that the output process of the M/G/∞ queue — i.e., the number of customers
who (by time t) have already left the infinite server queue with Poisson arrivals and
general service d.f. G — is a non-homogeneous Poisson process with intensity function
λ(t) = λG(t). •
1.6 Conditional Poisson process
What happens if the arrival rate is a positive r.v.? Is the resulting stochastic process
mathematically tractable?
Yes!
In this case we end up dealing with another generalization of the Poisson process.
Definition 1.103 — Conditional Poisson process (Ross, 1983, pp. 49–50)
Let:
• Λ be a positive r.v. having c.d.f. G;
• N(t) : t ≥ 0 be a counting process such that, given that Λ = λ, N(t) : t ≥ 0
is a Poisson process with rate λ.
Then N(t) : t ≥ 0 is called a conditional (or mixed) Poisson process and32
P [N(t+ s)−N(s) = n] =
∫ +∞
0
e−λt(λt)n
n!dG(λ), (1.38)
for s ≥ 0. •
32What follows is a Lebesgue-Stieltjes integral equivalent to the Riemann-Stieltjes integral, which is
particularly common in probability theory when G is the c.d.f. of a real-valued r.v. such as Λ. For more
details, the reader is referred to Morais (2011, Subsection 4.2.1) and links therein.
58
Remark 1.104 — Conditional Poisson process
N(t) : t ≥ 0 is not a homogeneous PP (Ross, 1983, p. 50). Even though it has stationary
increments, it has not independent increments in general:
• the stationary increments follow immediately from (1.38) (Ross, 2003, p. 327), which
does not depend on the origin s of the time interval (s, t + s], thus, P [N(t) = n] is
given by (1.38).
• knowing how many events occur in an interval gives information about the possible
value of the random arrival rate Λ, therefore affecting the distribution of the number
of events in other time intervals (Ross, 2003, p. 327). •
Example/Exercise 1.105 — Conditional Poisson process33
Suppose that the number of requests to a web server follows a conditional (or mixed)
Poisson process with random rate Λ (in requests per minute) and admit that Λ ∼
Gamma(r, β), where r, β ∈ IN .
(a) Derive a simplified expression for the probability that the server receives at most m
requests in t minutes (m ∈ IN0, t > 0) and obtain the value of this probability for
r = β = t = 2, m = 8.
• Stochastic process
N(t) : t ≥ 0 conditional (or mixed) Poisson process with random rate Λ
• R.v.
N(t) = number of requests to a web server in the 1st. t minutes
(N(t) | Λ = λ) ∼ Poisson(λt)
• Random rate and its p.d.f.
Λ ∼ Gamma(r, β), r, β ∈ N
gΛ(λ) = βr
Γ(r)λr−1 e−βλ, λ ≥ 0
33Inspired by Ross (2003, Example 5.27, p. 327).
59
• P.f. of N(t)
Since Λ is a continuous r.v. with p.d.f. gΛ(λ), we get
P [N(t+ s)−N(s) = n] =
∫ +∞
0
e−λt(λt)n
n!gΛ(λ)dλ.
by using (1.38).
Moreover, since N(t) : t ≥ 0 has stationary increments, we can add that:
P [N(t) = n] =
∫ +∞
0
e−λt(λt)n
n!
βr
Γ(r)λr−1e−βλdλ
=tnΓ(n+ r)βr
n! Γ(r)(β + t)r+n
∫ +∞
0
(β + t)n+r
Γ(n+ r)λn+r−1e−(β+t)λ dλ
=tnΓ(n+ r)βr
n! Γ(r)(β + t)r+n
∫ +∞
0
fGamma(n+r,β+t)(λ) dλ
=
(n+ r − 1
n
)(β
β + t
)r (1− β
β + t
)n, n ∈ N0
= p.f. of a NegativeBinomial∗(r, β/(β + t))
• Requested probability
For m ∈ N0, we have
P [N(t) ≤ m] = FNegativeBinomial∗(r,β/(β+t))(m)
= FNegativeBinomial(r,β/(β+t))(m+ r)
= 1− FBinomial(m+r,β/(β+t))(r − 1).
Thus, for r = β = t = 2 and m = 8, we get
P [N(2) ≤ 8] = 1− FBinomial(8+2,2/(2+2))(2− 1)
table= 1− 0.0107
= 0.9893.
(b) Prove that (Λ|N(t) = n) ∼ Gamma(r + n, β + t).
(c) Determine limh→0+P [N(t+h)−N(t)=1|N(t)=n]
h. •
Exercise 1.106 — Conditional Poisson process (bis)
60
An airport has a 24/7 booking center at which customers arrive according to a conditional
Poisson process with an exponentially distributed arrival rate Λ.
(a) What is the probability that no arrivals occur in a t hours period?
(b) Admit that the expected value of Λ is equal to 4 customers per hour. Verify that the
probability that Λ does not exceed 6 customers per hour — given that 49 customers
arrived in (0, 12], is given by Fχ2100
(147).34 •
Exercise 1.107 — Conditional Poisson process (bis, bis) (Ross, 1983, p. 50)
Admit that, depending on factors not at present understood, the rate at which seismic
shocks occur in a certain region over a given season is either λ1 or λ2. Admit also that
the rate equals λ1 for p× 100% of the seasons and λ2 in the remaining time.
A simple model would be to suppose that N(t), t ≥ 0 is a conditional Poisson
process such that Λ is either λ1 or λ2 with respective probabilities p and 1− p.
Prove that the probability that it is a λ1−season, given n shocks in the first t units of
a season, equals
p e−λ1t(λ1t)n
p e−λ1t(λ1t)n + (1− p) e−λ2t(λ2t)n, (1.39)
by applying the Bayes’ theorem. •
Exercise 1.108 — Splitting PP; Conditional PP
One estimates that meteors enter the atmosphere in a specific region of the globe according
to a Poisson process having rate λ equal to 100 meteors per hour and that 1% of those
meteors are visible to the “naked eye” as shooting stars.
(a) What is the probability that an observer is lucky enough to see at least two shooting
stars in 30 minutes?
After a detailed study of the meteor automatic detection process of a certain type of
telescope, one admits that:
34Hint: X ∼ Gamma(α, δ) ⇔ 2δX ∼ χ22α.
61
• the automatic detection rate of meteors per hour has Uniform distribution in the
interval [20, 200];
• conditionally on the knowledge of the automatic detection rate, the number of
meteors detected by the telescope is governed by a Poisson process.
(b) What are the expected value and standard deviation of the number of meteors
automatically detected by the telescope in 6 hours? (Hint: E(X) = E[E(X|Y )]
and V (X) = V [E(X|Y )] + E[V (X|Y )].)
(c) What is the probability that the telescope automatically detects at least one meteor
in 15 minutes? •
1.7 Compound Poisson process
Now, consider a continuous-time stochastic process with jumps. Admit
that the jumps occur randomly according to a Poisson process and the
size of the jumps is also random, with a specified probability distribution
(http://en.wikipedia.org/wiki/Compound Poisson process). Does the total size of
the jumps that occurred up to time t define a stochastic process easy to deal with?
Yes!
It is another generalization of the Poisson process.
Definition 1.109 — Compound Poisson process (Ross, 1989, p. 237)
A stochastic process X(t) : t ≥ 0 is said to be a compound Poisson process if it can be
represented as
X(t) =
N(t)∑i=1
Yi, (1.40)
where
• N(t) : t ≥ 0 ∼ PP (λ) and
62
• Yii.i.d.∼ Y and independent of N(t) : t ≥ 0. •
Example 1.110 — Compound Poisson process (Ross, 2003, pp. 321–322)
Compound Poisson processes arise, for example, in the following settings.
• Suppose that buses arrive to a venue according to a Poisson process, the numbers
of persons in each bus are i.i.d. r.v. Yi and X(t) denotes the total number of persons
who arrived by time t. Then X(t) : t ≥ 0 is a compound Poisson process.
• Admit that customers leave a supermarket in accordance to a Poisson process, the
amounts of money each person has spent are i.i.d. r.v. Yi and X(t) represents the
total amount of money spent by the customers who left the supermarket until time
t. Then X(t) : t ≥ 0 is also a compound Poisson process. •
Quiz 1.111 — Compound Poisson process
Give more examples of compound Poisson processes.
Proposition 1.112 — Compound Poisson process (Ross, 1989, pp. 238–239)
Let X(t) : t ≥ 0 be the compound Poisson process described in Definition 1.109. Then
E[X(t)] = λt× E(Y ) (1.41)
V [X(t)] = λt× E(Y 2). (1.42)
•
Exercise 1.113 — Compound Poisson process
Let X(t) : t ≥ 0 be a compound Poisson process.
(a) Prove Proposition 1.112, by noting that E[X(t)] = EE[X(t)|N(t)] and
V [X(t)] = EV [X(t)|N(t)]+ V E[X(t)|N(t)] (Ross, 1989, pp. 238–239).
63
(b) Use the total probability law to prove that the m.g.f. of X(t) can be written as
MX(t)(s) = E[esX(t)
]= eλt[MY (s)−1] (1.43)
(Kulkarni, 1995, pp. 229-230).35 •
Example 1.114 — Compound Poisson process
Let X(t) be the total amount of money payed by an insurance company in (0, t]. Admit
that:
• the number of payments is governed by a Poisson process having rate λ equal to 5
payments a week;
• the payments are i.i.d. r.v. with Exponential distribution with expected value equal
to 20 000 Euros.
Determine the expected value and variance of the total amount of money payed by the
insurance company in 4 weeks.
• Stochastic processX(t) =
∑N(t)i=1 Yi : t ≥ 0
∼ Compound PP
• R.v. et al.
X(t) = total amount payed to the insurance company by time t
Yi = ith amount payed to the insurance company
Yii.i.d.∼ Y
Y ∼ Exponential(1/20000)
Yi : i ∈ N indep. of N(t) : t ≥ 0 ∼ PP (λ = 5 payments per week)
35See the proof also in http://en.wikipedia.org/wiki/Compound Poisson process. This result is another
consequence of Campbell’s theorem, named after Norman Robert Campbell who first published the result
in 1909/1910; this result gives the m.g.f. of a compound Poisson process, from which the expected value
and variance can be easily computed (http://en.wikipedia.org/wiki/Campbell’s theorem (probability)).
64
• Requested expected value and variance
According to Proposition 1.112, we have E[X(t)] = λt × E(Y ) and V [X(t)] =
λtE(Y 2) = λt× [V (Y ) + E2(Y )]. Thus,
E[X(4)] = 5× 4× 20000
= 400000
V [X(4)] = 5× 4× [200002 + 200002]
= 1.6× 1010.
•
Exercise 1.115 — Compound Poisson process (bis) (Kulkarni, 1995, Example 5.20,
pp. 230–231)
Suppose that:
• customers arrive at a restaurant in batches of size 1, 2, 3, 4, 5 and 6;
• the batches themselves arrive according to a Poisson process having rate λ;
• the successive batch sizes Yi are i.i.d. r.v. to Y with the following p.f.
P (Y = y) =
0.1, y = 1, 3
0.25, y = 2, 4
0.15, y = 5, 6.
Compute the mean and the variance of the number of customers who arrived at the
restaurant in (0, t]. •
Exercise 1.116 — Compound Poisson process (bis, bis) (Walrand, 2004, p. 208)
Let N(t) : t ≥ 0 be a Poisson process with rate λ. At each jump time, a random
number Yi of customers arrive at a cashier waiting line. The r.v. Yi are i.i.d. with mean µ
and variance σ2. Let X(t) be the number of customers who arrived by time t, for t ≥ 0.
Calculate E[X(t)] and V [X(t)]. •
65
Exercise 1.117 — Compound Poisson process (bis, bis, bis) (Ross, 2003, pp. 322,
326)
Suppose that families migrate to an area at a Poisson rate λ = 2 per week. Assume that
the number of people in each family is independent and takes values 1, 2, 3 and 4 with
respective probabilities 16, 1
3, 1
3and 1
6.
(a) What is the expected value and variance of the number of individuals migrating to
this area during a five-week period?
(b) Find an approximate value for the probability that at least 240 people migrate within
the next 50 weeks. •
Remark 1.118 — Values of the compound Poisson process; joint p.f. of
X(t1), . . . , X(tn)
• Since the r.v. Yi may take positive as well as negative values, the value of
X(t) =∑N(t)
i=1 Yi may either increase or decrease (Kulkarni, 1995, p. 228), unlike
any counting process.
• If Yi are integer-valued r.v., then so is X(t); moreover, for 0 < t1 < · · · < tn, we
have
P [X(t1) = k1, . . . , X(tn) = kn] =n∏j=1
pkj−kj−1(tj − tj−1), (1.44)
where pk(t) = P [X(t) = k], k0 = 0 and t0 = 0 (Kulkarni, 1995, pp. 228–229). •
Since the r.v. Yi (i = 1, . . . , n) are i.i.d. and N(t) : t ≥ 0 has stationary and
independent increments, the compound Poisson processX(t) =
∑N(t)i=1 Yi : t ≥ 0
also
has stationary and independent increments (Kulkarni, 1995, p. 228).
66
Remark 1.119 — Homogeneous, non-homogeneous, conditional and
compound Poisson processes
Stochastic process Independent increments? Stationary increments?
Homogeneous PP Yes!!! Yes!!!
Non-homogeneous PP Yes!!! No!
Conditional PP No! Yes!!!
Compound PP Yes!!! Yes!!!
•
Exercise 1.120 — Independent increments and the compound Poisson process
(Ross, 1983, Exercise 2.26, p. 53)
Obtain the autocovariance function of a compound Poisson process X(t) : t ≥ 0. •
Quiz 1.121 — A generalization of the compound Poisson process (Kulkarni,
1995, p. 231)
It is possible to construct a non-homogeneous compound Poisson process, X(t) : t ≥ 0,
by assuming that, in Definition 1.109, N(t) : t ≥ 0 is a non-homogeneous Poisson
process.
Can you derive the m.g.f. of X(t)? •
Exercise 1.122 — A generalization of the compound Poisson process (Ross,
1983, Exercise 2.14, p. 52)
Admit that: busloads of customers arrive at an infinite server queue according to a Poisson
process having rate λ; G denotes the service distribution (of each busload of customers
regardless of its size); a bus contains j customers with probability αj. Let X(t) denote
the number of customers that have been served by time t.
(a) Obtain E[X(t)].
(b) Is X(t) Poisson distributed? •
67
Exercise 1.123 — Mind expanding exercise (Ross, 1983, Exercise 2.25, p. 53)
A two-dimensional Poisson process is characterized as follows:
(i) it is a process of randomly occurring events in the plane;
(ii) for any region of area A the number of events in that region has a Poisson distribution
with parameter λA;
(iii) the numbers of events in non-overlapping regions are independent r.v.
Now consider an arbitrary point in the plane and let X denote its distance from its nearest
event of the two-dimensional Poisson process.
(a) Show that P (X > t) = e−λπt2.
(b) Prove that E(X) = 12√λ. •
Exercise 1.124 — Mind expanding exercise (bis)
Every Sunday, 15 units of a perishable product are stocked in order to be sold in the
remaining week days. The orders of this product are governed by a Poisson process with
rate λ equal to 3 units per day.36 Moreover, admit that due to the nature of the product
all unsold units are destroyed on Sunday before restocking for the next week.
(a) Determine the probability that there are no units for sale on Tuesday (at 00:00).
(b) Compute the probability that all the 15 units were sold by Saturday (at 23:59:59).
(c) Obtain a simplified expression for the expected value of units destroyed weekly. •
Exercise 1.125 — Mind expanding exercise (bis, bis)
Consider a junction between a main and a secondary road. Admit that: cars pass in the
main road according to a Poisson process with a rate of 10 cars per minute; Harry is
driving a car in the secondary road and needs 10 seconds to enter the main road; the cars
circulating in the main road take a negligible time to pass the junction.
Let:36Note that an order of the product does not result in a sale if there are no units in stock.
68
• N be the number of cars that pass the junction while Harry waits to enter the main
road;
• Yn be the time (in seconds) at which the nth car passed the junction while Harry
waited to enter the main road;
• Y0 = 0.
(a) Obtain the distribution of N and compute its 75% percentage point.
(b) Show that E(Yn) = 2n× 3−8e−53
1−e−53
, for n = 1, . . . , N , and N ∈ N.
(c) Obtain the expected value of the time Harry has to wait at the junction until he
initiates the manoeuvre to enter the main road. •
69
Chapter 2
Renewal Processes
(Homogeneous) Poisson processes are counting processes for which the times between
successive events are i.i.d. exponential r.v.
Can we be slightly more realistic, by dropping the exponentially distributed
assumption and considering inter-event times with a common but arbitrary distribution?
Yes!
The resulting process is called a renewal process and is still mathematically tractable.
Applications of renewal processes include calculating the expected time
for a monkey who is randomly tapping at a keyboard to type the word
Macbeth and comparing the long-term benefits of different insurance policies
(http://en.wikipedia.org/wiki/Renewal theory). More importantly, many questions
about more complex and interesting stochastic processes can be addressed by identifying
a relevant renewal process.1
2.1 Introduction
Informally, a renewal process is a generalization of the Poisson process. Expectedly,
renewal processes are going to be defined in terms of the times between consecutive
events as a Poisson process in Definition 1.31.
1What follows was essentially inspired by Kulkarni (1995, Chap. 8), Ross (1983, Chap. 3), and Ross
(2003, Chap. 7).
70
Definition 2.1 — Renewal process (Kulkarni, 1995, pp. 401–402; Ross, 2003, pp.
401–402)
Let:
• Xi : i ∈ N be a sequence of r.v. representing the inter-event times;
• S0 = 0;
• Sn =∑n
i=1Xi be the time of the occurrence of the nth event;
• N(t) = supn ∈ N0 : Sn ≤ t, t ≥ 0 — i.e., N(t) represents the number of events
(or renewals) that occurred in (0, t].
If Xi : i ∈ N is a sequence of i.i.d. non-negative (real) r.v. with common c.d.f. F ,2 then
the counting process N(t) : t ≥ 0 is said to be a renewal process. •
Remark 2.2 — Renewal sequence; characterization of renewal processes
• Sn : n ∈ N is said to be a renewal sequence (Kulkarni, 1995, p. 402).
• The renewal process N(t) : t ≥ 0 is fully characterized by the inter-event time
distribution F (Kulkarni, 1995, Theorem 8.1, p. 404).
• The designation of N(t) : t ≥ 0 as a renewal process is due to the fact that
N(t + Sn) − n : t ≥ 0 is stochastically identical to N(t) : t ≥ 0, for n ∈ N0
(Kulkarni, 1995, pp. 404-405). •
Example 2.3 — Renewal process (Kulkarni, 1995, pp. 403–404)
• Admit a process requires the continuous use of a specific machine. At time 0 a
brand new machine is put to work until it fails after a random amount of time
X1; this machine is instantly replaced with a new one which will last for a random
amount of time X2; etc. In case Xi are i.i.d. r.v. and N(t) denotes the number of
failures/replacements by time t, N(t) : t ≥ 0 is a renewal process.
2To avoid trivialities, we assume that F (0) = P (Xi = 0) < 1. From the non-negativity of Xi and the
fact that Xi is not identically 0, we get E(Xi) > 0.
71
• Let Sn be the completion time of the nth busy cycle of a M/G/1 queue and N(t) be
the number of busy cycles completed by time t. Then N(t) : t ≥ 0 is a renewal
process. •
Quiz 2.4 — Renewal process
Give examples of stochastic processes arising in your daily life that could be modeled as
renewal processes. •
2.2 Properties of the number of renewals
It is essential to study in some detail the properties of the number of renewals up to time
and including time t, N(t). The distribution of N(t) can be obtained, at least in theory
(Ross, 2003, p. 403), by capitalizing on a familiar result
N(t) ≥ n ⇔ Sn ≤ t, (2.1)
or on the fact that
N(t) = n ⇔ Sn ≤ t < Sn+1. (2.2)
But before we proceed to derive the p.f. of N(t) let us explore a bit more the
relationship between N(t) and Sn by solving the following exercise.
Exercise 2.5 — Relating N(t) and Sn (Ross, 1983, Exercise 3.1, p. 93)
Is it true that:
(a) N(t) < n ⇔ Sn > t ?
(b) N(t) ≤ n ⇔ Sn ≥ t ?
(c) N(t) > n ⇔ Sn < t ?
Justify your answers! •
72
Proposition 2.6 — Relating the p.f. of N(t) and the c.d.f. of the event times
(Ross, 2003, p. 403)
Let:
• N(t) : t ≥ 0 be a renewal process;
• S0 = 0;
• F0(t) = P (S0 ≤ t) = 1, t ≥ 0;
• Sn the time of the occurrence of the nth event, for n ∈ N;
• Fn(t) = P (Sn ≤ t) the c.d.f. of Sn.
Then P [N(t) ≥ n] = Fn(t) and
P [N(t) = n] = Fn(t)− Fn+1(t), (2.3)
for n ∈ N and t > 0. •
Computing the p.f. of N(t) is a non trivial task for all but a few renewal processes
(Kulkarni, 1995, p. 405).
Example 2.7 — P.f. of N(t) (Ross, 2003, Example 7.1, pp. 403–404)
Admit that
P (Xn = k) = (1− p)k−1p, k, n ∈ N,
that is, S1 = X1 can be interpreted as the number of i.i.d. Bernoulli trials until the first
success and Sn =∑n
i=1 Xi may be interpreted as the number of trials necessary to attain
n successes.
Obtain the distribution of Sn and P [N(t) = n].
• Renewal process
N(t) : t ≥ 0
73
• R.v.
N(t) = number of renewals up to time t
Xi = time between the (i− 1)th and ith renewals
Xii.i.d.∼ Geometric(p), i ∈ N
• Renewal times
Sn = time of the nth renewal, n ∈ N
Sn =∑n
i=1Xi
Sn ∼ NegativeBinomial(n, p)
P (Sn = k) =(k−1n−1
)pn(1− p)k−n, k = n, n+ 1, . . .
• P.f. of N(t)
For 0 < t < 1, P [N(t) = 0] = 1.
For t ≥ 1, N(t) ∼ Binomial(btc, p), where btc represents the integer part of t.3 In
fact:
P [N(t) = 0] = P (X1 > t)
= (1− p)btc;
P [N(t) = n] = P (Sn ≤ t)− P (Sn+1 ≤ t)
=
btc∑k=n
(k − 1
n− 1
)pn(1− p)k−n −
btc∑k=n+1
(k − 1
n
)pn+1(1− p)k−n−1
= · · ·
=
(btcn
)pn(1− p)btc−n,
for n ∈ 1, . . . , btc. •
Exercise 2.8 — P.f. of N(t) (bis) (Ross, 2003, Exercise 7.2, p. 460)
Suppose the inter-event distribution of a renewal process, say N(t) : t ≥ 0, is Poisson
with expected value λ, i.e., Xii.i.d.∼ Poisson(λ), i ∈ N.
3We are essentially dealing with a Bernoulli counting process...
74
(a) Find the distribution of Sn.
(b) Compute P [N(t) = n]. •
Now, we attempt to answer is whether an infinite number of renewals can occur in a
finite time (Ross, 1983, p. 55).
Proposition 2.9 — Finiteness of N(t) in finite time (Kulkarni, 1995, Theorem 8.3,
p. 406)
Let N(t) : t ≥ 0 be a renewal process. Then N(t) is a proper r.v. for all (finite!) t ≥ 0,
i.e.,
P [N(t) < +∞] = 1, (2.4)
for 0 ≤ t < +∞. •
Exercise 2.10 — Finiteness of N(t) in finite time
Prove Proposition 2.9 (Ross, 1983, pp. 55–56; Kulkarni, 1995, p. 406). •
Since Proposition 2.9 is valid, we can write N(t) = maxn ∈ N0 : Sn ≤ t (instead of
N(t) = supn : Sn ≤ t), and we can indeed identify the p.f. of N(t) in terms of the c.d.f.
of the renewal times Sn.
2.3 Renewal function
Definition 2.11 — Renewal function (Ross, 2003, p. 404)
The expected value of the number of renewals up to t,
m(t) = E[N(t)], t ≥ 0, (2.5)
defines what is called the mean-value or the renewal function. •
By capitalizing on the fact that N(t) is a non-negative integer r.v. such that P [N(t) ≥n] = Fn(t) we can provide an expression for m(t).
75
Proposition 2.12 — Renewal function (Ross, 2003, p. 404)
Let N(t) : t ≥ 0 be a renewal process. Then
m(t) =+∞∑n=1
Fn(t). (2.6)
•
Exercise 2.13 — Renewal function
Prove Proposition 2.12 (Ross, 2003, p. 404; Ross, 1983, pp. 56–57). •
Exercise 2.14 — Renewal function (Ross, 2003, Exercise 6, p. 461)
Consider a renewal process N(t) : t ≥ 0 with a Gamma(r, λ) (r ∈ N) inter-renewal
distribution.
(a) Show that, for t ≥ 0 and n ∈ N0,
P [N(t) ≥ n] =+∞∑i=nr
e−λt(λt)i
i!.
(b) Use (a) to prove that
m(t) =+∞∑i=r
⌊i
r
⌋e−λt
(λt)i
i!,
for t ≥ 0.4 •
The renewal function is very important because it completely characterizes the renewal
process (Kulkarni, 1995, p. 414).
Proposition 2.15 — Renewal function and the inter-renewal distribution
There is a one-to-one correspondence between m(t) and the inter-renewal distribution F
(Ross, 2003, p. 404).5 •
4Use the relationship between the Gamma(r, λ) distribution and the sum of r independent
exponentially distributed r.v. with rate λ to define N(t) in terms of the number of events of a Poisson
process with rate λ.5The proof can be found in Kulkarni (1995, p. 414) and it makes use of the Laplace-Stieltjes transforms
76
Exercise 2.16 — Renewal function and the inter-renewal distribution (Ross,
2003, Example 7.2, p. 405)
(a) Prove Proposition 2.15.
(b) Suppose the renewal process N(t) : t ≥ 0 has renewal function m(t) = 2t, t ≥ 0.
What is the distribution of N(10)? •
Exercise 2.17 — Renewal function and the inter-renewal distribution (bis)
(Ross, 2003, Exercise 7.3, p. 460)
If the mean-value function of the renewal process N(t) : t ≥ 0 is given by m(t) = t2, t ≥
0, what is the value of P [N(5) = 0]? •
It is time to inquire about the finiteness of m(t) = E[N(t)] in finite time.
Some readers might think that the finiteness of N(t) (with probability 1) implies the
finiteness of m(t); even though such reasoning should be avoided,6 the result is valid, as
stated in the next proposition.
Proposition 2.18 — Finiteness of m(t) in finite time (Ross, 1983, p. 57; Kulkarni,
1995, Theorem 8.8, p. 416)
Let N(t) : t ≥ 0 be a renewal process. Then
m(t) < +∞, (2.7)
for 0 ≤ t < +∞.7 •of m(t) and F (t), m(s) =
∫ +∞0−
e−st dm(t) and F (s) = E(e−s S1
)= MS1
(−s) =∫ +∞0−
e−st dF (t),
respectively. In fact, these two Laplace-Stieltjes transforms satisfy m(s) = F (s)
1−F (s)and the renewal
function can be obtained by inverting its Laplace-Stieltjes transform — so this is another method of
computing the renewal function besides the use of (2.6). Note that the Laplace transform of a function
f(t) is defined as∫ +∞0−
e−stf(t) dt and should not be mistaken for the Laplace-Stieltjes transform even
though they can be obvious related. Moreover, Mathematica can be used to obtain the Laplace transform
and the inverse Laplace transform.6Consider, for instance, the r.v. Y that takes value 2n with probability ( 1
2 )n, for n ∈ N. Even though
Y is finite — P (Y < +∞) =∑+∞n=1 P (Y = 2n) = 1 —, we have E(Y ) =
∑+∞n=1 2nP (Y = 2n) = +∞.
7Ross (1983, p. 57) and Kulkarni (1995, pp. 416–417) provide proofs of this result; Ross (2003, p. 405)
states the result without a proof.
77
2.4 Renewal-type equations
In general the renewal function is difficult to compute for an arbitrary inter-arrival
distribution F , namely using (2.6).8 However, we are able to derive the expected value of
N(t) via an integral equation.
Remark 2.19 — Renewal argument (Kulkarni, 1995, p. 407)
One of the most useful tools of renewal theory is the renewal argument. It allows us to
derive an integral equation for certain probabilistic quantities, such as m(t), in renewal
processes, by conditioning on the time of the first renewal S1. •
Proposition 2.20 — Renewal equation (Kulkarni, 1995, p. 415)
Let N(t) : t ≥ 0 be a renewal process with inter-renewal distribution F . Then the
renewal argument leads to the following integral eq. involving the renewal function:
m(t) = F (t) +
∫ t
0
m(t− x) dF (x). (2.8)
(2.8) is called the renewal equation and provides another method of computing m(t). •
Exercise 2.21 — Renewal equation
Prove the renewal equation (Kulkarni, 1995, pp. 414–415; Ross, 2003, p. 406). •
The renewal equation can sometimes be solved to obtain the renewal function (Ross,
2003, p. 406)
Exercise 2.22 — Renewal equation (Ross, 2003, Example 7.3, pp. 406–407)
Solve the renewal equation for 0 ≤ t ≤ 1, when the inter-renewal distribution is
Uniform(0, 1) and show that m(t) = et − 1, 0 ≤ t ≤ 1. •
Exercise 2.23 — Renewal equation (bis) (Kulkarni, 1995, examples 8.8 and 8.15,
pp. 404, 416)
Suppose the inter-renewal times have Bernoulli(α) distribution (α ∈ (0, 1)).9
Prove that m(t) = btc+1−αα
by solving the renewal equation. •
8Or capitalizing on the relationship between the Laplace-Stieltjes transforms of m(t) and F (t).9The associated renewal process is called the negative binomial process.
78
We know that P [N(t) = 0] = P (X1 > t) = 1− F (t). Can we derive P [N(t) = n], for
n ∈ N, when we are not able to obtain a nice formula for the convolution Fn(t)?
In certain cases!
The renewal argument can be also used to derive what is called a renewal-type equation
for P [N(t) = n].
Proposition 2.24 — Renewal-type equation for P [N(t) = n] (Kulkarni, 1995, pp.
408–409)
Let N(t) : t ≥ 0 be a renewal process with inter-renewal distribution F . Then the
renewal argument leads to the following renewal-type equation:
P [N(t) = n] =
∫ t
0
P [N(t− x) = n− 1] dF (x), n ∈ N. (2.9)
•
Exercise 2.25 — Renewal-type equation for P [N(t) = n]
Prove the renewal-type equation (2.9) (Kulkarni, 1995, p. 408). •
The integral equation (2.9) is not simple to solve unless the inter-renewal times are
discrete r.v. with common p.f. P (X = i) = αi, i ∈ N0. In this case, N(t)st= N(btc) and
its p.f. can be obtained recursively, for a fixed t ≥ 0 (Kulkarni, 1995, p. 408):
P [N(btc) = 0] = P (X1 > btc)
= 1−btc∑i=0
αi; (2.10)
P [N(btc) = n] =
btc∑i=0
P [N(btc − i) = n− 1]× αi, n ∈ N. (2.11)
Equation (2.11) provides a computationally stable method of computing the p.f. of the
number of renewals up to time t, N(t) (Kulkarni, 1995, p. 409).
Exercise 2.26 — Renewal-type equation for P [N(t) = n]
Use results (2.10) and (2.11) to derive the renewal function obtained in Exercise 2.23. •
79
The renewal argument can also be used to obtain a renewal-type equation involving
the expected value of the time of the first renewal after t, SN(t)+1.
Proposition 2.27 — Renewal-type equation for E[SN(t)+1] (Kulkarni, 1995, p. 418)
Let N(t) : t ≥ 0 be a renewal process with inter-renewal distribution F and H(t) =
E[SN(t)+1]. Then
H(t) = E(S1) +
∫ t
0
H(t− x) dF (x). (2.12)
•
Exercise 2.28 — Renewal-type equation for E[SN(t)+1]
Derive the renewal-type equation (2.12) (Kulkarni, 1995, p. 418). •
After stating the renewal equation and two renewal-type equations (for P [N(t) = n]
and E[SN(t)+1]), we proceed with a more thorough treatment of a general renewal-type
equation.
Definition 2.29 — Renewal-type equation (Kulkarni, 1995, pp. 420–421)
The type of integral equations that arise from using the renewal argument are called
renewal-type equations and have the following form:
H(t) = D(t) +
∫ t
0
H(t− x) dF (x), (2.13)
where F (x) is the c.d.f. of a non-negative r.v., D(t) is a known function and H(t) is a
function to be determined. •
If D(t) = F (t) and H(t) = m(t) then equation (2.13) corresponds to the renewal
equation (2.8). If D(t) = E(S1), for all t ≥ 0, and H(t) = E[SN(t)+1], we get the renewal-
type equation for E[SN(t)+1] in (2.12).
The following result not only gives the conditions for the existence and uniqueness of
the solution of the renewal-type equation in (2.13), but also one possible representation
of the solution (Kulkarni, 1995, p. 421).10
10The proof of this result can be found in Kulkarni (1995, pp. 421–422).
80
Proposition 2.30 — Existence, uniqueness and representation of the solution
of the renewal-type equation (Kulkarni, 1995, Theorem 8.10, p. 421)
Let N(t) : t ≥ 0 be a renewal process, with inter-renewal distribution F (x) and renewal
function m(t). Suppose |D(t)| < +∞, for all t ≥ 0. Then the renewal-type equation
H(t) = D(t) +∫ t
0H(t−x) dF (x) has a unique solution. This unique solution is such that
|H(t)| < +∞, for all t ≥ 0, and can be written in terms of m(t):
H(t) = D(t) +
∫ t
0
D(t− x) dm(x). (2.14)
•
Since the renewal function m(t) is difficult to compute, the solution (2.14) of a renewal-
type equation is not easy to obtain. However, there are a few methods of solving renewal-
type equations (Kulkarni, 1995, p. 423), such as the following methods:
• Laplace-Stieltjes transforms — for short, LST (Kulkarni, 1995, pp. 423-426);
• discrete approximation (Kulkarni, 1995, pp. 426–427);
• successive approximation (Kulkarni, 1995, p. 427).
For the sake of briefness, we concisely describe and illustrate the first of these three
methods.
Proposition 2.31 — Solving a renewal-type equation using the method of LST
(Kulkarni, 1995, p. 423)
If D(t) admits a LST, D(s) =∫ +∞
0−e−st dD(t), then the LST of H(t) = D(t) +
∫ t0D(t −
x) dm(x) is given by
H(s) =
∫ +∞
0−e−st dH(t)
=D(s)
1− F (s), (2.15)
and H(t), the solution of the renewal type-equation, can be obtained by inverting the
right-hand side of (2.15). •
81
Exercise 2.32 — Solving a renewal-type equation using the method of LST
Prove Proposition 2.31. •
Exercise 2.33 — Solving a renewal-type equation using the method of LST
(Kulkarni, 1995, Example 8.16, pp. 423–425)
Consider a machine that alternates between two states — up and down. Admit that:
• the successive up times Ui are i.i.d. r.v. with exponential distribution with parameter
µ;
• if the up time is equal to Ui, then the subsequent down time Di = cUi, where c is a
non-negative constant;
• the machine is up at time 0;
• H(t) represents the probability that the machine is up at time t.
Derive a renewal-type equation for H(t) and solve it by using the method of Laplace-
Stieltjes transforms. •
Exercise 2.34 — Solving a renewal-type equation using the method of LST
(bis) (Kulkarni, 1995, Exercise 24, pp. 470–471)
Consider a system that can be in one of two states: on and off. Admit that:
• the system is on at time 0;
• the successive durations of on and off periods are independent and exponentially
distributed with parameter λ and µ, respectively;
• W (t) represents the total amount of on time during (0, t].
Derive a renewal-type equation for H(t) = E[W (t)] and solve it by using the method
of Laplace-Stieltjes transforms. •
82
Exercise 2.35 — Back to the renewal equation (Ross, 2003, Exercise 18, p. 465)
Use Mathematica to compute the renewal function when the inter-renewal distribution is
a hyper-exponential with survival function given by
1− F (t) = pe−µ1t + (1− p)e−µ2t,
for t ≥ 0 and µ1, µ2 > 0. •
Proposition 2.36 — Another renewal-type equation (Ross, 1983, p. 65)
Let N(t) : t ≥ 0 be a renewal process, with inter-renewal distribution F (x) and renewal
function m(t). Then the c.d.f. of SN(t), the time of the last renewal prior (or at) time t,
can be represented in the form (2.14) with D(t) = F (t) = 1− F (t):
P [SN(t) ≤ s] = F (t) +
∫ s
0
F (t− x) dm(x), (2.16)
for 0 ≤ s ≤ t. •
Exercise 2.37 — Another renewal-type equation
Prove Proposition 2.36 (Ross, 1983, p. 66). •
Remark 2.38 — Another renewal-type equation (Ross, 1983, p. 65)
Proposition 2.36 leads us to conclude that:
P [SN(t) = 0] = F (t); (2.17)
dFSN(t)(s) = F (t− s) dm(s), 0 < s < +∞. (2.18)
If the inter-renewal times are continuous r.v. with common p.d.f. f , then:
dm(s) =+∞∑n=1
fn(s) ds
=+∞∑n=1
P [nth renewal occurs in (s, s+ ds)]
= P [renewal occurs in (s, s+ ds)]; (2.19)
fSN(t)(s) ds = P [renewal in (s, s+ ds), next inter-renewal time > t− s]
= F (t− s) dm(s). (2.20)
•
83
2.5 Key renewal theorem and some other limit
theorems
Computing the exact distribution of N(t) for finite t is far from being a trivial problem,
either analytically or numerically (Kulkarni, 1995, p. 409). Moreover, the renewal function
m(t) is also difficult to compute for an arbitrary inter-arrival distribution F .
Unsurprisingly, we have to turn our attention to the study of the limiting behavior of
N(t), as t → +∞ (Kulkarni, 1995, p. 405), and we are bound to study the asymptotic
behavior of m(t) (Kulkarni, 1995, p. 416). However,
N(+∞) ≡ limt→+∞
N(t) = +∞, (2.21)
with probability 1,11 even though N(t) is finite for finite t. (2.21) follows because the only
way in which N(+∞) can be finite is for one of the inter-renewal times to be infinite, and
the probability of this last event is equal to zero (Ross, 1983, pp. 57-58; Ross, 2003, pp.
402–403).12
Is it possible to identify the approximate distribution of N(t) for large t?
Yes!
N(t) is asymptotically normally distributed.
Theorem 2.39 — Central limit theorem for renewal processes (Ross, 1983,
Theorem 3.3.5, pp. 62–63)
Let N(t) : t ≥ 0 be a renewal process whose inter-renewal times Xi have common
expected value µ and finite variance σ2. Then
limt→+∞
P
N(t)− tµ√
tσ2
µ3
< z
= Φ(z). (2.22)
Consequently, for sufficiently large t, the distribution of N(t) is approximately normal
11N(+∞) represents the total number of renewals that occur. Moreover, N(+∞) = +∞ implies
m(+∞) = +∞.12Ross (1983, p. 58) and Ross (2003, p. 403) note that P [N(+∞) < +∞] = P (Xi = +∞ for some i) =
P(∪+∞i=1 Xi = +∞
)≤∑+∞i=1 P (Xi = +∞) = 0. We should add that this is true if F (+∞) = 1, that is,
if the renewal process N(t) : t ≥ 0 is recurrent, as put by Kulkarni (1995, p. 409).
84
with mean tµ
and variance tσ2
µ3and we have
P [N(t) < n] ' Φ
n− tµ√
tσ2
µ3
. (2.23)
•
Exercise 2.40 — Central limit theorem for renewal processes
Prove Theorem 2.39 (Ross, 1983, pp. 62–63). •
Exercise 2.41 — Central limit theorem for renewal processes (Kulkarni, 1995,
Example 8.13, p. 413)
Suppose a part in a machine is available from two different sources, a and b. When the
part fails it is replaced by a new one from source a (resp. b) with probability 0.3 (resp. 0.7)
independently of everything else. A part from source a (resp. b) last for an exponentially
distributed time with a mean of 8 (resp. 5) days and it takes exactly 1 (resp. half a) day
to install it. Moreover, assume that a failure has taken place just before time 0.
Compute the approximate distribution of the number of failures during the first year
(not counting the one just before time 0). •
Exercise 2.42 — Central limit theorem for renewal processes (bis)
Resume Exercise 2.35 and obtain an approximate value to P [N(50) ≥ 50], when:
(a) p = 12, µ1 = 1 and µ2 = 2;
(b) p = 1 and µ1 = 1. •
Although N(t) and m(t) go to infinity, as t→ +∞, and N(t) is asymptotically normal,
it would be quite useful to inquire at what rate N(t) and m(t) grow (Ross, 1938, p. 58;
Ross, 2003, p. 407).
Can we calculate limt→+∞N(t)t
and limt→+∞m(t)t
?
Yes!
The limit of the sequence of r.v. N(n)/n : n ∈ N, which can be represented by
limt→+∞N(t)t
, is given in the following proposition.
85
Proposition 2.43 — Strong law of large numbers for renewal processes (Ross,
1983, Proposition 3.3.1, pp. 58–59; Serfozo, 2009, Corollary 11, p. 105)
Let N(t) : t ≥ 0 be a renewal process whose inter-renewal times X1, X2, . . . have
common expected value µ. Then
N(t)
t
w.p.1→ 1
µ, (2.24)
that is, the long-run rate at which renewals occur equals 1µ.13 •
Remark 2.44 — Almost sure convergence or convergence with probability 1
• Almost sure convergence — or convergence with probability one — is the
probabilistic version of pointwise convergence known from elementary real analysis.
• The sequence of r.v. Y1, Y2, . . . is said to converge almost surely or with probability
1 to a r.v. Y if
P
(ω : lim
n→+∞Yn(ω) = Y (ω)
)= 1 (2.25)
(Karr, 1993, p. 135; Rohatgi, 1976, p. 249). In this case we write
Yna.s.→ Y or Yn
w.p.1→ Y . Moreover, equation (2.25) does not mean that
limn→+∞ P (ω : Yn(ω) = Y (ω)) = 1.
• Almost sure convergence is preserved under continuous mappings (Karr, 1993, p.
148). •
Exercise 2.45 — Strong law of large numbers for renewal processes
Prove Proposition 2.43, by using the fact thatSN(t)
N(t)≤ t
N(t)<
SN(t)+1
N(t)and by applying the
strong law of large numbers (SLLN)14 (Ross, 1983, pp. 58–59) and then the preservation
of almost sure convergence under continuous mappings. •13Since the rate at which renewals occur will equal 1
µ w.p.1, 1µ is also called the rate of the renewal
process (Ross, 1983, p. 59). This result is valid for both finite and infinite µ (Kulkarni, 1995, p. 410-411).14The SLLN for i.i.d. r.v. in L1 (or Kolmogorov’s SLLN) can be stated as follows. Let Y1, Y2, . . .
be a sequence of i.i.d. r.v. to Y . Then Yn = 1n
∑ni=1 Yi
a.s.→ µ iff Y ∈ L1 (i.e., E(|Y |) < +∞), and then
µ = E(Y ) (Karr, 1993, p. 188; Rohatgi, 1976, p. 274, Theorem 7). Note that if µ =∞, then we have to
use a more delicate argument to show that the result is still valid (Kulkarni, 1995, p. 411).
86
Exercise 2.46 — Strong law of large numbers for renewal processes (Ross, 2003,
examples 7.5 and 7.6, pp. 409–410)
Evaristo has a radio that works on a single 3 volt battery. As soon as the battery fails,
Evaristo immediately replaces it with a new battery.
(a) Admit that the lifetime (in hours) of those batteries is uniformly distributed over the
interval [30, 60]. At what rate has Evaristo to change batteries in the long-run?
(b) Now, admit that Evaristo does not keep any surplus batteries and each time a failure
occurs he must go and buy a new battery spending a uniformly distributed time over
the interval [0, 1]. Recalculate the rate at which Evaristo has to change batteries in
the long-run. •
Exercise 2.47 — Strong law of large numbers for renewal processes (bis) (Ross,
2003, Exercise 7, p. 461)
Clotilde works regretfully on a temporary basis and the mean length of each job she gets
is three months.
At what rate does Clotilde get new jobs in the long-run if the amount of time she
spends unemployed is exponentially distributed with mean equal to 2. •
Exercise 2.48 — Strong law of large numbers for renewal processes (bis, bis)
(Ross, 2003, Example 7.8, pp. 411–412)
A game consists of a sequence of independent trials — each of which results in outcome
i with probability Pi (i = 1, . . . , n and∑n
i=1 Pi = 1) — which is observed until the same
outcome occurs k times in a row; this outcome is then declared to be the winner of the
game. For instance, if k = 2 and the sequence of outcomes is 1, 2, 4, 3, 5, 2, 1, 3, 3, then
the game stops after nine trials and number 3 is declared the winner.
(a) What is the probability that outcome i wins, i = 1, . . . , n?
(b) Determine the expected number of trials until an outcome is declared the winner. •
87
To prove that limt→+∞m(t)t
= 1µ, which is not a simple consequence of Proposition 2.43
(Ross, 2003, p. 409),15 we have to digress to the notion of stopping time, state Wald’s
equation (Ross, 1983, p. 59) and establish such limit independently (Kulkarni, 1995, p.
418).
Definition 2.49 — Stopping time (Ross, 1983, p. 59)
An integer-valued r.v. N is said to be a stopping time for the sequence of independent
r.v. Xi : i ∈ N if the event N = n is independent of Xn+1, Xn+2, . . . , for all n ∈ N.16 •
Exercise 2.50 — Stopping time
Let Xi : i ∈ N be a sequence of i.i.d. r.v. with Bernoulli(1/2) distribution.17
(a) Prove that N = min n :∑n
i=1Xi = 10 is a stopping time for this sequence.
(b) Now, consider Yist= 2Xi − 1, i ∈ N, i.e., P (Yi = −1) = P (Yi = 1) = 1
2.
Show that N = min n :∑n
i=1 Yi = 1 is a stopping time for the sequence of i.i.d. r.v.
Yi : i ∈ N. •
Exercise 2.51 — Stopping time (Ross, 2003, Exercise 13, pp. 462–463)
Let Xi : i ∈ N be a sequence of i.i.d. r.v. with Bernoulli(p) distribution, where 0 < p < 1,
and define:
(a) N1 = inf n ∈ N :∑n
i=1 Xi = 5;
(b) N2 =
3, X1 = 0
5, X1 = 1;
(c) N3 =
3, X4 = 0
2, X4 = 1.
Which of these three r.v. are stopping times for the sequence Xi : i ∈ N? Justify. •
15Because it is not always true that limt→+∞m(t)t
(= limt→+∞E
[N(t)t
])= E
[limt→+∞
N(t)t
]; after
all the almost sure converge does not imply the convergence in expected value.16N essentially represents the number of r.v. observed before stopping.17Ross (1983, p. 59) illustrates the notion of stopping time without proving that we are indeed dealing
with two stopping times for two sequences of r.v.
88
Proposition 2.52 — Wald’s equation (Ross, 1983, Theorem 3.3.2, p. 59)
Let:
• Xi : i ∈ N be a sequence of i.i.d. r.v. with common finite expectation E(X);
• N be a stopping time for the sequence Xi : i ∈ N such that E(N) <∞.
Then
E
(N∑i=1
Xi
)= E(N)× E(X). (2.26)
•
Exercise 2.53 — Wald’s equation
Prove Proposition 2.52 (Ross, 1983, pp. 59–60). •
Exercise 2.54 — Wald’s equation (Ross, 2003, Exercise 115, p. 464)
Consider a miner trapped in a room that contains 3 doors:
• door 1 leads him/her to freedom after two days of travel;
• door 2 returns him/her to the room after a four-day journey;
• door 3 returns him/her to the room after a six-day journey.
Suppose at all times the miner is equally likely to choose any of the 3 doors, and let T
denote the time it takes the miner to become free.
(a) Define a sequence of i.i.d. r.v. Xi : i ∈ N and a stopping time N such that T =∑Ni=1 Xi.
(b) Use Wald’s equation to obtain E(T ).
(c) Compute E(∑N
i=1Xi | N = n)
and verify that it is not equal to E (∑n
i=1Xi).
(d) Use line (c) for a second derivation of E(T ). •
89
Exercise 2.55 — Back to stopping times
Argue that:
(a) N(t)+1 is indeed a stopping time for the sequence of inter-renewal times, X1, X2, . . .
(Ross, 1983, p. 60);
(b) N(t) is not a stopping time for X1, X2, . . . .18 •
Since N(t) + 1 is a stopping time for the sequence of inter-renewal times, we can use
Wald’s equation to state the following auxiliary result that plays a crucial role in the proof
of the elementary renewal theorem.
Proposition 2.56 — Relating E[SN(t)+1
]and the renewal function (Ross, 1983,
Corollary 3.3.3, p. 61)
Let N(t) : t ≥ 0 be a renewal process, whose inter-renewal times X1, X2, . . . have
common (and finite) expected value µ, and m(t) be its renewal function. Then the
expected value of SN(t)+1 =∑N(t)+1
i=1 Xi, the time of the first renewal after time t, is equal
to:
E[SN(t)+1
]= µ× [m(t) + 1]. (2.27)
•
Exercise 2.57 — Relating E[SN(t)+1
]and the renewal function
Prove Proposition 2.56. •
Theorem 2.58 — Elementary renewal theorem (Ross, 1983, Theorem 3.3.4, p. 61;
Ross, 2003, p. 409; Kulkarni, 1995, Theorem 8.9, p. 417)
Let N(t) : t ≥ 0 be a renewal process whose inter-renewal times X1, X2, . . . have
common expected value µ. Then
limt→+∞
m(t)
t=
1
µ, (2.28)
where 1/∞ ≡ 0. That is, the expected average renewal rate converges to 1µ. •
18Hint: N(t) = n ⇔ Sn =∑ni=1Xi ≤ t and Sn+1 =
∑n+1i=1 Xi > t (Ross, 2003, p. 464).
90
Exercise 2.59 — Elementary renewal theorem
(a) Prove Theorem 2.58 (Ross, 1983, p. 61; Kulkarni, 1995, pp. 419–420).
(b) Resume Exercise 2.23 and apply the elementary renewal theorem to obtain
limt→+∞m(t)t
(Kulkarni, 1995, Example 8.21, pp. 430–431). •
Now, it is time to study the limiting behavior of the solutions of the renewal-type
equations (Kulkarni, 1995, p. 428). But before we proceed, we need to define lattice r.v.19
and its period,20 and also directly Riemann integrable (dRi) functions.
Definition 2.60 — Lattice r.v. and its period (Ross, 1983, p. 63; Kulkarni, 1995,
Definition 8.3, p. 428)
A non-negative r.v. X and its c.d.f. F are said to be lattice if there exists a constant d > 0
such that∑+∞
n=0 P (X = nd) = 1, that is, if X only takes on integral multiples of some
positive number d.
The largest d having this property is said to be the period of X. •
Example 2.61 — Lattice r.v. and its period (Ross, 1983, p. 63; Kulkarni, 1995,
Example 8.18, p. 428)
The r.v. taking values in the following sets are lattice:
• 0, 1, 2, . . . (d = 1);
• 0, 2, 4, . . . (d = 2);
• 0,√
2 (d =√
2). •
Definition 2.62 — Directly Riemann integrable (dRi) function (Caravena, 2012)
A non-negative function D, defined on (the real line or on) a half-line, is said to be
directly Riemann integrable if the upper and lower Riemann sums of D over the whole
(unbounded) domain converge to the same finite limit, as the mesh of the partition
vanishes. •19Or arithmetic or periodic r.v.20Or span.
91
Remark 2.63 — Directly Riemann integrable function (Ross, 1983, p. 64)
Let:
• D be a function defined on [0,+∞];
• mn(a) (resp. mn(a)) be the supremum (resp. infimum) of D(t) over the interval
[(n− 1)a, na], for any a > 0.
Then D is said to be a directly Riemann integrable function if∑+∞
n=1mn(a) and∑+∞n=1mn(a) are finite, for all a > 0, and lima→0
∑+∞n=1 amn(a) = lima→0
∑+∞n=1 amn(a).
A (jointly!) sufficient condition for D to be directly Riemann integrable function is
that
(i) D(t) ≥ 0, t ≥ 0,
(ii) D(t) is non-increasing,
(iii)∫ +∞
0D(t) dt < +∞. •
Theorem 2.64 — Key renewal theorem (Ross, 1983, Theorem 3.4.2, p. 65; Kulkarni,
1995, Theorem 8.11, pp. 428-429)
Let:
• N(t) : t ≥ 0 be a renewal process, with renewal function m(t) and whose inter-
renewal times Xi : i ∈ N have common c.d.f. F and expected value µ;
• D(t) be a directly Riemann integrable (dRi) function;
• H(t) be a solution to the following renewal-type equation
H(t) = D(t) +
∫ t
0
H(t− x) dF (x)
= D(t) +
∫ t
0
D(t− x) dm(x),
92
If F is not lattice then
limt→+∞
H(t) = limt→+∞
∫ t
0
H(t− x) dF (x)
= limt→+∞
∫ t
0
D(t− x) dm(x)
=1
µ
∫ +∞
0
D(y) dy. (2.29)
If F is lattice with period d then
limk→+∞
H(kd+ x) =d
µ
+∞∑n=0
D(nd+ x). (2.30)
•
Remark 2.65 — Key renewal theorem
The proof of the key renewal theorem is excruciating and can be found in: Feller (1971,
Vol. II, pp. 364–366), for non lattice distributions; Feller (1968, Vol. I, pp. 335-337), for
lattice distributions. •
Exercise 2.66 — Key renewal theorem
Use the key renewal theorem to obtain limt→+∞
[m(t)− t
µ
], when the inter-renewal times
are not lattice and have finite variance (Kulkarni, 1995, Example 8.23, pp. 431–433). •
The next theorem is an application of the key renewal theorem (Kulkarni, 1995, p.
429).
Theorem 2.67 — Blackwell’s (renewal) theorem (Ross, 1983, Theorem 3.4.1, p. 63)
N(t) : t ≥ 0 be a renewal process, with renewal function m(t) and whose inter-renewal
times X1, X2, . . . , have common c.d.f. F and expected value µ.
• If F is not lattice then
limt→+∞
[m(t+ a)−m(t)] =a
µ. (2.31)
• If F is lattice with period d then
limn→+∞
E[number of renewals at nd] =d
µ. (2.32)
•
93
Remark 2.68 — Interpreting Blackwell’s theorem (Ross, 1983, pp. 63–64)
• Blackwell’s theorem states that if F is not lattice then the expected number of
renewals in an interval of length a, far from the origin, is approximately aµ, i.e.,
it is proportional to the length of the interval (a) and the long-run rate at which
renewals occur ( 1µ).
• If F is lattice with period d then limt→+∞[m(t + a)−m(t)] does not exist because
renewals can only occur at integral multiples of d and, thus, the expected number
of renewals in an interval far from the origin would clearly depend on how many
points integer multiples of the period d it contains and not on the interval length.
In the lattice case the relevant limit is that of the expected number of renewals
at nd, which is proportional to the period (d) and to the long-run rate at which
renewals occur ( 1µ). •
Remark 2.69 — Relating the Blackwell’s theorem and the key renewal
theorem
Blackwell’s theorem and the key renewal theorem can be shown to be equivalent (Ross,
1983, p. 65). In fact, we can deduce Blackwell’s theorem from the key renewal theorem,
by considering a function D(t) = I[0,h](t), for a fixed h > 0 (Kulkarni, 1995, pp. 429–430);
the reverse can be proven by approximating the directly Riemann integrable function D(t)
with step functions (Ross, 1983, p. 65). •
Exercise 2.70 — Blackwell’s theorem
Show that Blackwell’s theorem is verified in the following cases:
(a) N(t) : t ≥ 0 ∼ PP (λ) (Kulkarni, 1995, Example 8.20, p. 430);
(b) a renewal process N(t) : t ≥ 0 with inter-renewal times with Gamma(α = 2, λ = 1)
distribution. •
94
2.6 Recurrence times; the inspection paradox
Now, we study the following r.v. associated to a renewal process N(t) : t ≥ 0.
Definition 2.71 — Age, residual life and total life at time t (Kulkarni, 1995, p.
433; Ross, 1983, pp. 67-68)
Let N(t) : t ≥ 0 be a renewal process whose inter-renewal times are not lattice.
Then, for t ≥ 0, we define
A(t) = t− SN(t) (2.33)
Y (t) = SN(t)+1 − t (2.34)
XN(t)+1 = SN(t)+1 − SN(t)
= A(t) + Y (t), (2.35)
which are called the age at time t, the residual (or excess) life at time t and the total life
at time t, respectively. •
Remark 2.72 — Age, residual life and total life at time t (Kulkarni, 1995, p. 433;
Ross, 1983, pp. 67-68)
• A(t) represents the time from t since the last renewal and is sometimes called the
backward recurrence time.
• Y (t) denotes the time from t until the next renewal and is called the forward
recurrence time.
• XN(t)+1 represents the time between the last renewal before (or at t) and first renewal
after t, i.e., the inter-renewal time covering t. •
Exercise 2.73 — Age, residual life and total life at time t
(a) Draw a scheme with A(t), Y (t) and XN(t)+1 (Kulkarni, 1995, p. 434).
(b) Draw sample paths of the following stochastic processes:
(i) A(t) : t ≥ 0 (the age process),
95
(ii) Y (t) : t ≥ 0 (the residual life process) and
(iii) XN(t)+1 : t ≥ 0 (the total life process)
(Kulkarni, 1995, pp. 434–435).21 •
Exercise 2.74 — Age, residual life and total life at time t (Ross, 1983, Exercise
3.11, p. 95)
Let A(t) and Y (t) denote the age and residual life at t of a renewal process. Fill in the
missing terms, considering 0 < x ≤ t and y > 0:
(a) A(t) > x ⇔ 0 events in the interval ;
(b) Y (t) > y ⇔ 0 events in the interval ;
(c) P [Y (t) > y] = P [A( ) > ]. •
Exercise 2.75 — Age, residual life and total life at time t (bis) (Ross, 1983,
Exercises 3.11 and 3.12, p. 95)
Let A(t) and Y (t) denote the age and residual life at t of a renewal process, N(t) : t ≥ 0,
with inter-renewal c.d.f. Consider 0 < x ≤ t, 0 ≤ s ≤ t + x/2, 0 ≤ u < t + x and y > 0,
and find:
(a) the joint c.d.f. of (A(t), Y (t)) for a Poisson process;
(b) P [Y (t) > y | A(t) = x];
(c) P [Y (t) > x | A(t+ x/2) = s];
(d) P [Y (t) > x | A(t+ x) > u] for a Poisson process;
(e) P [A(t) > x, Y (t) > y]. •21According to Kulkarni (1995, p. 435), the sample paths of the: age process have slope 1 and downward
jumps of size Xn at Sn; residual life process decrease at a unit rate with upward jumps of size Xn+1 at
Sn; total life process are piecewise constant with upward or downward jumps of size Xn+1 −Xn at Sn.
96
Proposition 2.76 — Relating E[Y (t)] and the renewal function
Let N(t) : t ≥ 0 be a renewal process, whose inter-renewal times X1, X2, . . . have
common (and finite) expected value µ, and m(t) be its renewal function. Then the
expected residual life is equal to:
E[Y (t)] = µ× [m(t) + 1]− t. (2.36)
•
Exercise 2.77 — Relating E[Y (t)] and the renewal function
Prove Proposition 2.76. •
Exercise 2.78 — Relating E[Y (t)] and the renewal function
Consider the renewal process whose inter-renewal times have an hypo-exponential
distribution with parameters µ1 and µ2 (i.e., we are dealing with a convolution of two
exponentials).22
(a) Obtain the renewal function m(t) using the relationship between the LST of m(t) and
of the common c.d.f. F (t) of the inter-renewal times: m(s) = F (s)
1−F (s).23
(b) Determine the expected residual life at time t, E[Y (t)]. •
A curious feature of renewal processes is that if we wait some predetermined
time t and then observe how large the renewal interval containing time
t is, we should expect it to be larger than a typical renewal interval
(http://en.wikipedia.org/wiki/Renewal theory#The inspection paradox).
This counterintuitive fact is called the inspection paradox (Kulkarni, 1995, p. 439) and
is formalized in the following proposition.
22For more details check Proposition 1.19.23Ross (2003, Example 7.9, pp. 414–415) obtained the renewal function by first determining the
expected residual life via a continuous-time Markov chain reasoning.
97
Proposition 2.79 — Inspection paradox (Ross, 1983, Exercise 3.3, p. 93)
Let:
• N(t) : t ≥ 0 be a renewal process with inter-renewal times Xi : i ∈ N and
inter-renewal distribution F ;
• XN(t)+1 be the inter-renewal time covering t.
Then
P[XN(t)+1 > x
]≥ P (X > x), (2.37)
for any x > 0, i.e., XN(t)+1 ≥st Xi, i ∈ N.24 •
Exercise 2.80 — Inspection paradox (Ross, 1983, Exercise 3.3, p. 93)
(a) Prove Proposition 2.79 (Ross, 2003, p. 438).
(b) Compute P[XN(t)+1 > x
]when F (x) = 1− e−λx, x ≥ 0 (Ross, 2003, pp. 439-440). •
By capitalizing on limit theorems we are able to derive several results concerning the
limit behavior of the age, the residual life and the total life of a renewal process. Two of
those results are stated as an exercise.
Exercise 2.81 — Limit behavior of A(t)t
and E[Y (t)]t
Use the:
(a) SLLN for renewal processes to prove that A(t)t
w.p.1→ 0 (Ross, 1983, Exercise 3.12, p.
95).
(b) elementary renewal theorem to show that limt→+∞E[Y (t)]
t= 0 (Ross, 2003, p. 414). •
Can we determine the limit behavior of E[A(t)], E[Y (t)] and E[XN(t)+1]?
Yes!
We have to capitalize on the key renewal theorem for non-lattice inter-renewal times.
24XN(t)+1 ≥st Xi reads as follows: XN(t)+1 is stochastically larger than Xi. Moreover, XN(t)+1 ≥stXi, i ∈ N ⇒ E[XN(t)+1] ≥ E(Xi), i ∈ N.
98
Proposition 2.82 — Limit behavior of E[Y (t)], E[A(t)] and E[XN(t)+1] (Ross, 1983,
Proposition 3.4.6, p. 71; Kulkarni, 1995, Theorem 8.13 and Corollary 8.4, pp. 438-439)
Consider a renewal process N(t) : t ≥ 0 whose inter-renewal times have a common
non-lattice distribution F , expected value E(X) = µ and E(X2) < +∞. Then:
limt→+∞
E[Y (t)] =E(X2)
2µ; (2.38)
limt→+∞
E[A(t)] =E(X2)
2µ; (2.39)
limt→+∞
E[XN(t)+1] =E(X2)
µ. (2.40)
•
Note that limt→+∞E[XN(t)+1] = E(X2)µ≥ E(X), thus, agreeing with the inspection
paradox.
Exercise 2.83 — Limit behavior of E[Y (t)] and m(t)− tµ
(a) Prove Proposition 2.82 (Ross, 1983, pp. 70–71; Kulkarni, 1995, pp. 438-439).25
(b) Use Proposition 2.82 to prove that if E(X2) < +∞ and X is not lattice then
limt→+∞
[m(t)− t
µ
]= E(X2)
2µ2− 1.26 •
Exercise 2.84 — Limit behavior of E[Y (t)]
Consider a renewal process whose inter-arrival distribution is Gamma(n, λ).
(a) Use Proposition 2.82 to prove that limt→+∞E[Y (t)] = n+12λ
(Ross, 1983, Exercise 3.13,
p. 95).
(b) Compute limt→+∞E[Y (t)], by capitalizing not only on the fact that the inter-arrival
distribution is a sum of n independent and exponentially distributed r.v., but also on
the lack of memory of the exponential distribution and any convenient properties of
the Poisson process. •
25To prove the first (resp. second) result, derive the following renewal-type equation: E[Y (t)] =∫ +∞t
(x−t) dF (x)+∫ t0E[Y (t−x)] dF (x), t ≥ 0 (resp. E[A(t)] = t×[1−F (t)]+
∫ t0E[A(t−x)] dF (x), t ≥ 0).
26This should be the result of Exercise 2.66.
99
Can we use the key renewal theorem to derive the limiting survival function of A(t)
and Y (t)?
Yes!
Proposition 2.85 — Obtaining the limiting survival function of Y (t) via the
key renewal theorem (Kulkarni, 1995, Theorem 8.12, p. 435)
Consider a renewal process N(t) : t ≥ 0 whose inter-renewal times have a common
non-lattice distribution F and expected value E(X) = µ. Then
limt→+∞
P [Y (t) > x] =
∫ +∞x
[1− F (u)] du
µ, x > 0. (2.41)
•
Exercise 2.86 — Obtaining the limiting survival function of Y (t) via the key
renewal theorem
Prove Proposition 2.85 (Kulkarni, 1995, pp. 435–436).27 •
Proposition 2.87 — Obtaining the limiting survival function of A(t) (Kulkarni,
1995, Corollary 8.2, p. 436)
Under the conditions of Proposition 2.85, the limiting survival function of A(t) is given
by
limt→+∞
P [A(t) > y] =
∫ +∞y
[1− F (u)] du
µ, t > y. (2.42)
•
Exercise 2.88 — Obtaining the limiting survival function of A(t)
Prove Proposition 2.87 (Kulkarni, 1995, p. 436).28 •
27Consider H(t) = P [Y (t) > x], show that H(t) satisfies the renewal-type equation H(t) = [1− F (x+
t)] +∫ t0H(t− u)F (u) and then apply the key renewal theorem.
28Capitalize on the fact that A(t) > y ⇔ No renewals in [t− y, t] ⇔ Y (t− y) > y.
100
Exercise 2.89 — Obtaining the limiting c.d.f. of Y (t) and A(t)
Use propositions 2.85 and 2.87 to show that limt→+∞ P [Y (t) ≤ x] = limt→+∞ P [A(t) ≤
x] =∫ x0 [1−F (u)] du
µ. •
Remark 2.90 — Equilibrium distribution (Kulkarni, 1995, p. 437; Ross, 2003, pp.
432 and 469)
The c.d.f. Fe(x) =∫ x0 [1−F (u)]du
µis called the equilibrium distribution associated with inter-
renewal distribution F . It represents the long-run proportion of time the age and the
residual life of the renewal process does not exceed x. •
Exercise 2.91 — Equilibrium distribution (Ross, 2003, Exercise 42, p. 469)
Let Fe(x) =∫ x0 [1−F (u)]du
µbe the equilibrium distribution associated with the inter-renewal
distribution F .
(a) Show that if F is an exponential distribution then F = Fe. Comment this result.
(b) Let c be some positive constant and F (x) = I[c,+∞)(x) (i.e., the inter-renewal times
are all equal to c). Show that Fe is the uniform distribution over (0, c).
(c) The city of Berkeley, California, allows for two hours at all non-metered locations
within one mile of the University of California. Parking officials regularly tour around,
passing the same point every 2 hours. When an official encounters a car, he/she marks
it with chalk. If the same car is there on the official’s return 2 hours later, then the
parking ticket is written.
What is the probability you receive a ticket if you park your car in one of those
locations and return after 3 hours? •
101
2.7 Renewal reward processes
Can we generalize compound Poisson processes and study reward models associated with
renewal processes?
Yes!
They are called renewal reward processes, they consider that each time a renewal
occurs we receive a reward, and those processes are formally defined as follows.
Definition 2.92 — Renewal reward process (Kulkarni, 1995, p. 452; Ross, 2003, pp.
416–417)
Let:
• N(t) : t ≥ 0 a renewal process;
• Xn be the nth inter-renewal time (n ∈ N);
• Rn be the reward earned at the time of the nth renewal (n ∈ N);
• (Xn, Rn) : n ∈ N i.i.d.∼ (X,R);29
• R(t) =∑N(t)
n=1 Rn be the total reward earned by time t.30
Then R(t) : t ≥ 0 is called a renewal reward process. •
Exercise 2.93 — Renewal reward process
(a) Are renewal processes and compound PP examples of renewal reward processes?
(b) Give a detailed example of a renewal reward process.31
(c) Having in mind that Rn is a real-valued r.v., draw a typical sample path of a renewal
reward process R(t) : t ≥ 0 (Kulkarni, 1995, p. 453).32 •29We shall assume that the rewards Rn, n ∈ N can (and usually) depend on Xn, n ∈ N.30R(t) = 0 if N(t) = 0.31See, for instance, Kulkarni (1995, Example 8.33, p. 453).32The sample paths of R(t) : t ≥ 0 may go up and down, and a jump of size Rn occurs at time
Sn =∑ni=1Xi.
102
Computing the distribution of R(t) is rather difficult (Kulkarni, 1995, p. 454), and
obtaining the expected value of R(t) is far from trivial, namely because N(t) is not a
stopping time for the sequence of i.i.d. inter-renewal times (and neither for the sequence
of i.i.d. rewards).
Consequently, the question arises as to whether it is possible to study the limit behavior
of R(t)t
and E[R(t)]t
?
Yes!
We can use the SLLN for renewal processes (resp. Wald’s equation and the elementary
renewal theorem) to compute limt→+∞R(t)t
(resp. limt→+∞E[R(t)]
t).
Proposition 2.94 — SLLN (and elementary renewal theorem) for renewal
reward processes (Ross, 2003, Proposition 7.3, p. 417)
Let R(t) : t ≥ 0 be a renewal reward process such that the common expected values of
the inter-renewal times and rewards, E(R) and E(X), are finite. Then
R(t)
t
w.p.1→ E(R)
E(X), (2.43)
i.e., the long-run reward per time unit equals E(R)E(X)
. Moreover,
limt→+∞
E[R(t)]
t=E(R)
E(X). (2.44)
•
Exercise 2.95 — SLLN (and elementary renewal theorem) for renewal reward
processes
Prove Proposition 2.94 (Ross, 1983, pp. 78-79).33 •
Exercise 2.96 — SLLN for renewal reward processes (Ross, 2003, Example 7.10,
pp. 417–418)
Resume Exercise 2.109 and suppose that the amounts that the successive customers
deposit in the bank are independent r.v. with a common c.d.f. and expected value H
and µH , respectively.
At what rate deposits accumulate in the long-run? •
33The proof of (2.43) is similar to the one of the SLLN for renewal processes (Proposition 2.43).
103
Exercise 2.97 — SLLN for renewal reward processes (bis) (Ross, 2003, Example
7.11, pp. 418–419)
The lifetime of a car is a continuous r.v. with c.d.f. H and p.d.f. h. Evaristo has a policy
that he buys a new car as soon as his old one either breaks down or reaches the age of T
years. Suppose that a new car costs C1 (thousand) euros and also that an additional cost
of C2 (thousand) euros is incurred whenever Evaristo’s car breaks down.
(a) Under the assumption that a used car has no resale value, how much Evaristo spends
in cars per time unit in the long-run?34
(b) Now, suppose the lifetime of a car (in years) is uniformly distributed over (0, 10),
T ≤ 10, C1 = 3 (thousand) euros, and C2 = 0.5 (thousand) euros. What value of T
minimizes Evaristo’s cost per time unit in the long-run? •
Exercise 2.98 — SLLN for renewal reward processes (bis, bis) (Ross, 2003,
Exercises 22–24, pp. 465–466)
Resume line (a) of Exercise 2.97.
(a) Recalculate the long-run cost per time unit if one assumes that a T -year-old car in
working order has an expected resale value of R(T ).
(b) What value of T minimizes the previous cost per time unit in the long-run when:
(i) H represents the c.d.f. of the uniform distribution over (2, 8), C1 = 4 (thousand)
euros, C2 = 1 (thousand) euros, and R(T ) = 4− T2?
(ii) H is the c.d.f. of the exponential distribution with mean 5 years, C1 = 3
(thousand) euros, C2 = 0.5 (thousand) euros, and R(T ) = 0? Interpret the
result. •
Exercise 2.99 — SLLN for renewal reward processes (bis, bis, bis) (Ross, 2003,
Example 7.12, p. 420)
Suppose that:
34Ross (2003, p. 419) called it long-run average cost.
104
• customers arrive at a train depot according to a renewal process with mean inter-
arrival time µ;
• whenever there are N customers waiting in the depot, a train leaves;
• the depot incurs a cost at the rate of nc per time unit whenever there are n customers
waiting.
(a) What is the cost per time unit incurred by the train depot in the long-run?
(b) Suppose now that each time a train leaves, the depot incurs a cost of 6 monetary
units. What value of N minimizes the cost per time unit incurred by the train depot
in the long-run? •
Exercise 2.100 — Renewal reward processes (4bis) (Ross, 2003, Exercise 26, p.
466)
Resume Exercise 2.99 and suppose that:
• the customers arrive according to a Poisson process with rate λ;
• a train is summoned whenever there are N customers waiting in the depot, but the
train takes K time units to arrive at the depot;
• when the train arrives to the depot it picks all waiting customers.
What is now the cost per time unit incurred by the train depot in the long-run? •
Exercise 2.101 — SLLN for renewal reward processes (5bis) (Ross, 2003,
Example 7.14, p. 421-423)
Consider a manufacturing process that sequentially produces items, each of which is either
defective or acceptable. The following type of scheme is often employed in an attempt to
detect and eliminate most of the defective items:
105
• initially, every single item is inspected and this continues until there are k items
that are acceptable;
• at this point 100% inspection ends and each successive item is independently
inspected with probability α ∈ (0, 1);
• this partial inspection continues until a defective item is encountered, at which time
100% inspection is resumed, and the process begins anew.
Admit each item is defective with probability q, independently of the remaining items.
(a) What proportion of items are inspected in the long-run?
(b) If defective items are removed when detected, what proportion of the remaining items
are defective in the long-run? •
Exercise 2.102 — SLLN for renewal (reward) processes (Ross, 2003, Exercise 8,
p. 461)
A machine in use is replaced by a new machine either when it fails or when it reaches the
age of T years.
After having admitted that the lifetimes of the successive machines are independent
with a common c.d.f. (resp. p.d.f.) F (resp. f), show that:
(a) the long-run rate at which machines are replaced equals 1∫ T0 x f(x) dx+T×[1−F (T )]
;
(b) the long-run rate at which machines in use fail is given by F (T )∫ T0 x f(x) dx+T×[1−F (T )]
. •
Exercise 2.103 — Key renewal theorem and renewal reward processes (bis,
bis) (Ross, 1983, Exercise 3.20, p. 97)
For a renewal reward process show that
limt→+∞
E[RN(t)+1
]=E(R1 ×X1)
E(X1). (2.45)
In this proof assume the inter-renewal distribution is not lattice and that any relevant
function is dRi. •
106
2.8 Alternating renewal processes
Consider that:
• a system that alternates between two states — up (or on) and down (or off );
• the system is initially up/on and remains up/on for a random time U1;
• it then goes down/off and remains down/off for a random time D1;
• it then goes up/on for a time U2, then down/off for a time D2, etc.
Can we deduce the long-run proportion of time that the alternating renewal process
is up/on (resp. down/off)?
Yes!
It suffices to apply the key renewal theorem, but let us define first an alternating
renewal process.
Definition 2.104 — Alternating renewal process (Kulkarni, 1995, p. 447; Ross,
2003, pp. 66–67)
Let:
• Un be the nth up time;
• Dn be the nth down time;
• (Un, Dn) : n ∈ N i.i.d.∼ (U,D); 35
• Xn = Un +Dn the duration of the nth up and down cycle;
• Z(t) be the state of the process at time t (1 ≡ up/on; 0 ≡ down/off).
Then
Z(t) =
1, if ∃n ∈ N : Sn =∑n
i=1 Xi =∑n
i=1(Ui +Di) ≤ t < Sn + Un+1
0, if ∃n ∈ N : Sn + Un+1 ≤ t < Sn+1,(2.46)
and Z(t) : t ≥ 0 is called an alternating renewal process.36 •35We allow Un and Dn to be dependent!36(Un, Dn) : n ∈ N is usually called the alternating renewal sequence.
107
Exercise 2.105 — Alternating renewal processes
Draw a typical sample path of an alternating renewal process (Kulkarni, 1995, p. 448). •
Proposition 2.106 — Key renewal theorem and alternating renewal processes
(Ross, 1983, Theorem 3.4.4, p. 67; Kulkarni, 1995, Theorem 8.23, pp. 447–448)
Let:
• H, G and F be the distributions of Un, Dn and Xn = Un +Dn, respectively;
• E(U) = E(Un) (resp. E(D) = E(Dn)) denotes the expected length of an up/on
(resp. a down/off) period;
• P (t) = P (system is up/on at time t) = P [Z(t) = 1];
• Q(t) = P (system is down/off at time t) = 1− P (t) = P [Z(t) = 0].
If E(Xn) = E(Un +Dn) < +∞ and F is not lattice then the proportion of time that the
system is up is, in the long-run, equal to
limt→+∞
P (t) =E(U)
E(U) + E(D). (2.47)
Moreover,
limt→+∞
Q(t) =E(D)
E(U) + E(D). (2.48)
•
Remark 2.107 — Key renewal theorem and alternating renewal processes
(Kulkarni, 1995, Theorem 8.23, p. 448)
If F is lattice with period d, then the above the results from Proposition 2.106 should be
restated as follows:
limn→+∞
P (nd) =E(U)
E(U) + E(D);
limn→+∞
Q(nd) =E(D)
E(U) + E(D).
•
108
Exercise 2.108 — Key renewal theorem and alternating renewal processes
Prove Proposition 2.106 (Ross, 1983, p. 67; Kulkarni, 1995, pp. 448–449). •
Exercise 2.109 — Key renewal theorem and alternating renewal processes
Consider an M/G/1/1 system.37
(a) What is the rate at which customers enter the system in the long-run (Ross, 2003,
Example 7.7, p. 410)?
(b) What proportion of potential customers that actually enter the bank in the long-run
(Ross, 2003, Example 7.7, p. 410)?
(c) Determine the long-run proportion of time that the server is busy. •
Exercise 2.110 — Key renewal theorem and alternating renewal processes
(bis)
Consider a single-server bank (with infinite capacity) to which customers arrive in
accordance with a Poisson process with rate λ. Moreover, admit that the service provided
by the server is a r.v. with c.d.f. G and expected value µ−1 (µ > λ).38
Obtain the long-run proportion of time that the server is busy (Ross, 1989, Example
5.1d, pp. 322-325). •
The limit behavior of the c.d.f. of the age and the residual life of a renewal process can
be determined using Proposition 2.10639 and appropriate alternating renewal processes.
37In this case: potential customers arrive to this single-server system according to a Poisson process
with rate λ; a potential customer will enter the system iff the only server is free when he/she arrived; the
time spent in the system by an entering customer corresponds to the duration of the service provided by
the server and is a r.v. with c.d.f. G.38We are dealing with a M/G/1 system.39Instead of the key renewal theorem.
109
Proposition 2.111 — Obtaining the limiting c.d.f. of A(t), Y (t) and XN(t)+1 via
alternating renewal processes (Ross, 1983, Proposition 3.4.5, p. 68)
If the inter-renewal distribution F is not lattice and µ < +∞ then:
limt→+∞
P [A(t) ≤ x] =
∫ x0
[1− F (y)]dy
µ; (2.49)
limt→+∞
P [Y (t) ≤ x] =
∫ x0
[1− F (y)]dy
µ; (2.50)
limt→+∞
P [XN(t)+1 ≤ x] =
∫ x0ydF (y)
µ. (2.51)
•
Exercise 2.112 — Obtaining the limiting c.d.f. of A(t), Y (t) and XN(t)+1 via
alternating renewal processes
After having considered convenient alternating renewal processes, prove Proposition 2.111
(Ross, 1983, p. 68). •
Exercise 2.113 — Limiting c.d.f. of A(t) (Ross, 2003, Exercise 41, p. 469)
Each time a certain machine breaks down it is replaced by a new one of the same type.
In the long-run, what percentage of time is the machine in use less than one year if
the life distribution of a machine is:
(a) uniformly distributed over (0, 2)?
(b) exponentially distributed with expected value 1? •
110
2.9 Delayed renewal processes
Can we mathematically treat a counting process for which the first inter-event time has
a different distribution from the remaining ones?
Yes!
It is a generalization of renewal processes we describe in this section.
Definition 2.114 — Delayed renewal process (Ross, 1983, p. 74)
Let:
• Xi : i ∈ N be a sequence of independent non-negative r.v. representing the inter-
event times, with X1 having distribution G and Xi (i = 2, 3, . . . ) having distribution
F ;
• S0 = 0;
• Sn =∑n
i=1Xi, n ∈ N;
• ND(t) = supn ∈ N0 : Sn ≤ t be the number of events that occurred in (0, t].
Then ND(t) : t ≥ 0 is called a (general or) delayed renewal process. •
Remark 2.115 — Delayed renewal process (Kulkarni, 1995, p. 440)
The term delayed is pertinent because the process behaves exactly like a (standard)
renewal process after the occurrence of the first renewal at time X1. ND(t + X1) − 1 :
t ≥ 0 is indeed a (standard) renewal process with inter-renewal distribution F . •
Exercise 2.116 — Delayed renewal process
Give a few detailed examples of delayed renewal processes (Kulkarni, 1995, examples 8.24
and 8.26, p. 440). •
Exercise 2.117 — Delayed renewal process (bis) (Ross, 1983, Exercise 3.15, p. 96)
In Exercise 2.109 suppose that potential customers arrive in accordance with a renewal
process having distribution F .
111
Would the number of events by time t constitute a (possibly delayed) renewal process
if an event corresponds to a customer:
(a) entering the system?
(b) leaving the system?
What if F were exponential? •
The delayed renewal process inherits most properties of the (standard) renewal process
(Kulkarni, 1995, p. 440), for instance, ND(t) and its expected value are finite in finite time,
the p.f. of ND(t) can be written in terms of the c.d.f. of Sn and Sn+1, the LST of the
renewal function depends on the LST of the inter-renewal distributions G and F , etc.
Proposition 2.118 — Properties of ND(t) (Kulkarni, 1995, pp. 440–441)
Let:
• ND(t) : t ≥ 0 be a delayed renewal process;
• G and F be the inter-renewal distributions referring to X1 and Xi (i = 2, 3, . . . ),
respectively;
• Fn−1(t) = P (Sn−X1 = X2 + · · ·+Xn ≤ t) the (n− 1)− fold convolution of F with
itself (n ∈ N);
• (G ? Fn−1)(t) = P (Sn ≤ t) =∫ t
0G(t− x) dFn−1(x).
Then
• P [ND(t) < +∞] = 1, 0 ≤ t < +∞;
• P [ND(t) = n] = P (Sn ≤ t)− P (Sn+1 ≤ t) = (G ? Fn−1)(t)− (G ? Fn)(t);
• mD(t) = E[ND(t)] < +∞, 0 ≤ t < +∞;
• mD(t) =∑+∞
n=1(G ? Fn−1)(t);
• mD(s) =∫ +∞
0−e−st dmD(t) = G(s)
1−F (s). •
112
Moreover, it is easy to prove similar limit theorems for the delayed renewal process
(Ross, 1983, p. 74). Note that the distribution of the first inter-renewal time X1, G, plays
no role in the asymptotic behavior of the delayed renewal process (Kulkarni, 1995, p.
441), as illustrated by the following proposition.
Proposition 2.119 — Limit theorems for delayed renewal processes (Ross, 1983,
Proposition 3.5.1, pp. 74–75; Kulkarni, 1995, pp. 441–442)
Let ND(t) : t ≥ 0 be a delayed renewal process and µ = E(Xi), i = 2, 3, . . . . Then
• ND(t)t
w.p.1→ 1µ
(SLLN for delayed renewal processes);
• limt→+∞mD(t)t
= 1µ
(elementary renewal theorem for delayed renewal processes);
• F is not lattice ⇒ limt→+∞ [mD(t+ a)−mD(t)] = aµ
(Blackwell’s theorem for
delayed renewal processes);
• G and F are lattice with period d ⇒ limn→+∞E[number of renewals at nd] = dµ
(ibidem). •
Proposition 2.120 — Key renewal theorem for delayed renewal processes (Ross,
1983, Proposition 3.5.1, p. 75; Kulkarni, 1995, pp. 442–443)
Let:
• ND(t) : t ≥ 0 be a delayed renewal process and µ = E(Xi) < +∞, i = 2, 3, . . . ;
• D(t) be a dRi function;
• HD(t) be a solution to the renewal-type equation HD(t) = D(t)+∫ t
0D(t−x) dmD(x).
If F is not lattice then
limt→+∞
H(t) = limt→+∞
∫ t
0
D(t− x) dmD(x)
=1
µ
∫ +∞
0
D(y) dy. (2.52)
•
113
Exercise 2.121 — Delayed renewal processes (Ross, 1983, Exercise 3.18, p. 96)
Consider a delayed renewal process ND(t) : t ≥ 0 whose first inter-event has distribution
G and the others have distribution F .
(a) Prove that mD(t) satisfies the following renewal-type equation:
mD(t) = G(t) +
∫ t
0
m(t− x) dG(x), (2.53)
where m(t) =∑+∞
n=1 Fn(t).
(b) Show that if G has a finite mean then limt→+∞ t× [1−G(t)] = 0.
(c) Let AD(t) denote the age at time t. Prove that if F is not lattice, with∫ +∞
0x2 dF (x) <
+∞, and limt→+∞ t× [1−G(t)] = 0, then
limt→+∞
E[AD(t)] =
∫ +∞0
x2 dF (x)
2∫ +∞
0x dF (x)
, (2.54)
•
2.10 Regenerative processes
Can we further generalize renewal processes?
Yes!
Renewal processes lead to an important and more general class of stochastic processes
defined below.
Definition 2.122 — Regenerative process (Ross, 1983, p. 84)
Consider a stochastic process X(t) : t ≥ 0 with state space N0 and having the property
that there are time points at which the stochastic process restarts itself (probabilistic
speaking!).40 Then X(t) : t ≥ 0 is called a regenerative process. •40That is, w.p.1 there is a time S1 such that the continuation of the process beyond S1 is a probabilistic
replica of the whole process starting at 0. Note that this property implies the existence of further points
S2, S3, . . . with the same property as S1.
114
Remark 2.123 — Regenerative process (Ross, 1983, p. 84)
It follows that S1, S2, S3, . . . constitute the event times of a renewal process. Moreover,
we say that a cycle is completed every time a renewal occurs and N(t) = maxn ∈ N0 :
Sn ≤ t denotes the number of cycles by time t.41 •
Exercise 2.124 — Regenerative process
Give a few detailed examples of regenerative processes (Kulkarni, 1995, examples 8.36
and 8.39, pp. 460–461). •
To obtain the limiting p.f. of X(t) we have to use the key renewal theorem.
Proposition 2.125 — Limiting behavior of P [X(t) = j] (Kulkarni, 1995, Theorem
8.26, p. 461; Ross, 1983, Theorem 3.7.1, p. 84)
Let:
• X(t) : t ≥ 0 be a regenerative process with state space N0;42
• S1 be the first regeneration epoch and F its distribution;
• Uj be the time that the process spends in state j during [0, S1).
If F is not lattice and E(S1) < +∞ then
Pj ≡ limt→+∞
P [X(t) = j]
=E(amount of time in state j during a cycle)
E(time of a cycle)
=E(Uj)
E(S1). (2.55)
•
Exercise 2.126 — Limiting behavior of P [X(t) = j]
Prove Proposition 2.125 (Ross, 1983, p. 84). •41Once again we consider S0 = 0.42And right continuous sample paths with left limits.
115
Example/Exercise 2.127 — Limiting behavior of P [X(t) = j] (Ross, 1983, Exercise
3.26, p. 98)
Packages arrive to a mailing depot in accordance with a Poisson process having rate λ.
Trucks, picking up all waiting packages, arrive (instantly pick the waiting packages and
immediately leave the depot) in accordance to a renewal process with a non-lattice inter-
event distribution F . Let X(t) denote the number of packages waiting to be picked at
time t.
(a) What type of stochastic process is X(t) : t ≥ 0? Justify!
(b) Find an expression for limt→+∞ P [X(t) = j], j ∈ N0.
• Regenerative process
X(t) : t ≥ 0
• Regeneration times
Times S1, S2, . . . of departing trucks
• Limiting value of P [X(t) = j]
According to Proposition 2.125,
limt→+∞
P [X(t) = j] =E(Uj)
E(S1),
where:
E(S1) =
∫ +∞
0
x dF (x);
E(Uj) = E[E(Uj | S1)]
=
∫ +∞
0
E(Uj | S1 = x) dF (x).
Conditioning on the number of packages that arrived during the time x elapsed
between two successive truck departures, we can add that N(x) ∼ Poisson(λx)
and
E(Uj) =
∫ +∞
0
+∞∑i=j
E[Uj | S1 = x,N(x) = i]× P [N(x) = i]
dF (x).
116
Furthermore, let S∗1 , . . . , S∗i be the epochs of the i arrivals of packages that
occurred in an interval of length x. Since the (S∗1 , . . . , S∗i | N(x) = i) behave like
the order statistics of a random sample (Y1, . . . , Yi) from a Uniform(0, x) and,
thus,
Y(k)
x∼ Beta(k, i− k + 1), k = 1, . . . , i
E
[Y(k)
x
]=
k
k + (i− k + 1)
=k
i+ 1.
Moreover, there are j packages in the depot waiting be picked between the arrival
of the jth and (j + 1)th packages. Therefore,(Ujx| S1 = x,N(x) = i
)∼
Y(j+1)
x−Y(j)
x.
Consequently:
E[Uj | S1 = x,N(x) = i] = x× E[Y(j+1)
x−Y(j)
x| S1 = x,N(x) = i
]= x×
(j + 1
i+ 1− j
i+ 1
)=
x
i+ 1;
E(Ui) =
∫ +∞
0
+∞∑i=j
x
i+ 1× P [N(x) = i]
dF (x)
=
∫ +∞
0
1
λ
+∞∑i=j
e−λx(λx)j+1
(i+ 1)!
dF (x)
=1
λ
∫ +∞
0
FGamma(j,λ)(x) dF (x);
limt→+∞
P [X(t) = j] =
∫ +∞0
FGamma(j,λ)(x) dF (x)
λ×∫ +∞
0x dF (x)
.
•
We are particularly interested in determining the long-run proportion of time a
regenerative process spends at state j (Ross, 2003, p. 425). We can obtain this quantity
by applying the theory of renewal reward process.
117
Proposition 2.128 — Long-run proportion of time that X(t) = j (Ross, 1983,
Theorem 3.7.2, p. 85)
For a regenerative process X(t) : t ≥ 0 with E(S1) < +∞, we have
amount of time in j during (0, t)
t
w.p.1→ Pj, (2.56)
that is, the long-run proportion of time a regenerative process spends at state j is equal
to Pj. •
Exercise 2.129 — Long-run proportion of time that X(t) = j
Prove Proposition 2.128 (Ross, 1983, p. 85). •
Exercise 2.130 — Long-run proportion of time that X(t) = j (Kulkarni, 1995,
Example 8.41, pp. 465–466)
Customers arrive at a bus depot according to a renewal process with i.i.d. inter-arrival
times with mean µ < +∞. As soon as there are k (k ∈ N) customers waiting at the
depot, a shuttle is immediately dispatched to (instantly) clear all the k customers.
Let X(t) denote the number of customers in the depot at time t. What is the long-run
proportion of time the bus depot has j (j ∈ 0, 1, . . . , k − 1) customers? •
118
Chapter 3
Discrete time Markov chains
While trying to realistically model a system, we are forced to tackle all sorts of
dependencies which make for unmanageable or impossible calculations (Resnick, 1992,
p. 60). Thus, when constructing a stochastic model, the challenge is to have dependencies
which allow for sufficient realism but which can be analytically tamed to permit
mathematical tractability (Resnick, 1992, p. 60). Markov processes balance these two
demands quite nicely because
conditional on a history up to the present, the probabilistic structure of the
future does not depend on the whole history but only on the present
(Resnick, 1992, p. 60), i.e., they satisfy the Markov property (Kulkarni, 1995, p 17).
Markov processes were named after the Russian mathematician Andrey Markov (1856–
1922), who produced the first purely theoretically results in 1906 for these processes
(http://en.wikipedia.org/wiki/Markov chain).
This chapter is devoted to the simplest Markov processes, time-homogeneous discrete
time Markov chains (DTMC) with finite or countable state space.1
A few quantities that could be modeled by a DTMC: the state of deterioration of a
piece of equipment; the popularity of a politician; the inventory level of an item in a store;
the number of jobs waiting to be processed by a computer (Hastings, 2001, p. 309).
1For the definition of DTMC with a general state space, the reader is referred to Kulkarni (1995,
Definition 2.1, pp. 17–18).
119
3.1 Definitions and examples
The formal definition and examples of time-homogeneous DTMC with finite or countable
state space can be found in this section.
Definition 3.1 — Time-homogeneous DTMC with finite or countable state
space (Ross, 2003, p. 181)
Let Xn : n ∈ N0 be a stochastic process with a finite or countable state space S. If
P (Xn+1 = j | Xn = i,Xn−1 = in−1, . . . , X0 = i0) = P (Xn+1 = j | Xn = i)
= Pij, (3.1)
for all i0, . . . , in−1, i, j ∈ S and n ∈ N0, then Xn : n ∈ N0 is said to be a time-
homogeneous DTMC with finite or countable state space.2 •
From now on, we shall assume that the state space S is finite or countable and the
DTMC is time-homogeneous, thus, we shall drop the terms time-homogeneous and with
finite or countable state space whenever we refer to such a DTMC.
Remark 3.2 — (One-step) transition probability matrices, stochastic matrices,
transition diagrams
• A DTMC is a probabilistic model that undergoes transitions from one state to
another (http://en.wikipedia.org/wiki/Markov chain). Consequently, the law of
motion is specified by one-step transition probabilities (Walrand, 2004, p. 225) and,
rightly so, the matrix
P = [Pij]i,j∈S (3.2)
is called the one-step transition probability matrix. Often we omit the word one-
step and simply refer to P as transition probability matrix (Kulkarni, 1995, p. 17)
or briefly TPM.
2Although the stochastic process possesses stationary/time-homogeneous transition probabilities, it is
in general not stationary (Resnick, 1992, p. 64). If P (Xn+1 = j | Xn = i) depends not only on i and j,
but also on n, Xn : n ∈ N0 is said to be a (non-homogeneous) DTMC with finite or countable state
space.
120
• P is a stochastic matrix because Pij ≥ 0, for all i, j ∈ S and∑
j∈S Pij = 1, for all
i ∈ S (Kulkarni, 1995, Definition 2.3 and Theorem 2.1, pp. 17–18).
• The random behavior of a DTMC is best visualized via its transition diagram — a
directed graph with
– one node for each state i ∈ S,
– a directed arc from node i to node j if Pij > 0 and
– a loop at node i (i.e., an arc from node i to itself) if Pii > 0
(Kulkarni, 1995, p. 19). •
Proposition 3.3 — Characterization of a DTMC (Kulkarni, 1995, Theorem 2.2, p.
18)
A DTMC Xn : n ∈ N0 is fully characterized by
• its TPM, P, and
• the p.f. of X0 denoted by α = [αi]i∈S = [P (X0 = i)]i∈S .3 •
Remark 3.4 — Applications of DTMC
Markov chains constitute an important class of probabilistic models because they are
fairly general and good numerical techniques exist for computing probabilistic quantities
referring to Markov chains (Walrand, 2004, p. 225). Unsurprisingly, the application of
Markov chains has been reported in areas, such as:
• chemistry (the classical model of enzyme activity, Michaelis-Menten kinetics, can
be viewed as a Markov chain);
• internet (the PageRank of a webpage as used by Google is defined by a Markov
chain; Markov models have also been used to analyze web navigation behavior of
users);
3SinceX0 is the initial state of the DTMC, the row vector [αi]i∈S is usually called the initial distribution
of the DTMC.
121
• music (Markov chains are employed in algorithmic music composition, particularly
in software programs such as CSound, Max or SuperCollider);
• physics (Markov processes appear extensively in thermodynamics and statistical
mechanics);
• sports (Markov chain models have been used in advanced baseball analysis since
1960).4
According to Kulkarni (1995, p. 33), DTMC have also been used in
• sociology to study the issues of social mobility (how the social/economic status of
the nth generation affects that of the (n + 1)th generation) or the impact of social
(and sexist!) traditions (how family names propagate through generations in a
family tree, etc.).
For an extensive account of other applications of DTMC (namely in genetics, manpower
planning, neurology, telecommunication), the reader is referred to Kulkarni (1995, pp.
30–41). •
Example 3.5 — DTMC and weather prediction (Ross, 2003, Example 4.1, p. 182;
http://en.wikipedia.org/wiki/Examples of Markov chains)
A very simple weather model can be represented by the following transition diagram
where a sunny day is 90% likely to be followed by another sunny day, and a rainy day is
50% likely to be followed by another rainy day. As a consequence, the probabilities of the
4For a more detailed account on these applications of DTMC, consult
http://en.wikipedia.org/wiki/Markov chain or http://en.wikipedia.org/wiki/Examples of Markov chains.
122
weather conditions (sunny or rainy), given the weather on the preceding day, are given
by the TPM
P =
0.9 0.1
0.5 0.5
.Its rows can be labelled sunny and rainy, and its columns are labelled in the same way
and order.
This two-sate DTMC is a particular case on the one with a TPM of the form
P =
α 1− α
β 1− β
,where α, β ∈ [0, 1]. •
Exercise 3.6 — TPM (Ross, 2003, Example 4.3, p. 182)
On any given day, Evaristo is either cheerful (C), so-so (S), or glum (G).
• If he is cheerful today, then he will be C, S, or G tomorrow with probabilities 0.5,
0.4, 0.1, respectively.
• If he is feeling so-so today, then he will be C, S, or G tomorrow with probabilities
0.3, 0.4, 0.3.
• If he is glum today, then he will be C, S, or G tomorrow with probabilities 0.2, 0.3,
0.5.
Let Xn denote Evaristo’s mood on the nth day, then Xn : n ∈ N0 is a three-state Markov
chain (state 1 = C, state 2 = S, state 3 = G).
Identify the TPM of this DTMC. •
Exercise 3.7 — Peculiar TPM
(a) Let Xn : n ∈ N0 be a sequence of i.i.d. discrete r.v. with p.f. pj = P (Xn = j), j ∈ S.
Identify the TPM of this DTMC (Kulkarni, 1995, Example 2.3, pp. 21–22).
123
(b) Let:
• Zn : n ∈ N0 be a sequence of i.i.d. discrete r.v. with common p.f. pj = P (Zn =
j), j ∈ Z;
• X0 = 0 and Xn =∑n
i=1 Zi, n ∈ N.
Then Xn : n ∈ N0 is a DTMC with state space S = Z.
Identify the TPM of this DTMC and note that Pij = pj−i, i.e., Xn : n ∈ N0 is a
space-homogeneous DTMC (Kulkarni, 1995, Example 2.4, pp. 22–23). •
Exercise 3.8 — Identifying DTMC (Ross, 1983, Exercise 4.7, p. 135)
Let X1, X2, . . . be independent r.v. such that P (Xi = j) = αj, j ≥ 0. Say that a record
occurs at time n if Xn > maxX1, . . . , Xn−1, where X0 = −∞, and if a record does occur
at time n call Xn the record value.
Let Ri denote the ith record value.
(a) Argue that Ri : i ≥ 1 is a Markov chain and compute its transition probabilities.
(b) Let Ti denote the time between the ith and (i+ 1)th record.
(i) Is Ti : i ≥ N a Markov chain?
(ii) What about (Ri, Ti) : i ≥ 1?
Compute transition probabilities where appropriate. •
Exercise 3.9 — More on transition probabilities (Ross, 1983, Exercise 4.2, p. 134)
Prove that, for a DTMC,
P (Xn = j | Xn1 = in1 , . . . , Xnk = ink) = P (Xn = j | Xnk = ink),
whenever 0 ≤ n1 < n2 < · · · < nk < n. •
124
3.2 Chapman-Kolmogorov equations; marginal and
joint distributions
So far we dealt with one-step transition probabilities Pij, i, j ∈ S. Can we calculate
n−step transition probabilities, such as
P (Xn+m = j | Xm = i), i, j ∈ S, n,m ∈ N0? (3.3)
Yes!
The n−step transition probabilities P (Xn+m = j | Xm = i) are usually denoted by
P nij (Ross, 1983, p. 103; Ross, 2003, p. 185), or by P
(n)ij , or even by p
(n)ij (Kulkarni, 1995,
p. 41).
Proposition 3.10 — Chapman-Kolmogorov equations and n − step transition
probabilities (Ross, 2003, p. 185; Kulkarni, 1995, Theorem 2.3, p. 42)
The Chapman-Kolmogorov equations provide a method for computing the n−step
transition probabilities and can be stated as follows:
P n+mij = P (Xn+m = j | X0 = i) (3.4)
=∑k∈S
P nik P
mkj , (3.5)
for i, j ∈ S and n,m ∈ N0. Equivalently, P nij =
∑k∈S P
lik P
n−lkj , for i, j ∈ S, n ∈ N0 and a
fixed l in 0, 1, . . . , n. •
Exercise 3.11 — Chapman-Kolmogorov equations
Prove the Chapman-Kolmogorov equations (Ross, 2003, p. 185; Kulkarni, 1995, Theorem
2.3, p. 42). •
Proposition 3.12 — n−step TPM (Ross, 2003, p. 186; Kulkarni, 1995, Theorem 2.4,
p. 42)
Let P(n) =[P nij
]i,j∈S denote the n−step TPM. Then the Chapman-Kolmogorov equations
assert that P(n+m) = P(n)P(m), n,m ∈ N0 and, most importantly,
125
P(n) = Pn, n ∈ N0, (3.6)
i.e., the n−step TPM may be calculated by multiplying the TPM P by itself n times. •
Remark 3.13 — Computing n−step TPM
The n−step TPM may be obtained either by multiplying the TPM P by itself n times or
by using the method of direct multiplication described by Algorithm 2 in Kulkarni (1995,
p. 47). For instance, to obtain P37 we have to compute P2, P4 = (P2)2, P8 = (P4)2,
P16 = (P8)2, P32 = (P16)2, and compute P37 as P P4 P32.
For more methods of computing the powers of a TPM, please refer to Kulkarni (1995,
pp. 47–54) and Kleinrock (1975, pp. 36–38). •
Exercise 3.14 — n−step TPM
Consider a DTMC with two states and TPM
P =
p 1− p
1− p p
.Use mathematical induction to prove that the n−step TPM is given by
Pn =
12
+ (2p−1)n
212− (2p−1)n
2
12− (2p−1)n
212
+ (2p−1)n
2
,for n ∈ N0. •
Exercise 3.15 — n−step TPM (bis) (Ross, 2003, Example 4.3, p. 182)
Resume Exercise 3.6.
(a) Obtain the probability that Evaristo is cheerful (C) two days after being glum (G).
(b) What is the probability that Evaristo is not cheerful (C) in four days time, given that
he is so-so (S) today? •
126
Exercise 3.16 — n−step TPM (bis, bis) (Resnick, 1992, Exercise 2.1, p. 147)
Consider a DTMC, Xn : n ≥ 0, with state space 1, 1, 2 and TPM equal to
P =
0.3 0.3 0.4
0.2 0.7 0.1
0.2 0.3 0.5
.Compute:
(a) P (X8 = 3 | X0 = 1);
(b) P (X4 = 3, X8 = 3 | X0 = 1);
(c) P (X16 = 3 | X0 = 1);
(d) P (X12 = 3, X16 = 3 | X0 = 1).
Try not to this by hand. •
Can we obtain P (Xn = j), for j ∈ S?
Yes! But how?
We need to know the initial distribution of the DTMC, to use the total probability
law and to capitalize on the n−step transition probabilities.
Proposition 3.17 — Marginal probabilities
Let:
• Xn : n ∈ N0 be a DTMC with TPM P = [Pij]i,j∈S ;
• α = [αi]i∈S be the row vector with the initial distribution of the DTMC (i.e., the
p.f. of X0).
Then
P (Xn = j) =∑i∈S
P (X0 = i)× P (Xn = j | X0 = i)
=∑i∈S
αi × P nij, j ∈ S, (3.7)
127
and the row vector with the p.f. of Xn is given by
αn = [P (Xn = j)]j∈S
= αPn. (3.8)
•
Exercise 3.18 — Marginal probabilities
Resume Exercise 3.16 and calculate P (X8 = 2) by considering the probability distribution
of the initial state given by α0 = 0.7, α1 = 0.2 and α2 = 0.1. •
Exercise 3.19 — n−step transition probabilities and marginal probabilities
Brand switching models are used quite often in practice by industries to predict market
shares, etc. Admit a DTMC, Xn : n ≥ 0 with state space A,B,C and TPM equal to
P =
0.1 0.2 0.7
0.2 0.4 0.4
0.1 0.3 0.6
,describes the choice of beer brand a typical customer buys weekly (Kulkarni, 1995,
Example 2.6, p. 26).
(a) Compute P (X2 = A | X0 = B).
(b) Consider the initial distribution α = [0.2 0.3 0.5] (Kulkarni, 1995, Example 3.1, p.
65) and find P (X2 = A). •
Can we compute joint probabilities such as P (Xn1 = in1 , . . . , Xnk = ink)?
Yes!
By simply capitalizing on the multiplication rule and on the Markov property.
Proposition 3.20 — Joint probabilities
Let:
128
• Xn : n ∈ N0 be a DTMC with TPM P = [Pij]i,j∈S ;
• α = [αi]i∈S be the row vector with the initial distribution of this DTMC.
Then
P (Xnk = ink , . . . , Xn1 = in1) =
(∑i∈S
αi × P n1i,in1
)×
k∏j=2
Pnj−nj−1
inj−1 ,inj, (3.9)
for 0 ≤ n1 < n2 < · · · < nk and in1 , . . . , ink ∈ S. •
Exercise 3.21 — Joint probabilities
Prove Proposition 3.20. •
Exercise 3.22 — Marginal and joint probabilities
Let Xn : n ∈ N0 be a DTMC with state space S = 1, 2, 3, 4 and the TPM
P =
0.1 0.2 0.3 0.4
0.2 0.2 0.3 0.3
0.5 0 0.5 0
0.6 0.2 0.1 0.1
and initial distribution α = [0.25 0.25 0.25 0.25]. Compute
(a) the p.f., the expected value and the variance of X4,
(b) P (X3 = 4, X2 = 1, X1 = 3, X0 = 1),
(c) P (X3 = 4, X2 = 1, X1 = 3)
(Kulkarni, 1995, Example 2.10, pp. 43-44). •
129
3.3 Classification of states; recurrent and transient
states
To infer the evolution of the DTMC it is critical to understand which paths through the
state space are possible and to unravel the allowable movements of the stochastic process
(Resnick, 1992, p. 77).
To identify which states j can be reached from a (starting) state i, we need to define
the notion of accessibility.
Definition 3.23 — Accessibility (Ross, 2003, p. 189)
State j is said to be accessible from state i — for short, i → j — if P nij > 0 for some
n ∈ N0. •
Remark 3.24 — Accessibility
A state i ∈ S is accessible from itself (i.e., i → i) because P 0ii = P (X0 = i | X0 = i) =
1 > 0.5 •
Exercise 3.25 — Accessibility
Consider a DTMC with TPM
P =
p 1− p
1− p p
,where p ∈ (0, 1).
(a) Draw the transition diagram of this DTMC.
(b) Is state 2 accessible from state 1?
(c) Is state 2 accessible from state 1 if p = 1? And if p = 0?
5In fact, P0 = [P (X0 = j | X0 = i)]i,j∈S = I#S×#S , where I#S×#S represents the identity matrix
with rank #S.
130
(d) Compute P 2n12 , n ∈ N, when p = 0? Comment this result. •
Definition 3.26 — Communicating states (Ross, 2003, p. 190)
Two states i and j that are accessible to each other are said to communicate, and we
write i↔ j. •
Proposition 3.27 — Properties of communication (Ross, 1983, Proposition 4.2.1,
p. 104; Kulkarni, 1995, Theorem 3.1, p. 71)
Communication is an equivalence relation, i.e.:
• i↔ i (reflexivity);
• i↔ j ⇒ j ↔ i (symmetry);
• i↔ j, j ↔ k ⇒ i↔ k (transitivity). •
Exercise 3.28 — Properties of communication
Prove Proposition 3.27 (Ross, 1983, p. 104; Kulkarni, 1995, p. 71). •
Two states that communicate are said to be in the same class. Moreover, since
communication is a reflexive, symmetric and transitive relation, we can use it to partition
the state space S into subsets known as communicating classes (Kulkarni, 1995, p. 71).
Definition 3.29 — Communicating class (Kulkarni, 1995, Definition 3.3, p. 71;
http://en.wikipedia.org/wiki/Markov chain)
A set of states C ⊂ S is a communicating class if every pair of states in C communicates
with each other, and no state in C communicates with any state not in C, that is:
(i) i, j ∈ C ⇒ i↔ j;
(ii) i ∈ C, i↔ j ⇒ j ∈ C. •
Definition 3.30 — Closed communicating class (Kulkarni, 1995, Definition 3.4, p.
72)
A communicating class C ⊂ S is said to be closed if i ∈ C, j 6∈ C ⇒ i 6→ j. •
131
Definition 3.31 — Irreducible and reducible DTMC
(http://en.wikipedia.org/wiki/Markov chain; Ross, 2003, p. 190; Kulkarni, 1995,
Definition 3.5, p. 72)
The DTMC is said to be irreducible if its state space S is a single closed communicating
class.6 Otherwise, the DTMC is called reducible. •
Exercise 3.32 — Irreducible and reducible DTMC
Draw the transition diagrams of the DTMC with state space S = 1, 2 and the following
TPM, verify if they are irreducible/reducible and whether the communicating classes are
closed or not:
(a) P =
0.2 0.8
0.3 0.7
,
(b) P =
1 0
0.3 0.7
,
(c) P =
1 0
0 1
(Kulkarni, 1995, Example 3.4, pp. 73–74).
(d) Consider now S = 1, 2, 3, 4, 5, 6. Draw the transition diagram associated to the
following TPM and identify the communicating classes
P =
12
0 0 0 0 12
0 13
0 0 23
0
16
16
16
16
16
16
0 0 0 1 0 0
0 23
0 0 13
0
12
0 0 0 0 12
.
Which of these classes are closed? (Kulkarni, 1995, Example 3.5, p. 74). •6In other words, if it is possible to get to any state from any state, that is, if all states communicate
with each other.
132
After introducing the notions of accessibility, communication, communicating class
and irreducibility, it is time to introduce the concept of periodicity.
Definition 3.33 — Periodic and aperiodic states (Ross, 1983, pp. 104–105;
Kulkarni, 1995, Definitions 3.6 and 3.7, pp. 74–75)
State i is said to be periodic, with period d ≡ d(i), if P nii = 0 whenever n is not divisible
by d and d is the greatest positive integer with this property.
A state with period d = 1 is said to be aperiodic. •
Remark 3.34 — Periodicity
• If P nii = 0, for n ∈ N, then the period of state i is said to be infinite (Ross, 1983,
pp. 104–105), that is the DTMC never returns to state i after leaving this state.
• An alternative definition of periodicity can be stated in terms of a r.v. that tells us
when the DTMC revisits state i (the first hitting time or recurrence time), given
that it started in state i:
Ti = minn ∈ N : Xn = i | X0 = i. (3.10)
Then state i is said to be periodic with period d ≡ d(i) if d is the largest integer
such that
P (Ti = n) > 0 ⇒ n is an integer multiple of d (3.11)
(Kulkarni, 1995, Definition 3.7, p. 75). •
Exercise 3.35 — Periodicity
Consider the DTMC on S = 1, 2 with TPM
P =
0 1
1 0
(Kulkarni, 1995, Example 3.8, p. 76).
Find the period of state 1, d ≡ d(1), and show that state 2 has the same period. •
133
Exercise 3.35 suggests that, like communication, periodicity is a class property.
Proposition 3.36 — A property of periodicity (Ross, Proposition 4.2.2, p. 105;
Kulkarni, 1995, Theorem 3.2, p. 76)
Periodicity is a class property, i.e.,
i↔ j ⇒ d(i) = d(j) (3.12)
•
Exercise 3.37 — A property of periodicity
Show Proposition 3.36 (Ross, 1983, p. 105; Kulkarni, 1995, p. 76). •
Now, we introduce the concepts of recurrence and transience of states. These notions
play an important role in the study of the limiting behavior of DTMC (Kulkarni, 1995, p.
77). But before proceeding into the definitions of recurrent and transient states, we need
to define the following probabilities.
Definition 3.38 — Probabilities of a first transition to a state and of ever
making a transition to a state (Ross, 1983, p. 105)
For any states i, j ∈ S define fnij to be the probability that, starting from state i, the first
transition into state j occurs exactly at time n. Formally,
f 0ij = 0, i 6= j (f 0
ii = 1) (3.13)
fnij = P (Xn = j,Xn−1 6= j, . . . , X1 6= j | X0 = i), n ∈ N. (3.14)
(The computation of fnij is thoroughly discussed in Section 3.9).
The probability of ever making a transition into state j, given that the process starts
in state i, equals
fij =+∞∑n=1
fnij. (3.15)
•
134
Note that for i 6= j, fij is positive iff state j is accessible from state i.
Exercise 3.39 — More on n−step transition probabilities (Ross, 1983, Exercise
4.4, p. 134)
Prove that P nij =
∑nk=1 f
kijP
n−kjj . •
Definition 3.40 — Recurrent and transient states (Ross, 2003, p. 191)
For any state i ∈ S, let
fi ≡ fii =+∞∑n=1
fnii = P (Ti < +∞) (3.16)
be the probability that, starting in state i, the process will ever reenter state i.7 Then
state i is said to be:
• recurrent if fi = 1;
• transient if fi < 1. •
Remark 3.41 — Transient and recurrent states, recurrence time and absorbing
states
• A state i is said to be transient if, given that we start in state i, there is a
non-zero probability that we will never return to i, that is, P (Ti < +∞) < 1
(http://en.wikipedia.org/wiki/Markov chain).
• State i is recurrent (or persistent) if it is not transient; recurrent states have finite
recurrence time with probability 1 (http://en.wikipedia.org/wiki/Markov chain),
i.e., P (Ti < +∞) = 1.
• If state i is transient then, starting in state i, the number of time periods that the
process will be back to state i has a geometric distribution with finite mean 11−fi
(Ross, 2003, pp. 191–192).
• If Pii = 1 then state i is said to be an absorbing state; in this case state i is
(obviously!) a recurrent state. •
7Recall that the r.v. Ti is the first return time to state i.
135
Exercise 3.42 — Recurrent and transient states (bis) (Resnick, 1992, Exercise
2.15(a), p. 151)
The Media Police has identified six states associated with TV watching habits of its
inhabitants:
• 1 (never watch TV);
• 2 (watch only PBS);
• 3 (watch TV fairly frequently);
• 4 (addicted);
• 5 (undergoing behavior modification);
• 6 (brain dead).
Transitions from state to state can be modeled as a DTMC with the following TPM:
P =
1 0 0 0 0 0
0.5 0 0.5 0 0 0
0.1 0 0.5 0.3 0 0.1
0 0 0 0.7 0.1 0.2
13
0 0 13
13
0
0 0 0 0 0 1
.
After having drawn the associated transition diagram, identify which states are transient
and which are recurrent? •
Necessary and sufficient conditions to guarantee recurrence and transience of a state
i can be written in terms of expected number of periods that the DTMC is in state i,∑+∞n=1 P
nii (Ross, 2003, p. 192).8
8By letting In = 1, if Xn = i, and In = 0, otherwise, we have that∑+∞i=1 In represents the number of
periods that the DTMC is in state i; moreover, E(∑+∞i=1 In | X0 = i) =
∑+∞n=1 P
nii (Ross, 2003, p. 192).
136
Proposition 3.43 — Recurrent and transient states (Ross, 1983, Proposition 4.2.3,
p. 105; Ross, 2003, p. 192)
These are necessary and sufficient conditions for recurrence and transience:
• state i is recurrent iff∑+∞
n=1 Pnii = +∞;
• state i is transient iff∑+∞
n=1 Pnii < +∞. •
Exercise 3.44 — Recurrent and transient states
Prove Proposition 3.43 (Ross, 1983, p. 105). •
Exercise 3.45 — More on transient states (Ross, 1983, Exercise 4.8, p. 135)
Show that if fi ≡ fii < 1 and fj ≡ fjj < 1 then:
(a)∑+∞
n=1 Pnij < +∞;
(b) fij =∑+∞n=1 P
nij
1+∑+∞n=1 P
njj
. •
Proposition 3.46 — Property of recurrence and transience (Ross, 2003, p. 193;
Kulkarni, 1995, Theorem 3.5, p. 81)
Recurrence and transience9 are class properties, i.e.:
i is recurrent (resp. transient), i↔ j ⇒ j is recurrent (resp. transient). (3.17)
•
Exercise 3.47 — Property of recurrence and transience
Prove Proposition 3.46 (Ross, 2003, p. 193; Kulkarni, 1995, p. 81). •
Exercise 3.48 — Classification of states
Consider a DTMC Xn : n ∈ N0 with state space S = Z and transition probabilities
given by
Pi,i+1 = p = 1− Pi,i−1, i ∈ Z,
9Like communication and periodicity.
137
where 0 < p < 1 and Xn : n ∈ N represents the one-dimensional random walk on the
integer number line.
Classify the states of this DTMC (Ross, 2003, Example 4.15, pp. 194–195). •
Exercise 3.49 — Classification of states of a symmetric random walk (bis)
(Ross, 1983, Exercise 4.5, p. 134)
Show that the symmetric random walk is recurrent in two dimensions and transient in 3
dimensions (Ross, 2003, Example 4.15, pp. 195–196; Resnick, 1992, pp. 95–97). •
Definition 3.50 — Mean recurrence time (Ross, 1983, p. 108)
The mean recurrence time at state i is the expected number of transitions needed to
return to state i:
µii = E(Ti)
=
+∞, if state i is transient∑+∞n=1 n× fnii , if state i is recurrent.
(3.18)
•
Even if the recurrence time Ti is finite with probability 1 when the state i is recurrent,
it need not have a finite expectation (http://en.wikipedia.org/wiki/Markov chain).
Unsurprisingly, recurrent states are further classified according to the finiteness (or not)
of this expected value.
Definition 3.51 — Positive and null recurrence (Ross, 1983, p. 108; Kulkarni, 1995,
Definition 3.9, p. 78)
A recurrent state i is said to be:
• positive recurrent if µii < +∞;
• null recurrent, if µii = +∞. •
Exercise 3.52 — Property of positive and null recurrence (Ross, 1983, Exercise
4.10, p. 135)
Show that positive and null recurrence are class properties (Kulkarni, 1995, p. 82). •
138
The next proposition gives a necessary and sufficient condition for positive and null
recurrence.
Proposition 3.53 — Necessary and sufficient condition for positive and null
recurrence (Kulkarni, 1995, Theorem 3.4, pp. 80–81)
Let
P ?ij(n) =
1
n+ 1
n∑k=0
P kij (3.19)
be the expected number of visits to state j starting from state i, per time unit up to time
n. Then a recurrent state i is:
• positive recurrent iff limn→+∞ P?ii(n) > 0;
• null recurrent iff limn→+∞ P?ii(n) = 0. •
Definition 3.54 — Recurrent/transient/positive-recurrent/null-recurrent
classes (resp. DTMC) (Kulkarni, 1995, definitions 3.10 and 3.11, p. 82)
A communicating class (resp. DTMC) is said to be recurrent/transient/positive-
recurrent/null-recurrent if all its states are recurrent/transient/positive recurrent/null
recurrent. •
Determining recurrence and transience of a finite communicating class or a finite state
space DTMC is trivial (Kulkarni, 1995, p. 82), essentially because of the following results.
Proposition 3.55 — Classification of states of a finite communicating class and
a finite state space DTMC (Kulkarni, 1995, theorems 3.7 and 3.8 and corollaries 3.1
and 3.2, pp. 82–84)
• Let C ⊂ S be a finite closed communicating class. Then all states in C are
positive recurrent.
• Let C ⊂ S be a finite communicating class that is not closed. Then all
states in C are transient.
139
• There are no null recurrent states in a finite state space DTMC.
• Not all states in a finite state space DTMC can be transient. •
The simple and extremely useful in Proposition 3.55 do not hold for infinite-state-space
DTMC (Kulkarni, 1995, p. 84).
For methods of establishing transience and recurrence in the infinite-state-space case,
please refer to (Kulkarni, 1995, pp. 84–98).
Exercise 3.56 — Classification of states of a finite state space DTMC (Ross,
1983, Exercise 4.11, p. 135)
Show that in a finite state space DTMC there are no null recurrent states and not all
states can be transient. •
Exercise 3.57 — More on the classification of states
Specify the classes of the following DTMC, and determine whether they are
recurrent/transient/positive-recurrent/null-recurrent classes:
(a) P =
p 1− p
1− p p
, where p ∈ (0, 1);
(b) P1 =
0 1
212
12
0 12
12
12
0
P2 =
0 0 0 1
0 0 0 1
12
12
0 0
0 0 1 0
(Ross, 2003, Exercise 14, p. 254);
(c) P3 =
0 0 1 0
1 0 0 0
12
12
0 0
13
13
13
0
P4 =
0 1 0 0
0 0 0 1
0 1 0 0
13
0 23
0
;
140
(d) P5 =
12
0 12
0 0
14
12
14
0 0
12
0 12
0 0
0 0 0 12
12
0 0 0 12
12
P6 =
14
34
0 0 0
12
12
0 0 0
0 0 1 0 0
0 0 13
23
0
1 0 0 0 0
;
(e) P7 =
13
0 23
0 0 0
0 14
0 34
0 0
23
0 13
0 0 0
0 15
0 45
0 0
14
14
0 0 14
14
16
16
16
16
16
16
P8 =
1 0 0 0 0 0
0 34
14
0 0 0
0 18
78
0 0 0
14
14
0 18
38
0
13
0 16
14
14
0
0 0 0 0 0 1
. •
3.4 Limit behavior of irreducible Markov chains
Let:
• Xn : n ∈ N0 be an irreducible DTMC with state space S and TPM P;
• α = [P (X0 = j)]j∈S be the initial distribution of the DTMC;
• αn = [P (Xn = j)]j∈S be the marginal distribution of Xn.
Since αn = αPn it is clear that once we devise the limiting behavior of Pn then
immediately derive the limiting behavior of αn (Kulkarni, 1995, p. 65), as illustrated
by the following example/exercise.
Example/Exercise 3.58 — Limiting behavior of the DTMC for brand
switching (Kulkarni, 1995, pp. 65–66)
Resume the brand switching model described in Exercise 3.19. Recall that the TPM is
this case
P =
0.1 0.2 0.7
0.2 0.4 0.4
0.1 0.3 0.6
141
Now, suppose that the initial distribution is
α = [0.2 0.3 0.5],
i.e., a typical customer buys brands A, B and C with probabilities 0.2, 0.3 and 0.5 at
time 0, respectively — essentially, 0.2, 0.3 and 0.5 are the initial market shares of these 3
brands.
It is of interest to the manufacturers of brands A, B and C to know how the market
shares will evolve with time (n). For that matter, use Mathematica to complete the
following table
n Pn αn
1
0.1 0.2 0.7
0.2 0.4 0.4
0.1 0.3 0.6
[0.130 0.310 0.560]
2
0.12 0.31 0.57
0.14 0.32 0.54
0.13 0.32 0.55
[0.131 0.318 0.551]
5
0.13188 0.31869 0.54943
0.13186 0.31868 0.54946
0.13187 0.31868 0.54945
[0.131869 0.318682 0.549449]
10
[ ]
100
[ ]
and realize that all 3 rows of Pn converge to [0.131868 0.318681 0.549451], as n→ +∞,
and so does αn.
The numbers 0.131868, 0.318681 and 0.549451 represent the long-run market shares
of brands A, B and C,10 respectively. •10I.e., the fraction of the long-run daily sales volume that goes to brands A, B and C (Kulkarni, 1995,
142
Before we provide a summary of the main results concerning the limiting behavior of
irreducible DTMC, we need a preliminary definition (Ross, 1983, p. 108).
Definition 3.59 — Stationary distribution (Ross, 1983, p. 108)
A probability distribution Pj : j ∈ S is said to be stationary for a DTMC Xn : n ∈ N0,
with state space S and TPM P = [Pij]i,j∈S , if
Pj =∑i∈S
PiPij, j ∈ S. (3.20)
•
Exercise 3.60 — Stationary distribution
Use mathematical induction to show that if the p.f. of X0 is given by Pj : j ∈ S defined
in (3.20) then
P (Xn = j) =∑i∈S
P (Xn = j | Xn−1 = i)× P (Xn−1 = i)
= Pj,
for all n ∈ N and j ∈ S (Ross, 1983, pp. 108–109). •
The rather convenient limiting results that are going to be stated in the next theorems
are a consequence of the discrete version of the key renewal theorem if we interpret the
transitions into state j as being renewals, as suggested by Ross (1983, p. 108).11
p. 68).11Kulkarni (1995, p. 100) called it the discrete renewal theorem (Kulkarni, 1995, Theorem 3.11, p. 100).
Ross (1983, Theorem 4.3.1, p. 108) also states it, as follows. Let Nj(t) the number of transitions into
state j up to time t. If states i and j communicate then:
(i) P[limn→+∞
Nj(n)n = 1
µjj| X0 = i
]= 1;
(ii) limn→+∞1n
∑nk=1 P
nij = 1
µjj;
(iii) If state j is aperiodic then limn→+∞ Pnij = 1µjj
;
(iv) If state j is periodic with period d then limn→+∞ Pndjj = dµjj
;
143
Theorem 3.61 — Limiting behavior of irreducible aperiodic DTMC (Kulkarni,
1995, theorems 3.13–3.15, pp. 103–105; Ross, 1983, Theorem 4.3.3, p. 109)
An irreducible aperiodic DTMC, with state space S and TPM P = [Pij]i,j∈S , belongs
to one of the following two classes.
(i) Either the states are all transient or all null recurrent and in this case
limn→+∞
P nij = 0, i, j ∈ S, (3.21)
and there is no stationary distribution.
(ii) Or else, all states are positive recurrent and
limn→+∞
P nij = πj > 0, i, j ∈ S, (3.22)
where πj : j ∈ S is the unique stationary distribution and satisfies the following
system of equations: πj =∑
i∈S πiPij, j ∈ S∑j∈S πj = 1.
(3.23)
•
Remark 3.62 — Limiting behavior of an irreducible positive recurrent DTMC
• An irreducible DTMC is positive recurrent iff there is a solution to the system
of equations (3.23); if there is a solution then is it is unique and πj > 0, j ∈ S
(Kulkarni, 1995, Theorem 3.18, p. 111).
• The previous result is extremely useful because it allows us to solve (3.23) without
checking for positive recurrence; in fact, if we solve (3.23) we are automatically
guaranteed positive recurrence (Kulkarni, 1995, p. 111). •
Theorem 3.63 — Limiting behavior of irreducible positive recurrent and
periodic DTMC (Ross, 1983, p. 111; Kulkarni, 1995, Theorem 3.17, p. 109)
144
For an irreducible positive recurrent and periodic DTMC with period d,
limn→+∞
P ndij = d× πj, i, j ∈ S, (3.24)
where πj : j ∈ S is the unique non-negative solution of (3.23).12 •
Remark 3.64 — Interpretation of the πj (Kulkarni, 1995, p. 111)
• In an aperiodic DTMC, πj has two interpretations:
(i) πj is the limiting probability that the DTMC is in state j;
(ii) πj is the long-run fraction of time that the DTMC spends in state j and
µjj = 1πj
.
• If the DTMC is periodic then only the second interpretation is valid.
• πj : j ∈ S is the stationary distribution regardless of the fact that the DTMC
is aperiodic or not — if X0 has p.f. πj : j ∈ S then Xn has the same p.f. for all
n ∈ N (recall Exercise 3.60). •
Exercise 3.65 — Limiting behavior of irreducible aperiodic DTMC
Resume Example/Exercise 3.58 (brand switching) and use Theorem 3.61 to confirm that
the stationary distribution if given by [0.131868 0.318681 0.549451]. •
Exercise 3.66 — Limiting behavior of irreducible aperiodic DTMC (bis)
Resume Exercise 3.6 in which Evaristo’s mood is governed by a DTMC with a TPM
P =
0.5 0.4 0.1
0.3 0.4 0.3
0.2 0.3 0.5
.In the long-run, what proportion of time is the stochastic process in each of the three
states (Ross, 2003, Example 4.18, p. 202)? •12Equivalently, limn→+∞ P ?ij(n) = limn→+∞
1n+1
∑nk=0 P
kij = πj , i, j ∈ S.
145
Remark 3.67 — Obtaining the limiting probabilities (Kulkarni, 1995, p. 113)
• The study of the limiting behavior of irreducible positive recurrent DTMC
essentially involves solving the system of equations πj =∑
i∈S πiPij, j ∈ S∑j∈S πj = 1.
• If the DTMC has finite state space and the transition probabilities are given
numerically (such as in exercises 3.65 and 3.66) — rather than algebraically as in
exercises 3.72 and 3.73 — then one can provide numerical values for the πj : j ∈ S,
namely by making use of Proposition 3.68 and Mathematica to avoid tedious
calculations by hand. •
Proposition 3.68 — Obtaining the limiting probabilities numerically (Resnick,
1992, Proposition 2.14.1, p. 138)
Let:
• 1 = [1 · · · 1] a row vector with #S = m ones;
• I be the identity matrix with rank m;
• P = [Pij]i,j∈S be an m×m irreducible TPM;
• ONE is the m×m matrix all of whose entries are equal to 1;
• π = [πj]j∈S be the row vector denoting the stationary distribution.
Then
π = 1× (I−P + ONE)−1. (3.25)
•
Exercise 3.69 — Obtaining the limiting probabilities numerically
Prove Proposition 3.68 (Resnick, 1992, pp. 138–139). •
146
Exercise 3.70 — Obtaining the limiting probabilities numerically
Admit that in Evaristo’s home country, the transitions between (1) upper-, (2) middle-,
or (3) lower-class of the successive generations can be regarded as transitions of a DTMC
with TPM given by
P =
0.45 0.48 0.07
0.05 0.70 0.25
0.01 0.50 0.49
.Determine the percentage of inhabitants in each one of those social classes in the
long-run (Ross, 2003, Example 4.19, pp. 202–203). •
Exercise 3.71 — Obtaining the limiting probabilities numerically (bis) (Ross,
2003, Exercise 25, p. 256)
Each morning Evaristo leaves his house and goes for a run. He is equally likely to leave
either from his front or back door. Upon leaving the house, he chooses a pair of running
shoes (or goes running barefoot if there are no running shoes at the door from which he
departed). On his return he is equally likely to enter, and leave his running shoes, either
by the front or back door.
What proportion of the time does Evaristo run barefoot if he owns a total of k pairs
of running shoes? •
Exercise 3.72 — Obtaining the limiting probabilities algebraically (Ross, 1983,
Exercise 4.9, p. 135; Ross, 2003, Exercise 20, p. 255)
A TPM P is said to be doubly stochastic if∑
i∈S Pij = 1, for all j.
If the associated DTMC has a n states and is ergodic,13 show that the limiting
probabilities are given by 1n. •
13A positive recurrent, aperiodic state is called ergodic (Ross, 1983, p. 108).
147
Exercise 3.73 — Obtaining the limiting probabilities algebraically (bis) (Ross,
1983, Exercise 4.13, p. 135)
Clotilde possesses r umbrellas which she employs in going from her home to office, and
vice versa. If she is at home (resp. the office) at the beginning (resp. end) of a day and it is
raining, then she will take an umbrella with her to the office (resp. home), provided there
is one to be taken. If it is not raining, then she never takes an umbrella. Assume that,
independent of the past, it rains at the beginning (resp. end) of a day with probability p.
(a) Define a Markov chain with r+1 states which will help us to determine the proportion
of time that Clotilde gets wet.
(Note: She gets wet if it is raining, and all umbrellas are at her other location.)
(b) Compute the limiting probabilities.
(c) What value of p maximizes the fraction of time Clotilde gets wet when r = 3? •
Remark 3.74 — Obtaining the limiting probabilities, infinite state space
(Kulkarni, 1995, p. 113)
The limiting behavior of irreducible positive recurrent DTMC with infinite state space
can be also derived, namely when the transition probabilities are given algebraically, such
as in Exercise 3.75. For a detailed description of methods of solution, please refer to
Kulkarni (1995, pp. 113-123). •
Exercise 3.75 — Obtaining the limiting probabilities algebraically, infinite
state space (Ross, 1983, Exercise 4.16, p. 136)
Consider a DTMC with state space S = N0 and TPM such that
Pi,i+1 = pi = 1− Pi,i−1,
where p0 = 1.
Find the necessary and sufficient condition on the pi’s for this DTMC to be positive
recurrent, and compute the limiting probabilities in this case (Kulkarni, 1995, Example
3.23, pp. 115–117). •
148
Exercise 3.76 — More on limiting probabilities (Ross, 1983, Exercise 4.31, p. 139)
Let Xn : n ∈ N denote an irreducible DTMC with a countable space S.
Now consider a new stochastic process Yn : n ∈ N where Yn denotes the nth value
of Xn : n ∈ N that is between 0 and N . For instance, if N = 3 and X1 = 1, X2 = 3,
X3 = 5, X4 = 6, X5 = 2, then Y1 = 1, Y2 = 3, Y3 = 2.
(a) Is Yn : n ∈ N a DTMC? Explain briefly.
(b) Let πj denote the proportion of time that Xn : n ∈ N is in state j. If πj > 0 for all
j, what proportion of time is Yn : n ∈ N in each of the states 0, 1, . . . , N? •
3.5 Limit behavior of reducible Markov chains
In this section, we follow Kulkarni (1995, pp. 132–137) quite closely and assume that the
DTMC has k closed communicating classes C1, . . . , Ck and the remaining states form a
set T(
i.e., T = S\⋃kr=1Cr
). Moreover the states are assumed to have been relabeled so
that the TPM of the reducible DTMC is of the form
P =
P(1) · · · · · · · · · O O...
. . ....
......
. . ....
......
. . ....
...
O · · · · · · · · · P(k) O
D Q
, (3.26)
where:
• P(r) = [Pij]i,j∈Cr denotes the stochastic matrix associated with class Cr, r =
1, . . . , k;
• the O’s are matrices of zeroes;
149
• Q = [Qij]i,j∈T is a sub-stochastic matrix governing the transitions between the states
in T ;14
• D = [Dij]i∈T, j∈S\T is a matrix such that∑
j∈S\T Dij +∑
j∈T Qij = 1, i ∈ T .
Exercise 3.77 — Relabeling the states of a reducible DTMC (Kulkarni, 1995,
Example 3.26, pp. 135–136)
Relabel the states of the TPM in Exercise 3.32(d),
P =
12
0 0 0 0 12
0 13
0 0 23
0
16
16
16
16
16
16
0 0 0 1 0 0
0 23
0 0 13
0
12
0 0 0 0 12
,
such that the TPM of the reducible DTMC is of the form (3.26). Identify the matrices
P(1), P(2), P(3), Q and D. •
Remark 3.78 — Limiting behavior of a reducible DTMC (Kulkarni, 1995, p. 132)
• Elementary matrix algebra leads to the following n−step TPM
Pn =
Pn(1) · · · · · · · · · O O...
. . ....
......
. . ....
......
. . ....
...
O · · · · · · · · · Pn(k) O
Dn Qn
, (3.27)
where Dn = [Dn(i, j)]i∈T, j∈S\T .
14I.e., Qij ≥ 0, i, j ∈ T and∑j∈T Qij < 1, i ∈ T .
150
• Moreover, since P(r) is, for r = 1, . . . , k, a TPM of an irreducible DTMC, we can
add, for instance that
limn→+∞
P nij(r) = πj(r), (3.28)
where the limiting probabilities πj(r) : j ∈ Cr are given by the unique solution of
the system of equations πj(r) =∑
i∈Cr πiPij, j ∈ Cr∑j∈Cr πj(r) = 1,
(3.29)
in case Cr is a positive recurrent and aperiodic closed communicating class.
• Similarly, since all states in T must be transient we know that
limn→+∞
Qnij = 0, i, j ∈ T. (3.30)
• In conclusion, deriving the limiting behavior of Pn boils down to the study of the
limiting behavior of Dn, described in Proposition 3.79. •
Proposition 3.79 — Limiting behavior of a reducible DTMC (Kulkarni, 1995,
Theorem 3.21, pp. 134–135)
Let:
• αi(r) = P (Xn ∈ Cr, for some n ∈ N0 | X0 = i), for i ∈ T and fixed r ∈ 1, . . . , k.15
For i ∈ T , j ∈ Cr and fixed r ∈ 1, . . . , k:
• if Cr is transient or null recurrent then
limn→+∞
Dn(i, j) = 0, i ∈ T, j ∈ Cr; (3.31)
15For a fixed r ∈ 1, . . . , k, the quantities αi(r), for i ∈ T , are given by the smallest non-negative
solution ui : i ∈ T to ui =∑j∈Cr
Pij +∑j∈T Pij uj , for i ∈ T (Kulkarni, 1995, Theorem 3.20, p. 133).
151
• if Cr is positive recurrent and aperiodic then
limn→+∞
Dn(i, j) = αi(r)πj(r), i ∈ T, j ∈ Cr, (3.32)
where πj(r) : j ∈ Cr are given by the solution of (3.29);
• if Cr is positive recurrent and periodic then
limn→+∞
1
n+ 1
n∑m=0
Dm(i, j) = αi(r)πj(r), i ∈ T, j ∈ Cr, (3.33)
where πj(r) : j ∈ Cr are given by the solution of (3.29); in this case Dn(i, j) has
not a limit as n→ +∞. •
Exercise 3.80 — Limiting behavior of a reducible DTMC
Use Mathematica to “verify” that
P =
12
12
0 0 0 0
12
12
0 0 0 0
0 0 13
23
0 0
0 0 23
13
0 0
0 0 0 0 1 0
16
16
16
16
16
16
converges to
0.5 0.5 0 0 0 0
0.5 0.5 0 0 0 0
0 0 0.5 0.5 0 0
0 0 0.5 0.5 0 0
0 0 0 0 1 0
0.2 0.2 0.2 0.2 0.2 0
(Kulkarni, 1995, Example 3.26, pp. 135–137). •
152
3.6 Markov chains with costs/rewards
Consider a DTMC Xn : n ∈ N0 and suppose that every time we visit state i we incur
in a cost c(i). Then the expected total cost up to time N is given by
E
[N∑n=0
c(Xn) | X0 = i
], (3.34)
and the expected cost per time unit — up to time N — equals
1N+1
E[∑N
n=0 c(Xn) | X0 = i].
Can we calculate the long-run expected cost per time unit?
Yes!
It is related to the limiting probabilities, as stated in the following proposition for an
irreducible, positive recurrent DTMC.
Proposition 3.81 — Long-run expected cost per time unit (Kulkarni, 1995,
Theorem 3.23, p. 140)
Let:
• Xn : n ∈ N0 be an irreducible, positive recurrent DTMC, with TPM P, state
space S;
• πj : j ∈ S be the stationary distribution of this DTMC;
• c(i) be the cost incurred whenever we visit state i.
If |c(i)| ≤ B, for all i ∈ S, then the long-run expected cost per time unit — or long-run
cost rate — is given by
limN→+∞
1
N + 1E
[N∑n=0
c(Xn) | X0 = i
]=∑j∈S
πjc(j), (3.35)
regardless of the value of initial state i. •
Remark 3.82 — Long-run expected cost per time unit (Kulkarni, 1995, Theorem
3.23, p. 140)
• It can be shown that Equation (3.35) is still valid if∑
j∈S πj|c(j)| < +∞, which is
a condition weaker than |c(i)| ≤ B, i ∈ S.
153
• Proposition 3.81 can be extended to reducible DTMC, however the long-run cost
rate depends on the initial state i. •
Exercise 3.83 — Long-run expected cost per time unit
Prove Proposition 3.81 (Kulkarni, 1995, pp. 140–141). •
Exercise 3.84 — Long-run expected cost per time unit
Clotilde used to play semi-pro basketball and her scoring productivity per game fluctuated
between 3 states:
• 1 (scored 0 or 1 points);
• 2 (scored between 2 or 5 points);
• 3 (scored more than 5 points).
Inevitably, if Clotilde scored more than 5 points in one game, her jealous teammates
refused to pass her the ball in the next game.
The team statistician, upon observing the transitions between states, concluded that
these transitions could be modeled by a DTMC with TPM
P =
0 1
323
13
0 23
1 0 0
.(a) What is the long-run proportion of games that Clotilde scores more than 5 points
(Resnick, 1992, pp. 139–141)?
(b) The salaries in the semi-pro leagues include incentives for scoring. Clotilde was paid
20, 30 and 40 euros per game, for a scorings in states 1, 2 and 3, respectively.
What was the long-run earning rate of Clotilde (Resnick, 1992, pp. 139, 141)? •
154
Exercise 3.85 — Long-run expected cost per time unit (bis) (Resnick, 1992,
Exercise 2.29, pp. 156–157)
Evaristo visits the dentist every six months. Because of a sweet tooth and fetish for
chocolate, the condition of his teeth varies according to a DTMC on the states 1, 2, 3, 4,
where: 1 means no dental work is required; 2 means a cleaning is required; 3 means a
filling is required; and 4 means a root canal work is needed. Admit that transitions from
state to state are governed by the TPM
P =
0.6 0.2 0.1 0.1
0.4 0.4 0.1 0.1
0.3 0.3 0.2 0.2
0.4 0.5 0.1 0
.
Charges for each visit to the dentist depend on the work done: 20, 30, 50 and 300 euros
if the condition of Evaristo’s teeth are in states 1, 2, 3 and 4, respectively.
(a) What is the percentage of visits associated to charge of at least 50 euros?
(b) Determine Evaristo’s long-run cost rate for maintaining his teeth. •
Can we consider time-dependent cost functions, namely when we admit that if we
incur a cost of c monetary units at time n then this cost is equivalent to αnc (α ∈ [0, 1))
at time 0?
Yes!
Proposition 3.86 — Expected total discounted cost (Kulkarni, 1995, p. 138)
Let:
• Xn : n ∈ N0 be a DTMC, with TPM P, state space S;
• c = [c(i)]i∈S be a column vector of costs;
• α (α ∈ [0, 1)) be the rate at which the costs c(i) are discounted.
155
Then the expected total discounted cost incurred over the infinite horizon, starting at
state i, is equal to
φ(i) = E
[+∞∑n=0
αnc(Xn) | X0 = i
]. (3.36)
Moreover, φ(i) satisfies
φ(i) = c(i) + α∑j∈S
Pijφ(j), i ∈ S. (3.37)
Equivalently, the column vector φ = [φ(i)]i∈S is given by
φ = (I− αP)−1 × c. (3.38)
•
Exercise 3.87 — Expected total discounted cost
Prove Proposition 3.86 (Kulkarni, 1995, p. 138). •
Exercise 3.88 — Long-run cost rate; expected total discounted cost
Consider a brand-switching model such that a typical customer keeps switching between
brands A, B and C according to the following TPM:
P =
0.1 0.2 0.7
0.2 0.4 0.4
0.1 0.3 0.6
.Suppose brands A, B and C cost 1.00, 1.50 and 2.00 euros, respectively.
(a) Find the long-run expected cost per time unit (Kulkarni, 1995, Example 3.28, p. 141).
(b) Compute the expected total discounted expenditure of a typical customer, assuming
a discount factor α = 0.90 (Kulkarni, 1995, Example 3.27, pp. 139–140). •
156
3.7 Reversible Markov chains
Are there any DTMC with the property that when the direction of time is reversed the
behavior of the process remains the same?
Yes!
Some DTMC (and other stochastic processes) have this curious property and are
called reversible DTMC. Loosely speaking, if we film a DTMC and then run the film
backwards the result will be statistically indistinguishable from the original DTMC
(www.statslab.cam.ac.uk/∼frank/BOOKS/book/ch1.pdf).
But before we proceed, note that we have to consider from now on:
• a DTMC with index set Z, Xm : m ∈ Z,16 that happens to be irreducible positive
recurrent and also stationary;17
• the reversed process of this DTMC at n (n ∈ Z), Xn−m : m ∈ Z.
Proposition 3.89 — Property of the reversed process (Kulkarni, 1995, Theorem
3.25, p. 143; Ross, 2003, p. 232)
Let Xm : m ∈ Z be an irreducible positive recurrent DTMC, with stationary
distribution πj : j ∈ S. Then its reversed process at n, Xn−m : m ∈ Z is a DTMC
with transition probabilities
Qij =πj × Pjiπi
, (3.39)
for i, j ∈ S. •
Exercise 3.90 — Property of the reversed process
Prove Proposition 3.89 (Kulkarni, 1995, pp. 143–144; Ross, 2003, p. 232). •
16For the definition of such a DTMC please refer to Kulkarni (1995, Definition 3.14, p. 142).17That is, its initial state, say X−∞, is chosen according to the stationary probabilities πj : j ∈ S.
157
Definition 3.91 — Time reversible DTMC (Kulkarni, 1995, Definition 3.12, p. 142)
The DTMC Xm : m ∈ Z is said to be time reversible if it is has the same probabilistic
behavior as Xn−m : m ∈ Z, for all n ∈ Z.18 •
If we capitalize on Definition 3.91, Proposition 3.89 suggests necessary and sufficient
conditions for time reversibility.
Proposition 3.92 — Necessary and sufficient conditions for time reversibility
(Kulkarni, 1995, Theorem 3.26, p. 142)
Let Xm : m ∈ Z be an irreducible positive recurrent DTMC, with stationary
distribution πj : j ∈ S. Then Xm : m ∈ Z is time reversible iff
πi × Pij = πj × Pji, i, j ∈ S. (3.40)
•
Equations (3.40) are usually called detailed balance equations (Kulkarni, 1995, p. 144)
and can be stated as follows: for all states i and j, the rate at which the DTMC goes
from i to j, πi×Pij, is equal to the rate at which it goes from j to i, πj ×Pji (Ross, 2003,
p. 233).
Exercise 3.93 — Time reversible DTMC
Consider a DTMC Xn : n ∈ Z with state space S = 0, 1, . . . ,M and transition
probabilities
Pi,i+1 = αi = 1− Pi,i−1, i = 1, . . . ,M − 1
P0,1 = α0 = 1− P0,0
PM,M = αM = 1− PM,M−1.
(a) Prove that Xn : n ∈ Z is time reversible and compute its limiting probabilities
(Ross, 2003, Example 4.31, pp. 234–235).
18That is, if (Xn1 , Xn2 , . . . , Xnk) has the same distribution as (Xn−n1 , Xn−n2 , . . . , Xn−nk
), for all
n, n1, n2, . . . , nk ∈ Z and k ∈ N (www.statslab.cam.ac.uk/∼frank/BOOKS/book/ch1.pdf).
158
(b) Determine those limiting probabilities when αi = α, i = 0, 1, . . . ,M (Ross, 2003,
Example 4.31, p. 235).
(c) The DTMC considered in this exercise arose in an urn model proposed by the
physicists P. and T. Ehrenfest to describe the movements of molecules. These authors
admitted that M molecules were distributed among two urns, I and II, and that at
each time point n one of the molecules is chosen at random, removed from its urn,
and placed in the other one. Let Yn be the number of molecules in urn I at time n.
Then Yn : n ∈ N0 is a DTMC with the same state space and transition probabilities
as Xn : n ∈ N0.
Compute the limiting probabilities in this case (Ross, 2003, pp. 235–236). •
Exercise 3.94 — Time reversible DTMC
Let G be an arbitrary connected graph with cost cij associated with the arc (i, j). Now
consider a particle moving from node i to node j with probability
Pij =cij∑k cik
,
where cik = 0 if there is no arc (i, k).
Define a DTMC that describes the movement of this particle and show that that
DTMC is time reversible (Ross, 2003, Example 4.32, pp. 236–237). •
Exercise 3.95 — Time reversible DTMC (Ross, 1983, Exercise 4.29, p. 138)
Consider a time reversible DTMC with state space N0 and transition probabilities Pij
and limiting probabilities πi. Now, consider the same DTMC truncated with state space
0, 1, . . . ,M and transition probabilities
Pij =
Pij∑Mk=0 Pik
, i, j = 0, 1, . . . ,M
0, otherwise.
159
Show that the truncated DTMC is also time reversible and has limiting probabilities
given by
πi =πi∑M
k=0 Pik∑Mk=0
(πk∑M
j=0 Pkj
) .•
Note that the detailed balance equations allow us to determine if a DTMC
is time reversible based on the transition probabilities and the stationary
distribution, while the stationary distribution is determined solely by the
transition probabilities. Thus, we are led to believe that it is possible to
determine if a DTMC is time reversible from the transition probabilities alone
(www.math.ucsd.edu/∼williams/courses/.../scullardMath289 Reversibility.pdf). This
constitutes a result also known as Kolmogorov’s criterion.
Proposition 3.96 — Kolmogorov’s criterion for time reversibility (Ross, 2003,
Theorem 4.2, p. 238; Kulkarni, 1995, Theorem 3.27, p. 145)
An irreducible, positive recurrent stationary DTMC is time reversible iff its transition
probabilities satisfy
Pi,i1 × Pi1,i2 × · · · × Pik,i = Pi,ik × · · · × Pi2,i1 × Pi1,i, i, i1, i2, . . . , ik, (3.41)
for any k ∈ N, i.e., if starting in state i, any path back to state i has the same probability
as the reversed path. •
Exercise 3.97 — Kolmogorov’s criterion for time reversibility
Prove Proposition 3.96 (Kulkarni, 1995, pp. 145–146). •
Exercise 3.98 — Kolmogorov’s criterion for time reversibility
Use Proposition 3.96 to investigate if the DTMC described by the following transition
diagrams are time reversible:
160
(a)
4 Examples
Examples of stochastic processes which are reversible include stationary birth-death processes, M/M/1 queues, and symmetric random walks. For instance, itis easy to see from Kolmogorov’s Criterion that the Markov chain given belowis reversible:
On the other hand, Kolmogorov’s Criterion shows that the following processis not reversible, as a clockwise loop around the graph has probability 1
256 whilea counterclockwise loop has probability 1
16 :
3
(b)
4 Examples
Examples of stochastic processes which are reversible include stationary birth-death processes, M/M/1 queues, and symmetric random walks. For instance, itis easy to see from Kolmogorov’s Criterion that the Markov chain given belowis reversible:
On the other hand, Kolmogorov’s Criterion shows that the following processis not reversible, as a clockwise loop around the graph has probability 1
256 whilea counterclockwise loop has probability 1
16 :
3
(www.math.ucsd.edu/ williams/courses/.../scullardMath289 Reversibility.pdf). •
161
3.8 Branching processes
Branching processes:
• are Markov processes that model a population in which each individual in generation
n produces some random number of individuals in generation n+1, according, in the
simplest case, to a fixed probability distribution that does not vary from individual
to individual (http://en.wikipedia.org/wiki/Branching process);
• have been applied, for instance, in biology,19 sociology20 and engineering (Ross,
2003, p. 228), namely to model the size of a population of individuals, bacteria,
etc., the spread of surnames and the propagation of neutrons in a nuclear reactor
(http://en.wikipedia.org/wiki/Branching process).
The most common formulation of a branching process is as a Galton-Watson process,
arising originally from Francis Galton’s statistical investigation of the extinction of family
names (http://en.wikipedia.org/wiki/Branching process).
Definition 3.99 — Branching process, X0 = 1 (Kulkarni, 1995, p. 34; Ross, 2003,
pp. 228–229)
Let:
• Xn denote the number of individuals of the nth generation, starting with X0 = 1
individual (the size of the zeroth generation);
• Zl (or Zl,n) be the number of offspring of the lth individual of the nth generation.
If Zl : l ∈ N are non-negative integer i.i.d. r.v., with p.f. Pj = P (Zl = j), j ∈ N0, and
independent of the size of the generation, then
Xn =
Xn−1∑l=1
Zl, n ∈ N, (3.42)
19For concrete illustrations of branching processes in biology, see http://en.wikipedia.org/wiki/Galton-
Watson process.20See Kulkarni (1995, pp. 33–34).
162
and Xn : n ∈ N0 is usually called a branching process (or, rightly so, a Galton-Watson
process). •
Proposition 3.100 — Branching process, X0 = 1 (Kulkarni, 1995, p. 34; Ross, 2003,
p. 229)
The branching process Xn : n ∈ N0 is a DTMC with state space S = N0,21 and
transition probabilities given by
Pij = P
(Xn =
Xn−1∑l=1
Zl = j | Xn−1 = i
)
= P
(i∑l=1
Zl = j
). (3.43)
•
Remark 3.101 — Classification of states of a branching process, X0 = 1
• Since P00 = P (Xn+1 = 0 | Xn = 0) = 1, we can add state 0 is absorbing and, thus,
recurrent.
• If P1 = 1 then Xn = 1, n ∈ N0, and the branching process is a DTMC with state
space S = 1, which is obviously an absorbing state.
• If P1 < 1 and P0 = 0 then the branching process is a increasing DTMC, with state
space S = N, and all its states are transient (Resnick, 1992, p. 97).22
• If P1 < 1 and P0 > 0 then all the states of the branching process are also transient.
(Resnick, 1992, pp. 97–98).23 •
Deriving the p.f. of Xn is far from being simple. One way of identifying this p.f.
is via the p.g.f. of Xn, Pn(s) = PXn(s) = E(sXn), s ∈ [0, 1], after all P (Xn = k) =
1k!× dkPXn (s)
dsk
∣∣∣s=0
.
21In most cases! See Remark 3.101.
22Because fkk = P (eventual return to k) = P (Xn+1 = k | Xn = k) = P (Zl,n+1st= Zl = 1, l =
1, . . . , k) = P k1 < 1, for k ∈ N.23Note that in this case fkk ≤ P (X1 6= 0 | X0 = k) = 1−P (X1 = 0 | X0 = k) = 1−P k0 < 1, for k ∈ N.
163
Proposition 3.102 — P.g.f. of a branching process, X0 = 1 (Resnick, 2003, p. 19)
Let:
• Xn : n ∈ N0 be a branching process such that X0 = 1;
• Pn(s) = E(sXn) =∑+∞
j=0 sjP (Xn = j), s ∈ [0, 1], be the p.g.f. of Xn;
• P (s) = E(sZl) =∑+∞
j=0 sjPj, s ∈ [0, 1], be the common p.g.f. of the r.v. Zl.
Then the p.g.f. of Xn can be obtained recursively:
Pn+1(s) = Pn[P (s)], n ∈ N, s ∈ [0, 1], (3.44)
Similarly, Pn+1(s) = P [Pn(s)]. •
Exercise 3.103 — P.g.f. of a branching process, X0 = 1
Prove Proposition 3.102 taking advantage of Equation (3.42) (Resnick, 2003, p. 19). •
In general, determining Pn(s) is not trivial24 (and so is the computation of the p.f. of
Xn via its p.g.f.). However, we can add expressions for the expected value and variance
of a branching process.
Proposition 3.104 — Expected value and variance of a branching process, X0 =
1 (Ross, 2003, pp. 229–230)
Let Xn : n ∈ N be a branching process such that X0 = 1. Then
E(Xn | X0 = 1) = µn (3.45)
V (Xn | X0 = 1) =
σ2µn−1 × µn−1µ−1
, if µ 6= 1
nσ2, if µ = 1,(3.46)
where µ = E(Zl) =∑
j∈N0j × Pj and σ2 = V (Zl) =
∑j∈N0
(j − µ)2 × Pj represent the
expected value and variance of the number of offspring an individual has. •
Exercise 3.105 — Expected value and variance of a branching process, X0 = 1
Prove Proposition 3.104 (Ross, 2003, pp. 229–230). •
24For a case where calculations are possible, please refer to Resnick (1992, p. 20).
164
Remark 3.106 — Expected value and variance of a branching process, X0 = 1
From Proposition 3.104 we can conclude that:
limn→+∞
E(Xn | X0 = 1) =
0, µ < 1
1, µ = 1
+∞, µ > 1;
(3.47)
limn→+∞
V (Xn | X0 = 1) =
0, µ < 1
+∞, µ = 1
+∞, µ > 1.
(3.48)
•
A central question in the theory of branching processes is the probability of (ultimate)
extinction ((http://en.wikipedia.org/wiki/Branching process), π, and the probability of
extinction on or before generation n, πn = P (Xn = 0 | X0 = 1) = Pn(0) = P (Pn−1(0)) =
P (πn−1), for n ∈ N (and π0 = 0). Unsurprisingly, the problem of determining the value of
the probability of extinction π was first raised in connection with the extinction of family
surnames by Galton in 1889 (Ross, 2003, p. 231).
Proposition 3.107 — Probability of extinction, X0 = 1 (Ross, 1983, p. 117)
Let π denote the probability that the population will eventually die out (assuming that
X0 = 1), i.e.,
π = limn→+∞
P (Xn = 0 | X0 = 1). (3.49)
Suppose that P0 > 0 and P0 + P1 < 1. Then:
• π = 1 iff µ ≤ 1;
• if µ > 1, π is the smallest positive number satisfying π = P (π), i.e.,
π =+∞∑j=0
πj × Pj, (3.50)
where Pj denotes the probability that an individual has j offspring.25 •25 In fact π0 is the smallest positive number satisfying s = P (s), where P (s) represents the p.g.f. of Zl
(Resnick, 1992, Theorem 1.4.1, p. 21). Moreover, for s ∈ [0, 1), π0 = limn→+∞ Pn(s).
165
Exercise 3.108 — Probability of extinction, X0 = 1
Prove Proposition 3.107 (Ross, 2003, p. 231, to show (3.50); Resnick, 1992, p. 22, to show
that π0 is the smallest positive number satisfying (3.50)). •
Exercise 3.109 — Probability of extinction, X0 = 1
Compute the extinction probability π when:
(a) P0 = 12, P1 = 1
4, P2 = 1
4(Ross, 2003, Example 4.28, p. 232);
(b) P0 = 14, P1 = 1
4, P2 = 1
2(Ross, 2003, Example 4.29, p. 232);
(c) P0 = 14, P1 = 1
12, P2 = 2
3;
(d) P0 = 16, P1 = 1
12, P2 = 3
4. •
Exercise 3.110 — Probability of extinction, X0 = m
What is the probability that the population will die out if it initially consists of m
individuals (Ross, 2003, Example 4.30, p. 232)? •
Exercise 3.111 — Probability of extinction (Ross, 1983, Exercise 4.24, p. 137)
Let Xn : n ∈ N be a branching process such that the number of offspring per individual
has a binomial distribution with parameters (2, p). Starting with a single individual (i.e.,
X0 = 1), calculate:
(a) the extinction probability π;
(b) the probability that the population becomes extinct (for the first time) in the 3rd.
generation.
Suppose that, instead of starting with a single individual, X0 = Z0, where Z0 ∼
Poisson(λ).
(c) Show that, in this case, the extinction probability is given by
exp [−λ(1− π)] ,
for π ≡ π(p) and p > 12. •
166
3.9 First passage times; absorption probabilities
We start this section by noting that:
• to classify a state i as recurrent or transient we have to calculate fi, the probability
that, starting in state i, the process will ever reenter state i; since fi ≡ fii =∑+∞
n=1 fnii
we need to calculate fnii , a particular case of fnij, the probability that, starting from
state i, the first transition into state j occurs exactly at time n;
• a state i is called absorbing if it is impossible to leave this state, thus, the state i is
absorbing iff Pii = 1; if every state can reach an absorbing state, then the Markov
chain is an absorbing Markov chain (http://en.wikipedia.org/wiki/Markov chain);
• we are frequently interested in finding the probability that a DTMC (e.g., a
branching process) reaches an absorbing state (e.g., extinction).
We can use a recursive scheme to compute the probabilities fnij, as described in the
next proposition.
Proposition 3.112 — First passage probabilities (Resnick, 1992, pp. 89–90)
Let:
• Xn : n ∈ N0 a DTMC with state space S and TPM P;
• fnij be the probability that, starting from state i, the first transition into state j
occurs exactly at time n, i.e., fnij = P (Xn = j,Xn−1 6= j, . . . , X1 6= j | X0 = i), n ∈
N, and fnj
= [fnij]i∈S be the corresponding column vector;
• (j)P be a matrix obtained by setting all the entries of the jth column of P equal to
0.
Then fnij can be obtained by a first jump decomposition,26
fnij =
Pij, n = 1∑k 6=j Pikf
n−1kj , n = 2, 3, . . .
(3.51)
26fnij =∑k 6=j, k∈S P (Xn = j,Xn−1 6= j, . . . , X1 = k,X0 = i) =
∑k 6=j P (Xn = j,Xn−1 6= j, . . . , X2 6=
j | X1 = k)× P (X1 = k | X0 = i).
167
or, more conveniently, fnj
can be computed as a matrix recursion:
fnj
=
f 1
j= [Pij]i∈S , n = 1
(j)P× fn−1
j=[
(j)P]n−1 × f 1
j, n = 2, 3, . . .
(3.52)
•
Exercise 3.113 — First passage probabilities (Resnick, 1992, Exercise 2.8, p. 149)
Consider a DTMC on S = 1, 2, 3 with TPM
P =
1 0 0
12
16
13
13
35
115
.(a) Find fni3 = P (Xn = 3, Xn−1 6= 3, . . . , X1 6= 3 | X0 = i), for i, n = 1, 2, 3, without using
Proposition 3.112.
(b) Obtain a recursive equation for fni3, i = 1, 2, 3 and n ∈ N. 27 •
We are frequently interested in characterizing the time Ti until the system goes from
some initial state i to some terminal critical state, which may represent a machine
breakdown, bankruptcy, or simply a state of interest (Resnick, 1992, p. 102).
These problems can be often be formulated as first passage probabilities/times to
absorption/exiting times, as put by Resnick (1992, p. 102), and solutions can be provided
by making use of what reminds us of the renewal argument and Resnick (1992, p. 104)
calls first step analysis, as illustrated by Exercise 3.114.
Exercise 3.114 — Time to absorption (Ross, 1983, Exercise 4.15, pp. 135–136)
A DTMC has an absorbing state 0 — that is, P00 = 1 — and set of transient states N.
Let Ti be the time the DTMC takes to reach state 0 given it starts in state i, and let
Mi = E(Ti), i ∈ N.
(a) Show that Mi = 1 +∑
j∈N PijMj.
(b) Let σi(n) = P (Ti > n). Derive a formula of σi(n+ 1) in terms of σj(n), j ∈ N. •
27See Resnick (1992, p. 90).
168
In the absence of absorbing states, the distribution of exiting times can be determined
in a quite trivial way in specific cases, as shown by Exercise 3.115.
Exercise 3.115 — Exiting times (Resnick, 1992, Exercise 2.4, p. 148)
Suppose Pii > 0 and let
τi = infn ∈ N : Xn 6= i | X0 = i
be the exit time from state i.
Show that τi has a geometric distribution and identify its parameter. •
The general treatment of first passage probabilities/times is as follows.
Proposition 3.116 — First passage probabilities/times (Resnick, 1992, pp. 106–
107)
Let:
• Xn : n ∈ N0 a DTMC with state space S;
• S = T ∪C1 ∪C2 ∪ · · · be the (canonical) decomposition of the state space, where T
consists of the set of transient states and the communicating classes Cl are closed
and recurrent;
• P = [Pij]i,j∈S be its TPM;
• Q = [Qij]i,j∈T be the restriction of P to the transient states;
• R = [Pkl]k∈T, l∈T ;
• P =
Q R
0 P2
be the partition of the TPM;
• τ = infn ∈ N0 : Xn 6∈ T be the exit time of the set of transient states;28
• Xτ be the first state hit by the DTMC outside T (assuming τ finite!);
28We shall assume that τ is finite for all starting states i ∈ T .
169
• uik = P (Xτ = k | X0 = i) be the probability that the first state the DTMC reaches
when it leaves the set of transient states is k ∈ T , given that the initial state of the
chain is state i ∈ T ;
• U = [uik]i∈T, k∈T be the matrix of the uik probabilities;
• E(τ | X0 = i), i ∈ T , be the expected value of the first passage time τ , given the
initial state of the DTMC is i ∈ T ;
• E[∑τ−1
n=0 g(Xn) | X0 = i], i ∈ T , be the expected cumulative reward starting from
the initial state i ∈ T until we leave T , given , where g(j) represents the reward for
being in state j.
Then once the DTMC leaves T , it will hit one of the closed recurrent communicating
classes and can never return to T , and ui(Cl) = P (Xτ ∈ Cl | X0 = i) =∑
k∈Cl uik.
Moreover:
U = (I−Q)−1 ×R (3.53)
[E(τ | X0 = i)]i∈T = (I−Q)−1 × 1 (3.54)[E[
τ−1∑n=0
g(Xn) | X0 = i]
]i∈T
= (I−Q)−1 × g, (3.55)
where (I − Q) is usually known as the fundamental matrix, 1 is a vector of ones and g
the vector of rewards.29
Exercise 3.117 — First passage times (Resnick, 1992, Exercise 2.16, pp. 151–152)
Evaristo owns a restaurant which fluctuates in successive years between 3 states — 1
(bankruptcy), 2 (verge of bankruptcy), and 3 (solvency) —, according to a DTMC with
TPM equal to
P =
1 0 0
0.5 0.25 0.25
0.5 0.25 0.25
.29When the state space is finite or when T is finite, (I − Q) has indeed an inverse, which can be
represented as (I−Q)−1 =∑+∞n=0 Q
n.
170
(a) Compute the expected number of years until Evaristo’s restaurant goes bankrupt
assuming it starts from the state of solvency.
Evaristo’s rich aunt, Mrs. T. da Cunha, decides that it is bad for the family name if the
restaurant is allowed to go bankrupt. Thus, when state is entered, Mrs. T. da Cunha
infuses Evaristo’s restaurant with cash returning it to solvency with probability 1, i.e.,
the TPM for this new DTMC is
P =
0 0 1
0.5 0.25 0.25
0.5 0.25 0.25
.(b) Is this new DTMC irreducible? Is it aperiodic?
(c) What is the expected number of years between consecutive cash infusions from
Evaristo’s rich aunt.30 •
Exercise 3.118 — First passage times (bis) (Resnick, 1992, Exercise 2.17, p. 152)
Some graduate students exhibit 4 states of mind: 1 (suicidal); 2 (severe depression); 3
(mild depression); 4 (seeking for professional psychiatric help).
Admit weekly changes in state of mind can be modeled as a DTMC with TPM given by
P =
1 0 0 0
0.50 0 0.25 0.25
0.25 0.50 0 0.25
0 0 0 1
.
(a) Compute the probability the student will eventually be suicidal starting from state
X0 = 2? Recalculate this probability considering X0 = 3.
(b) Find the expected number of changes of state of mind until a student is suicidal or
seeks for professional psychiatric help, considering the initial state X0 = 2. Determine
this expected number assuming now X0 = 3. •30Recall that µjj = 1
πj.
171
Exercise 3.119 — First passage times (bis, bis) (Resnick, 1992, pp. 106–107)
Coltilde runs a restaurant and organizes there “amateur night” on Fridays. The clientele
of the restaurant judges the performers and their quality falls into 5 categories with:
• 1 being the best;
• 5 being atrocious and able to cause a riot with probability 0.3.
Clotilde admits that the succession of states on amateur night can be modeled as a
DTMC with
• 6 states, where i represents a class i performer (i = 1, . . . , 5), and state 6 represents
“riot”;
• and TPM given by
P =
0.05 0.15 0.3 0.3 0.2 0
0.05 0.3 0.3 0.3 0.05 0
0.05 0.2 0.3 0.35 0.1 0
0.05 0.2 0.3 0.35 0.1 0
0.01 0.1 0.1 0.1 0.39 0.3
0.2 0.2 0.2 0.2 0.2 0
.
To play it safe Clotilde starts the evening off with a class 2 performer.
(a) What is the probability that a class 1 performer is discovered before a riot is started?
(b) What is the expected number of performers seen before the first riot? •
Exercise 3.120 — More on first passage times (Ross, 1983, Exercise 4.21, p. 137)
A spider hunting a fly moves between locations 1 and 2 according to a DTMC with TPM
P =
0.7 0.3
0.3 0.7
and starting in location 1.
The fly, unaware of the spider, starts in location 2 and moves according to a DTMC
with TPM
172
Q =
0.4 0.6
0.6 0.4
(a) Show that the progress of the hunt, except for knowing the location where it ends,
can be described by a three-state DTMC. Obtain the TPM for this DTMC.
(b) Find the probability that at time n the spider and fly are both in the same
compartment.
(c) What is the average duration of the hunt? •
Exercise 3.121 — Classification of states of a symmetric random walk (Ross,
1983, Exercise 4.6, pp. 134–135)
For the symmetric random walk starting at 0:
(a) What is the expected time to return to 0?
(b) Let N2n denote the number of returns by time 2n. Show that E(N2n) = 2n+122n
(2nn
)− 1.
(c) Use (b) and Stirling approximation31 to show that for n large E(Nn) is proportional
to√n. •
Exercise 3.122 — The gambler’s ruin problem
Consider a gambler who at each play of the game has a probability of winning one
monetary unit and probability q = 1− p of losing one monetary unit.
Assuming successive plays of the game are independent, prove that the probability
that, starting with i monetary units, the gambler’s fortune will reach N before reaching
0 equals:1−(q/p)i
1−(q/p)N, if p 6= 1
2
iN, if p = 1
2
(Ross, 1983, Example 4.4(a), p. 115–116). •31n! ∼
√2π e−n nn+1/2, for sufficiently large n.
173
Exercise 3.123 — The gambler’s ruin problem (bis) (Ross, 1983, Exercise 4.18, p.
136)
In the gambler’s ruin problem, prove that the probability that he/she wins the next
gamble, given that the present fortune is i and he/she eventually reaches N , is equal top[1−(q/p)i+1]
1−(q/p)i, if p 6= 1
2
i+12i, if p = 1
2
•
Exercise 3.124 — The gambler’s ruin problem (bis, bis) (Ross, 1983, Exercise
4.20, pp. 136–137)
Suppose that two independent sequences X1, X2, . . . and Y1, Y2, . . . are coming in form
some laboratory and that they represent the results of Bernoulli trails with unknown
success probabilities P1 and P2.32
To decide whether P1 > P2 or P2 > P1, we use the following test. Choose some
positive integer M and stop at N , the first value of n such either∑n
i=1(Xi − Yi) = M or∑ni=1(Xi − Yi) = −M . In the former case we then assert that P1 > P2, and in the latter
that P2 > P1.
Show that, when P1 > P2:
(a) the probability of making an error (that is, of asserting that P2 > P1) is equal to
11+λM
, where λ = P1(1−P2)P2(1−P1)
;
(b) the expected number of pairs observed is M(λM−1)(P1−P2)(λM+1)
.
Hint: Relate this to the gambler’s ruin problem. •
32That is, P (Xi = 1) = 1 − P (Xi = 0) = P1 and P (Yi = 1) = 1 − P (Yi = 0) = P2 and all r.v. are
independent.
174
Chapter 4
Continuous time Markov chains
In the DTMC setting, we enter a state i at time n and stay there for exactly one time
unit and then jump to state j at time n+ 1 with probability Pij, regardless of the states
we have visited up to and including time n− 1 (Kulkarni, 1995, p 240).
Since many processes we may wish to model occur in continuous time,1 can we consider
a stochastic model, with index set R+0 , such that we enter state i at time t, stay there for
a random amount of time, then jump to state j with probability Pij and still satisfy the
Markov property?
YES!
As long as the time spent in state i is an exponentially distributed r.v. independent of
the next state visited. The resulting stochastic process is called a continuous time Markov
chains (CTMC).
CTMC have a wide variety of applications in the real world (Ross, 2003, p. 349)
— they naturally arise in control and optimization, manufacturing systems, biology and
financial engineering. Moreover, a large class of queueing models can be studied as CTMC
(Resnick, 1992, p. 367).
We have already dealt with a few CTMC. For instance, if the total number of arrivals
by time t is the state of the process at time t, then the Poisson process N(t) : t ≥ 0 is
a CTMC with state space N0 (Ross, 2003, p. 349).
1E.g., disease transmission events, state of deterioration of mechanical components, etc.
175
4.1 Definitions and examples
We start this section with two possible definitions of CTMC.
Definition 4.1 — CTMC (Ross, 2003, p. 350)
Let X(t) : t ≥ 0 be a continuous time stochastic process taking values in the set of
non-negative integers (that is, the state space S ⊆ N0). If
P [X(t+ s) = j | X(s) = i, X(u) = x(u), 0 ≤ u < s]
= P [X(t+ s) = j | X(s) = i],(4.1)
for all s, t ≥ 0 and non-negative integers i, j and x(u), 0 ≤ u < s, then X(t) : t ≥ 0 is
said to be a CTMC. If, in addition, the transition probabilities satisfy
P [X(t+ s) = j | X(s) = i] = P [X(t) = j | X(0) = i], (4.2)
then the CTMC is said to be time-homogenous.2 •
To motivate another definition of CTMC, let us recall Ross (1983, p. 142), who
mentions that, by the Markov property, this stochastic process has the following properties
each time it enters state i:
• the amount of time spent in state i (sojourn or holding time!) before making a
transition into a different state has exponential distribution with parameter νi;3
• the probability that the process leaves state i and the next state it enters is j equals
Pij, where∑
j 6=i Pij = 1.4
Definition 4.2 — CTMC (bis) (Kulkarni, 1995, pp. 240–241)
Let:
• X(t) : t ≥ 0 be a continuous time stochastic process with state space S ⊆ N0;
2Or to have stationary or homogenous transition probabilities. The material in this and the next
sections only refers to CTMC with stationary transition probabilities.3A state i is called called instantaneous if νi = +∞ (Kulkarni, 1995, p. 241), i.e., the expected sojourn
time in state i is equal to 0. From now on, we shall only deal with CTMC with no instantaneous states.4Pii = 0 unless state i is an absorbing state — in this case Pii = 1 (Kulkarni, 1995, p. 246) and νi = 0.
176
• S0 = 0;
• Sn be the time of the nth transition;
• Yn = Sn − Sn−1 be the nth sojourn or holding time;
• X0 = X(0) be the initial state of the process;
• Xn = X(S+n ) = X(Sn) be the state of the stochastic process immediately after the
nth transition;
• Pij = P [X(S+n+1) = j | X(S+
n ) = i].
Then the stochastic process X(t) : t ≥ 0 is said to be a CTMC with initial state
X0 = X(0) if it changes states at times 0 < S1 < S2 < . . . and the embedded process
X0, (Xn, Yn) : n ∈ N satisfies
P [Xn+1 = j, Yn+1 > y | (Xn, Yn) = (i, yn), (Xn−1, Yn−1) = (in−1, yn−1), . . . ,
(X1, Y1) = (i1, y1), X0 = i0] = Pij e−νi×y,
(4.3)
for all non-negative integers i, j, in−1, . . . , i1, i0 and non-negative real numbers
y, yn, yn−1, . . . , y1.
Xn : n ∈ N0 is usually called the embedded DTMC in the CTMC X(t) : t ≥ 0. •
Less formally, in a CTMC the succession of states visited still follows a DTMC but
now the flow of time is perturbed by exponentially distributed sojourn (or holding) times
in each state (Resnick, 1992, p. 367).
Example/Exercise 4.3 — (Sample path of a) CTMC
Some of the stochastic processes we have previously studied are indeed CTMC (Kulkarni,
1995, p. 242):
• X(t) : t ≥ 0 ∼ PP (λ) is a CTMC with νi = λ and Pi,i+1 = 1;
• A compound PP — with batch arrival rate λ and batch sizes with p.f. ak =
P (batch size = k), k ∈ N — is a CTMC with νi = λ and Pij = aj−i, for j > i.
Draw a typical sample path of a CTMC (Kulkarni, 1995, Figure 6.1, p. 241). •
177
4.2 Properties of the transition matrix; Chapman-
Kolmogorov equations
The law of motion of a CTMC is governed by a time-dependent transition probability
matrix.
Definition 4.4 — Transition probability matrix (Kulkarni, 1995, p. 243)
Let:
• X(t) : t ≥ 0 be a (time-homogeneous) CTMC with state space S;
• Pij(t) = P [X(t) = j | X(0) = i], i, j ∈ S, be the (time-dependent) transition
probabilities.
Then
P(t) = [Pij(t)]i,j∈S (4.4)
is called the transition probability matrix (TPM). •
Remark 4.5 — Transition probability matrix (Kulkarni, 1995, p. 243)
The reader should not mistake the transition probabilities Pij(t) = P [X(t) = j | X(0) = i]
of the CTMC for the transition probabilities Pij of the embedded DTMC. •
Proposition 4.6 — Characterization of a CTMC (Kulkarni, 1995, Theorem 6.2, p.
244)
A CTMC X(t) : t ≥ 0 is fully characterized by its
• TPM, P(t), and
• initial distribution, that is, the p.f. of X(0) denoted by α = [αi]i∈S =
[P [X(0) = i]]i∈S . •
178
P(t) is certainly a stochastic matrix and satisfy the Chapman-Kolmogorov equations
(obviously rewritten for a continuous time stochastic process).
Proposition 4.7 — Properties of the TPM; Chapman-Kolmogorov equations
(Kulkarni, 1995, Theorem 6.3, p. 253–254)
The TPM, P(t), of a CTMC X(t) : t ≥ 0 has the following properties:
• Pij(t) ≥ 0, i, j ∈ S, t ≥ 0;
•∑
j∈S Pij(t) = 1, i ∈ S, t ≥ 0.
Moreover, P(0) = I and the Chapman-Kolmogorov equations5 are written as follows:
Pij(t+ s) =∑k∈S
Pik(t)× Pkj(s), i, j ∈ S, t, s ≥ 0, (4.5)
or, in matrix form,
P(t+ s) = P(t)×P(s) (4.6)
= P(s)×P(t), t, s ≥ 0. (4.7)
•
Exercise 4.8 — Chapman-Kolmogorov equations
Prove the Chapman-Kolmogorov equations (4.5) (Ross, 2003, p. 363). •
Exercise 4.9 — Properties of the TPM (Isaacson and Madsen, 1976, Exercise 1, p.
231)
Which of the following matrices have the properties of the TPM for a CTMC?
(a) P(t) =
e−t 1− e−t
0 1
(b) P(t) =
et 1− et
0 1
5The equations were arrived at independently by both the British mathematician Sydney
Chapman (1888–1970) and the Russian mathematician Andrey Kolmogorov (1903–1987)
(http://en.wikipedia.org/wiki/Chapman-Kolmogorov equation).
179
(c) P(t) =
1 0
1− te−t te−t
(d) P(t) =
t+ e−t 1− t− e−t
0 1
(e) P(t) =
1− te−t te−t 0 0
te−t 1− 3te−t 2te−t 0
0 te−t 1− 2te−t te−t
0 0 te−t 1− te−t
•
Marginal and joint probabilities (ADDED)
Let:
• X(t) : t ≥ 0 be a CTMC with TPM P(t) = [Pij(t)]i,j∈S ;
• α = [αi]i∈S be the row vector with the initial distribution of the CTMC (i.e., the
p.f. of X(0)).
Then
P [X(t) = j] =∑i∈S
P [X(0) = i]× P [X(t) = j | X(0) = i]
=∑i∈S
αi × Pij(t), j ∈ S, (4.8)
and the row vector with the p.f. of X(t) is given by
[P [X(t) = j)]j∈S = α×P(t). (4.9)
Moreover,
P [X(t1) = x(t1), . . . , X(tk) = x(tk)] =
[∑i∈S
αi × Pi,x(t1)(t1)
]
×k∏j=2
Px(tj−1),x(tj)(tj − tj−1), (4.10)
for 0 ≤ t1 < t2 < · · · < tk and x(t1), . . . , x(tk) ∈ S. •
180
Definition 4.10 — Instantaneous transition rates (Ross, 2003, p. 362)
For any pair of states i and j (i 6= j), let
qij = νi × Pij, (4.11)
where νi is the rate at which the process makes a transition when in state i and Pij
is the probability that this transition is into state j. The quantities qij are called the
instantaneous transition rates and represent the rate, when in state i, at which the process
makes a transition into state j. •
Remark 4.11 — Instantaneous transition rates and rate diagrams (Kulkarni,
1995, p. 246)
A rate diagram is a directed graph in which each state is represented by a node and there
is an arc going from node i to node j (if qij > 0) with qij written on it.
The rate diagrams helps us visualize the dynamics of the CTMC and are the continuous
analogue of the transition diagrams of DTMC. •
Exercise 4.12 — Rate diagram (Kulkarni, 1995, examples 6.1 and pp. 242 and 246–
247)
Consider a machine that can be either up (1) or down (0). If the machine is up (resp.
down), it fails (resp. is repaired) after an Exp(µ) (resp. Exp(λ)) amount of time. Once
this machine is repaired it is good as new.
Let X(t) : t ≥ 0 be the state of the machine at time t and draw the corresponding
rate diagram. •
Specifying the instantaneous transition rates determines the parameters of the CTMC
(Ross, 2003, p. 362). In addition, the instantaneous transition rates are related to the
infinitesimal behavior of the transition probabilities, as stated by the next proposition.
Proposition 4.13 — Infinitesimal behavior of the transition probabilities (Ross,
2003, Lemma 6.2, p. 362)
Let X(t) : t ≥ 0 be a CTMC with state space S, TPM P(t) and instantaneous transition
rates qij. Then:
181
limh→0+
Pij(h)
h= qij, i 6= j; (4.12)
limh→0+
1− Pii(h)
h= νi. (4.13)
•
Capitalizing on Proposition 4.13 and on the Chapman-Kolmogorov equations, we can
derive a set of differential equations that the transition probabilities Pij(t) satisfy (Ross,
2003, p. 362) and provide a solution for them.
Proposition 4.14 — Kolmogorov’s backward and forward equations (Ross, 2003,
Theorem 6.1, pp. 364, 367)
For all states i and j and times t ≥ 0:
dPij(t)
dt= lim
h→0+
Pij(h+ t)− Pij(t)h
=∑k 6=i
qikPkj(t)− νiPij(t) (backward equations); (4.14)
dPij(t)
dt= lim
h→0+
Pij(t+ h)− Pij(t)h
=∑k 6=j
Pik(t)qkj − Pij(t)νj (forward equations). (4.15)
•
Exercise 4.15 — Kolmogorov’s backward and forward equations
Prove Proposition 4.14 (Ross, 2003, Theorem 6.1, pp. 363–364 and 367). •
Proposition 4.16 — Kolmogorov’s backward and forward equations in matrix
form (Ross, 2003, p. 388)
Let
rij =
qij, i 6= j
−νi, i = j(4.16)
and R = [rij]i,j∈S .6 Then the Kolmogorov’s backward and forward equations can be
written in matrix form:6R is usually called the rate matrix (or the infinitesimal generator) of the CTMC.
182
dP(t)
dt=
[dPij(t)
dt
]i,j∈S
= R×P(t) (backward equations) (4.17)
= P(t)×R (forward equations). (4.18)
•
Proposition 4.17 — Solution of the Kolmogorov’s backward and forward
equations in matrix form (Ross, 2003, p. 388)
The solution of the matrix differential equations dP(t)dt
= R×P(t) and dP(t)dt
= P(t)×R
is
P(t) = eR t (4.19)
=+∞∑n=0
Rn tn
n!. (4.20)
•
According to Ross (2003, p. 389), the direct use of (4.20) to compute P(t) turns out to
be very inefficient (why?),7 not to mention the case of CTMC with infinite state space.
Consequently, we are going to discuss methods to derive or to approximate the TPM P(t)
in the next two sections.
7Since R contains both positive and negative entries we are bound to deal with computer round-off
errors when we compute the powers of the matrix R. Moreover, to arrive at a good approximation we
have to compute a lot of the terms in the infinite sum (4.20).
183
4.3 Computing the transition matrix: finite state
space
Rather than using (4.20) to compute the TPM, we can use the matrix equivalent of the
identities
ex = limn→+∞
(1 +
x
n
)n= lim
n→+∞
[(1− x
n
)−1]n
to efficiently (derive or) approximate P(t).
Proposition 4.18 — Two approximations to P(t) (Ross, 2003, pp. 389–390)
Since
P(t) = eR t
= limn→+∞
(I +
R t
n
)n(4.21)
= limn→+∞
[(I− R t
n
)−1]n, (4.22)
if we let n be a power of 2, say n = 2k, then we can approximate P(t) by raising either the
matrix(I + R t
n
)or the matrix
(I− R t
n
)−1to the nth power, which can be accomplished
by k matrix multiplications.8 •
Exercise 4.19 — Two approximations to P(t)
Consider
R =
−λ λ
µ −µ
,where λ = 1 and µ = 2.
8For instance, we multiply(I + R t
n
)by itself to obtain
(I + R t
n
)2and then multiplying that by itself
to obtain(I + R t
n
)4and so on.
184
(a) Use Mathematica, in particular the function MatrixExp, to obtain P(t), for t = 1, 100.
(b) Compare the exact results in (a) to the approximate ones, obtained by using
Proposition 4.18. •
For a more detailed account on other methods to compute the TPM P(t) of a CTMC
with finite state space, the reader is referred to Kulkarni (1995, pp. 261–274).
4.4 Computing the transition matrix: infinite state
space
When the CTMC is defined on an infinite state space, exact analytical computation of
P(t) is only possible if the CTMC has a special structure, such as the one in Exercise
4.20.
The method of Laplace transforms, also used when the state space is finite, is
particularly useful and therefore briefly described, following Kulkarni (1995, pp. 269–270).
Since P(t) is a matrix of bounded continuous functions, it is possible to define Laplace
transforms of its entries:
P ?ij(s) =
∫ +∞
0
e−stPij(t)dt, i, j ∈ S. (4.23)
Using the properties of Laplace transforms and Kolmogorov’s forward equations, we
successively get∫ +∞
0
e−stdPij(t)
dtdt = s× P ?
ij(s)− Pij(0) (4.24)
and
s× P ?ij(s)− Pij(0) =
∑k 6=j
P ?ik(s)× qkj − P ?
ij(s)× νj. (4.25)
Now, using the fact that P(0) = I, we can solve (4.25) recursively, as in the next exercise.
Exercise 4.20 — Obtaining P(t) via the method of Laplace transforms
Consider X(t) : t ≥ 0 ∼ PP (λ) and Pk(t) ≡ P [X(t) = k | X(0) = 0]:
185
(a) write Kolmogorov’s forward equations for this CTMC in terms of Pk(t);
(b) use the method of Laplace transforms to obtain a solution to Pk(t)
(Kulkarni, 1995, Example 6.20, p. 275). •
Kulkarni (1995, pp. 275–282) shows how the problem of computing P(t) — when the
CTMC is defined on an infinite state space — needs analytical tools such as Laplace
transforms, partial differential equations, etc.9
All these methods are particularly useful to solve Kolmogorov’s backward and forward
equations, while dealing with a broad and popular class of CTMC, the birth and death
processes.
4.5 Birth and death processes
Birth-death processes are special cases of CTMC where the state transitions are of only
two types:
• birth (or arrival) which increase the state variable by one;
• death (or departure) which decrease the state variable by one.
The model’s name comes from a common application, the use of such models to represent
the current size of a population where the transitions are literally due to births and deaths
(http://en.wikipedia.org/wiki/Birth-death process).
Unsurprisingly, birth-death processes have many applications in demography, queueing
theory, performance engineering, epidemiology or in biology — they may be used,
for example to study the evolution of bacteria, the number of people with a
disease within a population, or the number of customers in line at the supermarket
(http://en.wikipedia.org/wiki/Birth-death process).
9In fact, partial differential equations arise while solving Kolmogorov’s forward equations via the p.g.f.,
as illustrated by Kulkarni (1995, Example 6.24, pp. 278–282).
186
Definition 4.21 — Birth and death process (Ross, 2003, p. 352)
Let the state variable X(t) be the number of people in a system at time t. Now, suppose
that whenever there are n people in the system
• the time until the next birth/arrival is exponentially distributed, with mean λ−1n
(n ∈ N0), and independent of the
• the time until the next death/departure which is exponentially distributed with
mean µ−1n (n ∈ N).
Then X(t) : t ≥ 0 is called a birth and death process, with birth rates λn : n ∈ N0
and death rates µn : n ∈ N. •
Remark 4.22 — Birth and death processes (Ross, 2003, pp. 352–353; Kleinrock,
1975, p. 54)
• A birth and death process is a CTMC with state space N0 for which transitions only
to states n− 1 and n+ 1 are possible from state n.
• The rates at which the process makes a transition when in state i are:
ν0 = λ0; (4.26)
νi = λi + µi, i ∈ N. (4.27)
The transition probabilities Pij of the embedded DTMC are equal to:
P01 = 1; (4.28)
Pi,i+1 = P (birth before a death given i people in the system)
=λi
λi + µi, i ∈ N; (4.29)
Pi,i−1 = P (death before a birth given i people in the system)
=µi
λi + µi, i ∈ N. (4.30)
• In addition, the instantaneous transition rates are given by
187
qij = νi × Pij =
λi, j = i+ 1
µi, j = i− 1,(4.31)
for i 6= j.
• Given that X(t) = i, the probability that:
– one birth occurs in the interval (t, t+ ∆t] is given by
Pi,i+1(∆t) = λi ×∆t+ o(∆t);
– one death occurs in the interval (t, t+ ∆t] is equal to
Pi,i−1(∆t) = µi ×∆t+ o(∆t);
– no death or birth occur in the interval (t, t+ ∆t] amounts to
Pi,i(∆t) = 1− (λi + µi)×∆t+ o(∆t).
Consequently,
– multiple births,
– multiple deaths,
– a birth and a death,
in intervals of infinitesimal range ∆t are not possible.
• A birth and death process for which µn = 0, n ∈ N (resp. λn = 0, n ∈ N0), is called
a pure birth (resp. pure death) process. •
Exercise 4.23 — Rate diagrams and rate matrices of birth and death processes
Draw the rate diagrams of the following birth and death processes:
(a) Poisson process with arrival rate λ (Kulkarni, 1995, Figure 6.5, p. 249);
188
(b) pure birth process with birth rates λi (Kulkarni, 1995, Figure 6.6, p. 249);
(c) pure death process with death rates µi (Kulkarni, 1995, Figure 6.7, p. 250);
(d) general birth and death process with birth and death rates λi and µi,
respectively (Kulkarni, 1995, Figure 6.8, p. 251; http://en.wikipedia.org/wiki/Birth-
death process).
Identify the rate matrices of all these CTMC. •
Before we proceed to describe the derivation of the TPM P(t), we illustrate the
computation of the expected value of the state variable of a few birth and death processes
with a two exercises.
Exercise 4.24 — Expected value of a linear growth model
Consider a population in each individual gives birth at an exponential rate λ and dies at
an exponential rate µ.
After having identified the birth and death rates of this linear growth model, derive
Mi(t) = E[X(t) | X(0) = i], the expected value of the size of the population at time t
given that the population started with i ∈ N individuals (Ross, 1989, Example 3c, pp.
252–254). •
Exercise 4.25 — Expected value of a linear growth model with immigration
Admit the size of a bird colony is governed by a birth and death process with rates
λn = nλ+ θ, n ∈ N0 and µn = nµ, n ∈ N.10
Derive E[X(t) | X(0) = i], the expected value of the size of the bird colony at time
t given that this colony was founded by i ∈ N individuals (Ross, 2003, Example 6.4, pp.
353–355). •10This is called a linear growth model with immigration: each individual in the population is assumed
to give birth at an exponential rate λ; there is an exponential rate of increase θ of the population due to
an external source such as immigration; deaths are assumed to occur at an exponential rate µ for each
member of the population (Ross, 2003, Example 6.4, p. 353).
189
Proposition 4.26 — Kolmogorov’s backward and forward equations for birth
and death processes (Ross, 2003, examples 6.10 and 6.12, pp. 364 and 368)
For birth and death processes:
• Kolmogorov’s backward (h+ t) equations become
dP0j(t)
dt=λ0 P1j(t)− λ0 P0j(t), j ∈ N0 (4.32)
dPij(t)
dt=λi Pi+1,j(t) + µi Pi−1,j(t)− (λi + µi)Pij(t), i ∈ N, j ∈ N0; (4.33)
• Kolmogorov’s forward (t+ h) equations are given by
dPi0(t)
dt=Pi1(t)µ1 − Pi0(t)λ0, i ∈ N0 (4.34)
dPij(t)
dt=Pi,j−1(t)λj−1 + Pi,j+1(t)µj+1 − Pij(t) (λj + µj), i ∈ N0, j ∈ N. (4.35)
•
Exercise 4.27 — Kolmogorov’s backward and forward equations for a pure
birth process
Write Kolmogorov’s backward and forward equations for a pure birth process (Ross, 2003,
Example 6.9, p. 364). •
Solving Kolmogorov’s backward differential equations is feasible, namely for some
birth and death processes with finite state space such as the CTMC of the next exercise.
Exercise 4.28 — Solving Kolmogorov’s backward differential equations
Suppose that:
• a machine works for an exponential amount of time with mean λ−1 before breaking
down;
• it takes an exponential amount of time with mean µ−1 to repair the machine.
190
(a) Show that if the machine is in working condition (state 0) at time 0 then the
probability that it will be working at time t is equal to
P00(t) =λ
λ+ µ× e−(λ+µ)t +
µ
λ+ µ
and
P10(t) =µ
λ+ µ− µ
λ+ µ× e−(λ+µ)t.
(Ross, 1989, Example 4c, pp. 263–265; Ross, 2003, Example 6.11, pp. 364–366).
(b) Consider λ = 1, µ = 2 and t = 10 and compare P(t) to its approximations(I + R t
n
)nand
[(I− R t
n
)−1]n
, where n = 210. •
Solving Kolmogorov’s forward differential equations is also possible in certain
cases, namely for pure birth processes, as shown by Proposition 4.29 and Exercise 4.31.
Moreover, Kolmogorov’s forward differential equations are in fact differential-
difference equations; they can always be solved, at least in principle, by recurrence,
that is, successive substitution (Cooper, 1981, p. 16).
Proposition 4.29 — Solving Kolmogorov’s forward equations for pure birth
processes (Ross, 1989, Proposition 4.1, p. 266)
Let X(t) : t ≥ 0 be a pure birth process with rates λi, i ∈ N0. Then the entries of the
TPM can be obtained recursively:
Pii(t) = e−λit, i ∈ N0; (4.36)
Pij(t) = λj−1 × e−λjt ×∫ t
0
eλjsPi,j−1(s) ds, i ∈ N0, j = i+ 1, i+ 2, . . . ; (4.37)
and Pij(t) = 0, for j = 0, 1, . . . , i. •
Exercise 4.30 — Solving Kolmogorov’s forward equations for pure birth
processes
Prove Proposition 4.29 (Ross, 1989, p. 266). •
191
Exercise 4.31 — Solving Kolmogorov’s forward equations for a Yule process
The Yule process is a pure birth process having rates λj = jλ, j ∈ N0.
(a) Use Proposition 4.29 to prove that, for fixed i ∈ N,
Pij(t) =
(j − 1
i− 1
)(e−λt
)i (1− e−λt
)j−i, j = i, i+ 1, . . .
(Ross, 1989, pp. 266–267).
(b) Give a probabilistic interpretation to the result (Ross, 1983, pp. 144–145). •
Kolmogorov’s forward differential equations are also easy to derive and handle
when we are dealing with pure death processes, such as the ones in exercises 4.32 and
4.33.
Exercise 4.32 — Verifying Kolmogorov’s forward differential equations
Admit the size of a population at time t, X(t), can be be described by a pure death
process with rates µk = kµ, k = 0, 1, . . . , n, where n (n ∈ N) represents the initial
number of individuals.
(a) Write Kolmogorov’s forward differential equations in terms of Pk(t) ≡ Pnk(t) =
P [X(t) = k | X(0) = n].
(b) Show that
Pk(t) =
(n
k
)(e−µt
)k (1− e−µt
)n−k, k = 0, 1, . . . , n,
verifies the Kolmogorov’s forward equations written in (a). •
Exercise 4.33 — Kolmogorov’s forward differential equations for a pure death
process
There are n0 (n0 ∈ N) seals in an isolated cove; there are all sick and have to be captured
and taken from the cove to be treated.
192
Let X(t) be the number of (uncaptured) seals in the isolated cove at time t and admit
that X(t) : t ≥ 0 is a pure death process with rates µk = kµ, k ∈ 0, 1, . . . , n0.
(a) Derive Kolmogorov’s forward equations in terms of Pk(t) ≡ Pn0,k(t) = P [X(t) = k |
X(0) = n0].
(b) Argue that the solution to these equations is
Pk(t) = P [X(t) = k | X(0) = n0] =
(n0
k
)(pt)
k (1− pt)n0−k ,
and identify pt.
(c) Compute E[X(t) | X(0) = n0].
(d) Let Tc be the time needed to capture all the seals. Derive the p.d.f. of Tc. •
The p.g.f. method, also called z − transform method, is frequently used to reduce the
Kolmogorov’s forward differential equations to a single partial differential equation,
whose solution can be derived for some birth and death processes.
Let:
• X(t) : t ≥ 0 be a birth and death process such that X(0) = i (where i 6= 0);
• Pj(t) ≡ P [X(t) = j | X(0) = i] be the p.f. of the r.v. (X(t) | X(0) = i);
• P (z, t) = E[zX(t) | X(0) = i
], |z| ≤ 1, be the p.g.f. of (X(t) | X(0) = i).
Then multiplying the jth Kolmogorov’s forward differential equation in (4.35) by zj and
summing up in j (Kulkarni, 1995, p. 279), we get a single equation:∑j∈S
zj × dPj(t)
dt=∑j∈S
zj × [Pj−1(t)λj−1 + Pj+1(t)µj+1 − Pj(t) (λj + µj)] . (4.38)
By noting that∑j∈S
zj × dPj(t)
dt=∂P (z, t)
∂t(4.39)
and that, depending of the birth and death rates, the right term of (4.38) can be written
in terms of P (z, t) and
193
∂P (z, t)
∂z=∑j∈S
jzj−1 × Pj(t) =∑j∈S
(j + 1)zj × Pj+1(t), (4.40)
(4.38) is nothing but a (first order) partial differential equation whose solution is the
p.g.f. of the r.v. (X(t) | X(0) = i).
Exercise 4.34 — Solving Kolmogorov’s forward equations via the p.g.f. method
(Kleinrock, 1975, Exercise 2.10(a)–(d), p. 81)
Admit X(t) : t ≥ 0 is a Yule process — i.e., a pure birth process with birth rates
λj = jλ, for j ∈ N0 — with X(0) = 1.
(a) Derive Kolmogorov’s forward equations in terms of Pj(t) ≡ P1j(t) = P [X(t) = j |
X(0) = 1].
(b) After having rewritten the Kolmogorov’s forward equations derived in (a) as a partial
differential equation in terms of the p.g.f. of the r.v. (X(t) | X(0) = 1), P (z, t) =
E[zX(t) | X(0) = 1
], verify that
P (z, t) =z e−λt
1− (1− e−λt)× z, |z| ≤ 1,
satisfies that partial differential equation (Cooper, 1981, Exercise 6a)b), p. 34).
(c) Identify the distribution of (X(t) | X(0) = 1) and compute E[X(t) | X(0) = 1]. •
Exercise 4.35 — Solving Kolmogorov’s forward equations via the p.g.f. method
(bis) (Kleinrock, 1975, Exercise 2.12, p. 82)
Let:
• X(t) : t ≥ 0 be a birth and death process with X(0) = 0 and rates λj = λ, j ∈ N0
and µj = jµ, j ∈ N;
• Pj(t) ≡ P0j(t) be the p.f. of the r.v. (X(t) | X(0) = 0).
(a) Derive Kolmogorov’s forward equations in terms of Pj(t).
194
(b) After having rewritten the Kolmogorov’s forward equations derived in (a) as a partial
differential equation in terms of the the p.g.f. of (X(t) | X(0) = 0), verify that
P (z, t) = exp
[−λ× (1− e−µt)× (1− z)
µ
], |z| ≤ 1,
satisfies that partial differential equation (Cooper, 1981, pp. 32–33).
(c) Rewrite P (z, t) as a power series to identify Pj(t) and calculate limt→+∞ Pj(t) (Cooper,
1981, p. 33). •
Exercise 4.36 — Solving Kolmogorov’s forward equations via the p.g.f. method
(bis, bis) (Kleinrock, 1975, Exercise 2.14, pp. 82–83)
Let:
• X(t) : t ≥ 0 a birth and death process with X(0) = 1 and rates λj = jλ, j ∈ N0
and µj = jµ, j ∈ N;
• Pj(t) ≡ P1j(t) be the p.f. of the r.v. (X(t) | X(0) = 1).
(a) Derive Kolmogorov’s forward equations in terms of Pj(t) and a partial differential
equation satisfied by the p.g.f. of (X(t) | X(0) = 1) (Kulkarni, 1995, Example 6.24,
pp. 278–279).
(b) Verify that P (z, t) =µ[1−e(λ−µ)t]−[λ−µ e(λ−µ)t]zµ−λ e(λ−µ)t−λ [1−e(λ−µ)t]z
satisfies the partial differential equation
derived in (b).
(c) Calculate the expected value and the variance of (X(t) | X(0) = 1).
(d) After having rewritten P (z, t) as a power series, show that
Pj(t) =
α(t), j = 0
[1− α(t)]× [1− β(t)]× [β(t)]j−1 , j ∈ N,
and obtain expressions for α(t) and β(t) (Kulkarni, 1995, Example 6.24, p. 281).
(e) Find the extinction probability, limt→+∞ P0(t). •
195
4.6 Classification of states
The concepts of accessibility, communication, irreducibility, transience and recurrence for
CTMC can be defined in the same lines as for DTMC. Consequently, these concepts are
going to be briefly discussed.
Definition 4.37 — CTMC and accessibility, communication, irreducibility,
transience and recurrence (Kulkarni, 1995, definitions 6.2–6.8, pp. 283–285)
Let:
• X(t) : t ≥ 0 be a CTMC with state space S, TPM P(t) and initial state i;
• S1 be the time of the first jump of this stochastic process;
• Tj = inft ≥ S1 : X(t) = j be the first time the CTMC enters state j ∈ S;
• Ti = inft ≥ S1 : X(t) = i be the first time the CTMC returns to state i ∈ S;
• fij = P [Tj < +∞ | X(0) = i] be the probability that the first visit to state j (resp.
the first return to the initial state i if j = i) occurs in finite time;
• µij = E[Tj | X(0) = i] be the expected time until the first visit to state j (resp. the
first return to the initial state i if j = i).
Then, for i, j ∈ S:
• state j is said to be accessible from state i, i.e., i→ j, if Pij(t) > 0 for some t ≥ 0;
• states i and j are said to communicate, i.e., i↔ j, if i→ j and j → i;11
• a set of states C ⊂ S is said to be a communicating class if
(i) i, j ∈ C ⇒ i↔ j
(ii) i ∈ C, i↔ j ⇒ j ∈ C;
11Two states that communicate are obviously said to be in the same class.
196
• A communicating class C ⊂ S is said to be closed if i ∈ C, j 6∈ C ⇒ i 6→ j.
• the CTMC is said to be irreducible if its state space S is a single closed
communicating class, i.e., if all states communicate with each other; otherwise,
the CTMC is called reducible;
• state i is said to be recurrent if fii = 1;
• state i is called transient if fii < 1;
• a recurrent state i is said to be
(i) positive recurrent if µii < +∞
(ii) null recurrent if µii = +∞. •
Remark 4.38 — Periodicity (Kulkarni, 1995, p. 287)
Tj is a continuous r.v. and, thus, if state j is accessible from state i (i → j) then it is
possible to visit j at any time t > 0 starting from i.12 Consequently, the notion of period
of a state of a CTMC does not exist. •
Since CTMC can be alternatively described in terms of holding times and an embedded
DTMC, can accessibility, communication, irreducibility, transience and recurrence be
defined in terms of such DTMC?
Yes!
This is indeed possible if we are dealing with what is called a regular CTMC, i.e., with
no instantaneous states.
Definition 4.39 — Regular CTMC (Ross, 1983, p. 142)
A CTMC is said to be regular if with probability one, the number of transitions in any
finite length time is also finite, that is, if supi∈S νi < +∞. •
12That is, if ∃s > 0 : Pij(s) > 0 then Pij(t) > 0,∀t > 0.
197
Proposition 4.40 — Accessibility, communication, irreducibility, transience
and recurrence redefined for CTMC (Kulkarni, 1995, theorems 6.8 and 6.9, pp.
284–285)
Let:
• X(t) : t ≥ 0 be a regular CTMC with state space S and TPM P(t) = [Pij(t)]i,j∈S ;
• R = [rij]i,j∈S be the associated rate matrix (or infinitesimal generator), where rij =
qij = νi × Pij (i 6= j) and rij = −νi (i = j);
• Xn : n ∈ N0 be the embedded DTMC with TPM P = [Pij]i,j∈S , where13
Pij =
qijνi, νi 6= 0, i 6= j
0, νi 6= 0, i = j
0, νi = 0, i 6= j
1, νi = 0, i = j.
(4.41)
Then
X(t) : t ≥ 0 Xn : n ∈ N0
i→ j ⇔ i→ j
i↔ j ⇔ i↔ j
C is a communicating class ⇔ C is a communicating class
MC is irreducible ⇔ MC is irreducible
i is recurrent ⇔ i is recurrent
i is transient ⇔ i is transient
•
Remark 4.41 — Transience and recurrence redefined for CTMC (Kulkarni, 1995,
p. 285)
Immediate consequences of Proposition 4.40:
• recurrence and transience are class properties;
13This is because the quantities are undefined when νi = 0 (Kulkarni, 1995, p. 284).
198
• the criteria to test recurrence and transience of DTMC (see Proposition 3.43) can
be used to establish the recurrence and transience of the embedded DTMC and
therefore of the CTMC. •
Needless to say that positive and null recurrence cannot be defined in terms of that
embedded DTMC because those two concepts rely on the holding times. However, the next
proposition establishes a criterion for positive (resp. null) recurrence somewhat related to
a result related to the positive recurrence of the DTMC (see Remark 3.62).
Proposition 4.42 — Criterion for positive (resp. null) recurrence (Kulkarni,
1995, Theorem 6.10, p. 285)
Let:
• X(t) : t ≥ 0 be an irreducible and recurrent CTMC with state space S;
• Xn : n ∈ N0 be the recurrent embedded DTMC with TPM P = [Pij]i,j∈S ;
• π be a positive solution to π = π ×P.
Then the CTMC is positive (resp. null) recurrent iff∑
i∈Sπiνi< +∞ (resp.
∑i∈S
πiνi
=
+∞). •
Proposition 4.42 also proves that positive and null recurrence are class properties in
the CTMC setting (Kulkarni, 1995, p. 286).14
14Please refer to Kulkarni (1995, Example 6.28, pp. 286–287) for a positive recurrent CTMC with null
recurrent embedded DTMC and vice-versa.
199
4.7 Limit behavior of CTMC
Computing the TPM P(t) for a fixed finite t is not a trivial problem to handle,
algebraically or numerically (Kulkarni, 1995, p. 282). Expectedly, we shift our focus
to the study of the behavior of P(t) as t→ +∞. But can we determine limt→+∞P(t)?
Yes!
What follows provides answers to questions, such as:
• when does Pij(t) have a limit as t→ +∞?
• how to compute limt→+∞ Pij(t)?
(Kulkarni, 1995, p. 282).
Example/Exercise 4.43 — Limit behavior of P(t)
(a) The CTMC described in Exercise 4.28 has TPM equal to
P(t) =
λλ+µ
− λλ+µ
− µλ+µ
− µλ+µ
× e−(λ+µ)t +
µλ+µ
λλ+µ
µλ+µ
λλ+µ
,thus,
limt→+∞
P(t) =
µλ+µ
λλ+µ
µλ+µ
λλ+µ
,obviously independent of the initial state of the CTMC (Kulkarni, 1995, Example
6.25, p. 282).
(b) Consider the CTMC described in Kulkarni (1995, Example 6.13, pp. 261–262), with
five states and the following rate matrix
−λ1 0 λ1 0 0
0 −λ2 0 λ2 0
0 µ1 −(µ1 + λ2) 0 λ2
µ2 0 0 −(µ2 + λ1) λ1
0 0 µ2 µ1 −(µ1 + µ2)
,
200
where λ1 = 1, λ2 = 2, µ1 = 0.1 and µ2 = 0.15 (Kulkarni, 1995, Example 6.26, p. 283).
After having drawn the rate diagram of this CTMC, use the Mathematica function
MatrixExp to obtain P(t) and investigate the limit behavior of this TPM.
(c) Consider the CTMC from Exercise 4.36 — now with X(0) = i and λ > µ. It can be
shown that
limt→+∞
Pij(t) =
(µλ
)i, j = 0
0, j ∈ N,
thus, the limiting probabilities are dependent of the initial state (Kulkarni, 1995,
Example 6.27, p. 283). •
After this example/exercise, we proceed with results concerning the limit behavior of
the TPM P(t) of a general CTMC.
Proposition 4.44 — Limit behavior of P(t) (Kulkarni, 1995, theorems 6.11–6.12 and
Corollary 6.3, pp. 287–288)
Let X(t) : t ≥ 0 be a CTMC. Then:
• limt→+∞ Pjj(t) = 1νj×µjj , where 1/µjj is taken to be 0 if µjj = +∞;
• limt→+∞ Pij(t) =
fij
νj×µjj , νj > 0
fij, νj = 0,
where, once again, 1/µjj is taken to be 0 if µjj = +∞;
• if j is a transient or null recurrent state of the CTMC then limt→+∞ Pij(t) = 0, for
all i ∈ S. •
Now, we turn our attention to the limit behavior of positive recurrent (i.e., ergodic),
irreducible CTMC. Unsurprisingly, it depends on the stationary distribution of the
embedded DTMC.
201
Theorem 4.45 — Limiting behavior of irreducible, positive recurrent CTMC
(Kulkarni, 1995, Theorem 6.13, p. 288; Ross, 1983, p. 152)
Let:
• X(t) : t ≥ 0 be an irreducible, positive recurrent CTMC;
• Xn : n ∈ N0 be the embedded DTMC;
• π = [πj]j∈S be the unique stationary distribution of the embedded DTMC.15
Then the limiting probabilities
Pj = limt→+∞
Pij(t) (4.42)
are given by
Pj =
πjνj∑k∈S
πkνk
, j ∈ S. (4.43)
•
Remark 4.46 — Limiting behavior of irreducible, positive recurrent CTMC
(Ross, 1983, p. 152)
• Pj also equals the long-run proportion of time the CTMC is in state j.
• If the initial state is chosen according to the limiting probabilities Pj : j ∈ S,
then P [X(t) = j] =∑
i∈S Pi × Pij(t) = Pj, for all t, i.e., the resultant CTMC is
stationary.16 •
The next theorem gives one method of computing the limiting distribution of X(t) in
terms of the rate matrix.
15I.e., πj =∑i∈S πiPij , j ∈ S, and
∑j∈S πj = 1; in other words π = πP.
16In fact, P [X(t) = j] =∑i∈S Pi×Pij(t) =
∑i∈S [lims→+∞ Pki(s)]×Pij(t) = lims→+∞
∑i∈S Pki(s)×
Pij(t) = lims→+∞ Pkj(s+ t) = Pj .
202
Theorem 4.47 — Limiting distribution of an irreducible, positive recurrent
CTMC in terms of its rate matrix (Kulkarni, 1995, Theorem 6.11, p. 289; Ross,
1983, p. 152)
Let X(t) : t ≥ 0 be an irreducible, positive recurrent CTMC with rate matrix R. Then
the limiting distribution, represented by the row vector P = [Pj]j∈S , is given by the unique
non negative solution to P ×R = 0∑j∈S Pj = 1.
(4.44)
•
Remark 4.48 — Limiting distribution of an irreducible, positive recurrent
CTMC in terms of its rate matrix
• P ×R = 0 can be written as
Pj × νj =∑i∈S
Pi × qij, j ∈ S (4.45)
(Ross, 1983, p. 152), where qii = 0.
• Pj × νj = rate at which the process leaves state j,
because Pj is the proportion of time the process is in state j and when it is in state
j it leaves at rate νj (Ross, 1983, p. 153).
•∑
i∈S Pi × qij = rate at which the process enters state j,
because Pi is the proportion of time the process is in state i and when it is in state
i it departs to state j at rate qij (Ross, 1983, p. 153).
• Since equations (4.45) can be thought as a statement of the equality of the rate
at which the process leaves and enters state j, they are sometimes referred to as
balance equations (Ross, 1983, p. 153).
203
• An irreducible CTMC is positive recurrent iff there is a solution to the system of
equations (4.44).17 Hence, like in the DTMC setting, by solving these equations, we
are automatically guaranteed positive recurrence of the CTMC. •
Exercise 4.49 — Limiting distribution of an irreducible, positive recurrent
CTMC in terms of its rate matrix
Derive and solve the balance equations of the CTMC with the following rate matrices:
(a)
−λ λ
µ −µ
(Kulkarni, 1995, Example 6.29, p. 290);
(b)
−λ1 0 λ1 0 0
0 −λ2 0 λ2 0
0 µ1 −(µ1 + λ2) 0 λ2
µ2 0 0 −(µ2 + λ1) λ1
0 0 µ2 µ1 −(µ1 + µ2)
,
where λ1 = 1, λ2 = 2, µ1 = 0.1 and µ2 = 0.15 (Kulkarni, 1995, Example 6.30, p.
291).18 •
Let us now determine the limiting probabilities for a birth and death process (Ross,
1983, p. 153), with rates λn, n ∈ N0, and µn, n ∈ N. These are obtained by equating the
rate at which the process leaves a state with the rate at which it enters that state,19 as
follows:
State Rate at which process leaves state = Rate at which process enters state
0 P0λ0 = P1µ1
n ∈ N Pn(λn + µn) = Pn−1λn−1 + Pn+1µn+1
and then rewriting and solving these equations in terms of P0 we get the limiting
probabilities in the following proposition.
17See Kulkarni (1995, Theorem 6.15, p. 290).18Try not to solve (b) by hand...19This is the result of taking limits as t → +∞ throughout Kolmogorov’s forward equations (4.34) –
(4.35), setting limt→+∞dPij(t)dt = limt→+∞
dPj(t)dt = 0 (because if
dPij(t)dt converges then it must converge
to 0) and limt→+∞ Pij(t) = Pj , and normalizing so that∑j∈S Pj = 1 (Cooper, 1981, p. 21).
204
Proposition 4.50 — Limiting probabilities for a birth and death process (Ross,
1983, p. 154)
Let X(t) : t ≥ 0 be a birth and death process with rates λn, n ∈ N0, and µn, n ∈ N.
Then
P0 =1
1 +∑+∞
n=1λ0 λ1 ... λn−1
µ1 µ2 ... µn
(4.46)
Pj =λj−1
µjPj−1
= P0 ×λ0 λ1 . . . λj−1
µ1 µ2 . . . µj, j ∈ N. (4.47)
•
For an account on the limiting behavior of reducible CTMC, the reader should refer
to Kulkarni (1995, pp. 296–299).
Exercise 4.51 — Limiting probabilities for a birth and death process
Prove Proposition 4.50 (Ross, 1983, p. 154). •
Exercise 4.52 — Limiting probabilities for a birth and death process
A taxi company has one mechanic who replaces fuel pumps when they fail. Assume:
• the waiting time in days until a fuel pump fails is exponentially distributed with
parameter 1300
;
• the company has 1000 cars;
• the repair time for each car is exponentially distributed with expected repair time
of 14
days.
Find the long-run distribution for X(t), the number of cars with a broken fuel pump
at time t, by considering X(t) : t ≥ 0 a process where a birth corresponds to a broken
fuel pump and a death corresponds to a repaired fuel pump (Isaacson and Madsen, 1976,
Example VII.3.5, p. 246).20 •20Note that the rates are given by λn = 1000−n
300 and µn+1 = 4, for n = 0, 1, . . . , 1000.
205
Remark 4.53 — Existence of limiting probabilities for a birth and death
process
• Equation (4.46) shows us what condition is needed for the limiting probabilities for
a birth and death process to exist:
+∞∑n=1
λ0 λ1 . . . λn−1
µ1 µ2 . . . µn< +∞ (4.48)
(Ross, 1983, p. 154); we are simply requiring that P0 > 0 (Kleinrock, 1975, p. 93).
• We should also note that the condition for the existence of limiting probabilities for
a birth and death process is met whenever the sequence λkµk
: k ∈ N remains below
the unit from some k onwards, i.e., if
∃k0 :λkµk
< 1,∀k ≥ k0 (4.49)
(Kleinrock, 1975, p. 94).
Simply stated, in order for those expressions to represent a probability distribution
we have to place a condition on the birth and death rates that essentially says that
the system occasionally empties (Kleinrock, 1975, p. 93). •
Remark 4.54 — Classification of states of a birth and death process (Kleinrock,
1975, pp. 93–94)
Let
S1 =+∞∑n=1
λ0 λ1 . . . λn−1
µ1 µ2 . . . µn(4.50)
S2 =+∞∑n=1
µ1 µ2 . . . µnλ0 λ1 . . . λn
. (4.51)
Then, all states will be:
• positive recurrent (i.e., ergodic) iff S1 < +∞ and S2 = +∞;
• null recurrent iff S1 = +∞ and S2 = +∞;
206
• transient iff S1 = +∞ and S2 < +∞;
It is the ergodic case that gives rise to the equilibrium/limiting probabilities and that is
of most interest to our studies. •
Exercise 4.55 — (Existence of) limiting probabilities for a birth and death
process (Ross, 1983, Exercise 5.13, p. 179)
The size of a biological population is assumed modeled as a birth and death process — for
which immigration is not allowed when the population size is N or larger — with rates
λk =
kλ+ θ, k = 0, 1, . . . , N − 1
kλ, k = N,N + 1, . . .
and µk = kµ, k ∈ N.
Determine the proportion of time that immigration is restricted, in case N = 3,
λ = θ = 1 and µ = 2. •
Exercise 4.56 — (Existence of) limiting probabilities for a birth and death
process (bis)
After having established conditions that guarantee the existence of limiting probabilities,
obtain (in case it is possible) those probabilities for the birth and death processes with
the following birth and death rates λk, k ∈ N0, and µk, k ∈ N:
(a) λk ≡ λ and µk ≡ µ;
(b) λk ≡ λ and µk = kµ;
(c) λk ≡ λ and µk =
kµ, k = 1, 2, . . . , c
cµ, k = c+ 1, c+ 2, . . ., where c ∈ N;
(d) λk = kλ and µk ≡ µ;
(e) λk = αkλ and µk ≡ µ, with 0 < α < 1
(Kleinrock and Gail, 1996, p. 71);
207
(f) λk =
(M − k)λ, k = 0, 1, 2, . . . ,M
0, k = M + 1,M + 2, . . .and µk ≡ µ
(Ross, 1983, Example 5.5(b), p. 155). •
Interestingly enough, some of the birth and death processes described in Exercise 4.56
are in fact related to queueing systems we shall study in Section 4.8.
4.8 Birth and death queueing systems in equilibrium
This section is devoted to a class of models in which customers arrive in some random
manner at a service facility. Upon arrival they are made to wait in queue21 until it is their
turn to be served. Once served they are generally assumed to leave the system (Ross,
2003, p. 475). These models are usually queueing systems.
Queueing theory started with research by the Danish mathematician, statistician and
engineer Agner Krarup Erlang (1878–1929), when he created models to describe the
Copenhagen telephone exchange; the ideas have since seen applications in such areas
like telecommunications, traffic engineering, computing and in the design of factories,
shops, offices and hospitals (https://en.wikipedia.org/wiki/Queueing theory).
In this section, we narrowed the class of queueing systems to the ones that can be
modeled as birth and death processes — also called birth and death queues. Recall that
these systems enjoy a most convenient property: the times between consecutive arrivals
and the service times are all exponentially distributed r.v. (Kleinrock, 1975, p. 89), and
are all independent r.v.
We are going to describe these queueing systems using Kendall’s notation
(https://en.wikipedia.org/wiki/Kendall’s notation) in the form A/S/c, where:
• A describes the time between consecutive arrivals to the queue;
21The word queue comes, via French, from the Latin cauda, meaning tail
(https://en.wikipedia.org/wiki/Queueing theory#Etymology).
208
• S refers to the service time distribution;
• c represents the number of (identical) service channels or servers.
For instance, when we write M/M/1:
• the first M stands for a Poisson arrival process (that is, for a Markovian arrival
process);
• the second M refers exponentially distributed service times (that is, for Markovian
service times);
• 1 means we are dealing with a single server queue.
We shall also assume that customers are served according to a a first-come, first-served
(FCFS) service policy, whereby the requests of customers are attended to in the order
that they arrived, without other biases or preferences (http://en.wikipedia.org/wiki/First-
come, first-served).
Finally, since the study of the transient behavior of queueing systems is far from being
trivial, we focus on their equilibrium behavior, namely derive limiting probabilities of the
number of customers an arriving customer sees in the system.
4.8.1 Performance measures
For birth and death queueing models, we will be interested in determining the following
performance measures in the long-run or equilibrium:
• Ls, the number of customers in the system — an arriving customer sees;
• Lq, the number of customers in the queue (waiting to be served) — an arriving
customer sees;
• Ws, the time an arriving customer will spend in the system;
• Wq, the time an arriving customer will spend in the queue waiting to be served.22
22Note that Wsst= Wq + service time
209
Ws and Wq influence customer satisfaction, whereas Ls and Lq are particulary important
performance measures to resource management (Pacheco, 2002, p. 76).
The distributions of these four r.v. depend on the following parameters:
• λ, the arrival rate;
• µ, the service rate;
• c, the number of (identical) servers in parallel;
• a = λµ, the (offered) load;23
• Pb, the blocking probability;24
• λe = λ× (1− Pb), the input rate;25
• ρ = λc µ
, the traffic intensity;26
• ρe = λec µ
= ρ× (1− Pb), the carried traffic intensity;
• Pi, the long-run fraction of time in state i.
Remark 4.57 — PASTA (Poisson Arrivals See Time Averages) (Pacheco, 2002,
p. 76)
Birth and death queueing systems possess the PASTA (Poisson Arrivals See Time
Averages) property, i.e., the long-run fraction of customers that find at arrival i customers
in the system coincides with the fraction of time the system spends in state i. •23It corresponds to the expected amount of time a (single) server would take to serve all customers
that in the long-run arrive to the system during one unit of time, including blocked customers (Pacheco,
2002, p. 74).24It is the long-run fraction of customers that are blocked (Pacheco, 2002, p. 75) upon arrival and
unable to enter the system.25It is the effective arrival rate which corresponds to the rate at which customers enter the system,
thus, we are excluding blocked customers (Pacheco, 2002, p. 75).26Or utilization factor (Kleinrock, 1975, p. 98). It is a (relative) measure of congestion
and represents the load offered to each server if the work is divided equally among servers
(Pacheco, 2002, p. 75); it should be strictly less than one for the system to function well
(https://en.wikipedia.org/wiki/Queueing theory#Utilization).
210
In addition, relationships between these four performance measures can be obtained
by using all these parameters and capitalizing on the following result.
Theorem 4.58 — Little’s law (http://en.wikipedia.org/wiki/Little’s law; Ross, 2003,
p. 478)
The long-term average number of customers in a stable system, L, is equal to the long-
term average effective arrival rate, λe, multiplied by the average time a customer spends
in the system, W — expressed algebraically:
L = λeW. (4.52)
•
Remark 4.59 — Little’s law
• Consequently:
E(Ls) = λeE(Ws); (4.53)
E(Lq) = λeE(Wq). (4.54)
• Although Little’s law looks intuitively reasonable, it is a quite remarkable result
(http://en.wikipedia.org/wiki/Little’s law), as it is valid regardless of the
– arrival process distribution;
– service distribution;
– number of servers;
– service policy (as long as it is not biased);
– etc. •
211
4.8.2 M/M/1, the classical queueing system
The celebrated M/M/1 queue is the simplest non trivial interesting queueing system
(Kleinrock, 1975, p. 94).
An M/M/1 queue may be described by a birth and death process with rates:
λk = λ, k ∈ N0 (4.55)
µk = µ, k ∈ N (4.56)
(Kleinrock, 1975, p. 94).
Furthermore, the necessary and sufficient condition for the ergodicity in the M/M/1
system is simply written in terms of the traffic intensity:27
ρ =λ
µ< 1 (4.57)
(Kleinrock, 1975, p. 95).
Needless to say that the next results, referring to Ls, Lq, Ws and Wq, are stated
assuming that ρ < 1.
Proposition 4.60 — M/M/1: distribution of Ls (Kleinrock, 1975, p. 96)
The steady-state probability of finding k customers in the M/M/1 system only depends
on λ and µ through their ratio ρ and is given by:
P (Ls = k) = ρk (1− ρ), k ∈ N0, (4.58)
i.e., Ls ∼ Geometric∗(1− ρ). •
Exercise 4.61 — M/M/1: distribution of Ls
Prove Proposition 4.60 (Kleinrock, 1975, pp. 95–96). •27Note that in this case ρ = ρe because the M/M/1 system has a waiting area with infinite capacity.
212
Exercise 4.62 — M/M/1: characteristics of Ls
Consider an M/M/1 queueing system.
(a) Plot the p.f. of Ls for ρ = 12
(Kleinrock, 1975, Figure 3.2, p. 97).
(b) Obtain the expected value and the variance of Ls as a function of ρ.
(c) Plot E(Ls) (Kleinrock, 1975, Figure 3.3, p. 97) to show that this performance measure
grows in an unbounded fashion with ρ (Kleinrock, 1975, p. 98).
(d) Show that Ls stochastically increases with the arrival rate, λ, and with the expected
service time, µ−1.28 •
Proposition 4.63 — M/M/1: distribution of Lq
The equilibrium probability of finding k customers waiting to be served in the M/M/1
system equals:
P (Lq = k) =
1− ρ2, k = 0
ρk+1 (1− ρ), k ∈ N.(4.59)
•
Exercise 4.64 — M/M/1: distribution of Lq
Prove Proposition 4.63. •
Proposition 4.65 — M/M/1: distribution of Ws
Since the service times are memoryless in the M/M/1 queueing system, we get:29
(Ws | Ls = k) ∼ Gamma(k + 1, µ), k ∈ N0; (4.60)
Ws ∼ Exponential(µ(1− ρ)). (4.61)
•28A r.v. X, whose distribution depends on the parameter θ, is said to stochastically increase with θ if
Pθ(X > x) is an increasing function of θ, for all −∞ < x < +∞.29Given that upon arrival a customer finds k customers in the M/M/1 system, he/she will leave this
system after the completion of 1 + (k − 1) + 1 services: the service that have already started when the
customer arrived; the ones of the k − 1 customers waiting to be served when the customer arrived; and
his/her own service.
213
Exercise 4.66 — M/M/1: distribution of Ws
Prove Proposition 4.65.30 •
Proposition 4.67 — M/M/1: distribution of Wq
For the M/M/1 queueing system, Wq is a mixed r.v. with the following characteristics:
(Wq | Ls = 0)st= 0;
(Wq | Ls = k) ∼ Gamma(k, µ), k ∈ N; (4.62)
(Wq | Wq > 0) ∼ Exponential(µ(1− ρ)); (4.63)
FWq(t) =
0, t < 0
1− ρ, t = 0
(1− ρ) + ρ× FExp(µ(1−ρ))(t), t > 0.
(4.64)
•
Exercise 4.68 — M/M/1: distribution of Wq
Prove Proposition 4.67. •
Exercise 4.69 — M/M/1 queueing system
Consider an M/M/1 queueing system and draw the graphs of the following parameters
in terms of ρ:
(a) the limiting probability that the system is empty;
(b) E(Wq);
(c) E(Ws) (Kleinrock, 1975, Figure 3.4, p. 97). •
Exercise 4.70 — M/M/1 queueing system (bis)
Derive V (Ls), V (Lq), V (Ws) and V (Wq). •
The following table condenses the distributions and expected values of the four
performance measures of an (ergodic) M/M/1.
30Apply the total probability law to prove (4.61).
214
M/M/1
Rates λk = λ, k ∈ N0
µk = µ, k ∈ N
Ls P (Ls = k) = ρk (1− ρ), k ∈ N0
E(Ls) = ρ(1−ρ)
Lq P (Lq = k) =
1− ρ2, k = 0
ρk+1 (1− ρ), k ∈ N
E(Lq) = ρ2
(1−ρ)
Ws (Ws | Ls = k) ∼ Gamma(k + 1, µ), k ∈ N0
Ws ∼ Exponential(µ(1− ρ))
E(Ws) = 1µ(1−ρ)
Wq (Wq | Ls = k) ∼ Gamma(k, µ), k ∈ N
FWq(t) =
0, t < 0
1− ρ, t = 0
(1− ρ) + ρ× FExp(µ(1−ρ))(t), t > 0
(Wq | Wq > 0) ∼ Exponential(µ(1− ρ))
E(Wq) = ρµ(1−ρ)
Exercise 4.71 — M/M/1 queueing system (bis, bis)
Admit that defective items from a production line arrive to the repair shop of the same
factory according to a Poisson process with constant rate λ. The repair shop has a single-
server who completes repairs after independent and exponentially distributed times with
expected value equal to 3 minutes.
(a) Determine the distribution of Ls.
(b) The manager of the production line wishes that the probability of having more than
5 defective items waiting for repair does not exceed 10% and that the probability of
having an idle server in the repair shop does not exceed 30%. Identify the arrival
rates that satisfy both conditions. •
215
Exercise 4.72 — More on the M/M/1 queueing model
Passengers arrive to a passport control area in a very small airport according to a Poisson
process having rate equal to 30 passengers per hour. The passport control has a sole
officer who completes checks after independent and exponentially distributed times with
expected value equal to 1.5 minutes.
(a) Obtain the probability that the server is idle.
(b) Calculate the expected number of passengers in the passport control area.
(c) What is the probability that passengers form a queue and its expected size?
(d) Determine not only the expected time an arriving passenger spends in the passport
control area, but also the expected value this passenger waits to be served.
(e) What is the probability that a passenger waits at least 10 minutes until his/her
passport starts to be checked by the officer? •
Exercise 4.73 — More on the M/M/1 queueing model (bis)
People arrive to a phone booth according to a Poisson process with rate 0.1 persons per
minute and the durations of the phone calls are independent and exponentially distributed
r.v. with common expected value equal to 3 minutes.
(a) What is the probability that someone has to wait to make a phone call?
(b) Determine the expected size of the queue.
(c) The phone company will install another phone booth in the same area if the expected
waiting time is of at least 3 minutes. Calculate the increase in the arrival rate that
justifies the installation of the second phone booth.
(d) Obtain the probability that a customer has to wait more than 10 minutes to start
his/her phone call.
216
(e) What is the probability that a person does not spend more than 10 minutes from
arrival to the system until the end of the phone call.
(f) Calculate the percentage of time the phone booth is being used. •
Exercise 4.74 — More on the M/M/1 queueing model (bis)
Vehicles arrive to a car wash according to a Poisson process with rate 5 vehicles per hour
and the durations of the car washes are independent and exponentially distributed r.v.
with expected value equal to 10 minutes. Admit that the car wash has a waiting area
with infinite capacity.
(a) What is the probability that a vehicle has to wait to be washed?
(b) Determine the expected number of vehicles that have to wait to be washed.
(c) Compute the standard deviation of the time spent in queue waiting for the vehicle to
be washed.
(d) What is the percentage of time the car wash machines are nor working? •
Exercise 4.75 — M/M/1 queueing system with discouraged arrivals
Consider an M/M/1 queueing system where arrivals tend to get discouraged when more
and more people are present in the system. One possible way to model this effect is to
consider an harmonic discouragement of arrivals with respect to the number present in
the system, i.e., having birth rates equal to
λk =λ
k + 1, k ∈ N0,
and keep the death rates equal to µk = µ, k ∈ N (Kleinrock, 1975, p. 99).
(a) Draw the rate diagram of this birth and death process (Kleinrock, 1975, Figure 3.5,
p. 100).
(b) Verify that the process is ergodic if λµ< +∞.
217
(c) Show that the limiting probabilities are given by
Pk = e−λ/µ(λ/µ)k
k!, k ∈ N0
(Kleinrock, 1975, pp. 99–100), i.e., Ls ∼ Poisson(λ/µ). •
4.8.3 The M/M/∞ queueing system
The M/M/∞ can be thought as a system where there is always a new server for each
arriving customer (Kleinrock, 1975, p. 101).31
This queueing system can be obviously described by a birth and death process with
rates
λk = λ, k ∈ N0, (4.65)
µk = kµ, k ∈ N, (4.66)
and the ergodic condition is simply λµ< +∞ (Kleinrock, 1975, p. 101).
Proposition 4.76 — M/M/∞: distribution of Ls (Kleinrock, 1975, p. 101)
Ls ∼ Poisson(λ/µ). •
Exercise 4.77 — M/M/∞: distribution of Ls
Prove Proposition 4.76. •
Suffice to say that we have not to wait in this system and the time spent in the system
coincides with the duration of the service, thus,
Lqst= 0 (4.67)
Ws ∼ Exponential(µ) (4.68)
Wqst= 0. (4.69)
Since this system is quite simple to describe in the equilibrium state we are tempted
to state the transient behavior of the number of customers in the system at time t.
31It may also be interpreted as a system with a responsive server who accelerates his/her service rate
linearly (Kleinrock, 1975, p. 101), to avoid any customers waiting.
218
Proposition 4.78 — M/M/∞: transient behavior of number of customers
Let X(t) be the number of customers in the M/M/∞ system at time t. Then
(X(t) | X(0) = 0) ∼ Poisson(λ (1− e−µt)/µ). (4.70)
•
Exercise 4.79 — M/M/∞: transient behavior of number of customers
Prove Proposition 4.78, by deriving the Kolmogorov’s forward equations and the
associated partial differential equation. •
Even though the M/G/∞ queueing system32 cannot be modeled as a birth and death
process, we digress and state the transient and limit behavior of its number of customers.
Proposition 4.80 — M/G/∞: transient and limit behavior of number of
customers (Pacheco, 2002, p. 91)
Let X(t) be the number of customers in the M/G/∞ system at time t. Then
(X(t) | X(0) = 0) ∼ Poisson
(λ
∫ t
0
[1−G(t− s)] ds)
(4.71)
limt→+∞
(X(t) | X(0) = 0) ∼ Poisson(λ/µ). (4.72)
•
Exercise 4.81 — M/G/∞: transient and limit behavior of number of
customers
Prove Proposition 4.80 (Pacheco, 2002, p. 91). •
Exercise 4.82 — M/G/∞: transient and limit behavior of number of
customers
Users arrive at a library according to a Poisson process with rate equal to 3 users per
minute and spend in the library an amount of time with Uniform(10, 210) distribution.
What is the expected number of users in the library two hours after it opened (Pacheco,
2002, p. 91)? •
32This system is associate with a Poisson arrival process with rate λ and service time distribution
function G with finite expected value µ−1.
219
M/M/∞
Rates λk = λ, k ∈ N0
µk = kµ, k ∈ N
Ls Ls ∼ Poisson(λ/µ)
Lq Lqst= 0
Ws Ws ∼ Exp(µ)
Wq Wqst= 0
X(t) = number of customers in the system at time t
(X(t) | X(0) = 0) ∼ Poisson(λ (1− e−µt)/µ)
M/G/∞
(X(t) | X(0) = 0) ∼ Poisson(λ∫ t
0[1−G(t− s)] ds
)limt→+∞(X(t) | X(0) = 0) ∼ Poisson(λ/µ)
220
4.8.4 M/M/m, the m server case
Once again we consider a queueing system with an unlimited waiting area and with a
constant arrival rate; this system provides a maximum of m servers, is within the reach
of a birth and death formulation and leads to
λk = λ, k ∈ N0 (4.73)
µk = minkµ,mµ
=
kµ, k = 1, . . . ,m
mµ, k = m+ 1,m+ 2, . . .(4.74)
(Kleinrock, 1975, p. 102).
From these birth and death rates, it is easily seen that the condition for ergodicity is
written, expectedly, in terms of the traffic intensity:
ρ =λ
mµ< 1. (4.75)
Proposition 4.83 — M/M/m: distribution of Ls (Kleinrock, 1975, pp. 102–103)
The limit probability of finding k customers in the M/M/m system depends, once again,
on λ and µ through the traffic intensity ρ = λmµ
:
P (Ls = k) =
P0(mρ)k
k!, k = 0, 1, . . . ,m− 1
P0mmρk
m!, k = m,m+ 1, . . . ,
(4.76)
where P0 = P (Ls = 0) =[∑m−1
k=0(mρ)k
k!+ (mρ)m
m!(1−ρ)
]−1
.
Equivalently,
P (Ls = k) =
m!k!
(1− ρ)(mρ)k−mC(m,mρ), k = 0, 1, . . . ,m− 1
(1− ρ) ρk−mC(m,mρ), k = m,m+ 1, . . . ,(4.77)
where
C(m,mρ) = P (queueing) = P (Ls ≥ m) =
(mρ)m
m!(1−ρ)∑m−1k=0
(mρ)k
k!+ (mρ)m
m!(1−ρ)
(4.78)
is usually referred to as Erlang’s C formula (or Erlang’s second formula).33 •33Some authors, such as Kleinrock (1975, p. 103), represent this probability by C(m, ρ) instead of
C(m,mρ). We prefer the notation of Pacheco (2002, p. 80).
221
Proposition 4.84 — M/M/m: distribution of Lq
The equilibrium probability of finding k customers waiting in line in the M/M/m queueing
system is simply given by:
P (Lq = k) =
1− ρC(m,mρ), k = 0
(1− ρ) ρk C(m,mρ), k ∈ N.(4.79)
•
Proposition 4.85 — M/M/m: distribution of Ws
The distribution of Ws conditional Ls = k, depends on the fact that the arriving customer
is immediately served or not:
(Ws | Ls = k) ∼
Exp(µ), k = 0, . . . ,m− 1,
Exp(µ) ?Gamma(k −m+ 1,mµ), k = m,m+ 1, . . . ,(4.80)
where ? represents the convolution (or sum of two independent r.v.).
The survival function of Ws has two expressions, depending on whether ρ is equal
to m−1m
or not:
1− FWs(t) =
[1 + µtC(m,mρ)] e−µt, t ≥ 0, ρ = m−1m[
1 + eµ[1−m(1−ρ)]t
1−m(1−ρ)× C(m,mρ)
]e−µt, t ≥ 0, ρ 6= m−1
m.
(4.81)
•
Proposition 4.86 — M/M/m: distribution of Wq
Once more Wq is a mixed r.v. In this case:
(Wq | Ls = k)st=
0, k = 0, . . . ,m− 1
Gamma(k −m+ 1,mµ), k = m,m+ 1, . . . ;(4.82)
(Wq | Wq > 0) ∼ Exponential(mµ(1− ρ)); (4.83)
1− FWq(t) =
1, t < 0
C(m,mρ), t = 0
C(m,mρ)× [1− FExp(mµ(1−ρ))(t)], t > 0.
(4.84)
•
222
Exercise 4.87 — M/M/m: distributions of Ls, Lq, Ws and Wq
Prove propositions 4.83–4.86 and obtain the expected values below.
M/M/m
Rates λk = λ, k ∈ N0
µk =
kµ, k = 1, . . . ,m
mµ, k = m+ 1,m+ 2, . . .
Ls P (Ls = k) =
m!k!
(1− ρ)(mρ)k−mC(m,mρ), k = 0, 1, . . . ,m− 1
(1− ρ) ρk−mC(m,mρ), k = m,m+ 1, . . .
C(m,mρ) = P (Ls ≥ m) =(mρ)m
m!(1−ρ)∑m−1k=0
(mρ)k
k!+
(mρ)m
m!(1−ρ)
C(1, ρ) = ρ
C(2, 2ρ) = 2ρ2
1+ρ
E(Ls) = mρ+ ρ1−ρC(m,mρ)
Lq P (Lq = k) =
1− ρC(m,mρ), k = 0
(1− ρ) ρk C(m,mρ), k ∈ N
E(Lq) = ρ1−ρC(m,mρ)
Ws (Ws | Ls = k) ∼
Exp(µ), k = 0, . . . ,m− 1,
Exp(µ) ?Gamma(k −m+ 1,mµ), k = m,m+ 1, . . .
1− FWs(t) =
[1 + µtC(m,mρ)] e−µt, t ≥ 0, ρ = m−1m[
1 + eµ[1−m(1−ρ)]t
1−m(1−ρ)× C(m,mρ)
]e−µt, t ≥ 0, ρ 6= m−1
m
E(Ws) = 1µ
+ C(m,mρ)mµ(1−ρ)
Wq (Wq | Ls = k) ∼ Gamma(k −m+ 1,mµ), k = m,m+ 1, . . .
(Wq | Wq > 0) ∼ Exponential(mµ(1− ρ))
1− FWq(t) =
1, t < 0
C(m,mρ), t = 0
C(m,mρ)× [1− FExp(mµ(1−ρ))(t)], t > 0
E(Wq) = C(m,mρ)mµ(1−ρ)
•
223
Exercise 4.88 — M/M/m queueing system
A system has two servers, who attend to customers in a FCFS basis and whose service
times are independent and exponentially distributed r.v. with mean value 1.8 minutes.
Considering that customers arrive to the system according to a Poisson process with rate
equal to 1 customer per minute, compute:
(a) the probability that there are more than 10 customers in the system;
(b) the expected time a customer spends in line waiting to be served;
(c) the expected number of customers in the system;
(d) the probability that exactly one server is idle. •
Exercise 4.89 — M/M/m queueing system (bis)
A small public office has two officers, who service times are independent and exponentially
distributed r.v. with rate equal to 60 visitors per hour. Admit that the times between
consecutive arrivals of visitors are i.i.d. r.v. exponentially distributed with parameter equal
to 100 visitors per hour and calculate:
(a) the probability that there are more than 4 visitors in the system;
(b) the expected number of visitors in the system;
(c) the expected time a visitor spends in the system. •
Exercise 4.90 — M/M/m queueing system (bis, bis)
A department has three secretaries, who process requests that arrive according to a
Poisson process with rate equal to 20 requests per 8 hours. Assume that the processing
times are independent and exponentially distributed r.v. with expected value equal to 40
minutes.
(a) What is the percentage of time all (resp. at least one of) the secretaries are busy?
224
(b) Obtain the expected time one waits for a request to be completely processed
(c) Admit that due to financial problems one of the secretaries had to be fired. Recompute
the quantities in (a) and (b). •
Exercise 4.91 — M/M/m queueing system (bis, bis, bis)
Airplanes arrive to an airport according to a Poisson process having rate equal to 18
airplanes per hour, and their times in a runway during landing are independent and
exponentially distributed r.v. with expected value equal to 2 minutes.
Derive the number of runways the airport should have so that the probability that an
arriving airplane waits to land does not exceed 0.20. •
Exercise 4.92 — A M/M/m queueing system with a impatient customers
(Isaacson and Madsen, 1976, Example VII.3.4, pp. 245–246)
Assume:
• customers arrive at a ticket counter with m windows according to a Poisson process
with parameter 6 per minute;
• customers are served on a first-come-first-served basis;
• service times are independent and exponentially distributed with mean 13
of a
minute.
(a) What is the minimum number of windows needed to guarantee that the line does not
get infinitely long?
(b) Assume m = 4 and that we are dealing with impatient customers who:
• wait for service if Ls ≤ 4;
• wait for service with probability 12
if Ls = 5;
• leave if Ls ≥ 6.
What is the distribution of Ls? •
225
4.8.5 M/M/m/m, the m–server loss system
Kendall’s notation has been extended, namely to A/S/c/K where:
• K stands for the capacity of the system, i.e., the maximum number of customers
allowed in the system including those in service.
When the number is at this maximum, further arrivals are turned away
(http://en.wikipedia.org/wiki/Kendall’s notation).34 K is sometimes denoted by m + c
where c is the buffer size, that is, the number of places in the waiting area.
The M/M/m/m queueing system, is a m−server system with no waiting area:
each newly arriving customer is given her/his private server; however, if a customer
arrives when all servers are occupied, that customer is lost (Kleinrock, 1975, p. 105).
Unsurprisingly, this queueing system also called a m−server loss system.
This queueing system can be modeled as birth and death process with
λk =
λ, k = 0, 1, . . . ,m− 1
0, k = m,m+ 1, . . .(4.85)
µk =
kµ, k = 1, . . . ,m
0, k = m+ 1,m+ 2, . . .(4.86)
(Kleinrock, 1975, p. 105). Since we are dealing with a finite state space (S =
0, 1, . . . ,m), ergodicity is obviously assured as long as the traffic intensity ρ = λmµ
is finite, and this condition can be written in terms of the offered load:
mρ =λ
µ< +∞. (4.87)
Proposition 4.93 — M/M/m/m: distribution of Ls (Kleinrock, 1975, p. 105)
The limit probability of finding k customers in the M/M/m/m queueing system depends
on the offered load mρ = λµ:
P (Ls = k) =
P0(mρ)k
k!, k = 0, 1, . . . ,m
0, k = m+ 1,m+ 2, . . . ,(4.88)
where P0 = P (Ls = 0) =[∑m
k=0(mρ)k
k!
]−1
. •
34If this number is omitted, the capacity is assumed to be unlimited, or infinite.
226
Remark 4.94 — M/M/m/m system and Erlang’s B formula
The long-run fraction of lost customers is equal to
P (Ls = m) =(mρ)m
m!∑mk=0
(mρ)k
k!
(4.89)
= B(m,mρ) ≡ B(m,λ/µ), (4.90)
usually referred to as Erlang’s B formula (or Erlang’s first formula or Erlang loss formula)
and it was first derived by Erlang in 1917 (Kleinrock, 1975, p. 106).
The (equilibrium) distribution of Ls is sometimes written in terms of B(m,mρ):
P (Ls = k) =
m!k! (mρ)m−k
×B(m,mρ), k = 0, 1, . . . ,m
0, k = m+ 1,m+ 2, . . .(4.91)
•
Exercise 4.95 — M/M/m/m: distributions of Ls, Lq, Ws and Wq
Prove Proposition 4.93 and show that E(Ls) = mρ[1−B(m,mρ)]. •
Exercise 4.96 — Erlang’s B and C formulae (Cooper, 1981, pp. 82, 92)
Prove that:
(a) Erlang’s B formula can be obtained in a recursive way:
B(m,mρ) = B(m,λ/µ)
=
ρ
1+ρ, m = 1
mρ×B(m−1,λ/µ)m+mρ×B(m−1,λ/µ)
, m = 2, 3, . . . ;
(b) Erlang’s C formula is related with Erlang’s B formula as follows:
C(m,mρ) =m×B(m,mρ)
m−mρ× [1−B(m,mρ)]
C(m,λ/µ) =1
1 + (m−mρ)× [mρ×B(m− 1, λ/µ)]−1,
where B(0, λ/µ) = 1;
(c) C(m,mρ) > B(m,mρ);
(d) C(m,mρ) ≡ C(m,λ/µ) = 1
1+ 1−ρρ× m−1−mρ×C(m−1,λ/µ)
(m−1−mρ)×C(m−1,λ/µ)
, for m > mρ+ 1. •
227
We are dealing, once more, with a system where there is no wait — in this case because
there is no waiting area and, thus, arriving customers who find all the m servers busy are
lost. As a consequence:
Lqst= 0 (4.92)
Ws ∼ Exp(µ) (4.93)
Wqst= 0. (4.94)
M/M/m/m
Rates λk =
λ, k = 0, 1, . . . ,m− 1
0, k = m,m+ 1, . . .
µk =
kµ, k = 1, . . . ,m
0, k = m+ 1,m+ 2, . . .
Ls P (Ls = k) =
(mρ)k
k!∑mj=0
(mρ)j
j!
= m!k! (mρ)m−k
×B(m,mρ), k = 0, 1, . . . ,m
0, k = m+ 1,m+ 2, . . .
B(m,mρ) =(mρ)m
m!∑mj=0
(mρ)j
j!
E(Ls) = mρ[1−B(m,mρ)]
Lq Lqst= 0
Ws Ws ∼ Exp(µ)
Wq Wqst= 0
Exercise 4.97 — M/M/m/m system
Answer the questions in Exercise 4.89, considering that the small public office has no
waiting area. •
228
Bibliography
• Bertsekas, D.P. (2—). Stochastic Processes (Chapter 5).
(www.telecom.otago.ac.nz/tele302/ref/Bertsekas ch5.pdf)
• Billingsley, P (1990). Probability and Measure, 3rd. ed. Wiley.
(QA273.4-.67.BIL.37008 and QA273.4-.67.BIL.36649 refer to the library code of the
2nd. edition)
• Brockwell, P.J. and Davis, R.A. (1991). Time Series: Theory and Methods (2nd.
edition). Springer-Verlag.
• Caravena, F. (2012). A note on directly Riemann integrable functions. Accessed
from http://arxiv.org/abs/1210.2361 on 2013-04-03.
• Cooper, R.B. (1981). Introduction to Queueing Theory (2nd. edition). North
Holland.
• Feller, W. (1968). An introduction to probability theory and its applications, Vol. 1
(3rd. edition). John Wiley & Sons.
(QA273.4-.67.FEL.30377, QA273.4-.67.FEL.27086)
• Feller, W. (1971). An introduction to probability theory and its applications, Vol. 2
(2rd. edition). John Wiley & Sons.
(QA273.FEI.1018)
• Grimmett, G.R. and Stirzaker, D.R. (2001a). Probability and Random Processes
(3rd. edition). Oxford.
(QA274.12-.76.GRI.40695 refers to the library code of the 1st. and 2nd. editions
from 1982 and 1992, respectively.)
• Grimmett, G.R. and Stirzaker, D.R. (2001b). One Thousand Exercises in
Probability. Oxford University Press.
• Hajek, B. (2009). Notes for ECE 534 — An Exploration of Random Processes for
Engineers.
(http://www.ifp.illinois.edu/ hajek/Papers/randomprocesses.html)
229
• Hastings, K. (2001). Introduction to probability with Mathematica. Chapman &
Hall.
(QA273.19.HAS.54617)
• Isaacson, D.L. and Madsen, R.W. (1976). Markov Chains: Theory and Applications.
John Wiley & Sons.
(QA274.12-.76.ISA.28858)
• Karr, A.F. (1993). Probability. Springer-Verlag.
• Kleinrock, L. (1975). Queueing Systems, Volume I: Theory. John Wiley & Sons.
(T57.9.KLE)
• Kleinrock, L. and Gail, R. (1996). Queueing Systems: Problems and Solutions.
John Wiley & Sons.
(T57.92.KLE.49916)
• Kulkarni, V.G. (1995). Modeling and Analysis of Stochastic Systems. Chapman &
Hall.
(QA274.12-.76.KUL.59065, QA274.12-.76.KUL.45259)
• Morais, M.C. (2011). Lecture Notes — Probability Theory. Departamento de
Matematica, Instituto Superior Tecnico, Universidade Tecnica de Lisboa.
(https://fenix.ist.utl.pt/disciplinas/tp/2010-2011/1-semestre/material-didactico)
• Morais, M.C. (2012). Real- and Integer-valued Time Series and Quality Control
Charts. Departamento de Matematica, Instituto Superior Tecnico, Universidade
Tecnica de Lisboa.
• Pacheco, A. (2002). Class Notes – Stochastic Manufacturing and Service Systems.
Georgia Institute of Technology, Atlanta, USA.
(https://fenix.ist.utl.pt/disciplinas/ipe64/2012-2013/2-semestre/material-
didactico)
• Pinkerton, S.D. and Holtgrave, D.R. (1998). The Bernoulli-process model in HIV
transmission: applications and implications. In Handbook of economic evaluation
230
of HIV prevention programs, Holtgrave, D.R. (Ed.), pp. 13–32. Plenum Press, New
York.
• Resnick, S. (1992). Adventures in Stochastic Processes Birkhauser, Boston.
(QA274.12-.76.RES.43493)
• Rohatgi, V.K. (1976). An Introduction to Probability Theory and Mathematical
Statistics. John Wiley & Sons.
(QA273-280/4.ROH.34909)
• Ross, S.M. (1983). Stochastic Processes. John Wiley & Sons, New York.
(QA274.12-.76.ROS.36921, QA274.12-.76.ROS.37578)
• Ross, S.M. (1989). Introduction to Probability Models (fourth edition). Academic
Press. (QA274.12-.76.ROS.43540 refers to the library code of the 5th. revised edition
from 1993.)
• Ross, S.M. (2003). Introduction to Probability Models (8th edition). Academic Press,
San Diego, California.
(QA273.ROS.62694)
• Serfozo, R. (2009). Basics of Applied Stochastic Processes. Springer-Verlag.
• Shumway, R.H. and Stoffer, D.S. (2006). Time Series Analysis and Its Applications:
With R Examples (2nd. edition). Springer-Verlag.
• Walrand, J. (2004). Lecture Notes on Probability Theory and Random Processes.
Department of Electrical Engineering and Computer Sciences, University of
California, Berkeley.
(walrandpc.eecs.berkeley.edu/126notes.pdf)
• Yates, R.D. and Goodman, D.J. (1999). Probability and Stochastic Processes: A
friendly Introduction for Electrical and Computer Engineers. John Wiley & Sons,
Inc. (QA273-280/4.YAT.49920)
231