Note on Markov Chains

1 Markov Chains

We will adopt the following conventions : random variables will be denotedby capital letters Xt and Zt, while realizations of random variables will bedenoted by corresponding lowercase letters xt and zt. A stochastic processwill be a sequence of random variables, say fXtg and fZtg, while a samplepath will be a sequence of realizations fxtg and fztg.Roughly speaking, a stochastic process fXtg has the Markov property if

the probability distributions

PrfXt+1 x j xt; xt1; :::xtkg = PrfXt+1 x j xtgfor any k 2. A Markov chain is a stochastic process with this property

and takes values in a nite set. A Markov chain (x; P; 0) is characterized by atriple of three objects : a state space identied with an n-vector x, an n-by-ntransition matrix P , and an initial distribution, an n-vector 0.Let

x = [x1 x2:::::xn]0

Then the transition matrix P = [pij ] has elements with the interpretation

pij = Pr(Xt+1 = xj j Xt = xi)So, x a row i. Then the elements in each of the j columns give the con-

ditional probabilities of transiting from state xi to xj . In order for these to bewell-dened probabilities, we require

0 pij 1, i; j = 1; 2; :::::; nnXj=1

pij = 1 i = 1; 2; ::::; n

For example, if n = 2, and

P =

1=21=10

1=29=10

then the conditional probability of moving from state i = 1 to i = 2 is

p12 = 1=2 while the conditional probability of moving from state i = 2 to i = 1is only p21 = 1=10. Notice that the diagonal elements of P give a sense of thepersistence of the chain, because the elements give the probability of staying ina given state.Similarly the initial distribution 0 = [0;i] has the elements with the inter-

pretation0;i = Pr(X0 = xi)

And in order for these to be well-dened probabilities, we require

0 0;i 1, i = 1; 2; ::::; nnXi=1

0;i = 1

1

In applications we will often set the initial distribution to have i = 1 forsome i and zero everywhere else so that the chain is started in state i withprobability 1.

2 Higher order transitions

The main convenience of the Markov chain model is the ease with which it canbe manipulated using ordinary linear algebra. For example,

Pr(Xt+2 = xj j Xt = xi) =nXk=1

Pr(Xt+2 = xj j Xt+1 = xk) Pr(Xt+1 = xk j Xt = xi)

=nXk=1

pkjpik

= (P 2)ij

where (P 2)ij is the ij-th element of the matrix P 2 = PP . More generally,

Pr(Xt+s = xj j Xt = xi) = (P s)ijThe Markov Chain gives a law of motion for probability distributions

over a nite set of possible state values. The sequence of unconditional proba-bility distributions is ftg1t=0 where each t is an n-vector. Let t = [t;i] be avector whose elements have the interpretation

t;i = Pr(Xt = xi)

Then the sequence of probability distributions is computed using

01 = 00P

02 = 01P =

00P

2

::

::

0t = 0t1P =

00P

t

where the prime (0) denotes the transpose. In short, probability distributions0t evolve according to the linear homogeneous system of dierence equations

t+1 = P0t

The conditional and unconditional moments of the Markov chain can becalculated using similar reasoning. For example,

EfX0g = 00xEfX1g = 01x = 00Px

::

::

EfXtg = 0tx = 00P tx

2

while conditional expectations are

EfXt+k j Xt = xg = P kxSimilarly, let g be an n-vector that represents a function with typical element

g(x) on the state x. Then

Efg(Xt+k) j Xt = xg = P kg

3 Stationary Distribution

A steady state or stationary probability distribution is a vector suchthat

0 = 0P

or, equivalently(P 0 I) = 0

Solving this matrix equation gives us . Clearly, for a non-trivial (i.e., 6= 0), we need det(P 0 I) = 0, which will always be satised for the problemswe will work with. Will such a non-trivial be unique? Yes, if P is regular,i.e., 0 < pij < 1 for each i; j (I will not prove this result).

4 Computing Stationary Distributions By Hand

Consider a two-state Markov chain (x; P; 0) with transition matrix

P =

1 pq

p1 q

for 0 < p; q < 1. Then this Markov chain has the unique invariant distribu-

tion which we can solve for as follows

0 = (I P 0)or

00

=

1001

1 pp

q1 q

12

=

pp

qq

12

Carrying out the calculations, we nd that

0 = p1 q20 = p1 + q2

3

These two equations only tell us one piece of information, namely

2 =p

q1

But we also know that the elements must satisfy

2 + 1 = 1

So we can solve these two equations in two unknowns to get

1 =q

p+ q, 2 =

p

p+ q

5 Computing Stationary Distributions in MAT-LAB

One way to compute the stationary distribution using a computer is to use bruteforce. So if P is the transition matrix, simply compute PT for some large T .Then you will get back a matrix with identical rows that are equal to the chainsstationary distribution. For example, if

P =

0@ 0:700

0:20:50:9

0:10:50:1

1AThen,

P 1000 =

0@ 000

0:64290:64290:6429

0:35710:35710:3571

1Aand the stationary distribution is = [0 0:6429 0:3571]0, which is what

we would get from the "by-hand" method of the last section. This is not veryelegant, of course. A neater approach is to use MATLAB to compute the eigen-values and eigenvectors of P . Why eigenvalues and eigenvectors?Return to the equation that describes the steady state

(P 0 I) = 0

and notice that solving this equation for amounts to solving for the eigen-vector of P 0 corresponding to an eigenvalue equal to 1. Can we be sure that P 0

will have an eigenvalue equal to 1? Yes, always (this follows from the property

that 0 pij 1, andnXj=1

pij = 1). Can we be sure that it will have only one

eigenvalue equal to 1, so that the eigenvector corresponding to it will be unique?Yes, provided P 0 is regular (as dened in Section 3). Again, I will not prove

4

these last two claims. The requirement thatnXi=1

i = 1 is a normalization of the

eigenvector.Note that in the example above, P does not satisfy the regularity property.

So how did we get convergence? We got it because the regularity property isonly su cient for uniqueness. In this course, we may not always have a regulartransition matrix, but P will always be chosen by me (for examples, problemsets, and exams) so that the stationary distribution is unique.Step 1 : Compute matrix of eigenvalues and eigenvectors of P 0 (remember

the transpose!). In MATLAB,

[V,D] = eig(P)

gives a matrix of eigenvectors V and a diagonal matrix D whose entries arethe eigenvalues of P .Step 2 : Now since P is a transition matrix, one of the eigenvalues is 1.

Pick the column of V associated with the eigenvalue 1. With the matrix Pgiven above, this will be the second column and we will have

v = V(:,2) =

0@ 00:87420:4856

1AStep 3 : Finally, normalize the eigenvector to sum to one, that is

pibar = v/sum(v) =

0@ 00:64290:3571

1A

6 Simulating Markov Chains in MATLAB

In the problem set, you will have to simulate an n-state Markov chain (x; P; 0)for t = 0; 1; 2; ::::; T time periods. I use a bold x to distinguish the vector ofpossible state values from sample realizations from the chain. Iterating on theMarkov chain will produce a sample path fxtgTt=0 where for each t, xt 2 x. Inthe exposition below, I suppose that n = 2 for simplicity so that the transitionmatrix can be written

P =

p1p2

1 p11 p2

Step 1 : Set values for each (x; P; 0)Step 2 : Determine the initial state x0. To do this, draw a random variable

from a uniform distribution on [0; 1]. Call that realization "0. In MATLAB,this can be done with the rand() command. If the number "0 0;1, setx0 = x(1). Otherwise, set x0 = x(2):

5

Step 3 : Draw a vector of length T of independent random variables from auniform distribution on [0; 1]. Call a typical realization "t. Again, in MATLAB,this can be done with the command rand(T,1). Now the current state isxt = x(i), check if "t pi;1. If so, the state transits to xt+1 = x(j) with j 6= i.Otherwise, the state remains at i and xt+1 = x(i). Iterating in this mannerbuilds up an entire simulation.The MATLAB lemarkov_example.m (uploaded to BB) implements this

procedure for a specic chain (x; P; 0). When this le is executed, it calls an-other MATLAB le simulate_markov.m (also uploaded to BB) which per-forms the simulation for an arbitrary chain (x; P; 0) and specied simulationlength T . The latter le is a function le.As an example, suppose the growth rate of log GDP is

xt+1 log(yt+1) log(yt)and suppose that fXtg follows a 3-state Markov chain with

x = ( ; ; + ) = (0:02; 0:02; 0:04)and transition probabilities

P =

0@ 0:50:030

0:50:90:2

00:070:80

1AThe idea here is that the economy can either be shrinking, x = < 0,

growing at its usual pace, x = > 0, or growing even faster. If the economyis in recession, there is a 50/50 chance of reverting back to the average growthrate. If the economy is growing at an average pace, there is a slight probabilityof it falling into recession and a slightly bigger probability of it growing evenfaster, and so on. We simulate a Markov chain on x and then recover the levelof output via the sum

log(yt) = log(y0) +tX

k=1

xk, t 1

The MATLAB le gdpgrowth_example.m (uploaded to BB) produces asimulation of this stochastic growth process assuming 0 = [0 1 0]0, simulationlength T = 100, and initial condition log(y0) = 0. This MATLAB le also callsthe function le simulate_markov.m.

7 Homework

Work through the programs mentioned above. For information on MATLABfunction les (what is it? how does one code a function le?) go to the Helpsection of MATLAB (once you have opened the program), and look up the indexentry on "function"

6

Documents

Note on Markov Chains