DynamicsI

Economic Dynamics

I: Discrete Time

Jianjun Miao

April 9, 2012

ii

Contents

Preface xi

I Dynamical Systems 1

1 Deterministic Difference Equations 5

1.1 Scalar First-Order Linear Equations . . . . . . . . . . . . . . 5

1.2 Lag Operators . . . . . . . . . . . . . . . . . . . . . . . . . . 11

1.3 Scalar Second-Order Linear Equations . . . . . . . . . . . . . 12

1.4 First-Order Linear Systems . . . . . . . . . . . . . . . . . . . 15

1.4.1 Nonsingular System . . . . . . . . . . . . . . . . . . . 17

1.4.2 Singular System . . . . . . . . . . . . . . . . . . . . . 23

1.5 Phase Diagrams . . . . . . . . . . . . . . . . . . . . . . . . . . 27

1.6 Nonlinear Systems . . . . . . . . . . . . . . . . . . . . . . . . 28

1.7 Numerical Solutions Using Dynare . . . . . . . . . . . . . . . 32

1.8 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 40

2 Stochastic Difference Equations 43

2.1 First-Order Linear Systems . . . . . . . . . . . . . . . . . . . 43

2.2 Scalar Linear Rational Expectations Models . . . . . . . . . . 46

2.2.1 Lag Operators . . . . . . . . . . . . . . . . . . . . . . 46

2.2.2 The Method of Undetermined Coefficients . . . . . . . 49

2.3 Multivariate Linear Rational Expectations Models . . . . . . 50

2.3.1 The Blanchard and Kahn Method . . . . . . . . . . . 51

2.3.2 The Klein Method . . . . . . . . . . . . . . . . . . . . 53

2.3.3 The Sims Method . . . . . . . . . . . . . . . . . . . . 56

2.4 Nonlinear Rational Expectations Models . . . . . . . . . . . . 60

2.5 Numerical Solutions Using Dynare . . . . . . . . . . . . . . . 65

2.6 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 78

iii

iv CONTENTS

3 Markov Processes 81

3.1 Markov Chains . . . . . . . . . . . . . . . . . . . . . . . . . . 82

3.1.1 Classification of States . . . . . . . . . . . . . . . . . . 86

3.1.2 Stationary Distribution . . . . . . . . . . . . . . . . . 90

3.1.3 Countable-State Markov Chains . . . . . . . . . . . . 93

3.2 General Markov Processes . . . . . . . . . . . . . . . . . . . . 95

3.3 Convergence . . . . . . . . . . . . . . . . . . . . . . . . . . . . 100

3.3.1 Strong Convergence . . . . . . . . . . . . . . . . . . . 100

3.3.2 Weak Convergence . . . . . . . . . . . . . . . . . . . . 104

3.4 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 109

4 Ergodic Theory and Stationary Processes 111

4.1 Ergodic Theorem . . . . . . . . . . . . . . . . . . . . . . . . . 111

4.2 Application to Stationary Processes . . . . . . . . . . . . . . 116

4.3 Application to Stationary Markov Processes . . . . . . . . . . 120

4.4 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 124

II Dynamic Optimization 125

5 Markov Decision Process Model 129

5.1 Model Setup . . . . . . . . . . . . . . . . . . . . . . . . . . . 129

5.2 Examples . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 137

5.2.1 Discrete Choice . . . . . . . . . . . . . . . . . . . . . . 137

5.2.2 Optimal Stopping . . . . . . . . . . . . . . . . . . . . 138

5.2.3 Bandit Model . . . . . . . . . . . . . . . . . . . . . . . 143

5.2.4 Optimal Control . . . . . . . . . . . . . . . . . . . . . 147

5.3 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 149

6 Finite-Horizon Dynamic Programming 151

6.1 A Motivating Example . . . . . . . . . . . . . . . . . . . . . . 151

6.2 Measurability Problem . . . . . . . . . . . . . . . . . . . . . . 155

6.3 The Principle of Optimality . . . . . . . . . . . . . . . . . . . 157

6.4 Optimal Control . . . . . . . . . . . . . . . . . . . . . . . . . 166

6.5 The Maximum Principle . . . . . . . . . . . . . . . . . . . . . 174

6.6 Applications . . . . . . . . . . . . . . . . . . . . . . . . . . . . 178

6.6.1 The Secretary Problem . . . . . . . . . . . . . . . . . 179

6.6.2 A Consumption-Saving Problem . . . . . . . . . . . . 181

6.7 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 183

CONTENTS v

7 Infinite-Horizon Dynamic Programming 187

7.1 The Principle of Optimality . . . . . . . . . . . . . . . . . . . 188

7.2 Bounded Rewards . . . . . . . . . . . . . . . . . . . . . . . . 198

7.3 Optimal Control . . . . . . . . . . . . . . . . . . . . . . . . . 201

7.3.1 Bounded Rewards . . . . . . . . . . . . . . . . . . . . 201

7.3.2 Unbounded Rewards . . . . . . . . . . . . . . . . . . . 205

7.4 The Maximum Principle and Transversality Condition . . . . 219

7.5 Euler Equations and Transversality Condition . . . . . . . . . 223

7.6 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 230

8 Applications 233

8.1 Option Exercise . . . . . . . . . . . . . . . . . . . . . . . . . . 233

8.2 Discrete Choice . . . . . . . . . . . . . . . . . . . . . . . . . . 236

8.3 Consumption and Saving . . . . . . . . . . . . . . . . . . . . 239

8.3.1 Deterministic income . . . . . . . . . . . . . . . . . . . 243

8.3.2 Stochastic income . . . . . . . . . . . . . . . . . . . . 251

8.4 Inventory . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 262

8.5 Investment . . . . . . . . . . . . . . . . . . . . . . . . . . . . 276

8.5.1 Neoclassical Theory . . . . . . . . . . . . . . . . . . . 276

8.5.2 Q Theory . . . . . . . . . . . . . . . . . . . . . . . . . 278

8.5.3 Augmented Adjustment Costs . . . . . . . . . . . . . . 281

8.6 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 289

9 Linear-Quadratic Models 293

9.1 Controlled Linear State-Space System . . . . . . . . . . . . . 293

9.2 Finite-Horizon Problems . . . . . . . . . . . . . . . . . . . . . 296

9.3 Infinite-Horizon Limits . . . . . . . . . . . . . . . . . . . . . . 300

9.4 Optimal Policy under Commitment . . . . . . . . . . . . . . . 306

9.5 Optimal Discretional Policy . . . . . . . . . . . . . . . . . . . 312

9.6 Robust Control . . . . . . . . . . . . . . . . . . . . . . . . . . 317

9.7 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 326

10 Control under Partial Information 329

10.1 Filters . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 329

10.1.1 Kalman Filter . . . . . . . . . . . . . . . . . . . . . . . 329

10.1.2 Hidden Markov Chain . . . . . . . . . . . . . . . . . . 339

10.1.3 Hidden Markov-Switching Model . . . . . . . . . . . . 341

10.2 Control Problems . . . . . . . . . . . . . . . . . . . . . . . . . 342

10.3 Linear-Quadratic Control . . . . . . . . . . . . . . . . . . . . 344

10.4 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 347

vi CONTENTS

11 Numerical Methods 349

11.1 Numerical Integration . . . . . . . . . . . . . . . . . . . . . . 349

11.1.1 Gaussian Quadrature . . . . . . . . . . . . . . . . . . 350

11.1.2 Multidimensional Quadrature . . . . . . . . . . . . . . 352

11.2 Discretizing AR(1) Processes . . . . . . . . . . . . . . . . . . 352

11.3 Interpolation . . . . . . . . . . . . . . . . . . . . . . . . . . . 356

11.3.1 Orthogonal Polynomials . . . . . . . . . . . . . . . . . 358

11.3.2 Splines . . . . . . . . . . . . . . . . . . . . . . . . . . . 362

11.3.3 Multidimensional Approximation . . . . . . . . . . . . 366

11.4 Perturbation Methods . . . . . . . . . . . . . . . . . . . . . . 368

11.5 Projection Methods . . . . . . . . . . . . . . . . . . . . . . . 372

11.6 Numerical Dynamic Programming . . . . . . . . . . . . . . . 378

11.6.1 Discrete Approximation Methods . . . . . . . . . . . . 379

11.6.2 Smooth Approximation Methods . . . . . . . . . . . . 382

11.7 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 385

12 Structural Estimation 387

12.1 Estimation Methods and Asymptotic Properties . . . . . . . . 387

12.1.1 Generalized Method of Moments . . . . . . . . . . . . 387

12.1.2 Maximum Likelihood . . . . . . . . . . . . . . . . . . . 387

12.1.3 Simulation-Based Methods . . . . . . . . . . . . . . . 387

12.2 Application: Investment . . . . . . . . . . . . . . . . . . . . . 387

12.3 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 387

III Equilibrium Analysis 389

13 Complete Markets Exchange Economies 393

13.1 Arrow-Debreu Equilibrium versus Radner Equilibrium . . . . 393

13.2 Recursive Formulation . . . . . . . . . . . . . . . . . . . . . . 393

13.3 The Lucas Model . . . . . . . . . . . . . . . . . . . . . . . . . 393

13.4 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 393

14 Neoclassical Growth Models 395

14.1 Deterministic Models . . . . . . . . . . . . . . . . . . . . . . . 396

14.1.1 A Basic Ramsey Model . . . . . . . . . . . . . . . . . 396

14.1.2 Incorporating Fiscal Policy . . . . . . . . . . . . . . . 407

14.2 Real Business Cycle Theory . . . . . . . . . . . . . . . . . . . 411

14.2.1 A Basic RBC Model . . . . . . . . . . . . . . . . . . . 412

14.2.2 Extensions . . . . . . . . . . . . . . . . . . . . . . . . 426

CONTENTS vii

14.3 Exercise . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 447

15 Bayesian Estimation of DSGE Models Using Dynare 449

15.1 Principles of Bayesian Estimation . . . . . . . . . . . . . . . . 450

15.2 Bayesian Estimation of DSGE Models . . . . . . . . . . . . . 452

15.2.1 Numerical Solution and State Space Representation . 452

15.2.2 Evaluating the Likelihood Function . . . . . . . . . . . 454

15.2.3 Computing the Posterior . . . . . . . . . . . . . . . . . 456

15.3 An Example . . . . . . . . . . . . . . . . . . . . . . . . . . . . 459

15.4 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 467

16 Overlapping-Generations Models 469

16.1 Exchange Economies . . . . . . . . . . . . . . . . . . . . . . . 470

16.2 Production Economies . . . . . . . . . . . . . . . . . . . . . . 484

16.3 Asset Bubbles . . . . . . . . . . . . . . . . . . . . . . . . . . . 491

16.4 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 496

17 Incomplete Markets Models 499

17.1 Exchange Economies . . . . . . . . . . . . . . . . . . . . . . . 499

17.2 Production Economies . . . . . . . . . . . . . . . . . . . . . . 499

17.3 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 499

18 Search and Matching Models of Unemployment 501

18.1 A Basic DMP Model . . . . . . . . . . . . . . . . . . . . . . . 502

18.2 Endogenous Job Destruction . . . . . . . . . . . . . . . . . . 514

18.3 Unemployment and Business Cycles . . . . . . . . . . . . . . 521

18.4 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 528

19 New Keynesian Models 529

19.1 A Basic Model . . . . . . . . . . . . . . . . . . . . . . . . . . 529

19.2 A Medium-Scale DSGE Model . . . . . . . . . . . . . . . . . 529

19.3 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 529

IV Further Topics 531

20 Bandit Models 533

20.1 Basic Problem . . . . . . . . . . . . . . . . . . . . . . . . . . 533

20.2 Market Learning . . . . . . . . . . . . . . . . . . . . . . . . . 533

20.3 Job Matching . . . . . . . . . . . . . . . . . . . . . . . . . . . 533

20.4 Experimentation and Pricing . . . . . . . . . . . . . . . . . . 533

viii CONTENTS

20.5 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 533

21 Recursive Utility 535

21.1 Utility Models . . . . . . . . . . . . . . . . . . . . . . . . . . 535

21.2 Optimal Growth . . . . . . . . . . . . . . . . . . . . . . . . . 535

21.3 Portfolio Choice and Asset Pricing . . . . . . . . . . . . . . . 535

21.4 Pareto Optimality . . . . . . . . . . . . . . . . . . . . . . . . 535

21.5 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 535

22 Recursive Contracts 537

22.1 Limited Commitment . . . . . . . . . . . . . . . . . . . . . . 537

22.2 Hidden Actions . . . . . . . . . . . . . . . . . . . . . . . . . . 537

22.3 Hidden Information . . . . . . . . . . . . . . . . . . . . . . . . 537

22.4 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 537

V Mathematical Appendix 539

A Linear Algebra 541

A.1 Vector Spaces . . . . . . . . . . . . . . . . . . . . . . . . . . . 541

A.2 Linear Transformations . . . . . . . . . . . . . . . . . . . . . 541

A.3 Determinants . . . . . . . . . . . . . . . . . . . . . . . . . . . 541

A.4 Eigentheory . . . . . . . . . . . . . . . . . . . . . . . . . . . . 541

A.5 Inner Product Spaces . . . . . . . . . . . . . . . . . . . . . . . 541

A.6 Unitary Diagonalization Theorems . . . . . . . . . . . . . . . 541

A.7 Quadratic Forms . . . . . . . . . . . . . . . . . . . . . . . . . 541

A.8 Jordan Decomposition Theorem . . . . . . . . . . . . . . . . . 541

B Real and Functional Analysis 543

B.1 Metric Spaces and Normed Vector Spaces . . . . . . . . . . . 543

B.2 The Contraction Mapping Theorem . . . . . . . . . . . . . . 543

B.3 Dual Spaces . . . . . . . . . . . . . . . . . . . . . . . . . . . . 543

B.4 Hahn-Banach Theorem . . . . . . . . . . . . . . . . . . . . . . 543

C Convex Analysis 545

C.1 Convex Functions . . . . . . . . . . . . . . . . . . . . . . . . . 545

C.2 The Theorem of the Maximum . . . . . . . . . . . . . . . . . 545

C.3 Lagrange Theorem . . . . . . . . . . . . . . . . . . . . . . . . 545

C.4 Kuhn-Tucker Theorem . . . . . . . . . . . . . . . . . . . . . . 545

CONTENTS ix

D Measure and Probability Theory 547D.1 Measurable Spaces . . . . . . . . . . . . . . . . . . . . . . . . 547D.2 Measures . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 547D.3 Measurable Functions . . . . . . . . . . . . . . . . . . . . . . 547D.4 Integration . . . . . . . . . . . . . . . . . . . . . . . . . . . . 547D.5 Production Spaces . . . . . . . . . . . . . . . . . . . . . . . . 547D.6 The Monotone Class Lemma . . . . . . . . . . . . . . . . . . 547D.7 Conditional Expectation . . . . . . . . . . . . . . . . . . . . . 547

Afterword 549

References 551

x CONTENTS

Preface

To be written.

xi

xii PREFACE

Part I

Dynamical Systems

1

3

The dynamics of economic variables are typically described by the fol-

lowing system of p-order difference equations:

xt+p = f (xt, xt+1, ..., xt+p−1, zt, zt+1, ..., zt+p−1) , (1)

where f : Rnp × Rnz → Rn, xt ∈ Rn, zt ∈ Rnz for all t = 0, 1, ..., and n,

nz and p are natural numbers. The vector zt consists of exogenously given

forcing variables. We need to impose certain initial or terminal conditions

to solve (1). These conditions typically depend on the economic problems

at hand. By appropriate changes of variables, we can often transform (1)

into a system of first-order difference equations. If the sequence {zt} is de-terministic, then (1) is a deterministic system. In Chapter 1, we study this

case. If {zt} is a stochastic process, then (1) is a stochastic system. In

this case, we introduce information structure and require f to satisfy cer-

tain measurability condition. In a dynamic economy, economic agents must

form expectations about future variables. If system (1) characterizes a ra-

tional expectations equilibrium, we must introduce conditional expectations

into this system. We study the stochastic case in Chapter 2. Researchers

typically use a recursive approach to study dynamic equilibria. Under this

approach, equilibrium variables typically satisfy certain Markov property.

In Chapter 3, we study Markov processes and their convergence. A central

issue is the existence and uniqueness of a stationary distribution. In Chap-

ter 4, we discuss ergodic theory and its application to stationary processes.

We establish several strong laws of large numbers for stationary processes

and in particular for Markov processes.

4

Chapter 1

Deterministic DifferenceEquations

In this chapter, we focus on deterministic dynamics characterized by systems

of first-order linear difference equations. We distinguish between singular

and nonsingular systems because different solution methods are applied to

these two cases. We also introduce lag operators and apply them to solve

second-order linear difference equations. Finally, we provide a brief intro-

duction to nonlinear dynamics.

1.1 Scalar First-Order Linear Equations

Consider the following scalar first-order linear difference equation:

xt+1 = bxt + czt, (1.1)

where xt, b, c, zt are all real numbers. We assume {zt} is an exogenously given

bounded sequence. If zt is constant for each t, then (1.1) is autonomous.

When czt = 0 for all t, we call (1.1) a homogeneous difference equation.

These concepts can be generalized to systems of high-order difference equa-

tions introduced later.

In the autonomous case, we may suppose zt = 1 for all t in (1.1) without

5

6 CHAPTER 1. DETERMINISTIC DIFFERENCE EQUATIONS

loss of generality. We then obtain

xt+1 = bxt + c. (1.2)

A particular solution to this difference equation is a constant solution xt = x

for all t, where

x =c

1− b , for b �= 1.

This solution is called a stationary point or steady state. One can verify

that the general solution to (1.2) is given by

xt = abt + x, (1.3)

where a = x0 − x is a constant determined by the initial value x0. We are

interested in the long-run behavior of this solution:

• If |b| < 1, then the solution in (1.3) converges asymptotically to the

steady state x for any initial value x0. In this case, we call x a globally

asymptotically stable steady state. If x0 is not exogenously given,

then the solution is indeterminate. Starting from any initial value

x0 = a+ x, equation (1.3) gives a solution to (1.2).

• If |b| > 1, then the solution in (1.3) explodes or is unstable, unless

a = 0 or x0 = x. In this case, we often assume x0 is unknown and

solve for the entire path of xt. The only stable solution is xt = x for

all t ≥ 0.

In the nonautonomous case, we may solve for {xt} in two ways depending

on whether or not the initial value x0 is exogenously given. First, consider

the case in which x0 is exogenously given. Then we solve for xt backward

by repeated substitution to obtain the backward-looking solution:

xt = ct−1∑j=0

bjzt−1−j + btx0. (1.4)

1.1. SCALAR FIRST-ORDER LINEAR EQUATIONS 7

If |b| < 1, then

limt→∞xt = lim

t→∞ c

t−1∑j=0

bjzt−1−j , (1.5)

where a finite limit exists because we assume {zt} is a bounded sequence.

Thus, for any given initial value x0, the difference equation in (1.1) has a

solution for {xt} , which converges to a finite limit in (1.5). We call this

limit a generalized steady state. It is globally asymptotically stable. If

|b| > 1, then (1.4) shows that {xt} diverges. If |b| = 1, then the solution

does not converge to a finite limit unless∑∞

j=0 zj is finite. Even if a finite

limit exists, it depends on the initial condition x0 so that the solution is not

globally stable.

Second, suppose x0 is not exogenously given. For example, xt represents

an asset’s price. Let b be the gross return and −czt > 0 be the asset’s

dividends. Then equation (1.1) is an asset pricing equation. We may solve

for xt forward by repeated substitution:

xt =

(1

b

)Txt+T −

c

b

T−1∑j=0

(1

b

)jzt+j , (1.6)

for any T ≥ 1. Taking T →∞ and assuming the transversality condition

(or no-bubble condition),

limT→∞

(1

b

)Txt+T = 0, (1.7)

we obtain the forward-looking solution:

xt = −c

b

∞∑j=0

(1

b

)jzt+j . (1.8)

If |b| > 1, then the above infinite sum is finite since {zt} is a bounded se-

quence. Clearly, the above solution also satisfies the transversality condition

(1.7). This solution is stable in the sense that xt is bounded for all t.


If we remove the transversality condition (1.7), then (1.1) admits many

solutions. Let x∗t denote the solution given by (1.8). Then for any Bt

satisfying

Bt+1 = bBt, (1.9)

the expression, xt = x∗t + Bt, is a solution to (1.1). We often call x∗t the

fundamental solution and Bt a bubble. The bubble grows at the gross rate

b.

If |b| < 1, then the infinite sum in (1.8) is unlikely to converge in general.

There is an infinity of bubble solutions, which are globally stable rather than

exploding. For example, let zt = 1 for all t, then the expression below is a

solution to (1.1):

xt =c

1− b +Bt,

where Bt satisfies (1.9) and B0 is any given value. This is an instance of

indeterminacy discussed earlier for the autonomous system. Theorem 1.4.3

studied later will consider more general cases.

Example 1.1.1 Asset prices under adapted versus rational expectations.

Consider the following asset pricing equation:

pt =tpet+1 + d

R, (1.10)

where d represents constant dividends, R is the gross return on the asset,

pt is the asset price in period t, and tpet+1 is investors’ period-t forecast of

the price in period t+1. An important question is how to form this forecast.

According to the adapted expectations hypothesis, the forecast satisfies

tpet+1 = (1− λ) t−1p

et + λpt, (1.11)

where λ ∈ (0, 1) . This means that investors’ current forecast of the next-

period price is equal to a weighted average of the current price and the

1.1. SCALAR FIRST-ORDER LINEAR EQUATIONS 9

previous-period forecast of the current price. Using equation (1.10) to sub-

stitute for tpet+1 and t−1p

et into equation (1.11), we obtain

Rpt − d = (1− λ) (Rpt−1 − d) + λpt.

Simplifying yields:

(R− λ) pt = (1− λ)Rpt−1 + λd.

Solving this equation backward until time 0, we obtain the backward-

looking solution:

pt = atp0 +

(1− at

)d

R− 1

where a = R(1−λ)R−λ . We need to assign the initial value p0. For this solution

to be stable, we must assume that |a| < 1. In this case, pt converges to its

steady state value p = d/ (R− 1) , starting at any initial value p0.

We next turn to the case under rational expectations. In a deterministic

model, rational expectations mean perfect foresight in the sense that tpet+1 =

pt+1. That is, investors’ rational forecast of the future price is identical to

its true value. In this case, we rewrite (1.10) as:

pt =pt+1 + d

R, (1.12)

Solving this equation forward, we obtain:

pt =d

R− 1+ limt→∞

pt+TRT

.

Ruling out bubbles, we obtain the forward-looking solution, pt = d/ (R− 1) .

This means that the stock price is always equal to the constant fundamental

value.

Example 1.1.2 Dividend taxes

Suppose dividends are taxed at the constant rate τ 1 from time 0 to time

T. From time T + 1 on, the dividend tax rate is increased to τ2 forever.


Suppose this policy is public announced at time 0. What will happen to the

stock price at time 0? Given the rational expectations hypothesis, we solve

the price at time T :

pT =(1− τ2) dR− 1

.

At time 0, we use equation (1.6) to derive the forward-looking solution:

p0 =1

RTpT +

1

R

T−1∑j=0

(1− τ1) dRj

=(1− τ1) dR− 1

+1

RT

((1− τ2) dR− 1

− (1− τ1) dR− 1

).

Thus, the stock price drops immediately at time 0 and then continuously

declines until it reaches the new fundamental value. Figure 1.1 shows a nu-

merical example. The dashed and solid lines represent the price path without

and with tax changes, respectively.

0 T t

pt

Figure 1.1: The impact of dividend taxes on the stock price.

1.2. LAG OPERATORS 11

In this section, we have shown that two conditions are important for

solving a linear difference equation: (i) whether the initial value is given;

and (ii) whether the coefficient b is smaller than one or not in absolute value.

We will show below that similar conditions apply to general multivariate

linear systems. In particular, the first condition determines whether the

variable xt is predetermined or not, and the second condition corresponds

to whether the eigenvalue is stable or not.

1.2 Lag Operators

Lag operators provide a powerful tool for solving difference equations. They

are also useful for analyzing economic dynamics and time series economet-

rics. We now introduce these operators.1

Consider a sequence {Xt}∞t=−∞ . The lag operator L on this sequence is

defined by

LXt = Xt−1, LnXt = Xt−n, for all n = ...,−2,−1, 0, 1, 2, ...,

In addition, Lnc = c for any constant c that is independent of time. The

following formulas are useful in applications:

1

1− λLn =

∞∑j=0

λjLnj,

1

(1− λLn)2=

∞∑j=0

(j + 1)λjLnj,

for |λ| < 1, and

(I −ALn)−1 =

∞∑j=0

AjLnj,

1We refer readers to Sargent (1987) and Hamilton (1994) for a further introduction.


if all of the eigenvalues of the matrix A are inside the unit circle. Here n is

any integer.

1.3 Scalar Second-Order Linear Equations

We consider the following scalar second-order linear difference equation:

xt+2 = axt+1 + bxt + czt, (1.13)

where x0 ∈ R is given, a, b, and c are real-valued constants, and {zt}t≥0 is

an exogenously given bounded sequence of real numbers. We shall use the

lag operator introduced in the previous section to solve (1.13). We write

this equation as: (L−2 − aL−1 − b

)xt = czt. (1.14)

Define the characteristic equation as

λ2 − aλ− b = 0.

This equation has two (possibly complex) characteristic roots λ1 and λ2,

satisfying λ1 + λ2 = a and λ1λ2 = −b. As will be clear later, these roots

correspond to eigenvalues of a linear system. Suppose both roots are not

equal to 1 in modulus and are distinct. We then factorize the left-hand side

of (1.14) to derive:

(L−1 − λ1

) (L−1 − λ2

)xt = czt. (1.15)

The solution method below relies on this factorization. As a result, it is

often called the method of factorization (See Sargent (1987)).

An important case in economic analysis is that the modulus of one char-

acteristic root is bigger than 1 and the other is less than 1. The solution of

this case is often referred to as the saddle path solution. The associated

1.3. SCALAR SECOND-ORDER LINEAR EQUATIONS 13

steady state if it exists is called a saddle point. Without loss of generality,

we assume |λ1| < 1 and |λ2| > 1. It follows from (1.15) that(L−1 − λ1

)xt =

cztL−1 − λ2

=−czt

λ2(1− λ−1

2 L−1) . (1.16)

We then obtain:

xt+1 = λ1xt−1 −c

λ2

∞∑j=0

λ−j2 L−jzt

= λ1xt −c

λ2

∞∑j=0

λ−j2 zt+j .

Given an initial value x0, this equation gives the solution to (1.13). Note

that we have assumed that {zt} is a bounded sequence for the above infinite

sum to converge. We can relax this assumption to allow growth in {zt} .For the general case, we can write the general solution to (1.13) as:

xt = c1λt1 + c2λ

t2 +

czt(L−1 − λ1) (L−1 − λ2)

, (1.17)

where c1 and c2 are two constants to be determined by two boundary con-

ditions. We can easily verify that (1.17) gives the general solution by sub-

stituting it into (1.15) and noticing that(L−1 − λ1

) (L−1 − λ2

)c1λ

t1 +

(L−1 − λ1

) (L−1 − λ2

)c2λ

t2 = 0.

We need to compute the second term on the right-hand-side of (1.17). A

simple method is to use the following factorization:

1

(L−1 − λ1) (L−1 − λ2)=

L2

(1− λ1L) (1− λ2L)

=L2

λ1 − λ2

(λ1

1− λ1L− λ2

1− λ2L

).

Substituting this equation into (1.17) yields:

xt = c1λt1 + c2λ

t2 +

c

λ1 − λ2

(λ1

1− λ1L− λ2

1− λ2L

)zt−2. (1.18)


As we have done before, we can expand the two terms in the bracket de-

pending on whether the characteristic roots are bigger or less than 1. To

analyze the stability of the solution, we suppose that

limt→∞

czt(L−1 − λ1) (L−1 − λ2)

= x (1.19)

exits and is finite. This limit is the generalized steady state of {xt} . If zt con-verges to some constant, say 1, then the steady state is x = c/ (1− a− b) .There are three cases to consider:

• Both characteristic roots have modulus less than 1. Then the steady

state is globally asymptotically stable. In this case, the initial condi-

tion x0 and an additional condition are needed to pin down the two

constants c1 and c2. Furthermore, if both roots are positive, then xt

converges monotonically to the steady state. If at least one of the

roots is negative, then the system will flip from one side of the steady

state to the other as it converges. If both roots are complex, the the

solution spirals into the steady state. In all cases, the steady state is

called a sink.

• One characteristic root has modulus less than 1 and the other has

modulus bigger than 1. Without loss of generality, we assume |λ1| < 1

and |λ2| > 1. To obtain a stable solution, we need c2 = 0. The constant

c1 is determined by the initial condition for x0, c1 = x0 − x. Once c1

is determined, x1 is obtained, x1 − x = λ1 (x0 − x) . We then obtain

the saddle path solution derived earlier.

• Both characteristic roots are greater than 1 in modulus. In this case,

there is no stable solution. The steady state is called a source.

1.4. FIRST-ORDER LINEAR SYSTEMS 15

1.4 First-Order Linear Systems

An alternative way to solve the second-order linear difference equation (1.13)

is to transform it into a system of linear difference equations:[xt+1

xt+2

]=

[0 1b a

] [xtxt+1

]+

[0c

]zt. (1.20)

In a similar way, we can transform any high-order linear difference equation

into a system of linear difference equations. In this section, we turn to the

study of linear systems. We consider a general system of the following form:

Axt+1 = Bxt + Czt, (1.21)

where xt ∈ Rn is endogenous, zt ∈ Rnz is a vector of exogenous forcing

variables, and A, B, C are conformable matrices. As we will show later,

many economic problems can be described by this form.

Following King and Watson (1998), we make the following regularity

assumption:

Assumption 1.4.1 (regularity) There is α such that the determinant

det (Aα−B) �= 0.

When the pair (A,B) satisfies this assumption, we say it is regular.

This assumption is essentially a solvability condition. If this assumption is

violated, then (1.21) may not have a solution for generic exogenous sequences

{zt} . To see this, we rewrite (1.21) as:

(AL−1 −B

)xt = Czt.

If Aα−B is singular for all α, then its row is linearly dependent, and hence

there exists a row vector polynomial ψ (α) such that ψ (α)′ (Aα−B) = 0

identically in α. Thus, we have ψ(L−1

)′Czt = 0. This equation cannot


hold for all sequences {zt} . As a simple example, consider the scalar dif-

ference equation: axt+1 = bxt + czt. If a = b = 0, then there is no α such

that det (Aα−B) �= 0. Thus, Assumption 1.4.1 is violated and there is no

solution to the difference equation for czt �= 0.

Sometimes, we make the following assumption:

Assumption 1.4.2 There exists T > 0 such that zt = z for all t ≥ T.

Under this assumption, it is possible for system (1.21) to have a steady

state.

Definition 1.4.1 A steady state or stationary point of a sequence {Xt}is a point X such that if Xt = X, then Xs = X for all s > t.

In words, if a sequence arrives at a steady state, then it always stays

at this state. Let Assumption 1.4.2 hold. If A − B is invertible, then the

solution to (1.21) has a unique steady state x given by:

x = (A−B)−1Cz. (1.22)

We introduce the following definitions for points in an arbitrary finite

dimensional Euclidean space:

Definition 1.4.2 A sequence {Xt} is stable if there exists M > 0 such

that ||xt||max < M for all t, where ‖X‖max = maxj |Xj | for any X ∈ Rn.

This definition rules out growth in the sequence in order to derive non-

exploding solutions. In principle, we can relax this definition to allow for

growth. Whenever there is growth in a sequence, we can normalize it by the

growth factor to obtain a stable sequence.

Definition 1.4.3 A point X is asymptotically stable for the sequence

{Xt} if limt→∞Xt = X for some X0. The basin (of attraction) of an


asymptotically stable steady state X is the set of all points X0 such that

limt→∞Xt = X.

Definition 1.4.4 A point X is globally (asymptotically) stable for the

sequence {Xt} if limt→∞Xt = X for any initial value X0.

We make the following assumption throughout this section:

Assumption 1.4.3 {zt} is a stable sequence.

Our goal is to find a stable solution to (1.21). Given Assumption 1.4.2,

we will also discuss the issue of asymptotic stability.

1.4.1 Nonsingular System

The solution to (1.21) for the case of nonsingular A is first studied by Blan-

chard and Kahn (1980). In this case, we rewrite (1.21) as:

xt+1 = A−1Bxt +A−1Czt, (1.23)

The regularity assumption (1.4.1) is always satisfied.2 Define W = A−1B.

By the Jordan form, there exists a a nonsingular matrix P (left eigenvectors

of W ) such that W = P−1JP, where J is a Jordan matrix:

J =

⎡⎢⎢⎣J1

J2...

Jl

⎤⎥⎥⎦ , with Ji =⎡⎢⎢⎣λi 1

λi 1... 1

λi

⎤⎥⎥⎦mi×mi

,

where λ1, ..., λl are the distinct eigenvalues ofW satisfying det(W − λI) = 0.

Here, λi has mi repeated values for all i = 1, ..., l.

Definition 1.4.5 An eigenvalue of some matrix is stable if it has modulus

less than 1. It is unstable if it has modulus greater than 1. A matrix is

stable if all its eigenvalues have modulus less than 1.

2If z is not an eigenvalue of A−1B, then det (Az −B) = det(A) det(Iz −A−1B) �= 0.


Define

x∗t = Pxt and C∗ = PA−1C. (1.24)

It follows from (1.23) that

x∗t+1 = Jx∗t + C∗zt. (1.25)

Solving this equation backward, we obtain the backward-looking solution:

x∗t = J tx∗0 +t−1∑j=0

J jC∗zt−1−j , t ≥ 1, (1.26)

where

J t =

⎡⎢⎢⎣J t1

J t2...

J tl

⎤⎥⎥⎦ , with J ti =⎡⎢⎢⎣λti tλt−1

it(t−1)

2 λt−2i ...

λti tλt−1i ......

λti

⎤⎥⎥⎦ .If the initial value x0 is exogenously given, then x∗0 = Px0 is determined.

By (1.26), we can then solve for {x∗t} . Transforming back, we obtain the

solution to (1.23): xt = P−1x∗t . We next study the stability issue:

Theorem 1.4.1 Suppose A and I − W are nonsingular and Assumption

1.4.2 is satisfied. Let x be given by (1.22). Then there is a solution {xt}such that limt→∞ xt = x for any given x0 if and only if all eigenvalues of W

are stable.

Proof. We have constructed a solution previously. Suppose zt = z for

t ≥ T. As in (1.26), we can derive:

x∗t+1 = J t−T+1x∗T +

t−T∑j=0

J jC∗zt−j ,

= J t−T+1x∗T +t−T∑j=0

J jC∗z.


If all eigenvalues of W are stable, we obtain

limt→∞x∗t+1 = lim

t→∞

t−T∑j=0

J jC∗z = (I − J)−1C∗z.

Thus, by definition (1.24),

limt→∞xt+1 = lim

t→∞P−1x∗t+1 = P−1 (I − J)−1 C∗z = (I −W )−1A−1Cz = x.

The converse is also true.

If the condition in this theorem is satisfied, the steady state is called a

sink. If all eigenvalues of W lie outside of the unit circle, then the steady

state is called a source. What if some components of the initial value x0 are

not given? For the deterministic case, we call a variable predetermined if

its initial value is exogenously given. Otherwise, it is a non-predetermined

or jump variable. In the presence of non-predetermined variables, we cannot

use the backward-looking solution (1.26) and hence Theorem 1.4.1 does not

apply.

Next, we turn to this case. We order components of xt as xt = [k′t, y′t]′

where the vector yt contains ny non-predetermined variables and the vector

kt contains nk = n−ny predetermined variables. SupposeW has nu unstable

eigenvalues and ns = n−nu stable eigenvalues. We rule out the case in which

some eigenvalues are on the unit circle. We partition its Jordan matrix J

as:

J =

[Js

Ju

],

where all eigenvalues of the ns× ns matrix Js are stable and all eigenvalues

of the nu × nu matrix Ju are unstable. We partition x∗t and C∗ as

x∗t =[stut

], C∗ =

[C∗s

C∗u

],

so that they are conformable with the partition of J. We then rewrite (1.25)

as: [st+1

ut+1

]=

[Js

Ju

] [stut

]+

[C∗s

C∗u

]zt. (1.27)


By (1.24), we write[stut

]= P

[ktyt

]=

[Psk PsyPuk Puy

] [ktyt

], (1.28)

where we partition P conformably. For example, Puy is an nu × ny matrix.

Let R = P−1. By definition,[ktyt

]= R

[stut

]=

[Rks RkuRys Ryu

] [stut

], (1.29)

where we have partitioned R conformably.

Now, we are ready to prove the following theorem due to Blanchard and

Kahn (1980).

Theorem 1.4.2 Suppose the following three conditions hold: (i) the number

of unstable eigenvalues of W = A−1B is equal to the number of nonpredeter-

mined variables, i.e., nu = ny; (ii) Puy is nonsingular; and (iii) W has no

eigenvalues on the unit circle. Then the system (1.23) has a unique stable

solution for any given k0 and any stable sequence {zt}.

Proof. We first solve for ut. We use the lag operator L to rewrite

equation (1.27) as:

L−1ut = Juut + C∗uzt.

Because Ju contains all unstable eigenvalues. Solving forward yields the

forward-looking solution:

ut = −(I − L−1J−1

u

)−1J−1u C∗

uzt = −∞∑j=0

J−j−1u C∗

uzt+j , (1.30)

where we need {zt} to be stable for the series to converge.

By equation (1.28), we obtain:

st = Psyyt + Pskkt, (1.31)

ut = Puyyt + Pukkt. (1.32)


When Puy is nonsingular, we can solve for yt as:

yt = P−1uy ut − P−1

uy Pukkt. (1.33)

To solve for kt, we use (1.29) to derive:

kt+1 = Rkuut+1 +Rksst+1.

Substituting (1.27) into this equation yields:

kt+1 = Rku (Juut + C∗uzt) +Rks (Jsst + C∗

szt)

= Rku (Juut + C∗uzt) +Rks [Js (Psyyt + Pskkt) + C∗

s zt]

= Rku (Juut + C∗uzt) +RksJsPsy

(P−1uy ut − P−1

uy Pukkt)

+RksJsPskkt +RksC∗s zt,

where we have substituted (1.31) for st in the second equality and have used

(1.33) to substitute for yt in the third equality. Simplifying the equation

yields:

kt+1 = RksJsR−1ks kt + (RkuC

∗u +RksC

∗s ) zt (1.34)

+ (RkuJuPuy +RksJsPsy)P−1uy ut,

where by the partition of R and the assumption that Puy is nonsingular, we

deduce that Rks is also nonsingular and

Rks =[Psk − PsyP−1

uy Puk]−1

.

Finally, (1.30) and (1.33) give the solution for yt. Note that because Js

contains all stable eigenvalues, (1.34) implies that {kt} is stable.The first two conditions in the theorem are often called the Blanchard-

Kahn conditions in the literature. They are equivalent to the conditions

that (i) ns = nk (order condition) and (ii) Rks is invertible (rank condi-

tion), given that we rule out eigenvalues on the unit circle. The intuition


behind the proof is the following. We first use a forward-looking solution

to determine the transformed variables related to unstable eigenvalues. The

number of these transformed variables is equal to the number of unstable

eigenvalues. Next, to transform these variables back to the original non-

predetermined variables, we need a certain matrix to be invertible and also

the number of non-predetermined variables to be equal to the number of un-

stable eigenvalues. Finally, we use a backward-looking solution to determine

the predetermined variables. The above proof of the theorem is construc-

tive and gives a solution algorithm to derive the stable solution of the linear

system (1.23).

An immediate corollary of the above theorem is the following:

Corollary 1.4.1 If the conditions in Theorem 1.4.2 and Assumption 1.4.2

are satisfied, then the solution to (1.23) satisfies limt→∞ xt = x, where x is

the steady state defined in (1.22).

In this case, the steady state x is called a saddle. By (1.33), the initial

value (k0, y0) satisfies:

y0 = P−1uy u0 − P−1

uy Pukk0. (1.35)

All points (k0, y0) satisfying the above equation form an nk-dimensional

subspace of Rn; this space is called the stable manifold or stable arm of

(1.23).

What happens if some eigenvalues of W are on the unit circle? We

note that the solution constructed in the proof of the previous theorem still

applies to this case. However, the solution may not be stable. For example,

suppose nk = 1. Then (1.34) shows that kt behaves like a random walk,

which is not stable in general. Condition (iii) in Theorem 1.4.2 is often

called a hyperbolic condition.

What happens if the Blanchard-Kahn conditions are violated?


Theorem 1.4.3 If nu > ny, then there is no solution to (1.23). If nu < ny,

then there are infinitely many solutions to (1.23).

Proof. From equation (1.32), we obtain:

u0 = Puyy0 + Pukk0, (1.36)

where k0 is given, u0 is derived from (1.30) and Puy is nu × ny matrix. If

nu > ny, there is no solution for y0. If nu < ny there are infinitely many

solutions for y0.

Intuitively, if the number of unstable eigenvalues is greater than the num-

ber of non-predetermined variables, then we have too many restrictions on

the initial values of the non-predetermined variables, therefore these initial

values cannot be determined. If the number of unstable eigenvalues is less

than the number of non-predetermined variables, then there are too few re-

strictions on these initial values so that infinitely many solutions may satisfy

those restrictions. In this case, the solution is indeterminate.

1.4.2 Singular System

In many economic problems, the assumption of nonsingular A is violated.

To illustrate this point, consider a simple version of the Cagan (1956) model.

Let Rt be the nominal interest rate, pt be the logarithm of the price level,

and mt be the logarithm of the money stock. The model consists of a

Fisher equation, Rt = pt+1 − pt, and a monetary equilibrium condition,

mt − pt = −αRt. Writing these equation in a linear system gives[0 10 0

] [Rt+1

pt+1

]=

[1 1α −1

] [Rtpt

]+

[01

]mt.

In this system A is singular. This example shows that whenever there are

some intratemporal relations, A will be singular.


King and Watson (1998, 2002), Sims (2000) and Klein (2000) provide

algorithm to solve the case of singular A. We follow the approach of Klein

(2000). We first introduce some definitions:

Definition 1.4.6 A scalar λ ∈ C is a generalized eigenvalue of the ma-

trices (A,B) if there is a nonzero vector x ∈ Cn such that Bx = λAx.3 We

call x the corresponding right generalized eigenvector.

Let λ (A,B) denote the set of all generalized eigenvalues of (A,B) . Let

dij denote the element in the ith row and the jth column of any matrix D.

Theorem 1.4.4 (the complex generalized Schur form) Let the n × n ma-

trices A and B be regular, i.e., Assumption 1.4.1 holds. Then there exist

unitary n× n matrices of complex numbers Q and Z such that

1. QAZ = S is upper triangular;

2. QBZ = T is upper triangular;

3. For each i, sii and tii are not both zero;

4. λ (A,B) = {tii/sii : sii �= 0} ;

5. The pairs (sii, tii) , i = 1, ..., n can be arranged in any order.

The proof can be found in Golub and van Loan (1996) and is omitted

here. The generalized Schur form is also called QZ decomposition. Note

that the possibility that sii = tii = 0 for some i is ruled out by Assumption

1.4.1. Note that λ (A,B) may contain elements less than n, since if A is

singular, then we may have sii = 0. Since sii and tii cannot be both equal

to zero, we interpret an eigenvalue corresponding to sii = 0 as infinity. So

it is unstable. Let S and T be arranged in such a way that the ns stable

3C is the space of complex numbers.


generalized eigenvalues come first. The remaining nu generalized eigenvalues

are unstable. Define x∗t = ZHxt. As in previous subsection, we use the

partition:

xt =

[ktyt

]nk × 1ny × 1

, x∗t =[stut

]ns × 1nu × 1

.

We also partition Z conformably with the above partition:

Z =

[Z11 Z12

Z21 Z22

],

where, for example, Z11 is ns × nk.

Theorem 1.4.5 Suppose the following conditions hold: (i) ns = nk; (ii)

Z11 is invertible; and (iii) (A,B) has no generalized eigenvalue on the unit

circle. Then system (1.21) has a unique stable solution for any given k0 and

any stable sequence {zt}.

Proof. We apply Theorem 1.4.4 and rewrite (1.21) as:

Sx∗t+1 = Tx∗t +QCzt,

We partition matrices conformably so that:[S11 S120 S22

] [st+1

ut+1

]=

[T11 T120 T22

] [stut

]+

[Q1

Q2

]Czt. (1.37)

By the ordering of stable generalized eigenvalues, S11 is invertible because

all diagonal elements sii of S11 cannot be zero. In addition, T22 is also

invertible because all diagonal elements tii of T22 cannot be zero. Similar to

the proof of Theorem 1.4.2, we use the following steps.

Step 1. Solve for {ut} . By construction all generalized eigenvalues of

(S22, T22) are unstable. We can then derive the forward-looking solution:

ut = −T−122

∞∑j=0

[T−122 S22

]jQ2Czt+j . (1.38)


Step 2. Solve for {st} . By the first block of (1.37),

S11st+1 + S12ut+1 = T11st + T12ut +Q1Czt.

Manipulating it yields the backward-looking solution:

st+1 = S−111 T11st + S−1

11 T12ut − S−111 S12ut+1 + S−1

11 Q1Czt. (1.39)

We need to determine s0. By definition xt = Zx∗t , we obtain

kt = Z11st + Z12ut, (1.40)

yt = Z21st + Z22ut. (1.41)

Since Z11 is invertible by assumption, we obtain

s0 = Z−111 (k0 − Z12u0) .

Step 3. Solve for {kt} and {yt} . Equations (1.40) and (1.41) give the

solutions for {kt} and {yt} given that {st} and {ut} are obtained in Steps

1-2.

To apply the above theorem to solve the system (1.21), one has to specify

the exogenous forcing sequence {zt} . A useful specification is given by zt+1 =

Φzt where Φ is an nz × nz stable matrix. In this case, we can simplify the

computation of ut in (1.38). Another way is to rewrite the system (1.21) as:[Inz 00 A

] [zt+1

xt+1

]=

[Φ 0C B

] [ztxt

].

As a result, we need to solve a linear system for (z′t, x′t)′ without a forcing

variable. We can simplify some formulas in the proof of the above theorem.

For example, we have ut = 0. On the other hand, the transformed matrices

A and B are bigger. We have to manipulate these bigger matrices, which

may be computationally costly.

We can state a corollary similar to Corollary 1.4.1 if Assumption 1.4.2

and the conditions in Theorem 1.4.5 are satisfied. We will not repeat it here.

1.5. PHASE DIAGRAMS 27

0txΔ <

0txΔ >

0txΔ =

tx

ty

Figure 1.2: Phaseline Δxt = 0

1.5 Phase Diagrams

Phase diagrams are intuitive graphical devices showing the orbits of a two-

dimensional dynamic system. They are useful for analyzing both linear and

nonlinear systems. Here we use a linear system to illustrate their usefulness.

Let

xt+1 = axt + byt − e,

yt+1 = cxt − dyt + f.

Define Δxt = xt+1 − xt, Δyt = yt+1 − yt. We then have

Δxt = axt + byt − e,

Δyt = cxt − dyt + f,

where a = a−1 and d = d+1.We assume that a, b, c, d, e,and f are positive.

We call points (xt, yt) such that Δxt = 0 or Δyt = 0 phase lines. In the

(xt, yt) plane, we draw these phase lines and the arrows indicating directions

of changes in xt and yt. These arrows are called vector fields.


Because

Δxt ≥ 0⇐⇒ yt ≥ −a

bxt +

e

b,

Δyt ≥ 0⇐⇒ yt ≤c

dxt +

f

d,

we deduce that xt increases if and only if it is above the phase line Δxt = 0,

and yt increases if and only if it is below the phase line Δyt = 0. Figures 1.2

and 1.3 illustrate the phaselines and vector fields:

tx

ty 0tyΔ =

0tyΔ <

0tyΔ >

Figure 1.3: Phaseline Δyt = 0

Combining these two figures, we obtain a phase diagram in Figure 1.4.

This diagram suggests that the existence of a convergent saddle path through

the upper and lower quadrants and that all other paths diverge.

1.6 Nonlinear Systems

We consider the following first-order nonlinear system:

f (xt+1, xt, zt) = 0, (1.42)

where xt ∈ Rn is endogenous, zt ∈ Rnz is a vector of exogenous forcing

variables, and f : R2n ×Rnz → Rn. Many high-order nonlinear systems can

1.6. NONLINEAR SYSTEMS 29

tx

ty

0txΔ =

0tyΔ =

Figure 1.4: Phase diagram

be transformed to a first-order system by a suitable change of variables as

we have done previously. In order to study the steady state of this nonlinear

system, we assume that there exits T > 0 such that zt = z, for all t ≥ T.

Then the steady state of {xt+1} defined in (1.42) is the point x such that

f (x, x, z) = 0. We now take a first-order Taylor approximation to (1.42)

about x :

0 = fx′ dxt+1 + fx dxt + fz dzt, (1.43)

where fx′, fx, and fz are the partial derivatives of f with respect to xt+1,

xt and zt, evaluated at the steady state. We may interpret dXt for any

variable Xt as the level deviation from the steady state, i.e., dXt = Xt− X.Sometimes we also consider a variable’s log deviation from its steady state.

We adopt the following notation:

Xt = dXt/X =(Xt − X

)/X � ln

(Xt/X

).

Since equation (1.43) is a linear system of the form in (1.21), we may apply

the method discussed before to analyze this system. The critical conditions

for stability are based on the properties of the generalized eigenvalues of the


pair of Jacobian matrices (fx′ ,−fx) evaluated at the steady state.

Example 1.6.1 Optimal Growth

Consider a social planner’s problem:

max{Ct,Kt+1}

∞∑t=0

βtu (Ct)

subject to the resource constraint

Ct +Kt+1 − (1− δ)Kt = f (Kt) , K0 given, (1.44)

where β, δ ∈ (0, 1) . Assume that u and f are increasing, concave and twice

continuously differentiable and satisfy the usual Inada condition. The opti-

mal allocation satisfies the system of difference equations, (1.44) and

u′ (Ct) = βu′ (Ct+1)(f ′ (Kt+1) + 1− δ

).

The steady state(K, C

)satisfies the system of equations:

C = f (K)− δK,

1 = β(f ′ (K) + 1− δ

).

To analyze the stability of this steady state, we linearize the dynamical system

around the steady state:

u′′dCt = βu′′(f ′ + 1− δ

)dCt+1 + βu′f ′′dKt+1,

dCt + dKt+1 − (1− δ) dKt = f ′dKt.

Simplifying yields:[βu′f ′′u′′ 11 0

] [dKt+1

dCt+1

]=

[0 1

f ′ + 1− δ −1

] [dKt

dCt

].

This is nonsingular linear system, which is equivalent to[dKt+1

dCt+1

]=

⎡⎣ 1/β −1f ′′(K)A(C)

1− βf ′′(K)A(C)

⎤⎦[dKt

dCt

],

1.6. NONLINEAR SYSTEMS 31

where A(C)= −u′′

(C)/u′

(C). The above coefficient matrix has trace

T = 1 +1

β−βf ′′

(K)

A(C) > 1 +

1

β> 2,

and determinant D = 1/β > 1. Thus, the characteristic equation λ2− Tλ+D = 0 has two positive real roots satisfying 0 < λ1 < 1 < λ2. As a result,

the steady state is a saddle point.

Nonlinear systems can generate complicated dynamics.

Definition 1.6.1 A point x∗ is called a periodic point of period p for

the dynamical system xt+1 = f (xt) if it is a fixed point for the pth iterate

of f, i.e., if fp (x∗) = x∗ and p is the smallest integer for which this is true.

The set of all iterates of a periodic point{x∗1 = x∗, x∗2 = f (x∗) , ..., x∗p = fp−1 (x∗)

}is called a periodic orbit of period p or a p-cycle.

Here we will not study the general conditions for the existence of a

periodic cycle.4 Instead, we give a simple example taken from Stokey and

Lucas with Prescott (1989):

xt+1 = f (xt) = 4xt − 4x2t .

We can compute the second iterate of f and obtain:

xt+2 = f2 (xt) = 4(4xt − 4x2t

)− 4

(4xt − 4x2t

)2= 16xt − 80x2t + 128x3t − 64x4t .

Figure 1.5 plots f (x) and f2 (x) . It is straightforward to verify that f has

two fixed points, 0 and 0.75, and that f2 has four fixed points, 0, 0.3455,

0.75, and 0.9045. The two points 0.3455 and 0.9045 constitute a two-cycle.

4See Azariadis (1993) for a good introduction of nonlinear dynamics.


0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 10

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

1

x

f(x)

f2(x)

Figure 1.5: Two-cycle

1.7 Numerical Solutions Using Dynare

There are several Matlab codes publicly available on the world wide web,

that implement the algorithm discussed earlier. Here we provide a brief in-

troduction of a powerful and convenient Matlab software–Dynare.5 Dynare

is a software platform for solving and estimating dynamic stochastic gen-

eral equilibrium (DSGE) and overlapping generations (OLG) models. It

can also solve deterministic linear or nonlinear system of difference equa-

tions. Dynare typically handles models based on the rational expectations

hypothesis. But it can also handle models where expectations are formed

differently: on one extreme, models where agents perfectly anticipate the

future; on the other extreme, models where agents have limited rational-

ity or imperfect knowledge of the state of the economy and, hence, form

their expectations through a learning process. Dynare can solve compli-

cated models that include consumers, productive firms, governments, mon-

etary authorities, investors, and financial intermediaries. Dynare can also

5See Adjemian et al. (2011) for a detailed introduction. Dynare can be freely down-loaded from the web site: http://www.dynare.org/.

1.7. NUMERICAL SOLUTIONS USING DYNARE 33

handle heterogeneous-agent models by including several distinct classes of

agents in each of the aforementioned agent categories.

A Dynare program is named .mod file. To run a Dynare program, simply

type Dynare filename.mod in the Matlab command line. Dynare has its own

timing convention. In particular, the timing of each variable reflects when

that variable is decided. A variable xt chosen at time t is written as x in

Dynare. Similarly, xt−j is written as x (−j) and xt+j is written as x (+j) .

A variable that was chosen at time t − 1 and is known at time t, such as

the capital stock, is often denoted as kt at time t, but is written as k(−1)in Dynare. This convention applies to any predetermined variable and is

important for Dynare to identify which variable is predetermined. A (+1)

next to a variable tells Dynare to count the occurrence of that variable

as a jumper or forward-looking or non-predetermined variable. For other

particulars of Dynare, the reader is referred to the Reference Manual by

Adjemian et al. (2011) and the User Guide by Griffoli (2010).

In this section, we explain how to use Dynare to solve deterministic

models. In Sections 2.5 and 15, we will explain how to use Dynare to solve

stochastic models and to estimate these models, respectively.

Dynare solves the nonlinear system (1.42) using a Newton-type algorithm

(see Juillard (1996)). It is much more accurate than the linear approximation

method described in the previous section. To explain the basic idea, we let

xt = (k′t, y′t)′ where kt is nk × 1 vector of predetermined variables and yt

is ny × 1 vector of non-predetermined variables. The initial value k0 = k∗0is given. Suppose that the terminal value yT+1 = y∗T+1 is known at time

T + 1. Our objective is to solve for the (T + 1)× (nk + ny) unknowns, {k0,k1, ..., kT+1; y0, y1, ..., yT }. There is system of (T +1)× (nk +ny) nonlinear

equations using (1.42):

F (kt, yt, kt+1, yt+1, zt) ≡ f (xt, xt+1, zt) = 0, t = 0, 1, ..., T.


We can then use a Newton-type root-finding method to solve this system.

For infinite-horizon problem, the terminal condition is often expressed in

terms of transversality conditions. We then solve for a stable solution. We

use a shooting algorithm by specifying a time T+1 such that the system has

reached a steady state such that F(k, y, k, y, z

)= 0. We then set yT+1 = y

and use the above algorithm. Finally, we adjust T until the solution does

not change within some error bound or until (kT , yT , kT+1) is sufficiently

close to (k, y, k). The Blanchard-Kahn condition must be satisfied for this

algorithm to find a stable solution.

Using Dynare timing conventions, Dynare solves any deterministic model

of the following form:

F(y−t−1, yt, y

+t+1, et

)= 0, (1.45)

where yt is a vector that contains endogenous variables, y−t−1 is a subset of

predetermined variables or variables with a lag, y+t+1 is a subset of variables

with a lead, and et is a vector of exogenous forcing variables. Note that this

form is more general than (1.42) in that a variable with both one period lag

and one period lead can enter the system. The initial value for y−−1 is given.

We use a simple example to illustrate how to use Dynare:6

max{Ct,Nt,Kt+1}

∞∑t=0

βt {ln (Ct) + χ ln (1−Nt)} ,

subject to

Ct +Kt+1 − (1− δ)Kt = ztKαt N

1−αt , K0 given, (1.46)

ln zt = ρ ln zt−1 + σet, z0 given, (1.47)

where the deterministic sequence {et} is exogenously given. The equilibrium

consists of a system of 3 equations

1

Ct=

β

Ct+1

(zt+1αK

α−1t+1 N

1−αt+1 + 1− δ

),

6The Dynare code is deterministicgrowth.mod downloadable from ???.


χCt1−Nt

= (1− α) ztKαt N

−αt ,

and (1.46) for 3 unknowns Ct, Nt,Kt+1. Note that zt is often treated as an

“endogenous” variable and equation (1.47) is used to solve this variable.

In this case, zt−1, zt, and zt+1 enter the equilibrium system in the form of

(1.45).

A Dynare program consists of the following blocks:

Preamble. The preamble generally involves three commands that tell

Dynare what are the model’s variables, which are endogenous and what are

the parameters. The commands are:

• var starts the list of endogenous variables, to be separated by commas.

• varexo starts the list of exogenous variables that will be shocked.

• parameters starts the list of parameters and assigns values to each.

For our example, the preamble block may be written as:

var c k n z;

varexo e;

parameters beta chi delta alpha rho ;

alpha = 0.33;

beta = 0.99;

delta = 0.023;

chi = 1.75;

rho = 0.95;

Model. One of the beauties of Dynare is that we can input our model’s

equations naturally using Dynare conventions. The following is a list of these

conventions:


• The model block of the .mod file begins with the command model and

ends with the command end.

• There must be as many equations as we declared endogenous variables

using var (this is actually one of the first things that Dynare checks;

it will immediately let we know if there are any problems).

• As in the preamble and everywhere along the .mod file, each line of

instruction ends with a semicolon (except when a line is too long and

we want to break it across two lines. This is unlike Matlab where if

we break a line we need to add ...).

• Equations are entered one after the other; no matrix representation is

necessary. Note that variable and parameter names used in the model

block must be the same as those declared in the preamble. Note that

variable and parameter names are case sensitive.

For our example, the model block reads:

model;

(1/c) = beta*(1/c(+1))*(1+alpha*(kˆ(alpha-1))*exp(z(+1))*(n(+1))ˆ(1-

alpha)-delta);

chi*c/(1-n) = (1-alpha)*(k(-1)âlpha)*exp(z)*(nˆ(-alpha));

c+ k-(1-delta)*k(-1) = (k(-1)âlpha)*exp(z)*nˆ(1-alpha);

z = rho*z(-1)+e;

end;

Note that both z (−1) and z (+1) appear in the model block. Thus,

Dynare treats it as both a predetermined variable and a non-predetermined

variable. Thus, there are 3 non-predetermined variables C,N, z, and 2 pre-

determined variables K, z.


Steady State or Initial Values. The Dynare commands in this block

consist of initval, endval, steady, and check. Deterministic models do not need

to be linearized in order to be solved. Thus, technically, we do not need

to provide a steady state for these models. But in applications, researchers

typically study how the economy responds to a shock starting from an initial

steady state and ending up in a new steady state. If we wanted to shock

our model starting from a steady-state value, we would enter approximate

(or exact) steady state values in the initval block, followed by the command

steady, which computes the initial steady state. Otherwise, if we want to

begin our solution path from an arbitrary point, we will enter those values

in our initval block and not use the steady command.

Solving a steady state is typically the most difficult part. For complicated

models, the Dynare command steady may not be able to find a steady state

because one has to start with a good initial guess to get convergence. In

this case, one may write a separate Matlab file to compute the steady state

using a reverse engineering strategy. By this strategy, one solves for some

parameter values such that some steady state values are prespecified using

moments from the data.

Following steady, one may use the command check to display the gen-

eralized eigenvalues to check the Blanchard-Kahn condition. For our ex-

ample, we need 3 generalized eigenvalues larger than 1 in modulus for the

Blanchard-Kahn condition to hold.

If the shock is temporary, then the new steady state is identical to the

initial steady state. The endval block is not needed. For our example, if

we consider the impact of a temporary technology shock, then we write the

block for the steady state or initial values as:

initval;

k = 9;


c = 0.76;

n = 0.3;

z = 0;

e = 0;

end;

steady;

If the shock is permanent, then we have to use the command endval

following the initval and steady block. The endval block gives the terminal

condition after a permanent shock. This terminal condition is often the new

steady state. Thus, the command steady is often used after the endval block.

For our example, if we consider the impact of a permanent technology

shock when et changes from 0 to 0.01 permanently, we will write:

endval;

k = 9;

c = 0.76;

n = 0.3;

z = 0;

e = 0.01;

end;

steady;

The output of steady is saved in the vector oo .steady state. Each element

corresponds to a variable defined in var, arranged in the same order.

Shocks. To study the impact of a temporary shock, the duration and

the level of the shock must be specified. To specify a shock that lasts 10

periods on et, for instance, we would write:

shocks;

var e; periods 1:10;

values 0.01;


end;

Given the above codes, Dynare would replace the value of et specified

in the initval block with the value of 0.01 for periods 1 to 10. Note that we

could have entered future periods in the shocks block, such as periods 5:10,

in order to study the impact of an anticipated temporary shock.

If the shock is permanent and starts immediately in period 1, then the

shock block is not needed. But if the shock is permanent but starts in the

future, we have to add a shocks block after the endval block to “undo” the

first several periods of the permanent shock. For instance, suppose that the

technology shock et moves to 0.01 permanently starting in period 10. Then

we would follow the above endval block with:

shocks;

var e ; periods 1:9;

values 0;

end;

Computation. The Dynare command for this block is simul. The

input is periods, which specifies the number of simulation periods. In appli-

cations, one may choose an arbitrary large number of period. Then adjust

this number until the solution does not change within an error bound. The

output is stored in the matrix oo .endo simul. The variables are arranged row

by row, in order of declaration in the command var. Note that oo .endo simul

also contains initial and terminal conditions, so it has 2 more columns than

the value of periods option.

For our example, we write:

simul(periods=200);

Results. The final block presents the results of the computation in

terms of graphs and tables. The results can also be stored in some variables.


This block is not needed in Dynare and is user defined. In our example, to

plot the solutions for the impact of the temporary shocks, we write:

tt=1:202;

figure

subplot(2,2,1);

plot(tt, oo .endo simul(1,:), width,tt, oo .steady state(1)*ones(1,202));

title(’C’)

subplot(2,2,2);

plot(tt, oo .endo simul(2,:), tt, oo .steady state(2)*ones(1,202));

title(’K’)

subplot(2,2,3);

plot(tt, oo .endo simul(3,:), tt, oo .steady state(3)*ones(1,202));

title(’N’);

subplot(2,2,4);

plot(tt,oo .endo simul(4,:));

title(’z’)

We present the results in Figure 1.6.

1.8 Exercises

1. Consider the deterministic Cagan (1956) model:

mt − pt = −α(pet+1 − pt

), α > 0,

where mt is the log of the nominal money supply and pt is the log

of the price level. Under rational expectations, pet+1 = pt+1. Suppose

that money supply satisfies

mt+1 = ρmt + μ, ρ ∈ [0, 1], m0 given.

(a) Given conditions on the parameters such that there exists a unique

stable solution for pt. Derive this solution.

1.8. EXERCISES 41

0 50 100 150 200 2500.79

0.8

0.81

0.82

0.83

0.84

0.85

0.86C

0 50 100 150 200 25010.2

10.4

10.6

10.8

11

11.2K

0 50 100 150 200 2500.32

0.325

0.33

0.335

0.34

0.345

0.35N

0 50 100 150 200 2500

0.01

0.02

0.03

0.04

0.05

0.06

0.07

0.08

0.09z

Figure 1.6: Transitional dynamics in response to a temporary tech-nology shock.

(b) Suppose there is a public announcement that the money supply will

increase on a future date T. In particular, the money supply mt = m

for t < T , and mt = m′ > m for t ≥ T. Derive the path for {pt} .

2. Consider a version of the Dornbusch (1976) model:

ydt = δ (et + p∗ − pt)− γrt + φy + g,

pt+1 − pt = θ(ydt − y

),

mt − pt = αy − βrt,

rt = r∗ + et+1 − et,

where ydt is the domestic aggregate demand, et is the nominal exchange


rate, mt is the exogenous domestic money supply, pt is the price level,

y is the constant domestic output supply, g is the constant domestic

government spending, rt is the domestic nominal interest rate, and an

asterisk denotes the corresponding foreign variable. All these variables

are in terms of the logarithm. Assume that all parameters are positive.

In addition, pt is sluggish to adjust so that it is predetermined.

(a) Derive the state steady of this model.

(b) Give conditions on parameter values such that there is a unique

stable solution.

(c) Draw the phase diagram.

(d) Assume mt = m for all t. Analyze the effect of an unexpected

permanent increase in the money supply from m to m′ at date zero,

assuming that the economy is in the steady state initially.

(e) In the previous question, suppose there is a public announcement

that the money supply will increase to m′ on a future date T, i.e., mt =

m for t < T , and mt = m′ > m for t ≥ T. Analyze the equilibrium

dynamics.

3. Consider the example studied in Section 1.7. What will happen if you

set periods = 50 option in simul? Experiment with other specifications

and discuss how to choose this option.

4. Consider the example studied in Section 1.7. Use Dynare to solve for

output Yt, investment It, the wage rate wt, and the rental rate Rt. Plot

transitional dynamics for these variables in response to

(a) a temporary shock such that et is equal to 0.015 from periods 5 to

10;

(b) a permanent shock such that et = 0.015 starting from period 1 on.

Chapter 2

Stochastic DifferenceEquations

In this chapter, we introduce stochastic shocks into the deterministic sys-

tems studied in Chapter 1. The highlight is the solution methods for linear

rational expectations models. We fix a probability space (Ω,F ,Pr) and a fil-

tration {Ft}t≥0 through out this chapter.1 A stochastic process {Xt}t≥0

with state space X is a sequence of random variables valued in X. We say

that {Xt} is adapted to {Ft} if Xt is measurable with respect to Ft foreach t = 0, 1, 2.... Unless we make it explicit, all stochastic processes in this

chapter are assumed to be adapted to {Ft} and valued in some finite dimen-

sional Euclidean space. We use Et to denote the conditional expectation

operator given information Ft.

2.1 First-Order Linear Systems

We often assume that exogenous shocks {zt} are governed by the following

first-order linear difference equation:

zt+1 = Φzt +Σεt+1, z0 given, (2.1)

1Formally, {Ft}t≥0 is an increasing sequence of σ-algebras F0 ⊂ F1 ⊂ F2... ⊂ F .

43

44 CHAPTER 2. STOCHASTIC DIFFERENCE EQUATIONS

where {zt} is an Rnz -valued stochastic process, Φ is an nz ×nz matrix, Σ is

an nz × nε matrix, and {εt+1} is a stochastic process satisfying:

Etεt+1 = 0 and Etεt+1ε′t+1 = I, (2.2)

where I is an identity matrix. Many stochastic processes can be written as

in (2.1).

Example 2.1.1 Scalar first-order autoregression with nonzero mean: Let

yt+1 = byt + d+ σεt+1,

where b, d, and σ are constants satisfying 1 − b �= 0. Define zt = yt −d/ (1− b) . We obtain:

zt+1 = bzt + σεt+1.

Example 2.1.2 First-order scalar mixed moving average and autoregres-

sion: Let

yt+1 = byt + εt+1 + cεt,

where b and c are constants. Rewrite the above equation as:[yt+1

εt+1

]=

[b c0 0

] [ytεt

]+

[11

]εt+1

Example 2.1.3 Vector autoregression: Let

yt+1 = A1yt +A2yt−1 + Cεt+1,

where {yt} is Rn-valued stochastic process. We rewrite the above equation

as: [yt+1

yt

]=

[A1 A2

I 0

] [ytyt−1

]+

[C0

]εt+1

We are interested in characterizing the first and second moments of {zt}defined in (2.1).


Definition 2.1.1 A stochastic process {Xt} is covariance stationary if

it has a constant mean m = EXt, finite second moments, EX2t < ∞, and

a covariance matrix E (Xt −m) (Xs −m)′ that depends only on the time

difference |t− s| .

Assume that all eigenvalues of the matrix Φ are inside the unit cir-

cle. Then {zt} has a unique stationary distribution. Assume z0 is drawn

from the stationary distribution with mean μ0 = Ez0 and covariance matrix

E (z0 − μ) (z0 − μ)′. We shall compute them and all the autocovariance ma-

trices as well. Taking mathematical expectations on the two side of equation

(2.1), we obtain μt+1 = Φμt, where μt = Ezt. If all eigenvalues of the matrix

Φ are inside the unit circle, then μt → 0. Thus, when we set μ0 = 0, we have

μt = 0 for all t.

Next, we compute the stationary covariance matrices:

Cz (0) ≡ ΦCz (0) Φ′ +ΣΣ′.

This equation is discrete Lyapunov equation in the nz×nz matrix Cz (0) .

It can be solved numerically by successive iteration. To compute the auto-

covariance matrix, we note that

zt+j = Φjzt +Σεt+j +ΦΣεt+j−1 + ...+Φj−1Σεt+1.

Postmultiplying both sides by z′t and taking expectations show that the

autocovariance matrix satisfies:

Cz (j) = E(zt+jz

′t

)= ΦjCz (0) .

After we solve for Cz (0) , the above equation gives the whole sequence of

autocovariance matrices. We call it autocovariogram.

Now, we turn to impulse response function. We still assume that all

eigenvalues of the matrix Φ are inside the unit circle. Using the lag operator


L, we obtain:

(I − ΦL)zt+1 = Σεt+1.

Thus,

zt+1 = Σεt+1 +ΦΣεt +Φ2Σεt−1 + ...+ΦjΣεt+1−j + ....

Viewed as a function of the lag j, ΦjΣ is called the impulse response

function. It gives the impact of εt+1 on future values of zt+1+j for all j. Or

it gives the impact of all past εt+1−j on zt+1.

2.2 Scalar Linear Rational Expectations Models

Scalar linear rational expectations models can typically be solved analyt-

ically. Here we introduce two widely adopted methods: the lag operator

method and the method of undetermined coefficients.

2.2.1 Lag Operators

We adopt the following rule for the lag operator L on any stochastic process

{Xt} :

LjXt = Xt−j , LjEtXt+i = EtXt+i−j ,

L−jXt = EtXt+j , i ≥ 0, j ≥ 0.

According to this rule, the lag operator shifts the date of the variable Xt+i,

but not the date of the conditional expectation Et. In addition, because

LjEtXt+j = EtXt = Xt, we define the inverse lag operator L−j in the above

way.

To illustrate the usefulness of lag operators, we first consider the follow-

ing scalar first-order difference equation:

Etxt+1 = bxt + czt,

2.2. SCALAR LINEAR RATIONAL EXPECTATIONS MODELS 47

where b, c are constants, and {zt} is a bounded stochastic process. Suppose

|b| > 1. Using the inverse lag operator, we rewrite the above equation as:(L−1 − b

)xt = czt.

Thus, we obtain the forward-looking solution or the fundamental solution:

xt =czt

L−1 − b =−czt

1− b−1L−1

= −c∞∑j=0

b−jL−jzt = −c∞∑j=0

b−jEtzt+j . (2.3)

In general, there is a solution with bubbles:

xt = x∗t +Bt

where x∗t denote the fundamental solution given in (2.3) and Bt is the bubble

solution satisfying

EtBt+1 = bBt. (2.4)

If we impose a transversality (or no-bubble) condition:

limT→∞

b−TEtxt+T = 0, (2.5)

then bubbles cannot exist.

Example 2.2.1 Asset price bubbles (Blanchard and Watson (1982)).

Suppose the stock price satisfies the asset pricing equation:

pt =Etpt+1 + dt

R,

where R > 1 and dt satisfies

dt = ρdt−1 + εt, |ρ| < 1.

Assume εt is independently and identically distributed (IID) with mean zero.

The general solution for the stock price is given by:

pt =dt

R− ρ +Bt,


where the bubble {Bt} satisfies

Bt+1 =

{RBt/q + et+1 with probability q

et+1 with probability 1− q , Etet+1 = 0.

This process satisfies (2.4). The bubble bursts with probability 1 − q each

period and continues with probability q. If it bursts, it returns to zero in

expected value. Imposing a transversality condition like (2.5) will rule out

bubbles.

Next, we apply lag operators to solve the following stochastic second-

order difference equation:

Etxt+1 = axt + bxt−1 + czt, (2.6)

where x0 ∈ R is given, a, b, and c are real-valued constants, and {zt}t≥0

is an exogenously given bounded stochastic process. We rewrite the above

equation as: (L−1 − a− bL

)xt = czt. (2.7)

Let λ1 and λ2 are the two characteristic roots defined in Section 1.3.

Assume |λ1| < 1 and |λ2| > 1. We rewrite (2.7) as:

(L−1 − λ1

) (L−1 − λ2

)Lxt = czt. (2.8)

It follows from (1.15) that

(L−1 − λ1

)Lxt =

cztL−1 − λ2

=−czt

λ2(1− λ−1

2 L−1) . (2.9)

We then obtain the solution:

xt = λ1xt−1 −c

λ2

∞∑j=0

λ−j2 L−jzt

= λ1xt−1 −c

λ2

∞∑j=0

λ−j2 Etzt+j .

2.2. SCALAR LINEAR RATIONAL EXPECTATIONS MODELS 49

Finally, we consider the following stochastic second-order difference equa-

tion:

Etxt+1 = a0xt + a1Et−1xt + a2xt−1 + czt, (2.10)

where a0, a1, a2, and c are constants, and {zt} is a bounded stochastic pro-

cess. A special feature of this equation is that it involves a conditional

expectation at date t − 1. To solve this equation, we first take conditional

expectations given information at t− 1 to derive:

Et−1xt+1 = a0Et−1xt + a1Et−1xt + a2xt−1 + cEt−1zt.

Next, we apply lag operators and rewrite the above equation as:

(L−1 − (a0 + a1)− a2L)Et−1xt = cEt−1zt.

This equation is similar to (2.7) and can be solved in the same way. After

we solve for Et−1xt, we may obtain the solution for Etxt+1 by shift one

period forward. Finally, substituting the solutions for Et−1xt and Etxt+1

into (2.10), we obtain the solution for xt.

2.2.2 The Method of Undetermined Coefficients

The method of undetermined coefficients is extremely useful for solving lin-

ear models. To apply this method, we first guess the form of the solution and

then verify the guess and solve the undetermined coefficients. We use the

stochastic second-order difference equation (2.6) to illustrate this method.

We assume that {zt} satisfies:

zt = ρzt−1 + εt,

where |ρ| < 1 and εt is IID with mean zero.

We guess the solution is linear too:

xt = Gxt−1 +Hzt,


where G and H are coefficients to be determined. We substitute this guess

into (2.6) to derive:

Et (Gxt +Hzt+1) = a (Gxt−1 +Hzt) + bxt−1 + czt

= G2xt−1 +GHzt + ρHzt.

Matching coefficients on xt−1 and zt gives

G2 = aG+ b,

GH + ρH = aH + c.

The first equation is quadratic, which has two roots. Assume that one root

has modulus less than 1 and the other has modulus greater than 1. To have

a stable and nonexploding solution, we pick the solution with modulus less

than 1. Once we obtain G, the above second equation gives the solution for

H.

2.3 Multivariate Linear Rational Expectations Mod-

els

Multivariate linear rational expectations models can be typically character-

ized by the following first-order system:

AEtxt+1 = Bxt + Czt (2.11)

where xt is an n × 1 random vector, A and B are n × n matrices, C is

an n× nz matrix, and zt is an nz-dimensional exogenously given stochastic

process. We maintain the regularity Assumption 1.4.1 for the same reason

as discussed in Section 1.4.

Definition 2.3.1 A stochastic process {Xt} is stable if there exists M > 0

2.3. MULTIVARIATE LINEAR RATIONAL EXPECTATIONSMODELS51

such that ||Xt||max < M for all t, where ‖x‖max = maxj E |xj | for any Rn-

valued random variable x.2

We assume that {zt} is a stable stochastic process. In applications, we

typically assume that {zt} takes the form as in (2.1). We intend to solve for

a stable solution to (2.11).

Two issues are important for solving (2.11): (a) The matrix A may not

be singular. (b) Some components of xt are predetermined and others are

non-predetermined. The three methods discussed below deal with these two

issues in different ways.

2.3.1 The Blanchard and Kahn Method

We first introduce the Blanchard and Kahn (1980) method. As in the de-

terministic case discussed in Section 1.4.1, Blanchard and Kahn (1980) as-

sume that A is nonsingular. As a result, we define W = A−1B and apply

the Jordan decomposition to W. We can then apply a similar method in

the determine case studied in Section 1.4.1. However, we have to redefine

the notion of predetermined variables. According to Blanchard and Kahn

(1980), a stochastic process {Xt}t≥0 is predetermined if X0 is exogenously

given and if Xt+1 is measurable with respect to Ft, i.e., E [Xt+1|Ft] = Xt+1.

As in the deterministic case, let xt = [k′t, y′t]′ , where the vector yt contains

ny non-predetermined variables and the vector kt contains nk = n − ny

predetermined variables. Suppose W has nu unstable eigenvalues and ns =

n − nu stable eigenvalues. Then Theorem 1.4.2 and Corollary 1.4.1 still

apply with small changes. In particular, the forward-looking solution for

the transformed variables corresponding to the unstable eigenvalues is given

2Following Blanchard and Kahn (1980) and Sims (2002), we may generalize the notionof stability to include growth. In applications, one often uses detrended variables.


by

ut = −(I − L−1J−1

u

)−1J−1u C∗

uzt = −∞∑j=0

J−j−1u C∗

uEtzt+j . (2.12)

The rest of the formulas in the proof of Theorem 1.4.2 still apply. As a result,

we obtain the certainty equivalence principle for linear rational expectations

models: the solution for xt in the stochastic case is identical to that in the

deterministic case except that the conditional expectation operator Et is

applied to all terms related to {zt} in the stochastic case. In particular,

if {zt} is given by (2.1) and Σ = 0 in the deterministic case, then the

two solutions give identical expressions. In addition, Theorem 1.4.3 in the

deterministic case also applies here.

Example 2.3.1 Interest rate rule in a New Keynesian model.

We consider a simple log-linearized new Keynesian model:

πt = κxt + βEtπt+1 (2.13)

xt = Etxt+1 −1

σ(it − Etπt+1) + et (2.14)

where πt is the log inflation rate, xt is the log output gap, it is the log

gross nominal interest rate, and et represents exogenous IID shocks, and

κ, β, and σ are positive constants. Equation (2.13) represents the short-run

aggregate supply relation or the Phillips curve. Equation (2.14) represents

the aggregate demand relation or the IS curve. An equilibrium is obtained

once we specify the central bank’s policy rule. Here we suppose that monetary

policy is represented by the following rule for interest rates:

it = ρrit−1 + εt, (2.15)

where |ρr| < 1 and εt is IID. Combining (2.13), (2.14) and (2.15) gives the


following system:⎡⎣ 1 0 0−σ−1 1 σ−1

0 0 β

⎤⎦⎡⎣ itEtxt+1

Etπt+1

⎤⎦ =

⎡⎣ ρr 0 00 1 00 −κ 1

⎤⎦⎡⎣ it−1

xtπt

⎤⎦+

⎡⎣ εt−et0

⎤⎦ .(2.16)

Premultiplying both sides by the inverse of the matrix on the left gives⎡⎣ itEtxt+1

Etπt+1

⎤⎦ =W

⎡⎣ it−1

xtπt

⎤⎦+

⎡⎣ 1 0 00 1 σ−1

0 0 β

⎤⎦−1 ⎡⎣ εt−et0

⎤⎦ ,where

W =

⎡⎣ ρr 0 01σ 1 + κ

βσ − 1βσ

0 −κβ

1β

⎤⎦ .The above system has a unique stable solution if and only if the number of

eigenvalues of W outside the unit circle is equal to the number of nonpre-

determined variables, in this case, two. Since |ρr| < 1 by assumption, we

can show that only one eigenvalue of W is outside the unit circle. Thus,

equilibrium is indeterminate.

This example illustrates that an exogenous policy rule, that does not re-

spond to endogenous variables, may cause indeterminacy. In Exercise 2

below, you are asked to give conditions for the existence of a unique equilib-

rium when the interest rate responds to inflation or output gap.

2.3.2 The Klein Method

Klein (2000) allow A to be nonsingular. In addition, Klein (2000) general-

izes the definition of predetermined variables given by Blanchard and Kahn

(1980). To introduce this definition, we define the following concepts.

Definition 2.3.2 A stochastic process {Xt} is a martingale if EtXt+1 =

Xt for all t ≥ 0. It is a martingale difference process if EtXt+1 = 0.


Definition 2.3.3 A stochastic process {Xt} is a white noise if E [Xt] = 0

and

E[XtX

′s

]=

{0 if s �= tΣ if s = t

,

where Σ is a matrix.

A martingale difference process may not be stable. It is stable if it is

uniformly bounded. A white noise process is stable. The process {εt+1}satisfying (2.2) is a white noise. However, a white noise may not satisfy

(2.2).

Definition 2.3.4 A process {Xt} is predetermined or backward looking

if (a) the prediction error {ξt} defined by ξt+1 = Xt+1 − EtXt+1 is an

exogenously given martingale difference process; and (b) the initial value

X0 is measurable with respect to F0 and exogenously given.

When ξt+1 = 0, we obtain the Blanchard and Kahn (1980) definition.

We assume that kt is predetermined in the sense of Definition 2.3.4 with

exogenously given prediction error {ξt}. We also make the same assumptions

as in Theorem 1.4.5.

We now use the generalized Schur decomposition theorem and similar

steps in the proof of Theorem 1.4.5. In step 1, we modify equation (1.38):

ut = −T−122

∞∑j=0

[T−122 S22

]jQ2CEtzt+j . (2.17)

We modify step 2 as follows. We replace equation (1.39) with:

Etst+1 = S−111 T11st + S−1

11 T12ut − S−111 S12Etut+1 + S−1

11 Q1Czt.

We also have

kt+1 = Z11st+1 + Z12ut+1.


Since {kt} is predetermined with exogenously given prediction error {ξt},we derive:

Z11 (st+1 − Etst+1) + Z12 (ut+1 −Etut+1) = ξt+1.

Since Z11 is invertible, we obtain:

st+1 = S−111 T11st + S−1

11 T12ut − S−111 S12Etut+1 + S−1

11 Q1Czt

−Z−111 Z12(ut+1 −Etut+1) + Z−1

11 ξt+1.

The rest of the steps are identical to those in the proof of 1.4.5. Unlike in

the Blanchard and Kahn method, {ξt} enters the expressions of the solutionbecause of the new definition of predetermined variables. We can simplify

the solution if we assume that {zt} satisfies (2.1). In this case, the solution

will be of the following form:

yt = Aykkt +Ayzzt

kt+1 = Akkkt +Akzzt + ξt+1,

where k0 and z0 are given. We refer readers to Klein (2000) for details.

To illustrate the usefulness of the previous generalized notion of predeter-

mined variables, we consider the following linear-quadratic control problem:

max{ut}

− 1

2E

∞∑t=0

βt[x′t, u′t]

[Q SS′ R

] [xtut

]subject to

xt+1 = Axt +But + Cξt+1, x0 given, (2.18)

where {ξt} is an exogenously given white noise process. Let Ft be the

σ-algebra generated by {x0, ξ1, ξ2, ..., ξt} . Assume that {xt} and {ut} are

adapted to {Ft}. Then {xt} is predetermined with the prediction error {ξt} .Suppose the matrix [

Q SS′ R

]


is symmetric and positive semi-definite so that the first-order conditions

together with the transversality condition are necessary and sufficient for

optimality. Let βt+1λt+1 be the Lagrange multiplier associated with (2.18).

We obtain the linear system as in (2.11):⎡⎣ I 0 00 0 −βB′

0 0 βA′

⎤⎦⎡⎣ Etxt+1

Etut+1

Etλt+1

⎤⎦ =

⎡⎣ A B 0S′ R 0−Q −S I

⎤⎦⎡⎣ xtutλt

⎤⎦ .In this system ut and λt are forward-looking or non-predetermined variables.

We can then apply the Klein method to solve the above system. This method

does not require the matrix Q to be invertible. It is more general than many

other methods discussed in Anderson et al. (1996). See Chapter 9 for linear-

quadratic models.

2.3.3 The Sims Method

Sims (2002) propose a different method. He represents the rational expec-

tations model in the following form:

Γ0yt = Γ1yt−1 + C +Ψzt +Πηt, (2.19)

where yt is an n × 1 random vector, zt is an exogenously given nz × 1

random vector, ηt is endogenously determined m × 1 vector of prediction

errors satisfying Etηt+1 = 0. In addition, Γ0 and Γ1 are n× n matrix, C is

an n× 1 constant vector, Ψ is an n× nz matrix and Π is an n×m matrix.

Without loss of generality, we may assume C = 0 by appropriate change

of variables as in Example 2.1.1. We may also assume that {zt} is serially

uncorrelated with mean zero if it is a process given by (2.1), because we

may append zt to yt and replace zt in (2.19) with εt. For these reasons, we

consider the following system below:

Γ0yt = Γ1yt−1 +Ψεt +Πηt. (2.20)


As in the Klein method, Γ0 may be singular. We assume that the pair

(Γ0,Γ1) is regular in that it satisfies Assumption 1.4.1. Unlike the methods

of Blanchard and Kahn (1980) and Klein (2000), the Sims method does not

need to define what variables are predetermined and nonpredetermined in

the first place. This method determines endogenously a linear combination

of variables that are predetermined. In addition, the Sims method adopts

the convention that all variables dated t are observable at t, i.e., all variables

are adapted to {Ft} .We apply the generalized Schur decomposition to (Γ0,Γ1) . There exist

unitary complex matrices Q and Z, and upper triangle matrices S and T

such that QΓ0Z = S and QΓ1Z = T. Let S and T be ordered in the way

that the first ns generalized eigenvalues are stable and the remaining nu

generalized eigenvalues are unstable. Define y∗t = ZHyt. We partition y∗t as:

y∗t =[stut

]ns × 1nu × 1

Premultiplying by Q on the two sides of equation (2.19) yields:

Sy∗t+1 = Ty∗t +Q (Ψεt +Πηt) ,

We partition matrices conformably so that:[S11 S120 S22

] [st+1

ut+1

]=

[T11 T120 T22

] [stut

]+

[Q1

Q2

](Ψεt +Πηt) ,

(2.21)

where Q1 is an ns × n matrix and Q2 is an nu × n matrix.

From this equation, we obtain:

ut+1 = S−122 T22ut + S−1

22 Q2 (Ψεt +Πηt) . (2.22)

Because ut corresponds to unstable generalized eigenvalues, we solve it for-

ward to obtain:

ut = −T−122

∞∑j=0

[T−122 S22

]jQ2

(Ψεt+j +Πηt+j

). (2.23)


Taking expectations Et gives:

Etut = ut = 0, for all t,

where we have used the convention that ut is Ft measurable. We partition

y∗t = ZHyt as:

y∗t =[stut

]=

[ZH1ZH2

]yt.

It follows that

ZH2 y0 = 0.

This equation gives the condition for the initial value of yt to satisfy.

From equation (2.21) and ut = 0 for all t, we obtain:

Q2 (Ψεt +Πηt) = 0. (2.24)

Lemma 2.3.1 For every εt, there exists ηt ∈ Rm satisfying equation (2.24)

if and only if there exists an m× nz matrix Λ such that Q2Ψ = Q2ΠΛ.

Proof. For the “if” part, we obtain (2.24) by setting ηt = Λεt. For the

“only if” part, let the jth column of Λ be an ηt that solves (2.24) for εt = Ij ,

where Ij is the jth column of the nz × nz identity matrix, j = 1, ..., nz .

Thus, for a solution to (2.20) to exist, we must make the following:

Assumption 2.3.1 There exists an m × nz matrix Λ such that Q2Ψ =

Q2ΠΛ.

In addition, we need the following assumption for uniqueness:

Assumption 2.3.2 There exists an ns × nu matrix Ξ such that

Q1Π = ΞQ2Π.


Given this assumption, we premultiply (2.21) by [I − Ξ] to derive:[S11 S12 − ΞS22

] [ st+1

ut+1

](2.25)

=[T11 T12 − ΞT22

] [ stut

]+ [Q1 − ΞQ2] Ψεt.

We combine this equation with (2.22) and (2.24) to derive:[S11 S12 − ΞS220 I

] [st+1

ut+1

]=

[T11 T12 − ΞT220 S−1

22 T22

] [stut

]+

[Q1 − ΞQ2

0

]Ψεt.

Using yt = Zy∗t and the above equation, we obtain the solution to (2.20):

yt+1 = Θ1yt +Θ2εt,

where

Θ1 = Z

[S11 S12 − ΞS220 I

]−1 [T11 T12 − ΞT220 S−1

22 T22

],

Θ2 = Z

[S11 S12 − ΞS220 I

]−1 [Q1 − ΞQ2

0

].

The following theorem summarizes the above analysis.

Theorem 2.3.1 A necessary and sufficient conditions for the existence of

a unique stable solution to (2.20) are Assumptions 2.3.1-2.3.2.

The conditions in this theorem are different from the Blanchard and

Kahn conditions which check a rank condition and also check whether the

number of unstable eigenvalues nu is equal to the number of nonpredeter-

mined variables (possibly m). To practically apply the conditions in Theo-

rem 2.19 and to implement the previous algorithm, we use the singular value

decomposition of the nu ×m matrix Q2Π :

Q2Π = U︸︷︷︸nu×nu

D︸︷︷︸nu×m

V ′︸︷︷︸m×m

=[U1 U2

] [ D11 00 0

] [V ′1

V ′2

]= U1︸︷︷︸

nu×rD11︸︷︷︸r×r

V ′1︸︷︷︸

r×m,


where D11 is a nonsingular diagonal matrix and U and V are orthonormal

matrices satisfying U ′U = I and V ′V = I. As a result, we obtain U ′1U1 = Ir

and V1V′1 + V2V

′2 = Im. If m = r, then V2 = 0. In this case, we obtain:

Λ = V1D−111 U

′1 (Q2Ψ) and Ξ = (Q1Π)V1D

−111 U

′1.

We thus have proved the following result:

Corollary 2.3.1 A necessary and sufficient conditions for the existence of

a unique stable solution to (2.20) is m = r.

Lubik and Schorfheide (2003) generalize this result and show that if m >

r there are infinitely many stable solutions. They also provide a computation

algorithm to solve this case.

We use the model in Example 2.3.1 to illustrate how to apply the Sims

method. Let Etπt+1 = ξπt , Etxt+1 = ξxt , ηπt = πt − ξπt−1 and ηxt = xt − ξxt−1.

We rewrite system (2.16) in terms of the Sims form (2.20):⎡⎣ 1 0 00 1 σ−1

0 0 β

⎤⎦⎡⎣ itξxtξπt

⎤⎦ =

⎡⎣ ρr 0 0σ−1 1 00 −κ 1

⎤⎦⎡⎣ it−1

ξxt−1

ξπt−1

⎤⎦+

⎡⎣ εt−et0

⎤⎦+

⎡⎣ 0 01 0−κ 1

⎤⎦[ηxtηπt

].

To derive decision rules for xt and πt directly, we may append xt and πt and

two identity equations ηπt = πt − ξπt−1 and ηxt = xt − ξxt−1 to the system.

2.4 Nonlinear Rational Expectations Models

Many nonlinear rational expectations models can be represented by the fol-

lowing system:

Et [f (yt+1, yt, xt+1, xt)] = 0, (2.26)

2.4. NONLINEAR RATIONAL EXPECTATIONS MODELS 61

where xt is an (nk + nz) × 1 vector of predetermined variables and yt is

an ny × 1 vector of nonpredetermined variables in the sense of Blanchard

and Kahn (1980). Let n = nk + nz + ny. The function f : R2n → Rn.

The vector xt consists of nk endogenous variables and nz exogenous shocks:

xt = [k′t, z′t]′. In addition to the initial values for x0 and z0, there is a terminal

condition in the form of transversality conditions. Rather than specifying

transversality conditions explicitly, we shall focus on a stable solution (see

Definition 2.3.1) which satisfies these conditions trivially.

Assume that zt satisfies (2.1), where all eigenvalues of Φ are inside the

unit circle. Suppose the system (2.26) has a deterministic steady state (x, y)

such that f (y, y, x, x) = 0. We intend to find a linearized solution locally

around the deterministic steady state. We totally differentiate (2.26) to

obtain the linearized system:

fy′ Etdyt+1 + fy dyt + fx′ Etdxt+1 + fx dxt = 0,

where the Jacobian matrix of f is evaluated at the steady state. Adopting

the notation introduced in Section 1.6, we can also derive a log-linearized

system. Both systems are in the form of (2.11), e.g.,[fx′ fy′

] [ dxt+1

dyt+1

]= −

[fx fy

] [ dxtdyt

]. (2.27)

We can thus apply previous methods to solve these systems. The stable

solution will be of the following form:

dyt = Ayk dkt +Ayz dzt

dkt+1 = Akk dkt +Akz dzt,

dzt+1 = Φ dzt +Σεt+1,

where dk0 and dz0 are given.

Linearly approximated solutions are inaccurate to compute risk premium

and welfare. Higher-order approximations are needed to compute such ob-


jects. We now introduce the second-order approximation method studied by

Sims (2000) and Schmitt-Grohe and Uribe (2004). This method is actually

an application of the more general perturbation method that will be intro-

duced in Section 11.4. To apply this method, we introduce a perturbation

scalar σ ∈ R in (2.1):

zt+1 = Φzt + σΣεt+1,

where {εt} is an IID process with bounded support.

Suppose the solution to (2.26) takes the following form:

yt = g (xt, σ) , xt+1 = h (xt, σ) + ησεt+1, (2.28)

where g and h are functions to be determined and η is an nx × nε matrix

defined by

η =

[0Σ

].

Our goal is to find an approximated solution for g and h. We shall consider

local approximation around a particular point (x, σ) . In applications, a con-

venient point to take is the non-stochastic steady state (x, σ) . At this point,

f (y, y, x, x) = 0. We then write the Taylor approximation up to the second

order,

g (x, σ) = g (x, 0) + gx (x, 0) (x− x) + gσ (x, 0) σ

+1

2gxx (x, 0) (x− x)2 + gxσ (x, 0) (x− x)σ

+1

2gσσ (x, 0) σ

2,

h (x, σ) = h (x, 0) + hx (x, 0) (x− x) + hσ (x, 0) σ

+1

2hxx (x, 0) (x− x)2 + hxσ (x, 0) (x− x)σ

+1

2hσσ (x, 0) σ

2,

where, for notation convenience, we consider the scalar case only. We shall

follow Schmitt-Grohe and Uribe (2004) and derive an algorithm to solve

2.4. NONLINEAR RATIONAL EXPECTATIONS MODELS 63

for all the coefficients.3 This algorithm also applies to the general high

dimensional case. We first note that

y = g (x, 0) , x = h (x, 0) .

We then define

F (x, σ) ≡ Ef(g(h (x, σ) + ησε′, σ

), g (x, σ) , h (x, σ) + ησε′, x

)(2.29)

= 0,

where we have substituted (2.28) into (2.26) and the expectation is taken

with respect to ε′. Since F (x, σ) is equal to zero for any values of (x, σ) , we

must have

Fxiσj (x, σ) = 0, any i, j, x, σ,

where Fxiσj denotes the partial derivative of F with respect to x taken i

times and with respect to σ taken j times. We shall use the above equation

to derive other coefficients in the approximated solution.

We start with the first-order approximation. Let

Fx (x, 0) = 0, Fσ (x, 0) = 0.

We then use (2.29) to compute

Fσ (x, 0) = E{fy′

[gx

(hσ + ηε′

)+ gσ

]+ fygσ + fx′

(hσ + ηε′

)}= fy′ [gxhσ + gσ] + fygσ + fx′hσ = 0,

where all derivatives are evaluated at the non-stochastic steady state. This

is a system of homogeneous linear equations for gσ and hσ . For a unique

solution to exist, we must have

gσ = hσ = 0.

3The Matlab codes are publically available on the world wide web site:http://www.columbia.edu/˜mu2166/2nd order.htm.


This result reflects the certainty equivalence principle in that the policy func-

tion in the first-order approximation is independent of the variance of the

shock. Thus, linear approximation is not suitable to address risk premium

and welfare.

Next, we use (2.29) to compute

Fx (x, 0) = fy′gxhx + fygx + fx′hx + fx = 0.

We can equivalently rewrite the above equation as[fx′ fy′

] [ Igx

]hx = −

[fx fy

] [ Igx

]. (2.30)

Let A =[fx′ fy′

], B = −

[fx fy

], and xt = xt − x. Postmultiplying

xt on each side of (2.30) yields:

A

[Igx

]hxxt = B

[Igx

]xt. (2.31)

Using the fact that xt+1 = hxxt and yt = gxxt for the linearized non-

stochastic solution, we obtain

A

[xt+1

yt+1

]= B

[xtyt

]. (2.32)

Notice that this equation is equivalent to equation (2.27). One may adopt

the Klein method or the Sims method to solve this linear system. The

solution gives hx and gx.

Now, we turn to the second-order approximation. We compute

0 = Fxx (x, 0) = fy′ygxhx + fy′ygx

+fy′x′hx + fy′xgxhx

+fy′gxxhxhx + fy′gxhxx

+fyy′gxhx + fyygx + fyx′hx + fyxgx + fygxx

+fx′y′gxhx + fx′ygx + fx′x′hx + fx′xhx + fx′hxx

+fxy′gxhx + fxygx + fxx′hx + fxx.


Since we know the derivatives of f as well as the first derivatives of g and

h evaluated at (y, y, x, x), it follows that the above equation represents a

system of n × nx × nx linear equations for the same number of unknowns

given by the elements of gxx and hxx.

Similarly, we compute gσσ and hσσ by solving the equation Fσσ (x, 0) = 0.

We compute

0 = Fσσ (x, 0) = fy′gxhσσ + fy′y′gxηgxηI

+fy′x′ηgxηI + fy′gxxηηI + fy′gσσ

+fygσσ + fx′hσσ

+fx′y′gxηηI + fx′x′ηηI.

This is a system of n linear equations for n unknowns gσσ and hσσ.

Finally, we show that the cross derivatives gσx and hσx are equal to zero

when evaluated at (x, 0). We compute

0 = Fσx (x, 0) = fy′gxhσx + fy′gσxhx + fygσx + fx′hσx.

This is a system of n×nx linear homogenous equations for n×nx unknownsgσx and hσx. If a unique solution exists, then the solution must be equal to

zero.

In summary, uncertainty matters in the second-order approximated solu-

tion only through the terms 12gσσ (x, 0) σ

2 and 12hσσ (x, 0) σ

2. In applications,

we typically set σ = 1 when Σ is “small.” In principle, we can proceed to

compute a third-order or even higher-order approximate solution using the

previous procedure. Because the algebra is highly involved, we omit the

exposition here.

2.5 Numerical Solutions Using Dynare

Dynare can solve DSGE models in the form of (2.26). Using Dynare timing

conventions, Dynare solves any stochastic model of the following general


form:

Etf(y−t−1, yt, y

+t+1, εt

)= 0, (2.33)

where yt is a vector that contains endogenous variables, y−t−1 is a subset of

predetermined variables or variables with a lag, y+t+1 is a subset of variables

with a lead, and εt is a vector of exogenous IID shocks with mean zero. This

form is more general than that in (2.26) in that the former can incorporate

a variable with both a one-period lead and a one-period lag. For example,

for habit formation utility, the consumption Euler equation contains current

consumption as well as consumption with one-period lead and lag. Thus,

consumption is both a predetermined and a non-predetermined variable and

can be naturally incorporated in the Dynare form (2.33), but not in the

form (2.26). One has to perform a change of variable to incorporate habit

formation in (2.26).

The solution to (2.33) is given by the following form:

yt = g(y−t−1, εt

).

Dynare uses the perturbation method as in the previous section to deliver

an approximate solution up to the third order. In particular, we introduce a

perturbation scalar σ and then use Taylor expansions. Finally, we set σ = 1.

The first-order approximate solution is given by

yt = y + gy y−t−1 + guεt,

where y−t−1 = y−t−1 − y.As in the deterministic case discussed in Section 1.7, a Dynare program

typically consists of 5 blocks. The preamble block and the model block are

similar for both the deterministic and stochastic cases. The steady state

and initial values block is needed for the stochastic case because Dynare has

to use the nonstochastic steady state as the perturbation point. The main


differences between the two cases lie in the shocks block and the computation

block.

We shall use the following simple stochastic growth model as an example

to illustrate how to work with Dynare:

max{Ct,Nt,Kt+1}

E∞∑t=0

βt {ln (Ct) + χ ln (1−Nt)} ,

subject to

Ct +Kt+1 − (1− δ)Kt = ztKαt N

1−αt , (2.34)

ln zt = ρ ln zt−1 + σεt. (2.35)

The equilibrium system consists of the two first-order conditions:

1

Ct= Et

β

Ct+1

(zt+1αK

α−1t+1 N

1−αt+1 + 1− δ

),

χCt1−Nt


−αt ,

and the resource constraint (2.34) for 3 unknowns Ct,Kt, Nt. We have to

decide approximations in levels or in logs. If we use approximations in logs,

we do the following transformation for any variable xt:

lxt = ln (xt) and xt = exp (lxt) .

We then rewrite the equilibrium system in terms of the transformed vari-

ables. Dynare will conduct approximations for these variables. Note that

Dynare treats exogenous shocks (e.g., zt) as endogenous variables. It treats

the shock innovations as exogenous variables.

We shall write a Dynare program for our example using approximations

in levels.4 We start with the first three blocks:

%—————————————————————-

% 1. Preamble

4The Dynare code is rbc0.mod.


%—————————————————————-

var c k n z;

varexo e;

parameters beta chi delta alpha rho sigma;

alpha = 0.33;

beta = 0.99;

delta = 0.023;

chi = 1.75;

rho = 0.95;

sigma = 0.01;

%—————————————————————-

% 2. Model

%—————————————————————-

model;

(1/c) = beta*(1/c(+1))*(1+alpha*(kˆ(alpha-1))*exp(z(+1))*(n(+1))ˆ(1-

alpha)-delta);

chi*c/(1-n) = (1-alpha)*(k(-1)âlpha)*exp(z)*(nˆ(-alpha));

c+ k-(1-delta)*k(-1) = (k(-1)âlpha)*exp(z)*nˆ(1-alpha);

z = rho*z(-1)+e;

end;

%—————————————————————-

% 3. Steady State and Initial Values

%—————————————————————-

initval;

k = 9;

c = 0.76;

n = 0.3;

z = 0;


e = 0;

end;

steady;

check;

Shocks. Next, we introduce the shocks block. For stochastic simu-

lations, the shocks block specifies the non-zero elements of the covariance

matrix of the shocks of exogenous variables. The following commands may

be used in this block:

var VARIABLE NAME; stderr EXPRESSION;

Specifies the standard error of a variable.

var VARIABLE NAME = EXPRESSION;

Specifies the variance of a variable.

var VARIABLE NAME, VARIABLE NAME = EXPRESSION;

Specifies the covariance of two variables.

corr VARIABLE NAME, VARIABLE NAME = EXPRESSION;

Specifies the correlation of two variables.

For our example, we write the shocks block as:

%———————————————-

% 4. Shocks Block

%———————————————–

shocks;

var e = sigmaˆ2;

end;

It is possible to mix deterministic and stochastic shocks to build models

where agents know from the start of the simulation about future exogenous

changes. In that case stoch simul will compute the rational expectation

solution adding future information to the state space (nothing is shown


in the output of stoch simul) and the command forecast will compute a

simulation conditional on initial conditions and future information.

Here is an example:

varexo det tau;

varexo e;

...

shocks;

var e; stderr 0.01;

var tau;

periods 1:9;

values -0.15;

end;

stoch simul(irf=0);

forecast;

Computation. Now, we turn to the computation block. The Dynare

command used in this block is:

stoch simul (OPTIONS . . . ) [VARIABLE NAME . . . ];

This command solves a stochastic (i.e. rational expectations) model,

using the perturbation method by Taylor approximations of the policy and

transition functions around the steady state. Using this solution, it com-

putes impulse response functions and various descriptive statistics (mo-

ments, variance decomposition, correlation and autocorrelation coefficients).

For correlated shocks, the variance decomposition is computed as in the VAR

literature through a Cholesky decomposition of the covariance matrix of the

exogenous variables. When the shocks are correlated, the variance decom-

position depends upon the order of the variables in the varexo command.

The IRFs are computed as the difference between the trajectory of a

variable following a shock at the beginning of period 1 and its steady state


value. Currently, the IRFs are only plotted for 12 variables. Select the ones

you want to see, if your model contains more than 12 endogenous variables.

Variance decomposition, correlation, autocorrelation are only displayed

for variables with positive variance. Impulse response functions are only

plotted for variables with response larger than 10−10.

Variance decomposition is computed relative to the sum of the contribu-

tion of each shock. Normally, this is of course equal to aggregate variance,

but if a model generates very large variances, it may happen that, due to

numerical error, the two differ by a significant amount. Dynare issues a

warning if the maximum relative difference between the sum of the contri-

bution of each shock and aggregate variance is larger than 0.01%.

The covariance matrix of the shocks is specified in the shocks block.

When a list of VARIABLE NAME is specified, results are displayed only

for these variables.

The command stoch simul contains many options. Here I list a few com-

monly used options. The reader is referred to the Dynare Reference Manual

for more options.

• ar = INTEGER: Order of autocorrelation coefficients to compute and

to print. Default: 5.

• drop = INTEGER: Number of points dropped at the beginning of sim-

ulation before computing the summary statistics. Default: 100.

• hp filter = INTEGER: Uses HP filter with = INTEGER before com-

puting moments. Default: no filter.

• irf = INTEGER: Number of periods on which to compute the IRFs.

Setting irf=0, suppresses the plotting of IRF’s. Default: 40.

• relative irf: Requests the computation of normalized IRFs in percent-

age of the standard error of each shock.


• linear: Indicates that the original model is linear (put it rather in the

model command).

• nocorr: Don’t print the correlation matrix (printing them is the de-

fault).

• nofunctions: Don’t print the coefficients of the approximated solution

(printing them is the default).

• nomoments: Don’t print moments of the endogenous variables (print-

ing them is the default).

• nograph: Don’t do the graphs. Useful for loops.

• noprint: Don’t print anything. Useful for loops.

• print: Print results (opposite of noprint).

• order = INTEGER: Order of Taylor approximation. Acceptable values

are 1, 2 and 3. Note that for third order, k order solver option is

implied and only empirical moments are available (you must provide

a value for periods option). Default: 2.

• periods = INTEGER: If different from zero, empirical moments will

be computed instead of theoretical moments. The value of the option

specifies the number of periods to use in the simulations. Values of the

initval block, possibly recomputed by steady, will be used as starting

point for the simulation. The simulated endogenous variables are made

available to the user in a vector for each variable and in the global

matrix oo .endo simul. Default: 0.

• conditional variance decomposition = INTEGER: See below.

• conditional variance decomposition = [INTEGER1:INTEGER2]: See be-

low.


• conditional variance decomposition = [INTEGER1 INTEGER2 ...]: Com-

putes a conditional variance decomposition for the specified period(s).

Conditional variances are given by vart (yt+k). For period 1, the condi-

tional variance decomposition provides the decomposition of the effects

of shocks upon impact.

• pruning: Discard higher order terms when iteratively computing sim-

ulations of the solution, as in Schaumburg and Sims (2008).

• partial information: Computes the solution of the model under partial

information, along the lines of Currie and Levine (1986). Agents are

supposed to observe only some variables of the economy. The set of

observed variables is declared using the varobs command. Note that if

varobs is not present or contains all endogenous variables, then this is

the full information case and this option has no effect.

For our example, we may write:

%—————————————-

% 5. Computation Block

%——————————————–

stoch simul(hp filter = 1600, order = 1);

Output and Results. The command stoch simul produces the follow-

ing results, which are stored in Matlab global variables started with oo .

• Model summary: a count of the various variable types in your model

(endogenous, jumpers, etc...).

• Eigenvalues: they are displayed if the command check is used. The

Blanchard-Kahn condition is checked.


• Matrix of covariance of exogenous shocks: this should square

with the values of the shock variances and co-variances you provided

in the .mod file.

• Policy and transition functions: Coefficients of these functions are

stored up to third-order approximations and are displayed on screen

up to second-order approximations. See below.

• Moments of simulated variables: up to the fourth moments.

• Correlation of simulated variables: these are the contemporane-

ous correlations, presented in a table.

• Autocorrelation of simulated variables: up to the fifth lag, as

specified in the options of stoch simul.

• Graphs of impulse functions: up to 12 variables.

You can easily browse the global variables of output starting with oo . in

Matlab by either calling them in the command line, or using the workspace

interface. Steady state values are stored in oo .dr.ys or oo .steady state.

Impulse response functions are stored under the following nomenclature:

“x e” denotes the impulse response of variable x to shock e. Note that

variables will always appear in the order in which you declare them in the

preamble block of your .mod file.

Typology and Ordering of Variables. Dynare distinguishes four

types of endogenous variables:

• Purely backward (or purely predetermined) variables: Those that ap-

pear only at current and past period in the model, but not in future

period (i.e. at t and t−1 but not t+1). The number of such variables

is equal to oo .dr.npred – oo .dr.nboth.


• Purely forward variables: Those that appear only at current and future

period in the model, but not at past period (i.e. at t and t+1 but not

t− 1). The number of such variables is stored in oo .dr.nfwrd.

• Mixed variables: Those that appear at current, past and future periods

in the model (i.e. at t, t− 1 and t+ 1). The number of such variables

is stored in oo .dr.nboth.

• Static variables: Those that appear only at current, not past and

future period in the model (i.e. only at t, not at t+ 1 or t− 1). The

number of such variables is stored in oo .dr.nstatic.

Note that all endogenous variables fall into one of these four categories,

since after the creation of auxiliary variables, all endogenous variables have

at most one lead and one lag. We therefore have the following identity:

oo .dr.npred + oo .dr.nfwrd + oo .dr.nstatic = M .endo nbr

Internally, Dynare uses two orderings of the endogenous variables: the

order of declaration (which is reflected in M .endo names), and an order

based on the four types described above, which we will call the DR-order

(“DR” stands for decision rules). Most of the time, the declaration order is

used, but for elements of the decision rules, the DR-order is used.

The DR-order is the following: static variables appear first, then purely

backward variables, then mixed variables, and finally purely forward vari-

ables. Within each category, variables are arranged according to the decla-

ration order.

First-Order Approximation. The approximation has the form:

yt = y +Ay−t−1 +Bεt

where y is the deterministic steady-state value of yt and y−t−1 = y−t−1 − y−.

The coefficients of the decision rules are stored as follows:


• y is stored in oo .dr.ys. The vector rows correspond to all endogenous

in the declaration order.

• A is stored in oo .dr.ghx. The matrix rows correspond to all endoge-

nous in the DR-order. The matrix columns correspond to state vari-

ables in DR-order.

• B is stored oo .dr.ghu. The matrix rows correspond to all endogenous

in DR-order. The matrix columns correspond to exogenous variables

in declaration order.

For our example, the following table gives the first-order approximated

policy and transition functions:

c k n z

constant 0.793902 10.269592 0.331892 0k(−1) 0.041667 0.951446 −0.008169 0z(−1) 0.306067 1.143748 0.226601 0.950000e 0.322175 1.203945 0.238528 1.000000

The row “constant” should give the steady-state values of the model. As

an example, we write the consumption policy in the more familiar form:

Ct = 0.793902 + 0.041667Kt + 0.306067zt−1 + 0.322175εt

or

Ct = 0.041667Kt + 0.322175zt

where we have used the fact that zt = 0.95zt−1 + εt.

Second-Order Approximation. The approximation has the form:

yt = y +1

2gσσ +Ay−t−1 +Bεt

+1

2C

(y−t−1 ⊗ y−t−1

)+

1

2D (εt ⊗ εt) + E

[y−t−1 ⊗ εt

]


where gσσ is the shift effect of the variance of future shocks and ⊗ denotes

the Kronecker product. The coefficients of the decision rules are stored in

the variables described for the case of first-order approximations, plus the

following variables:

• gσσ is stored in oo .dr.ghs2. The vector rows correspond to all endoge-

nous in DR-order.

• C is stored in oo .dr.ghxx. The matrix rows correspond to all endoge-

nous in DR-order. The matrix columns correspond to the Kronecker

product of the vector of state variables in DR-order.

• D is stored in oo .dr.ghuu. The matrix rows correspond to all endoge-


product of exogenous variables in declaration order.

• E is stored in oo .dr.ghxu. The matrix rows correspond to all endoge-


product of the vector of state variables (in DR-order) by the vector of

exogenous variables (in declaration order).

We often refer to ys ≡ y + 0.5gσσ as the stochastic steady state,

which corrects for uncertainty. It is the steady state when we set the current

shock εt = 0, but taking into account of future shocks. This is unlike the

deterministic steady state in which stocks in all periods are shut down.

Third-order approximation. The approximation has the form:

yt = y +G0 +G1xt +G2 (xt ⊗ xt) +G3 (xt ⊗ xt ⊗ xt) ,

where xt is a vector consisting of the deviation from the steady state of

the state variables (in DR-order) at date t − 1 followed by the exogenous


variables at date t (in declaration order). The coefficients of the decision

rules are stored as follows:

• y is stored in oo .dr.ys. The vector rows correspond to all endogenous

in the declaration order.

• G0 is stored in oo .dr.g 0.




2.6 Exercises

1. Consider the stochastic Cagan (1956) model:

mt − pt = −αEt (pt+1 − pt) , α > 0,

where mt is the log of the nominal money supply and pt is the log of

the price level. Suppose that money supply satisfies

mt+1 = ρmt + εt+1, ρ ∈ [0, 1], m0 given,

where εt+1 is a white noise. Given conditions on the parameters such

that there exists a unique stable solution for pt. Derive this solution.

2. Modify the policy rule in Example 2.3.1 as follows.

(a) Set the policy rule as:

it = δπt + εt.

Prove that δ > 1 is necessary and sufficient for the existence of a unique

stable equilibrium. This condition is called the Taylor principle.

2.6. EXERCISES 79

(b) Suppose the policy rule is given by the Taylor rule:

it = δππt + δxxt + εt.

Derive conditions for the existence of a unique stable equilibrium.

3. For the example in Section 2.5, write a Dynare program using approx-

imations in logs. Plot impulse response functions for consumption Ct,

investment It, output Yt, hours Nt, capital Kt, wage wt and rental

rate Rt. Compare the solutions using first-, second- and third-order

approximations.


Chapter 3

Markov Processes

Economic data and economic models often have a recursive structure. A

recursive structure is essential for researchers to use recursive methods to

analyze dynamic problems and also is convenient for organizing time series

data. To introduce a recursive structure, it is necessary to assume that

exogenous shocks follow Markov processes. We are also interested in long-

run properties of Markov processes induced by these exogenous shocks and

economic agents’ optimal behavior. In this chapter, we provide a brief intro-

duction to Markov processes and their convergence.1 We first study Markov

chains, for which the state space is finite. We then study Markov processes

with infinite state space. Finally, we discuss strong and weak convergence

for general Markov processes.

Throughout this chapter, we fix a probability space (Ω,F ,Pr) and a

measurable space (X,X ) .

Definition 3.0.1 A stochastic process {Xt}∞t=0 with the state space X is a

Markov process if

Pr (Xt+1 ∈ A|Xt = x0,Xt−1 = x1, ...,Xt−k = xk) = Pr (Xt+1 ∈ A|Xt = x0)

1Our exposition draws heavily on the treatment in Lucas and Stokey with Prescott(1989), Billingsley (1995), Shiryaev (1996), and Gallager (1998).

81

82 CHAPTER 3. MARKOV PROCESSES

for all A ∈ X , all x0, x1, ..., xk ∈ X, all 0 ≤ k ≤ t, and all t ≥ 0. If the above

probability does not depend on t, the Markov process is time homogeneous.

In this chapter, we focus on time-homogeneous Markov processes only.

3.1 Markov Chains

If the state space X is countable or finite, then the Markov process {Xt} iscalled a Markov chain.2 A Markov chain {Xt} is characterized by three

objects:

• A finite or countable state space X. This set gives all possible values

that xt may take. Without loss of generality, we may simply assume

that X = {1, 2, ..., n} , where n ≤ ∞.

• A transition matrix (or stochastic matrix) P = (pij)n×n , where

pij = Pr (Xt+1 = j|Xt = i) , and

n∑k=1

pik = 1.

for all i, j = 1, 2, ..., n.

• An initial distribution π0 = (π01, π02, ..., π0n) , where π0i = Pr (x0 = i)

and∑n

i=1 π0i = 1.

Given a transition matrix P and an initial distribution π0, there exists

a unique probability distribution Pr on (X∞,X∞) such that there exists a

Markov chain {Xt} defined on this probability space and

Pr (X0 = i) = π0,i and Pr (Xt+1 = j|Xt = i) = pij,

for all i, j ∈ X. A general version of this result is stated in Theorem 3.2.3

below.2In the literature, some researchers use the term Markov chain to refer to any Markov

processes with a discrete time parameter, regardless of the nature of the state space.

3.1. MARKOV CHAINS 83

1/3 0 2/3 0

0 1/4 0 3/4

1/2 0 1/2 0

0 1/4 0 3/4

P

⎡ ⎤⎢ ⎥⎢ ⎥=⎢ ⎥⎢ ⎥⎣ ⎦

1 2

3 4

Figure 3.1: The graph representation of a Markov chain.

Given a transition matrix P, we may compute the probabilities of tran-

sition from state i to state j in m steps, m = 1, 2, 3, .... In particular, we can

verify that if P is a transition matrix, then so is Pm. If the initial state is i,

then the probability distribution over states m periods ahead is given by the

ith row of Pm. We denote by p(m)ij the ith row and the jth column of Pm.

The m-step transition probabilities satisfy the Chapman–Kolmogorov

equation:

Pm+k = P kPm or p(m+k)ij =

n∑r=1

p(k)ir p

(m)rj for any m,k ≥ 1. (3.1)

It is convenient to use a directed graph to describe a Markov chain. In

the graph, there is one node for each state and a directed arc for each non-

zero transition probability. If pij = 0, then the arc from node i to node j is

omitted. Since many most important properties of a Markov chain depend

only on which transition probabilities are zero, the graph representation is

an intuitive device for understanding these properties. Figure 3.1 gives any

example. We also give more examples below.

Example 3.1.1 Regime shifts. Suppose there are two states representing a

boom and a recession, respectively. Transition matrix is given by:

P =

[p 1− p

1− q q

], 0 < p, q < 1.


The m-step transition matrix is given by:

Pm =1

2− p− q

[1− q 1− p1− q 1− p

]+

(p+ q − 1)m

2− p− q

[1− p p− 1q − 1 1− p

].

Thus,

limm→∞Pm =

1

2− p− q

[1− q 1− p1− q 1− p

].

Example 3.1.2 Cyclical moving subsets. Suppose the transition matrix is

given by:

P =

[0 P1

P2 0

]where P1 and P2 are k× (n− k) and (n− k)×k transition matrices, respec-

tively, each with strictly positive elements. Then

P 2m =

[(P1P2)

m 00 (P2P1)

m

], P 2m+1 =

[0 (P1P2)

m P1

(P2P1)m P2 0

].

Example 3.1.3 Suppose the transition matrix is given by:

P =

⎡⎣ 1− α α/2 α/20 1/2 1/20 1/2 1/2

⎤⎦ , α ∈ (0, 1) .

Simple algebra gives

Pm =

⎡⎣ (1− α)m βm/2 βm/20 1/2 1/20 1/2 1/2

⎤⎦ ,where βm = 1− (1− α)m . Since α ∈ (0, 1) , it follows that

limm→∞Pm =

⎡⎣ 0 1/2 1/20 1/2 1/20 1/2 1/2

⎤⎦ .Example 3.1.4 Birth-death Markov chain. Take a sequence of IID binary

random variables {bt}t≥1 with Pr (bt = 1) = p and Pr (bt = −1) = q = 1− p.Let Xt = max {0, bt +Xt−1} with X0 = 0. The sequence {Xt} is a spe-

cial birth-death Markov chain. Its state space is {0, 1, 2, ....} and its graph

representation is given by Figure 3.2.


0 1 2 3 4

q

p p p p

q 1q p= − q q

Figure 3.2: The graph representation of the birth-death Markov chain.

Example 3.1.5 Unrestricted one-dimensional random walk. Consider a

sequence of IID random variables {wt : t ≥ 1} , with Pr(wt = 1) = 0.5 and

Pr(wt = −1) = 0.5. The unrestricted one-dimensional random walk {Xt} isdefined as Xt = Xt−1+wt, X0 = 0. Its state space is {...,−2,−1, 0, 1, 2, ...} .Its graph representation is given by Figure 3.3.

-2 -1 0 1 2

Figure 3.3: The graph representation of the unrestricted one-dimensionalrandom walk.

Example 3.1.6 Random walk with two barriers. Suppose there are 4 states

{1, 2, 3, 4} with the transition matrix:

P =

⎡⎢⎢⎣1 0 0 0q 0 p 00 q 0 p0 0 0 1

⎤⎥⎥⎦ .States 1 and 4 are absorbing states in the sense that once the chain enters

one of the states, it stays there forever.

In Section 3.1-3.2, we shall focus on finite-state Markov chains. In Sec-

tion 3.3, we shall briefly introduce countable-state Markov chains and high-

light the key differences between these two cases.


3.1.1 Classification of States

This section, except where indicated otherwise, applies to Markov chains

with both finite and countable state spaces. We begin by several definitions.

Definition 3.1.1 A state j is accessible from state i (denoted i → j)

if p(m)ij > 0 for some m ≥ 1. Two distinct states i and j communicate

(denoted i↔ j) if i is accessible from j and j is accessible from i. Let i↔ i

regardless of whether p(m)ii > 0 or not.

The property of communicating is an equivalence relation because the

following properties hold:

• i↔ i (reflexivity),

• i↔ j if and only if j ↔ i (symmetry),

• i↔ k and k ←→ j implies i←→ j (transitivity).

As a result, the state space X can be completely divided into disjoint

equivalence classes with respect to the equivalence relation ←→ . For the

example of Figure 3.1, there are two equivalent classes {1, 3} and {2, 4} .

Definition 3.1.2 A Markov chain is called irreducible if the state space

consists of only one equivalence class. It is called decomposable if there

are at least two equivalence classes.

By this definition, for an irreducible Markov chain with the transition

matrix P , for any states i, j, there exists k such that p(k)ij > 0. (This k may

depend on i, j) That is, all states communicate each other. Sometimes this

property is used to define irreducibility. If the matrix is decomposable, we

may classify the equivalence classes according to whether the states in a

class are recurrent or transient.


Definition 3.1.3 For finite-state Markov chains, a state i is called recur-

rent if whenever i → j for a state j, it must be j −→ i. A state is called

transient if it is not recurrent.

According to this definition, a state i in a finite-state Markov chain is

recurrent if there is no possibility of going to a state j from which there is no

return. A state i is transient if there is some state j that is accessible from

i, but from which there is no possible return. For the example of Figure 3.1,

all states are recurrent. In Examples 3.1.1-3.1.2, all states are recurrent.

In Examples 3.1.3, state 1 is transient and states 2 and 3 are recurrent.

Examples 3.1.4-3.1.5 give two countable-state Markov chains. We will see in

Section 3.3 that the definition of recurrence for finite state Markov chains is

inadequate for the case with countable state space. In Section 3.3, we shall

propose a new definition.

Theorem 3.1.1 (a) For finite-state Markov chains, either all states in an

equivalence class are transient or all are recurrent. (b) The finite state

space can be partitioned into at least one recurrent classes and at most one

transient class. (c) For an irreducible finite-state Markov chain, all states

are recurrent.

Proof. (a) If there is no transient state in an equivalent class, then this

class is recurrent. Suppose there is a transient state i in a class such that

i→ j but j � i and suppose that i and m are in the same class, i.e., i↔ m.

Then m → i and i → j, so m → j. But j � m, otherwise, j → m → i,

leading to a contradiction. Thus, m is transient. Therefore, all states in the

class containing i are transient.

(b) First, we show that there is at least one recurrent state so that there

is at least one recurrent class. Suppose state i1 is transient. Then there

is a state i2 such that i1 → i2, but i2 � i1. If i2 is recurrent, then we


are done. If i2 is transient, then there is a state i3 such that i2 → i3, but

i3 � i1 and i3 � i1. Continuing to do in this way, we obtain a directed

graph that does not contain any cycle. Since the state space is finite, there

is a last state j which does not have an outgoing arc. Such a node j must

be recurrent because pjj = 1 and pjk = 0 for all k �= j but construction.

Now, all transient states form one class if there exists a transient state. The

remaining states are recurrent and can be partitioned into disjoint classes

of recurrent states using the equivalence relation ↔.

(c) By definition, an irreducible Markov chain has only one class. If

this class is not recurrent, then there is there is a state i in this class and

a state j outside this class such that i → j but not j � i. The state j

belongs to another equivalence class, contradicting with the irreducibility of

the Markov chain.

Remark 3.1.1 Stokey and Lucas with Prescott (1989) call a recurrent class

an ergodic set. They define an ergodic set as a subset of states E such that

if∑

j∈E pij = 1 for each i ∈ E, and if no proper subset of E has this

property. Exercise 1 asks readers to check that this definition is equivalent

to our definition of a recurrent class for finite Markov chains.

In terms of the graph of a Markov chain, a class is transient if there

is some directed arc going from a node in the class to a node outside the

class. Since an irreducible Markov chain has only one class, this class must

be recurrent if the state space is finite state. For the example of Figure 3.1,

{1, 3} and {2, 4} are two recurrent classes. In Examples 3.1.1 and 3.1.2, all

states are recurrent and the Markov chain is irreducible. The Markov chain

in Example 3.1.3 have two classes. One class consists of state 1 only and is

transient. The other class consists of states 2 and 3 and is recurrent.

Next, we classify states according to their periods.


Definition 3.1.4 The period of a state, denoted d (i) , is the greatest com-

mon divisor of those values of m for which p(m)ii > 0. If the period is 1, the

state is aperiodic, and if the period is at least 2, the state is periodic.

By this definition, the state i is periodic with period d (i) if a return

to i is impossible except, in d (i) , 2d (i) , 3d (i) , ..., steps, and d (i) ≥ 1

is the greatest integer with this property. Clearly this definition applies

to recurrent states only. It also applies to both finite- and countable-state

Markov chains. For the example of Figure 3.1 and Examples 3.1.1 and 3.1.4,

all states are aperiodic. In Example 3.1.3, the two recurrent states 2 and 3

are aperiodic. In Examples 3.1.2 and 3.1.5, all states have period 2.

Theorem 3.1.2 For any finite- or countable-state Markov chain, all states

in the same class have the same period.

Proof. Let i and j be any distinct pair of states in a class. Then i↔ j

and there is some r such that p(r)ij > 0 and some s such that p

(s)ji > 0. Thus,

by the Chapman–Kolmogorov equation (3.1), p(r+s)ii > 0. By definition, r+s

must be divisible by d (i) . Let t be any integer such that p(t)jj > 0. By the

Chapman–Kolmogorov equation (3.1), p(r+s+t)ii > 0. Thus, r+ s+ t must be

divisible by d (i) , and thus t is divisible by d (i) . Since this is true for any t

such that p(t)jj > 0, d (j) is divisible by d (i) . By a symmetric argument, d (i)

is divisible by d (j) , so d (i) = d (j) .

We have classified each class of states for a finite-state Markov chain in

terms of its period and in terms of whether it is recurrent or transient. In

summary, we can reorder states so that a transition matrix can be written

as follows:

P =

⎡⎢⎢⎣A1 0 ... 00 A2 ... 00 0 ... 0B1 B2 ... Ak

⎤⎥⎥⎦ ,


where A1, ..., Ak−1 correspond to k − 1 classes of recurrent states. The last

block corresponds to a single class of transient states. Each sub-matrix Ai,

i = 1, ..., k − 1, forms a transition matrix. The states corresponding to this

matrix have the same period. We may further reorder states in a submatrix

with period d so that it can be decomposed into d subclasses as in Example

3.1.2. In Example 3.1.6, states {1} and {4} form two recurrent classes.

States {2, 3} form one transient class.

3.1.2 Stationary Distribution

Suppose the initial distribution on the state space X is given by a row vector

π0. The the distribution in period t is given by: πt = π0Pt.We are interested

in long-run properties of Markov chains. That is, does πt converges to a

distribution? Does this convergence depend on the initial distribution π0?

We formally introduce the following definition to study those questions.

Definition 3.1.5 An invariant distribution or stationary distribu-

tion for a Markov chain with the transition matrix P is a distribution (row

vector) π such that π = πP.

The following theorem gives a sufficient condition for the existence and

uniqueness of a stationary distribution.

Theorem 3.1.3 For a finite-state Markov chain with the transition matrix

P , if p(k)ij > 0 for some k > 0 and for all i, j, then the Markov chain has

a unique stationary distribution π such that limt→∞ p(t)ij = πj > 0 for all j

and all i.

Proof. Fix any column j of P t for all t ≥ 1. Let mt = mini p(t)ij and

Mt = maxi p(t)ij . We will prove that:

1. The sequence {mt} is increasing.


2. The sequence {Mt} is decreasing.

3. Δt =Mt −mt converges to zero exponentially fast.

To prove the first property, we apply the Chapman-Kolmogorov equa-

tion,

mt+1 = mini

∑v

pivp(t)vj ≥ min

i

∑v

pivmt = mt.

The second property follows from a similar argument. To prove the third

property, we temporarily suppose all pij > 0. Let δ = mini,j pij > 0. Since

there are at least two states, it follows from∑

j pij ≥ 2δ that δ ≤ 1/2. Let l

be the row such that Mt = p(t)lj . Then:

mt+1 = minip(t+1)ij = min

∑k

pikp(t)kj

= minipljMt +

∑k �=l

pikp(t)kj

≥ minipljMt +

∑k �=l

pikmt

= minipljMt + (1− plj)mt

≥ miniδMt + (1− δ)mt.

Similarly,

Mt+1 ≤ δmt + (1− δ)Mt.

Taking the above two inequalities together yields:

Δt+1 = Mt+1 −mt+1

≤ (1− 2δ) (Mt −mt) = (1− 2δ)Δt

≤ (1− 2δ)t+1 .

Letting t → ∞, we find that mt and Mt converge to a common limit,

denoted πj. Since mt is increasing, πj = limtmt ≥ m1 > 0. For the case


with p(k)ij > 0 for some k > 0, we consider mkt and Mkt. Applying a similar

argument, we can show that Mkt −mkt ≤ (1− 2δ)t . Letting t → ∞ gives

the desired result as well.

Since Example 3.1.1 satisfies the assumption of the theorem, there ex-

ists a unique stationary distribution and each row of the limiting transition

matrix gives the stationary distribution. Example 3.1.2 violates the assump-

tion of the theorem. There does not exist a stationary distribution. When

the dimension of the transition matrix is large, the condition in the theo-

rem may be hard to check. The following theorem gives a simple equivalent

condition.

Theorem 3.1.4 A finite-state Markov chain is irreducible and aperiodic if

and only if there exists k0 > 0 such that p(k)ij > 0 for all i, j and all k ≥ k0.

Proof. The “if” part is trivial. We consider the “only if” part. Since

the Markov chain is irreducible, there exists some m ≥ 1 such that p(m)jj > 0

for some j. Define the set:

S ={k : p

(k)jj > 0, k ≥ 1

}.

By the Chapman-Kolmogorov equation, p(k+l)jj ≥ p

(k)jj p

(l)jj . Thus, if k, l ∈ S,

then k+ l ∈ S. By a lemma in number theory that if a set of positive integers

is closed under addition and has greatest common divisor 1, then it contains

all integers exceeding some kj (see Appendix A21 of Billingsley (1995)). We

can apply this lemma because the Markov chain is aperiodic. Given i and

j, there exists rij such that p(rij)ij > 0 since the Markov chain is irreducible.

For all k > kij ≡ kj + rij , p(k)ij ≥ p

(rij)ij p

(k−rij)jj > 0. Since the state space is

finite, we can choose the largest among all kij , simply denoted k0.

By definition, for an irreducible Markov chain, there exists k > 0 such

that p(k)ij > 0 for any i, j. However, this k may take different values for

different i or j. The force of the theorem is that we can choose a k0 common


for all i, j such that p(k)ij > 0 for all k > k0. The Markov chain in Example

3.1.2 has period 2, which violates the assumption of the theorem. There

does not exist a stationary distribution.

For the general non-irreducible case, we can partition the state space

into many classes of recurrent states and one class of transient states. Then

each recurrent class is associated with an invariant distribution who support

is that recurrent class. All invariant distributions are linear combinations of

those invariant distributed associated with recurrent classes. Thus, if there

are at least two recurrent classes, then there are infinitely many invariant

distributions. See Theorem 11.1 in Stokey, Lucas and Prescott (1989) for a

more formal proof of this result.

3.1.3 Countable-State Markov Chains

Countable-state Markov chains exhibit some types of behavior not possible

for finite-state Markov chains. We use Examples 3.1.4-3.1.5 to illustrate the

new issues posed by countable state spaces. For Example 3.1.5, all states

communicate with all other states, and are thus in the same class. The

chain is irreducible and all states have period 2. The probability of each

state approaches zero as time goes to infinity. These probabilities cannot

form a steady-state distribution.

Next, consider Example 3.1.4. This chain is irreducible and aperiodic.

For p > 0.5, transitions to the right occur with higher frequency than tran-

sitions to the left. Thus, reasoning heuristically, we expect that the state

Xt at time t to drift to the right with increasing t. Given X0 = 0, the prob-

ability p(l)0j of being in state j at time l, should tend to zero for any fixed j

with increasing l. Thus, a steady state cannot exist.

It is interesting to compare with the truncated finite-state Markov chain.

In particular, suppose there are k states {0, 1, ..., k − 1} . Transition probabil-

ities are given in Figure 3.2. In addition, pk−1,k−1 = p. In this case, Exercise


3 asks readers to show that the stationary distribution π = (π0, π1, ..., πk−1)

is given by πi = (1− ρ) ρi/(1− ρk

)where ρ = p/q and ρ �= 1. For ρ = 1,

πi = 1/k. Now consider the limiting behavior as k →∞. For ρ < 1 (p < 0.5),

πi → (1− ρ) ρi for all i, which we later interpret as the stationary distri-

bution for the untruncated chain. On the other hand, for ρ > 1 (p > 0.5) ,

πi → 0 for all i. So this limit cannot be the stationary distribution for the

untruncated chain.

These two examples motivate us to redefine recurrence. To do so, we

denote by τ i the first return time of the Markov chain {Xt}∞t=0 to the

state i,

τ i = min {k ≥ 1 : Xk = i} .

Also denote by Pr the probability distribution of {Xt} .

Definition 3.1.6 The state i is called:

• positive recurrent if Pr (τ i <∞|X0 = i) = 1 and E [τ i|X0 = i] <∞;

• null recurrent if Pr (τ i <∞|X0 = i) = 1 and E [τ i|X0 = i] =∞;

• transient if Pr (τ i <∞|X0 = i) < 1.

According to this definition, state i is positive (null) recurrent if, starting

at i, the Markov chain eventually returns to i with certainty and also the

mean return time is finite (infinite). State i is transient if there is a positive

probability that the Markov chain started at state i eventually leaves this

state. Given this new definition of recurrence. Parts (a)-(b) of Theorem

3.1.1 are still true, but the proof must be modified (see Lemma 1 in page

575 of Shiryaev (1996). In addition, a recurrent class for countable state

spaces is further classified into either positive recurrent or null recurrent.

Part (c) is no longer true. For an irreducible countable-state Markov chain,

all states can be either transient or recurrent (see Theorem 8.3 in Billingsley

3.2. GENERAL MARKOV PROCESSES 95

(1995)). For Example 3.1.4, all states are transient if p > 0.5. The notion of

period in Definition 3.1.5 and Theorem 3.1.2 still apply to countable-state

Markov chains. Theorem 3.1.4 also applies, but the proof must be modified

(see Lemma 2 in Section 8 of Billingsley (1995)). Theorem 3.1.3 does not

apply and is modified below (see Theorem 8.8 in Billingsley (1995)):

Theorem 3.1.5 For an irreducible, aperiodic Markov chain, there are three

possibilities:

(a) The chain is transient. For all i and j, limk→∞ p(k)ij = 0 and

∑k p

(k)ij <

∞.(b) The chain is null recurrent but there exists no stationary distribu-

tion. For all i, j, limk→∞ p(k)ij = 0, but so slowly that

∑k p

(k)ij = ∞ and

E [τ j|X0 = j] =∞.(c) The chain is positive recurrent and there exists a unique stationary

distribution π such that limk→∞ p(k)ij = πj = 1/E [τ j|X0 = j] > 0 for all i, j.

In addition,

Part (c) describes the intuitive meaning that the stationary probability

of a state is equal to the long-run frequency of staying in this state. For a

finite state space, cases (a)-(b) cannot happen because∑

j p(k)ij = 1.

3.2 General Markov Processes

General Markov processes can be conveniently characterized by transition

functions, analogous to transition matrices for finite-state Markov chains.

We fix measurable spaces (X,X ) and (Y,Y).

Definition 3.2.1 A function P : X × Y → [0, 1] is called a stochastic

kernel if:

(a) For any x ∈ X, P (x, ·) is a probability measure on (Y,Y).(b) For any A ∈ Y, P (·, A) is an X -measurable function.


If the two spaces (X,X ) and (Y,Y) are identical, then P is called a

transition function.

Given any time-homogeneous Markov process {Xt} on a probability

space (Ω,F ,Pr) with the state space X, we can construct a transition func-

tion P : X× X → [0, 1] as follows:

P (x,A) = Pr (Xt+1 ∈ A|Xt = x) , (x,A) ∈ (X,X ) .

The interpretation is that P (x,A) is the probability that the next period’s

state lies in the set A, given that the current state is x.

Next, we consider composition of stochastic kernels. The following result

is needed. Its proof is available in Stokey and Lucas with Prescott (1989, p.

228).

Theorem 3.2.1 Let (W,W) , (X,X ), and (Y,Y) be measurable spaces; and

let P1 and P2 be stochastic kernels on (W,X ) and (Y,Y) respectively. Then

the map P3 defined by:

P3 (w,A×B) =

∫AP2 (x,B)P1 (w, dx) ,

for all w ∈W and all A ∈ X , B ∈ Y, is a stochastic kernel on (W,X × Y).

The kernel P3 is often called the direct product of P1 and P2 and

denoted P1⊗P2. The direct product gives the probability of landing in A×Bwhen starting at the state w. It is equal to the probability of first going to a

point x ∈ A and then landing in B, integrated over all points in A. Given a

transition function P, one can construct a corresponding time-homogeneous

Markov process. We make use of Theorem 3.2.1 and the following important

result in probability theory:

Theorem 3.2.2 (Kolmogorov Extension Theorem) Let {μt} be a sequence

of probability measures on X t such that

μt (A1 × ...×At) = μt+1 (A1 × ...×At × X ) , (3.2)


for all t and all Ai ∈ X . Then there exists a unique probability measure μ

on X∞ such that μt (A) = μ (A× X∞) for each A ∈ X t.

Equation (3.2) is often called the Kolmogorov consistency condition.

This condition and the theorem can be stated in terms of finite dimensional

distributions. We refer to Shiryaev (1996, p. 246-247) for a proof.

Theorem 3.2.3 Let μ0 be a probability measure on X and P be a transition

function on (X,X ) . Then there exists a unique probability measure Pr on

(X∞,X∞) and a Markov process {Xt}∞t=0 on (X∞,X∞,Pr) with the state

space X such that (a) Pr (Xt+1 ∈ A|Xt = x) = P (x,A) for all (x,A) ∈(X,X ), and (b) the distribution of X0 is μ0.

Proof. Define a sequence of {μt}∞t=0 as follows:

μt (A0 ×A1 × ...×At)

=

∫A0

∫A1

...

∫At−1

P (xt−1, At)P (xt−2, dxt−1) ...P (x0, dx1)μ0 (dx0) ,

for all t ≥ 0 and all Ai ∈ X for all i = 0, 1, ..., t. Repeatedly applying

Theorem 3.2.1, we can show that μt is a probability measure on X t+1.We can

check that {μt} satisfies (3.2). By the Kolmogorov Extension Theorem, there

exists a unique probability measure Pr on (X∞,X∞) . Define the sequence

{Xt}∞t=0 as:

Xt (x0, x1, ..., xt, ...) = xt (3.3)

for any {xt}∞t=0 ∈ X∞. Then we can verify that {Xt} is the desired Markov

process.

Remark 3.2.1 The constructed probability space (X∞,X∞,Pr) is called

canonical, and the construction given by (3.3) is called the coordinate

method of constructing a stochastic process. This method is quite general

and applies to continuous time (see Shiryaev (1996)).


An extension of above theorem is the following:

Theorem 3.2.4 (Ionescu Tulcea’s Theorem on Extending a Measure and

the Existence of a Stochastic Process) Let (Ωn,Fn) , n = 0, 1, 2... be any

measurable spaces and let Ω = Π∞n=0Ωn and F = ⊗∞

n=0Fn be the product

measurable space. Suppose that a probability measure Q0 is given on (Ω0,F0)

and that a sequence of stochastic kernels Qn : (Ω0 × Ω1 × ...× Ωn)×Fn+1 →[0, 1], n ≥ 1, is given. Let

Pn (A0 ×A1 × ...×An) =

∫A0

Q0 (dω0)

∫A1

Q1 (ω0; dω1)

· · ·∫An

Qn (ω0, ω1, ..., ωn−1; dωn) ,

for all Ai ∈ Fi, n ≥ 1. Then there exists a unique probability measure Pr on

(Ω,F) such that

Pr (ω : ω0 ∈ A0, ..., ωn ∈ An) = Pn (A0 ×A1 × ...×An)

and there is a stochastic process X = (X0 (ω) ,X1 (ω) , ...) such that

Pr (ω : X0 (ω) ∈ A0, ...,Xn (ω) ∈ An) = Pn (A0 ×A1 × ...×An) ,

where Ai are Borel sets for some measurable spaces (Xi,Xi) .

Proof. See the proof of Theorem 2 in Shiryaev (1996, p.249-251).

The following result is fundamental to the description of Markov pro-

cesses.

Theorem 3.2.5 Let {Xt} be a time-homogeneous Markov process with the

transition function P : X× X → [0, 1]. Define n-step transitions recursively

as follows:

P 1 = P, Pn (a,A) =

∫X

P (x,A)Pn−1 (a, dx) , (3.4)

where (a,A) ∈ (X,X ) . Then Pn is a transition function for each n ≥ 2.


The interpretation is that Pn (a,A) is the probability of going from the

point a to the set A in exactly n periods. Equation (3.4) is called a Chapman-

Kolmogorov equation, analogous to (3.1). Exercise 2 asks readers to prove

this theorem and also the following result:

Pm+n (a,A) =

∫X

Pm (x,A)Pn (a, dx) for each m,n ≥ 1. (3.5)

Next, we introduce two important operators associated with a Markov

process. Let B (X,X ) denote the set of all bounded and X -measurable

functions on X. Let P (X,X ) denote the set of all probability measures on

(X,X )

Definition 3.2.2 The Markov operator T associated with the transition

function P : X× X → [0, 1] is a map on B (X,X ) such that:

Tf (x) =

∫f (y)P (x, dy) for any f ∈ B (X,X )

The adjoint operator T ∗ of T is a map on P (X,X ) such that:

T ∗λ (A) =∫P (x,A)λ (dx) for any λ ∈ P (X,X ) .

Exercise 4 asks readers to show that T is a self map on B (X,X ) and T ∗

is a self map on P (X,X ) . An intuitive property of T and T ∗ is that∫(Tf) (x)λ (dx) =

∫f (y) (T ∗λ) (dy) =

∫ ∫f (y)P (x, dy)λ (dx) ,

for any f ∈ B (X,X ) and any λ ∈ P (X,X ) . The above equation gives the

expected value of f next period, if λ is the probability measure over the

current state and the conditional probability is P (x, ·).Let T n and T ∗n be the operators corresponding to the transitions Pn. It

is straightforward to verify that

T (n+m)f = (T n) (Tmf) , f ∈ B (X,X ) ,

T ∗(n+m)λ = (T ∗n) (T ∗mλ) , λ ∈ P (X,X ) .


It follows that given any initial probability measure λ0 on (X,X ) , we can

define a sequence of probability measures {λn} on (X,X ) by:

λn = T ∗λn−1 = T ∗nλ0.

The interpretation is that given the initial distribution λ0, λn is the proba-

bility measure over the state in period n. As in the case of Markov chains,

we define a fixed point π of T ∗ as an invariant distribution or stationary

distribution. It satisfies the equation:

π (A) =

∫P (x,A) π (dx) , all A ∈ X .

We will use the terms of invariant distribution and stationary distribution

interchangeably below.

3.3 Convergence

We are interested in the existence and uniqueness of an invariant distribu-

tion π. In addition, we are interested in the long-run stability of the Markov

process, i.e., whether or not T ∗nλ0 convergences to π for any initial distribu-

tion λ0. Unlike the finite state case, the mode of convergence of a sequence

of probability measures {λn} is delicate. We need to introduce some metric

to measure the distance between probability measures. We shall study two

metrics that induce two notions of convergence – strong and weak. The for-

mer implies the latter. Since the theory of strong convergence for Markov

processes is developed in close analogy to the theory of Markov chains, we

study it first.

3.3.1 Strong Convergence

We define the total variation distance on a set of probability measures

P (X,X ) by:

dTV (λ1, λ2) = supA∈X|λ1 (A)− λ2 (A)| , λ1, λ2 ∈ P (X,X ) . (3.6)

3.3. CONVERGENCE 101

Intuitively, this is the largest possible difference between the probabilities

that the two probability measures can assign to the same event. An im-

portant fact is that the space P (X,X ) endowed with the total variation

distance is a complete metric space (see Stokey and Lucas with Prescott

(1989, Lemma 11.8)). Exercise 5 asks readers to show that the total varia-

tion distance in (3.6) can be written in terms of Radon-Nikodym derivatives:

dTV (λ1, λ2) =1

2

∫ ∣∣∣∣dλ1dν − dλ1dν

∣∣∣∣ dν= 1−

∫min

(dλ1dν

,dλ1dν

)dν

≡ 1− λ1 ∧ λ2, (3.7)

where ν = λ1 + λ2 so that both λ1 and λ2 are absolutely continuous with

respect to ν and dλ1/dν and dλ2/dν denote the Radon-Nikodym deriva-

tives. In the second equality, we have used the fact that |x− y| = x + y −2min (x, y) .

Definition 3.3.1 The sequence of probability measures {λn} in P (X,X )converges strongly to λ if limn→∞ dTV (λn, λ) = 0.

Remark 3.3.1 There are several equivalent ways of defining strong conver-

gence. Stokey and Lucas with Prescott (1989) provide a good discussion on

these definitions.

Theorem 3.3.1 below provides a sharp characterization of the stability

of Markov processes in terms of strong convergence. This theorem needs

the following condition which appears in Onicesu (1969) and is discussed by

Stokey and Lucas with Prescott (1989):3

Condition M. There exists ε > 0 and an integer N ≥ 1 such that for any

A ∈ X , either PN (x,A) ≥ ε all x ∈ X, or PN (x,Ac) ≥ ε, all x ∈ X.

3Stokey and Lucas with Prescott (1989) also discuss another condition due to Doeblinand its implications for the existence of invariant measures.


Theorem 3.3.1 Let P (X,X ) be the space of probability measure on the

measurable space (X,X ) with the total variation distance, let P be a transi-

tion function on (X,X ) , and let T ∗ be the adjoint operator associated with

P . If P satisfies Condition M for N ≥ 1 and ε > 0, then there exists a

unique probability measure π in P (X,X ) such that

dTV (T∗Nkλ0, π) ≤ (1− ε)k dTV (λ0, π), (3.8)

for all λ0 ∈ P (X,X ) , k = 1, 2, .... Conversely, if (3.8) holds, then Condition

M is satisfied for some N and ε > 0.

Proof. We adapt the proof of Theorem 11.12 in Stokey and Lucas with

Prescott (1989). Suppose Condition M holds. We first show that T ∗N is a

contraction of modulus (1− ε) on the space P (X,X ) with the total variation

distance. Define two measure λi, i = 1, 2, as:

λi = λ1 ∧ λ2 + dTV (λ1, λ2) λi.

By (3.7), we can verify that λi is a probability measure on P (X,X ) . For anysets A,Ac ∈ X , and without loss of generality, suppose that PN (x,A) ≥ ε

for all x ∈ X. Then

dTV (T∗Nλ1, T ∗Nλ2)

= dTV (λ1, λ2) supA∈X

∣∣∣∣∫ PN (x,A) λ1 (dx)−∫PN (x,A) λ2 (dx)

∣∣∣∣≤ (1− ε) dTV (λ1, λ2) .

Since P (X,X ) is a complete metric space with the total variation distance,

the desired result follows from the Contraction Mapping Theorem.


Conversely, suppose that (3.8) holds. Let δ{x} denote the Dirac measure

that puts a unit mass on {x} . Choosing any x ∈ X, we show that∣∣∣PNk (x,A) − π (A)∣∣∣ ≤ dTV (PNk (x, ·) , π (·))

= dTV (T∗Nkδ{x}, π)

≤ (1− ε)k dTV (δ{x}, π)

≤ (1− ε)k .

Choose K sufficient large such that (1− ε)K ≤ 1/4. Let A,Ac ∈ X be

given, and without loss of generality, suppose that π (A) ≥ 1/2. Then∣∣PNk (x,A)− π (A)∣∣ ≤ 1/4 implies that PNk (x,A) ≥ 1/4. Condition M

holds for N0 = NK and ε0 = 1/4.

Meyn and Tweedie (1993) provide another condition for the stability of

Markov processes.

Minorization Condition. There exists α > 0, a probability measure μ in

P (X,X ) and an integer N ≥ 1 such that PN (x, ·) ≥ αμ (·) for all x ∈ X.

Proposition 3.3.1 The Minorization Condition and Condition M are equiv-

alent.

Proof. Suppose the Minorization Condition holds. For any set A ∈ X ,if μ (A) ≥ 1/2, then PN (x,A) ≥ αμ (A) ≥ α/2. If μ (A) ≤ 1/2, then

PN (x,Ac) ≥ αμ (Ac) ≥ α/2. So Condition M holds by setting ε = α/2.

Conversely, suppose Condition M holds. Pick any A ∈ X . Define the

probability measure μ = δA, where δA denotes the Dirac measure that

puts a unit mass on A. Then the Minorization Condition holds by setting

α = ε.

We now give some examples to verify Condition M or the Minorization

Condition.


Example 3.3.1 Suppose there is a point x0 ∈ X, an integer N ≥ 1 and

a number ε > 0, such that PN (x, {x0}) > ε for all x ∈ X. Then for

any set A ∈ X containing x0, PN (x,A) ≥ PN (x, {x0}) > ε. For any set

A ∈ X that does not contain x0, if PN (x,A) > ε, then Condition M is

satisfied. But if PN (x,A) ≤ ε, then we consider Ac. Since Ac contains x0,

PN (x,Ac) ≥ PN (x, {x0}) > ε. Thus, Condition M is also satisfied.

Example 3.3.2 The transition function P has a density p (x, x′) ≥ ε > 0

with respect to a measure μ ∈ P (X,X ) , i.e.,

P (x,A) =

∫Ap(x, x′

)μ(dx′

).

The Minorization Condition is satisfied.

Example 3.3.3 Suppose X is countable. If there exists an integer N ≥ 1

such that

α ≡∑y∈X

infx∈X

PN (x, {y}) > 0, (3.9)

then the Minorization Condition is satisfied for (N,α, ν) where ν is the mea-

sure defined by ν {y} = α−1 infx∈X PN (x, {y}) . If X is finite, then condition

(3.9) is satisfied for any irreducible and aperiodic Markov chain by Theorem

3.1.4.

3.3.2 Weak Convergence

Strong convergence may require too much that rules out some intuitive be-

havior. For example, consider the sequence of Dirac measures{δ{1/n}

}. We

may expect that this sequence should converge to δ{0}. However, this is not

true if strong convergence is used. This motivates us to introduce a weaker

notion of convergence such that it can accommodate the preceding example.

This notion is called weak convergence. For this convergence, we need to


introduce topological structure on the state space. Throughout this subsec-

tion, we assume that X is a metric space and that X is the Borel space of

X.

Definition 3.3.2 Let {λn} and λ be measures in P (X,X ); and let C (X)

be the space of bounded, continuous, real-valued functions on X. Then {λn}converges weakly to λ (denoted λn =⇒ λ) if

limn→∞

∫fdλn =

∫fdλ, all f ∈ C (X) .

Remark 3.3.2 As in the case of strong convergence, there are several equiv-

alent ways of defining weak convergence. Stokey and Lucas with Prescott

(1989) and Billingsley (1999) provide a good discussion on the characteri-

zation of weak convergence.

To study the weak convergence to an invariant distribution, we make use

of the Prohorov Theorem, which needs a tightness condition.

Definition 3.3.3 A family of probability measuresM on (X,X ) is tight iffor every ε > 0, there exists a compact set K ⊂ X such that λ (K) ≥ 1 − εfor each λ ∈ M.

Theorem 3.3.2 (Prohorov) Let {μn} be a sequence of probability measures

on a complete and separable metric space X. Then there exists a probability

measure on X and a subsequence {μnk} such that μnk

=⇒ μ if and only if

{μn} is tight.

Proof. See Section 1.5 of Billingsley (1999).

To apply this theorem, we impose certain continuity assumption on

Markov processes.


Definition 3.3.4 A transition function P on (X,X ) has the Feller prop-

erty if the associated operator T maps the space of bounded continuous

function on X into itself, i.e., T : C(X)→ C(X).

Theorem 3.3.3 (Krylov-Bogolubov) Let P be a Feller transition function

on a complete separable metric space X. If there exists x ∈ X such that

the sequence of probability measures {Pn (x, ·)} is tight, then there exists an

invariant probability measure for P .

Proof. Define a sequence of probability measures {μn} by:

μn (A) =1

n

n∑k=1

P k (x,A) , any A ∈ X .

It follows that {μn} is tight. By the Prohorov Theorem, there exists a

subsequence {μnk} such that μnk

=⇒ μ. Now, we show that μ is an invariant

distribution for P. In doing so, we take any function f ∈ C (X) . Let T and

T ∗ denote the Markov and adjoint operators associated with P respectively.

Since μnk=⇒ μ, for nk sufficiently large,∣∣∣∣∫ f (x)μnk

(dx)−∫f (x)μ (dx)

∣∣∣∣ < ε.

Since P has the Feller property, Tf (x) is continuous so that∣∣∣∣∫ Tf (x)μ (dx)−∫Tf (x)μnk

(dx)

∣∣∣∣ < ε,

for nk sufficiently large. We then compute:∣∣∣∣∫ f (x) (T ∗μ) (dx)−∫f (x)μ (dx)

∣∣∣∣ ≤ ∣∣∣∣∫ f (x) (T ∗μ) (dx)−∫f (x) (T ∗μn) (dx)

∣∣∣∣+

∣∣∣∣∫ f (x)(T ∗μnk

)(dx)−

∫f (x)μnk

(dx)

∣∣∣∣+∣∣∣∣∫ f (x)μnk(dx)−

∫f (x)μ (dx)

∣∣∣∣


≤∣∣∣∣∫ Tf (x)μ (dx)−

∫Tf (x)μnk

(dx)

∣∣∣∣+

1

nk

∣∣∣∣∫ f (y)Pnk+1 (x, dy) −∫f (y)P (x, dy)

∣∣∣∣+ ε

≤ 2ε+2 ||f ||nk

.

Letting nk →∞, we obtain that∫f (x) (T ∗μ) (dx) =

∫f (x)μ (dx)

for any f ∈ C (X) . Thus, T ∗μ = μ.

Because every family of probability measure on a compact space is tight,

we immediately obtain the following:

Corollary 3.3.1 If X is a compact and complete separable metric space,

then every Feller transition function P has an invariant distribution.

If there are multiple invariant distributions, not all of them describe the

long-run steady state in an intuitive way. For example, consider a determin-

istic transition P on {0, 1} defined by P (0, ·) = δ{0} and P (1, ·) = δ{1}.4

Every probability on {0, 1} is an invariant distribution. To see this, let

π ({0}) = p and π ({1}) = 1−p, p ∈ [0, 1] . Then T ∗π = pδ{0}+(1− p) δ{1} =π. Clearly, π is not a good long-run stationary distribution for this system

unless π is zero or one. This motivates us to introduce the following:

Definition 3.3.5 Let P be a transition function on (X,X ) and let π be

an associated invariant distribution. The set A ∈ X is a π-invariant set

if P (x,A) = 1 for π-almost every x ∈ A. The measure π is ergodic if

either π (A) = 1 or π (A) = 0 for every π-invariant set A. If, in addition,

π (A) = 1, then A is an ergodic set.

4Recall the notation of the Dirac measure.


To ensure uniqueness of invariant distribution, we need to introduce more

assumptions on transition functions. We restrict attention to transition

functions that satisfy certain monotonicity property.

Definition 3.3.6 A transition function P on (X,X ) is increasing if for

any bounded and increasing function f on X, T f is increasing.

The following condition is crucial for uniqueness.

Mixing Condition. Let X = [a, b] ⊂ Rl. There exists c ∈ X, ε > 0, and

N ≥ 1 such that PN (a, [c, b]) ≥ ε and PN (b, [a, c]) ≥ ε.

Theorem 3.3.4 Let X = [a, b] ⊂ Rl. If P is increasing, has the Feller

property, and satisfies the Mixing Condition, then P has a unique invariant

probability measure λ∗, and T ∗nλ0 ⇒ λ∗, for all λ0 ∈ P (X,X ) .

Proof. See the proof of Theorem 12.12 in Stokey and Lucas with

Prescott (1989)

In applications, we are often interested in how the equilibrium system

behaves when certain model parameters change. These parameters might

affect exogenous shocks, preferences or technology. The result below studies

how the long-run behavior changes when some parameters alter a transition

function.

Let the state space X ⊂ Rl, with its Borel sets X , and the exogenous

parameters are described by a vector θ ∈ Θ ⊂ Rm. For each θ ∈ Θ, let

Pθ be a transition function on (X,X ) , and let Tθ and T ∗θ be the operators

associated with Pθ.

Theorem 3.3.5 Assume that

(a) X is compact;

(b) if {(xn, θn)} is a sequence in X ×Θ converging to (x0, θ0) , then the

sequence {Pθn (xn, ·)} in P (X,X ) converges weakly to Pθ0 (x0, ·) ; and

3.4. EXERCISES 109

(c) for each θ ∈ Θ, T ∗θ has a unique fixed point μθ ∈ P (X,X ) .

If {θn} is a sequence in Θ converging to θ0, then the sequence{μθn

}converges weakly to μθ0 .

Proof. See the proof of Theorem 12.13 in Stokey and Lucas with

Prescott (1989).

3.4 Exercises

1. Prove the statement in Remark 3.1.1.

2. Prove Theorem 3.2.5 and equation (3.5).

3. Derive the stationary distribution for the truncated finite-state birth-

death Markov chain discussed in Section 3.1.3.

4. Prove that T is a self map on B (X,X ) and T ∗ is a self map on

P (X,X ) . In addition, T and T ∗ satisfy∫(Tf) (x)λ (dx) =

∫f (y) (T ∗λ) (dy) =

∫ ∫f (y)P (x, dy)λ (dx) .

5. Consider two measures λ1, λ2 ∈ P (X,X ) . Show the following:

(a) For any a < b,

dTV (λ1, λ2) =1

b− a supf :X→[a,b]

∣∣∣∣∫ fdλ1 −∫fdλ2

∣∣∣∣ .(b) If λ1 and λ2 have densities g and h, respectively, with respect to

some measure ρ, then

dTV (λ1, λ2) = 1−∫

min (g, h) dρ


6. A transition matrix P is called doubly stochastic if∑j

pij = 1 for all i, and∑i

pij = 1 for all j.

If a doubly stochastic chain has N states and is irreducible and aperi-

odic, compute its stationary distribution.

Chapter 4

Ergodic Theory andStationary Processes

Ergodic theory studies dynamical systems with an invariant distribution and

related problems. A central question of ergodic theory is how a dynamical

system behaves when it is allowed to run for a long time. The main results of

ergodic theory are various ergodic theorems which assert that, under certain

conditions, the time average of a function along the trajectories exists almost

everywhere and is related to the space average. Ergodic theory has been

applied to economics, especially to econometrics. It leads to laws of large

numbers for dependent random variables. In this chapter, we provide a

brief introduction to ergodic theory. We then discuss its two applications.

First, we apply it to stationary processes and establish strong laws of large

numbers for stationary processes. Second, we apply it to stationary Markov

processes. We characterize the set of invariant measures and also establish

a strong law of large numbers for stationary Markov processes.

4.1 Ergodic Theorem

Fix a probability space (Ω,F ,Pr) .

111

112CHAPTER 4. ERGODIC THEORY AND STATIONARY PROCESSES

Definition 4.1.1 The map θ : Ω → Ω is a measurable transformation

if for every A ∈ F ,

θ−1A = {ω ∈ Ω : θω ∈ A} ∈ F .

A dynamical system consists of (Ω,F ,Pr, θ) .

Definition 4.1.2 A measurable transformation θ : Ω → Ω is measure

preserving if for every A ∈ F , Pr(θ−1A

)= Pr (A) .

Measure preserving means that the measure of a set is always the same

as the measure of the set of points which map to it. Let θn be the nth

iterate of θ, n ≥ 1 and let θ0 be the identity transformation. Consider the

following examples.

Example 4.1.1 Let the probability space be (R∞,B (R∞) ,Pr). A real-valued

stochastic process X = (X0,X1,X2, ...) is defined on this space. Define the

shift operator θ as: θX = (X1,X2, ...) and θkX = (Xk,Xk+1, ...) . It is a

measurable transformation. However, it is not measure-preserving in gen-

eral.

Example 4.1.2 Let Ω = {ω1, ..., ωn} , n ≥ 2. Let θωk = ωk+1, 1 ≤ k ≤n− 1, θωn = ω1. If Pr (ωk) = 1/n, then θ is measure-preserving.

Definition 4.1.3 Let θ be a measurable transformation on (Ω,F) . A set

A ∈ F is called transformation invariant if θ−1A = A.

It is straightforward to verify that the class of transformation invariant

sets forms a σ-algebra (see Exercise 1). We denote it by I.

Definition 4.1.4 A dynamic system (Ω,F ,Pr, θ) is ergodic if every trans-

formation invariant set has measure either zero or one.

4.1. ERGODIC THEOREM 113

Note that the ergodic property is a property of the measurable transfor-

mation θ as well as of the probability space measure (Ω,F ,Pr).

Theorem 4.1.1 (Birkhoff) Let (Ω,F ,Pr, θ) be a dynamical system and let

f : Ω → R be a measurable and integrable function, i.e., E |f | < ∞. If θ is

a measure-preserving transformation, then

limN→∞

1

N

N−1∑n=0

f (θnω) = E [f |I] , Pr -a.s.,

where I is the class of transformation invariant sets. If the dynamical system

is also ergodic, then

limN→∞

1

N

N−1∑n=0

f (θnω) = E [f ] , Pr -a.s.

The proof is based on the following result, whose simple proof is given

by Garsia (1965).

Lemma 4.1.1 (Maximal Ergodic Theorem) Let the assumption in the Birkhoff

Theorem holds. Define

SN (ω) =

N−1∑n=0

f (θnω) , MN (ω) = max {S0 (ω) , S1 (ω) , ..., SN (ω)} ,

with the convention S0 (ω) = 0. Then for every N ≥ 1, 1

E[f (ω)1{MN>0} (ω)

]≥ 0.

Proof. For every N ≥ k ≥ 0 and every ω ∈ Ω, we have MN (θω) ≥Sk (θω) by definition. It follows that

f (ω) +MN (θω) ≥ f (ω) + Sk (θω) = Sk+1 (ω) .

1Here 1A (x) is an indicator function equal to 1 if x ∈ A and 0, otherwise.


Thus,

f (ω) ≥ max {S1 (ω) , ..., SN (ω)} −MN (θω) .

In addition,MN (ω) = max {S1 (ω) , ..., SN (ω)} on the set {MN > 0} so that

E[f (ω) 1{MN>0} (ω)

]≥ E

[(MN (ω)−MN (θω)) 1{MN>0} (ω)

]≥ E [MN (ω)]− E [MN (θω)] = 0,

where the second inequality follows from MN (ω) ≥ 0 and MN (θω) ≥ 0,

and the last equality follows from the fact that θ is measure-preserving (see

Exercise 2).

Proof of the Birkhoff Theorem: Replacing f by f −E [f |I] , we can as-

sume without loss of generality thatE [f |I] = 0. Define η = lim supn→∞ Sn/n

and η = lim infn→∞ Sn/n. It is sufficient to show that

0 ≤ η ≤ η ≤ 0, Pr -a.s.

Since η (ω) = η (θω) for every ω, the set Aε = {ω ∈ Ω : η (ω) > ε} is also

invariant. Define

f∗ (ω) = (f (ω)− ε)1Aε (ω) ,

and define S∗N and M∗

N accordingly. It follows from the lemma that

E[f∗ (ω)1{M∗

N>0} (ω)]≥ 0.

By definition,S∗k (ω)

k=

{0 if η (ω) ≤ ε,

Sk(ω)−εk otherwise.

But as N →∞,

{M∗N > 0} =

{max

1≤k≤NS∗N

}↑{supk≥1

S∗k > 0

}=

{supk≥1

S∗k

k> 0

}

=

{supk≥1

Skk> ε

}∩Aε = Aε,

4.1. ERGODIC THEOREM 115

where the last equality follows because supk≥1 Sk/k ≥ η and Aε = {ω ∈ Ω : η (ω) > ε} .

Since E |f∗| ≤ E |f | + ε < ∞, the Dominated Convergence Theorem

implies that

0 ≤ E[f∗1{M∗

N>0}]→ E [f∗1Aε ] .

Thus,

0 ≤ E [f∗1Aε ] = E [(f − ε)1Aε ] = E [f1Aε ]− εPr (Aε)

= E [E (f |I)1Aε ]− εPr (Aε) = −εPr (Aε) ,

where we have used the fact that Aε ∈ I. It follows that Pr (Aε) = 0 so that

Pr (η ≤ 0) = 1.

Similarly, if we consider −f (ω) instead of f (ω) , we find that

lim sup

(−SNN

)= − lim inf

SNN

= −η,

and Pr(−η ≤ 0

)= 1, i.e., Pr(η ≥ 0) = 1. This completes the proof of the

first part.

Turn to the second part. Let f = E [f |I] . Define the sets

A+ ={ω ∈ Ω : f (ω) > Ef

}, A− =

{ω ∈ Ω : f (ω) < Ef

},

and A0 ={ω ∈ Ω : f (ω) = Ef

}. All three sets belong to I because f (ω) is

I-measurable and they form a partition of Ω. Since the dynamical system is

ergodic, only one of the sets has measure 1 and the other two have measure

0. If it were A+, then one would have

Ef = Ef = E[f1A+

]> Ef,

which is a contradiction and similarly for A−. Thus, we can only have

Pr (A0) = 1 and hence f (ω) = Ef , Pr-a.s.


4.2 Application to Stationary Processes

We now apply ergodic theory to stationary processes. Our goal is to establish

several strong laws of large numbers for stationary processes. Intuitively, a

stationary process is a stochastic process whose probabilistic laws remain

unchanged through shifts in time (sometimes in space). It is an important

concept in econometrics and statistics. For simplicity of exposition, we

suppose processes in this section is real-valued, i.e., X = R.

Definition 4.2.1 A stochastic process {Xt}∞t=0 on (Ω,F ,Pr) is (strictly)

stationary if the distribution of the random vector (Xt,Xt+1, ...,Xt+n) does

not depend on t for any finite n.

The simplest example is the real-valued sequence {Xt}∞t=0 of IID random

variables. From this sequence, we can construct many stationary processes.

For example, define the process {Yt}∞t=0 as Yt = g (Xt,Xt+1, ...,Xt+n) , where

n ≥ 0 and g is a Borel function. Then {Yt}∞t=0 is a stationary process.

Recall our definition of covariance stationary processes in Definition

2.1.1. A stationary process is covariance stationary if it has finite second

moments. It is possible that a covariance stationary process is not station-

ary. An exception is the Gaussian processes because they are characterized

by the first two moments.

We can use the measure-preserving transformation θ on (Ω,F ,Pr) to

construct a stationary process. Let g : Ω → R be a measurable function.

Define X0 (ω) = g (ω) , Xk (ω) = g(θkω

), k ≥ 1. We claim the process

{Xt} is stationary. In fact, let A = {ω ∈ Ω : (X0,X1, ...) ∈ B} and A1 =

{ω ∈ Ω : (X1,X2, ...) ∈ B} , where B ∈ B (R∞) . Since

A ={ω ∈ Ω :

(g (ω) , g (θω) , g

(θ2ω

), ...

)∈ B

},

A1 ={ω ∈ Ω :

(g (θω) , g

(θ2ω

), g

(θ3ω

), ...

)∈ B

},

4.2. APPLICATION TO STATIONARY PROCESSES 117

we have ω ∈ A1 if and only if θω ∈ A. Thus, A1 = θ−1A. Since θ is measure-

preserving, Pr (A1) = Pr (A) . Similarly, we can show that Pr (Ak) = Pr (A)

for all k ≥ 2, where Ak = {ω ∈ Ω : (Xk,Xk+1, ...) ∈ B} .Conversely, for any real-valued stationary process {Xt} on (Ω,F ,Pr) ,

we can construct a new stochastic process {Xt}, a new probability space

(Ω, F , Pr) and a measure-preserving transformation θ such that the distri-

butions of {Xt} and {Xt} are identical.

In fact, take Ω to be the coordinate space R∞, let F = B (R∞) and

define Pr (D) = Pr (ω ∈ Ω : {Xt} ∈ D) for any D ∈ B (R∞) . Define a Borel

measurable function f : R∞ → R as f (x) = x0 where x = (x0, x1, ...) ∈ R∞.

Let θ be the shift operator on Ω = R∞. Define X0 = f (ω) = X0, Xk =

f(θkω

)= Xk, k ≥ 1, where ω = (X0,X1,X2...) ∈ R∞. Let B = B

(Rk

)and

define:

A = {ω = (x0, x1, ...) ∈ R∞ : (x0, x1, ..., xk) ∈ B} .

Then we have

θ−1A = {ω = (x0, x1, ...) ∈ R∞ : (x1, x2, ..., xk+1) ∈ B} ,

and

Pr (A) = Pr (ω ∈ Ω : (X0 (ω) ,X1 (ω) , ...,Xk (ω)) ∈ B)

= Pr (ω ∈ Ω : (X1 (ω) ,X2 (ω) , ...,Xk+1 (ω)) ∈ B) = Pr(θ−1A

),

where the second equality follows because {Xt} is a stationary process. We

can further show that the above equation is true for any A ∈ B (R∞) .

Thus, θ is a measure-preserving transformation. Finally, by the previous

construction,

Pr(ω ∈ Ω : (X0, X1, ..., Xk) ∈ B

)= Pr (ω ∈ Ω : (X0,X1, ...,Xk) (ω) ∈ B) .

Thus, {Xt} and {Xt} have the same distribution.


Next, we apply the Birkhoff Ergodic Theorem to the stationary pro-

cess and derive a strong law of large numbers. We need to introduce some

definitions for stochastic processes.

Definition 4.2.2 Let θ : R∞ → R∞ be a shift operator. A set A ∈ B (R∞)

is shift invariant if θx ∈ A if and only if x ∈ A.

Definition 4.2.3 A stationary process is ergodic if the measure of every

shift-invariant set is either zero or one.

Theorem 4.2.1 (Strong Law of Large Numbers) Let X = (X0,X1, ...) be a

stationary process on (Ω,F ,Pr) with E |X0| <∞. Then

limN→∞

1

N

N−1∑k=0

Xk = E [X0|I] , Pr -a.s., (4.1)

where I is the class of shift invariant sets. If X is also an ergodic sequence,

then

limN→∞

1

N

N−1∑k=0

Xk = E [X0] , Pr -a.s. (4.2)

Proof. By the previous construction, we have a new process {Xt} de-

fined on a new probability space (Ω, F , Pr). On this space, the shift operator

θ is a measure-preserving transformation. In addition, the distributions of

{Xt} and {Xt} are identical. We apply the Birkhoff Theorem to {Xt}

limN→∞

1

N

N−1∑k=0

f(θkω

)= E [f (ω) |I] , Pr-a.s., (4.3)

where I is the σ-algebra of transformation invariant sets and E is the ex-

pectation operator with respect to Pr. By definition, the transformation in-

variant sets is the same as the class of shift invariant sets. By construction,

(4.3) is equivalent to (4.1). We can similarly prove (4.2).

4.2. APPLICATION TO STATIONARY PROCESSES 119

Karlin and Taylor (1975) provide several equivalent formulations of er-

godicity. We state their results below, which can be proved by applying the

Birkhoff Theorem.

Theorem 4.2.2 Let {Xt} = (X0,X1, ...) be a real-valued stationary process

on (Ω,F ,Pr) . The following conditions are equivalent:

(a) {Xt} is ergodic.

(b) For every shift invariant set A,

Pr {(X0,X1, ..) ∈ A} = 0or1.

(c) For every set A ∈ B (R∞) ,

limn→∞

1

n

n∑j=1

1{(Xj ,Xj+1,...)∈A} = Pr {(X0,X1, ...) ∈ A} ,Pr−a.s.

(d) For every k = 1, 2, .. and every set A ∈ B (R∞) ,

limn→∞

1

n

n∑j=1

1{(Xj ,Xj+1,...,Xj+k)∈A} = Pr {(X0,X1, ...,Xk) ∈ A} , Pr -a.s.

(e) For every k and every Borel measurable integrable function φ,

limn→∞

1

n

n∑j=1

φ (Xj ,Xj+1, ...,Xj+k) = E [φ (X0,X1, ...,Xk)] , Pr -a.s.

(f) For every Borel measurable function φ : R∞→ R with E [|φ (X0,X1, ...)|] <∞,

limn→∞

1

n

n∑k=1

φ (Xk,Xk+1, ...) = E [φ (X0,X1, ...)] , Pr -a.s.

Because ergodicity leads to laws of large numbers, it must be related to

some sense of independence. The following result formalize this intuition.

Definition 4.2.4 A stationary process {Xt} is mixing if for all sets A,B ∈B (R∞) ,

limn→∞Pr {(X1,X2, ...) ∈ A and (Xn,Xn+1, ...) ∈ B}

= Pr{(X1,X2, ...) ∈ A}Pr{(X1,X2, ...) ∈ B}.


Theorem 4.2.3 A mixing process is ergodic.

Proof. Suppose {Xt} is mixing and A = B is a shift invariant set. Then

(X1,X2, ...) ∈ B if and only if (Xn,Xn+1, ...) ∈ B. Applying the mixing

condition yields:

Pr {(X1,X2, ...) ∈ A} = Pr {(X1,X2, ...) ∈ A and (Xn,Xn+1, ...) ∈ A}

→ [Pr{(X1,X2, ...) ∈ A}]2 , as n→∞.

The only possibility is Pr{(X1,X2, ...) ∈ A} = 0 or 1. So {Xt} is ergodic.

4.3 Application to Stationary Markov Processes

In Chapter 3, we have shown that under certain conditions, a Markov process

has a stationary distribution. However, there may be multiple stationary

distributions. The goal of this section is to apply the Birkhoff Ergodic

Theorem to characterize the set of stationary distributions. We then provide

a strong law of large numbers for stationary Markov processes.

To apply the Birkhoff Ergodic Theorem, we need to make Markov pro-

cesses stationary. There are two ways to achieve this. First, Theorem 3.2.3

shows that given a transition function P on a measurable space (X,X )and an initial distribution μ0, we can construct a Markov process {Xt}∞t=0

such that its associated conditional distribution is consistent with P . Given

any stationary distribution π on (X,X ) , we can make the Markov process

{Xt}∞t=0 stationary by setting X0 to follow the distribution π.

There is another way to make a Markov process stationary, which we

shall adopt below. The idea is to consider bi-infinite sequences. The space Ω

is the set of all sequences {Xn : n = 0,±1,±2, ...} and F is the corresponding

product σ-algebra. That is, Ω is the countable product space XZ, where Z

is the set of all integers. We construct a probability measure Pπ on XZ as

4.3. APPLICATION TO STATIONARY MARKOV PROCESSES 121

follows: Given a positive integer n, we define Pnπ on X2n+1 as:

Pnπ (A−n × ...×An) (4.4)

=

∫A−n

∫A−n+1

...

∫nQ (xn−1, dxn) ...Q (x−n, dx−n+1)π (dx−n) .

We can verify that Pnπ satisfies the Kolmogorov consistency condition so

that it induces a unique probability measure Pπ on XZ by the Kolmogorov

Extension Theorem. We can then construct a Markov process {Xn}∞−∞ on

the probability space(XZ,XZ,Pπ

). Exercise 3 asks readers to show that

this Markov process is stationary and its marginal distribution for any Xn

is π.

We define a shift operator θ on XZ as(θkx

)(n) = xn+k for any k ∈ Z

and any x = {xn} . It is a measurable transformation from XZ to XZ. Then(XZ,XZ,Pπ, θ

)form a dynamical system for every invariant distribution π.

We shall focus on this system below. The main result of this section is the

following:

Theorem 4.3.1 Let M be the set of invariant measures for the transition

function P. Then(XZ,XZ,Pπ, θ

)is ergodic if and only if π is an extremal

point of the convex setM.

To prove this theorem, we need some preliminary results. Let Fmn =

σ (xk : n ≤ k ≤ m) where σ (Y ) denotes the σ-algebra generated by the ran-

dom variable Y. Let F∞∞ = ∩n≥1F∞n and F−∞

−∞ = ∩n≥1F−n−∞.

Lemma 4.3.1 Let I be the σ-algebra of shift invariant sets associated with

the dynamical system(XZ,XZ,Pπ, θ

). Then I ⊂ F0

0 , up to sets of Pπ-

measure zero.

Proof. For simplicity, we assume X = R. However, a similar argument

can be applied to the general space X. Let A be a shift invariant set.


Then A can be approximated by sets Ak in σ-algebras corresponding to the

coordinates from [−k, k] , k ≥ 1. Since A is shift invariant and θ is measure-

preserving, the sets θ±2kAk, k ≥ 1, will approximate A just as well. Thus, A

belongs to the tail σ-algebras, the remote past and the remote future. More

formally, A ∈ F∞∞ and A ∈ F−∞−∞ , where F∞∞ and F−∞

−∞ are the completion

of F∞∞ and F−∞−∞ , respectively. By Exercise 4, we have

Pπ

(A|F0

0

)= Pπ

(A ∩A|F0

0

)= Pπ

(A|F0

0

)Pπ

(A|F0

0

).

This implies that Pπ

(A|F0

0

)is equal to zero or one. Thus, A ∈ F0

0 .

Proof of Theorem 4.3.1: Suppose first that π ∈ M is not extremal, i.e.,

there exists an α ∈ (0, 1) and two distinct stationary distributions π1, π2 ∈M such that π = απ1+(1− α)π2. Thus, Pπ = αPπ1+(1− α)Pπ2 . Assume

by contradiction that(XZ,XZ,Pπ, θ

)is ergodic, so that Pπ (A) = 0 or 1 for

every A ∈ I. Thus, Pπ1 (A) = Pπ2 (A) = 0, or Pπ1 (A) = Pπ2 (A) = 1. This

means that(XZ,XZ,Pπ1 , θ

)and

(XZ,XZ,Pπ2 , θ

)are also ergodic. By the

Birkhoff Ergodic Theorem, for any bounded measurable function f on XZ,

limn→∞

1

n

n∑k=1

f(θkx

)=

∫f (x)Pπi (dx) , Pπi-a.s.

Let Ei be the sets of points x such that the equality holds. ThenPπi (Ei) = 1

so that Pπ (E1) = Pπ (E2) = 1. Since f is arbitrary and since π1 and π2 are

distinct, we can choose an f such that∫f (x)Pπ1 (dx) �=

∫f (x)Pπ2 (dx) .

This implies that E1∩E2 = ∅, leading to Pπ (E1 ∪E2) = 2, a contradiction.

To prove the converse, suppose that(XZ,XZ,Pπ, θ

)is not ergodic. We

wish to show that there exists an α ∈ (0, 1) and two distinct stationary

distributions π1, π2 ∈ M such that π = απ1 + (1− α) π2. By Lemma 4.3.1,

A ∈ F00 for any shift invariant set in XZ. Since we can label any time as time

zero, we have A ∈ Fnn for any n ∈ Z. Thus, there exists a set A1 ∈ X such

that A = A1 = AZ1 ≡

{ω ∈ XZ : xn ∈ A1, n ∈ Z

}up to a set of Pπ-measure

4.3. APPLICATION TO STATIONARY MARKOV PROCESSES 123

zero. The set A1 must be π-invariant.2 Otherwise, we cannot have A1 = AZ1 .

Thus, Pπ (A) = Pπ

(AZ

1

)= π (A1) . Since we assume that

(XZ,XZ,Pπ, θ

)is not ergodic, there exists α ∈ (0, 1) such that Pπ (A) = π (A1) = α.

Furthermore, we have π (Ac1) = 1 − α. The stationarity of π implies that

P (x,Ac1) = 1 for π-almost every x ∈ Ac1. Define two measures on (X,X ) as

π1 (B) =1

απ (A1 ∩B) , π2 (B) =

1

1− απ (Ac1 ∩B) , all B ∈ X .

Clearly these two measures on stationary distributions.

From the proof of this theorem, we can relate our previous definition

of ergodicity associated with (X,X , P, π) in Definition 3.3.5 to the present

notion of ergodicity for the dynamical system(XZ,XZ,Pπ, θ

). In particular,

we have shown that(XZ,XZ,Pπ, θ

)is ergodic if and only if the invariant

distribution π is ergodic. It is also straightforward to show the following:

Corollary 4.3.1 If a Markov process with the transition function P on

(X,X ) has a unique invariant distribution π, then π is ergodic.

Finally, we introduce strong laws of large numbers for stationary Markov

processes. Its proof follows from the Birkhoff Ergodic Theorem and Theorem

4.3.1. We omit the details here.

Theorem 4.3.2 Let Px0 be the probability measure on (X∞,X∞) for the

Markov process with the transition function P on (X,X ) starting at time

zero from x0 ∈ X. Let f be a bounded measurable function on X. Then for

almost all x0 with respect to any extremal invariant distribution π on (X,X ) ,

limN→∞

1

N

N∑k=1

f (xk) =

∫f (y)π (dy) ,

for Px0-almost every ω = (x1, x2, ...) ∈ X∞.2This means that P (x,A1) = 1 for π-almost every x ∈ A1 (recall Definition 3.3.5).


4.4 Exercises

1. Prove that the class of transformation invariant sets is a σ-algebra.

2. If θ is a measurable measure-preserving transformation, then

Ef (ω) = E [f (θω)]

for a measurable and integrable function f on Ω.

3. Prove that the constructed process {Xn} on(XZ,XZ,Pπ

)in Section

4.3 is a stationary Markov process. In addition, its marginal distribu-

tion for any Xn is π.

4. Prove that for any probability measure P on the product space X×Y ×Z, and for any measurable and integrable function f, g, the following

two equations are equivalent:

E [f (x) g (z) |Fy] = E [f (x) |Fy]E [g (z) |Fy] ,

E [g (z) |Fx,y] = E [g (z) |Fy] ,

where Fy is the σ-algebra generated by the projection onto Y and Fx,yis the σ-algebra generated by the projection onto X × Y.

5. Prove the laws of large numbers Theorems 4.2.2 and 4.3.2.

6. Prove that any two distinct ergodic invariant measures are mutually

singular.

7. Let X = (X1,X2, ...) be a Gaussian stationary sequence with EXn = 0

and covariance Rn = EXn+kXk. Show that limn→∞Rn = 0 is suffi-

cient for X to be ergodic.

Part II

Dynamic Optimization

125

127

Economic agents such as households, firms, or governments often face

many choice problems. These choice problems are typically dynamic. For

example, a household decides how much to consume today and how much to

save for the future. A firm decides what amount of dividends to distribute

today and how much to invest for the future. A worker decides whether to

take a job offer today or to wait and search for a new job tomorrow. All

these problems involve a cost-benefit tradeoff between today and the future.

In this part of the book, we discuss how to solve this type of problems.

We focus on economic problems that have a recursive structure. We shall

formulate these problems as a Markov decision process model. We then

apply a recursive approach–dynamic programming–to solve these problems.

In Chapter 5, we present the model setup and some examples. In Chapter

6, we discuss the dynamic programming method for finite-horizon problems,

while Chapter 7 studies infinite-horizon problems. In Chapter 8, we analyze

linear-quadratic models. In Chapter 9, we examine models with incomplete

information. In Chapter 10, we study numerical methods for solving dy-

namic programming problems.

128

Chapter 5

Markov Decision ProcessModel

This chapter introduces the Markov decision process model and presents a

variety of examples to illustrate the generality of this model.

5.1 Model Setup

Economic decisions can often be summarized as a problem in which a de-

cision maker chooses an action to influence a stochastic dynamical system

so as to maximize some criterion. To ensure a recursive structure, we for-

mulate the framework by a Markov decision process, which consists of five

elements: time, states, actions, stochastic kernels, and one-period rewards.

A Markov decision process together with an optimality criterion are called

a Markov decision problem. A Markov decision process and the associated

decision problem form a Markov decision process model.

We now describe this model formally. Suppose decisions are made at

discrete points in time, denoted by t ∈ T ≡ {0, 1, 2, ..., T} , where T ≤ ∞.The decision time may not be the same as the calendar time. If T < (=)∞,the decision process has a finite (infinite) horizon. At each time t, the

system occupies a state st. The initial state s0 is given. Denote the set of

129

130 CHAPTER 5. MARKOV DECISION PROCESS MODEL

all possible states by S, called the state space. Let (S,S) and (A,A) be

some measurable spaces. After observing the state st, the decision maker

chooses an action at. The constraints on this action are described by a

nonempty-valued correspondence Γ : S → A; that is, Γ (st) ⊂ A is the set

of feasible actions given that the current state is st ∈ S. Actions may be

chosen randomly or deterministically. When the decision criterion function

has kinks, random actions may be optimal. In game theory, one may consider

randomized actions to allow for mixed strategies. In this chapter, we shall

focus on deterministic actions. In the end of this section, we shall briefly

introduce the model setup for randomized actions.

As a result of choosing action at at state st at time t, the decision maker

receives a one-period reward ut (st, at) . In addition, the system state at time

t+1 is determined by the stochastic kernel Pt+1 : (S× A)×S → [0, 1]; that

is, Pt+1 (st, at; ·) is the probability measure for states at time t + 1 if the

state and action at time t are st and at, respectively. In addition, for any

B ∈ S, Pt+1 (·, ·;B) is a measurable function on S×A. Note that the decisionmaker’s actions may affect the stochastic kernel.

The collection of objects,{T , (S,S), (A,A), Γ, (Pt)Tt=1, (ut)

Tt=0

}, (5.1)

is called aMarkov decision process. In our formulation, we have assumed

that S,A and Γ are independent of time, but Pt+1 and ut may depend on

time. For the infinite-horizon case, we shall assume that Pt+1 and ut are

independent of time. In this case, the Markov decision process is station-

ary. In finite-horizon decision processes, we may assume that no decision is

made in the last period so that the terminal reward uT is a function of the

state only. We may refer to it as a salvage value, a scrap value, or a bequest

value. In some economic problems, we often assume that there is no reward

from the terminal state xT so that uT = 0. In this case, a constraint on the

5.1. MODEL SETUP 131

terminal state is often needed.1

In sum, a Markov decision process has the following essential features:

• The state system is governed by a Markov process (st).

• The stochastic kernels (Pt)Tt=1 of this Markov process are affected by

the decision maker’s current actions, independent of future or past

actions.

• The constraints correspondence Γ maps from a current state to a set

of actions, independent of histories.

• The one-period rewards ut are a function of current states and cur-

rent actions, independent of histories. In the finite-horizon case, the

terminal reward is either zero or a function of state only.

Next, we define the optimality criterion and formulate the Markov de-

cision problem. Suppose the initial state s0 is given. Given this state, the

decision maker chooses an action a0. At time t = 1, the decision maker

observes state s1 and the past state and action (s0, a0) . He then chooses

an action a1 contingent on this information. Since s0 is known, we may

ignore (s0, a0) in the information set. We then require the action a1 be a

measurable function of the history s1 = (s1) . Measurability is necessary for

computing expected values. At time t = 2, the decision maker observes s2

and (s0, a0, s1, a1) . Since a1 is a measurable function of s1, it does not pro-

vide any new information, we can then write the history as s2 = (s1, s2) and

require the action a2 be a measurable function of s2. In general, at any time

t ≥ 1, the decision maker observes the current state st and all the past states

and actions. His information is summarized by a history st = (s1, ..., st) . He

chooses an action at as a measurable function of this history st. His choice of

1See Section 6.6.2 for an example.


sequences of actions must satisfy some optimality criterion. In this chapter,

we assume that he maximizes total discounted expected rewards, where the

expectation is over realizations of the states.

Formally, define the product measurable spaces(St,St

), t = 1, ..., T −1.

Definition 5.1.1 A plan or policy π is a sequence of actions (π0, π1, ..., πT−1)

such that π0 ∈ A and πt : St → A is measurable for t = 1, 2, ..., T −1. A plan

π is Markovian if πt is independent of states from time 0 to time t − 1,

t ≥ 1. It is stationary if πt is independent of time t.

Definition 5.1.2 A plan π = (π0, π1, ..., πT−1) is feasible from s0 ∈ S if

π0 ∈ Γ (s0) and πt(st)∈ Γ (st) for all st ∈ St, t = 1, 2, ..., T − 1.

Let Π (s0) denote the set of all feasible plans from s0. Because of the

measurability requirement, Π (s0) may be empty. We give the following

sufficient condition to ensure nonemptiness of Π (s0) .

Assumption 5.1.1 The correspondence Γ has a measurable selection; that

is, there exists a measurable function h : S → A such that h (s) ∈ Γ (s) , all

s ∈ S.

Lemma 5.1.1 Let (S,S) , (A,A) and Γ be given. Under Assumption 5.1.1,

Π(s0) is nonempty for all s0 ∈ S.

Proof. Let h be a measurable selection of Γ. Define

π0 = h (s0) , πt(st)= h (st) ,

for all st ∈ St, t = 1, 2, ..., T − 1. Clearly the plan π = (π0, π1, ..., πT−1) ∈Π(s0) .

Now, we discuss how total discounted expected rewards are computed

for a feasible plan. First, we maintain the following:


Assumption 5.1.2 The discount factor is β ∈ (0, 1) .

Next, we define expectations. Given an initial state s0, a feasible plan

π from s0, and the stochastic kernels Pt, t = 1, ..., T, define a probability

measure on St as:

μt (s0, B;π) =

∫B1

· · ·∫Bt−1

∫Bt

Pt(st−1, πt−1

(st−1

); dst

)(5.2)

×Pt−1

(st−2, πt−2

(st−2

); dst−1

)...P1 (s0, π0; ds1) ,

for any B = B1 × ...× Bt ∈ St. By Tulcea’s Theorem 3.2.4, μt (s0, · ;π) is awell defined probability measure on St. Note that this measure depends on

the policy π.We use this measure to compute expected rewards at each date.

In the infinite-horizon case, the sequence of measures μt (s0, ·;π) , t = 1, 2, ...,

induces a unique measure μ∞ (s0, ·;π) on S∞.We impose the following assumption for the finite-horizon case.

Assumption 5.1.3 Let T < ∞. The one-period reward function ut : S ×A→ R is measurable for t = 0, 1, ..., T − 1 and the terminal reward function

uT : S → R is measurable. For each s0 ∈ S and each plan π ∈ Π(s0) ,

ut(st, πt

(st))

is μt(s0, ds

t;π)integrable.

For the infinite-horizon case, we require the sequences of one-period re-

ward functions and stochastic kernels be time homogeneous or stationary.

Assumption 5.1.4 Let T = ∞. At any date, the one-period reward func-

tion is u : S× A → R and the stochastic kernel is P : (S× A)× S → [0, 1].

For each s0 ∈ S and each plan π ∈ Π(s0) , u(st, πt

(st))

is μt(s0, ds

t;π)

integrable and the limit

u (s0, π0) + limT→∞

T∑t=1

βt∫u(st, πt

(st))μt

(s0, ds

t;π)

exists (although it may be plus or minus infinity).


Given the above assumptions, for any plan π ∈ Π(s0) , we define the

total discounted expected rewards for T <∞ as:

JT0 (s0, π) = u0 (s0, π0) +

T−1∑t=1

βt∫ut

(st, πt

(st))μt

(s0, ds

t;π)(5.3)

+βT∫uT (sT )μ

T(s0, ds

T ;π).

If T =∞, we define the total discounted expected rewards as:

J (s0, π) = u (s0, π0) + limT→∞

T∑t=1

βt∫u(st, πt

(st))μt

(s0, ds

t;π). (5.4)

We emphasize that for the finite-horizon case, discounting is not impor-

tant because we can simply redefine ut such that β = 1. But it is crucial

for the infinite horizon case as we will show in Chapter 7. Here we main-

tain Assumption 5.1.2 for both cases because we will study the connection

between these two cases in Chapter 7.

Assumptions 5.1.1, 5.1.3, and 5.1.4 ensure that JT0 (s, ·) and J (s, ·) are

well defined on the nonempty set Π (s) for each initial state s ∈ S. We can

then define the optimality criterion as:

V0 (s) = supπ∈Π(s)

JT0 (s, π) , if T <∞, (5.5)

V (s) = supπ∈Π(s)

J (s, π) , if T =∞, (5.6)

for each s ∈ S. More precisely, we may write V0 (s) as V T0 (s) to indicate

that the horizon is T. Whenever there is no risk of confusion, we will adopt

the former notation.

We call the problem in (5.5) a T -horizon Markov decision problem

and the problem in (5.6) an infinite-horizon Markov decision problem.

The restriction “Markov” indicates that we focus on problems resulting from

Markov decision processes. We call V0 : S→ R∪{±∞} the value function

at date zero for the finite-horizon Markov decision problem and call V :


S → R ∪ {±∞} the value function for the infinite-horizon Markov decision

problem. That is, V0 is the unique function satisfying the following three

conditions:

1. If |V0 (s)| <∞, then:

• for all π ∈ Π(s) ,

V0 (s) ≥ JT0 (s, π) ;

• for any ε > 0, there exists a feasible plan π ∈ Π(s) such that

V0 (s) ≤ JT0 (s, π) + ε.

2. If V0 (s) = +∞, then there exists a sequence of plans (πk) in Π (s)

such that limk→∞ JT0(s, πk

)= +∞.

3. If V0 (s) = −∞, then JT0 (s, π) = −∞ for all π ∈ Π(s) .

An optimal plan π∗ ∈ Π(s) exists if it attains the supremum in that

V0 (s) = JT0 (s, π∗) . If an optimal plan exists, then we can replace supremum

with maximum in (5.5). Similar definitions apply to V (s) in the infinite-

horizon case. The following example illustrates that the supremum may be

infinite and an optimal plan may not exist.

Example 5.1.1 (a) Consider the following static problem:

V0 (s) = supa∈[s,2]

1

a+ s, a �= 0.

If s ∈ (0, 2), then V0 (s) = 1/s + s < ∞ and a∗ = s attains the supremum.

If s < 0, then V0 (s) = +∞. No a ∈ [s, 2] attains this supremum.

(b) Consider V0 (s) = supa∈(0,1) a = 1, s ∈ S = R. The value function is

a finite constant. But action a ∈ (0, 1) attains the supremum.


At this point, we have completed the description of the Markov decision

process model. This model includes deterministic decision problems as a

special case. Thus, we will not treat this case separately in this book. In

the deterministic case, the choice of actions determines the subsequent state

with certainty. We may write the state transition as st+1 = f (st, at) for

some function f. The stochastic kernels becomes:

Pt+1 (st, at;B) =

{1 if f (st, at) ∈ B0 otherwise

, B ∈ S.

To close this section, we offer a few remarks on the model formulation.

First, we have assumed that the criterion function is total discounted ex-

pected rewards. In operations research, researchers often consider other

criterion functions; for example, the average expected rewards. We will not

study this case in this book. Second, to explain some experimental evidence

such as the Allais Paradox and the Ellsberg Paradox, researchers propose

nonexpected utility models. Our model formulation does not include this

case. Third, we have assumed time-additive criterion functions, our model

does not include dynamic choice problems with non-additive criterion func-

tions such as recursive utility. Chapter 21 will consider the last two cases.

Finally, we briefly introduce a very general setup with randomized ac-

tions. We refer readers to Bertsekas and Shreve (1978) for a more de-

tailed treatment. Suppose the initial state is unknown and drawn from

the distribution P0 on S. Because actions may be randomized, the history

at any date must include realizations of past actions. Define H0 = S,

Ht = Ht−1 × A × S and σ-algebra of Ht, denoted Ht. A date-t history is

ht = (s0, a0, ..., at−1, st) ∈ Ht. An action at date t is a measurable mapping

πt that maps from Ht to the set of probability measures on A. In addition,

for feasible, we require that Γ (s) ∈ A and πt (ht) (Γ (st)) = 1. A plan π

is a sequence of actions πt, t = 0, 1, 2, ..., T. Define the measurable space(St × At,St ×At

)and the probability measure on this space given any plan

5.2. EXAMPLES 137

π:

μt (ds0, da0, ..., dst, dat;π)

= P0 (ds0)π0 (h0) (da0)P1 (s0, a0; ds1) π1 (h1) (da1)

×...× Pt (st−1, at−1; dst)πt (ht) (dat) ,

for t = 0, 1, ..., T − 1. In the finite-horizon case, there is no action choice in

the last period, we thus have

μT (ds0, da0, ..., dsT ;π)

= P0 (ds0) π0 (h0) (da0)P1 (s0, a0; ds1)π1 (h1) (da1) ...PT (sT−1, aT−1; dsT ) .

We use these measures to compute the total discounted expected rewards. In

the infinite-horizon case, Tulcea’s Theorem implies that these measures in-

duce a unique probability measure on the measurable space (S∞ × A∞,S∞ ×A∞) ,

which is used to compute the total discounted expected rewards.

5.2 Examples

In this section, we present some examples to illustrate our model formulation

provided in the previous section. Some examples admit a variety of different

interpretations and may have wide applicability.

5.2.1 Discrete Choice

If the action set A is finite, the Markov decision problem becomes a dis-

crete choice problem. An example is the problem of optimal replacement

of durable assets analyzed in Rust (1985, 1986). In this problem, the state

space is S = R+. Time is denoted by t = 0, 1, 2, .... The state st is interpreted

as a measure of accumulated utilization of a durable good at time t. In par-

ticular, st = 0 means a brand new durable good. At each time t, a decision

maker decides whether to keep or replace the durable. Let the action set


be A = {0, 1} where at = 0 denotes the action of keeping the durable and

at = 1 denotes the action of selling the durable at the scrap value Ps and

replacing it with a new durable at a cost Pb. Suppose the level of utiliza-

tion follows an exponential distribution. We may write the densities of the

stochastic kernels as:

p (st, at; dst+1) =

⎧⎨⎩1− exp (−λ (dst+1 − st)) if at = 0 and st+1 ≥ st

1− exp (−λdst+1) if at = 1 and st+1 ≥ 0,0 otherwise.

Assume that the cost of maintaining the durable at state s is c (s) and

the objective is to minimize the discounted expected cost of maintaining

the durable over an infinite horizon. The discount factor is β ∈ (0, 1) . Since

minimizing a function is equivalent to maximizing its negative, we can define

the reward function as:

ut (st, at) =

{−c (st) if at = 0,

− (Pb − Ps)− c (0) if at = 1.

5.2.2 Optimal Stopping

In the above discrete choice problem, the decision maker’s choice affects the

evolution of the state system or its stochastic kernels. In a class of dis-

crete choice problems, the so called optimal stopping problems, the decision

maker’s choice does not affect the state system. Examples are abundant. An

investor decides whether and when to invest in a project with exogenously

given stochastic payoffs. A firm decides whether and when to enter or exit

an industry. It may also decide whether and when to default on debt. A

worker decides whether and when to accept a job offer or to quit his job. All

these decisions share three characteristics. First, the decision is irreversible

to some extent. Second, there is uncertainty about future rewards or costs.

Third, decision makers have some flexibility in choosing the timing of the

decision.

5.2. EXAMPLES 139

Now, we formulate such problems. Time is denoted by t = 0, 1, ..., T,

T ≤ ∞. Uncertainty is generated by an exogenous Markov state process

(zt)t≥0 taking values in some state space Z. The transition function of (zt)t≥0

is given by Qt : Z×Z → [0, 1], t = 1, ..., T − 1, where (Z,Z) is a measurable

space. At any time t, after observing the exogenous state zt, a decision

maker decides whether to stop or to continue. His decision is irreversible

in that if he chooses to stop, he will not make any choices in the future.

Continuation at date t generates a payoff ft (zt) , while stopping at date t

yields an immediate payoff gt (zt) and zero payoff in the future, where ft

and gt are functions that map Z into R. The problem may be finite horizon.

In this case, we assume that the terminal reward is h (zT ) if the decision

maker does not choose to stop before. The decision maker is risk neutral

and maximizes the total discounted payoffs with a discount factor β ∈ (0, 1].

In the infinite horizon case, we assume that ft and gt are independent of t.

To formulate the optimal stopping problem as a Markov decision prob-

lem, we introduce an additional state s, representing stopping. This state

is an absorbing state. Let a = 1 and a = 0 denote the actions of continu-

ing and stopping, respectively. The Markov decision model consists of the

following elements:

• Decision time:

T = {0, 1, ..., T} , T ≤ ∞.

• States:

S = Z ∪ {s} .

• Actions:

A = {0, 1} , Γ (s) =

{{1, 0} if s ∈ Z

∅ if s = s.


• Rewards:

ut (s, a) =

⎧⎨⎩ft (s) if s ∈ Z, a = 1gt (s) if s ∈ Z, a = 0, t < T0 if s = s

,

uT (s) = h (s) , s ∈ Z, if T <∞.

• Stochastic kernels:

Pt+1 (st, at;B) = Qt+1 (st, B) if st ∈ Z, at = 1, B ∈ Z,

Pt+1 (st, at; s) = 1 if st ∈ Z, a = 0 or st = s, t < T,

and Pt+1 (st, at; ·) = 0, otherwise.

• Optimality criterion: maximization of total discounted expected re-

wards with discount factor β ∈ (0, 1).

In applications, we only need to specify the Markov process (st) and the

functions ft, gt and h. We next consider some specific examples.

Option Exercise

Consider an infinite-horizon irreversible investment problem. The state

process (zt)t≥0 represents the stochastic project payoffs and has a time-

homogeneous transition function Q. The decision maker is risk neutral. An

investment project yields payoffs zt if it is invested at date t. Investment

costs a lump-sum value I > 0 at the time of the investment. If the decision

maker chooses to wait, he receives nothing in the current period. Then he

draws a new payoff in the next period and decides whether he should invest

again. We can cast this problem into our framework by setting:

gt (zt) = zt − I, (5.7)

ft (zt) = 0. (5.8)

5.2. EXAMPLES 141

We may view this problem as an option exercise problem, in which the

decision make decides when to exercise an American call option on a stock.

In this case, st represents the stock price and I represents the strike price.

The discount factor is equal to the inverse of the gross interest rate, β =

1/ (1 + r) .

Exit

Exit is an important problem in industrial organization and macroeco-

nomics.2 We may describe a stylized infinite-horizon exit model as follows.

Consider a firm in an industry. The Markov state process (zt)t≥0 with a

stationary transition function Q could be interpreted as a demand shock or

a productivity shock. Staying in business at date t generates profits ϕ (zt)

and incurs a fixed cost cf > 0. The owner/manager may decide to exit and

seek outside opportunities. Let the outside opportunity value be a constant

ξ > 0. Then the problem fits into our framework by setting:

gt (zt) = ξ, (5.9)

ft (zt) = ϕ (zt)− cf . (5.10)

Secretary Problem

Cayley (1875) proposes a problem of how to play a lottery. This problem

may be described as a secretary problem, which has been studied extensively

in the fields of applied probability, statistics, decision theory, and auction

theory. Here we follow Puterman’s (1994) presentation closely. Consider an

employer’s decision to hire an individual to fill a vacancy for a secretarial

position. There are N candidates or applicants for this job, with different

abilities. Candidates are interviewed sequentially. After an interview, the

employer decides whether or not to offer the job to the current candidate. If

2See Hopenhayn (1992) for an industry equilibrium model of entry and exit.


he does not offer the job to this candidate, the candidate must seek employ-

ment elsewhere. The employer wishes to maximize the probability of giving

an offer to the best candidate.

We formalize this problem in an abstract setting as follows. There are

N objects that have different qualities. They are ranked from 1 to N ,

according to decreasing order of quality. The true rankings are unknown to

the decision maker. He observes the objects one at a time in a random order.

He can either select the current object and terminate the search, or discard

it and choose the next object. His objective is to maximize the probability

of choosing the object ranked number 1.

To cast this problem in the optimal stopping model, we set the horizon

T = N . The decision time t = 1, 2, ..., N indexes the rank of a candidate

and has nothing to do with the calendar time. The difficulty of this problem

is how to model uncertainty generated by some Markov state process. We

set the state space for this process as Z = {0, 1}, where 1 denotes that

the current object is the best object (rank closest to 1) seen so far, and 0

denotes that a previous object was better. The transition function of this

state process is given by

Qt+1 (zt, 1) =1

t+ 1, Qt+1 (zt, 0) =

t

t+ 1,

for each zt ∈ Z, 1 = 2, ..., N − 1. The interpretation is that given the date t

state zt, the probability of (t+1)th candidate is the best candidate among the

t+ 1 candidates seen so far is 1/ (t+ 1) . Note that this transition function

is independent of the state zt.

In any state, the action a = 0 means select the current object (give an

offer to the current candidate) and a = 1 means do not select the current

object and continue the search (interview the next candidate).

Rewards are received only when stopping; that is, choosing action a = 0.

They correspond to the probability of choosing the best candidate. To

5.2. EXAMPLES 143

compute this probability, we observe that

Pr (Best object in first t)

=number of subsets of {1, 2, ..., N} of size t containing 1

number of subsets of {1, 2, ..., N} of size t

=

(N − 1t− 1

)(Nt

) =t

N,

where

(Nr

)denotes a binomial coefficient.

We can now define the rewards. We set ft (zt) = 0, zt ∈ Z, gt (0) = 0,

gt (1) = t/N , and the terminal rewards, h (0) = 0 and h (1) = 1. To interpret

the terminal rewards, we note that if the last object is the best, then the

probability of choosing the best is 1; otherwise, it is 0.

5.2.3 Bandit Model

Consider a gambler’s problem in which he chooses one of many arms of a

slot machine to play. Each arm may give different winning probabilities

unknown to the gambler. The gambler will choose the arm that yields the

largest expected reward. This problem is often referred to as a multi-armed

bandit problem. If the slot machine has only one arm, the problem is called a

one-armed bandit problem. Typically there are two types of bandit problems

according to the information available to the decision maker. In the first

type, the decision maker has complete information about the underlying

distribution of uncertainty and the structure of the system. In the other

type, the decision maker has incomplete information about the system. For

example, he may not know the type of the arms. He may learn about the

unknown part of the system by observing past states and actions. In this

chapter, we focus on the first type of bandit problems. We discuss the second


type in Chapter 10.3

Now we formulate the bandit problem with full information in terms of

the Markov decision model. Suppose a decision maker observes the state of

each of k Markov processes (k arms), at each date t = 0, 1, ..., T, T ≤ ∞.Each Markov process

(sit)Tt=0

, i = 1, 2, ..., k, is characterized by a sequence

of transition functions Qit : Si × Si → [0, 1] , where

(Si,Si

)is a measurable

space, t = 1, 2, ..., T − 1. The decision maker chooses one of the Markov

processes at each date. If he chooses process i at date t when the state is

st =(s1t , ..., s

it

), then he receives a reward uit

(sit), which does not depend

on other state processes. In addition, sit moves to a state at date t + 1

randomly according to the transition function Qit+1, but the states of other

processes remain frozen in that sjt+1 = sjt for all j �= i. The decision maker’s

objective is to maximize the total discounted expected rewards from the

selected process. We summarize the bandit model as follows:

• Decision time:

t = 0, 1, 2, ....

• States:

S = S1 × S

2 × ...× Sk.

• Actions:

A = {1, 2, ..., k} , Γ (s) = A, s ∈ S.

• Stochastic kernels:

Pt+1

((s1t , ..., s

kt

), at;B

)= Qit+1

(sit, B

i)if at = i,

for B ={s1t}× ...×

{si−1t

}×Bi×

{si+1t

}× ...×

{sit}and Bi ∈ Si. In

addition, Pt+1

((s1t , ..., s

kt

), at;B

)= 0, for all other B ∈ S1 × ...× Sk.

3See Berry and Fristedt (1985) and Gittins, Glazebrook and Weber (2011) for a com-prehensive study of the bandit model.

5.2. EXAMPLES 145

• Rewards:

ut (st, at) = uit(sit), st =

(s1t , ..., s

kt

)∈ S, if at = i.

• Optimality criterion:

sup(at)

E

[T∑t=0

βtut (st, at)

], β ∈ (0, 1) .

A crucial feature of the bandit problem is that when process i is chosen

at time t, its state at time t+ 1 moves randomly, while the states of other

processes remain frozen. Thus, even though process i may be attractive at

time t, it may draw a bad state at time t + 1. At that state, the decision

maker may choose a different process. The bandit model is related to the

optimal stopping model. Some optimal stopping problems may be viewed

as a special case of the bandit model. For example, when there are two

arms, one arm is represented by a deterministic constant process and the

other arm is stochastic. Then this two-armed bandit problem is an optimal

stopping problem. In general, if both arms are stochastic, then the decision

maker may switch choices between these two arms depending on the state

of the arms. It is not an optimal stopping problem.

There are many applications of the bandit model in economics and op-

erations research, including task selection, search, resource allocation, and

choice of R&D processes, etc. Here we shall describe the first two applica-

tions and also discuss some extensions of the bandit model. More applica-

tions are available for the bandit model with incomplete information. We

will discuss this case in Chapter 10.

Task Selection

Consider the task selection problem taken from Puterman (1994). A decision

maker decides to take on one of k tasks at any date t. The state sit represents


the degree of completion of task i, i = 1, 2, ..., k. If the decision maker takes

on task i, he receives an expected reward uit(sit). He can receive a certain

reward only if the task is completed. We may take Si = [0, 1] and si ∈ Si

represents the fraction of the task that has been completed. Working on

a task for an additional period is likely to draw the decision maker closer

to completion, but there is a chance that he falls behind. We may model(sit)as a Markov process. Let pi

(C|sit

)be the conditional probability of

completing task i given the state sit, where C represents completion of the

task. Let uit(sit)= Ripi

(C|sit

), where Ri represents the reward if task i is

completed. Assume that the decision maker can work on only one task at

any time. If he works on a given task, then other tasks are stopped at the

previous state.

Search for the Best Alternative

Weitzman (1979) studies the following Pandora’s problem. There are k

closed boxes. Each box i, 1 ≤ i ≤ k, contains an unknown reward xi

drawn from a distribution F i (xi) , independent of other boxes. It costs ci

to open box i and the reward is delivered after time ti. The continuously

compounded discount rate is r. So the reward is discounted at βi = e−rti .

An outside value x0 is available.

At each stage, Pandora decides whether to open a box and which box

to open. If she chooses to stop searching, Pandora collects the maximum

reward she has uncovered. If she continues to search, she pays the searching

cost and waits for the reward to come. At the time when that reward is

realized, she faces the same decision problem again. Pandora’s objective is

to maximizes the total discounted expected rewards. The question is what

is the optimal sequential search strategy of opening boxes.

5.2. EXAMPLES 147

Extensions

A natural extension of the above bandit model is to allow unchosen processes

(arms) to change states between decision times. The effect of this extension

is that we have to change the structure of stochastic kernels. This extended

model is often called the restless bandits model. A second extension is

to allow the number of arms change over time. Instead, new arms arrive

according to a Poisson process. A third extension is to allow switching costs

when switching among different arms.

5.2.4 Optimal Control

In many applications, it is more convenient to describe the evolution of the

state system by difference equations instead of stochastic kernels. The deci-

sion maker’s actions may affect these equations. This modelling framework

is called the optimal control model. We now introduce this model.

Let (X,X ) , (Z,Z) and (A,A) be measurable spaces, where X ⊂ Rnx,

Z ⊂ Rnz , and A ⊂ Rna. Suppose the state s of the system consists of an

endogenous state x ∈ X and an exogenous state z ∈ Z, i.e., s = (x, z) . Time

is denoted by t = 0, 1, 2, ..., T ≤ ∞. The initial state (x0, z0) is given. The

exogenous state evolves according to a time-homogeneous Markov process

with the stationary transition function Q : Z×Z → [0, 1] .4 The endogenous

state evolves according to the following difference equation:

xt+1 = φt (xt, at, zt, zt+1) , t = 0, 1, ..., T − 1, (x0, z0) given, (5.11)

where φt : X× A× Z× Z→ X is a measurable function. After choosing an

action at ∈ Γ (xt, zt) , the decision maker obtains reward ut (xt, zt, at) . The

action at is a vector of control variables.

4The framework can be generalized to incorporate time-inhomogeneous Markov pro-cesses.


If T < ∞, we may set a terminal reward uT (xT , zT ) , which may be

interpreted as a scrap value or a bequest value. If there is no reward from

the terminal state xT , we may impose some additional constraint on xT .5

The decision maker’s objective is to choose a feasible policy (at)T−1t=0 from

(x0, z0) to maximize the total discounted expected reward:

sup(at)

T−1t=0 ∈Π(x0,z0)

E

[T−1∑t=0

βtut (xt, zt, at) + βTuT (xT , zT )

], (5.12)

subject to (5.11), where E is the expectation operator. The objective func-

tion in (5.12) is called the Bolza form. If uT = 0 in (5.12), it is the

Lagrange form. If ut = 0 in (5.12) for t = 0, 1, ..., T − 1, it is called the

Mayer form.

If T =∞, we typically consider the time-homogeneous case in which ut =

u and φt = φ. The decision maker’s objective is to choose a feasible policy

(at)∞t=0 from (x0, z0) to maximize the total discounted expected reward:

sup(at)

∞t=0∈Π(x0,z0)

E

[ ∞∑t=0

βtu (xt, zt, at)

], (5.13)

subject to

xt+1 = φ (xt, at, zt, zt+1) , t ≥ 0, (x0, z0) given. (5.14)

The optimal control model fits in the Markov decision process model by

defining transition kernels as

Pt+1 (xt, zt, at;B1 ×B2) = 1B2 (zt+1) 1B1 (φt (xt, at, zt, zt+1))Q (zt, B2) ,

where B1 ∈ X , B2 ∈ Z and 1A (y) is an indicator function that is equal to

1 if y ∈ A and equal to 0, otherwise.

The system transition equation (5.11) may be generalized to contain

past states and actions. This case can be cast into the present model by

5See Section 6.6.2 for an example.

5.3. EXERCISES 149

redefining states and controls in terms of vectors to take account of past

states and controls. One may also include intratemporal constraints on the

states and actions each period.

When the action set A is a finite set, the control problem is a discrete

choice problem. When actions involve both discrete and continuous choices,

the problem is often called a mixed stopping and control problem.

Examples include inventory management and optimal investment with fixed

costs. With fixed costs, the decision maker has to decide when to make

adjustments as well as what size of adjustments to make. We will analyze

such problems in Chapter 8.5.

5.3 Exercises

1. (Job search) Consider McCall’s (1970) job search model. An unem-

ployed worker is search for a job. Each period he draws one offer w

from the same wage distribution F over [0, B] . If the worker rejects the

offer, he receives unemployment compensation c this period and waits

until next period to draw another offer from F. If the worker accepts

the offer to work at w, he receives a wage of w each period forever.

The worker’s objective is to maximize the total discounted expected

income. Formulate this problem as an optimal stopping problem.

2. (Put option) An American put option gives the holder the right to sell

stocks at a specified price at any time up to its expiry date. Formulate

the problem of when to exercise the put option as an optimal stopping

problem.

3. (Secretary problem) Formulate a variant of the secretary problem in

which the employer’s objective is to maximize the probability of choos-

ing one of the two best candidates.


4. (Pandora’s problem) Formulate Pandora’s problem analyzed by Weitz-

man (1979) as a bandit problem.

5. (Stochastic Ramsey growth model) Consider a social planner’s re-

source allocation problem. His objective is to choose sequences of

consumption (Ct) , labor supply (Nt), and investment (It) so as to

maximize a representative household’s utility given by:

E

[ ∞∑t=0

βtu (Ct, 1−Nt)

], β ∈ (0, 1) ,

subject to the resource constraint:

Ct + It = ztF (Kt, Nt) ,

Kt+1 = (1− δ)Kt + It,

where u : R+ × (0, 1) → R is a v-NM utility function, F : R+ → R is

a production function, zt represents productivity shocks which follow

a Markov process with the transition function Q : Z × Z → [0, 1] .

Formulate this problem as an optimal control problem. Argue that

one can choose Ct, It, and Nt as control variables or choose Kt+1 and

Nt as control variables.

6. (Inventory management) Consider a manager’s inventory management

problem. Each period he has some amount of inventory. He also faces

demand for his product. His sales are stochastic and are modeled as a

Markov process. Sales generate revenues. The current inventory may

not meet the demand. Thus, the manage may place a new order to

increase inventory. New order incurs a fixed cost and a variable cost.

Maintaining inventories also incurs costs. The manager’s objective is

to choose an ordering policy so as to maximize the total discounted

expected profits over an infinite horizon. Formulate this problem as

an optimal control problem.

Chapter 6

Finite-Horizon DynamicProgramming

The method of dynamic programming is best understood by studying finite-

horizon problems first. These problems are often encountered in making

life-cycle planning decisions on optimal consumption, savings, and portfolio

choice, etc. In this chapter, we first solve a simple example to illustrate

the idea of dynamic programming. We then present three main results

in dynamic programming theory. Finally, we use the method of dynamic

programming to solve some more complicated examples.

6.1 A Motivating Example

To explain the main idea of dynamic programming, we start with a deter-

ministic example. Consider the following social planner’s problem:

maxT∑t=0

βt ln (Ct) (6.1)

subject to

Ct +Kt+1 = Kαt , t = 0, 1, ..., T, K0 given. (6.2)

Here, Ct and Kt represent consumption and capital, respectively. The dis-

count factor is β ∈ (0, 1) and α ∈ (0, 1) represents the capital share. In

151

152 CHAPTER 6. FINITE-HORIZON DYNAMIC PROGRAMMING

this problem, the social planner makes a planning decision at time zero by

choosing sequences of consumption and capital.

A natural way to solve this problem is to use the resource constraint to

substitute out (Ct) in the objective function. Then one can derive first-order

conditions. Another way is to use the Lagrange method. One can also derive

first-order conditions. Both ways give the following second-order difference

equation:

1

Kαt −Kt+1

= β1

Kαt+1 −Kt+2

αKα−1t+1 , t = 0, 1, ..., T.

We need two boundary conditions. One boundary condition is the ini-

tial condition with given K0. The other condition is the terminal condition

KT+1 = 0. Using these two condition, we obtain the following solution:

Kt+1 = αβ1− (αβ)T−t

1− (αβ)T−t+1Kαt , t = 0, 1, ..., T. (6.3)

Next, we consider another problem in which the social planner chooses

a sequence {Ct+j ,Kt+j+1}T−tj=0 at time t so as to maximize discounted utility

from time t onward:

Vt (Kt) = max

T−t∑j=0

βj ln (Ct+j)

subject to a sequence of resource constraints in (6.2) starting from Kt. Here,

Vt (Kt) is called the value function at date t. The original problem (6.1)

is the special case with t = 0. Using the same method, we can derive:

Kt+1+j = αβ1− (αβ)T−t−j

1− (αβ)T−t−j+1Kαt+j , j = 0, 1, ..., T − t, (6.4)

Vt (Kt) =α(1− (αβ)T−t+1)

1− αβ lnKt +At, (6.5)

where At satisfies the difference equation:

At = βAt+1 + bt, AT = 0,

6.1. A MOTIVATING EXAMPLE 153

bt = ln1− αβ

1− (αβ)T−t+1(6.6)

+αβ

(1− (αβ)T−t

)1− αβ

[ln (αβ) + ln

1− (αβ)T−t

1− (αβ)T−t+1

].

Note that the above problem is a deterministic optimal control problem.

We may use the dynamic programming method. The idea of this method

is to use backward induction. We start with the last period t = T. The

planner solves the following problem:

max ln (CT ) (6.7)

subject to (6.2) for t = T. Clearly, the solution is given by:

CT = KαT , KT+1 = 0,

In addition, VT (KT ) = α lnKT given in (6.5) is equal to the maximum value

of problem (6.7).

At time t = T − 1, there are only two decision periods. The planner

solves the following problem:

max ln(CT−1) + βVT (KT ) , (6.8)

subject to (6.2) for t = T − 1. The solution is given by:

CT−1 =1

1 + αβKαT−1, KT =

αβ

1 + αβKαT−1.

In addition, we can verify that VT−1 (KT−1) is equal to the maximum value

of problem (6.8).

In general, at any time t = 0, 1, ..., T − 1, the planner knows that the

maximized value of utility from time t+1 onward is Vt+1 (Kt+1) if the capital

stock at t+1 is Kt+1. The planner solves the following two-period problem:

maxCt,Kt+1

ln(Ct) + βVt+1 (Kt+1) ,


subject to (6.2) at t. We obtain the solution for Kt+1 given in (6.3). We

may write the solution as Kt+1 = gt (Kt) , where the function gt is called an

optimal decision rule or policy function. In addition, we can show that

Vt (Kt) given in (6.5) satisfies the equation:

Vt (Kt) = maxCt,Kt+1

ln(Ct) + βVt+1 (Kt+1) .

This equation is often called a Bellman equation. In the literature, it is

also called dynamic programming equation, optimality equation, or

functional equation. The problem in this equation is called a dynamic

programming problem.

From the above analysis, we observe the following properties: (i) The

sequence of value functions obtained from the planning problem satisfies

the Bellman equation from dynamic programming. (ii) The optimal capi-

tal policy (plan) obtained from the planning problem are identical to those

generated from the optimal policy functions obtained from dynamic pro-

gramming. In addition, the optimal plan made at time zero remains op-

timal at any future date t whenever the date−t state Kt results from the

initial optimal plan. In fact, these properties are quite general and called

the Principle of Optimality. This principle is stated in Bellman (1957,

p. 83) as follows:

“An optimal policy has the property that whatever the ini-

tial state and initial decision are, the remaining decisions must

constitute an optimal policy with regard to the state resulting

from the first decision.”

Before formalizing this principle in the general Markov decision process

model in Section 6.3, we first explain the difficulty involved in stochastic

models in the next section.

6.2. MEASURABILITY PROBLEM 155

6.2 Measurability Problem

In stochastic models, measurability is necessary for computing expectations.

This issue does not arise in deterministic models or stochastic models with

countable state spaces. Unfortunately, measurability of value functions or

policy functions on general uncountable state spaces may fail to hold for the

dynamic programming method to work. We use a simple two-period opti-

mization problem adapted from Bertsekas and Shreve (1978) as an example

to illustrate this issue.

Suppose a decision maker chooses a plan π = (π0, π1) to solve the fol-

lowing problem:

V0 (s0) = supπ

∫u (s1, π1 (s1))P (s0, π0; ds1) ,

where π0 ∈ A and π1 : S → A is a measurable function. The measurability

of π1 is a necessary requirement for the computation of integration. Suppose

we use the method of dynamic programming to solve the above problem.

We proceed by backward induction:

V ∗1 (s1) = sup

a1∈Au (s1, a1) ,

V ∗0 (s0) = sup

a0∈A

∫V ∗1 (s1)P (s0, a0; ds1) .

Suppose V0, V∗1 and V ∗

0 are finite functions. The Principle of Optimality

implies that V0 (s0) = V ∗0 (s0) . An informal proof of this result consists of

the following two steps:

Step 1. For any plan π,

V ∗0 (s0) ≥

∫V ∗1 (s1)P (s0, π0; ds1) ≥

∫u (s1, π1)P (s0, π0; ds1) .

Thus, V ∗0 (s0) ≥ V0 (s0) .


Step 2. By the definition of supremum, given any s1 ∈ S, for any ε > 0,

there exists a∗1 (s1) ∈ A such that

u (s1, a∗1) + ε ≥ V ∗

1 (s1) . (6.9)

Thus, for any a0 ∈ A,

V0 (s0) ≥∫u (s1, a

∗1 (s1))P (s0, a0; ds1) (6.10)

≥∫V ∗1 (s1)P (s0, a0; ds1)− ε. (6.11)

Taking ε→ 0 and taking supremum, we deduce that V0 (s0) ≥ V ∗0 (s0) .

What goes wrong in the above proof? First, we need a∗1 to be measurable

for the integration in (6.10) to make sense. But the measurable selection

a∗1 in (6.9) may not exist. Second, we need V ∗1 to be measurable for the

integration in (6.11) to make sense. However, there are examples in which V ∗1

may not be measurable. As Blackwell (1965) point out, the nonmeasurability

of V ∗1 stems from the fact that the projection of a Borel set in R2 on one of

the axes need not be Borel measurable in the case where S = A = R. Since

for any c ∈ R,

{s1 : V1 (s1) < c} = projs1 {(s1, a1) : u (s1, a1) < c} ,

where for any set E ⊂ S× A,

projs1E = {s1 : (s1, a1) ∈ E} .

Even though {(s1, a1) : u (s1, a1) < c} is a Borel set, {s1 : V1 (s1) < c} neednot be. There are several approaches to deal with the preceding two difficul-

ties in the literature (see Bertsekas and Shreve (1978)). Instead of getting

into the complicated technical details, we simply sidestep these issues by

imposing sufficient conditions in order to formalize the key idea of dynamic

programming. For the model in Section 6.4, all these conditions are satisfied

so that we can apply the method of dynamic programming.

6.3. THE PRINCIPLE OF OPTIMALITY 157

6.3 The Principle of Optimality

Consider the Markov decision process model presented in Chapter 5. In

addition to studying the initial planning decisions, we also consider planning

decisions at any future date t = 1, 2, ..., T −1. Suppose at date t, the decisionmaker arrives at the history st ∈ St. Imagine that at the history st and after

experiencing a history of actions(π0, ..., πt−1

(st−1

)), the decision maker

reconsiders his decision made initially at time zero. What is his new optimal

plan at st? To formulate this problem, we need to consider feasible plans

from any history st and their feasible continuation plans following st+1.

A feasible plan from history st, πst, consists of a point πs

t

t ∈ Γ (st) and

a sequence of measurable functions πst

t+j : Sj → A satisfying πst

t+j(st+jt+1) ∈

Γ (st+j), for any st+jt+1 ≡ (st+1, ..., st+j) ∈ Sj, j = 1, 2, ..., T − t. Denote the

set of all feasible plans from history st by Π(st). For t = 0, we set s0 = s0.

For any feasible plan from st, πst ∈ Π

(st), we define its continuation

plan following st+1, C(πst, st+1), as follows:

Ct+1

(πs

t, st+1

)= πs

t

t+1(st+1),

Ct+j

(st+jt+2;π

st, st+1

)= πs

t

t+j

(st+1, s

t+jt+2

),

for any st+jt+2 ∈ Sj−1, j = 2, ..., T − t. By definition, Ct+1(πst , st+1) is a

measurable function of st+1. Each Ct+j(· ;πst, st+1) is a measurable function

on Sj−1 and is also a measurable function of st+1. Since πstis feasible,

C(πst, st+1) is also feasible by definition in that Ct+1(π

st, st+1) ∈ Γ (st+1)

and Ct+j(st+jt+2 ;πs

t, st+1) ∈ Γ (st+j) . Thus, Ct+1(π

st, st+1) ∈ Π(st+1

).

Now, we define the conditional probability measure on Sj given the

history st and the feasible plan πstfrom st

μt+jt

(st, B;πs

t)

=

∫B1

· · ·∫Bj

Pt+j

(st+j−1, π

st

t+j−1(st+j−1t+1 ); dst+j

)...Pt+1

(st, π

st

t ; dst+1

), (6.12)


where B = B1 × ... × Bj ∈ Sj , j = 1, 2, ..., T − t. For j = 1, we set

πst

t+j−1(st+j−1t+1 ) = πs

t

t . Using these measures, we compute the sequence of

discounted expected rewards resulting from any plan πstfrom history st :

JTT(sT

)= uT (sT ) ,

JTt

(st, πs

t)

= ut

(st, π

stt

)(6.13)

+T−t−1∑j=1

βj∫ut+j

(st+j, π

st

t+j(st+jt+1)

)μt+jt (st, dst+jt+1;π

st)

+βT−t∫uT (sT )μ

Tt

(st, dsTt+1;π

st),

for t = 0, 1, ..., T − 1. Here we use dst+jt+1 to denote the integration with

respect to (st+1, ..., st+j) on Sj.

We are ready to write the decision maker’s decision problem when arriv-

ing at any history st as follows:

Vt(st)= sup

πst∈Π(st)

JTt

(st, πs

t), t = 0, 1, ..., T − 1, (6.14)

where Vt : St → R∪{±∞} is the value function at time t. Since at the

terminal time, there is no decision to make, we set VT(sT

)= JTT

(sT

)=

uT (sT ) . The original Markov decision problem in (5.5) is a special case of

the sequence problem (6.14) with t = 0.

In this book, we shall focus on the case in which value functions always

take on finite values because the value of infinity has no economic meaning.

In view of the discussion in Section 6.2, we shall impose:

Assumption 6.3.1 (a) The value function Vt : St → R is measurable. (b)

For any ε > 0, there exists a plan πstfrom history st such that

JTt

(st, πs

t)+ ε ≥ Vt

(st), (6.15)

for t = 1, 2, ..., T − 1. In addition, as functions of st, πst

t is measurable and

each πst

t+j(st+jt+1) is measurable for any fixed st+jt+1 ∈ Sj, j = 1, 2, ..., T − t− 1.


By the definition of supremum, there exists a feasible πstsuch that (6.15)

holds. The added restriction of Assumption 6.3.1 (b) is that πstmust satisfy

certain measurability requirement.

The following lemma is fundamental for the Principle of Optimality.

Lemma 6.3.1 Let the Markov decision process in (5.1) be given. Suppose

Assumptions 5.1.1-5.1.3 hold. Then for any feasible plan πstfrom st ∈ St,

JTt

(st, πs

t)= ut(st, π

stt ) + β

∫JTt+1(s

t+1, C(πst, st+1))Pt+1(st, π

stt ; dst+1),

(6.16)

where t = 0, 1, ..., T − 1.

Proof. Substituting equation (6.13) at t+ 1 into the right-hand side of

(6.16), we compute

ut

(st, π

stt

)+ β

∫JTt+1

(st+1, C

(πs

t, st+1

))Pt+1

(st, π

stt ; dst+1

)= ut

(st, π

st

t

)+ β

∫ut+1

(st+1, π

st

t+1 (st+1))Pt+1

(st, π

st

t ; dst+1

)

+T−t−1∑j=2

βj∫∫

ut+j(st+j , πst

t+j(st+jt+1))

× μt+jt+1(st+1, dst+jt+2;C(πs

t, st+1))Pt+1

(st, π

st

t ; dst+1

)+βT−t

∫∫uT (sT )μ

Tt+1

(st+1, dsTt+2;C(πs

t, st+1

))Pt+1

(st, π

st

t ; dst+1

)

= ut

(st, π

stt

)+

T−t−1∑j=1

βj∫ut+j

(st+j, π

stt+j(s

t+jt+1)

)μt+jt

(st, dst+jt+1;π

st)

+βT−t∫uT (sT )μ

Tt

(st, dsTt+1;π

st)

= JTt

(st, πs

t),

where the second equality follows from the definition of μt+jt in (6.12).


Equation (6.16) implies that the objective function has a recursive struc-

ture. This recursive structure together with the Markov decision process

ensure that the sequence of value functions also has a recursive structure

and is Markovian as the following theorem shows.

Theorem 6.3.1 Let the Markov decision process in (5.1) be given. Let

(Vt)Tt=0 be the sequence of value functions for the sequence problem (6.14).

Suppose Assumptions 5.1.1-5.1.3 and 6.3.1 hold. Then (Vt)Tt=0 satisfies the

Bellman equation:

Vt(st)= sup

at∈Γ(st)ut (st, at) + β

∫Vt+1

(st+1

)Pt+1 (st, at; dst+1) , (6.17)

where VT(sT

)= uT (sT ) . In addition, (Vt) is Markovian in that Vt

(st)does

not depend on st−1 for t = 1, 2, ..., T.

Proof. In the terminal period T, there is no decision to make and thus

VT(sT

)= uT (sT ) . Let Wt

(st)denote the right-hand side of (6.17). We

first show that Wt

(st)≥ Vt

(st). Consider any plan from st, πs

t ∈ Π(st).

By the definition of Vt+1, we derive:

Wt

(st)

= supat∈Γ(st)

ut (st, at) + β

∫Vt+1

(st+1

)Pt+1 (st, at; dst+1)

≥ supat∈Γ(st)

ut (st, at) + β

∫JTt+1

(st+1, C(πs

t, st+1)

)Pt+1 (st, at; dst+1)

≥ ut

(st, π

stt

)+ β

∫JTt+1

(st+1, C(πs

t, st+1)

)Pt+1

(st, π

stt ; dst+1

)= JTt

(st, πs

t),

where the first inequality uses the fact that C(πst, st+1) ∈ Π(st+1), the

second follows from the fact that πst

t ∈ Γ (st) , and the last equality follows

from Lemma 6.3.1. Thus, Wt

(st)≥ Vt

(st).


Next, we show that Wt

(st)≤ Vt

(st). By Assumption 6.3.1, for any

ε > 0, there exists a plan from st+1, πst+1 ∈ Π

(st+1

)such that:

Vt+1

(st+1

)≤ JTt+1

(st+1, πs

t+1)+ ε. (6.18)

In addition, this plan satisfies certain measurability property. For any action

at ∈ Γ (st) , we construct a feasible plan πst ∈ Π

(st)such that πs

t

t = at and

C(πst

t , st+1) = πst+1. By the definition of Vt, Lemma 6.3.1, and (6.18), we

obtain:

Vt(st)≥ JTt

(st, πs

t)

= ut (st, at) + β

∫JTt+1

(st+1, πs

t+1)Pt+1 (st, at; dst+1)

≥ ut (st, at) + β

∫Vt+1

(st+1

)Pt+1 (st, at; dst+1)− βε.

The measurability assumption on πst+1

is needed to ensure the above inte-

gration makes sense. Since at ∈ Γ (st) is arbitrary, we deduce that

Vt(st)≥Wt

(st)− βε.

Letting ε → 0, we obtain Vt(st)≥ Wt

(st). Combining with the first step,

we have proved (6.17).

Finally, by definition, VT(sT

)is independent of sT−1. Suppose Vt+1

(st+1

)is independent of st. Then it follows from (6.17) that Vt

(st)is independent

of st−1. By induction, we conclude that (Vt) is Markovian.

Theorem 6.3.1 is often called the Dynamic Program Principle in

control theory. By this theorem, we may write the value function Vt as a

function of st and rewrite the Bellman equation (6.17) as:

Vt (st) = supat∈Γ(st)

ut (st, at) + β

∫Vt+1 (st+1)Pt+1 (st, at; dst+1) , (6.19)

If the supremum is attained, we define the set of maximizers by Gt (st) for

any st ∈ S. This way gives a policy correspondence Gt from S to A. If each


Gt is nonempty and if there exists a measurable selection from Gt, then we

say that π is generated by (Gt)T−1t=0 from s0 if it is formed in the following

way. Let (gt)T−1t=0 be a sequence of measurable selections from (Gt)

T−1t=0 , and

define π by:

π0 = g0 (s0) ,

πt(st)

= gt (st) , all st ∈ St, t = 1, ..., T − 1.

By this construction, the plan π is Markovian.

Theorem 6.3.1 demonstrates that the value function from the Markov

decision problem (5.5) can be computed by backward induction using the

Bellman equation (6.17) or (6.19). A natural question is how to derive an

optimal plan for the Markov decision problem using Bellman equations. The

following theorem shows that any plan generated by the sequence of optimal

policy correspondences associated with the Bellman equation (6.19) is an

optimal plan for the Markov decision problem. In addition, this optimal

plan is Markovian.

Theorem 6.3.2 Let the Markov decision process in (5.1) be given and As-

sumptions 5.1.1-5.1.3 hold. Suppose that the solution V ∗t to the Bellman

equation (6.19) is a finite and measurable function on S and that the associ-

ated optimal policy correspondence Gt is nonempty and permits a measurable

selection, for each t = 0, 1, ..., T −1. Then any Markovian plan π∗ generated

by (Gt) attains the supremum of (6.14), in that:

V ∗t (st) = JTt (s

t, π∗st) = Vt (st) , all st ∈ S, t = 0, 1, ..., T − 1,

where π∗st is the plan from history st defined by π∗stt = π∗t (st) , π∗st

t+j(st+jt+1) =

π∗t+j (st+j) , j = 1, ..., T − t− 1.

Proof. Consider any feasible plan from st, πst ∈ Π

(st). By (6.19) and


assumptions,

V ∗t (st)

= supat∈Γ(st)

ut (st, at) + β

∫V ∗t+1 (st+1)Pt+1 (st, at; dst+1)

≥ ut

(st, π

st

t

)+ β

∫V ∗t+1 (st+1)Pt+1

(st, π

st

t ; dst+1

)

= ut

(st, π

st

t

)+ β

∫ {sup

at+1∈Γ(st+1)[ut+1 (st+1, at+1)

+β

∫V ∗t+2 (st+2)Pt+2 (st+1, at+1; dst+2)

]}Pt+1

(st, π

st

t ; dst+1

)

≥ ut

(st, π

stt

)+ β

∫[ut+1

(st+1, π

stt+1 (st+1)

)+β

∫V ∗t+2 (st+2)Pt+2

(st+1, π

st

t+1 (st+1) ; dst+2

)]Pt+1

(st, π

st

t ; dst+1

)= ut

(st, π

st

t

)+ β

∫ut+1

(st+1, π

st

t+1 (st+1))μt+1t

(st, dst+1;π

st)

+β2∫ ∫

V ∗t+2 (st+2)μ

t+2t

(st, dst+1, dst+2;π

st),

where the two inequalities follow from the feasibility of πst, and the last

equality follows from (6.12). Continuing in this way, we can show by induc-

tion that V ∗t (st) ≥ JTt

(st, πs

t)for any πs

t ∈ Π(st). Thus, V ∗

t (st) ≥ Vt (st) .By assumption, we can replace the above inequalities by equalities if we set

π = π∗. Thus, V ∗t (st) = JTt (s

t, π∗st) = Vt (st) .

The above theorem may be viewed as a Verification Theorem in con-

trol theory. The final result gives a partial converse of this theorem. It

states that any optimal plan for the Markov decision problem can be gen-

erated by the sequence of optimal policy correspondences for the dynamic

programming problem. This result ensures that the dynamic programming

method finds all optimal plans for the Markov decision problem.


Theorem 6.3.3 Let the Markov decision process in (5.1) be given and As-

sumptions 5.1.1-5.1.3 hold. Suppose that the sequence of value functions

(Vt)Tt=0 satisfies the Bellman equation (6.17) and Vt : S

t → R is measurable

for t = 1, 2, ..., T . Suppose that the associated optimal policy correspondence

Gt is nonempty and permits a measurable selection for t = 0, 1, ..., T − 1.

Then any optimal plan π∗ ∈ Π(s0) for problem (5.5) satisfies:

π∗0 ∈ G0 (s0) and π∗t(st)∈ Gt (st) , all t, st.

Proof. Since the sequence of value functions (Vt) satisfies the Bellman

equation (6.17), it is sufficient to show that

V0 (s0) = u0 (s0, π∗0) + β

∫V1 (s1)P1 (s0, π

∗0; ds1) , (6.20)

Vt(st)= ut

(st, π

∗t

(st))

+ β

∫Vt+1

(st+1

)Pt+1

(st, π

∗t (s

t); dst+1

). (6.21)

We derive these equations by forward induction.

Consider (6.20) first. Since π∗ is an optimal plan,

V0 (s0) = JT0 (s0, π∗) ≥ JT0 (s0, π) , all π ∈ Π(s0) . (6.22)

By Lemma 6.3.1, we can rewrite the above inequality as:

u0 (s0, π∗0) + β

∫JT1 (s1, C(π∗, s1))P1 (s0, π

∗0; ds1) (6.23)

≥ u0 (s0, π0) + β

∫JT1 (s1, C(π, s1))P1 (s0, π0; ds1) ,

for all π ∈ Π(s0) .

Next, Choose a measurable selection gt from Gt and define the plan

πg ∈ Π(s0) as follows:

πg0 = π∗0,

πgt(st)

= gt (st) , all st ∈ S

t, t = 1, 2, ..., T − 1.


Thus, for any s1 ∈ S, the continuation plan C(πg, s1) is a plan generated by

(Gt) from s1. Replacing π by πg in (6.23) gives∫JT1 (s1, C(π∗, s1))P1 (s0, π

∗0; ds1) ≥

∫JT1 (s1, C(πg, s1))P1 (s0, π

∗0; ds1) .

(6.24)

By Theorem 6.3.2,

V1 (s1) = JT1 (s1, C(πg, s1)) ≥ JT1 (s1, σ) , (6.25)

for all s1 ∈ S, σ ∈ Π(s1). Since C (π∗, s1) ∈ Π

(s1), it follows that

JT1 (s1, C(πg, s1)) ≥ JT1 (s1, C (π∗, s1)) , all s1 ∈ S. (6.26)

The two inequalities in (6.24) and (6.26) imply that

JT1 (s1, C(πg, s1)) = JT1 (s1, C (π∗, s1)) , P1 (s0, π∗0; ·) -a.e.

It then follows from (6.25) that

V1 (s1) = JT1 (s1, C (π∗, s1)) , P1 (s0, π∗0; ·) -a.e. (6.27)

Thus, by the optimality of π∗,

V0 (s0) = JT0 (s0, π∗)

= u0 (s0, π∗0) + β

∫JT1 (s1, C(π∗, s1)P1 (s0, ds1;π

∗0)

= u0 (s0, π∗0) + β

∫V1 (s1)P1 (s0, ds1;π

∗0) ,

where the second line follows from Lemma 6.3.1. Since (Vt) satisfies the

Bellman equation (6.19) at t = 0, the above equation implies that π∗0 ∈G0 (s0) .

Use an analogous argument, with (6.27) in place of (6.22) as the starting

time, to show that (6.21) holds for t = 1, and continue by induction.

In this theorem, we may impose Assumption 6.3.1 to ensure that (Vt)

satisfies the Bellman equation by Theorem 6.3.1. If we know the solutions to


problems (6.17) and (6.14) are unique, then Theorem 6.3.2 guarantees that

the dynamic programming method finds the unique solution to the Markov

decision problem and Theorem 6.3.3 is unnecessary.

Theorems 6.3.1-6.3.3 give the formal statement of the Principle of Opti-

mality. Though Theorem 6.3.1 is intuitive, its key assumption 6.3.1 is hard

to check in applications. By contrast, the assumptions of Theorem 6.3.2 can

be easily verified by imposing certain continuity conditions as we show in

the next section. Thus, this theorem is widely used in applications.

6.4 Optimal Control

In this section, we consider the optimal control model introduced in Sec-

tion 5.2.4. In this control model, a state s can be decomposed into an

endogenous state x and an exogenous state z. The exogenous state follows

a time-homogenenous Markov process with the transition function Q. The

endogenous state follows the dynamics in (5.14). The system transition func-

tion φ may be time dependent as in (5.11). All results in this section can

be easily extended to this general case by suitably modifying assumptions.

We consider the control problem in the Bolza form:

sup(at)

T−1t=0 ∈Π(x0,z0)

E

[T−1∑t=0

βtut (xt, zt, at) + βTuT (xT , zT )

](6.28)

subject to (5.14).

The corresponding dynamic programming problem is given by:

Vt (xt, zt) = supat∈Γ(xt,zt)

ut (xt, zt, at) + β

∫Vt+1 (xt+1, zt+1)Q (zt, dzt+1)

(6.29)

subject to (5.14) and the terminal condition VT (xT , zT ) = uT (xT , zT ) . We

shall study the existence and properties of the solution to this problem and

the connection to the control problem. We impose the following assumptions

6.4. OPTIMAL CONTROL 167

on the primitives of the model, (X,X ) , (A,A), (Z,Z), Q, Γ, φ, (ut)Tt=0 , and

β :

Assumption 6.4.1 X is a convex Borel set in Rnx , with its Borel subsets

X .

Assumption 6.4.2 A is a convex Borel set in Rna, with its Borel subsets

A.

Assumption 6.4.3 One of the following conditions holds:

(i) Z is a countable set and Z is the σ-algebra containing all subsets of

Z; or

(ii) Z is compact Borel set in Rnz , with its Borel subsets Z, and the

transition function Q on (Z,Z) has the Feller property.

The assumption that Z is compact may be dispensed with. See Section

12.6 in Stokey, Lucas and Prescott (1989).

Assumption 6.4.4 The correspondence Γ : X× Z→ A is nonempty, compact-

valued and continuous.

Assumption 6.4.5 The system transition function φ : X×A×Z×Z→ X

is continuous.

Assumption 6.4.6 The reward functions ut : X× Z× A→ R, t = 0, 1, ..., T−1, and uT : X× Z→R are continuous.

If Z is a countable set, continuity on this set is in the sense of discrete

topology. We may suppose that β satisfies Assumption 5.1.2. But it is

unnecessary for the finite horizon case.

The following lemma is crucial for solving Bellman equations.


Lemma 6.4.1 Let (X,X ) , (A,A), (Z,Z), Q, and φ satisfy Assumptions

6.4.1, 6.4.2, 6.4.3, and 6.4.5. Then for any continuous function f : X×Z→R, the function defined by

h (x, a, z) =

∫f(φ(x, a, z, z′

), z′

)Q

(z, dz′

)is also continuous.

Proof. See the proof of Lemmas 9.5 and 9.5’ in Stokey, Lucas and

Prescott (1989).

Theorem 6.4.1 (Existence) Let (X,X ) , (A,A), (Z,Z), Q, Γ, φ, (ut)Tt=0 ,

and β satisfy Assumptions 6.4.1-6.4.6 and 5.1.2. Then the solution to the

Bellman equation (6.29), Vt : X × Z → R, is continuous and the associated

policy correspondence Gt : X × Z → A is non-empty, compact valued and

upper hemicontinuous, for any t = 0, 1, ..., T − 1.

Proof. The proof is by induction. At t = T − 1, since VT = uT , the

Bellman equation becomes:

VT−1 (xT−1, zT−1) = supa∈Γ(xT−1,zT−1)

uT−1 (xT−1, zT−1, a) (6.30)

+β

∫uT (φ(xT−1, a, zT−1, zT ), zT )Q (zT−1, dzT ) .

By Assumptions 6.4.3, 6.4.6, and 6.4.5, it follows from Lemma 6.4.1 that

the expression in the second line of the above equation is continuous in

(xT−1, a, zT−1) . By assumptions, we can use the Theorem of the Maximum

to deduce that VT−1 is continuous and the associated policy correspondence

GT is non-empty, compact valued and upper hemicontinuous. Suppose Vt+1

is continuous. Then applying Lemma 6.4.1 and the Theorem of the Maxi-

mum to the problem in (6.29), we deduce that the conclusion in the theorem

holds at time t. Thus, it holds for any t = 0, 1, ..., T − 1.


By this theorem and the Measurable Selection Theorem, each Gt permits

a measurable selection. We can then apply Theorem 6.3.2 to deduce that the

function V0 delivered by the Bellman equation is equal to the value function

for the Markov decision problem at date zero and any plan generated by

(Gt) gives the optimal plan for this problem.

Now, we provide some characterizations for (Vt, Gt) . First, we impose

assumptions to ensure the value function is concave.

Assumption 6.4.7 For each (z, a) ∈ Z × A, uT (·, z) : X → R is strictly

increasing and ut (·, z, a) : X→ R is strictly increasing for t = 0, 1, ..., T − 1.

Assumption 6.4.8 For each z ∈ Z, Γ (·, z) : X→ A is increasing in the

sense that x ≤ x′ implies Γ (x, z) ⊂ Γ (x′, z) .

Assumption 6.4.9 For each (a, z, z′) ∈ A× Z× Z, φ (·, a, z, z′) : X → X

is increasing.

Theorem 6.4.2 (Monotonicity) Let (X,X ) , (A,A), (Z,Z), Q, Γ, φ, (ut)Tt=0 ,

and β satisfy Assumptions 6.4.1-6.4.9 and 5.1.2. Then the solution to the

Bellman equation (6.29), Vt (·, z) : X → R, is strictly increasing for each

z ∈ Z and t = 0, 1, ..., T.

Proof. The proof is by induction. At time T, VT (·, z) = uT (·, z) is

strictly increasing for each z ∈ Z. ,Let x′T−1 > xT−1. Use the Bellman

equation (6.30) at time T − 1 to derive:

VT−1 (xT−1, zT−1) = uT−1 (xT−1, zT−1, a∗)

+β

∫uT (φ(xT−1, a

∗, zT−1, zT ), zT )Q (zT−1, dzT )

< uT−1

(x′T−1, zT−1, a

∗)+β

∫uT

(φ(x′T−1, a

∗, zT−1, zT ), zT)Q (zT−1, dzT )


≤ supa∈Γ(x′T−1,zT−1)

uT−1

(x′T−1, zT−1, a

)+β

∫uT

(φ(x′T−1, a, zT−1, zT ), zT

)Q (zT−1, dzT )

= VT−1

(x′T−1, zT−1

),

where a∗ ∈ Γ (xT−1, zT−1) attains the supremum. The existence of a∗ fol-

lows from Theorem 6.4.1. The first inequality follows from Assumptions

6.4.7 and 6.4.9. The second inequality follows from a∗ ∈ Γ(x′T−1, zT−1

)by

Assumption 6.4.8. Suppose Vt+1 (·, z) is strictly increasing. We then apply

the Bellman equation (6.29) at time t and follow the above argument to

deduce that Vt (·, z) is strictly increasing. This completes the proof.

Second, we impose assumptions to ensure that the value function is con-

cave and the optimal policy correspondence is single-valued.

Assumption 6.4.10 For each z ∈ Z, uT (·, z) is strictly concave and

ut(θx+ (1− θ)x′, z, θa+ (1− θ) a′

)≥ θut (x, z, a) + (1− θ)ut

(x′, z, a′

)for all (x, a) , (x′, a′) ∈ X × A, all θ ∈ (0, 1) , and t = 0, 1, ..., T − 1. The

inequality is strict if x �= x′.

Assumption 6.4.11 For each z ∈ Z and all x, x′ ∈ X,

a ∈ Γ (x, z) and a′ ∈ Γ(x′, z

)implies

θa+ (1− θ) a′ ∈ Γ(θx+ (1− θ)x′, z

), all θ ∈ (0, 1) .

Assumption 6.4.12 For each (z, z′) ∈ Z2, φ (·, ·, z, z′) is concave.

Theorem 6.4.3 (Concavity) Let (X,X ) , (A,A), (Z,Z), Q, Γ, φ, (ut)Tt=0 ,

and β satisfy Assumptions 6.4.1-6.4.12 and 5.1.2. Let Vt be the solution

to the Bellman equation (6.29) and Gt be the associated optimal policy cor-

respondence, t = 0, 1, ..., T − 1. Then, for each z ∈ Z, Vt (·, z) : X → R


is strictly concave and Gt (·, ·) : X× Z → A is a single-valued continuous

function.

Proof. The proof is by induction. At the terminal time T, VT (·, z) =uT (·, z) is strictly concave by Assumption 6.4.10. At T − 1, let xT−1 �=x′T−1 ∈ X. Define

xθT−1 = θxT−1 + (1− θ)x′T−1 for θ ∈ (0, 1)

By Assumption 6.4.1, xθT−1 ∈ X. Let a∗ ∈ Γ (xT−1, zT−1) and a′ ∈ Γ (xT−1, zT−1)

attain the supremum VT−1 (xT−1, zT−1) and VT−1 (xT−1, zT−1), respectively.

This is guaranteed by Theorem 6.4.1. By Assumption 6.4.11, aθ = θa∗ +

(1− θ)a′ ∈ Γ(xθT−1, zT−1

). Use the Bellman equation (6.30) to deduce that

VT−1

(xθT−1, zT−1

)≥ uT−1

(xθT−1, zT−1, a

θ)

+β

∫uT

(φ(xθT−1, a

θ, zT−1, zT ), zT

)Q (zT−1, dzT )

> θuT−1 (xT−1, zT−1, a∗) + (1− θ)uT−1

(x′T−1, zT−1, a

′)+βθ

∫uT (φ(xT−1, a

∗, zT−1, zT ), zT )Q (zT−1, dzT )

+β(1− θ)∫uT

(φ(x′T−1, a

′, zT−1, zT ), zT)Q (zT−1, dzT )

= θVT−1 (xT−1, zT−1) + (1− θ)VT−1

(x′T−1, zT−1

),

where the second inequality follows from concavity of uT−1 (·, zT−1, ·) and

uT (φ (·, ·, zT−1, zT ), zT ) by Assumptions 6.4.7 and 6.4.10. Note that we need

uT to be monotonic for a composite of concave functions to be concave.

This establishes concavity of VT−1 (·, zT−1) . Suppose Vt+1 (·, z) is strictly

concave. Then by Theorem 6.4.2, Vt+1 (·, z) is strictly increasing. Applying

the Bellman equation (6.29) at date t and the above argument, we deduce

that Vt (·, z) is strictly concave. Because Vt (·, z) is strictly concave for all t =


0, 1, ..., T, the optimal policy correspondence is a single-valued continuous

function. This completes the proof.

If the assumptions in the theorem hold, then we write the optimal policy

function as gt : X× Z→ A. This policy function generates an optimal plan

(a0, a1, ..., aT−1) and optimal system dynamics (x0, x1, ..., xT ) recursively as

follows:

at(zt)

= gt (xt, zt)

xt+1

(zt+1

)= φ (xt, at, zt, zt+1) , (x0, z0) given,

where zt ∈ Zt.

Third, we impose assumptions to ensure differentiability of the value

function. For any set B in some Euclidean space we use the notation int(B)

to denote the interior of B.

Assumption 6.4.13 For each z ∈ Z, uT (·, z) is differentiable on the in-

terior of X and ut (·, z, ·) is continuously differentiable on the interior of

X× A, for t = 0, 1, ..., T − 1.

Assumption 6.4.14 For each (z, z′) ∈ Z × Z, φ (·, ·, z, z′) is differentiable

on the interior of X×A.

Theorem 6.4.4 (Differentiability) Let (X,X ) , (A,A), (Z,Z), Q, Γ, φ, (ut)Tt=0 ,

and β satisfy Assumptions 6.4.1-6.4.14 and 5.1.2. Let xt ∈ int X and

at = gt (xt, zt) ∈ int Γ (xt, zt) for each zt ∈ Z. Then Vt (·, zt) is continu-

ously differentiable on int X and

∂Vt (xt, zt)

∂x=∂ut (xt, zt, at)

∂x+ βEt

∂Vt+1 (xt+1, zt+1)

∂x

∂φ (xt, at, zt, zt+1)

∂x,

(6.31)

for t = 1, 2, ..., T − 1 and

∂VT (xT , zT )

∂x=∂uT (xT , zT )

∂x, (6.32)


where Et is the expectation operator conditioned on (xt, zt) and xt+1 satisfies

(5.14) with at = gt (xt, zt) for t = 0, ..., T − 1.

Proof. The proof is by induction. At time T, VT (xT , zT ) = uT (xT , zT ) .

Let xT−1 ∈ int X and aT−1 = gT−1 (xT−1, zT−1) ∈ int(Γ (xt, zt)). By

Assumption 6.4.4, there is some open neighborhood D of xT−1 such that

gT−1 (xT−1, zT−1) ∈ int Γ (x, zT−1) , all x ∈ D. We define a function W :

D → R by:

WT−1 (x) = uT−1 (x, zT−1, gT−1(xT−1, zT−1))

+β

∫uT (φ(x, gT−1 (xT−1, zT−1) , zT−1, zT ), zT )Q (zT−1, dzT ) .

Then since gT−1 (xT−1, zT−1) ∈ int Γ (x, zT−1) , it follows from (6.30) that

WT−1 (x) ≤ VT−1 (xT−1, zT−1) with equality at x = xT−1. By Assumptions

6.4.13 and 6.4.14, WT−1 is differentiable. By the Benveniste and Scheinkman

Theorem, VT−1 (·, zT−1) is differentiable at xT−1 and

∂VT−1 (xT−1, zT−1)

∂x=∂WT−1 (xT−1, zT−1)

∂x.

We then obtain equation (6.31). Suppose Vt+1 (·, zt+1) is continuously dif-

ferentiable. We use the Bellman equation (6.29) at time t to construct a

function Wt (x) . We use the same argument to conclude Vt (·, zt) is contin-uously differentiable and its derivative at xt satisfies equation (6.31). This

completes the proof.

Equation (6.31) is often referred to as the envelope condition. Note

that we have not used the assumption that φ (x, a, z, z′) is differentiable in

a to derive this condition. We shall use it to derive first-order conditions

for the Bellman equation (6.29). By Theorem 6.4.4 and Assumptions 6.4.13

and 6.4.14, differentiating with respect to at yields:

0 =∂ut(xt, zt, at)

∂a+ βEt

∂Vt+1 (xt+1, zt+1)

∂x


∂a, (6.33)


for t = 0, 1, ..., T − 1. Combining with the envelope condition (6.31)-(6.32)

yields all the necessary conditions to solve for an optimal plan. At a first

glance, these conditions are complicated because of the presence of the value

function in each period. In the next section, we will show that solving for

value functions is unnecessary.

In the special case in which φ (x, a, z, z′) = a, we can simplify the solution

significantly. We do not need Vt to be monotonic to establish concavity or

differentiability. In this case, Assumptions 6.4.7-6.4.9 can be dispensed with.

The first-order conditions become:

0 =∂ut (xt, zt, xt+1)

∂a+ βEt

∂ut+1 (xt+1, zt+1, xt+2)

∂x,

for t = 0, 1, ..., T − 2 and

0 =∂uT−1 (xT−1, zT−1, xT )

∂a+ βET−1

∂uT (xT , zT )

∂x,

for t = T − 1. These equations are often called Euler equations. These

equations are recursive and can be solved by backward induction to obtain

a solution in the form xt+1 = gt (xt, zt) , t = 0, 1, ..., T − 1.

6.5 The Maximum Principle

Pontryagin’s Maximum Principle is originally proposed for continuous-time

optimal control problems by Pontryagin et al. (1962). This principle is

closely related to the Lagrange multiplier method and also applies to discrete-

time optimal control problems. We now present this principle for the control

problem (6.28) studied in Section 6.4. In that section, we have used the

dynamic programming method. Here we shall study the relation between

dynamic programming and the Maximum Principle.

To fix ideas, we simply assume that X = A = R and Γ (x, z) = A for

all (x, a) ∈ X × A. The analysis can be easily extended to general multi-

6.5. THE MAXIMUM PRINCIPLE 175

dimensional case. We write the Lagrangian form for problem (6.28) as:

L = E

T−1∑t=0

[βtut (xt, zt, at)− βt+1μt+1 (xt+1 − φ(xt, at, zt, zt+1))

]+EβTuT (xT , zT ) ,

where βt+1μt+1 is the discounted Lagrange multiplier associated with (5.14).

Under Assumptions 6.4.13 and 6.4.14, we can derive the first-order condi-

tions for an interior solution. Specifically, differentiating with respect to at

yields:∂ut (xt, zt, at)

∂a+ βEtμt+1


∂a= 0, (6.34)

for t = 0, 1, ..., T − 1. Differentiating with respect to xt yields:

μt =∂ut (xt, zt, at)

∂x+ βEtμt+1


∂x, (6.35)

for t = 1, 2, ..., T − 1, and

μT =∂uT (xT , zT )

∂x. (6.36)

Define the Hamiltonian function as:

Ht

(xt, at, zt, μt+1

)= ut (xt, zt, at) + βEtμt+1φ(xt, at, zt, zt+1), (6.37)

and express the preceding first-order conditions as:

∂Ht

(xt, at, zt, μt+1

)∂a

= 0, t = 0, 1, ..., T − 1, (6.38)

∂Ht

(xt, at, zt, μt+1

)∂x

= μt, t = 1, 2, ..., T − 1. (6.39)

Combining with the state transition equation (5.14) yields a system of 3T

equations for 3T variables(at, xt+1, μt+1

)T−1

t=0. This system gives the nec-

essary condition for optimality and can be solved recursively by backward

induction. This result is called the Maximum Principle. Given suitable


concavity conditions, the system of 3T equations is also sufficient for opti-

mality. Here we state a result in the one-dimensional case. It can be easily

generalized to the multi-dimensional case.

Theorem 6.5.1 (Sufficiency of the necessary conditions from the Maxi-

mum Principle) Let X = A = R, Γ (x, z) = A for all (x, a) ∈ X × A, and

Assumptions 5.1.2, 6.4.3, 6.4.13 and 6.4.14 hold. Let(x∗t+1, a

∗t , μ

∗t+1

)T−1

t=0

satisfy the system of 3T equations (5.14), (6.36), (6.38), and (6.39). Sup-

pose that Ht

(·, ·, zt, μ∗t+1

)is concave in (x, a) for each zt ∈ Z. Then (a∗t )

T−1t=0

is an optimal policy for problem (6.28).

Proof. We wish to show that

D = E

T−1∑t=0

βtut (x∗t , zt, a

∗t )− E

T−1∑t=0

βtut (xt, zt, at)

+EβTuT (x∗T , zT )− EβTuT (xT , zT )

≥ 0,

for any feasible plan (at)T−1t=0 and the associated state process (xt)

Tt=1 satis-

fying (5.14). Since both (xt+1, at)T−1t=0 and

(x∗t+1, a

∗t

)T−1

t=0satisfy the system

transition equation, we deduce that:

D = ET−1∑t=0

[βtut (x

∗t , zt, a

∗t )− βt+1μ∗t+1

(x∗t+1 − φ(x∗t , a∗t , zt, zt+1)

)]−E

T−1∑t=0

[βtut (xt, zt, at)− βt+1μ∗t+1 (xt+1 − φ(xt, at, zt, zt+1))

]+EβTuT (x∗T , zT )− EβTuT (xT , zT ) .

6.5. THE MAXIMUM PRINCIPLE 177

By assumptions, we obtain

D = ET−1∑t=0

βtHt

(x∗t , a

∗t , zt, μ

∗t+1

)− E

T−1∑t=0

βtHt

(xt, at, zt, μt+1

)−E

T−1∑t=0

βt+1μ∗t+1

(x∗t+1 − xt+1

)+EβTuT (x∗T , zT )− EβTuT (xT , zT )

≥ ET−1∑t=0

βt∂Ht

(x∗t , a∗t , zt, μ∗t+1

)∂x

(x∗t − xt)

+E

T−1∑t=0

βt∂Ht

(x∗t , a∗t , zt, μ∗t+1

)∂a

(a∗t − at)

−ET−1∑t=0

βt+1μ∗t+1

(x∗t+1 − xt+1

)+EβT

∂uT (x∗T , zT )∂x

(x∗T − xT )

Rearranging terms and using (6.36), (6.38), (6.39), and x∗0 = x0, we find

that the right-hand of the above inequality is equal to zero. Thus, D ≥ 0.

A set of sufficient conditions for Ht

(·, ·, zt, μ∗t+1

)to be concave is the

following: ut is concave in (x, a) for t = 0, 1, ..., T − 1, uT is concave in x,

and φ is concave in (x, a). In addition, μ∗t+1 ≥ 0 for t = 0, 1, ..., T − 1. If φ

is linear in (x, a) , then the last assumption on μ∗t+1 is not needed.

What is the relation between the Maximum Principle and dynamic pro-

gramming? Let

μt =∂Vt (xt, zt)

∂xt, t = 1, 2, ..., T,

where Vt is the value function defined in (6.29). The Lagrange multiplier μt

is interpreted as the shadow value of the state. Then we can immediately

see that equation (6.34) or (6.38) is equivalent to equation (6.33) and that

equation (6.35) or (6.39) is equivalent to equation (6.31). The beauty of the


Maximum principle is that we do not need to study value functions directly,

which are complicated objects as shown in the previous section. Instead, the

Lagrange multipliers are the objects to solve for. In addition, this approach

can easily handle additional intratemporal constraints on states or actions.

One only need to introduce additional Lagrange multipliers and then apply

the Kuhn-Tucker Theorem to derive first-order conditions.

In many economic problems, the system transition equation takes the

following form:

xt+1 = φ (xt, at, zt) ,

where xt+1 is nonstochastic conditioned on the current state and action. In

this case, we may replace the Lagrange multiplier βt+1μt+1 with βtλt in the

Lagrangian form. The first-order conditions become:

∂ut (xt, zt, at)

∂a+ λt


∂a= 0, (6.40)

for t = 0, 1, ..., T − 1, and

λt = βEt∂ut+1 (xt+1, zt+1, at+1)

∂x+ βEtλt+1

∂φ (xt+1, at+1, zt+1)

∂x,

for t = 0, 1, ..., T − 2, and

λT−1 = ET−1 [uT (xT , zT )] .

6.6 Applications

In this section, we analyze two Markov decision problems using the method

of dynamic programming. The first problem is a classic optimal stopping

problem and the second is a classic optimal control problem.

6.6. APPLICATIONS 179

6.6.1 The Secretary Problem

Consider the secretary problem presented in Section 5.2.2. The time horizon

T = N. Suppose N ≥ 3. The state space is {0, 1} . Let the value Vt (1) be

the maximum probability of choosing the best candidate, when the current

candidate has the highest relative rank among the first t interviewed. Let

Vt (0) be the maximum probability of choosing the best candidate, when the

current candidate does not have the highest relative rank among the first t

interviewed. In the terminal time,

VN (1) = h (1) = 1, VN (0) = h (0) = 0.

We write the Bellman equation as:

Vt (1) = max {gt (1) , ft (1) +Qt+1 (1, 1) Vt+1 (1) +Qt+1 (1, 0) Vt+1 (0)}

= max

{t

N,

1

t+ 1Vt+1 (1) +

t

t+ 1Vt+1 (0)

},

and

Vt (0) = max {gt (0) , ft (0) +Qt+1 (0, 1) Vt+1 (1) +Qt+1 (0, 0) Vt+1 (0)}

= max

{0,

1

t+ 1Vt+1 (1) +

t

t+ 1Vt+1 (0)

}.

Note that Vt ≥ 0. We deduce that

Vt (0) =1

t+ 1Vt+1 (1) +

t

t+ 1Vt+1 (0) , (6.41)

Vt (1) = max

{t

N, Vt (0)

}. (6.42)

The following proposition characterizes an optimal policy.

Proposition 6.6.1 It is optimal to interview τ candidates and subsequently

choose the first candidate that is better than those τ candidates, where τ is

given by:

τ = max

{t ≥ 1 :

1

t+

1

t+ 1+ ...+

1

N − 1> 1

}. (6.43)


Proof. The proof consists of three steps.

Step 1. Suppose τ ≥ 1 is the largest number such that Vτ (1) = Vτ (0) ≥τ/N. We use the Bellman equation (6.41) to show that

Vτ−1 (0) =1

τVτ (1) +

τ − 1

τVτ (0) = Vτ (0) ≥ τ/N.

Thus, the Bellman equation (6.42) implies that:

Vτ−1 (1) = max

{τ − 1

N, Vτ−1 (0)

}≥ τ/N >

τ − 1

N,

which implies that it is optimal to continue. By backward induction, we

conclude that Vt (1) = Vt (0) > t/N for all t < τ.

Step 2. Clearly, τ < N.We shall show that τ = 0 is impossible. If τ = 0,

then for all t ≥ 0, Vt (1) = t/N. Equation (6.41) implies that

Vt (0) =1

t+ 1

t+ 1

N+

t

t+ 1Vt+1 (0) =

1

N+

t

t+ 1Vt+1 (0) .

Note that VN (0) = 0. Solving this equation backward yields

Vt (0) =t

N

(1

t+

1

t+ 1+ ...+

1

N − 1

), 1 ≤ t < N.

This implies that

V1 (0) >1

N= V1 (1) ≥ V1 (0) ,

which is a contradiction.

Step 3. By Steps 1-2, we deduce that there exists 1 ≤ τ < N such that

V1 (0) = V1 (1) = ... = Vτ (0) = Vτ (1) ≥ τ/N,

Vt (1) = t/N for 1 ≤ t < τ

and

Vt (0) =t

N

(1

t+

1

t+ 1+ ...+

1

N − 1

), t > τ.

Thus, it is optimal to continue to interview candidates as long as Vt (0) >

t/N, so that τ is defined in (6.43).

6.6. APPLICATIONS 181

We may write τ as a function of N. We can check that τ (4) = 1. This

means that the employer should first interview one candidate and then

choose the next candidate that is better the first one. For N sufficiently

large,

1 � 1

τ(N)+

1

τ(N) + 1+ ...+

1

N − 1�

∫ N

τ(N)

1

xdx.

Thus,

lnN

τ(N)� 1 =⇒ τ(N)

N� 1

e� 0.368.

This implies that with a large number of candidates, the employer should

first interview about 36.8% of the candidates and then choose the first can-

didate who is better than those candidates.

6.6.2 A Consumption-Saving Problem

We consider a classical consumption-saving problem with uncertain labor

income. A consumer is initially endowed with some savings. Each period

he receives uncertain labor income. He then decides how much to consume

and how much to save in order to maximize his discounted expected utility

over a finite horizon.

We formulate this problem by the following Markov decision model:

maxE

[T∑t=0

βtu (ct)

],

subject to

ct + at+1 = Rat + yt, a0 ≥ 0 given,

where R > 0 is gross interest rate and ct, at, and yt represent consump-

tion, savings, and labor income, respectively. Assume u′′ < 0, u′ > 0 and

limc→0 u′ (c) = ∞. Assume each yt is independently and identically drawn

from a distribution over [ymin, ymax] . Note that there is no reward from the

terminal state aT+1. We need to impose a constraint on aT+1; otherwise,


the consumer could borrow and consume an infinite amount. Thus, we may

impose a borrowing constraint of the form aT+1 ≥ −bT where bT ≥ 0 repre-

sents a borrowing limit. In general, there may be borrowing constraints in

each period because of financial market frictions:

at+1 ≥ −bt, for some bt ≥ 0, t = 0, 1, ..., T,

where the borrowing limit bt may be time and state dependent. Here we

simply take an ad hoc constraint: bt = b ≥ 0.

Let at = at + b. Define the cash on hand as xt = Rat + yt + b, which is

the maximum that can be spent on consumption. Then we may rewrite the

budget and borrowing constraints as:

xt+1 = R (xt − ct) + yt+1 − rb,

ct + at+1 = xt, at+1 ≥ 0.

By the Principle of Optimality, we consider the following dynamic program-

ming problem:

Vt (xt) = max0≤at+1≤xt

u (xt − at+1) + βEVt+1 (Rat+1 + yt+1 − rb) ,

for t = 0, 1, ..., T − 1, where the expectation is taken with respect to yt+1.

Clearly, in the last period cT = xT and hence VT (xT ) = u (xT ) . Note that

the above change of variable reduces the dimension of the state space because

we only need to consider one state variable xt instead of two state variables

at and yt. This method typically works for models with IID shocks, but not

for models with Markov shocks.

It is useful to define n = T − t, which is interpreted as an index for

“time-to-go” following the terminology in Whittle (1982). We then rewrite

the preceding Bellman equation as:

v0 (x) = u (x) ,

6.7. EXERCISES 183

vn (x) = maxs∈[0,x]

u (x− s) + βEvn−1 (Rs+ y − rb) , n = 1, ..., T.

We can apply the theory developed in Section 6.4 to the above problem. In

particular, we deduce that vn inherits the basic properties of u, i.e., vs is

strictly increasing, strictly concave and continuously differentiable on R++.

In addition, there exist optimal consumption and saving policy functions

gn (x) and sn (x) = x− gn (x) , respectively. The following properties can be

proved:

• v′n (x) is increasing in n.

• g0 (x) = x and s0 (x) = 0.

• gn (x) and sn (x) are continuous and increasing in x.

• gn (x) is decreasing in n and sn (x) is increasing in n.

• gn (x) and sn (x) satisfy the first-order and envelope conditions:

v′n (x) = u′ (gn (x)) ≥ βRE[v′n−1 (Rsn (x) + y − rb)

],

with equality if sn (x) > 0 for n = 1, ..., T.

Relating to the original problem in terms of the calendar time, the con-

sumption and saving functions at time t are ct (x) = gT−t (x) and at+1 (x) =

sT−t (x) , respectively.

To compare with the deterministic case, let ct (x; ymin) denote the op-

timal consumption policy when yt = ymin, for t = 0, 1, ..., T . A similar

interpretation applies to the notation ct (x; ymax) . We the have the follow-

ing result: ct (x; ymax) ≥ ct (x) ≥ ct (x; ymin).

6.7 Exercises

1. Establish Theorems 6.3.1-6.3.3 for the deterministic case by suitably

modifying the assumptions. Compare to the stochastic case and dis-


cuss how the assumptions and proofs can be simplified without mea-

surability restrictions.

2. Consider the example analyzed in Section 6.1. Suppose there is a

multiplicative productivity shock zt to the production function, where

(zt) follows a Markov process with transition function Q. Solve the

optimal consumption and investment policies.

3. Solve the secretary problem, in which the objective is to maximize the

probability of choosing one of two best candidates, i.e., a candidate

with rank 1 or 2.

4. Solve the secretary problem, in which the objective is to maximize the

expected rank of the candidate selected (note that 1 is the highest

rank).

5. Prove the properties of the consumption and saving policy functions

stated in Section 6.6.2.

6. Formulate and analyze the stochastic consumption-saving problem in

Section 6.6.2 when labor income follows a general Markov process.

7. (Phelps (1962)) Consider the stochastic consumption-saving problem

analyzed in Section 6.6.2. Suppose there is no borrowing in that b = 0.

Suppose that labor income yt = y is constant over time, but the inter-

est rate is uncertain. The gross interest rate between time t and t+ 1

is Rt+1, which is drawn independently and identically from some dis-

tribution. Analyze the properties of consumption and saving policies.

8. Consider the following deterministic consumption-saving problem:

maxT∑t=0

βtc1−γt

1− γ , γ > 0

6.7. EXERCISES 185

subject to

ct + at+1 = Rat + yt, a0 given,

where yt+1 = (1 + g) yt. There is no borrowing constraint. Explicitly

solve this problem.

9. (Schechtman and Escudero (1977)) Consider the following stochastic

consumption-saving problem:

maxE

[T∑t=0

βt(−e−γct

)], γ > 0,

subject to


where yt is independently and identically distributed on [ymin, ymax] .

Explicitly solve the optimal consumption and saving policies and the

value function. Note that consumption can be negative for exponential

utility.

10. (Levhari, Mirman, and Zilcha (1980)) Suppose we impose the con-

straints ct ≥ 0 and at+1 ≥ 0 in the above problem. Solve the optimal

consumption and saving policies.


Chapter 7

Infinite-Horizon DynamicProgramming

In this chapter, we shall study the infinite-horizon Markov decision process

model introduced in Chapter 5. The primitives of this model are (S,S) ,(A,A), Γ, P, u, and β. We shall maintain Assumptions 5.1.1 and 5.1.4.

Unlike the finite-horizon case, the infinite-horizon model has a stationarity

structure in that both the one-period rewards and the stochastic kernels

for the state process are time homogeneous. Intuitively, we may view the

infinite-horizon model as the limit of the finite-horizon model as the time

horizon goes to infinity. As a result, the theory developed in the previous

chapter can be extended to the infinite-horizon case. In Section 7.1, we

formalize the Principle of Optimality and derive the Bellman equation for

the infinite-horizon case. The difficulty of the infinite-horizon case is that

there is no general theory to guarantee the existence of a solution to the

Bellman equation. For bounded rewards, we can use the powerful Contrac-

tion Mapping Theorem to deal with this issue. Section 7.2 studies models

with bounded rewards. Section 7.3 analyzes the optimal control model with

bounded and unbounded rewards. In Section 7.4, we derive first-order con-

ditions and transversality conditions.

187

188 CHAPTER 7. INFINITE-HORIZON DYNAMIC PROGRAMMING

7.1 The Principle of Optimality

In this section, we will prove the infinite-horizon Principle of Optimality by

establishing three theorems, analogous to Theorems 6.3.1-6.3.3. Recall that

the value function V : S→ R ∪ {±∞} is defined by:


J (s, π) , (7.1)

where Π (s) is the set of feasible plans from initial state s ∈ S and J (s, π)

is the total discounted expected rewards from π defined in (5.4). Under

Assumptions 5.1.1 and 5.1.4, this is a well defined problem. We also make

the following stronger assumption in order to establish that J has a recursive

structure. This structure is important for dynamic programming.

Assumption 7.1.1 If u takes on both signs, there is a collection of non-

negative, measurable functions Lt : S → R+, t = 0, 1, ..., such that for all

π ∈ Π(s0) and all s0 ∈ S,

|u (s0, π0)| ≤ L0 (s0) ,∣∣u (st, πt (st))∣∣ ≤ Lt (s0) ,

and ∞∑t=0

βtLt (s0) <∞.

Lemma 7.1.1 Let (S,S) , (A,A), Γ, P, u, and β be given. Suppose As-

sumptions 5.1.1, 5.1.2, 5.1.4 and 7.1.1 hold. Then for any s0 ∈ S and any

π ∈ Π(s0) ,

J (s0, π) = u (s0, π0) + β

∫J (s1, C (π, s1))P (s0, π0; ds1)

where for each s1 ∈ S, C (π, s1) is the continuation plan of π following s1.

Proof. By definition,

J (s0, π) = u (s0, π0) + β limn→∞

n∑t=1

βt−1

∫u(st, πt

(st))μt

(s0, ds

t;π).


We compute the second term on the right-hand side of the above equation:

limn→∞

n∑t=1

βt−1

∫u(st, πt

(st))μt

(s0, ds

t;π)

= limn→∞

∫{u(s1, π1

(s1))

+n∑t=2

βt−1

∫u(st, πt

(st))μt−1

(s1, ds

t2;C(π, s1)

)}P (s0, π0; ds1)

=

∫limn→∞{u(s1, π1

(s1))

+n∑t=2

βt−1

∫u(st, πt

(st))μt−1

(s1, ds

t2;C(π, s1)

)}P (s0, π0; ds1)

=

∫J (s1, C(π, s1)P (s0, π0; ds1) ,

where the first equality follows from the definition of μt in (5.2), the second

from the Monotone Convergence Theorem if u ≥ 0 or u ≤ 0, and the Dom-

inated Convergence Theorem by Assumption 7.1.1 if u takes on both signs,

and the last from the definition of J.

The following assumption is analogous to Assumption 6.3.1.

Assumption 7.1.2 (a) The value function V : S → R is measurable. (b)

Given any s ∈ S, for any ε > 0, there exists a plan from s, πs ∈ Π(s) , such

that

J (s, πs) + ε ≥ V (s) .

In addition, as functions of s, πs0 and πst(st1)for any fixed st1 ∈ St are

measurable, t = 1, 2, ...

A plan satisfying the measurability restriction in the above assumption

is called a global plan in Stokey, Lucas and Prescott (1989).

Theorem 7.1.1 (Dynamic Programming Principle) Let (S,S) , (A,A), Γ,P, u, and β be given. Suppose Assumptions 5.1.1, 5.1.2, 5.1.4, 7.1.1, and


7.1.2 hold. Then V satisfies the Bellman equation:

V (s) = supa∈Γ(s)

u (s, a) +

∫V

(s′)P

(s, a; ds′

). (7.2)

Proof. The proof consists of two steps. First, for any plan π ∈ Π(s) ,

W (s) ≡ supa∈Γ(s)

u (s, a) +

∫V

(s′)P

(s, a; ds′

)≥ sup

a∈Γ(s)u (s, a) +

∫J(s′, C

(π, s′

))P

(s, a; ds′

)≥ u (s, π0) +

∫J(s′, C

(π, s′

))P

(s, π0; ds

′)= J (s, π) .

Thus, W (s) ≥ V (s) .

Second, by Assumption 7.1.2, Given any s ∈ S, for any ε > 0, there

exists a plan from s, πs ∈ Π(s) such that

J (s, πs) + ε ≥ V (s) . (7.3)

In addition, as functions of s, πs0 and πst

(st1)for any fixed st1 ∈ St are measur-

able, t = 1, 2, ... For any a ∈ Γ (s) , define the plan d = (a, d1, d2, ...) where

C(d, s1) = πs1 for any s1 ∈ S. That is, d1 (s1) = πs10 and dt (s1, ..., st) =

πs1t−1 (s2, ..., st) , t = 1, 2.... Assumption 7.1.2 implies that d ∈ Π(s) . Thus,

V (s) ≥ J (s, d)

= u (s, a) +

∫J(s′, πs

′)P

(s, a; ds′

)≥ u (s, a) +

∫V

(s′)P

(s, a; ds′

)− βε,

where the second line follows from Lemma 7.1.1 and the last line from (7.3).

Taking supremum over a ∈ Γ (s) and letting ε→ 0, we obtain V (s) ≥W (s) .

Combining the above two steps completes the proof.

Unlike in the finite-horizon case, the Bellman equation in the infinite-

horizon case gives a nontrivial fixed point problem: To solve the problem


on the right-hand side of (7.2), one needs to know the function V, which is

also equal to the optimized value of this problem. We call this problem a

dynamic programming problem. If there is a solution a ∈ Γ (s) that

attains the supremum in (7.2), we denote the set of all solutions as G (s) .

This defines an optimal policy correspondence G : S → A. If G permits a

sequence of measurable selections (gt)∞t=0 (some selections may be identical),

then we can define a plan π generated by G as follows:

π0 = g0 (s0) , πt(st)= gt (st) , s

t ∈ St.

This plan is feasible from s0 and is Markovian. If gt is independent of t,

then π is a stationary plan.

The following result demonstrates that under some conditions the solu-

tion to the Bellman equation (7.2) gives the value function for the Markov

decision problem (7.1). In addition, any plan generated by the optimal pol-

icy correspondence from the dynamic programming problem is optimal for

the Markov decision problem.

Theorem 7.1.2 (Verification) Let (S,S) , (A,A), Γ, P, u, and β be given.

Suppose Assumptions 5.1.1, 5.1.2, 5.1.4 and 7.1.1 hold. Suppose that V ∗ :

S→ R is a measurable function satisfying the Bellman equation (7.2) and

limn→∞βn

∫V ∗ (sn)μn (s0, dsn;π) = 0, (7.4)

for all π ∈ Π(s0) and all s0 ∈ S, where μt (s0, · ;π) is defined in (5.2). Let

G be the optimal policy correspondence associated with (7.2) and suppose

that G is nonempty and permits a measurable selection. Then V = V ∗ and

any plan π∗ generated by G attains the supremum in (7.1).

Proof. Given any s0 ∈ S, for any feasible plan π ∈ Π(s0) , use the

Bellman equation to derive:

V ∗ (st) ≥ u(st, πt

(st))

+ β

∫V ∗ (st+1)P

(st, πt

(st); dst+1

), (7.5)


for t = 0, 1, ..., n. Multiplying by βt and rearranging yield:

βtV ∗ (st)− βt+1

∫V ∗ (st+1)P

(st, πt

(st); dst+1

)≥ βtu

(st, πt

(st)).

Taking expectations with μt(s0, ds

t;π)and summing over t = 0, 1, ..., n−1,

we obtain:

V ∗ (s0)−βn∫V ∗ (sn)μn (s0, dsn;π) ≥

n−1∑t=0

βt∫u(st, πt

(st))μt

(s0, ds

t;π).

Taking limits as n→∞ and using (7.4), we deduce

V ∗ (s0) ≥ J (s0, π) .

Since this is true for all π ∈ Π(s0) , V∗ (s0) ≥ V (s0) . The equality is

attained if we replace π by π∗ because π∗ is generated by G so that (7.5)

becomes equality for all t at π = π∗.

Condition (7.4) is often called a transversality condition. It is triv-

ially satisfied if V ∗ is bounded. To illustrate its usefulness if V ∗ is un-

bounded, we consider the following deterministic example:

sup(ct,at+1)∞t=0

∞∑t=0

βtct, β = 1/R ∈ (0, 1)

subject to

ct + st+1 = Rst, ct ≥ 0, s0 > 0 given.

Clearly, the consumer can borrow an infinite amount to approach an in-

finitely large value of discounted utility. Thus, the value function is V (s0) =

+∞ for any s0 > 0.

Now consider the dynamic programming problem:

V (s) = supc≥0

c+ βV (Rs− c) .

There are two solutions to this Bellman equation. One is V ∗ (s) = +∞. Theother is V ∗ (s) = Rs. For the feasible plan ct = 0 for all t, the asset holdings


are st = R−ts0 and βnV ∗ (sn) = s0 > 0. Thus, the transversality condition

(7.4) is violated for the solution V ∗.

We should emphasize that the transversality condition (7.4) is a sufficient

condition, but not a necessary one. It is quite strong because it requires the

limit converge to zero for any feasible plan. This condition is often violated in

many applications with unbounded rewards. For some special cases in which

an analytical solution to the Bellman equation is available, one can then

verify the transversality condition (7.4) for the particular plan generated by

the optimal policy function associated with the Bellman equation. If any

feasible plan that violates the transversality condition is dominated by some

feasible plan that satisfies this condition, then the following result, adapted

from Stokey, Lucas, and Prescott (1989), guarantees that the solution to the

Bellman equation is the value function and the associated policy function

generates an optimal plan.

Corollary 7.1.1 Let (S,S) , (A,A), Γ, P, u, and β be given. Suppose As-

sumptions 5.1.1, 5.1.2, 5.1.4 and 7.1.1 hold. Let V be the finite value func-

tion for problem (5.6). Suppose that V ∗ : S → R is a measurable function

satisfying the Bellman equation (7.2) and that the associated optimal policy

correspondence G is nonempty and permits a measurable selection. Define

Π(s) as the set of feasible plans π such that the transversality condition (7.4)

holds and define the function V : S→ R by


J (s, π) , all s ∈ S.

Let π∗ (s) ∈ Π(s) be a plan generated by G. Suppose that

(a) π∗ (s) ∈ Π(s), all s ∈ S; and

(b) for any s ∈ S and π ∈ Π(s), there exists π ∈ Π(s) such that J (s, π) ≥J (s, π) .


Then

V (s) = V ∗ (s) = V (s) = J (s, π∗ (s)) for all s ∈ S.

Proof. Since condition (a) holds, we can adapt the proof of Theorem

7.1.2 to show that

V ∗ (s) = V (s) = J (s, π∗ (s)) .

Since Π(s) ⊂ Π(s), V (s) ≤ V (s) . To show the reverse inequality, suppose

that{πk

}∞k=1

is a sequence in Π (s) such that

limk→∞

J(s, πk

)= V (s) .

Then by condition (b), there is a sequence{πk

}∞k=1

in such that J(s, πk

)≥

J(s, πk

). Thus,

limk→∞

J(s, πk

)≥ V (s) .

So V (s) ≥ V (s) , completing the proof.

The following example illustrates this result by explicitly computing the

value function. We call this method the value function approach.

Example 7.1.1 Consider the infinite-horizon case of the example analyzed

in Section 6.1. We write the Bellman equation as follows

V (K) = maxK ′,C≥0

lnC + βV(K ′) (7.6)

subject to

C +K ′ = Kα.

We may take the limit as the horizon T → ∞ in (6.5) and (6.3) to obtain

the value function:

V (K) =1

1− β

[ln (1− αβ) + β

1− αβ ln (αβ)

]+

α

1− αβ lnK,


and the policy function:

K ′ = g (K) = αβKα.

Alternatively, given log utility, one may conjecture that the value function

also takes a log form:

V (K) = A+B lnK,

for some constants A and B to be determined. One then substitutes this

conjecture into the Bellman equation (7.6) to verify this conjecture is correct

and also to determine the constants A and B.

We use Corollary 7.1.1 to verify that the above solution is indeed optimal.

First, we compute the policy K∗t+1 = αβK∗α

t implied value:

V (K∗t ) = A+B lnK∗

t = A+B1− αt1− α ln (αβ) +Bαt ln (K0) .

Thus, condition (a) in the corollary is satisfied:

limt→∞ βtV (K∗

t ) = 0. (7.7)

To verify condition (b) in the corollary, we show that for any plan (Kt) ,

either J (K0, (Kt)) = −∞ or it satisfies (7.7) or (Kt) ∈ Π (K0) so that it is

always dominated by (K∗t ). We note that (Kt) ∈ Π (K0) if and only if

limt→∞ βt lnKt = 0.

Suppose this equation fails. Then the series on the right-hand side of the

following equation must diverge:

J (K0, (Kt)) =

∞∑t=0

βt lnCt ≤∞∑t=0

βt lnKαt =

∞∑t=0

αβt lnKt.

One the other hand, this series is bounded above because we can use the

resource constraint to show that

lnKt ≤ αt lnK0.

Thus, this series must diverge to −∞, implying that J (K0, (Kt)) = −∞.


The final result is a partial converse of Theorem 7.1.2. It ensures that

any optimal plan for the Markov decision problem can be found by solving

the dynamic programming problem in (7.2).

Theorem 7.1.3 Let (S,S) , (A,A), Γ, P, u, and β be given. Suppose As-

sumptions 5.1.1, 5.1.2, 5.1.4, and 7.1.1 hold. Suppose that the value func-

tion V is measurable and satisfies the Bellman equation (7.2). Suppose that

the associated optimal policy correspondence G is nonempty and permits a

measurable selection. Then any optimal plan π∗ ∈ Π(s0) for problem (7.1)

satisfies:

π∗0 ∈ G (s0) and π∗t(st)∈ G (st) , all t, st.

Proof. Since the value functions V satisfies the Bellman equation (7.2),

it is sufficient to show that

V (s0) = u (s0, π∗0) + β

∫V (s1)P (s0, π

∗0; ds1) , (7.8)

V (st) = u(st, π

∗t

(st))

+ β

∫V (st+1)P

(st, π

∗t (s

t); dst+1

). (7.9)

We derive these equations by forward induction.

Consider (7.8) first. Since π∗ is an optimal plan,

V (s0) = J (s0, π∗) ≥ J (s0, π) , all π ∈ Π(s0) . (7.10)

By Lemma 7.1.1, we can rewrite the above inequality as:

u (s0, π∗0) + β

∫J (s1, C(π∗, s1))P (s0, π

∗0; ds1) (7.11)

≥ u (s0, π0) + β

∫J (s1, C(π, s1))P (s0, π0; ds1) ,

for all π ∈ Π(s0) .

Next, Choose a measurable selection g from G and define the plan πg ∈Π(s0) as follows:

πg0 = π∗0,

πgt(st)

= g (st) , all st ∈ S

t, t = 1, 2, ..., T − 1.


Thus, for any s1 ∈ S, the continuation plan C(πg, s1) is a plan generated by

G from s1. Replacing π by πg in (7.11) gives∫J (s1, C(π∗, s1))P (s0, π

∗0; ds1) ≥

∫J (s1, C(πg, s1))P (s0, π

∗0; ds1) .

(7.12)

By Theorem 7.1.2,

V (s1) = J (s1, C(πg, s1)) ≥ J (s1, σ) , (7.13)

for all s1 ∈ S, σ ∈ Π(s1). Since C (π∗, s1) ∈ Π

(s1), it follows that

J (s1, C(πg, s1)) ≥ J (s1, C (π∗, s1)) , all s1 ∈ S. (7.14)

The two inequalities in (7.12) and (7.14) imply that

J (s1, C(πg, s1)) = J (s1, C (π∗, s1)) , P (s0, π∗0; ·) -a.e.

It then follows from (7.13) that

V (s1) = J (s1, C (π∗, s1)) , P (s0, π∗0; ·) -a.e. (7.15)

Thus, by the optimality of π∗,

V (s0) = J (s0, π∗)

= u (s0, π∗0) + β

∫J (s1, C(π∗, s1)P (s0, ds1;π

∗0)

= u (s0, π∗0) + β

∫V (s1)P (s0, ds1;π

∗0) ,

where the second line follows from Lemma 7.1.1. Since V satisfies the Bell-

man equation (7.2), the above equation implies that π∗0 ∈ G0 (s0) .

Use an analogous argument, with (7.15) in place of (7.10) as the starting

time, to show that (7.9) holds for t = 1, and continue by induction.

Theorems 7.1.1-7.1.3 constitute the main theory of infinite-horizon dy-

namic programming and formalize the Principle of Optimality in the infinite-

horizon case. As in the finite-horizon case, one main difficulty in applying


these theorems is to check measurability. Another main difficulty arising

only in the infinite-horizon case is the existence of a solution to the fixed-

point problem in the Bellman equation (7.2). In the next section, we shall

impose certain continuity assumptions for bounded rewards to resolve both

difficulties. For unbounded rewards, there is no general solution to the sec-

ond difficulty. We will provide some recent results in Section 7.3.2.

7.2 Bounded Rewards

Consider the Markov decision process model with the primitives (S,S) ,(A,A), Γ, P, u, and β. Our objective is to study the existence of a so-

lution to the Bellman equation (7.2). We focus on the Borel case in which

Assumptions 6.4.1 and 6.4.2 hold. In addition, we impose the following:

Assumption 7.2.1 The reward function u : S× A→ R is bounded and

continuous.

Assumption 7.2.2 The correspondence Γ : S→ A is nonempty, compact-

valued and continuous.

Assumption 7.2.3 The stochastic kernel P satisfies the property that∫f(s′)P

(s, a; ds′

)is continuous in (s, a) for any bounded and continuous function f : S→ R.

These assumptions allow us to use the powerful Contraction Mapping

Theorem to find a fixed point of some operator or mapping. To apply this

theorem, we need to introduce a Banach space and a contraction mapping.

We consider the set of bounded and continuous functions on S endowed with

the sup norm. Denote this set by C (S) . The sup norm is defined as:

||f || = sups∈X|f (s)| , f ∈ C (S) .

7.2. BOUNDED REWARDS 199

This set is a Banach space by Theorem 3.1 in Stokey, Lucas and Prescott

(1989).

Define a Bellman operator T on C (S) by:

(Tf) (s) = supa∈Γ(s)

u (s, a) + β

∫f(s′)P

(s, a; dz′

), (7.16)

for any f ∈ C (S) . Then the solution to the Bellman equation (7.19) is a

fixed point of T in that V = TV.

Theorem 7.2.1 (Existence) Let (S,S) , (A,A), Γ, P, u, and β satisfy As-

sumptions 5.1.2, 6.4.1, 6.4.2, and 7.2.1-7.2.3. Then there exists a unique

solution V ∈ C (S) to the Bellman equation (7.2) and the associated policy

correspondence G is non-empty, compact valued and upper hemicontinuous.

Proof. For each f ∈ C (S) , the problem in (7.16) is to maximize a

continuous function over the compact set Γ (s) . Thus, the maximum is at-

tained. Since both u and f are bounded, Tf is bounded. It then follows

from the Theorem of the Maximum that Tf is continuous in (x, z). Thus,

T : C (S)→ C (S) .We use Assumption 5.1.2 to verify that the conditions in

the Blackwell Theorem are satisfied so that T is a contraction. Since C (S)

is a Banach space, it follows from the Contraction Mapping Theorem that

T has a unique fixed point V ∈ C (S) . Finally, apply the Theorem of the

Maximum to deduce the state properties of G.

By this theorem and the Measurable Selection Theorem, G permits a

measurable selection. We can then apply Theorem 7.1.2 to deduce that the

solution V to the Bellman equation is equal to the value function for the

Markov decision problem at date zero and any plan generated by G gives

an optimal Markov plan for this problem.

The contraction property of the Bellman operator T gives a globally

convergent algorithm to solve the value function. In addition, it allows us to


formalize the intuitive result that the solution to an infinite-horizon problem

is the limit of that for a finite-horizon problem.

Theorem 7.2.2 (Convergence) Suppose that (S,S) , (A,A), Γ, P, u, and βsatisfy Assumptions 5.1.2, 6.4.1, 6.4.2, and 7.2.1-7.2.3. Let V be the unique

solution to the Bellman equation (7.2). Let V N0 be the value function for the

N -horizon Markov decision problem (5.5):

V N0 (s) = sup

π∈Π(s)JN0 (s;π) , any s ∈ S,

where JN0 is defined in (5.3) with ut = u, t = 0, 1, .., N − 1, and uN = 0.

Then

||T nv0 − V || ≤ βn ||v0 − V || , n = 1, 2, ..., (7.17)

limN→∞

TNv0 = limN→∞

V N0 = V. (7.18)

for any v0 ∈ C (S) .

Proof. Equation (7.17) follows from the Contraction Mapping Theorem.

Thus, limN→∞ TNv0 for any v0 ∈ C (S) . By the definition of T and Theorem

6.3.2, we can verify that V Nt = TN−t0, for t = 0, 1, ..., N − 1, where 0 is a

function that is equal to 0 for any s ∈ S. Thus,

limN→∞

V N0 = lim

N→∞TN0 = V.

This completes the proof.

This theorem connects the finite- and infinite-horizon problems. In terms

of notations, we use V Nt to denote the value function at any time t for an

N -period problem. At time t, the decision maker faces an (N − t) horizon.Thus, V N

t = V N−t0 . In the proof, we show that vn = T n0 = V N

N−n = V n0 ,

where n is the horizon or the time-to-go. Similar notations apply to the

policy functions.


7.3 Optimal Control

In order to provide sharper characterizations of the value function and the

optimal policy correspondence, we study the infinite-horizon optimal control

model introduced in Section 5.2.4. In this model, the state space is S = X×Z.The set X is the set of possible values for the endogenous state variable and

Z is the set of possible values for the exogenous shock. The primitives of

this model are (X,X ) , (A,A), (Z,Z), Q, Γ, φ, u, and β.

The control problem is given by (5.13) and (5.14). We shall study the

associated dynamic programming problem:

V (x, z) = supa∈Γ(x,z)

u (x, z, a) + β

∫V

(φ(x, a, z, z′

), z′

)Q

(z, dz′

). (7.19)

The associated Bellman operator T on C (S) is defined by

(Tf) (x, z) = supa∈Γ(x,z)

u (x, z, a) + β

∫f(φ(x, a, z, z′

), z′

)Q

(z, dz′

),

(7.20)

for any f ∈ C (S) . We shall use this operator extensively to study the

existence and properties of the solution to the Bellman equation.

7.3.1 Bounded Rewards

For bounded rewards, we can use the powerful Contraction Mapping Theo-

rem to analyze dynamic programming problems as we have shown in Section

7.2. As in that section, we require that the reward function be continuous.

Given S = X× Z, we replace Assumption 7.2.1 by:

Assumption 7.3.1 The reward function u : X× Z× A→ R is bounded

and continuous.

Using Lemma 6.4.1 and Theorem 7.2.1, we immediately obtain:


Theorem 7.3.1 (Existence) Let (X,X ) , (A,A), (Z,Z), Q, Γ, φ, u, and βsatisfy Assumptions 5.1.2 6.4.1-6.4.5, and 7.3.1. Then there exists a unique

solution V ∈ C (S) to the Bellman equation (7.19) and the associated policy

correspondence G is non-empty, compact valued and upper hemicontinuous.

Now, we study the properties of the solution (V,G). As in the finite-

horizon case, we focus on the properties of monotonicity, concavity, and

differentiability. First, consider monotonicity.

Assumption 7.3.2 For each (z, a) ∈ Z × A, u (·, z, a) : X → R is strictly

increasing.

Theorem 7.3.2 (Monotonicity) Let (X,X ) , (A,A), (Z,Z), Q, Γ, φ, β, andu satisfy Assumptions 5.1.2 6.4.1-6.4.5, 6.4.8, 6.4.9, 7.3.1, and 7.3.2. Then

the solution to the Bellman equation (7.19), V (·, z) : X → R, is strictly

increasing for each z ∈ Z.

Proof. Let C ′ (S) ⊂ C (S) be the set of bounded continuous functions

f on S = X× Z that are increasing in x, and let C ′′ (S) be the set of

functions that are strictly increasing in x. Since C ′ (S) is a closed subspace

of the complete metric space C (S) , by the Closed Subspace Theorem, it is

sufficient to show that T maps an element in C ′ (S) into C ′′ (S) . This can

be done by applying an argument similar to that in the proof of Theorem

6.4.2.

Second, we consider concavity by replacing Assumption 6.4.10 with the

following:

Assumption 7.3.3 For each z ∈ Z,

u(θx+ (1− θ)x′, z, θa+ (1− θ) a′

)≥ θu (x, z, a) + (1− θ) u

(x′, z, a′

)for all (x, a) , (x′, a′) ∈ X × A and all θ ∈ (0, 1) . The inequality is strict if

x �= x′.


Theorem 7.3.3 (Concavity) Let (X,X ) , (A,A), (Z,Z), Q, Γ, φ, β, andu satisfy Assumptions 5.1.2, 6.4.1-6.4.5, 6.4.8, 6.4.9, 6.4.12, 7.3.1, 7.3.2,

and 7.3.3. Let V be the solution to the Bellman equation (7.19) and G

be the associated optimal policy correspondence. Then, for each z ∈ Z,

V (·, z) : X → R is strictly concave and G (·, ·) : X× Z → A is a single-

valued continuous function.

Proof. Let C ′ (S) ⊂ C (S) be the set of bounded continuous functions f

on S = X× Z that are concave in x, and let C ′′ (S) be the set of functions that

are strictly concave in x. Since C ′ (S) is a closed subspace of the complete

metric space C (S) , by the Closed Subspace Theorem, it is sufficient to show

that T maps an element in C ′ (S) into C ′′ (S) . This can be done by applying

an argument similar to that in the proof of Theorem 6.4.3.

If the assumptions in the above theorem hold, then we write the optimal

policy function as g : X × Z → A. This policy function generates an opti-

mal plan (a0, a1, ..., ) and optimal system dynamics (x0, x1, ...) recursively

as follows:

at(zt)

= g (xt, zt)

xt+1

(zt+1

)= φ (xt, at, zt, zt+1) , (x0, z0) given,

where zt ∈ Zt.

To establish differentiability we replace Assumptions 6.4.13 and 6.4.14

with the following:

Assumption 7.3.4 For each z ∈ Z, u (·, z, ·) is continuously differentiable

on the interior of X× A.

Assumption 7.3.5 The system transition function φ (x, a, z, z′) = a is in-

dependent of (x, z, z′).


Assumption 7.3.5 seems strong. But it is often satisfied in applications

because one may perform a suitable change of variables such that x′ = a.

Given this assumption, the Bellman equation (7.19) becomes:

V (x, z) = supx′∈Γ(x,z)

u(x, z, x′

)+ β

∫V

(x′, z′

)Q

(z, dz′

)(7.21)

This is the problem extensively studied in Stokey, Lucas and Prescott (1989).

We call this class of control problems the Euler class.

Theorem 7.3.4 (Differentiability) Let (X,X ) , (A,A), (Z,Z), Q, Γ, φ, β,and u satisfy Assumptions 5.1.2, 6.4.1-6.4.4, 6.4.11, and 7.3.3-7.3.5. Let

V : X× Z→ R be the solution to (7.21) and g : X× Z→ R be the associated

optimal policy function. Let x0 ∈ int(X) and g (x0, z0) ∈ int(Γ (x0, z0)) for

some z0 ∈ Z. Then V (·, z0) is continuously differentiable at x0 and

Vx (x0, z0) = ux (x0, z0, g (x0, z0)) . (7.22)

Proof. By Theorems 7.3.1 and 7.3.3. The solution for V and g exists

and V is strictly concave. Let x0 ∈ int(X) and g (x0, z0) ∈ int(Γ (x0, z0)).

By Assumption 6.4.4, there is some open neighborhood D of x0 such that

g (x0, z0) ∈ int(Γ (x, z0)), all x ∈ D. We define W : D → R by:

W (x) = u (x, z0, g(x0, z0)) + β

∫V

(g (x0, z0) , z

′)Q (z0, dz

′) .Then since g (x0, z0) ∈ int(Γ (x, z0)), it follows from (7.21) that W (x) ≤V (x, z0) with equality at x = x0. By Assumptions 7.3.4, W is differentiable.

By the Benveniste and Scheinkman Theorem, V (·, z0) is differentiable at x0and

∂V (x0, z0)

∂x=∂W (x0)

∂x.

We then obtain equation (7.22).

Finally, we show that not only does the sequence of value functions in the

finite-horizon problem converges to the value function in the infinite-horizon

problem, but also the sequence of policy functions.


Theorem 7.3.5 (Convergence) Let (X,X ) , (A,A), (Z,Z), Q, Γ, φ, β, andu satisfy Assumptions 5.1.2, 6.4.1-6.4.5, 6.4.8, 6.4.9, 6.4.12, 7.3.1, 7.3.2,

and 7.3.3. Let V be the solution to the Bellman equation (7.19) and g be the

associated optimal policy function. Let C ′ (S) ⊂ C (S) be the set of bounded

and continuous functions that are concave on X. Let v0 ∈ C ′ (S) and define

(vn, gn) by:

vn = T nv0, and

gn (x, z) = arg maxa∈Γ(x,z)

u (x, z, a) + β

∫vn

(φ(x, a, z, z′

), z′

)Q

(z, dz′

),

for n = 1, 2, .... Then gn → g pointwise. If X and Z are compact, then the

convergence is uniform.

Proof. The proof is analogous to that of Theorem 9.9 in Stokey, Lucas

and Prescott (1989). We omit it here.

As in Theorem 7.2.2, we may relate the above result to that in the

finite-horizon case. We have the relation: T n0 = V NN−n = V n

0 . Intuitively,

T n0 is equal to the value function at time N − n for an N -period problem

with the terminal reward uN = 0. It is also equal to the value function

for the n-horizon problem with the terminal reward un = 0 and gn is equal

to the associated initial policy function. As the horizon or the time-to-go

n → ∞, the sequence of n-horizon policy functions (gn) converges to the

infinite-horizon policy function g.

7.3.2 Unbounded Rewards

For Markov decision problems with unbounded rewards, there is no general

theory for dynamic programming to work. One difficulty is that there is no

general fixed point theorem to guarantee the existence of a solution to the

Bellman equation. The Contraction Mapping Theorem cannot be simply

applied to the set of unbounded and continuous functions because the sup

norm on this set is not well defined. One may apply other fixed point


theorems such as the Schauder Fixed-Point Theorem or the Tarski Fixed-

Point Theorem to find a fixed point. However, there is no general theorem to

guarantee its uniqueness and global convergence. The other difficulty is that

the transversality condition (7.4) often fails for some feasible plans so that

Theorem 7.1.2 does not apply. In this case, Corollary 7.1.1 is useful. Stokey,

Lucas and Prescott (1989) provide several examples to illustrate this result.

In Section 8, we will provide further examples. Here we will introduce two

additional approaches. We refer readers to Le Van and Morhaim (2002) for

other approaches.

Weighted Norm Approach

A natural idea to apply the Contraction Mapping Theorem to unbounded

and continuous functions is to restrict to a subset of these functions. Each

function in this subset is bounded by some function and thus one can de-

fine a sup norm weighted by this function. Under this weighted sup norm,

the subset is a complete metric space. As a result, if one can construct a

contraction mapping in this space, one can apply the Contraction Mapping

Theorem. This idea appears in the mathematics literature (e.g., Wessells

(1977)) and is extended by Boyd (1990) in economics. Recently, Duran

(2000, 2003) further extends this approach.

To formalizes it, let ϕ : S→ R++ be a continuous function. Define Cϕ(S)

as the set of continuous functions such that sups∈S |f (s) /ϕ (s)| < ∞, for

any f ∈ Cϕ(S). We call ϕ a weight function and define a weighted sup

norm by:

||f ||ϕ = sups∈S

∣∣∣∣f (s)ϕ (s)

∣∣∣∣ , f ∈ Cϕ(S). (7.23)

In Exercise 1, the reader is asked to show that this is a well defined norm

and that Cϕ(S) endowed with this norm is a Banach space.

Now, we need to check that the Bellman operator is a contraction under


the weighted sup norm. The following theorem analogous to the Blackwell

Theorem is useful.

Theorem 7.3.6 Let the operator T : Cϕ(S)→ C(S) be such that:

(a) (monotonicity) f ≤ g implies Tf ≤ Tg;(b) (discounting) T (f + aϕ) ≤ Tf + θaϕ for some constant θ ∈ (0, 1)

and any constant a > 0;

(c) (one-period weighted boundedness) T0 ∈ Cϕ(S).Then T is a contraction with modulus θ and has a unique fixed point in

Cϕ(S).

Proof. Since f, g ∈ Cϕ(S), f ≤ g + |f − g| ≤ g + ||f − g||ϕ ϕ. By (a)

and (b),

Tf ≤ T(g + ||f − g||ϕ ϕ

)≤ Tg + θ ||f − g||ϕ ϕ.

Similarly, Tg ≤ Tf + θ ||f − g||ϕ ϕ. Thus,

||Tf − Tg||ϕ ≤ θ ||f − g||ϕ ,

and T is a contraction with modulus θ. For any f ∈ Cϕ(S), setting g = 0 in

yields:

Tf ≤ T0 + θ ||f ||ϕ ϕ ≤ ||T0||ϕ ϕ+ θ ||f ||ϕ ϕ.

Thus, ||Tf ||ϕ < ∞ by (c). By the Contraction Mapping Theorem, T has

unique fixed point in Cϕ(S).

Conditions (a)-(b) ensure that T is a contraction. Condition (c) en-

sures that ||Tf ||ϕ < ∞ so that T is a self-map. This result is often called

the Weighted Contraction Mapping Theorem. The following example

illustrates this theorem.

Example 7.3.1 Consider the following deterministic AK model:

max(Ct,Kt+1)∞t=0

∞∑t=0

βtCγt , γ ∈ (0, 1)


subject to

Ct +Kt+1 = AKt, A > 0, K0 > 0 given.

Define the value function for this problem as V (K) . Clearly it is an

unbounded function. Note that the utility function is homogeneous of degree

γ and the constraint set is linearly homogeneous. We choose the weight

function ϕ (K) = Kγ , K ≥ 0.1 For any continuous function f : R+→ R

that is homogeneous of degree γ, define the weighted sup norm, ||f ||ϕ =

supk≥0 f (k) /kγ = f (1) <∞. Let Cϕ (R+) be the space of all such functions.

Define the Bellman operator T on this space:

Tf (K) = maxy∈[0,AK]

(AK − y)γ + βf (y) , f ∈ Cϕ (R+) .

We now verify that T is a contraction under the weighted sup norm ||·||ϕ .First, T satisfies monotonicity. Second, we derive:

T (f + aϕ) (K) = maxy∈[0,AK]

(AK − y)γγ

+ βf (y) + βayγ

≤ maxy∈[0,AK]

(AK − y)γ + βf (y) + βa (AK)γ

= Tf (K) + βAγaϕ (K) .

Thus, if βAγ < 1, then T satisfies discounting. Third, we compute

T0(K) = maxC∈[0,AK]

Cγ = AγKγ = Aγϕ (K) .

Thus, T0 ∈ Cϕ (R+) . We conclude that if βAγ < 1 then T is a contraction

mapping and has a unique fixed point in Cϕ (R+).

Alvarez and Stokey (1998) consider general dynamic programming prob-

lems with a similar homogeneity property. They show that the Weighted

Contraction Mapping Theorem can be applied for general cases with posi-

tive degree of homogeneity. Duran (2000, 2003) extend the weighted norm

1This function is equal to zero only if K = 0. But it is still a well-defined weightfunction for homogeneous functions if we use the convention that Kγ/Kγ = 1 for K = 0.


approach to general problems without the homogeneity property. His idea

does not rely on the Weighted Contraction Mapping Theorem because the

conditions in this theorem are hard to verify in some applications. He im-

poses conditions on the weight function to directly verify that the Bellman

operator is a contraction.

To explain his idea, we consider the deterministic model:

max(xt+1)

∞∑t=0

βtu (xt, xt+1) , x0 given, (7.24)

subject to xt+1 ∈ Γ (xt) , t = 0, 1, .... We impose the following:

Assumption 7.3.6 The function u : X × X → R is continuous, where

X ⊂ Rnx .

Assumption 7.3.7 The correspondence Γ : X→ X is nonempty, compact-

valued, and continuous.

Define the Bellman operator:

Tf (x) = supy∈Γ(x)

u (x, y) + βf (y) . (7.25)

Let ϕ be a weight function and consider the space Cϕ (X) endowed with the

weighted sup norm ||·||ϕ. Our objective is to give conditions such that T is a

contraction and a self-map on this space so that a unique fixed point exists.

First, in Exercise 2, the reader is asked to show the following:

|Tf − Tg| (x) ≤ β supx′∈Γ(x)

∣∣f (x′)− g

(x′)∣∣

= β supx′∈Γ(x)

|f (x′)− g (x′)|ϕ (x′)

ϕ (x′)ϕ (x)

ϕ (x)

≤ β supx′∈Γ(x)

ϕ (x′)ϕ (x)

||f − g||ϕ ϕ (x) . (7.26)


Thus, if

β supx′∈Γ(x)

ϕ (x′)ϕ (x)

≤ θ < 1, for all x ∈ X, (7.27)

then T is a contraction with the modulus θ under the weighted sup norm.

Next, we need to show that T maps a function in Cϕ (X) into itself. The

key step is to verify that ‖Tf‖ϕ < ∞ if ‖f‖ϕ < ∞. Setting g = 0 in (7.26)

yields:

Tf (x) ≤ θ ||f ||ϕ ϕ (x) + T0 (x) ≤ θ ||f ||ϕ ϕ (x) + ‖T0‖ϕ ϕ (x) .

Thus, if we can show that

‖T0‖ϕ <∞, (7.28)

then

‖Tf‖ϕ ≤ θ ||f ||ϕ + ‖T0‖ϕ <∞.

Finally, we need to verify certain continuity. This is routine. To conclude

(7.27) and (7.28) are the key conditions for the application of the Contraction

Mapping Theorem to T under the norm ||·||ϕ . This result can be applied to

models without homogeneity properties. The following example illustrate

this result.

Example 7.3.2 We modify the previous example to incorporate nonhomo-

geneous technology. Let the production function be F (x) = Ax+ xα, A > 1

and α ∈ (0, 1) . Then one-period utility is u (x, y) = (Ax+ xα − y)γ . Takethe weight function ϕ (x) = 1 + (Ax+ xα)γ . To check condition (7.27), we

compute:

β supy∈[0,Ax+xα]

1 + (Ay + yα)γ

1 + (Ax+ xα)γ≤ β

1 + (Az + zα)γ

1 + zγ

≤ limz→∞β

1 + (Az + zα)γ

1 + zγ

= βAγ ,


where z = Ax + xα ≥ 0. We assume βAγ < 1. To check condition (7.28),

we compute:

T0 (x) = supy∈[0,Ax+xα]

(Ax+ xα − y)γ < ϕ (x) .

It is straightforward to extend the above approach to stochastic models

(see Duran (2003)). As Duran (2000) point out, it is not easy to apply this

approach to models with reward functions that are bounded above but un-

bounded below. Alvarez and Stokey (1998) also point out this difficulty and

propose solutions for homogeneous functions. To see the issue, we consider

the AK model in Example 7.3.1 with log utility: u (x, y) = ln (Ax− y) . Toensure one-period boundedness, we may set ϕ (x) = 1+ |ln (Ax)| . However,condition (7.27) can never be satisfied.

Local Contraction Approach

One limitation of the weighted norm approach is that it is hard to handle

reward functions that are bounded above but unbounded below. Rincon-

Zapatero and Rodrıguez-Palmero (2003, 2009) propose a new approach

based on local contractions. This approach has been extended and mod-

ified by Martins-Da-Rocha and Vailakis (2010) and Matkowski and Nowak

(2009). These studies show that the local contraction approach is an effec-

tive way to analyze reward functions that are bounded above but unbounded

below.

Here we present the latest development by Martins-Da-Rocha and Vailakis

(2010). Let � be a set and D = (dj)j∈J be a family of semi-distances defined

on �. We let σ be the weak topology on � defined by the family D. This

means that, for any f, g ∈ �, f → g if dj (f, g) → 0 for all j ∈ J . In this

case, we say f is σ-convergent to g. The space � is said σ-Hausdorff if for

any f, g ∈ �, f �= g implies that there exists j ∈ J such that dj (f, g) > 0. A


sequence (fn)∞n=1 is said σ-Cauchy if it is dj-Cauchy for each j ∈ J. A sub-

set A of � is said sequentially σ-complete if every σ-Cauchy sequence in

A converges in A for the σ-topology. A subset A of � is said σ-bounded if

diamj (A) ≡ sup {dj (f, g) : f, g ∈ A} is finite for every j ∈ J. For any h ∈ �

and B ⊂ �, define dj (h,B) = inf {dj (h, g) : g ∈ B} .The following definition plays a central role.

Definition 7.3.1 Let r : J → J . An operator T : � → � is a local

contraction with respect to (D, r) if for every j ∈ J there exists βj ∈[0, 1) such that

dj (Tf, Tg) ≤ βjdr(j) (f, g) , all f, g ∈ �.

It is a k-local contraction if r (j) = j + k for all j ∈ J.

The following theorem gives conditions for the existence, uniqueness and

stability of a fixed point for local contractions.

Theorem 7.3.7 Let the space � be σ-Hausdorff and T : � → � be a

local contraction with respect to (D, r) . Consider a non-empty, σ-bounded,

sequentially σ-complete subset A ⊂ � satisfying TA ⊂ A. If

limn→∞βjβr(j)...βrn(j)diamrn+1(j) (A) = 0, any j ∈ J, (7.29)

then the operator T admits a fixed point f∗ in A. Moreover, if h ∈ � satisfies

limn→∞βjβr(j)...βrn(j)drn+1(j) (h,A) = 0, any j ∈ J, (7.30)

then the fixed point f∗ is unique in A and the sequence (T nh)∞n=1 is σ-

convergent to f∗.

Proof. Fix an element g ∈ A. Since T is a local contraction, for every

pair of integers k > n > 0, we have

dj

(T kg, T ng

)≤ βjdr(j)

(T k−1g, T n−1g

)≤ ... ≤ βjβr(j)...βrn−1(j)drn(j)

(T k−ng, g

).


Since TA ⊂ A, T k−ng ∈ A and the above inequality implies that

dj

(T kg, T ng

)≤ βjβr(j)...βrn−1(j)drn(j)diamrn(j)A.

It follows from (7.29) that the sequence (T ng) is dj-Cauchy for each j. Since

A is assumed to be sequentially σ-complete, there exists f∗ in A such that

(T ng) is σ-convergent to f∗. It follows that

dj (Tf∗, f∗) = lim

ndj

(Tf∗, T n+1g

)≤ βj limn dr(j) (f

∗, T ng)

= 0, all j ∈ J.

where the inequality follows from the assumption that T is a local contrac-

tion with respect to (D, r) . Since � is σ-Hausdorff, it follows that Tf∗ = f∗.

Thus, f∗ is a fixed point of T.

Now, we show that (T nh) is σ-convergent to f∗ for all h ∈ �. Repeatedly

applying the definition of local contract, we derive:

dj(T n+1h, T n+1f∗

)≤ βjdr(j) (T

nh, T nf∗) ≤ ...

≤ βjβr(j)...βrn(j)drn+1(j) (h, f∗)

≤ βjβr(j)...βrn(j)[drn+1(j) (h,A) + diamrn+1(j) (A)

],

where the last inequality follows from the triangle inequality. Since Tf∗ =

f∗, it follows from conditions (7.29) and (7.30) that (T n+1h) is σ-convergent

to f∗.

Finally, we prove the uniqueness of f∗ by contradiction. If there is

another fixed point f∗∗ ∈ A such that f∗∗ �= f∗, then condition (7.30)

is satisfied for h = f∗∗. Thus, we obtain that the sequence (T nf∗∗) is σ-

convergent to f∗. Since Tf∗∗ = f∗∗, T nf∗∗ = f∗∗ for all n = 1, 2, .... It

follows that dj (f∗∗, f∗) = 0 for all j ∈ J, contradicting with the assumption

that � is σ-Hausdorff.

The next theorem is analogous to the Contraction Mapping Theorem.

We may refer to it as the Local Contraction Mapping Theorem.


Theorem 7.3.8 Let T : � → � be a 0-local contraction with respect to

a family D = (dj)j∈J of semidistances. Assume that � is sequentially σ-

complete. Then the operator T admits a unique fixed point f∗ in �. More-

over, for any f ∈ �, the sequence (T nf)∞n=1 is σ-convergent to f∗.

Proof. Fix any f ∈ �, define the set

�f ={g ∈ � : dj (g, f) ≤ dj (Tf, f) /

(1− βj

), all j ∈ J

}.

Then using the triangle inequality we can show that �f is nonempty, σ-

bounded, σ-closed and satisfies T�f ⊂ �f . Applying Theorem 7.3.7 by

setting A = �f gives the desired result.

We offer a few remarks on the above two theorems. First, the proof of

Theorem 7.3.7 only requires each βj to be nonnegative. The requirement

of βj ∈ [0, 1) is used only in the proof of the Local Contraction Mapping

Theorem. Second, the index set J used in the above two theorems can

be uncountable. This property is useful in applications, for example, in the

proof of the existence of recursive utility as illustrated by Martins-Da-Rocha

and Vailakis (2010). Third, the family(βj

)j∈J of contraction coefficients can

be arbitrarily close to one. Rincon-Zapatero and Rodrıguez-Palmero (2003,

2009) and Matkowski and Nowak (2009) require supj∈J βj < 1. As Martins-

Da-Rocha and Vailakis (2010) show, this assumption is violated in many

applications.

We now apply the Local Contract Mapping Theorem to the deterministic

control problem in (7.24). We focus on reward functions that are bounded

above but unbounded below. Leading examples are logarithm utility and

power utility with negative exponent. These reward functions tend to minus

infinity at the origin. Thus, the origin is an undesirable point. Nevertheless,

in many applications, it is natural to include it as part of the feasible set. For

example, this helps make the constraint correspondence compact-valued. To


include the origin in the domain, we set the state space X = Rnx+ and consider

extended real-valued functions.

Let A∗ = A\ {0} for any set A ⊂ X. Let C∗ (X) be the set of upper

semicontinuous functions f : X→ R ∪ {−∞} that are continuous and finite

on X∗. Given two functions f, g ∈ C∗(X), we say that f/g = O (1) at x = 0

if f/g is bounded in some neighborhood N\ {0} of x = 0 and g �= 0 in this

neighborhood. Define

Graph (Γ) = {(x, y) ∈ X× X : y ∈ Γ (x)} ,

Π0 (x0) = {π ∈ Π(x0) : J (x0, π) > −∞} .

We make the following:

Assumption 7.3.8 The correspondence Γ : X→ X is nonempty, compact-

valued, and continuous. In addition, Γ (0) = {0} .

Assumption 7.3.9 The reward function u : Graph(Γ) → R ∪ {−∞} is

upper semicontinuous and continuous at every point where it is finite. There

is only one point (0, 0) such that u takes −∞.

Assumption 7.3.10 For all x ∈ X∗, there is a continuous selection q of Γ

with u (x, q (x)) > −∞.

Assumption 7.3.11 There exist three functions w−, w+, and w in C∗ (X)

such that:

(a) w− ≤ w+ < w and

w− − ww+ − w

= O (1) at 0, (7.31)

w+ − wTw − w = O (1) at 0, (7.32)

(b) Tw < w, Tw− ≥ w−, Tw+ ≤ w+, and


(c) Π0 (x0) �= ∅ for all x0 ∈ X∗, and for each (xt) ∈ Π0 (x0),

limt→∞ βtw− (xt) = lim

t→∞ βtw+ (xt) = 0.

Assumption 7.3.12 There exists a countable increasing sequence {Kj} ofnonempty and compact subsets of X with X = ∪jKj such that K ⊂ Kj for

any compact set K ⊂ X and such that Γ(K∗j) ≡ ∪x∈K∗

jΓ (x) ⊂ K∗

j for all

j = 1, 2, ...

Assumptions 7.3.8 and 7.3.9 take into account of the special role of the

origin. Otherwise, they are standard assumptions. Assumption 7.3.10 en-

sures that the value function is finite on X∗ and allows for the use of the

Theorem of the Maximum on X∗. This assumption is first introduced by

Alvarez and Stokey (1998). Assumption 7.3.11 (a)-(b) allows us to construct

a set

{f ∈ C∗ (X) : w− ≤ f ≤ w+} ≡ [w−, w+] .

such that the Bellman operator T maps this order interval into itself. More-

over, the Bellman operator iterations will either be an increasing or decreas-

ing sequence that will always lie in the given order interval. The following

theorem shows that the limit under a suitable topology is the unique fixed

point of T on this order interval. Assumption 7.3.11 (c) is a transversality

condition that ensures that the fixed point of T coincides with the value

function. Assumption 7.3.12 and conditions (7.31)-(7.32) allow us to define

a family of semi-distances that are used in the Local Contraction Mapping

Theorem.

Theorem 7.3.9 Let T be the Bellman operator defined in (7.25) satisfying

Assumptions 7.3.8-7.3.12. Then the following statements hold:

(a) The Bellman operator has a unique fixed point V ∗ ∈ [w−, w+].

(b) The value function V = V ∗.


(c) For any f ∈ [w−, w+], the sequence (T nf) convergences to V under

the topology associated with the family (dj)∞j=1 of semi-distances defined on

[w−, w+] by:

dj (f, g) = supx∈K∗

j

∣∣∣∣ln(f −ww+ − w

(x)

)− ln

(g − ww+ − w

(x)

)∣∣∣∣ . (7.33)

This theorem is adapted from Rincon-Zapatero and Rodrıguez-Palmero

(2003) and Martins-Da-Rocha and Vailakis (2010). Its proof is lengthy and

omitted here. We refer readers to these two papers for details. One just need

to verify that all conditions of the Local Contraction Mapping Theorem are

satisfied. The hard part is to prove that (a) T is a self-map on [w−, w+] ,

and (b) T is a 0-local contraction with respect to (dj)∞j=1. For (a), we need

to apply a modified version of the Theorem of the Maximum, Lemma 2 of

Alvarez and Stokey (1998) and another modification for upper semicontinu-

ous functions (see ...). For (b), we need to use the convexity property of the

Bellman operator T. Rincon-Zapatero and Rodrıguez-Palmero (2003) prove

that for all j = 1, 2, ...,

dj (Tf, Tg) ≤(1− e−μj

)dj (f, g) , for all f, g ∈ [w−, w+] ,

where

μj = supf∈[w−,w+]

dj (f, Tw) .

To apply the above theorem, one needs to find suitable functions (w−, w+, w)

such that the conditions in the theorem are satisfied. It seems that Assump-

tions 7.3.11 and 7.3.12 are hard to verify. However, in some applications,

we may set w = 0. In this case, Tw (x) = T0 (x) = supy∈Γ(x) u (x, y) . This

makes verification of (7.31) and (7.32) simple. If the semi-distance in (7.33)

is well defined for the whole set X, then we can replace K∗j with X. In this

case, the assumption Γ(K∗j

)⊂ K∗

j can be dispensed with.

We give two examples taken from Alvarez and Stokey (1998) and Rincon-

Zapatero and Rodrıguez-Palmero (2003) to illustrate Theorem 7.3.9.


Example 7.3.3 (Homogeneous negative case) Let u (x, y) be a negative func-

tion with homogeneous of degree θ < 0. Assume that u (0, 0) = −∞ and the

graph of Γ is a cone. In addition, we assume the following:

Condition (a): u (x, y) ≤ −a ||x||θ for some a > 0 for all x ∈ X and

y ∈ Γ (x) .

Condition (b): For all x ∈ X∗, there exists a continuous selection q from

Γ with ‖q (x)‖ ≥ α ||x|| for α > 0 such that βαθ < 1, and u (x, q (x)) ≥−b ||x||θ for some b > 0.

Condition (b) implies that Assumption 7.3.10 holds. We verify that As-

sumption 7.3.11 is satisfied by constructing:

w− (x) =−b

1− βαθ ||x||θ ,

w+ (x) = T0 (x) = supy∈Γ(x)

u (x, y) ,

w (x) = 0.

Since −b ||x||θ ≤ w+ ≤ −a ||x||θ , we obtain that w−/w+ = O (1) so that

(7.31) holds. In addition, since∣∣∣∣ln(w− −ww+ −w

)(x)

∣∣∣∣ = ∣∣∣∣ln(w−w+

)(x)

∣∣∣∣ ≤ b

a (1− βαθ) ,

for all x ∈ X∗, we can replace Kj with X in equation (7.33). Thus, we do not

need the assumption Γ(K∗j

)⊂ K∗

j . In addition, other parts of Assumption

7.3.11 (a) also hold.

Example 7.3.4 (Logarithm utility and linear technology) Let X = Rnx+ . The

reward function is u (x, y) = ln (ψ (x, y)), where ψ : Graph (Γ) → R+ is

continuous, ψ (0, 0) = 0, and Graph (Γ) is a cone. When ψ is homogeneous

of degree one, this model is known as the homogeneous of degree zero case

analyzed by Alvarez and Stokey (1998). However, this last hypothesis is not

necessary and can be eliminated. Other assumptions used in Alvarez and

7.4. THEMAXIMUM PRINCIPLE AND TRANSVERSALITYCONDITION219

Stokey (1998) can also be weakened as shown in the conditions we assume

below:

Condition (a): ‖y‖ ≤ ε ‖x‖ for some ε > 0, for all (x, y) ∈ Graph (Γ).

Condition (b): ψ (x, y) ≤ B (‖x‖n + ‖y‖n) for some B,n > 0, for all

(x, y) ∈ Graph (Γ).

Condition (c): For all x ∈ X∗, there exists a continuous selection q of

Γ with q (x) ≥ αx for some α > 0, and such that ψ (x, q (x)) ≥ b ‖x‖n for

some b > 0.

We construct the functions

w− (x) =ln b

1− β +βn lnα

(1− β)2+

n

1− β ln ‖x‖

w+ (x) =lnB (1 + εn)

1− β +βn ln ε

(1− β)2+

n

1− β ln ‖x‖

w (x) = w+ (2x)

Since ln(w−−ww+−w

)is a constant independent of x, (7.31) holds and we

can replace K∗j with X in (7.33). Thus, we do not need the assumption

Γ(K∗j

)⊂ K∗

j . Since other assumptions are easy to verify, we only need

to check Assumption 7.3.11 (c).

Let x0 �= 0. Then the sequence (xt) ∈ Π0 (x0) where xt+1 = q (xt) , so

Π0 (x0) �= ∅. Finally, for all (xt) ∈ Π0 (x0) ,

limt→∞ βtn ln ‖xt‖ = lim

t→∞ βt ln (B (1 + εn) ‖xt‖n) ≥ − limt→∞βtu− (xt, xt+1) = 0,

limt→∞ βt ln ‖xt‖ ≤ lim

t→∞βt ln εt + limt→∞ βt ln (x0) = 0.

Thus, limt→∞ βtw+ (xt) = limt→∞ βtw− (xt) = 0.

7.4 The Maximum Principle and TransversalityCondition

Consider the optimal control problem studied in the previous section. For

simplicity, we assume that X ⊂ R+ and A ⊂ R are convex. To present the


infinite-horizon Maximum Principle, we write the Lagrangian form as:

L = E

∞∑t=0

[βtu (xt, zt, at)− βt+1μt+1 (xt+1 − φ(xt, at, zt, zt+1))

], (7.34)

where βt+1μt+1 is the discounted Lagrange multiplier associated with (5.14).2

Under Assumptions 7.3.4 and 6.4.14, we can derive the first-order conditions

for an interior solution:

at : ua (xt, zt, at) + βEtμt+1φa (xt, at, zt, zt+1) = 0, t ≥ 0,

xt : μt = ux (xt, zt, at) + βEtμt+1φx (xt, at, zt, zt+1) , t ≥ 1.

Define the Hamiltonian function as:

H(xt, at, zt, μt+1

)= u (xt, zt, at) + βEtμt+1φ(xt, at, zt, zt+1),

and express the preceding first-order conditions as:

∂H(xt, at, zt, μt+1

)∂a

= 0, t ≥ 0, (7.35)

∂H(xt, at, zt, μt+1

)∂x

= μt, t ≥ 1. (7.36)

To study the connection with dynamic programming, we consider the

Bellman equation (7.19). Given suitable differentiability assumptions, we

derive the first-order condition for a :

0 = ua (x, z, a) + β

∫Vx

(φ(x, a, z, z′), z′

)φa(x, a, z, z

′)Q(z, dz′

).

The envelope condition is given by:

Vx (x, z) = ux (x, z, a)

+β

∫Vx

(φ(x, a, z, z′), z′

)φx(x, a, z, z

′)Q(z, dz′

).

2As discussed in Section 6.5, if the state transition equation is xt+1 = φ (xt, at, zt) ,then we may replace βt+1μt+1 with βtλt.

7.4. THEMAXIMUM PRINCIPLE AND TRANSVERSALITYCONDITION221

After setting μt = Vx (xt, zt) , we can see that the above two conditions are

equivalent to (7.35) and (7.36). The Lagrange multiplier μt is interpreted

as the shadow value of the state. As in the finite-horizon case, the beauty

of the Maximum Principle is that the Lagrange multiplier is the object of

study, instead of the value function. As studied in the last section, analyzing

the existence and properties of the value function is nontrivial, especially for

unbounded reward functions. By contrast, unbounded reward functions do

not pose any difficulty for the Maximum Principle to work.3

Next, we turn to the solution of the first-order conditions. Suppose

we can use (7.35) to solve for at as a function of xt, zt and μt+1. We then

substitute this solution into equation (7.36) and the state transition equation

(5.14) to obtain a system of two first-order nonlinear difference equations for(xt+1, μt+1

)∞t=0

.We need two boundary conditions to obtain a solution. One

boundary condition is the initial value for x0. Unlike in the finite-horizon

case analyzed in Section 6.5 where we have a terminal condition for μT , there

is no well defined terminal condition in the infinite-horizon case. Instead, the

second boundary condition is in the form of the transversality condition:

limT→∞

E[βTμTxT

]= 0. (7.37)

The interpretation is that the discounted shadow value of the state variable

must be zero at infinity. Intuitively, this condition prevents overaccumula-

tion of wealth. One should not be confused with the no-Ponzi-game condi-

tion, which is also called a transversality condition in the literature. The

no-Ponzi-game condition prevents the overaccumulation of debt as we show

in Section 8.3. It is a restriction on the choice set of feasible plans. But the

transversality condition (7.37) is an optimality condition for a given choice

set. In Section 8.3, we will show that these two conditions are closely related

given first-order conditions.

3See Chow (1997) for extensive applications of the Maximum Principle to economics.


Now, we state a sufficiency result for the one-dimensional case.

Theorem 7.4.1 (Sufficiency of the necessary condition from the Maximum

Principle and the transversality condition) Let X ⊂ R+ and A ⊂ R be con-

vex. Suppose that Assumptions 5.1.2, 6.4.3, 6.4.12, 6.4.14, and 7.3.3, and

7.3.4 hold. Let(x∗t+1, a

∗t , μ

∗t+1

)∞t=0

satisfy (5.14), (7.35), (7.36), and (7.37).

Suppose that a∗t(zt)∈ int(Γ

(x∗t

(zt), zt

)), x∗t ∈ int X, and μ∗t+1 ≥ 0 for

t ≥ 0. Then (a∗t ) is an optimal plan.

Proof. We define

DT ≡ ET∑t=0

βtut (x∗t , zt, a

∗t )−E

T∑t=0

βtut (xt, zt, at) ,

for any feasible plan (at)∞t=0 and the associated state process (xt)

∞t=1 satis-

fying (5.14). Since both (xt+1, at)∞t=0 and

(x∗t+1, a

∗t

)∞t=0

satisfy the system

transition equation, we rewrite the preceding equation as:

DT = E

T∑t=0

[βtut (x

∗t , zt, a

∗t )− βt+1μ∗t+1

(x∗t+1 − φ(x∗t , a∗t , zt, zt+1)

)]−E

T∑t=0

[βtut (xt, zt, at)− βt+1μ∗t+1 (xt+1 − φ(xt, at, zt, zt+1))

].

By assumptions, we obtain

DT = ET∑t=0

βtHt

(x∗t , a

∗t , zt, μ

∗t+1

)− E

T∑t=0

βtHt

(xt, at, zt, μt+1

)−E

T∑t=0

βt+1μ∗t+1

(x∗t+1 − xt+1

)

≥ E

T∑t=0

βt∂Ht

(x∗t , a∗t , zt, μ∗t+1

)∂x

(x∗t − xt)

+E

T∑t=0

βt∂Ht

(x∗t , a∗t , zt, μ∗t+1

)∂a

(a∗t − at)

−ET∑t=0

βt+1μ∗t+1

(x∗t+1 − xt+1

).

7.5. EULER EQUATIONS AND TRANSVERSALITY CONDITION 223

Rearranging terms and using (7.35), (7.36), and x∗0 = x0, we obtain that

DT ≥ −EβT+1μ∗T+1

(x∗T+1 − xT+1

).

Since μ∗T+1 ≥ 0 and xT+1 ≥ 0, we deduce that

DT ≥ −EβT+1μ∗T+1x∗T+1.

Taking limit as T → ∞ and using the transversality condition (7.37), we

obtain the desired result.

Note that if φ is linear in (x, a) , then the assumption of μ∗t+1 ≥ 0 is

not needed. The assumption that X ⊂ R+ is typically satisfied in many

economic problems. For example, the capital stock as a state variable is

always nonnegative. In many problems, we can make a suitable change of

variables such that the state variables become nonnegative. For example,

in the consumption-saving model analyzed in Section 6.6.2, we can choose

either modified asset holdings at = at+ b ≥ 0 or cash on hand xt ≥ 0 as the

state variable.

To prove the necessity of the transversality condition is quite difficult. In

the next section, we analyze this issue for a special class of control problems

– the Euler class.

7.5 Euler Equations and Transversality Condition

In this section, we consider the following Euler class of stochastic optimal

control problems analyzed in Section 7.3:

max(xt+1)∞t=0

J (x0, z0, (xt+1)∞t=0) ≡ E

[ ∞∑t=0

βtu (xt, zt, xt+1)

], (7.38)

subject to xt+1

(zt)∈ Γ

(xt

(zt−1

), zt

)for all zt ∈ Zt, t = 0, 1, ..., and (x0, z0)

given. Under suitable differentiability assumptions, we can derive first-order

necessary conditions for (7.38).


We can use two ways to derive these conditions. First, we may use

the dynamic programming approach. Consider the associated dynamic pro-

gramming problem (7.21). Under suitable differentiability assumptions, we

can derive the first-order conditions for an interior solution:

ua (x, z, g (x, z)) + β

∫Vx

(g(x, z), z′

)Q

(z, dz′

)= 0,

where x′ = g (x, z) is the optimal policy function. Combining with the

envelope condition (7.22) yields the Euler equation:

0 = ua (x, z, g (x, z)) + β

∫ux

[g(x, z), z′, g(g(x, z))

]Q

(z, dz′

).

This is a functional equation for the optimal policy function g. If we set

x∗t = x, zt = z, and x∗t+1 = g (x∗t , zt) , we obtain the Euler equation in the

sequence form:

ua(x∗t , zt, x

∗t+1

)+ βEtux

(x∗t+1, zt+1, x

∗t+2

)= 0, t ≥ 0, (7.39)

where Et is the conditional expectation operator given the information avail-

able at time t. Both forms of Euler equations are useful for numerical analysis

studied later.

Another way of deriving Euler equations is to note that if the plan

(x∗t+1)∞t=0 is optimal, then x∗t+1 must solve the following problem:

maxy

u (x∗t , zt, y) + βEtu(y, zt+1, x

∗t+2

)subject to

y ∈ Γ (x∗t , zt) , x∗t+2 ∈ Γ (y, zt+1) .

The idea is that a small variation of x∗t+1 should not improve the total

expected reward. This idea underlies the classical variational approach

to optimal control. It is also related to the Maximum Principle studied in

the last section.


Unlike in the finite-horizon case, we cannot solve Euler equations simply

by backward induction because we do not have an explicit terminal condi-

tion. Instead, the terminal condition for the infinite-horizon case is typically

in the form of the so called transversality condition:

limT→∞

E[βTua

(x∗T , zT , x

∗T+1

)· x∗T+1

]= 0. (7.40)

To understand the intuition behind this condition, we start with a finite-

horizon social planner’s problem:

sup(ct)

Tt=0

E

[T∑t=0

βtu (ct)

]

subject to

xt+1 = ztF (xt)− ct, t = 0, 1, ..., T, (x0, z0) given,

where xt ≥ 0 represents capital stock. In the last period, the planner solves

the problem:

maxxT+1

E[βTu (zTF (xT )− xT+1)

].

Clearly, there must be a terminal condition that constrains the capital stock

xT+1. Otherwise, the planner will maker xT+1 as large (small) as possible,

if u′ < (>)0. Given the assumption that u′ ≥ 0 and xT+1 ≥ 0, we obtain

the first-order condition for xT+1 as:

E[βTu′ (c∗T )

]≥ 0, with inequality if x∗T+1 = 0.

We rewrite it as

E[βTu′ (c∗T ) · x∗T+1

]= 0. (7.41)

This is the transversality condition in the finite-horizon case. The economic

meaning is that the expected discounted shadow value of the terminal state

(e.g., capital or wealth) must be zero. Note that we have used the inner


product notation in (7.41) to anticipate that we will consider the multi-

dimensional case below.

In the infinite-horizon case, one may take the limit of (7.41) to obtain

(7.40). Using the Euler equation, we may rewrite it as:

limT→∞

E[βTux

(x∗T , zT , x

∗T+1

)· x∗T

]= 0. (7.42)

This form has no counterpart in the finite-horizon case or in the continuous-

time case. Note that from the analysis in the previous section and the

envelope condition for the Euler class, we deduce that μT = Vx (x∗T , zT ) =

ux(x∗T , zT , x

∗T+1

). It follows that (7.42) is a special case of (7.37).

Researchers used to believe that it is nontrivial to prove the neces-

sity of the transversality condition (see Weitzman (1973) and Ekeland and

Scheinkman (1986)). In a series of recent work, Kamihigashi (2000, 2001,

2002, 2003) simplifies the proof significantly. I now present one of his result.

Theorem 7.5.1 (Necessity of the Euler and transversality conditions) Let

X ⊂ Rnx be a convex set and contains the origin. Let 0 ∈ Γ (0, z) for

any z ∈ Z. Suppose Assumptions 6.4.3, 6.4.11, and 7.3.2-7.3.4 hold. Let(x∗t+1

)∞t=0

be an interior optimal plan that gives a finite discounted expected

reward J(x0, z0,

(x∗t+1

)). Suppose that J

(x0, z0,

(ξx∗t+1

))is also finite for

some ξ ∈ [0, 1). Then(x∗t+1

)∞t=0

satisfies the Euler equation (7.39) and the

transversality condition (7.40).

Proof. We have outlined the proof of the necessity of the Euler equation.

We focus on the transversality condition. The key is to use the following

result: Let f : [0, 1] → R∪{−∞} be a concave function with f (1) > −∞.Then for any γ ∈ [0, 1) and λ ∈ [γ, 1),

f (1)− f (λ)1− λ ≤ f (1)− f (γ)

1− γ . (7.43)


To prove this, we let α = (1− λ) / (1− γ) ∈ (0, 1]. By concavity of f, f (λ) ≥αf (γ) + (1− α) f (1) . We then obtain the above inequality.

Now, fix any integer T ≥ 0. By the convexity assumption of X and Γ and

the assumption that 0 ∈ Γ (0, z) for any z ∈ Z,{x∗0, ..., x∗T , λx

∗T+1, λx

∗T+2, ...

}is a feasible plan for some λ ∈ [ξ, 1). By optimality,

0 ≥ E{βTu

(x∗T , zT , λx

∗T+1

)− βTu

(x∗T , zT , x

∗T+1

)+

∞∑t=T+1

[βtu

(λx∗t , zt, λx

∗t+1

)− βtu

(x∗t , zt, x

∗t+1

)]}.

By Assumption 7.3.3, u(λx∗t , zt, λx∗t+1

)is a concave function of λ. Use (7.43)

with γ = ξ to derive:

EβTu(x∗T , zT , λx

∗T+1

)− u

(x∗T , zT , x

∗T+1

)1− λ

≤ E

∞∑t=T+1

βtu(x∗t , zt, x∗t+1

)− u

(λx∗t , zt, λx∗t+1

)1− λ

≤ E

∞∑t=T+1


)− u

(ξx∗t , zt, ξx∗t+1

)1− ξ .

Since(x∗t+1

)is an interior solution, taking λ→ 1 yields

0 ≤ −EβTua(x∗T , zT , x

∗T+1

)· x∗T+1

≤ E

∞∑t=T+1


)− u

(ξx∗t , zt, ξx∗t+1

)1− ξ ,

where the first inequality follows from Assumptions 7.3.2 and 7.3.4 and the

Euler equation. The last term in the above equation converges to zero as

T →∞ by the assumption that J(x0, z0,

(ξx∗t+1

))and J

(x0, z0,

(x∗t+1

))are

finite. Letting T →∞ yields the transversality condition (7.42).

The idea of the proof is simple. If we proportionally shift down the

continuation plan at any date by a small amount and if this results in a

feasible plan that reduces the optimal total discounted expected reward by


a finite value, then the transversality condition is necessary. This condition

is easy to check for models with a homogeneity property or models with a

bounded reward function. From the proof we can see that some assumptions

can be weakened. For example, we can weaken Assumption 6.4.3 and allow

a general stochastic process for the shock (zt) .

Note that if we can apply the above argument to each component of the

state vector, then the transversality condition (7.40) applies to all compo-

nents of the state vector. As a result, we obtain nx equations as terminal

boundary conditions. The Euler equation gives a system of nx second-order

difference equations. Given nx initial conditions, and nx terminal transver-

sality conditions, we can solve this system. We call this method the Euler

equation approach.

Now we turn to the sufficiency part. Its proof is similar to that of

Theorem 7.4.1.

Theorem 7.5.2 (Sufficiency of the Euler and transversality conditions) Let

X ⊂ Rnx+ be convex and suppose Assumptions 6.4.3 and 7.3.2-7.3.4 hold. If

the plan(x∗t+1

)∞t=0

with x∗t+1

(zt)∈ int(Γ

(x∗t

(zt−1

), zt

)) for any zt ∈ Zt

satisfies the Euler equation (7.39) and the transversality condition (7.40),

then it is optimal for problem (7.38).

Proof. Consider any feasible plan (xt) ∈ Π(x0, z0) and any integer

T ≥ 0. We compute

DT ≡ ET∑t=0

βt[u(x∗t , zt, x

∗t+1

)− u (xt, zt, xt+1)

]≥

ET∑t=0

βt[ux

(x∗t , zt, x

∗t+1

)· (x∗t − xt) + ua

(x∗t , zt, x

∗t+1

)·(x∗t+1 − xt+1

)]


= E

T∑t=0

βt[ua

(x∗t , zt, x

∗t+1

)+ ux

(x∗t+1, zt+1, x

∗t+2

)]·(x∗t+1 − xt+1

)+EβTua

(x∗T , zT , x

∗T+1

)·(x∗T+1 − xT+1

).

Where the inequality follows from Assumption 7.3.3 and the last equality

follows from x∗0 = x0. Using the Euler equation (7.39), we deduce that the

first expectation on the right-hand side of the last equation is zero. By

Assumption 7.3.2 and the Euler equation, we deduce that ua < 0. It follows

from xT+1 ≥ 0 that

DT ≥ EβTua(x∗T , zT , x

∗T+1

)· x∗T+1.

Taking limit as T →∞ yields

limT→∞

ET∑t=0

βt[u(x∗t , zt, x

∗t+1

)− u (xt, zt, xt+1)

]≥ 0.

Thus, J(x0, z0,

(x∗t+1

))≥ J (x0, z0, (xt+1)) establishing the desired result.

Now we consider an example to illustrate the previous theorems.

Example 7.5.1 (Phelps (1962) and Levhari and Srinivasan (1969)) Con-

sider the following consumption-saving problem:

max(ct)

E

∞∑t=0

βtc1−γt

1− γ , γ > 0, �= 1,

subject to

xt+1 = Rt+1 (xt − ct) , x0 > 0 given,

where Rt+1 > 0 is identically and independently draw from some distribu-

tion. There is no labor income. We interpret γ = 1 as log utility. We can

derive the Euler equation and the transversality condition:

c−γt = βE[Rt+1c

−γt+1

]limt→∞E

[βtc−γt xt

]= 0.


We conjecture the consumption policy is given by ct = Axt where A ∈ (0, 1)

is a constant to be determined. Substituting this conjecture into the Euler

equation and using the budget constraint, we obtain:

A−γx−γt = βE[Rt+1A

−γR−γt+1 (xt −Axt)

−γ].

Solving yields

A = 1−(βE

[R1−γt+1

])1/γ. (7.44)

For log utility (γ = 1), we obtain A = 1 − β. In general, for A ∈ (0, 1), we

need to assume that

0 <(βE

[R1−γt+1

])1/γ< 1.

Now we show the above consumption policy is optimal by verifying the transver-

sality condition. Using the budget constraint and policy function, we derive:

E[βtc−γt xt

]= E

[βtA−γx1−γt

]= E

[βtA−γR1−γ

t (1−A)1−γ x1−γt−1

]= ... = βtA−γ (1−A)t(1−γ) x1−γ0 E

[Πts=1R

1−γs

].

Since Rt is IID, it follows from (7.44) that

E[βtc−γt xt

]= A−γ (1−A)t x1−γ0 .

Since A ∈ (0, 1) , we have verified that the transversality condition is satis-

fied.

7.6 Exercises

1. Prove that Cϕ (X) is a complete metric space when endowed with the

weighted sup norm defined in (7.23).

2. Let T be Bellman operator defined in (7.25). Prove

|Tf − Tg| (x) ≤ β supx′∈Γ(x)

∣∣f (x′)− g

(x′)∣∣ ,

for any continuous functions f, g on X.

7.6. EXERCISES 231

3. Replace the production function in Example 7.3.1 so that the con-

straint correspondence is Γ (K) = [0, AK +Kα] , A > 1 and α ∈(0, 1) . Use the weighted norm approach to analyze the Bellman equa-

tion.

4. Consider the deterministic Ramsey growth model

max∞∑t=0

βtu (Ct)

subject to

Ct +Kt+1 = f (Kt) , K0 given.

Assume that f is continuous and strictly increasing. In addition,

f (0) = 0 and there exists some xm such that x ≤ f (x) ≤ xm for

all 0 ≤ x ≤ xm, and f (x) < x for all x > xm. Prove that one can

restrict the state space to X = [0, xm] . This exercise illustrates that

even though the reward function is unbounded, one may choose a suit-

able compact state space so that one can transform the problem into

a problem with bounded reward.

5. Consider the stochastic growth model:

maxE

[ ∞∑t=0

βt ln (Ct)

]

subject to

Ct +Kt+1 = ztKαt , K0 given,

where zt follows a Markov process with transition function Q. Derive

the optimal consumption and investment policies and the value func-

tion using the Euler equation approach or the value function approach.

Verify they are indeed optimal.


6. Use the value function approach to derive the optimal consumption

policy and the value function in Example 7.5.1 and verify the solution

is optimal.

7. (Schechtman and Escudero (1977)) Let T →∞ in Exercise 8 of Chap-

ter 6. Solve the limit consumption and saving policies. Prove that

limit policies are optimal for the infinite horizon case and derive the

value function.

8. (Phelps (1962)) Consider the model in Exercise 6 of Chapter 6. Let the

utility function be u (c) = c1−γ/ (1− γ) . Solve the consumption policy

in the finite horizon case. Then let the time horizon go to infinity and

derive the consumption policy for the infinite-horizon case. Prove the

limit policy is optimal.

9. (Levhari, Mirman and Zilcha (1980)) Let the horizon T → ∞ in Ex-

ercise 9 of Chapter 6. Derive the limiting policy and value functions.

Prove these functions are optimal for the infinite-horizon problem.

10. Solve for the value function in Example 7.5.1. Relate to the transver-

sality condition (7.4) to the transversality condition (7.40).

Chapter 8

Applications

In this chapter, we study five economic problems to illustrate how the theory

developed previously can be applied. We focus on stochastic models. For

some problems, we also consider their deterministic counterparts.

8.1 Option Exercise

Consider the option exercise problem presented in Section 5.2.2, which be-

longs to the class of optimal stopping problems. For simplicity, assume that

the lump-sum investment payoff is drawn independently and identically from

the distribution F over [0, B] , where B > I. We write the Bellman equation

as:

V (z) = max

{β

∫V

(z′)dF

(z′), z − I

},

where the maximization is over two choices: (i) make the investment and

obtains profits z − I, or (ii) wait and draw a new payoff from distribution

F next period. The function V represents option value. By the dynamic

programming theory developed in Section 7.3.1, we can immediately show

that there exists a unique solution V to the Bellman equation. In addition

it is increasing and convex (see Figure 8.1).

This figure reveals that there exists a unique threshold value z∗ such that

233

234 CHAPTER 8. APPLICATIONS

z

Vz I

( ') ( ')V z dF z

Figure 8.1: Option value. This figure plots the value function V (z) forthe irreversible project investment problem.

the decision maker chooses to invest whenever the project payoff exceeds z∗

for the first time. The value function V satisfies:

V (z) =

{z − I z > z∗

β∫ A0 V (z′) dF (z′) z ≤ z∗ .

At z∗, we obtain:

V (z∗) = β

∫ B

0V

(z′)dF

(z′)= z∗ − I

= β

∫ z∗

0(z∗ − I) dF

(z′)+ β

∫ B

z∗

(z′ − I

)dF

(z′)

= β

∫ B

0(z∗ − I) dF

(z′)+ β

∫ B

z∗

(z′ − z∗

)dF

(z′).

Thus, the threshold value is determined by the equation:

z∗ − I =β

1− β

∫ B

z∗

(z′ − z∗

)dF

(z′).

8.1. OPTION EXERCISE 235

One can verify that there exists a unique solution in (0, B) to the above

equation. From the above equation, we see that z∗ > I. According to the net

present value rule often used in capital budgeting, the investment threshold

is equal to I. With irreversibility, the decision maker forgoes some projects

with positive net present values because there is option value of waiting.

It is straightforward to derive the probability distribution of the waiting

time until the project is invested. Let λ = F (z∗) be the probability that the

project is not invested. Let N be the random variable “length of time until

the project is invested.” Then we can compute Pr (N = j) = (1− λ)λj . We

can also compute the mean waiting time to invest as E [N ] = 1/ (1− λ).The mean waiting time increases with the threshold z∗. Thus, z∗ fully char-

acterizes the investment timing.

What will happen if the investment project becomes riskier? We use

second-order stochastic dominance to characterize the riskiness of two distri-

butions with the same mean. Say distribution F1 is riskier than distribution

F2 in the sense of second-order stochastic dominance if∫xdF1 (x) =

∫xdF2 (x) and

∫u(x)dF1 (x) ≤

∫u(x)dF2 (x) ,

for any concave and increasing function u. Another equivalent definition is

in terms of mean preserving spreads. We refer to Rothschild and Stiglitz

(1970) and Green, Mas-Colell and Whinston (1995) for a detailed discussion.

Now suppose that there are two projects that generate identical mean

payoffs. Suppose that project 2 is riskier than project 1. Then by the con-

vexity of the value function, we can easily show that the decision maker

prefers the riskier project 2 instead of the safer project 1. This seemingly

counterintuitive result follows from the fact that the decision maker receives

payoffs only from the tail of the distribution. He does not exercise the invest-

ment option for bad payoff realizations. This result underlies the key idea

of the asset substitution (or risk shifting) problem identified by Jensen and


Meckling (1976). Jensen and Meckling (1976) point out that equityholders

or managers with limited liability have incentives to make excessively risky

investment at the expanse of bondholders.

We may restate the preceding result in terms of the investment timing:

An increase in risk in terms of second-order stochastic dominance induces

the decision maker to delay investment. In Chapter 21, we will argue that

there is a distinction between risk and uncertainty and that this distinc-

tion matters for the analysis of the investment timing (see Miao and Wang

(2011)).

We should mention that McCall’s (1970) job search model is similar to

the investment problem (see Exercise 1 of Chapter 5). Each period a worker

draws one wage offer w from the same distribution F. He may reject the offer

and receive unemployment compensation c. The he waits until next period

to draw another offer from F. Alternatively, the worker accepts the offer to

work at w and receives w each period forever. The Bellman equation for the

worker’s problem is

V (z) = max

{c+ β

∫V

(z′)dF

(z′),

w

1− β

}.

Stokey, Lucas and Prescott (1989) and Ljungqvist and Sargent (2004) pro-

vide an extensive analysis of this problem and its various extensions.

8.2 Discrete Choice

Consider the discrete choice problem analyzed by Rust (1985, 1986) and

presented in Section 5.2.1. We write the Bellman equation as:

V (s) = max{−c (s) + β

∫ ∞

sV

(s′)λ exp

(−λ

(s′ − s

))ds′,

− (Pb − Ps)− c (0) + β

∫ ∞

0V

(s′)λ exp

(−λs′

)ds′},

8.2. DISCRETE CHOICE 237

s

V

Figure 8.2: Value function. This figure plots the value function V (s) forthe discrete choice problem.

where the state variable s represents the accumulated utilization of a durable

good. The first expression in the curly bracket represents the value from

maintaining the durable and the second represents the value from replacing

the durable with a new one. The decision maker chooses the maximum of

these two values. Assume that c (s) is a bounded, continuously differentiable,

and strictly increasing function. By the dynamic programming theory with

bounded reward, we can show that V is decreasing and continuous (see

Figure 8.2).

Since Pb > Ps, it is never optimal to replace the new durable when s = 0.

Note that the value of replacing the durable is constant. Let s∗ be the smaller

value such that the decision maker is indifferent between maintaining and

replacing. Differentiating the Bellman equation on the region (0, s∗) yields

V ′ (s) = −c′ (s) + λc (s) + λ(1− β)V (s) . (8.1)


This is a first-order differential equation. It is called a free boundary value

problem since the boundary condition is endogenous at the point s∗. To

derive this condition, we note that

V (0) = −c (0) + β

∫ ∞

0V

(s′)λ exp

(−λs′

)ds′.

Thus, at s∗,

V (s∗) = − (Pb − Ps)− c (0) + β

∫ ∞

0V

(s′)λ exp

(−λs′

)ds′

= − (Pb − Ps) + V (0) .

In addition,

V (s∗) = −c (s∗) + β

∫ ∞

s∗V

(s′)λ exp

(−λ

(s′ − s∗

))ds′

= −c (s∗) + β

∫ ∞

s∗V (s∗)λ exp

(−λ

(s′ − s∗

))ds′

= −c (s∗) + βV (s∗) .

It follows that

V (s∗) =−c (s∗)1− β , V (0) = (Pb − Ps)−

c (s∗)1− β . (8.2)

We can then solve the differential equation (8.1) and the trigger value s∗

subject to the two boundary conditions in (8.2). The solution is given by

V (s) = max

{−c (s∗)1− β ,

−c (s)1− β +

∫ s∗

s

c′ (y)1− β

[1− βe−λ(1−β)(y−s)

]dy

},

and s∗ is the unique solution to:

Pb − Ps =∫ s∗

0

c′ (z)1− β

[1− βe−λ(1−β)z

]dz.

The decision maker’s optimal policy is to replace the old durable with a

new one whenever s > s∗. Unlike the option exercise problem analyzed in

the last section, the problem is not over after replacement. Rather, when the

8.3. CONSUMPTION AND SAVING 239

accumulated utilization of the new durable rises over time and exceeds s∗,

the decision maker replaces this durable with a second new one. This process

is repeated. This problem is called a regenerative optimal stopping

problem.

8.3 Consumption and Saving

We consider the following infinite-horizon consumption-saving problem:

max(ct)

E

[ ∞∑t=0

βtu (ct)

], β ∈ (0, 1)

subject to

ct + at+1 = Rat + yt, (8.3)

ct ≥ 0, at+1 ≥ −b, (a0, y0) > 0 given, (8.4)

Here, ct, at, and yt represent consumption, asset holdings and labor in-

come at time t, respectively. Assume that the gross interest rate R > 1

and the utility function u satisfies the usual conditions: u′ > 0, u′′ < 0,

limc→0 u′ (c) = ∞, limc→∞ u′ (c) = 0. Let yt follow a Markov process with

transition function Q. Suppose that its support is [ymin, ymax] ⊂ R+. This

implies that the present value of labor income is finite almost surely.

The constant b ≥ 0 represents an exogenous borrowing limit. Without

any constraint on borrowing, the consumer can achieve any consumption

level by borrowing and rolling over debt. In this way, his utility level ap-

proaches infinity. This strategy is often called a Ponzi game or Ponzi scheme.

To rule out this strategy, the following no-Ponzi-game condition is often im-

posed:

limT→∞

aT+1

RT≥ 0. (8.5)


To see what will happen if this condition is violated, we use the budget

constraint (8.3) to derive:

T∑t=0

ctRt

+aT+1

RT= Ra0 +

T∑t=0

ytRt. (8.6)

Taking limit as T → ∞, we find that if (8.5) is violated, then the present

value of consumption exceeds the sum of the present value of labor income

and initial wealth because the consumer’s debt is never repaid. For example,

the consumer can keep rolling over his initial debt, say a1 < 0, at the interest

rate r = R− 1 so that his debt grows at the interest rate aT+1 = Rta1. We

then have

limT→∞

aT+1

RT= lim

T→∞RTa1RT

= a1 < 0.

The following result is adapted from Aiyagari (1994).

Theorem 8.3.1 Let R = 1+r > 1. Suppose that (yt) has support [ymin, ymax] ⊂R+. If at+1 ≥ −b for all t ≥ 0 and any b ≥ 0, then the no-Ponzi-game con-

dition (8.5) holds. Conversely, if the no-Ponzi-game condition (8.5) holds,

then

at ≥ −ymin

ra.s. for t ≥ 0.

Proof. First, let at+1 ≥ −b for all t ≥ 0. It follows from (8.6) that

Ra0 +

T∑t=0

ytRt

=

T∑t=0

ctRt

+aT+1

RT≥

T∑t=0

ctRt− b

RT.

Taking limit as T →∞, we obtain:

Ra0 +

∞∑t=0

ytRt≥

∞∑t=0

ctRt.

Using (8.6) again yields:

aT+1

RT= Ra0 +

T∑t=0

ytRt−

T∑t=0

ctRt.


Taking limit as T →∞ yields (8.5).

Now, let (8.5) hold. Suppose at = −yminr −ε for some ε > 0 and for some t

with positive probability. By assumption, there is a positive probability that

yt ≤ ymin + rε/2. Using (8.3) and ct ≥ 0, we derive that at+1 ≤ −ymin/r+

rε/2− (1 + r) ε. Repeatedly using this argument, we deduce that

at+n ≤ −ymin

r+rε

2

[1 + (1 + r) + ...+ (1 + r)n−1

]− (1 + r)n ε,

with positive probability. Therefore, with positive probability,

at+n(1 + r)n

≤ − 1

(1 + r)n

(ymin

r+ε

2

)− ε

2< −ε

2.

Fix an n large enough such that

ymax

r

1

(1 + r)n−1 <ε

4.

By repeatedly applying the budget constraint (8.3) and ct ≥ 0, we obtain

at+n+m

(1 + r)n+m≤ ymax

r

1

(1 + r)n−1

[1− 1

(1 + r)m+1

]+

at+n(1 + r)n

.

Thus, with positive probability,

limm→∞

at+n+m

(1 + r)n+m≤ ymax

r

1

(1 + r)n−1 +at+n

(1 + r)n

<ε

4− ε

2< 0,

contradicting with (8.5).

Following Aiyagari (1994), we call ymin/r the natural debt limit. This is

the largest debt level that is feasible for the consumer to repay almost surely.

If b > ymin/r, then the borrowing constraint will never be binding. One may

impose more stringent debt limits. Here we assume that b < ymin/r in (8.4).

As in Section 6.6.2, it is more convenient to redefine variables. Let at =

at+b. Define the cash on hand or the disposable resource as xt = Rat+yt+b.


Then we may rewrite the budget and borrowing constraints as:

xt+1 = R (xt − ct) + yt+1 − rb,

ct + at+1 = xt, at+1 ≥ 0, x0 ≥ 0 given.

By the Maximum Principle, we can derive the Euler inequality:

u′ (ct) ≥ βREt[u′ (ct+1)

], with equality if at+1 > 0, (8.7)

and the transversality condition:

limt→∞Eβtu′ (ct)xt = 0. (8.8)

These conditions are also sufficient for optimality.

We may also write the associated Bellman equation as:

V (xt, yt) = maxat+1∈[0,xt]

u (xt − at+1)

+βEt [V (Rat+1 + yt+1 − rb, yt+1)] .

If (yt) is IID, then yt is not a state variable and xt is the only state variable.

In this case, we write the value function as V (x) . When u is unbounded,

the analysis of the above Bellman equation is nontrivial. One approach

followed by Schechtman and Escudero (1977) and Sotomayor (1984) is to

first study the truncated finite-horizon problem as in Section 6.6.2 and to

then analyze the limit as the horizon goes to infinite. Here, we simply assume

that there exists a unique solution V to the Bellman equation and that V

inherits properties of u in that V (x) is strictly increasing, strictly concave

and continuously differentiable in x. In addition, the envelope condition,

V ′ (x) = u′ (c) , holds. We write the consumption and saving policy functions

as c (x) and a (x) . From the envelope condition and concavity of V, we

can see that c (x) is strictly increasing in x. Properties of the consumption

and saving policies depend on uncertainty in the income process and the


relation between the discount factor β and the interest rate R. The intuition

is simple. Uncertainty affects consumption smoothing and precautionary

savings across states and over time. Discounting also affects consumption

smoothing over time even without uncertainty.

8.3.1 Deterministic income

We start with the deterministic case in which yt is deterministic for each t.

Suppose yt = y > rb for all t. The Bellman equation is given by:

V (x) = maxa∈[0,x]

u (x− a) + βV (Ra+ y − rb) . (8.9)

We allow u to be unbounded.

Case 1 βR < 1.

This case has been analyzed by Schechtman and Escudero (1977) and

Stokey, Lucas and Prescott (1989). In this case, the consumer is so impatient

that he prefers to consume more in early periods. If he has low wealth, he

also prefers to borrow and may exhaust the borrowing limit. Thus, the initial

wealth level plays an important role. We conjecture that there are critical

levels x∗0 < x∗1 < x∗2 < ... for the cash on hand such that if x ∈[x∗t−1, x

∗t

]with x∗−1 = y − rb, then the consumer will not exhaust the borrowing limit

from periods 0 to t − 1, and will exhaust the borrowing limit from period

t on for t ≥ 0. We will construct a consumption-saving plan based on this

conjecture and verify that this plan is optimal. We will also characterize the

value function explicitly.

First, consider t = 0. Note that u′ (x) is continuous and decreasing in

x. Since u′ (y − rb) > βRu′ (y − rb) and u′ (∞) = 0 < βRu′ (y − rb), thereexist a unique value x∗0 > y − rb, such that

u′ (x∗0) = βRu′ (y − rb) . (8.10)


Then for any x ∈ [x∗−1, x∗0),

u′ (x) > u′ (x∗0) = βRu′ (y − rb) .

We then define the consumption and saving policies by:

f0 (x) = x, g0 (x) = 0 for x ∈ [x∗−1, x∗0], (8.11)

which satisfy the above Euler inequality. Since x′ = R (x− f0 (x))+y−rb =y − rb < x∗0, we deduce that f0 (x

′) = x′ and g0 (x′) = 0. This means that

if the consumer’s initial cash on hand x0 ∈[x∗−1, x

∗0

], then he will always

exhaust the borrowing limit (at+1 = −b, t ≥ 0) and will consume all his

labor income net of interest payments on debt (c0 = x0 = Ra0 + y + b and

ct = y − rb, t ≥ 1). This plan is optimal because it satisfies the Euler

inequality (8.7) and the transversality condition (8.8). Given this plan, we

can compute the value function explicitly as:

V0 (x) = u (x) +β

1− βu (y − rb) , x ∈[x∗−1, x

∗0

].

By the deterministic version of Theorem 7.1.1, it satisfies the Bellman equa-

tion (8.9). Clearly, it satisfies the envelope condition, V ′0 (x) = u′ (x) for

x ∈[x∗−1, x

∗0

].

Next, consider t = 1. We conjecture that for initial cash on hand x in

some interval [x∗0, x∗1], the consumer’s optimal plan is to consume part of

his disposable resources in period 0 and to save the remaining a > 0 such

that his cash on hand in the next period z = Ra+ y − rb is in the interval

[x∗−1, x∗0]. As a result, from period t = 1 on the consumer follows the plan

described previously for the case with t = 0.

We first need to determine the value x∗1. Because u′ (0) =∞ and u′ (∞) =

0, it can be verified that there exists a unique solution x > (z − y + rb) /R

to the equation:

u′(x− z − y + rb

R

)= βRu′ (z) = βRV ′

0 (z) , (8.12)


for any z ∈ [x∗−1, x∗0]. Define this solution x as a function of z, x = h0 (z)

for z ∈ [x∗−1, x∗0]. Because u′ is continuous and strictly decreasing, we can

check that h0 is continuous and strictly increasing. From equations (8.10)

and (8.12), we see that as z → x∗−1 = y − rb, h0 (z) → x∗0. Define x∗1 =

limz→x∗0 h0 (z) > x∗0. Then, h0 :[x∗−1, x

∗0

]→ [x∗0, x

∗1] .

Now, we verify that the previously conjectured plan is optimal. We

define the initial saving and consumption policies, respectively, by

g1 (x) =h−10 (x)− y + rb

R, f1 (x) = x− g1 (x) , for x ∈ [x∗0, x

∗1],

which deliver cash on hand in the next period z = h−10 (x0) ∈ [x∗−1, x

∗0]. From

period 1 on, the consumer follows the plan for t = 0 with initial cash on hand

h−10 (x0) ∈ [x∗−1, x

∗0]. Because we can show that (i) the initial policies satisfy

the Euler equation (8.12) between periods 0 and 1, (ii) the plan from period

1 on also satisfy the Euler inequality, and (iii) the transversality condition

holds, we conclude that the constructed plan from time 0 is indeed optimal.

This plan gives the value function:

V1 (x) = u (f1 (x)) + βV0(h−10 (x)

), x ∈ [x∗0, x

∗1] .

By construction, it satisfies:

V1 (x) = maxa∈[0,x]

u (x− a) + βV0 (Ra+ y − rb) , x ∈ [x∗0, x∗1] .

By the envelope condition, V ′1 (x) = u′ (f1(x)) for x ∈ [x∗0, x∗1] . As in Section

6.4, we can show that V1 inherits properties of V0 in that V1 is strictly

increasing and strictly concave.

In general, for any t ≥ 1, suppose that we obtain the following:

• a sequence of nontrivial intervals [x∗s−1, x∗s] for s = 0, 1, ..., t;

• a sequence of continuous and strictly increasing functions,

hs−1 : [x∗s−2, x

∗s−1]→ [x∗s−1, x

∗s],


with hs−1

(x∗s−2

)= x∗s−1 and hs−1

(x∗s−1

)= x∗s, for s = 1, .., t;

• a sequence of savings and consumption policies gs (x) and fs (x) for

x ∈[x∗s−1, x

∗s

], s = 0, 1, ..., t;

• a sequence of value functions Vs :[x∗s−1, x

∗s

]→ R for s = 0, 1, ..., t.

These functions satisfy the following equations:

u′(hs (z)−

z − y + rb

R

)= βRu′

(z −

h−1s−1 (z)− y + rb

R

)= βRV ′

s (z) , z ∈ [x∗s−1, x∗s],

gs (x) =h−1s−1 (x)− y + rb

R, fs (x) = x− gs (x) , x ∈ [x∗s−1, x

∗s],

Vs (x) = u (fs (x)) + βVs−1

(h−1s−1 (x)

), x ∈

[x∗s−1, x

∗s

],

for s = 0, 1, ..., t. Each value function Vs is strictly increasing, strictly

concave, and satisfies the envelope condition V ′s (x) = u′ (fs(x)) for x ∈[

x∗s−1, x∗s

]. At each critical value x∗s, hs (x∗s) = hs+1 (x

∗s) , gs (x

∗s) = gs+1 (x

∗s),

Vs (x∗s) = Vs+1 (x

∗s), and V

′s (x

∗s) = V ′

s+1 (x∗s) for s = 0, 1, ..., t − 1.

Now, at s = t+1, we define another function ht (z) as the unique solution

to the equation:

u′(ht (z)−

z − y + rb

R

)= βRu′

(z −

h−1t−1 (z)− y + rb

R

)(8.13)

= βRV ′t (z) .

for z ∈ [x∗t−1, x∗t ]. The last equality follows from the envelope condition for

z ∈ [x∗t−1, x∗t ] and ft (z) = z −

[h−1t−1 (z)− y + rb

]/R. Since Vt is strictly

concave on [x∗t−1, x∗t ], the function ht is strictly increasing on [x∗t−1, x

∗t ]. By

definition, ht(x∗t−1

)= x∗t . Define x∗t+1 = ht (x

∗t ) . Setting z = x∗t in equation


(8.13) and using h−1t−1 (x

∗t ) = x∗t−1 and βR < 1, we obtain:

u′(x∗t+1 −

x∗t − y + rb

R

)= βRu′

(x∗t −

x∗t−1 − y + rb

R

)< u′

(x∗t −

x∗t−1 − y + rb

R

)< u′

(x∗t −

x∗t − y + rb

R

).

Thus, x∗t+1 > x∗t . It follows that ht : [x∗t−1, x∗t ]→ [x∗t , x∗t+1].

Define saving and consumption policies as:

gt+1 (x) =h−1t (x)− y + rb

R> 0, ft+1 (x) = x− gt+1 (x) , (8.14)

for any initial cash on hand x ∈ (x∗t , x∗t+1]. Because these polices deliver

cash on hand in period 1 z = h−1t (x) ∈ (x∗t−1, x

∗t ], from period 1 on the

consumer will follow the consumption and saving plans starting from initial

cash on hand z ∈ (x∗t−1, x∗t ]. Continuing by backward induction, we deduce

that the consumer will not exhaust the borrowing limit until period t and

his borrowing will always reach the limit from period t+ 1 on. Given these

policies, we compute the value function as:

Vt+1 (x) = u (ft+1 (x)) + βVt(h−1t (x)

), x ∈

[x∗t , x

∗t+1

].

By construction, it satisfies the equation:

Vt+1 (x) = maxa∈[0,x]

u (x− a) + βVt (Ra+ y − rb) , x ∈[x∗t , x

∗t+1

].

As in the analysis in Section 6.4, we can show that Vt+1 is strictly increasing,

strictly concave, and satisfies the envelope condition, V ′t+1 (x) = u′ (ft+1(x))

for x ∈[x∗t , x∗t+1

]. In addition, we can show that Vt (x

∗t ) = Vt+1 (x

∗t ) and

V ′t (x

∗t ) = V ′

t+1 (x∗t ) .

By induction, we can define the consumption, saving, and value functions

on the whole state space [y − rb,∞) by

c (x) = ft (x) , a (x) = gt (x) , V (x) = Vt (x) , if x ∈[x∗t−1, x

∗t

],


for any t ≥ 0. We can verify that the value function satisfies the Bellman

equation (8.9). We can also verify that for any initial cash on hand x0 ≥y − rb, the following sequences,

at+1 = a (xt) , ct = xt − a (xt) ,

xt+1 = Ra (xt) + y − rb, t ≥ 0,

give optimal consumption and saving plans.

In terms of dynamics, starting from any initial wealth level, there is a

finite time T such that consumption ct, savings at+1, and cash on hand xt

all decrease until period T , and after period T, they are constant over time

in that ct = xt = y − rb, and at+1 = −b for t > t.

Case 2 βR > 1.

In this case, the consumer is so patient that he prefers to save and

postpone consumption. We claim that the borrowing constraint is never

binding. Suppose to the contrary that at time t, the borrowing constraint is

binding in that at+1 = 0. Then xt+1 = y − rb. By the Euler inequality and

the envelope condition,

V ′ (xt) ≥ βRV ′ (xt+1) > V ′ (xt+1) ,

where the last inequality follows from βR > 1. Because V is strictly concave,

xt < xt+1. By the budget constraint, xt = Rat + y − rb ≥ y − rb = xt+1, a

contradiction.

What are the limiting properties of the optimal consumption and saving

plans? By the Euler equation,

u′ (ct) = βRu′ (ct+1) > u′ (ct+1) .

It follows that ct < ct+1. Thus, (ct) is an increasing sequence and hence has a

limit as t→∞. Suppose limt→∞ ct = c∗ <∞. Then taking limit in the above


Euler equation as t → ∞, we obtain u′ (c∗) = βRu′ (c∗) , which contradicts

with βR > 1. Thus, lim ct =∞. This also implies that limxt = lim at+1 =∞by the budget constraint.

Case 3 βR = 1.

In this case, we assume that yt is deterministic, but may not be constant

over time. Without borrowing constraints, the consumer would enjoy a

constant consumption stream. In the presence of borrowing constraints, his

current consumption is less than the future whenever he currently exhausts

the borrowing limit. Formally, by the Euler inequality,

u′ (ct) ≥ u′ (ct+1) with inequality if at+1 = 0.

Thus, either ct = ct+1 or ct < ct+1 when at+1 = 0. When at+1 = 0,

xt+1 = y−rb. Suppose that the consumer arrives in period t and is borrowing

constrained. If he knows that the borrowing constraint will never bind again

in the future, he would find it optimal to choose the highest sustainable

constant consumption level, given by the annuity value of the income stream

(net of interest payments on debt) starting from period t,

At =r

R

∞∑j=0

R−j(yt+j − rb).

Chamberlain and Wilson (2000) show that the impact of the borrowing

constraint will never vanish until the consumer reaches the period with the

highest annuity value of the remainder of the income process in that

c∗ ≡ limt→∞ ct = sup

tAt ≡ A∗,

where (ct) is an optimal plan. We now formally prove this result.

We first show that c∗ ≤ A∗. Suppose to the contrary that c∗ > A∗. Then

there is a t such that the consumer is borrowing constrained in that at = 0,


xt = yt − rb, and cj > At for all j ≥ t. Thus, there is a τ sufficiently large

such that

0 <

τ∑j=t

Rt−j (cj − yj + rb) = Rt−τ (cτ − xτ ) ,

where the equality uses xt = yt− rb and successive iterations on the budget

constraint. This is a contradiction because cτ ≤ xτ .

To show that c∗ ≥ A∗, suppose to the contrary that c∗ < A∗. Then there

is an At such that cj < At for all j ≥ t, and hence

∞∑j=t

Rt−jcj <∞∑j=t

Rt−jAt =∞∑j=t

Rt−j(yj − rb)

≤ xt +

∞∑j=t+1

Rt−j(yj − rb),

where the last inequality uses the inequality xt ≥ yt − rb. Thus, there is an

ε > 0 and τ∗ > t such that for all τ > τ∗,

τ∑j=t

Rt−jcj < xt +

τ∑j=t+1

Rt−j(yj − rb)− ε.

Using the budget constraint repeatedly, we obtain

Rt−τcτ < Rt−τxτ − ε,

or equivalently,

cτ < xτ −Rτ−tε.

We can then construct an alternative feasible consumption plan (c′j) such

that c′j = cj for all j �= τ∗ and c′j = cj + ε for j = τ∗. This plan yields higher

utility than the plan (cj) , which is a contradiction.

We next consider two examples taken from Ljungqvist and Sargent (2004)

to illustrate the preceding result.


Example 8.3.1 Assume that a0 = 0, b = 0, and the labor income stream

{yt} = {yh, yl, yh, yl, ...} where yh > yl > 0. The present value of the labor

income is ∞∑t=0

yt

(1 + r)t=yh + yl/R

1−R−2.

The annuity value c that has the same present value as the labor income

stream is given by:

Rc

r=yh + yl/R

1−R−2or c =

yh + yl/R

1 + 1/R.

Consuming c each period satisfies the Euler equation. Using the budget con-

straint, we can find the associated savings plan: at+1 = (yh − yl)R/ (1 +R)

for even t and at+1 = 0 for odd t.1

Example 8.3.2 Assume that a0 = 0, b = 0, and the labor income stream

{yt} = {yl, yh, yl, yh, yl, ...} where yh > yl > 0. The solution is c0 = yl and

a1 = 0, and from period 1 onward, the solution is the same as in the previous

example. Hence, the consumer is borrowing constrained in the first period

only.

In the case of constant labor income yt = y for all t, the policies ct =

y + ra0 and at+1 = a0, t ≥ 0, for any initial asset holdings a0 ≥ −b,satisfy the Euler equation and the transversality condition. Thus, they are

optimal. This means that it is optimal to roll over the initial asset (or debt)

level forever.

8.3.2 Stochastic income

We now suppose that the labor income yt follows an IID process. We will

show that the limiting behavior of the consumption and saving policies is

1Note that at+1 = 0 does not necessarily imply the consumer is borrowing constrained.If and only if the Lagrange multiplier associated with the borrowing constraint is positive,the consumer is borrowing constrained.


quite different from that in the deterministic case. We first establish some

properties of consumption and saving policies.

Lemma 8.3.1 Let β ∈ (0, 1) and R > 0. The consumption policy function

c (x) is strictly increasing for x ≥ xmin and the saving policy functions a (x)

satisfies

0 ≤ a(x′)− a (x) < x′ − x all xmin ≤ x < x′,

with equality if and only if a (x′) = a (x) = 0.

Proof. By the envelope condition, V ′ (x) = u′ (c (x)) . It follows from

the strict concavity of V that c is strictly increasing.

Given any x ≥ xmin = ymin − rb, define the function f by:

f (s;x) = u′ (x− s)− βREV ′(Rs+ y − rb), s ≥ 0, (8.15)

where the expectation E is taken with respect to the stochastic labor income

y. Let x′ > x. We consider two cases. First, suppose a (x) = 0. Then

a (x′) ≥ a (x) = 0. If a (x′) = 0, then the result is trivial. Consider a (x′) > 0.

Suppose that a (x′) > x′ − x > 0. Then by the Euler equation at x′ and the

strict concavity of u and V ,

0 = u′(x′ − a

(x′))− βREV ′(Ra

(x′)+ y − rb)

> u′ (x)− βREV ′ (R (x′ − x

)+ y − rb

)> u′(x)− βREV ′ (y − rb) ,

which contradicts with the Euler inequality at x since a (x) = 0.


In the second case, a (x) > 0. Then we use the Euler equation at x to

deduce that f (a (x) ;x′) < f (a (x) ;x) = 0. In addition,

f(a (x) + x′ − x;x′

)= u′ (x− a (x))− βREV ′ (Ra (x) +R

(x′ − x

)+ y − rb

)= βREV ′ (Ra (x) + y − rb)− βREV ′ (Ra (x) +R

(x′ − x

)+ y − rb

)> 0.

Since f (s;x′) is strictly increasing in s, it follows from the Intermediate

Value Theorem that there exists a unique solution a (x′) ∈ (a (x) , a (x) + x′ − x)to the equation (8.15) for each x′ > x.

Lemma 8.3.1 implies that whenever the borrowing constraint does not

bind, then the consumption and saving functions are strictly increasing and

have slopes less than 1. From the proof, we observe that this result holds

for the deterministic case and is also independent of whether βR > 1, = 1

or < 1.

Case 1 βR < 1.

Similar to the deterministic case, the optimal consumption, savings and cash

on hand sequences under uncertainty will settle down to a steady state.

However, this steady state does not imply constant paths, but implies the

existence of a stationary distribution. Before establishing this result for-

mally, we present two lemmas.

Lemma 8.3.2 (Schechtman and Escudero (1977)) Let βR < 1. Suppose

that xmin ≡ ymin − rb > 0. Then there is an x > xmin such that if xt ≤ x,

then at+1 = 0, and if xt > x, then at+1 > 0.

Proof. Suppose the borrowing constraint is never binding. Then by the

Euler equation and the envelope condition,

V ′ (xt) = βREt[V ′ (xt+1)

]≤ βRV ′ (xmin) < V ′ (xmin) ,


where the first inequality follows from the concavity of V. Letting xt → xmin

leads to a contradiction. Thus, there exits a t and xt ≥ xmin such that

at+1 = 0 and the Euler inequality

u′ (xt) > βREtV′ (yt+1 − rb) > 0.

is satisfied. Since u′ (∞) = 0, and u′ is continuous and strictly decreasing,

there exists a unique x > xt such that u′ (x) = βREtV′ (yt+1 − rb) . For any

xt < x, it follows that the Euler inequality holds: u′ (xt) > βREtV′ (yt+1 − rb) .

Thus, for any xt ≤ x, the consumer must exhaust the borrowing limit in

that ct = xt and at+1 = 0. If xt > x, then u′ (xt) < βREtV′ (yt+1 − rb),

implying that at+1 > 0.

This result is similar to that in the deterministic case. The key intu-

ition is that high discounting of the future induces the consumer to borrow

more and consume more in the present relative to the future. If his wealth

is sufficiently low, he will exhaust his borrowing limit. Figure 8.3 plots

consumption and assets as functions of cash on hand.

Lemma 8.3.3 (Schechtman and Escudero (1977)) Let βR < 1. If −cu′′ (c) /u′ (c) <γ for all c sufficiently large for some γ > 0, then there exists an x∗ such

that if xt > x∗, then xt+1 < xt.

Proof. Recall the consumption and saving policy functions are c (x)

and a (x) respectively. If a (x) is bounded in that a (x) ≤ K for all x ≥xmin = ymin − rb. Then we take x∗ = RK + ymax − rb. It follows that

xt+1 = Ra (xt) + yt+1 − rb ≤ RK + ymax − rb = x∗. Thus, if xt > x∗, then

xt+1 ≤ x∗ < xt.


btxx

tc

1ˆta

1ta

045

Figure 8.3: Policy functions. This figure plots the consumption and assetsas functions of cash on hand.

Now, suppose a (x) is unbounded as x→∞. Using the envelope condi-

tion and the concavity of V , we derive that for xt sufficiently large,

EtV′ (xt+1)

V ′ (ymax − rb+Ra (xt))≤ V ′ (ymin − rb+Ra (xt))

V ′ (ymax − rb+Ra (xt))

=u′ (c (ymin − rb+Ra (xt)))

u′ (c (ymax − rb+Ra (xt)))

≤[c (ymax − rb+Ra (xt))

c (ymin − rb+Ra (xt))

]γ,

where γ is the upper bound of −cu′′ (c) /u′ (c) .2 By definition,

c (ymax − rb+Ra (xt)) = c (ymin − rb+Ra (xt)) + ymax − ymin.

Thus,

1 ≤ EtV′ (xt+1)

V ′ (ymax − rb+Ra (xt))≤

[1 +

ymax − ymin

c (ymin − rb+Ra (xt))

]γ. (8.16)

2If −cu′′ (c) /u′ (c) ≤ γ for all c > c0, then u′ (c) /u′ (c0) ≥ (c0/c)γ for all c > c0.


We first show that c (x) goes to infinity as x → ∞, if a (x) → ∞. When

a (xt) → ∞ as xt → ∞, the borrowing constraint will not bind at xt when

xt is sufficiently large. In this case, the Euler inequality holds as an equality:

u′ (c (xt)) = βREtu′ (c (xt+1)) .

If limx→∞ c (x) = c > 0, then taking limit as xt →∞ yields xt+1 = Ra (xt)+

yt+1 − rb → ∞ and hence the above Euler equation implies that c = βRc,

leading to a contradiction. Thus, we must have limx→∞ c (x) =∞. It followsfrom (8.16) that

limxt→∞

EtV′ (xt+1)

V ′ (ymax − rb+Ra (xt))= 1.

Choose a positive ε < (1− βR) / (βR) and note that there exists a suffi-

ciently large x∗ such that for xt > x∗, EV ′ (xt+1) /V′ (ymax − rb+Ra (xt)) <

1 + ε. Using the Euler equation,

V ′ (xt) = βREtV′ (xt+1) < (1 + ε) βRV ′ (ymax − rb+Ra (xt))

≤ V ′ (ymax − rb+Ra (xt)) .

It follows from the concavity of V that if xt ≥ x∗, then xt > ymax − rb +Ra (xt) ≥ xt+1.

Lemma 8.3.3 implies that whenever the cash on hand is larger than a

critical value x∗, then the cash on hand in the next period will fall. Let xmax

denote the smallest critical values that satisfy the property in this lemma.

Then the cash on hand in the long run will settle down in the closed interval

[xmin, xmax] as illustrated in Figure 8.4.

Exercise 9 asks the reader to show that xmax > xmin and xmax is the

largest value such Ra (xmax) + ymax − rb = xmax.

Proposition 8.3.1 (Schechtman and Escudero (1977), Clarida (1987), and

Aiyagari (1994)) Let βR < 1. Let yt be independently and identically drawn


miny rb

txx

1 maxty y

1 minty y

045

maxy rb

minx maxx

1tx

Figure 8.4: Evolution of cash on hand. This figure plots xt+1 as afunction of xt for the largest and the lowest income levels.

from the distribution μ satisfying:

μ ([a, b]) ≥ α [a, b] for any [a, b] ⊂ [ymin, ymax] .

Suppose that −cu′′ (c) /u′ (c) < γ for all c sufficiently large for some γ > 0.

Then there exists a unique stable stationary distribution over the support

[xmin, xmax] for (xt) . Further, the stationary distribution is continuous (in

the weak convergence topology) in the parameters R and b.

Proof. The Markov process (xt) defined by

xt+1 = Ra (xt) + yt+1 − rb,

induces a transition function P on the state space [xmin, xmax] .We will apply

Theorem 3.3.4 and Theorem 3.3.5. We can easily verify that the transition

function is monotone and satisfies the Feller property. The key is to verify


that it satisfies the mixing condition in that there exists x and ε > 0 such

that Pn (xmin, [x, xmax]) ≥ ε and Pn (xmax, [xmin, x]) ≥ ε for n sufficiently

large.

Define the sequence {ϕn (x, y)} by:

ϕ0 (x, y) = x,

ϕn+1 (x, y) = Ra (ϕn (x, y)) + y − rb.

That is, ϕn (x, y) is the cash on hand at the beginning of period n if x is

the initial cash on hand, the labor income takes the constant value y each

period, and the saving policy a is followed each period. Since ϕn (x, y) is

increasing in y, we deduce that

Pn (x, [ϕn (x, ymax − δ) , ϕn (x, ymax)]) ≥ αnδn,

Pn (x, [ϕn (x, ymin) , ϕn (x, ymin + δ)]) ≥ αnδn,

for all 0 < δ < ymax − ymin and all n ≥ 1.

Since

ϕ0 (xmin, ymax) = xmin < ymax − rb = ϕ1 (xmin, ymax) ,

it follows by induction that {ϕn (xmin, ymax)} is a strictly increasing se-

quence. Since it is bounded by xmax, it converges to a limit ξ < xmax.

Similarly, limn→∞ ϕn (xmin, Ey) exists. We choose this limit as x. Since

ϕn (xmin, ymax) > ϕn (xmin, Ey) , it follows that ξ ≥ x. If ξ = x, then by

continuity of a,

ξ = Ra (ξ) + ymax − rb

= x = Ra (x) + Ey − rb,

which contradicts ymax > Ey. Thus, we must have ξ > x. This allows us

to choose N sufficiently large such that ϕN (xmin, ymax) > x, and δ > 0


sufficiently small such that ϕN (xmin, ymax − δ) > x. Then

PN (xmin, [x, xmax])

≥ PN (xmin, [ϕN (xmin, ymax − δ) , ϕN (xmin, ymax)]) ≥ αNδN > 0.

A similar argument applies to x0 = xmax :

PN (xmax, [xmin, x]) ≥ αNδN > 0.

The rest of the proof is routine.

This is a remarkable result and is the foundation for the incomplete

markets model of Bewley (1980) and Aiyagari (1994).

Case 2 βR > 1.

In the deterministic case, we have shown that (ct) is an increasing sequence

and hence has a limit. In the stochastic case, a stochastic process (Mt)

with the property Mt ≥ (≤)Et [Mt+1] , analogous to monotonicity in the

deterministic case, is called a supermartingale (submartingale). Chamber-

lain and Wilson (2000) analyze the long-run behavior and consumption and

savings using the powerful Supermartingale Convergence Theorem. Let

Mt = βtRtu′ (ct) ≥ 0. Then it follows from the Euler inequality that {Mt} isa nonnegative supermartingale. By the Supermartingale Convergence The-

orem, {Mt} converges to a nonnegative limit. If βR > 1. Then we must

have limt u′ (ct) = 0 and hence limt ct = ∞. By the budget constraint, we

also have limt xt =∞ and limt at+1 =∞.Sotomayor (1984) proves a stronger result when labor income is an IID

process.

Proposition 8.3.2 (Sotomayor (1984)) Let βR > 1. (i) limx→∞ a (x) =

∞. (ii) If −cu′′ (c) /u′ (c) < γ for all c sufficiently large for some γ > 0,

then there exists an x such that if xt > x, then xt+1 > xt.


Proof. (i) Suppose to the contrary that limx→∞ a (x) = k < ∞. Thenby the Euler inequality,

V ′ (xt) ≥ βREtV ′ (yt+1 − rb+Ra (xt)) .

Taking limit as xt →∞ yields > 0

limxt→∞V ′ (xt) ≥ βREtV ′ (yt+1 − rb+Rk) > 0.

Let limxt→∞ V ′ (xt) = K > 0. Then when x is larger than ymax − rb+ Rk,

it follows from the concavity of V that

K ≥ βREtV ′ (yt+1 − rb+Rk) ≥ βRV ′ (x) .

Letting x→∞ yields K ≥ βRK, a contradiction with βR > 1.

(ii) Using the concavity of u and the assumption,

Etu′ (c(yt+1 − rb+Ra(xt))

u′ (ymin − rb+Ra (xt))

≥ u′ (c(ymax − rb+Ra (xt)))

u′ (c(ymin − rb+Ra (xt)))≥

(c(ymax − rb+Ra (xt))

c(ymin − rb+Ra (xt))

)γ,

for xt sufficiently large. Using part (i) and a similar argument in the proof

of Lemma 8.3.3, we deduce that

Etu′ (c(yt+1 − rb+Ra (xt)))

u′ (c(ymin − rb+Ra (xt)))≥ 1 as xt →∞.

Thus, there exists an x sufficiently such that when xt > x

V ′ (xt) = βREtV′ (xt+1) > Etu

′ (c(yt+1 − rb+Ra (xt)))

≥ V ′ (ymin − rb+Ra (xt)) ,

where the first inequality follows from the envelope condition and βR > 1.

Thus, it follows from the concavity of V that xt < ymin−rb+Ra (xt) ≤ xt+1.


Case 3 βR = 1.

Chamberlain and Wilson (2000) show that the above result also holds for

βR = 1 for very general stochastic labor income processes if the discounted

present value of labor income is sufficiently stochastic. Sotomayor (1984)

and Aiyagari (1994) provide a simple proof in the IID case. When βR = 1,

the Supermartingale Convergence Theorem does not necessarily imply that

u′ (ct) converges to zero. Suppose that u′ (ct) converges to a positive limit

so that ct converges to a finite limit almost surely. Then xt also converges

to a finite limit almost surely. Since the set of sample paths for which

there are infinitely many t such that yt+1 = ymin and yt+1 = ymax has

probability one, we assume that the sample paths on which xt converges

also have this property. This contradicts with the budget constraint xt+1 =

Ra (xt) + yt+1 − rb given the continuity of a.

Compared to the deterministic case, this limiting result is remarkable.

As discussed by Chamberlain and Wilson (2000), the economic intuition

behind this result is hard to explain. The key argument relies on the powerful

Supermartingale Convergence Theorem.

Sotomayor (1984) argues that although limt xt =∞ for both βR > 1 and

βR = 1, the convergence behavior is different. Unlike the case of βR > 1,

there is no initial level of cash on hand from which the optimal sequence

of cash on hand is increasing. Intuitively, when xt is sufficiently large, the

borrowing constraint does not bind at xt. Then by the Euler equation and

the envelope condition,

V ′ (xt) = βREtV′ (xt+1) = EtV

′ (Ra (xt) + yt+1 − rb)

≤ V ′ (Ra (xt) + ymin − rb) .

Thus, xt ≥ Ra (xt)+ymin−rb. A similar argument shows that xt ≤ Ra (xt)+ymax − rb. Thus, unless xt = Ra (xt) + ymin − rb for all large xt, there is a


positive probability that xt > Ra (xt)+ymin−rb and hence there is a positive

probability that xt > xt+1 = Ra (xt) + yt+1 − rb.

Lemma 8.3.4 (Sotomayor (1984)) Let βR = 1. There is no x such that

for all x > x, x = Ra (x) + ymin − rb.

Proof. If there exists an x such that x = Ra (x) + ymin − rb. Then for

x large enough,

V ′ (Ra (x) + ymin − rb) = V ′ (x) = EV ′ (Ra (x) + y − rb) ,

where the second equality follows from the Euler inequality and the expec-

tation is taken with respect to the random variable y. The above equation

cannot hold for strictly concave V .

8.4 Inventory

Inventory control is a classical problem in operations research and is also

important in economics. This problem typically involves both fixed and

variable costs and belongs to the class of the mixed stopping and control

problem. Our presentation below follows Scarf (1960) and Iglehart (1963)

closely. Suppose that a firm has an inventory stock xt of some good at the

beginning period t. It may sell the good to meet the exogenous stochastic

market demand zt+1 ≥ 0 at the end of period t. Suppose that the demand

{zt+1} is IID with distribution ϕ. After observing xt and zt and before

observing zt+1, the firm may order dt ≥ 0 units of the good to increase its

inventory stock. Then the law of motion of the inventory stock is given by

xt+1 = xt + dt − zt+1, x0 given. (8.17)

We allow xt to take negative values. Negative stocks represent debt to

suppliers or backlogging. Sales of zt+1 generates profitsR (zt+1) ≥ 0. Storage

8.4. INVENTORY 263

of xt incurs costs Φ (xt) ≥ 0. Placing an order of size dt costs C (dt) ≥ 0.

Assume that future cash flows are discounted by β ∈ (0, 1) . The firm’s

problem is given by:

max(dt)≥0

E∞∑t=0

βt [R (zt+1)− Φ (xt)− C (dt)] , (8.18)

subject to (8.17), where the expectation is taken with respect to (zt).

What makes this problem special is the specification of the order cost

function C (d). Suppose that

C (d) =

{0 if d = 0

c+ pd if d > 0,

where c represents fixed costs and p represents proportional costs. The

presence of fixed costs makes the analysis of the problem in (8.18) difficult.

We shall use the dynamic programming method to analyze it. Observe that

this problem is equivalent to minimizing discounted expected costs. It is also

convenient to define yt = xt + dt, which represents the stock immediately

after order and before sales. In addition, we define the expected holding and

storage costs by:

L (yt) = βEΦ (xt+1) = βEΦ (yt − zt+1) ,

where E is the expectation operator for the random variable z.

Assumption 8.4.1 The function L : R → R+ is convex and coercive in

that lim|y|→∞ L (y) = +∞.

After the changes of variables, we write the cost minimization problem

as:

min(yt)

E∞∑t=0

βt [L (yt) + C (yt − xt)] , (8.19)


subject to xt+1 = yt − zt+1, yt ≥ xt, t ≥ 0, and x0 is given. The Bellman

equation for this problem is given by

V (x) = miny≥x

L (y) + C (y − x) + βEV (y − z) , (8.20)

where E is the expectation operator for the random variable z. Note that

the reward function for this problem, L (y) + C (y − x) , is bounded below

by zero, but unbounded above. We cannot apply the dynamic programming

theory for bounded rewards developed earlier. We will study corresponding

finite-horizon problems and take limit as the horizon goes to infinity.

Define the Bellman operator by:

Tf (x) = miny≥x

L (y) + c1y>x + p · (y − x) + βEf (y − z) ,

for any nonnegative function f : R → R+. Here 1y>x is an indicator func-

tion. The sequence of finite-horizon problems is given by fn (x) = T nf0 (x) ,

n = 1, 2, ..., where we set f0 = 0. To get a quick idea about the nature of

the solution, we consider the single period case:

f1 (x) = miny≥x

L(y) + c1y>x + p · (y − x) .

Figure 8.5 illustrates that the optimal ordering policy takes the (s, S)

form. The stock level S minimizes the function L (y) + py. At the point

s < S, the firm is indifferent between placing an order S−s and not ordering

such that L (S)+ c+pS = L (s)+ps. For x < s, L (S)+ c+pS < L (x)+px.

Thus, it is optimal for the firm to order S − x. For x > s and any y ≥ x,

L (y)+ c+ py > L (x)+ px. Thus, it is not optimal for the firm to place any

order y − x. This form of the ordering policy is called the (s, S) form.

Finite-Horizon Problem

Now we generalize the above intuition to a dynamic setting. The n-period

problem is given by

fn (x) = miny≥x

L (y) + c1y>x + p · (y − x) + βEfn−1 (y − z) , (8.21)

8.4. INVENTORY 265

S

c

( )L y py

s

( )L x p x

Figure 8.5: Optimality of the (s, S) policy. This figure illustrates thesolution for the one-period problem.

where f0 = 0. We may rewrite it as

fn (x) = min {L(x) + βEfn−1(x− z),

miny≥x

L (y) + c+ p · (y − x) + βEfn−1 (y − z)}.

This problem is a mixed control and stopping problem in that the firm first

decides whether and when to order (stopping) and if it decides to order,

then it chooses how much to order (control).

Define

Gn (y) = L (y) + py + βEfn−1 (y − z) .

If it is optimal to order from x to y > x, then Gn (x) > c+Gn (y) and this y

must minimize Gn (y) . Let Sn be the smallest minimizer if there are many.

If Gn (y) decreases to a minimum and then increases (e.g., Gn (y) is convex),


then there is a solution sn < Sn to the equation:

Gn (sn) = c+Gn (Sn) . (8.22)

At the stock level sn, the firm is indifferent between ordering and not. If

x < sn, the firm should place an order Sn−x to bring the stock to Sn. Thus,

it is natural that the optimal ordering policy takes the (s, S) form.

The problem of the above argument is that Gn (y) may not be convex

and there may be multiple local minimizers as can be verified by numerical

examples even if L (x) is strictly convex. To prove the optimality of the

(s, S) policy, we need the notion of c-convexity introduced by Scarf (1960).

Definition 8.4.1 Let c ≥ 0, and let h : R→ R be a differentiable function.

Say that h is c-convex if

c+ h (x+ a)− h (x)− ah′ (x) ≥ 0 for all a > 0. (8.23)

If differentiability is not assumed, then the above inequality is replaced with

c+ h (x+ a)− h (x)− ah (x)− h (x− b)b

≥ 0, (8.24)

for all a, b > 0.

Lemma 8.4.1 If h is a differentiable c-convex function, then

c+ h (y)− h (x)− (y − x)h′ (u) ≥ 0

for all y > x > u.

Proof. We consider two cases.

Case 1. h (u)− h (x) + (x− u) h′ (u) ≤ 0. In this case,

c+ h (y)− h (x)− (y − x)h′ (u)

≥ c+ h (y)− h (x)− y − xx− u [h (x)− h (u)] ≥ 0,

8.4. INVENTORY 267

where the last inequality follows from (8.24).

Case 2. h (u)− h (x) + (x− u)h′ (u) > 0. We then have

c+ h (y)− h (x)− (y − x) h′ (u)

> c+ h (y)− h (u)− (x− u)h′ (u)− (y − x) h′ (u) ≥ 0,

where the last inequality follows from (8.23).

Additional properties of c-convex functions are:

(i) Convex function is 0-convex and any c-convex function is also c′-

convex for c′ > c ≥ 0.

(ii) If h (x) is c-convex, then h (x+ a) is also c-convex for all a ∈ R.

(iii) If f is c-convex, and g is b-convex, then αf + γg is (αc+ γb)-convex

for α, γ > 0.

An important property of c-convexity is that it is impossible that in three

consecutive sub-intervals (x1, x2) , (x2, x3) , (x3, x4) the optimal policy is not

to order, to order and not to order respectively, as illustrated in Figure ??.

Suppose the contrary. Then there is a local maximizer in the interval

(x2, x3) . Let x∗ be one of the maximizers. By the c-convexity of Gn (y)

0 ≤ c+Gn (S2)−Gn (x∗)−(S2 − x∗) [f (x∗)− f (x∗ − b)]

b

≤ c+Gn (S2)−Gn (x∗) , b > 0,

which is a contradiction.

The following lemma demonstrates the optimal policy must be of the

(s, S) form.

Lemma 8.4.2 If Gn (y) is continuous, c-convex and coercive, then the fol-

lowing (s, S) policy is optimal:

dn (x) =

{0 if x > snSn − x if x ≤ sn

,

where Sn is a global minimizer of Gn (y) and sn is smallest the solution to

(8.22) such that sn < Sn.


1S

nG x

1x

c

c

2S2x 3x

Figure 8.6: Minimization of Gn (y).

Proof. Since Gn is continuous and coercive, there exists a minimizing

point of Gn. Let Sn be such a point. Let sn be the smallest value such that

(8.22) holds. For all x < sn, it follows from the definition of c-convexity that

c+Gn (Sn) ≥ Gn (sn) +Sn − snsn − s

(Gn (sn)−Gn (x)) .

Thus, Gn (sn) ≤ Gn (x) . Since x < sn and sn is the smallest value such that

(8.22) holds, we must have Gn (x) > Gn (sn) = Gn (Sn) + c. Thus, for all

x < sn, it is optimal to order Sn − x.

Now, we show that Gn (x) ≤ Gn (y)+ c for all x, y with sn ≤ x ≤ y. Thisis true for x = y, y = Sn or y = sn. There remain two other possibilities:

8.4. INVENTORY 269

Sn < x < y and sn < x < y. If Sn < x < y, then by c-convexity,

c+Gn (y) ≥ Gn (x) +y − xx− Sn

(Gn (x)−Gn (Sn))

≥ Gn (x) .

If sn < x < Sn, then

Gn (sn) = c+Gn (Sn) ≥ Gn (x) +Sn − xx− sn

(Gn (x)−Gn (sn)) .

Thus, Gn (sn) ≥ Gn (x) . It follows that

Gn (y) + c ≥ Gn (Sn) + c = Gn (sn) ≥ Gn (xn) .

We conclude that it is optimal not to order for all x > sn.

We can also prove that Gn (x) is decreasing for x < sn. Let x1 < x2 < sn.

We have

c+Gn (Sn) ≥ Gn (x2) +Sn − x2x2 − x1

(Gn (x2)−Gn (x1)) .

Since Gn (x2) > Gn (Sn) + c, we must have Gn (x2) < Gn (x1) .

The following lemma completes the proof the optimality of (s, S) policy

for the finite-horizon problem by showing that Gn (x) is indeed c-convex for

all n ≥ 1.

Lemma 8.4.3 Suppose that Assumption 8.4.1 holds. Then Gn (x) is c-

convex for all n ≥ 1.

Proof. We prove by induction. By definition, f0 (x) = 0 and G1 (x) =

L (x) + px. Clearly G1 (x) is c-convex. Suppose that Gn (x) is c-convex.

Using the previous lemma, we obtain that for x > sn,

fn (x) = −px+Gn (x) ,


which is c-convex. For x < sn

fn (x) = L (Sn) + c+ p (Sn − x) + βEfn−1 (Sn − z)

= c− px+Gn (Sn) .

Consider two cases. If x < x + a < sn, then fn (x) is linear in x and is

c-convex. If x < sn < x+ a, then

c+ fn (x+ a)− fn (x)− afn (x)− fn (x− b)

b

= c+ [−p (x+ a) +Gn (x+ a)]− [c− px+Gn (Sn)]

−a [c− px+Gn (Sn)]− [c− p(x− b) +Gn (Sn)]

b

= Gn (x+ a)−Gn (Sn) ≥ 0,

where the inequality follows from the fact that Sn is the global minimizer

of Gn. Thus, fn (x) is c-convex.

By the properties of c-convex functions, fn (x− z) and Efn (x− z) andβEfn (x− z) are all c-convex functions. Thus, Gn+1 (x) is c-convex.

Infinite-Horizon Problem

We shall study the convergence of the sequence of solutions to the finite-

horizon problems. We need to exploit additional properties of the finite-

horizon problems. In doing so, we impose:

Assumption 8.4.2 The function L (x) is differentiable.

From the proof of the previous lemma and equation (8.22),

fn (x) =

{−px+Gn (sn) if x ≤ sn−px+Gn (x) if x > sn

. (8.25)

In addition, each fn is continuous. Since L (x) is differentiable, we can prove

the following results by induction.

8.4. INVENTORY 271

(i) Each function fn (x) is differentiable except for at the point x = sn

where the left and right-hand derivatives exist:

f ′n (x) ={−p if x ≤ sn−p+G′

n (x) if x > sn. (8.26)

(ii) Each function Gn (y) is differentiable:

G′n (y) = L′ (y) + p+ βEf ′n−1 (y − z) .

Now, we establish bounds for the sequences of {sn} and {Sn} . Thefollowing lemma shows that {Sn} is bounded below.

Lemma 8.4.4 Suppose that Assumptions 8.4.1 and 8.4.2 hold. Then S1 ≤Sn, n = 2, 3, ....

It is sufficient to prove inductively that G′n (x) < 0 for x < S1. We leave

the proof as an exercise. The lemma below shows that {Sn} is bounded

above.

Lemma 8.4.5 Suppose that Assumptions 8.4.1 and 8.4.2 hold. Sn ≤M <

∞ for all n = 1, 2, ...

Proof. For all a > 0,

Gn (x+ a)−Gn (x)

= pa+ L (x+ a)− L (x)

+β

∫ ∞

0[fn−1 (x+ a− z)− fn−1 (x− z)]ϕ (z) dz.

Dividing the integral into two regions, we obtain

Gn (x+ a)−Gn (x)

= pa+ L (x+ a)− L (x)− paβ∫ ∞

max{x+a−sn−1,0}ϕ (z) dz

+β

∫ max{x+a−sn−1,0}

0[fn−1 (x+ a− z)− fn−1 (x− z)]ϕ (z) dz.


Since fn−1 is c-convex, it follows from Lemma 8.4.1 that∫ max{x+a−sn−1,0}

0[fn−1 (x+ a− z)− fn−1 (x− z)]ϕ (z) dz

≥∫ max{x+a−sn−1,0}

0

[−c+ af ′n−1 (x−max {x+ a− sn−1, 0})

]ϕ (z) dz

≥∫ max{x+a−sn−1,0}

0[−c+ ap]ϕ (z) dz,

where we have used (8.26). Thus, we obtain:

Gn (x+ a)−Gn (x) ≥ L (x+ a)− L (x)− c.

Since L (x) is convex and lim|x|→∞ L (x) =∞, there exist M1 and M2 such

that when x > M1 and a > M2, L (x+ a) − L (x) − c > 0. Suppose that

Sn > M1+M2. Then letting x+ a = Sn, we find that Gn (Sn)−Gn (x) > 0.

This contradicts the fact that Sn minimizes Gn (x) . Thus, Sn ≤M1+M2 ≡M.

Clearly, since sn < Sn ≤ M, {sn} is bounded above. The following

lemma shows that sn is also bounded below.

Lemma 8.4.6 Suppose that Assumptions 8.4.1 and 8.4.2 hold. 2s1 − S1 <sn for all n = 2, 3, ...

Proof. The result is obtained by inductively proving that G′n(x) ≤

−c/(S1 − s1) for x ≤ s1 and all n ≥ 1. For n = 1, the cord connect-

ing the two points (s1, G1 (s1)) and (S1, G1 (S1)) has slope −c/ (S1 − s1) .Since G1 (x) = px+ L (x) is convex, G′

1 (x) ≤ −c/ (S1 − s1) . Suppose that

G′n (x) ≤ −c/ (S1 − s1) for x ≤ s1. Then since f ′n (x) ≤ −c for x ≤ s1, it

follows that

G′n+1 (x) ≤ G′

1 (x)− βc < −c/ (S1 − s1) for x ≤ s1.

8.4. INVENTORY 273

By the Mean Value Theorem, there exists a ξ ∈ [s1 − (S1 − s1), s1] such that

Gn (Sn) ≤ Gn (s1) = Gn (s1 − (S1 − s1)) +Gn (ξ) (S1 − s1)

< Gn (s1 − (S1 − s1))− c,

where the first inequality follows from the fact that Sn minimizes Gn (x)

and the second from the previous claim. From these inequalities, we know

that at the stock level s1 − (S1 − s1) , it is optimal for the firm to place an

order so that s1 − (S1 − s1) < sn, for n = 2, 3, ...

Given above bounds, we can prove:

Proposition 8.4.1 Suppose that Assumptions 8.4.1 and 8.4.2 hold. Then

{fn (x)} and {Gn (x)} converge monotonically to finite limits and the con-

vergence is uniform for all x in any finite interval. The limit functions f (x)

and G (x) are continuous.

Proof. Define

T (y, x, f) = C (y − x) + L (y) + βEf (y − z)

and let yn (x) be a minimizing value of y, as a function of x in the finite-

horizon problem (8.21). Then (8.21) becomes

fn (x) = miny≥x

T (y, x, fn−1) = T (yn, x, fn−1) .

Using the optimality properties of yn and yn+1, we obtain

T (yn+1, x, fn)− T (yn+1, x, fn−1) ≤ fn+1 (x)− fn (x)

≤ T (yn, x, fn)− T (yn, x, fn−1)

or expressed as

|fn+1 (x)− fn (x)| ≤ max{|T (yn, x, fn)− T (yn, x, fn−1)| ,

|T (yn+1, x, fn)− T (yn+1, x, fn−1)|}.


Expanding T (·, ·, ·) and cancelling terms yields

|fn+1 (x)− fn (x)|

≤ max

{β∣∣∫∞

0 [fn (yn − z)− fn−1 (yn − z)]ϕ (z) dz∣∣ ,

β∣∣∫∞

0 [fn (yn+1 − z)− fn−1 (yn+1 − z)]ϕ (z) dz∣∣ }

.

Now choose two positive constants A and B such that −A ≤ 2s1 − S1and B ≥ M, where M is given in Lemma 8.4.5. Then the above inequality

implies that

max−A≤x≤B

|fn+1 (x)− fn (x)| ≤ β max−A≤x≤B

∫|fn (x− z)− fn−1 (x− z)|ϕ (z) dz,

where we have used the fact that −A ≤ yn (x) ≤ B for −A ≤ x ≤ B and

n = 1, 2, ... From Lemma 8.4.6, sn > −A. By equation (8.25), fn (x) =

px+Gn (sn) for x ≤ −A < sn. Thus, we deduce that

max−A≤x≤B

|fn+1 (x)− fn (x)| ≤ β max−A≤x≤B

|fn (x)− fn−1 (x)| .

Iterating this inequality yields

max−A≤x≤B

|fn+1 (x)− fn (x)| ≤ βn max−A≤x≤B

|f1 (x)| , n = 1, 2, ...

Hence the series∑∞

n=0 [fn+1 (x)− fn (x)] converges absolutely and uniformly

for all x in the interval −A ≤ x ≤ B. Since {fn (x)} is an increasing sequence

in n for each fixed x and A and B may be chosen arbitrarily large, we know

that fn (x) converges monotonically and uniformly for all x in any finite

interval. Since the functions fn (x) are continuous and the convergence is

uniform, the limit function f (x) is also continuous. By the definition of

Gn (x) , we immediately obtain the result for {Gn (x)} .


the limit function f (x) satisfies the Bellman equation (8.20).

8.4. INVENTORY 275

Proof. Since convergence to f (·) is monotone,

fn (x) = miny≥x

T (y, x, fn−1) ≤ miny≥x

T (y, x, f) .

Letting n→∞ yields

f (x) ≤ miny≥x

T (y, x, f)

On the other hand, monotone convergence implies that

f (x) ≥ miny≥x

T (y, x, fn) .

Since Sn ≤ M < ∞, we can choose a constant N > M such that the

above inequality is equivalent to

f (x) ≥ minN≥y≥x

T (y, x, fn) .

This allows us to take limit by interchanging the limiting, minimum and

integration operations, though the proof is tedious. Thus, we have f (x) =

miny≥x T (y, x, f) .

The following result characterizes the limiting behavior of {sn} and {Sn}.The proof is left as an exercise.


the sequences {sn} and {Sn} contain convergent subsequences. Every limit

point of the sequence {Sn} is a point at which the function

G (x) = px+ L (x) + βEf (x− z)

attains its minimum. If G (x) has a unique minimum point, the sequence

{Sn} converges. Furthermore, G(x) is c-convex and any limit point s of the

sequence {sn} satisfies G(S) + c = G(s), where S is a limit point of the

sequence {Sn}. The optimal ordering policy for the infinite horizon problem

is of the (s, S) form:

d (x) =

{0 if x > sS − x if x ≤ s .


Note that G (x) may have multiple minimizers. If there is a unique

minimizer, then {Sn} will converge to it. Finally, we state a uniqueness

result for the Bellman equation. The proof is nontrivial and is omitted here.

Interested readers may consult Iglehart (1963).

Proposition 8.4.4 Suppose that Assumptions 8.4.1 and 8.4.2 hold. If F (x)

is a solution to the Bellman equation (8.20), which is bounded in any finite

interval and is such that for some y infx≤y F (x) > −∞, then F (x) = f (x) .

What is the limiting property of the stochastic process {xt}? From the

optimal (s, S) policy, we obtain that the sequence of inventory stocks {xt}satisfies:

xt+1 =

{xt − zt+1 if s < xt ≤ SS − zt+1 if xt ≤ s

.

Iglehart (1963) shows that {xt} has a stationary distribution which can be

characterized explicitly using renewal theory for stochastic processes. A

simple way to prove the existence of a stationary distribution is to apply

Theorem 3.3.1 by verifying Condition M. We left this analysis as an exercise.

8.5 Investment

We first present neoclassical theory and Q theory in the deterministic case.

Introducing uncertainty to these theories does not change their essence. We

then present models with fixed and/or proportional adjustment costs. In

these models, uncertainty plays an important role in investment decisions

and will be explicitly incorporated.3

8.5.1 Neoclassical Theory

Consider a firm’s investment decisions. The firm uses capital and labor

inputs to produce output according to the production function F : R2+ →

3See Dixit and Pindyck (1994) for an extensive study of investment under uncertainty.

8.5. INVESTMENT 277

R+. Assume that F (0, ·) = F (·, 0) = 0, F1 > 0, F2 > 0, F11 < 0, F22 < 0,

limK→0 F1 (K,L) = ∞, limL→0 F1 (K,L) = ∞, and limK→∞ F1 (K,L) = 0.

The firm hires labor at the wage rate wt. Suppose that it rents capital at

the rental rate Rkt. The firm’s objective is to maximize discounted present

value of profits:

max(Kt,Lt)

∞∑t=0

βt[F (Kt, Lt)− wtLt −RktKt],

where β = 1/ (1 + r) and r is the interest rate. This problem reduces to

maximizing static profits. The solution is such that

F1 (Kt, Lt) = Rkt and F2 (Kt, Lt) = wt.

In reality, most capital is not rented, but is owned by firms that use it.

We now present the problem if the firm owns capital and makes investment

decisions:

max(Kt,Lt)

∞∑t=0

βt [F (Kt, Lt)− wtLt − It] , (8.27)

subject to

Kt+1 = (1− δ)Kt + It, K0 > 0 given, (8.28)

where It represents investment spending and δ ∈ (0, 1) represent the depreci-

ation rate. Let βtQt be the Lagrange multiplier associated with (8.28). The

variable Qt represents the shadow price of capital. The first-order conditions

are:

Qt = 1, Qt = β(1− δ + F1 (Kt+1, Lt+1)).

Thus,

F1 (Kt+1, Lt+1) = r + δ.

Thus, if the rental rate Rk is equal to r + δ, then the above two problems

give identical solutions.


It is convenient to define operating profits net of labor costs only as:

π (Kt) = maxLt

F (Kt, Lt)− wtLt.

Then by the envelope condition, π′ (Kt) = F1 (Kt, Lt) . For the Cobb-Douglas

production function F (K,L) = KαkLαl , we can derive that

π (Kt) = K

αk1−αlt

(αlwt

)αl/(1−αk)

(1− αl) ≡ AtKαk

1−αlt .

We may assume that wt is a fixed constant and hence At is constant too.

If the production function has constant returns to scale αk + αl = 1, then

π (K) = AK. If the production function has increasing (decreasing) returns

to scale αk + αl > (<)1, then π (K) is convex (concave).

8.5.2 Q Theory

The neoclassical investment theory implies that capital demand is deter-

mined by the static condition that the marginal product of capital is equal

to the user cost of capital. This condition has two undesirable problems.

First, investment and capital are too responsive to exogenous shocks. Sec-

ond, there is no mechanism through which expectations about the future

affect investment demand. To modify this theory, we now introduce the

Q theory of investment developed by Abel (1982), Hayashi (1982), and

Summers (1981).4

The key idea of the Q theory is that investment incurs convex adjustment

costs. Following Lucas (1967), Gould (1968), and Treadway (1969), assume

that these costs reduce firm profits directly. These costs may reflect costs

of installing capital and training workers. Assume that the adjustment cost

function Ψ : R×R+ → R+ satisfies Ψ1 > 0, Ψ11 < 0, Ψ2 ≤ 0, and Ψ (0, ·) =4See Lucas (1967), Gould (1969), Uzawa (1969), and Treadway (1969) for early studies

of investment with adjustment costs.

8.5. INVESTMENT 279

0. The firm’s problem is given by:

max(It,Kt)

∞∑t=0

βt [π (Kt)− It −Ψ(It,Kt)] ,

subject to (8.28). Let βtQt be the Lagrange multiplier associated with

(8.28). The first-order conditions are given by:

1 + Ψ1 (It,Kt) = Qt, (8.29)

Qt = β[π′ (Kt+1)−Ψ2 (It+1,Kt+1)] + β(1− δ)Qt+1. (8.30)

The transversality condition is given by:

limT→∞

βTQTKT+1 = 0. (8.31)

From equation (8.30), we derive that

Qt =

∞∑j=1

βj (1− δ)j−1 [π′ (Kt+j)−Ψ2 (It+j,Kt+j)], (8.32)

where we have used the condition:

limT→∞

βT (1− δ)T QT+1 = 0,

which follows from the transversality condition (8.31). Equation (8.32)

shows that Q is equal to the present value of undepreciated marginal rev-

enue product of capital net of changes in adjustment costs due to changes

in the capital stock. This equation indicates that Q summarizes all relevant

information about the future profitability of investment. Equation (8.29)

shows that investment increases with Q and Q is a sufficient statistic for

investment decisions.

Q has a natural economic interpretation. A unit increase in the firm’s

capital stock raises the present value of profits by Q and thus raises the

value of the firm by Q. Thus, Q is the market value of a marginal unit


of capital. Since we have assumed that the purchase price of capital is 1,

Q is also the ratio of the market value of a marginal unit of capital to its

replacement cost. It is often referred to as Tobin’s marginal Q (Tobin

(1969)). Tobin’s average Q is defined as the ratio of the total market

value of the firm to the replacement cost of its total capital stock. Marginal

Q is unobservable in the data. To test the Q theory of investment, one must

use a proxy for marginal Q.

Hayashi (1982) shows that if the production function has constant re-

turns to scale and the adjustment cost function is linearly homogenous, then

marginal Q is equal to average Q. Since the latter is observable in the data,

one can then use average Q to replace marginal Q to test the Q theory of

investment. We now prove Hayashi’s result formally.

Let π (K) = AK. By equation (8.30) and linear homogeneity of Ψ,

Qt = β

[π′ (Kt+1)−

Ψ(It+1,Kt+1)− It+1Ψ1 (It+1,Kt+1)

Kt+1

]+β(1− δ)Qt+1.

Using equation (8.29) yields:

Qt

= β

[π′ (Kt+1)−

Ψ(It+1,Kt+1)− It+1Qt+1 + It+1

Kt+1

]+ β(1− δ)Qt+1

= β

[π′ (Kt+1)−

Ψ(It+1,Kt+1) + It+1 − (Kt+2 − (1− δ)Kt+1)Qt+1

Kt+1

]+β(1− δ)Qt+1

= β

[AKt+1 −Ψ(It+1,Kt+1)− It+1

Kt+1

]+ β

Kt+2Qt+1

Kt+1.

Thus,

QtKt+1 = β[AKt+1 −Ψ(It+1,Kt+1)− It+1] + βKt+2Qt+1.

8.5. INVESTMENT 281

Iterating forward and using the transversality condition, we obtain that:

QtKt+1 =∞∑j=1

βj [π (Kt+j)− It+j −Ψ(It+j ,Kt+j)] .

The expression on the right-hand side of the above equation represents the

total (ex-dividend) market value of the firm or (ex-dividend) equity value.

Thus, we have proved Hayashi’s result.

An alternative way of introducing adjustment costs is suggested by Uzawa

(1969). He leaves (8.27) unchanged, but changes (8.28) as follows:

Kt+1 = (1− δ)Kt +Φ(It,Kt) , (8.33)

where Φ represents the adjustment cost function. In this formulation I units

of investment do not turn into I units of capital, but Φ units. Assume that

Φ satisfies Φ1 > 0, Φ11 < 0 and Φ (0, ·) = 0. Exercise 14 asks the reader to

show that both this formulation and the previous one deliver similar results.

8.5.3 Augmented Adjustment Costs

There is ample empirical evidence that documents irreversibility and lumpi-

ness of investment in the data. To model this behavior, we follow Abel

and Eberly (1994) and Cooper and Haltiwanger (2006) and introduce an

augmented adjustment cost function given by:

C (I,K) =

⎧⎨⎩pbI + φ (K) + Ψ (I,K) if I > 00 if I = 0psI + φ (K) + Ψ (I,K) if I < 0

,

where pb > ps ≥ 0 represent purchase and sale prices, respectively, φ (K)

represents fixed costs of adjusting capital, and Ψ (I,K) represents convex

adjustment costs. The difference between the purchase and sale prices re-

flects transactions costs. Assume that fixed costs may depend on the size of

the firm measured by K. Assume that the convex adjustment cost function

Ψ satisfies previous assumptions and also Ψ1 (0,K) = 0.


The firm’s optimization problem is given by:

max(It)

E

∞∑t=0

βt [π (Kt, zt)− pbIt1It≥0 − psIt1It≤0 − φ (Kt)1It �=0 −Ψ(It,Kt)] ,

where (zt) represents profitability shocks and follows a Markov process with

transition function P . Given any state (K, z) , the firm has three options:

(i) take inaction and let capital depreciate and obtain value V i (K, z) , (ii)

buy capital and obtain value V b (K, z) , (iii) sell capital and obtain value

V s (K, z). The firm chooses the best among these three and obtains value

V (K, z) . The Bellman equation is given by

V (K, z) = max{V i (K, z) , V b (K, z) , V s (K, z)

}, (8.34)

where

V i (K, z) = π (K, z) + βE[V((1− δ)K, z′

)|z],

V b (K, z) = maxI>0

π (K, z) − pbI − φ (K)−Ψ(I,K) + βE[V(K ′, z′

)|z]

subject to (8.28),

V s (K, z) = maxI<0

π (K, z) − psI − φ (K)−Ψ(I,K) + βE[V(K ′, z′

)|z]

subject to (8.28).

Here E [·|z] represents the conditional expectation operator over z′ given z.

This problem is similar to the inventory problem analyzed earlier. It is

hard to analyze given Markov shocks. To provide transparent intuition, we

will proceed heuristically. Interested readers are invited to make arguments

below more rigorous.5 From the Bellman equation, we observe that optimal

investment I∗ solves the following problem:

ϕ (K, z) = maxI

βE[V((1− δ)K + I, z′

)|z]− βE[V

((1− δ)K, z′

)|z]

− psI1I<0 − pbI1I>0 − φ (K)−Ψ(I,K) . (8.35)

5Under some conditions, the continuous time version of the investment problem has aclosed-form solution.

8.5. INVESTMENT 283

Suppose that V is differentiable in K.6 Using Taylor approximation up to

the first order around I = 0, we rewrite the above problem as:

ϕ (K, z) � maxI

βE[V1((1− δ)K, z′

)|z]I

−psI1I<0 − pbI1I>0 − φ (K)−Ψ(I,K)

= maxI

QI − psI1I<0 − pbI1I>0 − φ (K)−Ψ(I,K)

≡ ψ (Q,K) ,

where we define marginal Q at zero investment as:

Q = βE[V1

((1− δ)K, z′

)|z]. (8.36)

We analyze two cases. First, consider the case without fixed costs

φ (K) = 0. The first-order conditions for I in the above problem are given

by7

pb +Ψ1 (I∗,K) = Q and I∗ > 0 if Q > pb, (8.37)

I∗ = 0 if ps ≤ Q ≤ pb, (8.38)

ps +Ψ1 (I∗,K) = Q and I∗ < 0 if Q < ps. (8.39)

These conditions deliver a modified Q theory in the sense that optimal

investment is weakly increasing in Q. Optimal investment is in one of three

regimes (positive, zero and negative investment), depending on the value of

Q relative to two critical values pb and ps. Figure 8.7 illustrates the function

ψ (Q,K) for a fixed K.

Using equation (8.36), we obtain a boundary K = Kb (z) determined by

the equation:

pb = βE[V1

((1− δ)K, z′

)|z]. (8.40)

6Clearly, there are kinks in V due to fixed costs or transactions costs.7More precisely, the first-order conditions for (8.35) implies that Q =

βE [V1 ((1− δ)K + I, z′) |z] .


Qsp bp

( , )Q K

Figure 8.7: The reward to investing. This figure plots the functionψ (Q,K) for a fixed value of K in the case without fixed costs.

On this boundary, the firm is indifferent between buying capital and taking

inaction. If V is concave, V1 increases with z′, and the transition function

P of (zt) is monotone, then Kb (z) increases with z. Below this boundary,

Q > pb and the firm buys capital such that the target investment level

satisfies equation (8.37). We say that states (K, z) below the boundary

Kb(z) are in the “buy region.” Similarly, there is a boundary K = Ks (z)

determined by the equation:

ps = βE[V1

((1− δ)K, z′

)|z]. (8.41)

Above this boundary, Q < ps and the firm sells capital such that the target

disinvestment level satisfies equation (8.39). We say that states (K, z) above

the boundary Ks (z) are in the “sell region.” Within these two boundaries,

the firm does not invest or disinvest. This is the “inaction region.” Whenever

the firm faces a large positive (negative) shock z such that the state (K, z)

8.5. INVESTMENT 285

moves outside the inaction region, the firm responds by buying (selling)

capital to bring the state into the inaction region.

z

K

1[ ((1 ) , ') | ]bp EV K z z

1[ ((1 ) , ') | ]sp E V K z z

sell

inaction

buy

Figure 8.8: Three region of states for (K, z). This figure plots the twoboundaries that partition the state space into buy region, sell region andinaction region in the case without fixed costs.

Note that without fixed costs, there is no investment lumpiness in that

investment responds to profitability shocks continuously. There is no dis-

crete jump of investment. Figure 8.8 illustrates the three regions of states

(K, z) and the property of the solution.

In the special case of ps = 0, equation (8.41) cannot hold. Then it is

never optimal for the firm to make negative investment. This implies that

investment is fully irreversible. When pb > ps > 0, investment is partially

irreversible.

Example 8.5.1 Consider the following static investment problem:

ψ (Q) = maxI

QI − pbI1I>0 − psI1I<0 −I2

2− φ1I �=0.


If there is no fixed cost φ = 0, the optimal investment level is given by

I∗ =

⎧⎨⎩Q− pb > 0 if Q > pb0 if ps ≤ Q ≤ pbQ− ps < 0 if Q < ps

.

The corresponding profits are given by

ψ (Q) =

⎧⎨⎩12 (Q− pb)

2 if Q > pb0 if ps ≤ Q ≤ pb12(Q− ps)2 if Q < ps

.

In the presence of fixed costs φ > 0, we must subtract φ from the preceding

equation. To ensure nonnegative profits, the new investment policy is given

by

I∗ =

⎧⎨⎩Q− pb > 0 if Q > pb + 2

√φ

0 if ps − 2√φ ≤ Q ≤ pb + 2

√φ

Q− ps < 0 if Q < ps − 2√φ

.

Q

sp bp

( , )Q K

1Q 2Q

( )K

Figure 8.9: The reward to investing. This figure plots the functionψ (Q,K) for a fixed value of K in the presence of fixed costs.

8.5. INVESTMENT 287

Next, consider the case of fixed costs φ (K) > 0. Although equations

(8.37)-(8.39) are still first-order necessary conditions, they are not sufficient

for optimality. As Figure 8.9 illustrates, the presence of fixed costs shifts

the previous ψ (Q,K) curve down by φ (K) . The new curve crosses the

horizontal axis at two points Q1 and Q2. The new investment rule is given

by:

pb +Ψ1 (I∗,K) = Q and I∗ > 0 if Q > Q2, (8.42)

I∗ = 0 if Q1 ≤ Q ≤ Q2, (8.43)

ps +Ψ1 (I∗,K) = Q and I∗ < 0 if Q < Q1. (8.44)

When Q = Q2 (Q = Q1), the firm is indifferent between buying (selling)

and taking inaction. We can derive these boundaries more explicitly using

the Bellman equation (8.34). The buy boundary Kb (z) is the maximal

solution K to the equation:

V i (K, z) = V b (K, z) , (8.45)

for each z. Similarly, the equation

V i (K, z) = V s (K, z) , (8.46)

determines a sale boundaryKs (z) . It is intuitive that the firm will buy cap-

ital if K < Kb (z) , sell capital if K > Ks (z) , and take inaction, otherwise.

These two boundaries are increasing functions of z and partition the state

space for (K, z) into three regions as in Figure 8.8. The presence of fixed

costs widens the inaction region.

Equivalently, the two boundaries are determined by the maximal and

minimal solutions to the equation ϕ (K, z) = 0. In the case without fixed

costs φ (K) = 0, ϕ (K, z) � ψ (Q,K) . Thus, conditions (8.40)-(8.41) are

consistent with (8.45)-(8.46). We may view the former as the differential

form of the boundary conditions.


In the presence of fixed costs φ (K) > 0, the boundary conditions cannot

take a differential form. In this case, investment exhibits lumpiness in that

it may have discrete jumps. As illustrated in Figure 8.9, between pb and

Q2, investment returns are not sufficient to cover the fixed costs. The firm

must refrain from investment until Q is sufficiently large. When Q just

exceeds Q2, optimal investment jumps up discretely. The target investment

(or disinvestment) level satisfies equation (8.42) (or (8.44)). When there is

no transactions cost in that pb = ps, these two equations collapse to a single

one.

8.6. EXERCISES 289

8.6 Exercises

1. Consider the McCall’s (1970) job search model analyzed in Section 8.1.

Suppose that the wage offer follows a Markov process with transition

function Q. Write down the Bellman equation and characterize the

solution.

2. Suppose that each period there is probability α that the worker is fired

in the McCall’s (1970) job search model. After the worker is fired, he

has to wait for one period and start to search for a new job in the next

period. The job offer is still drawn from the previous wage distribution.

Write down the Bellman equation for this case and characterize the

solution.

3. Prove Lemma 8.4.4.

4. Prove Proposition 8.4.3.

5. Let {xt} be the sequence of optimal inventory levels for the inventory

control model analyzed in Section 8.4. Show that it is a Markov process

and derive its transition function. Prove that it has a unique stationary

distribution.

6. Consider the inventory model in Section 8.4. Suppose that the fixed

distribution for the demand zt has density λe−λz, z ≥ 0. Characterize

s and S.

7. Suppose that there is no fixed cost (c = 0) in the inventory control

problem analyzed in Section 8.4. Show that s = S and s satisfies the

equation:

p (1− β) + L′ (s) = 0.


8. Consider the following deterministic consumption-saving problem

max∞∑t=0

βtc1−γt

1− γ

subject to


and a no-Ponzi-game condition (natural borrowing constraint). Here

yt is deterministic labor income. Derive the optimal consumption sav-

ing policy. Give conditions such that the value function is finite. Study

properties of the optimal consumption and saving plans for the cases

of βR > 1, βR < 1, and βR = 1.

9. Consider the stochastic consumption-saving problem in the case of

βR < 1 analyzed in Section 8.3.2. Prove that xmax > xmin and xmax

is the largest value such Ra (xmax) + ymax − rb = xmax.

10. (Levhari, Mirman, and Zilcha (1980) Consider the following consumption-

saving problem:

max0≤ct≤xt

E∞∑t=0

−e−γct

subject to

xt+1 = R(xt − ct) + yt+1, x0 given,

Characterize the optimal consumption-saving policies for two cases:

(i) yt = y is constant over time; (ii) yt is IID and drawn from a fixed

distribution.

11. (Schechtman and Escudero (1977)) Explicitly solve the optimal consumption-

saving policies for the following problem:

maxct

E

[ ∞∑t=0

−e−γct]

8.6. EXERCISES 291

subject to

xt+1 = R(x− ct) + yt, a0 given,

yt+1 = ρyt + εt+1,

where εt+1 is independently and identically drawn from a fixed dis-

tribution over a bounded support. Let ρ = 0. Show that there is no

stationary distribution for (xt).

12. Use the Bellman equation to derive Hayashi’s result. Interpret marginal

Q in terms of the value function.

13. Consider the following investment problem

max

∞∑t=0

βt[AKt − It −

I2t2Kt

]subject to

Kt+1 = It + (1− δ)Kt.

Solve for the optimal investment policy and the value of the firm. Give

conditions such that the value function is finite.

14. Consider Uzawa’s formulation of convex adjustment costs in equation

(8.33). Derive the optimal investment policy and formulate the Q

theory of investment in this framework.


Chapter 9

Linear-Quadratic Models

Linear-quadratic models have two essential features: quadratic reward func-

tions and linear constraints. As we will show, these models deliver linear

decision rules and quadratic value functions. Thus, they are tractable in

applications. These models can also be used to approximate and solve non-

linear models. In this chapter, we first explain how to solve linear-quadratic

control models. We then study optimal policy under commitment and dis-

cretion using a linear-quadratic framework. Finally, we introduce robust

control within the linear-quadratic framework.

9.1 Controlled Linear State-Space System

We are given a fixed probability space with a filtration {Ft}∞t=0 . On this

space, we define a process of conditionally homoskedastic martingale differ-

ences {εt}∞t=1 which satisfies:

Assumption 9.1.1 The process {εt}∞t=1 satisfies:

(i) E [εt+1|Ft] = 0;

(ii) E[εt+1ε

′t+1|Ft

]= I.

293

294 CHAPTER 9. LINEAR-QUADRATIC MODELS

Consider the controlled linear system:

xt+1 = Axt +But + Cεt+1, (9.1)

where xt is nx×1 vector of state variables, ut is nu×1 vector of control vari-

ables adapted to {Ft}, and εt is nε× 1 vector of random variables satisfying

Assumption 9.1.1. Here A,B, and C are conformable matrices. Note that

xt may contain exogenous state variables that are independent of controls.

Equation (9.1) is often called a transition equation. The state vector xt

may be partially observable in that it is not Ft-measurable. An unobserv-

able state is called a hidden state. Let Ft be generated by the observation

ny-vector yt, which satisfies

yt = Gxt +Hut−1 +Dεt, (9.2)

where G,H, and D are conformable matrices. We call equation (9.2) a

measurement equation. The above two equations constitute a state

space representation. When the processes {xt} and {εt} are adapted

to {Ft} , the decision maker possess full information and the measurement

equation can be ignored. Otherwise, the system is partially observed.

In this section, we introduce some important concepts for the controlled

linear system when A,B,C,D,G and H are independent of time. These

concepts may be valid when they are time dependent. First, we consider

stability.

Definition 9.1.1 The pair of matrices (A,B) is stabilizable if there is a

matrix H such that A+BH is stable.

The motivation of this definition is the following. Take a control ut =

Hxt so that the transition equation reduces to

xt+1 = (A+BH)xt + Cεt+1.

9.1. CONTROLLED LINEAR STATE-SPACE SYSTEM 295

Then the deterministic steady-state x = 0 will be stable if and only if A +

BH is stable, i.e., (A,B) is stabilizable. This condition can be verified by

choosing a trivial control law.

Next, we say the system (9.1) is controllable if, given an arbitrary

initial value x0, one can choose values u0, u1, ..., ur−1 such that xr has an

arbitrary prescribed value for any r > 0. This definition is equivalent to the

following (see Whittle (1982, Theorem 4.3.1):

Definition 9.1.2 The pair of matrices (A,B) is controllable if

nx−1∑j=0

AjBB′ (A′)j > 0,

where nx is the number of rows of the matrix A.

Note that controllability implies stabilizability. But the converse is not

true. For example, let A be a stability matrix and set B = 0. Clearly, this

system is stabilizable, but not controllable. This example shows that the

concept of controllability applies to endogenous states affected by controls

only. It does not apply to exogenous states.

In the case of partial information, the state space representation (9.1)

and (9.2) is called observable if x0 can be determined from knowledge of

y0, y1, .., yr; u0, u1, .., ur−1, ε0, ..., εr for any initial value x0 and for any r > 0.

This definition is equivalent to the following (see Whittle (1982), Theorem

4.4.1):

Definition 9.1.3 The pair of matrices (A,G) is observable if

nx−1∑j=0

(A′)j G′GAj > 0,

where nx is the number of rows of the matrix A.

We now introduce the last concept.


Definition 9.1.4 The pair of matrices (A,G) is detectable if there is a

matrix J such that A− JG is stable.

The motivation of this definition is the following. Let xt+1|t be the

estimate of xt+1 given information yt. Under some conditions, we will show

in Section ?? that xt+1|t satisfies the system:

xt+1|t = Axt|t−1 +But +AKat, (9.3)

where at = yt−E[yt|yt−1

]= G

(xt − xt|t−1

)+Dεt. Define Δt = xt− xt|t−1.

Then subtracting equation (9.3) from equation (9.1) yields:

Δt+1 = (A−AKG)Δt−1 + Cεt+1 +AKDεt.

Note that Δt is stable if and only if A−AKG is stable. Thus, detectability

implies that one can find an updating rule of type (9.3) which will asymp-

totically yield a correct estimate of state.

9.2 Finite-Horizon Problems

In what follows, we focus on the case of full information and ignore the

measurement equation (9.2). In the next chapter, we will study the case

of partial information and incorporate the measurement equation. We first

consider the non-stationary finite-horizon problem:

max{ut}

− E[T−1∑t=0

βt(x′tQtxt + u′tRtut + 2x′tStut

)+ βTx′TPTxT

], β ∈ (0, 1],

subject to

xt+1 = Atxt +Btut + Ctεt+1,

where At, Bt, Ct, Qt, Rt, and St are deterministic matrices that depend on t

only. Suppose that {εt} satisfies Assumption 9.1.1.

9.2. FINITE-HORIZON PROBLEMS 297

For any two symmetric matrices M and N, say that M ≥ N if M − Nis positive semi-definite and M > N if M −N is positive definite.

Proposition 9.2.1 Suppose that[Qt StS′t Rt

]and PT are positive semi-definite and that Rt > 0 for t < T. Then (i) the

value function takes the form

Vt (xt) = −x′tPtxt − dt (9.4)

where Pt ≥ 0 and satisfies

Pt = Qt + βA′tPt+1At (9.5)

−(βA′

tPt+1Bt + St) (Rt + βB′

tPt+1Bt)−1 (

βB′tPt+1At + S′

t

),

for t < T, and dt satisfies

dt = βtr(Pt+1CtC

′t

)+ βdt+1, t < T, dT = 0. (9.6)

(ii) The optimal control is

ut = −Ftxt, (9.7)

where

Ft =(Rt + βB′

tPt+1Bt)−1 (

S′t + βB′

tPt+1At), t < T. (9.8)

Proof. The proof is by induction. At time T, VT (xT ) = −x′TPTxT .Suppose (9.4) holds at time t + 1 < T. Consider time t. By the Bellman

equation,

Vt (xt) = maxut−

(x′tQtxt + u′tRtut + 2x′tStut

)+ βEtVt (xt+1) ,


where Et denotes the conditional expectation operator given information Ft.Noting that Etεt+1 = 0 and

Etε′t+1C

′tPt+1Ctεt+1 = tr

(Etε

′t+1C

′tPt+1Ctεt+1

)= Ettr

(Pt+1Ctεt+1ε

′t+1C

′t

)= tr

(Pt+1CtC

′t

),

we can compute

EtVt (xt+1)

= −Et (Atxt +Btut +Ctεt+1)′ Pt+1 (Atxt +Btut + Ctεt+1)− dt+1

= − (Atxt +Btut)′ Pt+1 (Atxt +Btut)− Etε′t+1C

′tPt+1Ctεt+1 − dt+1

= − (Atxt +Btut)′ Pt+1 (Atxt +Btut)− tr

(Pt+1CtC

′t

)− dt+1.

Substituting this equation into the above Bellman equation and taking the

first-order condition,1 we obtain

(Rt + βB′

tPt+1Bt)ut = −

(S′t + βB′

tPt+1At)xt.

This gives (9.7) and (9.8). Since Rt > 0 and Pt+1 ≥ 0, (Rt + βB′tPt+1Bt)

is also positive definite. Substituting this decision into the the Bellman

equation and matching coefficients yield (9.4)-(9.6).

This decision rule in (9.7) is called a feedback or closed-loop control

in that ut depends on the current state xt. An open-loop control may be

expressed as ut = u∗ (x0, t) for which actions are determined by the clock

rather than by the current state. These two forms may not be equivalent.

Equation (9.5) is called a Riccati equation, which is of fundamental im-

portance for later analysis. To understand this equation, we know that Pt

1We use the following differential rules: ∂x′Ax∂x

= (A+A′)x, ∂y′Bz∂y

= Bz, and ∂y′Bz∂z

=

B′y.

9.2. FINITE-HORIZON PROBLEMS 299

satisfies

Pt = minFt

Qt − F ′tS

′t − StFt + F ′

tRtFt (9.9)

+β (At −BtFt)′ Pt+1 (At −BtFt) ,

where the minimization is in the sense of positive-definiteness. The mini-

mizer Ft is given by (9.8).

From the above analysis, we observe that uncertainty through the matrix

Ct does not affect the linear decision rule (9.7) and (9.8). It also does not

affect the matrix Pt in (9.5). It only affects the constant term dt. We state

the observation as:

Certainty Equivalence Principle: The decision rule that solves the stochas-

tic optimal linear-quadratic problem is identical with the decision rule for the

corresponding deterministic linear-quadratic problem.

By this principle, we can solve for the decision rules using the corre-

sponding deterministic model. This principle is very useful for us to study

linear-quadratic control problems.

Note that we have assumed that the return function is “centred.” In the

uncentred case,

r (x, u) = −ET−1∑t=0

{(xt − x∗)′Qt (xt − x∗) + (ut − u∗)′Rt (ut − u∗)

+ (xt − x∗)′ St (ut − u∗) + (xT − x∗)′ PT (xT − x∗)},

where x∗ and u∗ are some constants. One way to deal with this case is to

adopt the augmented state vector [x′, 1]′ .We can then conduct the previous

analysis using this augmented state vector. The other way is to perform a

change of variables by defining xt = xt − x∗ and ut = ut − u∗. We can then

rewrite the state transition equation and conduct a similar analysis.


9.3 Infinite-Horizon Limits

Consider the following infinite-horizon stochastic linear regulator problem:

max{ut}

− E∞∑t=0

βt(x′tQxt + u′tRut + 2x′tSut

), β ∈ (0, 1],

subject to (9.1), where the matrices Q, R, S, A, B, and C are time-

homogeneous. In economic applications, we often impose the following side

condition (see Hansen and Sargent (2008)):

E

[ ∞∑t=0

βt (ut · ut + xt · xt)]<∞. (9.10)

Condition (9.10) is a stability condition that ensures that ut and xt cannot

grow at a rate exceeding 1/√β. This condition is not required in standard

control problems, but may be useful in some economic problems since it is

related to the transversality condition.

To study the above infinite-horizon problem, we shall analyze the continuous-

time limits of a sequence of corresponding finite-horizon problems. Let the

P (s) = Pt, F (s) = Ft, d (s) = dt, s = T − t,

where Pt, Ft and dt are given in Proposition 9.2.1 for the time-homogeneous

case. By this proposition, we obtain

P (s) = T (P (s− 1)) = T s (P (0)) ,

where T is an operator defined by

T (P ) = Q+ βA′PA−(S + βA′PB

) (R+ βB′PB

)−1 (S′ + βB′PA

).

(9.11)

In addition,

F (s) =(R+ βB′P (s− 1)B

)−1 (S′ + βB′P (s− 1)A

).

9.3. INFINITE-HORIZON LIMITS 301

Our goal is to show that P (s) has a limit P as s→∞, and hence F (s) and

d (s) have limits F and d, respectively. These limits give the value function

and the optimal policy function for the infinite-horizon problem.

For ease of exposition, we shall assume that the matrix S has been nor-

malized to zero. This can be done by a suitable change of variable. In ad-

dition, we shall consider the deterministic case by the certainty equivalence

principle. In this case, the constant term in the value function vanishes. We

then write the value function for the s-horizon problem as:

Vs (x) = −x′P (s)x.

Proposition 9.3.1 Suppose that (i) Q > 0, (ii) R > 0, (iii)(√βA,√βB

)is either controllable or stabilizable. Then: (i) The Riccati equation,

P = T (P ) , (9.12)

has a unique positive semi-definite solution P.

(ii) P (s)→ P for any finite, positive semi-definite P (0) .

Proof. Let P (0) = 0. Then by (9.11), P (1) = Q. Thus, V1 (x) =

−x′P (1) x < 0 = V0 (x) . By the Bellman equation, we deduce that Vs+1 (x) <

Vs (x) for any s. Thus, P (s+ 1) > P (s) for any s.We next show that {P (s)}is bounded above and hence has a finite positive semi-definite limit P.

To demonstrate boundedness, we only need to show that a policy can

be found which incurs a finite infinite-horizon value for any initial value x0.

In the controllable case, one achieves this by choosing a control u bringing

x to zero in a finite number of steps, and by holding u (and so x) zero

thereafter. In the stabilizable case, one chooses ut = Kxt, where K is such

that√β (A+BK) is a a stable matrix. Define Γ = A+BK. Then xt = Γtx0,

ut = KΓtx0. The infinite-horizon objective function under this control takes


a finite value,

−E∞∑t=0

βtx′0(Γ′)t (Q+K ′RK

)Γtx0.

The limit P is positive semi-definite and satisfied (9.12), which is rewrit-

ten as

P = Q+ F ′RF + βΓ′PΓ, (9.13)

where

F =(R+ βB′PB

)−1βB′PA, Γ = A−BF.

Consider the function V (x) = −x′Px and a sequence xt =(√βΓ

)tx0

V (xt+1)− V (xt)

= −x′t+1Pxt+1 + x′tPxt

= −βx′tΓ′PΓxt + x′t(Q+ F ′RF + βΓ′PΓ

)xt

= x′t(Q+ F ′RF

)xt ≥ 0.

Thus, V (xt) increases to a finite limit because {V (xt)} is bounded above

by zero. Thus,

x′t(Q+ F ′RF

)xt → 0, (9.14)

which implies that xt → 0. Since x0 is arbitrary, this implies that(√βΓ

)t →0. Thus, by (9.13),

P =

∞∑j=0

(√βΓ′

)j (Q+ F ′RF

)(√βΓ

)j.

Note that for arbitrary finite positive semi-definite P (0) ,

P (s) = T s (P (0)) ≥ T s (0)→ P.

Comparing the maximal s-horizon reward with that incurred by using the

stationary rule ut = −Fxt, we deduce the reverse inequality

P (s) ≤s−1∑j=0

βj(Γ′)j (Q+ F ′RF

)Γj +

(Γ′)s P (0) Γs → P.


The above two equations imply that P (s)→ P for any finite positive semi-

definite P (0) .

Finally, we prove uniqueness. If P is another solution for (9.12), then

P = T(P)= T s

(P)→ P.

This completes the proof.

Proposition 9.3.2 Proposition 9.3.1 still holds if condition (i) is weakened

to: Q = L′L where(√βA,L

)is either observable or detectable.

Proof. Let xt =(√βΓ

)tx0. Relation (9.14) becomes

(Lxt)′ (Lxt) + (Fxt)

′R (Fxt)→ 0,

which implies that Fxt → 0 and Lxt → 0. Using xt+1 =√β (A−BF )xt,

we deduce that xt will enters a system such that

Lxt = 0 and xt+1 =√βAxt. (9.15)

The observability condition implies that the above relation can hold only if

xt is a constant zero. The detectability condition implies that we can find

an H such that√βA−HL is a stable matrix. By equation (9.15),

xt+1 =(√

βA−HL)xt.

Thus, xt → 0, implying that√βΓ is a stable matrix. The rest of the proof

is the same as that for Proposition 9.3.1.

Note that we can replace the detectability or observability condition with

the side condition (9.10) in the preceding proposition. We now turn to the

general case under uncertainty where S �= 0. The value function is given by

V (x) = −x′Px− d,


where P is a positive semi-definite matrix solution to equation (9.12) and

T (P ) is given by (9.11). In addition, d is a constant,

d = (1− β)−1βtr(PCC ′).

The optimal decision rule is given by

u = −Fx, (9.16)

where

F =(R+ βB′PB

)−1 (S′ + βB′PA

). (9.17)

Now, we discuss three numerical methods to solve the linear-quadratic

control problem in the centered case.

Value Function Iteration

We may first solve the truncated finite-horizon problem and then take limit

to obtain the infinite-horizon solution. Proposition 9.3.1 gives an algorithm.

The Matlab program olrp.m can be used to solve the discounted linear-

quadratic control problem.

Policy Improvement Algorithm

We may use (9.9) to rewrite the Riccati equation as:

P = Q+ F ′RF − SF − F ′S′ + β(A−BF )′P (A−BF ).

Starting for an initial F0 for which the eigenvalues of A−BF0 are less than

1/√β in modulus, the algorithm iterates on the two equations:

Pj = Q+ F ′jRFj − SFj − F ′

jS′ + β(A−BFj)′Pj(A−BFj),

Fj+1 =(R+ βB′PjB

)−1 (S′ + βB′PjA

).

The first equation is a discrete Lyapunov or Sylvester equation, which is

to be solved for Pj . If the eigenvalues of the matrix A−BFj are bounded in


modulus by 1/√β, then a solution to this equation exists. There are several

methods available for solving this equation. See Anderson et al. (1996) and

Chapter 5 in Ljungqvist and Sargent (2004).

The Lagrange Method

The Lagrange method is not only useful for computations, but also delivers

insights about the connections between stability and optimality. The La-

grange multipliers play an important role as the shadow price. Form the

Lagrangian:

−E∞∑t=0

βt{

x′tQxt + u′tRut + 2x′tSut−2βλ′t+1 (xt+1 −Axt −But − εt+1)

},

where 2βt+1λt+1 is the Lagrange multiplier associated with the constraint

(??). First-order conditions with respect to {ut, xt} are

ut : Rut + S′xt + βB′Etλt+1 = 0,

xt : λt = Qxt + Sut + βA′Etλt+1.

Combining with the constraint (??), we obtain the following linear system:⎡⎣ I 0 00 0 −βB′

0 0 βA′

⎤⎦⎡⎣ Etxt+1

Etut+1

Etλt+1

⎤⎦ =

⎡⎣ A B 0S′ R 0−Q −S I

⎤⎦⎡⎣ xtutλt

⎤⎦ .We can solve this system using the Klein method discussed in Section 2.3.2.

In this system xt is a vector of predetermined or backward-looking variables

according to Definition 2.3.4 and ut and λt are vectors of non-predetermined

or forward-looking variables. Anderson et al. (1996) and Ljungqvist and

Sargent (2004) discuss several numerical methods based on the assumption

that Q is invertible. This assumption is not needed for the Klein method.

The Lagrange multiplier has important economic meaning. By the en-

velope condition, we can show that

∂V (xt)

∂xt= Qλt + Sut + βA′Etλt+1 = λt, t ≥ 0.


Thus, we have λt = Pxt, which is interpreted as the shadow price of xt.

9.4 Optimal Policy under Commitment

Suppose that the equilibrium system can be summarized by the following

state space form:

D

[zt+1

Etyt+1

]= A

[ztyt

]+But +

[εt+1

0

], (9.18)

where z0 is given. Here zt is an nz × 1 vector of predetermined variables, yt

is an ny × 1 vector of nonpredetermined or forward-looking variables, ut is

an nu × 1 vector of instrument or control variables, and εt+1 is a vector of

exogenous shocks to zt that are white noise with covariance matrix Σ. Any

shock to yt is incorporated in the vector zt. We typically use zt to represent

the state of the economy, which may include productivity shocks, preference

shocks, or capital stock. Note that zt may include a component of unity in

order to handle constants. The vector yt represents endogenous variables

such as consumption, inflation rate, and output. Examples of instruments

ut include interest rates and money growth rates.

The matrices A, B, and D are conformable with the above vectors. We

assume that the matrix D can be partitioned conformably as:

D =

[I 00 Dyy

].

Note that we allow the matrix Dyy to be singular. In this case, the equi-

librium system in (9.18) may contain intratemporal optimality condition.

Define ξt+1 = yt+1 − Etyt+1 and x = [z′, y′]′ . We can then rewrite the

system (9.18) as:

Dxt+1 = Axt +But +

[εt+1

Dyyξt+1

]. (9.19)

9.4. OPTIMAL POLICY UNDER COMMITMENT 307

The policymaker’s objective is to minimize the loss function by choosing

{ut} and {xt} :

E

[ ∞∑t=0

βtr (xt, ut)

]subject to (9.19), where

r (x, u) =1

2x′Qx+

1

2u′Ru+ x′Su,

and Q and R are positive semi-definite matrices and S is symmetric. In this

problem, the policymaker is committed to choose {ut} and {xt} at time zero

and will not reoptimize in the future. This problem is also called a Ramsey

problem.

Example 9.4.1 Consider the following basic New Keynesian model:

πt = κ(yat − y

ft

)+ βEtπt+1 + et, (9.20)

yat − yft = Et

(yat+1 − y

ft+1

)− σ (it − Etπt+1 − rnt ) , (9.21)

et+1 = ρeet + εe,t+1,

yft+1 = ρyyft + εy,t+1,

rnt+1 =ρy − 1

σ

(ρyy

ft + εy,t+1

),

where πt is the inflation rate, yat is the actual output, yft is the output under

flexible prices, it is the nominal interest rate, et is a cost-push shock, and

rnt is the natural rate of interest defined as:

rnt =1

σ

(Ety

ft+1 − y

ft

).

The vector of forward looking variables is yt = (πt, yat )

′ . The vector of prede-

termined variables is zt = (1, et, yft , r

nt ). The vector of instruments is ut = it.

The one-period loss function is

r (xt, ut) =1

2(πt − π∗)2 +

1

2λ(yat − y

ft

)2,


where π∗ is the inflation target. We use unity in the state vector zt to handle

the constant π∗.

Form the Lagrangian expression:

E

∞∑t=0

βt

⎧⎨⎩12x

′tQxt +

12u

′tRut + x′tSut

+βμt+1

(Axt +But +

[εt+1

Dyyξt+1

]−Dxt+1

) ⎫⎬⎭ ,

where βt+1μt+1 is the Lagrange multiplier associated with the constraint

(9.19). First-order conditions with respect to xt and ut are given by:

xt : Qxt + Sut + βA′Etμt+1 − μtD′ = 0, t ≥ 0,

ut : Rut + S′xt + βB′Etμt+1 = 0, t ≥ 0.

Note that at time 0, z0 is given exogenously, but y0 is chosen endogenously.

Thus, the Lagrange multiplier corresponding to z0, μz0, must be endoge-

nously determined, while the Lagrange multiplier corresponding to y0 is set

as μy0 = 0.

Combining the first-order conditions with (9.19) yields:⎡⎣ D 0 00 0 βA′

0 0 βB′

⎤⎦⎡⎣ xt+1

ut+1

Etμt+1

⎤⎦ =

⎡⎣ A B 0−Q −S D′

−S′ −R 0

⎤⎦⎡⎣ xtutμt

⎤⎦ .Taking conditional expectations given information at date t yields the sys-

tem: ⎡⎣ D 0 00 0 βA′

0 0 βB′

⎤⎦⎡⎣ Etxt+1

Etut+1

Etμt+1

⎤⎦ =

⎡⎣ A B 0−Q −S D′

−S′ −R 0

⎤⎦⎡⎣ xtutμt

⎤⎦ .This system is in the form of a rational expectations equilibrium system

as in (2.11). As discussed in Section 2.3, there are many methods to solve this

system. Here we adopt the Klein method as discussed in Soderlind (1999).2

2See Currie and Levine (1985, 1993) and Backus and Driffill (1986) for other methods.


An important advantage of this method is that it allows the matrix R to be

singular. Define the (nz + ny)× 1 vector of predetermined variables kt and

the (ny + nz + nu)× 1 vector of non-predetermined variables λt by:

kt =

[ztμyt

], λt =

⎡⎣ ytutμzt

⎤⎦ ,where k0 is exogenously given and μy0 = 0. We reorder equations in (??)

and rewrite it as:

G

[Etkt+1

Etλt+1

]= H

[ktλt

],

for some conformable matrices G and H. In general, G is singular. As we

show in Chapter 2, the above system has a unique saddle path solution as

long as a stability condition and a rank condition is satisfied. The solution

for the original system is in the following form:

zt+1 = Azzzt +Azμyμyt + εt+1,

μyt+1 = Aμyzzt +Aμyμyμyt, z0 given, μy0 = 0,

for the predetermined variables and the Lagrange multipliers corresponding

to forward-looking variables, and

yt = Ayzzt +Ayμyμyt,

ut = Auzzt +Auμyμyt,

μzt = Aμzzzt +Aμzμyμyt,

for the forward-looking and control variables and the Lagrange multipliers

associated with the predetermined variables. In the preceding form, the vec-

tor of Lagrange multipliers μyt associated with the forward-looking variables

yt is a state vector, which makes the solution recursive.

The optimal policy under commitment is generally time inconsistent in

the sense that if the policymaker can reoptimize in the future, then he may


choose a different sequence of policy rules than the one chosen in period

0. Formally, suppose that the policymaker decides to reoptimize in period

t. Then we can derive first-order conditions with respect to xs and us for

s ≥ t. By a similar argument analyzed above, we must have μyt = 0 because

yt is freely chosen. This value of μyt is likely to be different from the value

chosen by the policymaker when he optimizes in period 0.

Example 9.4.2 Consider the following basic New Keynesian model:

πt = κyt + βEtπt+1 + zt, (9.22)

yt = Etyt+1 − σ (it − Etπt+1 − rnt ) , (9.23)

where πt is the inflation rate, yt = yat − yft is the output gap, defined as the

actual output relative to the equilibrium level of output under flexible prices,

it is the nominal interest rate, rnt is the natural rate of interest and zt is a

cost-push shock. Suppose that {zt} follows the following process:

zt+1 = ρzt + εt+1,

where εt+1 is IID with variance Σ. The policymaker’s optimal policy under

commitment is to choose {xt, πt, it} to solve the following problem:

min1

2E

∞∑t=0

βt[π2t + λ (yt − y∗)2

]subject to (9.22) and (9.23), where y∗ is the target level of output. Since

it does not enter the objective function, we can ignore equation (9.23). We

form the Lagrangian:

E∞∑t=0

βt{

12π

2t +

12λ (yt − y∗)

2 − βμzt+1 (zt+1 − ρzt − εt+1)−βμyt+1 (κyt + βπt+1 + zt − πt)

},


where βt+1μyt+1 is the Lagrange multiplier associated with (9.22). First-

order conditions with respect to {πt, xt, zt} are given by:3

πt : πt + βEtμyt+1 − βμyt = 0, (9.24)

yt : λ (yt − y∗)− βκEtμyt+1 = 0, (9.25)

zt : − βEtμyt+1 + βρEtμzt+1 − μzt = 0, (9.26)

for all t ≥ 0, where μy0 = 0. Combining these conditions with (9.22) yields

the following form in terms of a system of linear rational expectations equi-

librium: ⎡⎢⎢⎢⎢⎣1 0 0 0 00 β 0 0 00 βκ 0 0 00 β 0 0 00 −β 0 0 βρ

⎤⎥⎥⎥⎥⎦⎡⎢⎢⎢⎢⎣Etzt+1

Etμyt+1

Etyt+1

Etπt+1

Etμzt+1

⎤⎥⎥⎥⎥⎦

=

⎡⎢⎢⎢⎢⎣ρ 0 0 0 00 β 0 −1 00 0 λ 0 0−1 0 −κ 1 00 0 0 0 1

⎤⎥⎥⎥⎥⎦⎡⎢⎢⎢⎢⎣ztμytytπtμzt

⎤⎥⎥⎥⎥⎦−⎡⎢⎢⎢⎢⎣

00λy∗

00

⎤⎥⎥⎥⎥⎦ .

The predetermined variables are zt and μyt and the nonpredetermined vari-

ables are πt, yt, and μzt. We can then use the Klein method to solve this

system. Note that the Lagrange multiplier μzt is not useful for solving other

variables.

We can solve this simple problem analytically. Using (9.24) and (9.25)

to solve for πt and yt, respectively, and then substituting them into (9.22),

we obtain:

Et

{βμyt+2 −

(1 + β +

κ2

λ

)μyt+1 + μyt

}=κ

βy∗ +

ztβ, (9.27)

3Note that zt is exogenous and the first-order condition with respect to this variablehas no meaning. But we will see below that this condition does not affect the solution.


where μy0 = 0. The characteristic equation of this second-order difference

equation is

βθ2 −(1 + β +

κ2

λ

)θ + 1 = 0,

which has two real roots:

0 < θ1 < 1 < θ2.

We use the lag operator method discussed in Section 2.2.1 to solve (9.27):

β(L−1 − θ1

) (L−1 − θ2

)μyt =

κ

βy∗ +

ztβ,

where L−jXt = EtXt+j . Thus,(L−1 − θ1

)μyt =

1

β

1

L−1 − θ2

(κ

βy∗ +

ztβ

).

Since μyt is predetermined without any prediction error, Etμyt+1 = μyt+1,

we obtain:

μyt+1 − θ1μyt =−1β2θ2

1

1− L−1θ−12

(κy∗ + zt)

=−1β2θ2

∞∑j=0

θ−j2 (κy∗ + Etzt+j) .

Thus,

μyt+1 = θμyt −θ

β

(κy∗

1− βθ +zt

1− βθρ

),

where we have defined θ = θ1 and used the fact that θ2 = β−1θ−11 . Given

μy0 = 0, the above equation determines{μyt

}. Equations (9.24)-(9.26) then

determine πt, yt and μzt for t ≥ 0.

9.5 Optimal Discretional Policy

In the discretional case, the policymaker reoptimizes each period and the

sequence of policy rules are time consistent. We solve for a linear Markov

perfect equilibrium in which we take the predetermined variable zt as the

9.5. OPTIMAL DISCRETIONAL POLICY 313

state variable and the public’s reaction and the policy rule in period t are

linear functions of the state variable zt :

yt =Mzt, ut = −Fzt,

where M and F are matrices to be determined.

To solve these matrices, we use the method of successive approximations

by backward induction as in Oudiz and Sachs (1985), Backus and Driffill

(1986) and Soderlind (1999). Consider the corresponding horizon problem.

In period t, the policymaker expects that the public’s reaction function in

period t+ 1 is

yt+1 =Mt+1zt+1, (9.28)

where Mt+1 is some conformable matrix. The policymaker’s problem in

period t is given by the Bellman equation:

Jt (zt) = minut

1

2x′tQxt +

1

2u′tRut + x′tSut + βEt [Jt+1 (zt+1)] , (9.29)

subject to (9.18) and (9.28). Because this is a linear-quadratic control prob-

lem, we conjecture that the value function in period t takes the following

quadratic form:

Jt (zt) =1

2z′tVtzt +

1

2vt, (9.30)

where Vt is a matrix and vt is a scaler.

We proceed with the following three steps.

Step 1. Partitioning matrices in (9.18) conformably and using equations

(9.18) and (9.28), we derive that

yt+1 =Mt+1zt+1 =Mt+1 (Azzzt +Azyyt +Bzut + εt+1) . (9.31)

Partitioning matrices in (9.18) conformably yields:

DyyEtyt+1 = Ayzzt +Ayyyt +Byut. (9.32)


Taking conditional expectations and premultiplying Dyy on the two sides of

equation (9.31), we use equation (9.32) to derive

Ayzzt +Ayyyt +Byut = DyyMt+1 (Azzzt +Azyyt +Bzut) .

Assuming that the matrix Ayy −DyyMt+1Azy is invertible, we can then use

the preceding equation to solve for

yt = Hztzt +Hutut, (9.33)

where

Hzt = (Ayy −DyyMt+1Azy)−1 (DyyMt+1Azz −Ayz) ,

Hut = (Ayy −DyyMt+1Azy)−1 (DyyMt+1Bz −By) .

Substituting (9.33) into the upper block of (9.18), we obtain the new

system dynamics:

zt+1 = A∗t zt +B∗

t ut + εt+1, zt given, (9.34)

where

A∗t = Azz +AzyHzt,

B∗t = Bz +AzyHut,

Step 2. We rewrite the return function in (9.29) as

1

2x′tQxt +

1

2u′tRut + x′tSut

=1

2z′tQzzzt +

1

2y′tQyyyt + z′tQzyyt +

1

2u′tRut

+z′tSzut + y′tSyut

=1

2z′tQ

∗t zt +

1

2u′tR

∗tut + z′tS

∗t ut,

9.5. OPTIMAL DISCRETIONAL POLICY 315

where

Q∗t = Qzz +H ′

ztQyyHzt + 2QzyHzt,

R∗t = R+H ′

utQyyHut +H ′utSy,

S∗t = QzyHut + Sz +HztSy.

Step 3. Substituting the conjectured value function (9.30) and the above

return function into the Bellman equation (9.29), we obtain the following

control problem:

1

2z′tVtzt +

1

2vt = min

ut

1

2z′tQ

∗t zt +

1

2u′tR

∗tut + z′tS

∗t ut

+1

2βEt

[z′t+1Vt+1zt+1 + vt+1

](9.35)

subject to (9.34), where z0 is given. The first-order condition with respect

to ut gives

ut = −Ftzt , (9.36)

where

Ft = (R∗t + βB∗

t Vt+1B∗t )

−1 (S∗′t + βB∗′

t Vt+1A∗t

).

Substituting (9.36) back into the Bellman equation (9.35) and using (9.34),

we obtain:

Vt = Q∗t + F ′

tR∗tFt − 2S∗

t Ft + β (A∗t −B∗

t Ft)′ Vt+1 (A

∗t −B∗

t Ft) . (9.37)

In addition, we obtain a recursive equation for vt :

vt = βtr (Vt+1Σ) + βvt+1

Substituting (9.36) into (9.33) yields:

yt = (Hzt −HutFt) zt.

Thus, we obtain the updated value:

Mt = Hzt −HutFt. (9.38)


Now, we start with an initial guess of a positive semi-definite matrix

Vt+1 and any matrix Mt+1. We then obtain the updated matrices Vt and

Mt using equations (9.37)-(9.38) and other related equations given above.

Iterating this process until convergence gives V and M in a Markov perfect

equilibrium. In addition, vt converges to v in the value function and Ft

converges to F which gives the stationary policy rule.

Example 9.5.1 Consider again the canonical New Keynesian model dis-

cussed in Example 9.4.2. We now solve for the optimal discretional policy.

There is no predetermined endogenous state variable. The only state variable

is the exogenous shock zt. Thus, the discretionary policy solves the following

static problem:

min1

2π2t +

1

2(yt − y∗)2

subject to

πt = κyt + βπet + zt,

where πet is the expected inflation that is taken as given by the policymaker.

The solution of this problem is given by

πt =λ

κ2 + λ(κy∗ + βπet + zt) . (9.39)

A Markov perfect equilibrium is a pair of functions π (zt) and πe (zt) such

that (i) π (zt) = πt in (9.39) when πet = πe (zt) and (ii) πe (zt) = Etπ (zt+1) .

Using (9.39) yields:

πt =λ

κ2 + λ(κy∗ + βEtπt+1 + zt) .

We can then solve for the discretionary inflation rate:

πt = θ

∞∑j=0

βj θj[κy∗ + Etzt+j ] , θ =

λ

κ2 + λ.

9.6. ROBUST CONTROL 317

9.6 Robust Control

An optimal control problem can be written as the following standard form:

max{ut}

E∞∑t=0

βtr (xt, ut) , β ∈ (0, 1)

subject to

xt+1 = f (xt, ut, εt+1) , x0 given, (9.40)

where r is a reward or return function, f is a state transition function,

xt is a state vector, ut is a control vector, and εt is a vector of possibly

serially correlated random variables. An implicit assumption of this setup

is that the decision maker fully trusts his model in that the expectation

operator E is taken with respect to the probability measureP underlying the

stochastic process {εt} . In economics, this means that the decision maker

has rational expectations because his subjective beliefs coincide with the

objective probability measure.

Control theorists and applied mathematicians have sought ways to relax

the assumption that the decision maker trusts his model. They formulate

robust control theory in order to take into account the fact that the model

may be misspecified.4 Lars Hansen and Thomas Sargent have extended

this theory so that it can be applied to economics.5 In this section, we

provide a brief introduction of their work. In Chapter 21, we will discuss

the connection with the literature on ambiguity aversion in decision theory.

Belief Distortions and Entropy

The decision maker’s information set at date t is given by x0 and εt =

(ε′1, ..., εt)′ . For simplicity, we assume that εt is IID and drawn from a dis-

tribution with density π (ε). This density induces a probability measure P

4See Zhou, Doyle and Glover (1996) for an introduction to robut control theory.5See Hansen and Sargent (2008) for a comprehensive treatment.


on the full state space of the model. The decision maker views this model as

an approximation and fears that the model is misspecified. He believes that

εt+1 is drawn from an alternative distribution with density π(ε|εt

). Form

the likelihood ratio:

mt+1 =π(εt+1|εt

)π (εt+1)

.

Clearly

E[mt+1|εt

]=

∫π(ε|εt

)π (ε)

π (ε) dε = 1,

where we use E to denote the expectation operator with respect to the

reference measure P .

Set M0 = 1 and define

Mt+1 = mt+1Mt = Πt+1j=1mj.

The random variable Mt is the ratio of the distorted joint density to the

reference joint density of εt conditioned on x0 and evaluated at the random

vector εt. The process {Mt} is a martingale and induces a distorted measure

Q on the full state space. For any Borel function ϕ(εt)of εt, we can compute

its expectation with respect to the distorted belief Q as:

EQ[ϕ(εt, x0

)]= E

[Mtϕ

(εt, x0

)].

We use relative entropy to measure discrepancies between two proba-

bility measures P and Q. It is defined as:

EQ

[ln

(dQ

dP

∣∣∣∣εt

)]= E [Mt lnMt] ,

where dQ/dP |εt denotes the Radon-Nikodym derivative of Q with respect

to P when restricted to σ-algebra generated by εt. Clearly, relative entropy

is nonnegative and equal to zero if and only if Mt = 1 or Q = P . We can


decompose relative entropy as:

E [Mt lnMt] =

t−1∑j=0

E[MjE

(mj+1 lnmj+1|εj

)],

where E[mt+1 lnmt+1|εt

]is the conditional relative entropy of a perturba-

tion to the one-step transition density associated with the approximating

model. Discounted entropy over an infinite horizon can be expressed as:

(1− β)∞∑j=0

βjE [Mj lnMj ] =

∞∑j=0

βjE[MjE(mj+1 lnmj+1|εj)

].

To express the idea that model P is a good approximation when Q actually

generates the data, we restrain the approximation errors by the entropy

constraint: ∞∑j=0

βjE[MjE(mj+1 lnmj+1|εj)

]≤ η0, (9.41)

where η0 represents the size of errors.

Two Robust Control Problems

To acknowledge model misspecification, it is natural to formulate the fol-

lowing problem:

max{ut}

min{mt+1}

EQ∞∑t=0

βtr (xt, ut) ,

subject to (9.40) and (9.41). The interpretation is the following. The deci-

sion maker may use a distorted belief Q induced by a sequence of one-step

likelihood ratios {mt+1} to evaluate discounted rewards. He is averse to

model uncertainty. He chooses a distorted belief that minimizes the dis-

counted expected rewards given the entropy constraint (9.41). He then

chooses an optimal control {ut} based on this worst-case distorted belief.

Hansen and Sargent (2008) call the above formulation the constrained

problem. They also propose a related multiplier problem, which is more


tractable in applications:

max{ut}

min{mt+1}

EQ∞∑t=0

βtr (xt, ut) + βθEQ

{ ∞∑t=0

βtE[mt+1 lnmt+1|εt

]},

subject to (9.40). One may view βt+1θ as the Lagrange multiplier associated

with the entropy constraint (9.41). The parameter θ captures concerns about

robustness. Following Hansen and Sargent (2008), we may rewrite the above

multiplier problem as:

max{ut}

min{mt+1}

E∞∑t=0

βtMt

{r (xt, ut) + βθE

[mt+1 lnmt+1|εt

]}, (9.42)

subject to (9.40) and

Mt+1 = mt+1Mt and E[mt+1|εt

]= 1.

When θ goes to infinity, the penalty of distortions is so large that no

distortion is optimal. In this case, the multiplier problem converges to the

standard control problem under the reference probability measure P .When

θ becomes smaller, the decision maker is more concerned about model mis-

specification. We will show in Chapter 21 that θ can also be interpreted as

a parameter for ambiguity aversion. A smaller θ means a larger degree of

ambiguity aversion.

Note that θ cannot be arbitrarily small. There is a lower bound θ called

a breakdown point beyond which it is fruitless to seek more robustness

because the minimizing agent is sufficiently unconstrained that he can push

the objective function to −∞.6 We shall assume θ > θ throughout this

section.

6See Chapter 8 of Hansen and Sargent (2008) for more discussions on the breakdownpoint.


Recursive Formulation

We can formulate the multiplier problem (9.42) recursively. We conjecture

the value function takes the form W (M,x) = MV (x) . Substituting the

constraints, we write the Bellman equation as:

MV (x) = maxu

minm(ε)

M{r (x, u) + β

∫(m (ε)V (f (x, u, ε))

+θm (ε) lnm (ε))π (ε) dε},

subject to E [m (ε)] = 1. Thus, we obtain

V (x) = maxu

r (x, u) + βminm(ε)

∫(m (ε)V (f (x, u, ε))

+θm (ε) lnm (ε))π (ε) dε.

The first-order condition for the minimization problem gives

m∗ (ε) =−V (f (x, u, ε))

θ+ λ,

where λ is chosen such that E [m (ε)] = 1. Thus,

m∗ (ε) =exp

(−V (f(x,u,ε))θ

)∫exp

(−V (f(x,u,ε))θ

)π (ε) dε

. (9.43)

Under the minimizing m∗, we compute

R (V ) (x, u) ≡∫(m∗ (ε)V (f (x, u, ε)) + θm∗ (ε) lnm∗ (ε))π (ε) dε(9.44)

= −θ ln[∫

exp

(−V (f (x, u, ε))

θ

)π (ε) dε

],

where R is an operator that adjustments the value function. The expression

on the second-line of the above equation is the risk-sensitive adjustment to

the value function. We may reinterpret 1/θ as a parameter that enhances

risk aversion. Now the multiplier problem becomes

V (x) = maxu

r (x, u) + βR (V ) (x, u) . (9.45)


This problem is identical to a risk-sensitive control problem in which

there is no concern about model misspecification. If there is no choice of

optimal control, the above recursive equation defines a recursive utility func-

tion. This utility model is a special case of recursive utility studied by Ep-

stein and Zin (1989). It is also studied by Hansen and Sargent (1995) in a

linear-quadratic setting,

Linear Quadratic Model with Gaussian Disturbances

Now, we consider a linear-quadratic setting studied in Section 9.3. Define

r (x, u) = −1

2

(x′Qx+ u′Ru+ 2x′Su

)Suppose the disturbances {εt} is an IID Gaussian process. In this setting,

the value function for the multiplier problem is quadratic and takes the form:

V (x) = −1

2x′Px− d

2. (9.46)

By equation (9.43), we can compute the worst case likelihood ratio:

m∗ (ε) ∝ exp

[1

2θε′C ′PCε+

1

θε′C ′P ′ (Ax+Bu)

],

where ∝ means “proportional.” When the reference density π is a standard

normal density, we can compute the worst-case distorted density:

π∗ (ε) = π (ε)m∗ (ε) ∝ exp

[1

2ε′(I − 1

θC ′PC

)ε

+ε′(I − 1

θC ′PC

)(θI − C ′PC

)−1C ′P (Ax+Bu)

].

Thus, π∗ (ε) is also a normal density with mean (θI − C ′PC)−1 C ′P (Ax+Bu)

and covariance matrix(I − θ−1C ′PC

)−1. In this computation, we must as-

sume that the matrix(I − θ−1C ′PC

)is nonsingular.


Next, we compute the adjustment to the value function in (9.44):

R (V ) (x, u) =

∫V (f (x, u, ε) , u) π∗ (ε) dε+ θ

∫π∗ (ε) ln

(π∗ (ε)π (ε)

)dε

= −1

2(Ax+Bu)′

[P + PC

(θI − C ′PC

)−1C ′P

](Ax+Bu)

−d2− θ

2ln det

(I − 1

θC ′PC

)−1

.

Alternatively, we can compute this adjustment directly using the expression

in the second line of equation (9.44).

Define an operator D by:

D (P ) = P + PC(θI − C ′PC

)−1C ′P.

Now, using this operator, substituting the conjecture in (9.46) and the ex-

pression for R (V ) into (9.45), we obtain:

−1

2x′Px− d

2= max

u− 1

2x′Qx− 1

2u′Ru− x′Su

+ β1

2(Ax+Bu)′D (P ) (Ax+Bu)

− βd

2− βθ

2ln det

(I − 1

θC ′PC

)−1

.

Using the first-order condition, we can solve for the optimal control:

u = −Fx (9.47)

where

F =(R+ βB′D(P )B

)−1 (S′ + βB′D(P )A

). (9.48)

Substituting this decision rule back into the above Bellman equation and

matching coefficients, we obtain that

P = T ◦ D (P ) (9.49)


where the operator T is defined as

T (P ) = Q+ βA′PA−(S + βA′PB

) (R+ βB′PB

)−1 (S′ + βB′PA

),

(9.50)

d = (1− β)−1 βθ ln det

(I − 1

θC ′PC

)−1

. (9.51)

Note that the operator T is identical to that in (9.11) for the standard

control problem.

In summary, we use equations (9.49) and (9.50) to solve for P. A simple

value function iteration algorithm as in Section 9.3 may be applied. We then

use equation (9.51) to determine the constant term in the value function.

Finally, we use equations (9.47) and (9.48) to solve for the robust decision

rule.

Relative Entropy and Normal Distributions

We have shown that the worst-case distribution is normal. It is useful to

compute the relative entropy for normal distributions. Suppose that π is a

multivariate standard normal distribution and that π is normal with mean

w and nonsingular covariance Σ. Our objective is to compute the relative

entropy ∫π (ε) ln

π (ε)

π (ε)dε.

First, we compute the log likelihood ratio

lnπ (ε)

π (ε)=

1

2

[− (ε−w)′ Σ−1 (ε− w) + ε′ε− ln detΣ

].

Next, we compute

−∫

1

2(ε− w)′Σ−1 (ε− w) π (ε) dε = 1

2tr (I) .

Using the fact that

1

2ε′ε =

1

2w′w +

1

2(ε− w)′ (ε− w) + w′ (ε− w) ,


we compute1

2

∫ε′επ (ε) dε =

1

2w′w +

1

2tr (Σ) .

Combining terms yields∫π (ε) ln

π (ε)

π (ε)dε =

1

2w′w − ln detΣ +

1

2tr (Σ− I) . (9.52)

The first term on the right-hand side measures the mean distortion and the

last two terms measure the distortion of variances.

Modified Certainty Equivalence Principle

In computation, it is convenient to solve a deterministic problem with the

state transition equation:

xt+1 = Axt +But +Cwt+1,

where we have replaced the stochastic shock by a distorted mean w. To

solve for the worst-case distortion, we consider the following minimization

problem:

minw− 1

2(Ax+Bu+ Cw)′ P (Ax+Bu+ Cw) +

θ

2w′w.

In this problem, relative entropy is not well defined. Rather, we use (9.52)

and consider penalty of distortions in mean. The solution for w is

w∗ =(θI − C ′PC

)−1C ′P (Ax+Bu) .

This is identical to the mean distortion of the worst-case normal distribution

π∗ (ε) discussed earlier. The minimized objective function is

−1

2(Ax+Bu)′

[P + PC

(θI − C ′PC

)−1C ′P

](Ax+Bu) ,

which is identical to the stochastic robust adjustment to the value function

coming from the quadratic form in (Ax+Bu) .


The difference between this deterministic problem and the stochastic

problem is that the distorted covariance matrix for the worst-case normal

distribution and the constant term in the adjusted value function are not

present in the deterministic problem. However, neither objects affects the

computation of the robust decision rule and the matrix P in the value func-

tion. This is the modified certainty equivalence studied by Hansen and

Sargent (2005a). We should emphasize that unlike the standard control

problem, the volatility matrix C affects both the decision rule and the ma-

trix P in the deterministic robust control problem.

9.7 Exercises

1. One can normalize S = 0 in the control problem. Show that this

can be achieved by adopting the transformed control variable ut =

ut +R−1Sxt, where we assume R is nonsingular.

2. Solve the following uncentred control problem:

max{ut}

− E∞∑t=0

βt (xt − x∗)′[Q SS′ R

](ut − u∗) , β ∈ (0, 1],

subject to

xt+1 = Axt +But + Cεt+1, x0 given, (9.53)

where x∗ and u∗ are exogenously given.

3. Show that as θ →∞, the solution for the multiplier problem in (9.48)-

(9.51) converges to the solution to the standard control problem with-

out robustness.

4. (Hall (1978)) Consider the following consumption saving problem:

max − E∞∑t=0

βt (ct − b)2 , β ∈ (0, 1) ,

9.7. EXERCISES 327

subject to

ct + at+1 = Rat + yt,

yt = μy (1− ρ) + ρyt−1 + σyεt,

where b, μy > 0, R > 1, |ρ| < 1, and εt is IID Gaussian. Use xt = [1 at

yt]′ as the state vector. Solve for the optimal consumption rule.

5. Let the robustness parameter be θ and consider the multiplier prob-

lem for the above consumption-saving choices. Solve for the robust

consumption rule.


Chapter 10

Control under PartialInformation

Economic agents often possess limited information. In addition, economic

information is often noisy possibly because of measurement errors. In this

chapter, we study control problems under partial (or imperfect) information.

To do this, we first study how to estimate hidden states given observations.

10.1 Filters

Filtering is a procedure of estimating hidden states based on observable

data. We first present the Kalman filter in a linear-Gaussian framework.

We then study nonlinear filters.

10.1.1 Kalman Filter

We consider the following linear state-space representation:

xt+1 = Atxt +Btut + Ctεt+1, (10.1)

yt = Gtxt +Ht−1ut−1 +Dtεt, (10.2)

where xt is an nx×1 state vector, yt is ny×1 observation (or measurement)

vector, ut is an nu × 1 control vector, {εt} is an IID sequence of Gaussian

329

330 CHAPTER 10. CONTROL UNDER PARTIAL INFORMATION

vectors with Eεtε′t = I, At, Bt, Ct, Dt, Gt, and Ht are conformable ma-

trices with known but possibly time-varying elements.1 Assume that x0 is

Gaussian N (x0,Σ0) and is independent of all other random variables. Let

Ft = {y0, y1, ..., yt} be the information set at date t and let ut be adapted to

Ft. Equation (10.1) is called the transition equation and equation (10.2)

is called the measurement equation. Note that we allow the errors in

these two equations to be correlated, though the uncorrelated case is often

considered in applications.

In econometrics, the state-space representation is often written as

xt+1 = Axt + Cεt+1, (10.3)

yt = Gxt +Hzt +Dεt, (10.4)

where zt is nz × 1 vector of exogenous or predetermined variables. Let

Ft = {y0, y1, ..., yt, z0, z1, ..., zt} . Assume that zt is independent of xs and

εs for any s ≥ t. The matrices A,C,D,G, and H may be unknown and need

to be estimated based on observations {y0, ..., yT } and {z0, ..., zT } .We adopt the following notations:

• xt|s = E [xt|Fs] for s ≤ t,

• Σt|t = E[(xt − xt|t

) (xt − xt|t

)′ |Ft] ,• Σt|t−1 = E

[(xt − xt|t−1

) (xt − xt|t−1

)′ |Ft−1

],

• Ωt|t−1 = E[(yt − yt|t−1

) (yt − yt|t−1

)′ |Ft−1

].

Because of the assumption of Gaussian random variables, we can treat

the above conditional variance-covariance matrices as the unconditional variance-

covariance matrices. Thus, given information Ft, xt is distributed according

1See Hamilton (1994), Kim and Nelson (1999) and Ljungqvist and Sargent (2004) forvarious applications of the Kalman filter.

10.1. FILTERS 331

to N(xt|t,Σt|t

), xt+1 is distributed according to N

(xt+1|t,Σt+1|t

), and yt+1

is distributed according to N(Gtxt+1|t +Htut,Ωt+1|t

).

Example 10.1.1 Labor income process.

Guvenen (2007) considers the following dynamics for the log labor in-

come process:

yt = α+ βt+ zt + vt, (10.5)

zt = ρzt−1 + ηt, (10.6)

where α and β are unknown parameters and zt is a hidden state representing

the persistent component of labor income. The information set at date t is yt.

Assume that {vt}, {ηt}, α, β, and z0 are Gaussian and independent of each

other. We can express the above system as the state space representation:

xt+1 ≡

⎡⎣ αβzt+1

⎤⎦ =

⎡⎣ 11

ρ

⎤⎦⎡⎣ αβzt

⎤⎦+

⎡⎣ 0 00 01 0

⎤⎦[ηt+1

vt+1

]= Axt +Cεt+1,

and

yt =[1 t 1

] ⎡⎣ αβzt

⎤⎦+[0 1

] [ ηt+1

vt+1

]= Gtxt +Dεt.

The Kalman filter gives the following recursive forecasting procedure:

Suppose that we know xt|t−1 and yt|t−1 at date t. After observing yt, the

Kalman filter gives the estimate xt|t using the linear equation:

xt|t = xt|t−1 +Kt

(yt − yt|t−1

), (10.7)

where Kt is called a Kalman gain, which measures how much xt|t−1 is

updated based on the error in predicting yt. This error is often called an


innovation, denoted by at = yt− yt|t−1. The variance-covariance matrix of

the innovation is given by

Ωt|t−1 = E[ata

′t

]= E

[(Gt

(xt − xt|t−1

)+Dtεt

) (Gt

(xt − xt|t−1

)+Dtεt

)′]= GtΣt|t−1G

′t +DtD

′t +GtCtD

′t +DtC

′tG

′t, (10.8)

where we have used the fact that

E[(xt − xt|t−1

)ε′t]

= E[(At−1xt−1 +Bt−1ut−1 + Ctεt − xt|t−1

)ε′t]= Ct. (10.9)

The optimal Kalman gain Kt minimizes the sum of variances of the estima-

tion errors:

E[(xt − xt|t)′(xt − xt|t)

].

We will show below that the solution is given by

Kt =(Σt|t−1G

′t + CtD

′t

)Ω−1t|t−1. (10.10)

The intuition behind the formula for Kt is the following: If there is a

large error in forecasting xt using past information yt−1 (i.e., Σt|t−1 is large),

then the Kalman filter will give a larger weight to new information (i.e., Kt

is large). If new information is noisy (i.e., DtD′t is large), the Kalman filter

will give a large weight to the old prediction (i.e., Kt is small).

Since xt+1|t = Atxt|t and yt+1|t = Gt+1xt+1|t +Htut, we can then move

to date t+1 and use the previous procedure again. To make this procedure

work, we need to update Σt|t−1. In doing so, we first compute

Σt+1|t = E[(At

(xt − xt|t

)+ Cεt+1

) (At

(xt − xt|t

)+ Ctεt+1

)′]= AtΣt|tA′

t +CtC′t.

10.1. FILTERS 333

We then compute

Σt|t = E[(xt − xt|t

) (xt − xt|t

)′]= E

[(xt − xt|t−1 −Ktat

) (xt − xt|t−1 −Ktat

)′]= E

[ ((I −KtGt)

(xt − xt|t−1

)−KtDtεt

)×

((I −KtGt)

(xt − xt|t−1

)−KtDtεt

)′ ]= (I −KtGt) Σt|t−1 (I −KtGt)

′ +KtDtD′tK

′t

−E[(I −KtGt)

(xt − xt|t−1

)ε′tD

′tK

′t

]−E

[KtDtεt

(xt − xt|t−1

)′(I −KtGt)

′],

where we have used (10.7) in the second equality, the definition of innovation

in the third equality, and (10.8) in the last equality. We have to compute

E[(I −KtGt)

(xt − xt|t−1

)ε′tD

′tK

′t

]= E

[(I −KtGt)

(At−1xt−1 +Bt−1ut−1 + Ctεt − xt|t−1

)ε′tD

′tK

′t

]= (I −KtGt)CtD

′tK

′t.

Similarly, we can compute E[Dtεt

(xt − xt|t−1

)′(I −KtGt)

′]. We then ob-

tain

Σt|t = (I −KtGt) Σt|t−1 (I −KtGt)′ +KtDtD

′tK

′t (10.11)

− (I −KtGt)CtD′tK

′t −KtDtC

′t (I −KtGt)

′ .

We can use (10.11) to understand how we obtain the optimal Kalman

gain matrix in (10.10). Minimizing E[(xt − xt|t)′(xt − xt|t)

]is equivalent to

minimizing the trace of Σt|t. Taking the first-order condition using (10.11)

yields:

0 =∂tr

(Σt|t

)∂Kt

= −2(Σt|t−1G

′t +DtC

′t

)+2Kt

(GtΣt|t−1G

′t +DtD

′t +GtCtD

′t +DtC

′tGt

).


Solving gives the optimal Kalman gain matrix Kt in (10.10). Plugging this

equation and (10.8) into (10.11) yields:

Σt|t = (I −KtGt) Σt|t−1 −KtDtC′t.

One can derive (10.7) using a probabilistic approach. By the Projection

Theorem,

xt|t = xt|t−1 +Cov(xt, yt|yt−1)

V ar(yt|yt−1)

(yt − yt|t−1

). (10.12)

We then compute

Cov(xt, yt|yt−1) = E[(xt − xt|t−1

) (yt − yt|t−1

)′]= E

[(xt − xt|t−1

) (Gt

(xt − xt|t−1

)+Dtεt

)′]= Σt|t−1G

′t + CtD

′t,

V ar(yt|yt−1

)= E

[(yt − yt|t−1

) (yt − yt|t−1

)′]= Ωt|t−1.

The Kalman gain is given by

Kt =Cov(xt, yt|yt−1)

V ar(yt|yt−1).

Now, we summarize the Kalman filter algorithm as follows:

Step 1. Initialization:

Σ0|−1 = Σ0, x0|−1 = x0.

We may set initial values at the steady state values such that

x0 = Ax0,

and

Σ0 = AΣ0A′ +Q or Σ0 = (I −A⊗A)−1 vec (Q) .

Step 2. At date t, start with xt|t−1 and Σt|t−1. After observing yt, we

compute the following:

Ωt|t−1 = GtΣt|t−1G′t +DtD

′t +GtCtD

′t +DtC

′tG

′t,

10.1. FILTERS 335

Kt =(Σt|t−1G

′t + CtD

′t

)Ω−1t|t−1.

xt|t = xt|t−1 +Kt

(yt −Gtxt|t−1 −Ht−1ut−1

),

Σt|t = (I −KtGt)Σt|t−1 −KtDtC′t,

Σt+1|t = At Σt|t A′t + CtC

′t,

xt+1|t = Atxt|t +Btut.

Step 3. We obtain xt+1|t and Σt+1|t and go to step 2, starting with date

t+ 1.

We may rewrite the Kalman filter in a fully recursive form:

xt+1|t = Atxt|t−1 +Btut +AtKtat,

where the innovation at = yt−Gtxt|t−1−Ht−1ut−1 is independently and nor-

mally distributed with mean zero and covariance matrix Ωt|t−1, and Σt+1|tsatisfies the recursive equation:

Σt+1|t = At[(I −KtGt) Σt|t−1 −KtDtC

′t

]A′t + CtC

′t

= AtΣt|t−1At + CtC′t −At

(GtΣt|t−1 +DtC

′t

)′Ω−1t|t−1

(GtΣt|t−1 +DtC

′t

)A′t,

where we have used equation (10.10). This is a Riccati equation for Σt+1|t.

Alternatively, we may rewrite the Kalman filter as:

xt+1|t+1 = xt+1|t +Kt+1at+1 = Atxt|t +Btut +Kt+1at+1,

and Σt|t satisfies the recursion:

Σt+1|t+1 = (I −Kt+1Gt+1)Σt+1|t −Kt+1Dt+1C′t+1

= (I −Kt+1Gt+1)(AtΣt|t A′

t +CtC′t

)−Kt+1Dt+1C

′t+1.

For infinite-horizon models, we typically consider a time-homogeneous

setup in which At, Bt, Ct, Dt, Gt, and Ht are all independent of t. In this


case, we often use the long-run stationary Kalman filter in which Σt → Σ

and Kt → K. The law of motion for the Kalman filter is stationary so that

we can build it into stationary dynamic models using a recursive approach.

Now, we provide some examples to illustrate the application of the

Kalman filter.

Example 10.1.2 Muth’s (1960) example.

Muth (1960) studies the model:

xt+1 = xt + ηt+1,

yt = xt + vt,

where xt, yt are scalar stochastic processes, and {ηt} , {vt} are mutually

independent IID Gaussian processes with means zero and variances Eη2t =

Q, Ev2t = R. Here xt is a hidden state and yt is observable. The initial

condition is that x0 is Gaussian with mean x0 and variance Σ0. Applying

the Kalman filter gives

Kt =Σt|t−1

Σt|t−1 +R,

Σt+1|t =Σt|t−1 (R+Q) +QR

Σt|t−1 +R.

As t→∞, Σt|t−1 → Σ, and Kt → K. Thus, we obtain

xt+1|t = (1−K) xt|t−1 +Kyt.

This is a version of Cagan’s adaptive expectations formula. Thus, Muth

(1960) has constructed stochastic processes such that the above adaptive ex-

pectations formula is optimal.

In the special scalar case where εt = Q = 0, we may set xt = θ for all t.

Define mt = E[θ|yt

]= xt+1|t and σt+1 = E (θ −mt)

2 = Σt+1|t. Using the

10.1. FILTERS 337

above result, we obtain:

mt = (1−Kt)mt−1 +Ktyt,

Kt =σt

σt +R,

σt =σt−1R

σt−1 +R.

We may write the innovation form:

mt+1 = mt +Kt+1at+1

= mt +√Kt+1σt+1ξt+1,

where ξt+1 is a standard IID normal random variable. Here we used the

fact that the variance of at+1 is equal to σt+1 + R. Thus, the variance of

Kt+1at+1 is K2t+1 (σt+1 +R) = Kt+1σt+1.

Example 10.1.3 Estimating long-run productivity growth.

Edge, Laubach, and Williams (2007) and Gilchrist and Saito (2008) con-

sider the following filtering problem in dynamic stochastic general equilib-

rium models. Assume that agents observe the level of labor productivity, but

not its persistent and transitory components. We denote by zt the log of

total factor productivity, and by xt its expected growth rate. Assume zt and

xt follow:

xt = (1− ρ) g + ρxt−1 + ηt,

zt = zt−1 + xt + vt,

where {ηt} and {vt} are mutually independent IID Gaussian processes with

Eεt = Evt = 0, Eη2t = σ2η, and Ev2t = σ2v. Let xt|t = E [xt|z0, ..., zt] . Letyt = zt − zt−1. Applying the stationary Kalman filter gives:

xt+1|t+1 = ρxt|t + (1− ρ) g +Kat+1,


where

at+1 = zt+1 − zt − ρxt|t − (1− ρ) g.

The Kalman gain is given by

K =Σ

Σ+ σ2v,

and the stationary covariance matrix Σ = E(xt − xt|t−1

)2satisfies:

Σ = ρ2(1− Σ

(Σ+ σ2v

)−1)Σ+ σ2η.

Solving this equation yields:

Σ =σ2v2

[−

(1− ρ2 − φ

)+

√(1− ρ2 − φ)2 + 4φ

].

where φ = σ2η/σ2v. It follows that

K = 1− 2

2− (1− ρ2 − φ) +√

(1− ρ2 − φ)2 + 4φ.

We can also compute the variance of the innovation:

Ea2t = Σ+ σ2v.

Thus, conditional on zt, xt+1|t+1 is normally distributed with mean ρxt|t +

(1− ρ) g and variance K2(Σ+ σ2v

)= KΣ.

Edge, Laubach, and Williams (2007) and Gilchrist and Saito (2008) ap-

ply this filter to dynamic stochastic general equilibrium models.

Example 10.1.4 Likelihood function.

Suppose that we observe a sequence of data yT generated by the state

space system (10.1) and (10.2). We write down its log-likelihood density as

ln p(yT |θ

)=

T∑t=1

ln p(yt|yt−1, θ

)

10.1. FILTERS 339

= −nyT2

ln (2π)−T∑t=1

1

2ln

∣∣Ωt|t−1

∣∣− 1

2

T∑t=1

a′tΩ−1t|t−1at,

where θ is a vector of model parameters, at = yt − yt|t−1 is the innovation

and Ωt|t−1 is given by (10.8).

10.1.2 Hidden Markov Chain

Consider an N -state Markov chain. We represent the state space in terms

of the unit vector {e1, e2, ..., eN} , where ei is the ith N -dimensional unit

(column) vector. Let the transition matrix be P, with (i, j) element

pij = Pr (xt+1 = ej |xt = ei) .

Given this formulation, we have

E [xt+1|xt] = P ′xt.

We then obtain the state transition equation:

xt+1 = P ′xt + vt+1, (10.13)

where {vt} is a martingale difference process adapted to xt sinceE [vt+1|xt] =0. To compute its conditional second moment, we use (10.13) to derive that

xt+1x′t+1 = P ′xt

(P ′xt

)′+ P ′xtv′t+1 + vt+1x

′tP + vt+1v

′t+1,

and

xt+1x′t+1 = diag(xt+1) = diag

(P ′xt

)+ diag (vt+1) ,

where diag(xt+1) is a diagonal matrix whose diagonal entries are the ele-

ments of xt+1. Thus,

vt+1v′t+1 = diag

(P ′xt

)+ diag (vt+1)

−[P ′xt

(P ′xt

)′+ P ′xtv′t+1 + vt+1x

′tP

].


It follows that

E[vt+1v

′t+1|xt

]= diag

(P ′xt

)− P ′diag (xt)P

We next introduce a measurement equation. Suppose that xt is not

observed, but that a noisy signal of xt, denoted by yt, is observed. Let yt

take values in the set {ξ1, ξ2, ..., ξM} where ξi is the ith M -dimensional unit

vector. Define the N ×M matrix Q = (qij) by

Pr(yt = ξj |xt = ei

)= qij.

It follows that

E [yt|xt] = Q′xt.

We can then write the measurement equation as

yt = Q′xt + wt (10.14)

where {wt} is a martingale difference process satisfying E [wt|xt] = 0 and

E[utu

′t|xt

]= diag

(Q′xt

)−Q′diag (xt)Q.

Our goal is to give a recursive formula for computing the conditional

(posterior) distribution of the hidden state. By Bayes’s rule,

πt(ei|yt

)≡ Pr

{xt = ei|yt

}=

Pr(xt = ei, yt|yt−1

)Pr (yt|yt−1)

=

∑xt−1

Pr (yt|xt = ei) Pr (xt = ei|xt−1) Pr(xt−1|yt−1

)∑xt

∑xt−1

Pr (yt|xt) Pr (xt|xt−1) Pr (xt−1|yt−1).

It follows that

πt(ei|yt−1, yt = ξj

)=

∑s qijpsiπt−1

(es|yt−1

)∑i

∑s qijpsiπt−1 (es|yt−1)

where yt = ξj is the value of yt. The interpretation is that given the condi-

tional distribution πt−1

(·|yt−1

)at date t− 1 and after observing the signal

10.1. FILTERS 341

yt = ξj , the conditional distribution of the hidden state at date t is given by

πt(·|yt−1, yt = ξj

).We may set the prior distribution of x0 as the stationary

distribution πss. Thus,

π0(ei|y0 = ξj

)= Pr

(x0 = ei|y0 = ξj

)=

Pr(y0 = ξj |x0 = ei

)Pr (x0 = ei)∑

i Pr(y0 = ξj |x0 = ei

)Pr (x0 = ei)

=qijπ

ssi∑

i qijπssi

.

10.1.3 Hidden Markov-Switching Model

Consider the hidden regime switching model in which the state transition

equation is given by (10.13) and the measurement equation is given by

yt = h (xt, εt) ,

where εt is IID. An example is the following:

Δ lnCt = κ (xt) + εt,

where Δ lnCt is consumption growth, xt ∈ {e1, e2} , κ (ei) = κi, and εt is

IID N(0, σ2

). In this example, xt describes hidden economic regimes. The

expected growth rates of consumption are different in the two regimes.

Let

f (x, y) = Pr (h (x, ε) = y) = Pr (yt = y|xt = x) .

Define the conditional (posterior) distribution of the hidden state as

πt(ei|yt

)= Pr

(xt = ei|yt

), πt

(i|yt−1

)= Pr

(xt = ei|yt−1

).

Our goal is to derive a recursive formula for computing these conditional

distributions. First, we compute

πt(ei|yt−1

)= Pr

(xt = ei|yt−1

)=

∑xt−1

Pr(xt = ei|yt−1, xt−1

)Pr

(xt−1|yt−1

)=

∑xt−1

Pr (xt = ei|xt−1) Pr(xt−1|yt−1

)=

∑j

pjiπt−1

(ej |yt−1

).


We then compute

πt(ei|yt

)= Pr

(xt = ei|yt

)=

Pr(yt|xt = ei, y

t−1)Pr

(xt = ei|yt−1

)∑j Pr (yt|xt = ej , yt−1) Pr (xt = ej |yt−1)

=f (x, yt) πt

(ei|yt−1

)∑j f (ej , yt)πt (ej|yt−1)

.

The interpretation is that given date t−1 conditional distribution πt−1

(·|yt−1

)and after observing yt at date t, we can use the above equations to compute

the conditional distribution πt(·|yt

)and πt

(ei|yt−1

)at date t. We may set

the initial prior distribution π0(ei|y−1

)as the stationary distribution of the

Markov chain.

10.2 Control Problems

For simplicity, consider a finite-horizon problem. Suppose the state xt is

not perfectly observable at date t. Instead, the decision maker observes a

noisy signal yt about xt. He then chooses an action ut. The information

set at date t is given by yt. We make some assumptions about the stochas-

tic structure: (i) The prior distribution of x0 is given. (ii) Given the past

states, signals, and actions, the conditional distribution of xt+1 satisfies

Pr(xt+1|xt, yt, ut

)= Pr (xt+1|xt, ut). (iii) The conditional distribution of

the signal yt satisfies Pr(yt|xt, ut, yt−1

)= Pr (yt|xt, ut−1) . These assump-

tions are typically satisfied for the state-space representation. The objective

is to maximize the following expected total rewards:

E

[T−1∑t=0

βtr (xt, ut, t) + βTR (xT )

],

where R is a terminal reward function.

Given the above assumptions, we can compute

Pr(xt+1, yt+1|xt, yt, ut

)= Pr

(yt+1|xt+1, x

t, yt, ut)Pr

(xt+1|xt, yt, ut

)= Pr (yt+1|xt+1, ut) Pr (xt+1|xt, ut) .

10.2. CONTROL PROBLEMS 343

Define the posterior distribution as

Pt (xt) = Pr(xt|yt, ut−1

).

We can show that

Pt+1 (xt+1) = Pr(xt+1|yt+1, ut

)=

Pr(xt+1, yt+1|yt, ut

)Pr (yt+1|yt, ut)

=

∑xtPr

(xt+1, xt, yt+1|yt, ut

)∑xt+1

∑xtPr (xt+1, xt, yt+1|yt, ut)

=

∑xtPr

(xt|yt, ut

)Pr

(xt+1, yt+1|xt, yt, ut

)∑xt+1

∑xtPr (xt+1, xt, yt+1|yt, ut)

=

∑xtPt (xt) Pr (xt+1, yt+1|xt, ut)∑

xt+1

∑xtPt (xt) Pr (xt+1, yt+1|xt, ut)

.

This equation gives a recursive equation for linking today’s posterior distri-

bution of the hidden state P to tomorrow’s P ∗. Using this distribution as

a state variable, we can then establish the following dynamic programming

equation:

V (P, t) = maxu

∑x

P (x) r (x, u, t)+β∑x

∑y

∑z

P (z) at (x, y|z, u) V (P ∗, t+ 1) ,

where V (P, t) is the value function for t = 0, 1, ..., T − 1, and

V (P, T ) =∑x

P (x)R (x) .

In applications, the posterior distribution is an infinite dimensional ob-

ject. This makes the general control problem under partial information hard

to solve numerically. In some special cases such as normal distributions or

discrete states, the posterior distribution is finite dimensional. For example,

for the linear state-space representation, the Kalman filter implies that the

posterior distribution of the hidden state is Gaussian. Hence we can use the

conditional mean and conditional variance to summarize this distribution.


Example 10.2.1 Life-cycle consumption/saving problem (Guvenen (2007)).

Guvenen (2007) studies a life-cycle consumption/saving problem when a

household’s log labor income follows the process in (10.5) and (10.6). The

household observes past income data, but does not observe α, β and z. It


max{ct,st+1}

ET∑t=0

βtU (ct)

subject to the budget constraint:

ct + st+1 = Rst + exp(yt), st+1 ≥ 0. (10.15)

To solve this problem, we first derive the state space representation as in

Example 10.1.1 and then apply the Kalman filter to derive the recursion:

xt+1|t = Axt|t−1 +AKt

(yt −Gtxt|t−1

). (10.16)

Finally, we can write the dynamic programming problem as follows:

VT+1 = 0, and for t = 0, 1, ..., T,

Vt(st, yt, xt|t−1

)= max

ct,at+1

U (ct) + βE[Vt+1

(st+1, yt+1, xt+1|t

)|st, yt, xt|t−1

],

subject to (10.15) and (10.16). Note that the conditional expectation is with

respect to yt+1 only. The conditional distribution of yt+1 is N(Gtxt+1|t,Ωt+1|t

).

For a general utility function U , there is no analytical solution. One has to

use a numerical method to solve this dynamic problem. Note that we have

used(yt, xt|t−1

)as state variables. Alternatively, we can use xt|t as a state

variable.

10.3 Linear-Quadratic Control

Consider the following linear-quadratic control problem:

10.3. LINEAR-QUADRATIC CONTROL 345

max{ut}

−E∞∑t=0

βtr (xt, ut) , β ∈ (0, 1],

subject to

xt+1 = Axt +But + ηt+1,

yt = Gxt + vt,

where

r (x, u) =1

2

(x′Qx+ u′Ru+ 2x′Su

).

Assume that {ηt} and {vt} are IID Gaussian processes and independent

of each other and that x0 is Gaussian and independent of other random

variables. The decision maker observes only yt at date t.

In the case of full information studied in Section 9.3, we have shown that

the value function takes the form:

V (xt) = −1

2x′tPxt + constant,

where P satisfies equation (9.12), and the optimal policy is given by

ut = −Fxt, (10.17)

where F is given by (9.17). We shall show that, under partial information,

the value function takes the form:

V(xt|t

)= −1

2x′t|tPxt|t + constant,

where P also satisfies (9.12). However the constant term may be different

from that in the full information case. We shall also show that the optimal

policy is given by

ut = −Fxt|t,


where F satisfies (9.17). This result is a form of certainty equivalence:

the optimal control is obtained by substituting the estimate of the state xt|tfor the actual state xt in the decision rule for the full information optimal

control.

We now sketch a proof. The first step is to apply the long-run stationary

Kalman filter to obtain the estimate xt|t. Then, conjecture

V(xt|t

)= −1

2x′t|tPxt|t + constant.

Define

Δt ≡ xt − xt|t = Axt−1 + ηt −Axt−1|t−1 −Kat.

It is independent of control ut since E[(xt − xt|t

)ut]= 0. We first compute

Et[(x′tQxt + u′tRut + 2x′tSut

)]= Et

[((xt|t +Δt

)′Q

(xt|t +Δt

)+ u′tRut + 2

(xt|t +Δt

)′Sut

)]=

(x′t|tQxt|t + u′tRut + 2x′t|tSut

)+ ...

where we use the notation “+ ...” to denote terms independent of the control

ut. Here Et denotes the conditional expectation operator given information

yt. Next, we compute

Et

[x′t+1|t+1Pxt+1|t+1

]= Et

[(xt+1 −Δt+1)

′ P (xt+1 −Δt+1)]

= Et[x′t+1Pxt+1

]− 2Et

[Δ′t+1Pxt+1

]+ Et

[Δ′t+1PΔt+1

]= Et

[x′t+1Pxt+1

]− 2Et

[Δ′t+1P

(Axt +But + ηt+1

)]+ Et

[Δ′t+1PΔt+1

]= Et

[(Axt +But + ηt+1

)′P

(Axt +But + ηt+1

)]+ ...

= Et[(Axt +But)

′ P (Axt +But)]+ ...

= Et

[(Axt|t +But

)′P

(Axt|t +But

)]+ ...

10.4. EXERCISES 347

We then solve the problem

V(xt|t

)= max

utEt

[−1

2

(x′tQxt + u′tRut + 2x′tSut

)]−β2Et

[x′t+1|t+1Pxt+1|t+1

]+ ...

= maxut

− 1

2

(x′t|tQxt|t + u′tRut + 2x′t|tSut

)+βEt

[−1

2

(Axt|t +But

)′P

(Axt|t +But

)]+ ...

Clearly, this is the full-information problem in which the state is xt|t. Fol-

lowing the same analysis as in Section 9.3 delivers the desired result.

10.4 Exercises

1. Let

yt = μ+ εt + bεt−1.

Write down the state-space representation.

2. Consider Example 10.1.1. Set α = β = 0. Assume that ρ is unknown

and normally distributed with mean μρ ∈ (0, 1) and variance σ2ρ. De-

rive the Kalman filter.

3. Let xt = θ for all t and yt = θ + vt. Assume that θ is Gaussian with

mean μ and variance Σ0. Derive the Kalman filter. What happens in

the long run?

4. There is another way of writing the state-space representation:

xt+1 = Axt +But + Cεt+1,

yt+1 = Gxt +Hut +Dεt+1,

or equivalently,

xt+1 = Axt +But + Cεt,

yt = Gxt +Hut−1 +Dεt.


Derive the Kalman filter for this formulation.

5. Derive the Kalman filter for the state-space representation (10.3) and

(10.4).

6. Consider Examples 10.1.1 and 10.2.1. Set α = β = 0 and u (c) =

− exp (−γc) /γ.

(a) Derive the Kalman filter.

(b) Let T = 1. Solve the dynamic programming.

(c) Solve the dynamic programming for any finite T.

(d) If we use xt|t as a state variable instead of xt|t−1, how to write

down the Bellman equation?

Chapter 11

Numerical Methods

Quantitative answers are often needed for many economic problems. For ex-

ample, how much do technology shocks contribute to business cycles? What

is the impact of a major tax reform on the macroeconomy? To provide

quantitative answers, numerical solutions are needed. In this chapter, we

present some widely used numerical methods for solving dynamic program-

ming problems and equilibrium systems. The reader is referred to Judd

(1998), Miranda and Fackler (2002), and Adda and Cooper (2003) for other

textbook treatments of numerical methods in economics and finance.1

11.1 Numerical Integration

Computing expected values is necessary in economic problems under uncer-

tainty. This section studies how to numerically compute integral∫D f (x) dx,

where f : Rn → R is an integrable function over the domain D. There are

several numerical methods as detailed by Judd (1998). In this section, we

focus on the Gaussian quadrature method. We shall start with the one

dimensional case and then discuss multidimensional case.

1Miranda and Fackler (2002) provide an excellent toolbox CompEcon downloadablefrom the web site: http://www4.ncsu.edu/unity/users/p/pfackler/www/compecon/. Thistoolbox can be used to solve many problems described in this chapter.

349

350 CHAPTER 11. NUMERICAL METHODS

11.1.1 Gaussian Quadrature

The idea of the a quadrature method is to approximate the integral by

summation: ∫ b

af (x) dx �

n∑i=1

wif (xi)

where xis are quadrature nodes and wis are quadrature weights. Newton-

Cotes Methods are an example of the quadrature method. The Gaussian

quadrature method is constructed with respect to a specific weight func-

tion w. For a given order of approximation, this method chooses the nodes

x1, x2, ..., xn and weights w1, w2, ..., wn so as to satisfy the following 2n con-

ditions: ∫ b

axkw (x) dx =

n∑i=1

wixki , k = 0, 1, ..., 2n − 1.

The integral approximation is then computed by forming the prescribed

weighted sum of function values at the prescribed nodes:∫ b

af (x)w (x) dx =

n∑i=1

wif (xi) .

Thus, if the function f can be approximated by polynomials accurately,

then the Gaussian quadrature method will give accurate approximation to

the integral. When the weight function w is a density function, Gaussian

quadrature has a straightforward interpretation. It essentially discretizes a

continuous random variable as a discrete random variable with mass points

xi and probabilities wi.

Computing the n-degree Gaussian quadrature nodes and weights is non-

trivial because it requires to solving 2n nonlinear equations. The CompEcon

toolbox provided by Miranda and Fackler (2002) contains efficient numer-

ical programs for computing Gaussian quadrature nodes and weights for

different weight functions, including most known probability distribution

11.1. NUMERICAL INTEGRATION 351

functions such as the uniform, normal, lognormal, exponential, gamma, and

beta distributions.

The following is a list of some well known quadrature.

• Gauss-Legendre Quadrature. The weight function is w (x) = 1. Gauss-

Legendre quadrature computes integral∫ 1−1 f (x) dx. A change of vari-

able can compute integrals in any bounded interval and hence infinite

intervals ∫ b

af (x) dx � b− a

2

n∑i=1

wif

((xi + 1) (b− a)

2+ a

).

• Gauss-Chebyshev Quadrature. This method computes∫ 1

−1f (x)

(1− x2

)−1/2dx � π

n

n∑i=1

f (xi) ,

where

xi = cos

(2i− 1

2nπ

), i = 1, 2, ..., n.

• Gauss-Hermite Quadrature. This method computes∫ ∞

−∞f (x) e−x

2dx �

n∑i=1

wif (xi) .

This quadrature is very useful to compute integrals with respect to

normal distributions:

Ef (Y ) = π−1/2n∑i=1

wif(√

2σxi + μ),

where Y is normal with mean μ and variance σ2.

• Gauss-Laguerre Quadrature. This method computes∫ ∞

0e−xf (x) dx �

n∑i=1

wif (xi) .

It is useful to compute present values∫ ∞

ae−ryf (y) dy � e−ra

r

n∑i=1

wif(xir

+ a).


11.1.2 Multidimensional Quadrature

Suppose that we want to compute the d dimensional integral∫[a,b]d f (x) dx.

We can use the product rule based on any one-dimensional quadrature rule.

Let xji , wji , i = 1, ..., nj , be one dimensional quadrature points and weights

in dimension j. The the product rule gives the approximation:

n1∑i1=1

· · ·nd∑id=1

w1i1w

2i2 · · · w

didf(x1i1 , x

2i2 , ..., x

did

).

For Gaussian quadrature, if wj is a scalar weight function in dimension j,

then let

W (x) = Πdj=1wj (xj) .

Gaussian quadrature computes the integral∫[a,b]d f (x)W (x) dx using the

above approximation. We can apply a change of variable to compute the

integral in any interval. A difficulty of the product approach is the curse of

dimensionality. See Judd (1998) for a discussion of a nonproduct approach.

11.2 Discretizing AR(1) Processes

In economic problems, exogenous shocks are often modeled as a Markov

process. When solving stochastic dynamic programming problems, we need

to compute expected values with respect to the shocks. If a shock follows

a discrete Markov chain, then expected values are just simple summation.

However, if a shock follows an autoregressive process of order one (AR(1))

process, then computing expectations is cumbersome. In the next section,

we will discuss how to compute numerical integration. In this section, we

shall introduce two methods proposed by Tauchen (1986) and Tauchen and

Hussey (1991) to approximate an AR(1) process by a discrete Markov chain.

Consider the following AR(1) process

zt+1 = μ (1− ρ) + ρzt + σεt+1, (11.1)

11.2. DISCRETIZING AR(1) PROCESSES 353

where μ and ρ are unconditional mean and autocorrelation of zt and εt

is IID standard normal. Without loss of generality, we set μ = 0. The

unconditional standard deviation of zt is σz = σ/√

1− ρ2. Our goal is to

construct a discrete Markov chain with the state space{zi}Ni=1

and the

transition matrix π = (πij) that approximates {zt}. Making N larger, the

discrete Markov chain approximation becomes more accurate.

Tauchen (1986) Method The method of Tauchen (1986) consists of

the following steps:2

Step 1. Choose a discrete state space with N equally spaced points:

(zi)Ni=1

=

{zmin, zmin +

zmax − zmin

N − 1, ..., zmin +

N − 2

N − 1(zmax − zmin) , zmax

},

where

zmax = mσz, zmin = −mσz.

In practice, setting m = 3 or 4 is sufficient.

Step 2. Choose N intervals:

I1 =

(−∞, zmin +

zmax − zmin

2 (N − 1)

),

I2 =

(zmin +

zmax − zmin

2 (N − 1), zmin +

zmax − zmin

(N − 1)

), ...,

IN−1 =

(zmin +

(N − 1) (zmax − zmin)

2 (N − 1), zmin +

N (zmax − zmin)

2 (N − 1)

),

IN =

(zmin +

N (zmax − zmin)

2 (N − 1),∞

).

Step 3. Determine transition probabilities:

πij = Pr(zt+1 ∈ Ij|zt = zi

).

By (11.1), these transition probabilities can be easily computed.

2The Matlab code markovappr.m implements this method.


Tauchen and Hussey (1991) Method The method of Tauchen and

Hussey (1991) is related to the Gauss-Hermite quadrature method discussed

earlier.3 Suppose that we want to compute the condition expectation:

E[v(z′)|z]=

∫ ∞

−∞v(z′)f(z′|z

)dz′,

where f (·|z) is the conditional density of zt+1 given zt = z. Using the Gauss-

Hermite method, this integral can be approximated as∫v(z′)f(z′|z

)dz′ � π−1/2

N∑j=1

wjv(√

2σxj + ρz),

where xj and wj are the Gauss-Hermite nodes and weights, respectively. The

problem of this approach is that this approximation depends continuously

on z. Tauchen and Hussey (1991) propose the following transformation:∫v(z′)f(z′|z

)dz′ =

∫v(z′) f (z′|z)f (z′|0)f

(z′|0

)dz′ =

1√π

N∑j=1

wjv(zj) f (

zj |z)

f (zj |0) ,

where zj =√2σxj . Thus,

E[v(z′)|zi

]�

N∑j=1

πijv(zj),

where

πij = π−1/2N∑j=1

wjf(zj |zi

)f (zj |0) .

Note that we may haveN∑j=1

πij �= 1.

We can then normalize

πij =πij∑Nj=1 πij

,

3The Matlab program tauch hussey.m implements this method. The code gauss herm.mis needed.

11.2. DISCRETIZING AR(1) PROCESSES 355

and take the approximation,

E[v(z′)|zi

]�

N∑j=1

πijv(zj).

Note that this is not the Gauss-Hermite approximation of the integral. But

as N is sufficiently large,∑N

j=1 πij converges to one. Thus, we can use the

discrete Markov chain with the state space{z1, z2, ..., zN

}and the transition

matrix (πij) to approximate the AR(1) process.

Simulating a Markov Chain Suppose that {zt} is a Markov chain

with the state space{z1, z2, ..., zN

}and the transition matrix (πij) . We

want to simulate this Markov chain to generate a series for t = 0, 1, 2, ..., T.

Start in period t = 0 and suppose z0 is initialized at some zi. Next we

have to assign values for zt, t = 1, 2, ..., T. The following steps describe an

algorithm:4

Step 1. Compute the cumulative distribution of the Markov chain Πc.

That is,

Πcij =

j∑k=1

πik

is the probability that the Markov chain is in a state lower or equal to j

given that it was in state i in period t− 1.

Step 2. Set the initial state and simulate T random numbers from a

uniform distribution over [0, 1]: {pt}Tt=1 .

Step 3. Assume the Markov chain was in state i in t− 1, find the index

j such that

Πci(j−1) < pt ≤ Πcij , for j ≥ 2,

then zj is the state in period t. If pt ≤ πi1, then z1 is the state in period t.

4The Matlab code markovsimul.m implements this algorithm.


11.3 Interpolation

Function approximations are an important step in solving computational

problems. Generally speaking, there are three methods to approximate a

given function f. The first method is based on a local approximation using

data of f around a particular point x∗. The Taylor expansion theorem is

the main theory behind this method. The perturbation method discussed in

Chapters 1 and 2 and in Section 11.4 belongs to this category. The second

approximation method uses global properties of f. For example, an Lp ap-

proximation finds a function g that is close to f over some domain in the Lp

norm. This approximation method typically requires too much information

to be feasible in applications. The third approximation method is interpo-

lation, which uses a finite set of data to fit the function f. In particular, it

uses n conditions to determine n free parameters. There is also a method,

called regression, that lies between Lp approximation and interpolation in

that it uses m conditions to fix n parameters where m > n. This method is

more stable but requires more information than interpolation, and it is less

accurate than the Lp approximation.

In this section, we focus on the interpolation method.5 We start with

the one dimensional case to explain the key idea. Consider a continuous

function f : [a, b] → R. The objective of interpolation is to approximate f

by f ,

f (x) =n∑j=1

cjφj (x) ,

where n is called the degree of interpolation or the degree of ap-

proximation, φ1, φ2, ..., φn are linearly independent basis functions, and

c1, c2, ..., cn are basis coefficients.

The first step in implementing the interpolation method is to choose

5The CompEcon toolbox designed by Miranda and Fackler (2002) contains many usefulMatlab programs for implementing the approximation methods discussed in this section.

11.3. INTERPOLATION 357

suitable basis functions. There are two ways of specifying basis functions.

The spectral method uses basis functions which are almost everywhere

nonzero on [a, b]. Typically these functions are polynomials. The finite

element method uses basis functions which are nonzero on only small

subdomains of [a, b]. These functions are typically splines. In the next two

subsections, we will discuss these two types of basis functions.

The second step in implementing the interpolation method is to de-

termine basis coefficients c1, ..., cn. A natural way to solve these n coeffi-

cients is to use n data points (xi, yi) , where yi = f (xi) is known, such

that f (xi) = yi, i = 1, 2, .., n. That is, we solve the following system of n

equations:n∑j=1

cjφj (xi) = f (xi) , i = 1, ..., n.

We can rewrite it in a matrix form:

Φc = y,

where Φij = φj (xi) . This equation is call the interpolation equation and

the matrix Φ is called the interpolation matrix. The points x1, x2, ..., xn

are called interpolation nodes. The choice of the degree of interpolation

and the interpolation nodes are important for an accurate approximation.

When the number m of interpolation nodes exceeds the number n of

basis coefficients, we cannot use the interpolation equation. Instead, we can

use the least squares method or the regression method by solving the

following problem:

min{ci}

m∑i=1

⎡⎣f (xi)− n∑j=1

cjφj (xi)

⎤⎦2

, m > n.

This problem leads to the solution

c =(Φ′Φ

)−1Φ′y,


which is equivalent to the the interpolation equation if m = n and Φ is

invertible.

Interpolation conditions may not be limited to data on function values.

The information about function derivatives may also be used to solve for

the basis coefficients.

11.3.1 Orthogonal Polynomials

A natural family of basis functions is the monomial basis {1, x, x2, x3...}.The interpolation matrix Φ is given by

Φ =

⎡⎢⎢⎣1 x1 ... xn−1

1

1 x2 ... xn−12

...1 xn ... xn−1

n

⎤⎥⎥⎦ .This matrix is ill conditioned and hence interpolation using this basis yields

a large rounding error. One reason is that the monomial basis functions are

not mutually orthogonal.

Definition 11.3.1 A weighting function w (x) on [a, b] is any function which

is positive and has a finite integral over [a, b] . Given a weighting function

w (x) , define an inner product for integrable functions on [a, b]

< f, g >w=

∫ b

af (x) g (x)w (x) dx.

The family of functions{φj

}on [a, b] is mutually orthogonal with respect to

w (x) if

< φi, φj >w= 0, i �= j.

The inner product induces a norm,

||f ||w =< f, f >w,

and hence a natural metric.

The following is a list of commonly used orthogonal basis functions:


• Legendre polynomials. The nth Legendre polynomial is

Pn (x) =(−1)n

2nn!

dn

dxn[(1− x2

)n], P0 (x) = 1, x ∈ [−1, 1] .

The weighting function is w (x) = 1 on [−1, 1] . We can use the trans-

formation z = 2 (x− a) / (b− a)−1 to define functions on any interval

bounded [a, b] .

• Chebyshev polynomials. The nth Legendre polynomial on [−1, 1] isdefined recursively:

T0 (x) = 1, T1 (x) = x, T2 (x) = 2x2 − 1

Tn (x) = 2xTn−1 (x)− Tn−2 (x) .

The weighting function is w (x) =(1− x2

)−1/2. One can also equiva-

lently define

Tn (x) = cos (n arc cos (x)) .

• Laguerre polynomials. The nth member is

Ln (x) =ex

n!

dn

dxn(xne−x

), L0 (x) = 1, x ∈ [0,∞).

The weighting function is w (x) = e−x on [0,∞). Laguerre polynomials

are used to approximate functions of time in deterministic models.

• Hermite polynomials. The nth member is

Hn (x) = (−1)n ex2 dn

dxn

(e−x

2), H0 (x) = 1, x ∈ (−∞,∞).

The weighting function is w (x) = e−x2 . Hermite polynomials are often

used to approximate functions of normal random variables.

We shall focus on Chebyshev polynomials because they possess some

desirable properties. The roots of Tn (x) on [−1, 1] are given by

xk = cos

(π (2k − 1)

2n

), k = 1, 2, ..., n.


−1

1

−1

1

0 1−1

1

0 1 0 1

Figure 11.1: Chebychev polynomial basis functions up to 9 degreeon the unit interval.

Tn (x) has n+ 1 extrema at

x = cos

(πk

n

), k = 0, 1, ..., n.

At all maxima Tn (x) = 1 and at all minima Tn (x) = −1. Figure 11.1

illustrates 9 Chebyshev polynomial basis functions up to 9 degree on the

unit interval.

The Chebyshev polynomials have a discrete orthogonality property. If

xk (k = 1, 2, ...,m) are the m roots of Tm (x) , and if i, j < m, then

m∑k=1

Ti (xk)Tj (xk) =

⎧⎨⎩0 if i �= j

m/2 if i = j �= 0m if i = j = 0

.


Using this property, we now compute the basis coefficients c1, c2, ..., cn when

the basis functions are given by

φj (x) = Tj−1 (x) , j = 1, ..., n.

A suitable choice of interpolation nodes is critical for a good approximation.

We shall choose the n roots of Tn (x) , {x1, x2, ..., xk}, as the interpolation

nodes. We can then compute Φ′Φ = diag(n, , n/2, n/2, ..., n/2) . Using the

least squares formula c = (Φ′Φ)−1Φ′y and the discrete orthogonality prop-

erty, we can derive:

cj =

∑nk=1 f (xk)Tj−1 (xk)∑n

k=1 Tj−1 (xk)2 =

⎧⎨⎩1n

∑nk=1 f (xk)Tj−1 (xk) for j = 1

2n

∑nk=1 f (xk)Tj−1 (xk) for j ≥ 2

.

We then obtain the Chebyshev interpolant:

Ifn−1 (x) =n∑i=1

ciTi−1 (x) .

One can also check that Ifn−1 and f are exactly matched at the n roots of

Tn (x) .

The following theorem from Rivlin (1990, p. 14) guarantees that inter-

polation at the zeros of Chebyshev polynomials lead to asymptotically valid

approximations of f.

Theorem 11.3.1 (Chebyshev Interpolation Theorem) Suppose that func-

tion f : [−1, 1] → R is k times continuously differentiable for some k ≥ 1.

Let Ifn be the degree n Chebyshev interpolant of f . Then there is some dk

such that for all n,∣∣∣∣∣∣f − Ifn ∣∣∣∣∣∣∞ ≤(2

πln (n+ 1) + 2

)dknk

∣∣∣∣∣∣f (k)∣∣∣∣∣∣∞,

where ||·||∞ is the uniform norm.


This theorem says that the Chebyshev interpolant converges to f rapidly

as we increase the degree of interpolation. Note that interpolation at uni-

formly spaced nodes does not necessarily converge as we use more points.

Thus, the Chebyshev interpolant is extremely useful whenever the approxi-

mated function is smooth.

We may use more interpolation nodes to construct a Chebyshev inter-

polation of lower degree. Suppose that n is large so that Ifn approximates f

almost perfectly. Consider the truncated approximation:

Im−1 (x) =

m∑i=1

ciTi−1 (x) , m << n, (11.2)

where the terms for k = m+1, ..., cn are dropped from Ifn−1. Since Tk (x) is

bounded between −1 and 1, the difference between Ifn−1 and Im−1 is bounded

by the sum of the dropped terms ck for k = m + 1, ..., cn. Since the coef-

ficients are rapidly decreasing, the difference is dominated by cm+1Tm (x).

Thus, the truncated approximation Im−1 is almost the same as the mini-

max polynomial that (among all polynomials of the same degree) has the

smallest maximum deviation from the true function f. In this sense, the

Chebyshev polynomial interpolation is almost the most accurate approxi-

mation. See Section 6.7 of Judd (1998) for a more formal discussion and for

an equivalent Chebyshev regression algorithm to compute (11.2).

11.3.2 Splines

Chebyshev polynomials interpolation works well for well behaved smooth

functions. For functions that have kinks or regions of high curvatures, it

seems better to use spline approximations (or more generally finite element

methods). The idea of constructing a spline interpolation is to first divide

the domain into nonintersecting subdomains called elements. The func-

tion is approximated by some polynomials on each subdomain. The local


approximations are then patched together to get a global approximation. A

function φ (x) on [a, b] is a spline of order k if φ is Ck−2 on [a, b] and there is

a grid of points (called nodes or breaks), a = x0 < x1 < ... < xn = b, such

that φ is a polynomial of order at most k − 1 on each subinterval [xi, xi+1]

i = 0, 1, ..., n − 1. Note that a spline of order 2 (or linear spline) is a

piecewise linear function.

A cubic spline φ (that is of order 4) has the following properties: (i)

φ (xi) = f (xi) = yi, i = 0, 1, ..., n. (ii) On each subinterval [xi, xi+1] , φ (x) =

ai + bix+ cix2 + dix

3, i = 0, 1, ..., n − 1. (iii) limx↑xi φ (xi) = limx↓xi φ (xi) ,

limx↑xi φ′ (xi) = limx↓xi φ

′ (xi) , and limx↑xi φ′′ (xi) = limx↓xi φ

′′ (xi) . There

are 4n − 2 conditions for 4n coefficients. We need two additional restric-

tions. The natural spline imposes φ′′ (x0) = φ′′ (xn) = 0. The Her-

mite spline imposes φ′ (x0) = f ′ (x0) and φ′ (xn) = f ′ (xn) (called the

clamped end condition) if those derivatives are known. Cubic splines

can approximate an almost smooth function with kinks or with small re-

gions of high curvatures. The not-a-knot end condition requires that

limx↑x1 φ′′′ (x) = limx↓x1 φ

′′′ (x) and limx↑xn−1 φ′′′ (x) = limx↓xn−1 φ

′′′ (x) .

The Matlab spline toolbox uses the not-a-knot condition as the default and

the Hermite spline as an option.

There are two approaches to representing a spline. A pp-form of a

polynomial spline of order k provides a description in terms of its breaks

x0, x1, ..., xn and the local polynomial coefficients cji of its n pieces,

pj (x) =

k∑i=1

(x− xj−1)k−i cji, j = 1, ..., n.

The B-form of a spline of order k is a linear combination of some basis

splines or B-splines of the required order k. A B-spline is a spline function

that has a minimal support with respect to a given degree, smoothness, and

domain partition. de Boor (1978) shows that every spline function of a given


degree, smoothness, and domain partition, can be uniquely represented as

a linear combination of B-splines of that same degree and smoothness, and

over that same partition. We now construct B-splines following the method

of de Boor (1978). Define the knots ξ1 = ξ2 = ... = ξk < ξk+1 < ... <

ξk+n = ... = ξk+n+k−1, with ξk+i = xi, i = 0, 1, ..., n, and the jth B-spline

of order k for these knots:

Bj1 (x) =

{1 if x ∈ [ξj, ξj+1)

0 else,

and

Bjk (x) =x− ξj

ξj+k−1 − ξjBj,k−1 (x) +

ξj+k − xξj+k − ξj+1

Bj+1,k−1 (x) ,

for j = 1, 2, ..., n + k − 1. Note that Bjk (x) is a nonnegative piecewise

polynomial of degree k − 1 and is zero outside the interval[ξj, ξj+k

]. We

can then write the B-form of a spline of order k as

φ (x) =

n+k−1∑j=1

Bjk (x) aj,

where the coefficients aj are called de Boor points or control points.

As an example, the B-spline of order 2 with knots ξ1 = ξ2 = x0 < ξ3 =

x1 < .... < xn = ξn+2 = ξn+3 is given by

Bj2 (x) =

⎧⎪⎪⎨⎪⎪⎩ξj+1−xξj+1−ξj if x ∈

[ξj, ξj+1

]if ξj < ξj+1

x−ξi+1

ξj+2−ξj+1if x ∈

[ξj+1, ξj+2

]if ξj+1 < ξj+2

0 otherwise

,

for j = 1, 2, ..., n + 1. These are linear splines and are also called tent

functions. Figure 11.2 illustrates 9 = n + k − 1 pieces of linear B-splines

(order k = 2) with n+1 = 9 equally spaced breakpoints on the unit interval.

Figure 11.3 illustrates 9 = n+ k − 1 pieces of cubic B-splines (order k = 4)

with n+ 1 = 7 equally spaced breakpoints on the unit interval.


0

1

0

1

0 10

1

0 1 0 1

Figure 11.2: Linear B-splines with breakpoints 9 equally spacedbreakpoints {0, 1/8, 1/4, 3/8, 1/2, 5/8, 3/4, 7/8, 1} .

In many applications, preserving the shape of the approximating func-

tions is important. For example, when approximating value functions, keep-

ing concavity is important. Schumaker (1983) proposes a shape-preserving

quadratic spline.6 See Judd (1998 p. 231) for an introduction of this con-

struction of this spline.

6Leonardo Rezende wrote a Matlab program schumaker.m to compute this spline. Usethe Matlab command ppval to evaluate the spline.


0

1

0

1

0 10

1

0 1 0 1

Figure 11.3: Cubic B-splines with 7 equally spaced breakpoints{0, 1/6, 1/3, 1/4, 2/3, 5/6, 1} .

11.3.3 Multidimensional Approximation

Consider the problem of interpolating a d-variate function f on a d−dimensional

interval

I = {(x1, x2, ..., xd) : ai ≤ xi ≤ bi, i = 1, 2, ..., d} .

Let{φij : j = 1, 2, ..., ni

}be an ni−degree univariate basis for real-valued

functions defined on [ai, bi] , and let {xij : j = 1, 2, ..., ni} be a sequence of

ni interpolation nodes for the interval [ai, bi] . Then an n = Πdi=1ni degree

interpolation on I may be constructed using the basis functions:

φj1j2...jd (x1, x2, ..., xd) = φ1j1 (x1)φ2j2 (x2) ...φdjd (xd) ,


for i = 1, ..., d and ji = 1, 2, ..., ni. An approximate for f in the tensor

product basis takes the form:

f (x1, x2, ..., xd) =

n1∑j1=1

n2∑j2=1

...

nd∑jd=1

cj1...jdφj1j2...jd (x1, x2, ..., xd) .

In tensor notation,

f (x1, x2, ..., xd) =[φd (xd)⊗ φd−1 (xd−1)⊗ ...⊗ φ1 (x1)

]c,

where c is an n × 1 column vector and each φi is the 1 × ni row vector of

basis functions over dimension i. When evaluated at n interpolation nodes,

above equation can be written in the matrix form,

Φc = y,

where Φ = Φd ⊗ Φd−1 ⊗ ... ⊗ Φ1 is the n × n interpolation matrix, each Φi

is an ni × ni univariate basis matrix, and y is the n × 1 vector containing

the values of f at the n interpolation nodes, properly stacked. If each Φd is

invertible, then

c =[Φ−1d ⊗ Φ−1

d−1 ⊗ ...⊗ Φ−11

]y.

For orthogonal polynomials, we may use least squares to compute c =

(Φ′Φ)−1 Φ′y when the number of nodes exceed the degree of interpolation.

Now, we take an example from Miranda and Fackler (2002) to illustrate

the tensor product basis. Consider a two-dimensional basis built from uni-

variate monomial bases with n1 = 3 and n2 = 2. Then the vectors of basis

functions for each dimension are

φ1 (x1) =[1 x1 x

21

], φ2 (x2) = [1 x2] .

The full basis functions are listed in the vector:

φ (x) = [1 x2]⊗[1 x1 x

21

]=

[1 x1 x21 x2 x1x2 x21x2

].


Construct a grid of nodes from the univariate nodes x1 = {0, 1, 2} and

x2 = {0, 1} . Then

Φ1 =

⎡⎣ 1 0 01 1 11 2 4

⎤⎦ , Φ2 =

[1 01 1

],

and

Φ = Φ2 ⊗ Φ1 =

⎡⎢⎢⎢⎢⎢⎢⎣

1 0 0 0 0 01 1 1 0 0 01 2 4 0 0 01 0 0 1 0 01 1 1 1 1 11 2 4 1 2 4

⎤⎥⎥⎥⎥⎥⎥⎦ .

Stacking the interpolation nodes yields:

X =

⎡⎢⎢⎢⎢⎢⎢⎣

0 01 02 00 11 12 1

⎤⎥⎥⎥⎥⎥⎥⎦ .

Let yi = f (zi) where zi is the ith row of the matrix X. We then obtain the

vector y.

The tensor product approach suffers from the curse of dimensionality

when the dimension is high. Judd (1998) recommends to use complete

polynomial bases. See Judd (1988, p. 239).

11.4 Perturbation Methods

As Judd (1998) points out, the basic idea of perturbation methods is to

formulate a general problem, find a special case with a known solution, and

then use the special case and its solution as a reference point for comput-

ing approximate solutions to the nearby general problem. The foundation

of this approach is the implicit function theorem and the Taylor expansion

11.4. PERTURBATION METHODS 369

theorem. An application of this approach familiar in economics is the lin-

earization and second-order approximation methods in solving deterministic

and stochastic nonlinear difference equations. In fact, perturbation meth-

ods have wide applications in economics including performing comparative

statics and solving difference or differential equations.

In this section, we take an example from Hansen, Heaton and Li (2008)

to illustrate the method. Consider a representative agent with the following

recursive utility (Kreps and Porteus (1978) and Epstein and Zin (1989)):

Vt =[(1− β)C1−ρ

t + βRt (Vt+1)1−ρ

] 11−ρ

, (11.3)

where

Rt (Vt+1) =[Et

(V 1−γt+1

)] 11−γ

.

Here the parameter γ measures the degree of risk aversion and 1/ρ mea-

sures elasticity of intertemporal substitution (EIS). We will discuss general

recursive utility in Chapter 21. Aggregate consumption follows the follow

dynamics:

lnCt+1

Ct= μc + xt + σcεc,t+1,

xt+1 = ρxt + σxεx,t+1,

where εc,t and εx,t are IID standard normals and mutually independent. Our

objective is to compute the utility value Vt.

We will derive a closed-form expression for Vt in the special case with

unitary EIS ρ = 1. We will then use the perturbation method to derive an

approximate solution for Vt for the general case with ρ �= 1.

To do so, rewrite (11.3) as

VtCt

=

[(1− β) + βRt

(Vt+1

Ct+1

Ct+1

Ct

)1−ρ] 11−ρ

,


and define

vt = ln

(VtCt

)and ct = lnCt.

It follows that

vt =1

1− ρ ln ((1− β) + β exp [(1− ρ)Qt (vt+1 + ct+1 − ct)]) , (11.4)

where

Qt (vt+1) =1

1− γ lnEt exp [(1− γ) vt+1] .

When ρ = 1, we can show that

vt = βQt (vt+1 + ct+1 − ct)

=β

1− γ lnEt exp [(1− γ) (vt+1 + ct+1 − ct)] . (11.5)

Guess

vt = Axt + μv, (11.6)

where A and μc are constants to be determined.

Using the conjecture, compute

lnEt exp [(1− γ) (vt+1 + ct+1 − ct)]

= lnEt exp [(1− γ) (Axt+1 + μv + μc + xt + σcεc,t+1)]

= lnEt exp [(1− γ) (Aρxt +Aσxεx,t+1 + μv + μc + xt + σcεc,t+1)]

= (1− γ) [(Aρ+ 1) xt + μv + μc] +(1− γ)2

2

[(Aσx)

2 + σ2c

].

Substituting the above expression and the conjecture into (11.5) and match-

ing coefficients, we find:

A =β

1− βρ,

μv =β

1− β

[μc +

1− γ2

((βσx

1− βρ

)2

+ σ2c

)].

11.4. PERTURBATION METHODS 371

We now use the perturbation method to solve for vt for ρ �= 1. Let

vt = vt (ρ) . By Taylor expansion, the approximate solution is given by

vt (ρ) � vt (1) + (ρ− 1) v′t (1) .

We intend to find vt (1) and v′t (1) . Use (11.4) to define a functional equation:

0 = F (vt (ρ) , vt+1 (ρ) , ρ)

≡ ln ((1− β) + β exp [(1− ρ)Qt (vt+1 (ρ) + ct+1 − ct)])

− (1− ρ) vt (ρ)

Taking derivative with respect to ρ yields:

0 = F1 (vt (ρ) , vt+1 (ρ) , ρ) v′t (ρ) + F2 (vt (ρ) , vt+1 (ρ) , ρ) v

′t+1 (ρ)

+F3 (vt (ρ) , vt+1 (ρ) , ρ) .

Evaluating at ρ = 1, we find that vt (1) and vt+1 (1) satisfies (11.5). Thus,

vt (1) is given by (11.6). Next, taking derivative with respect to ρ again in

the above equation and evaluating at ρ = 1, we can show that

v′t (1) =−β (1− β)

2[Qt (vt+1 (1) + ct+1 − ct)]2 + βEt

[v′t+1 (1)

], (11.7)

where Et is the expectation operator with respect to the density

(Vt+1 (1))1−γ

Et

[(Vt+1 (1))

1−γ] .

Conjecture

v′t (1) = −Ad2x2t +Bdxt + μd.

Substituting this conjecture into (11.7) and matching coefficients, we can

determine the coefficients Ad, Bd and μd. Exercise 2 ask you to complete the

explicit solution.


11.5 Projection Methods

Consider the following functional equation:

F (f) = 0,

where f : D ⊂ RN → RM and F maps the function space F of all such f

to itself. The operator F may encode equilibrium conditions. For ease of

exposition, we consider the case of M = 1 only. Since an analytical solution

is typically not available, an approximated solution to f is needed. A natural

approximation is to take a linear combination of some basis functions as in

the interpolation method,

f (x) =n∑j=1

cjφj (x) .

The goal of the projection method is to determine the degree of approxima-

tion n, the vector of basis coefficients c = {c1, c2, ..., cn} , and the basis func-

tions {φ1, φ2, ..., φn} . As Judd (1992, 1998) explains, the projection method

typically consists of the following 5 steps.

Step 1. Choose basis functions, φi, i = 1, 2, ..., and inner products,

< ·, · >, on F.

The basis functions should be flexible, capable of yielding a good ap-

proximation for the solution, and the inner products should induce useful

norms on the spaces. We have discussed this issue in Section 11.3. Next, we

decide how many basis elements to use and how to implement F.

Step 2. Choose a degree of approximation n for f , a computable ap-

proximation F of F , and a collection of n functions on F, pi : D → R,

i = 1, ..., n.

One may start with a small value of n and increases it until some diag-

nostic indicates that little is gained by continuing. Whenever possible, we

11.5. PROJECTION METHODS 373

choose F = F . The functions pi are the projection directions or weighting

functions.

Step 3. For a guess c, compute the approximation,

f (x; c) =

n∑j=1

cjφj (x) ,

and the residual function:

R (x; c) = F(f (x; c)

).

The initial guess of c should reflect some initial knowledge about the

solution. After the initial guess, further guesses are generated in Steps 4

and 5, where we see how we use the inner product, < ·, · >, to measure how

the approximation is close to the true solution.

Step 4. For each guess of c, compute the n projections,

Pi (c) =< R (·; c) , pi (·) >, i = 1, 2..., n.

Step 5. By iterating over Steps 3 and 4, find c which sets the n projec-

tions Pi (c) equal to zero.

Different choices of the pi lead to different implementations of the pro-

jection method.

The collocation method. This method uses the following projection

directions:

pi (x) =

{1 if x = xi0 if x �= xi

, for i = 1, .., n,

where the points x1, ..., xn are selected collocation nodes on D. It follows

from the projection equation,

Pi (c) =< R (·; c) , pi (·) >= 0,

that

R (xi; c) = 0, i = 1, ..., n.


Thus, by a suitably choice of collocation nodes, one solves the above system

of n equations for n basis coefficients (c1, c2, ..., cn).

Orthogonal collocation chooses the n roots of the (n+ 1)th basis function

as collocation points.7 The basis functions are orthogonal with respect to

the inner product. Chebyshev polynomial basis functions are especially

useful. Suppose we have found a c such that R (xi; c) = 0, i = 1, 2, ..., n,

where the xis are the n roots of Tn (x). As long as R (x; c) is smooth in x,

the Chebyshev interpolation theorem says that these zero conditions force

R (x; c) to be close to zero for all x and that these are the best possible

points to use if we are to force R (x; c) to be close to zero. In practice, the

performance of the orthogonal collocation method is surprisingly good.

The least squares method. This method chooses projections direc-

tions:

pi (x) =∂R (x; c)

∂cj, i = 1, 2, ..., n.

It is equivalent to solving the following problem:

minc

< R (·; c) , R (·; c) > .

The Galerkin method. This method chooses basis functions as pro-

jection directions:

pi (x) = φi (x) , i = 1, 2, ..., n,

which lead to the projection condition:

Pi (c) =< R (·; c) , φi (·) >= 0, i = 1, 2, ..., n.

7Note that the first basis function is typically equal to 1 and the (n+1)th basis functionhas degree n leading polynomial term and hence exactly n roots.


The method of moments. This method chooses the first n polyno-

mials as projection directions:

pi (x) = xi−1, i = 1, 2, ..., n,

which lead to the projection condition:

Pi (c) =< R (x; c) , xi−1 >= 0, i = 1, 2, ..., n.

Choosing the projection directions is critical since the major compu-

tational task is the computation of the projection conditions. The col-

location method is fastest in this regard since it evaluates R (·; c) at n

points and solves a system of n nonlinear algebraic equations. More gen-

erally, the projections will involve integration when evaluating inner prod-

ucts < R (·; c) , pi (·) >. Numerical quadrature techniques are often used

to compute integration. As explained previously, a typical quadrature for-

mula approximates∫ ba f (x) g (x) dx with a finite sum,

∑i f (xi)wi, where

the xis are the quadrature nodes and the wis are the weights. Since these

formulas also evaluate R (·; c) at just a finite number of points, quadrature-

based projection techniques are essentially weighted collocation methods.

The advantage of quadrature formulas over collocation is that information

at more points is used to compute the approximation, hopefully yielding a

more accurate approximation of the projections.

There are many root-finding algorithm to solve the projection conditions

Pi (c) = 0. A good initial guess is critical for convergence. In applications,

one typically uses a known solution to a special case as an initial guess or a

linear approximation as an initial guess.

Now, we use the following simple stochastic growth model as an example

to illustrate the projection method:

maxE

[ ∞∑t=0

βtu (Ct)

],


subject to

Ct +Kt+1 = f (Kt, zt) ,

ln zt+1 = ρ ln zt + σεt+1,

where {εt} is IID standard normal and f (k, z) = zkα + (1− δ) k. The

equilibrium condition consists of the Euler equation:

u′ (f (Kt, zt)−Kt+1) = βEtu′ (f (Kt+1, zt+1)−Kt+2) f

′ (Kt+1, zt+1) ,

(11.8)

and a transversality condition. We seek a stable solution in the form

Kt+1 = g (Kt, zt) .

To reduce nonlinearity, we use the Euler equation to define the functional

equation as:

0 = f (k, z)−g (k, z)−(u′)−1

(β

∫I (k, z, ε)

exp(−ε2/2

)√2π

dε

)≡ F (g) (k, z) ,

where we define

I (k, z, ε) ≡ u′ (f (g (k, z) , exp (ρ ln z + σε))− g (g (k, z) , exp (ρ ln z + σε)))

×f ′ (g (k, z) , exp (ρ ln z + σε)) .

Since the operator F involves integration without an explicit solution,

we need to use an approximation F of F. We may use the Gauss-Hermite

quadrature to approximate∫I (k, z, ε)

exp(−ε2/2

)√2π

dε ≈ π−1/2mε∑j=1

I(k, z,√2εj

)wj ,

where wj and εj are Gauss-Hermite quadrature weights and points. We

choose Chebyshev polynomials as the basis functions. Thus, we need to

truncate the domain such that

(k, z) ∈ [km, kM ]× [zm, zM ] ,


where [km, kM ] must contain the deterministic steady-state value of capital

and we may set

zm = exp

(−4σ√1− ρ2

)and zM = exp

(4σ√1− ρ2

).

For Chebyshev polynomials, choose the inner product,

< f, g >w=

∫ ∫f (k, z) g (k, z)w (k, z) dkdz,

where

w (k, z) =

[1−

(2k − km − kMkM − km

)2]−1/2 [

1−(2z − zm − zMzM − zm

)]−1/2

.

Use the tensor product approximation:

g (k, z; c) =

nk∑i=1

nz∑j=1

cijφij (k, z) ,

where nk × nz degree of approximation is used and

φij (k, z) = Ti−1

(2k − km − kMkM − km

)Tj−1

(2z − zm − zMzM − zm

).

Now we can write the residual function as

R (k, z; c) = f (k, z)− g (k, z; c)−(u′)−1

⎛⎝βπ−1/2mε∑j=1

I(k, z,√2εj ; c

)wj

⎞⎠ ,

where

I (k, z, ε; c)

= u′ (f (g (k, z; c) , exp (ρ ln z + σε))− g (g (k, z; c) , exp (ρ ln z + σε) ; c)) .

Once we obtain the residual function, we can proceed to compute c numer-

ically using one of the above implementation methods. Judd (1992, 1998)


shows that the orthogonal collocation method performs well in terms of

speed and accuracy. Finally, we should check that the solution is stable.

This is easy to do for deterministic models. But it seems hard to check

for stochastic models. One way is to check whether the ergodic set for the

endogenous state is compact.

We may use other basis functions, for example, piecewise polynomials

or splines. This leads to the finite element method. We refer to McGrattan

(1996) for details on this method.

11.6 Numerical Dynamic Programming

From the standpoint of computation, there is an important distinction be-

tween discrete dynamic programming problems whose state and control vari-

ables can take only a finite number of possible values, and continuous dy-

namic programming problems whose state and control variables can take a

continuum of possible values. This section will focus on the latter problems,

since the methods for solving discrete dynamic programming problem have

been well developed in the operations research literature (e.g. Bertsekas

(1987), Porteus (1980) and Puterman (1990, 1994)). There are two basic

strategies for solving continuous dynamic programming problems numeri-

cally: (i) discrete approximation and (ii) smooth approximation. Discrete

approximation methods solve a discrete dynamic problem that approximates

the original continuous dynamic programming problem over a finite grid of

points in the state space and the action space. Smooth approximation meth-

ods use continuous or smooth functions to approximate the value function

or the policy function. These methods are typically more efficient, but less

reliable than the discrete approximation methods. The reader is referred to

Chapter 12 of Judd (1998) and Rust (1996) for more discussions on numer-

ical dynamic programming.

11.6. NUMERICAL DYNAMIC PROGRAMMING 379

We shall use a simple infinite-horizon stochastic growth model to illus-

trate different numerical methods. We will not treat finite-horizon problems

separately. Solving a finite-horizon problem may be an intermediate step.

The dynamic programming problem is given by

V (K, z) = maxC,K ′>0

u (C) + βE[V

(K ′, z′

)|z], (11.9)

subject to

C +K ′ = f (K, z) ,

ln z′ = ρ ln z + σε′,

where εt is IID standard normal.

11.6.1 Discrete Approximation Methods

The first step to apply a discrete approximation method is to discretize the

state space. This requires to truncate the state space to a compact set be-

cause many dynamic programming problems in economics have unbounded

state spaces. As long as the truncated state space is sufficiently large, the

truncated problem can approximate the original problem arbitrarily closely.

For the above stochastic growth model, we choose the truncated endogenous

state space as [kmin, kmax] . We take a small positive value as kmin and re-

quire kmax to be larger than the deterministic steady-state value of capital.

We put grid points on [kmin, kmax] , kmin = k1 < k2 < ... < knk= kmax.

These points are equally spaced. But for problems with kinks, one my put

more points on the domain with kinks. We then obtain the discretized space

K = {k1, k2, ..., knk} .

For the exogenous shock, we define st = ln zt. Then {st} is an AR(1)

process. To approximate this process, we use either the Tauchen (1986)

method or the Tauchen and Hussey (1991) method to derive the discretized


Markov chain with the state space S = {s1, s2, ..., snz} and the transition

matrix (πij) .

Now, the original problem is transformed to a discretized dynamic pro-

gramming problem:

V (km, si) = maxk′∈K

u(f (km, exp (si))− k′

)+ β

nz∑j=1

πijV(k′, sj

), (11.10)

for m = 1, 2, ..., nk and i = 1, 2, ..., nz . There are two algorithms to solve this

problem.

Value Function Iteration The first method is to iterate on the value

function. It is often called the value function iteration method or the method

of successive approximations. Define the Bellman operator B on the space

of bounded functions on K × S into itself:

Bv (km, si) = maxk′∈K


)+ β

nz∑j=1

πijv(k′, sj

). (11.11)

The solution to (11.10) is the fixed point V = BV. The contracting mapping

theorem ensures that there is a unique fixed point. To solve for this fixed

point, we may start with an arbitrary guess V0 and iterate the Bellman

operator,

Vn = B (Vn−1) = BnV0.

Note that in the case where we set V0 = 0, this method is equivalent to

solving an approximate finite-horizon problem by backward induction. The

contraction mapping theorem guarantees the convergence of this algorithm:

in particular the contraction property implies that ||V − Vn|| converges to

zero at a geometric rate.

Policy Function Iteration The second method iterates on the policy

function. This method starts by choosing an arbitrary initial policy, g0. Note


that g0 takes nk×nz discrete values. Next a policy evaluation step is carried

out to compute the value function implied by the stationary decision rule

g0. This is effectively to solve the following linear system:

V0 (km, si) = u (f (km, exp (si))− g0 (km, si)) + β

nz∑j=1

πijV0 (g0 (km, si) , sj) ,

(11.12)

for the nk × nz values of V0. Once the solution V0 is obtained, a policy

improvement step is used to generate an updated policy g1 :

g1 (km, si) = argmaxk′∈K


)+ β

nz∑j=1

πijV0(k′, sj

).

Given g1 one continues the cycle of policy valuation and policy improvement

steps until the first iteration n such that gn = gn−1 (or alternatively Vn =

Vn−1). Since such a gn satisfies the Bellman equation (11.10), it is optimal.

The policy function iteration method is much faster than the value func-

tion iteration method. The reason is closely connected to the very rapid

quadratic convergence rates of Newton’s method for nonlinear equations.

Indeed, Puterman and Brumelle (1979) show that policy iteration is a form

of the Newton-Kantorovich method for finding a zero to the nonlinear map-

ping F : Rnknz → Rnknz defined by F (V ) = (I −B)V .

One may use the modified policy iteration method to speed up the policy

evaluation step. Instead of solving for the large linear system, one may use

successive approximations to approximate the solution to (11.12). Define an

operator

BV0 (km, si) = u (f (km, exp (si))− g0 (km, si))+βnz∑j=1

πijV0 (g0 (km, si) , sj) .

Then the solution to the linear system is a fixed point V0 = BV0. We can

iterate on B for a small number of time to obtain an approximate of V0.


Multigrid Method In the previous two methods, we have fixed the

discretized the state space. In order to get an accurate solution, a large state

space is needed which will slow down the speed. Chow and Tsitsiklis (1991)

propose a multigrid method which makes convergence much faster. The idea

of this method is very simple. One first starts with a coarse grid. Get the

solution to the discretized dynamic programming. Double the number of

grid points to obtain a finer grid. In this grid, solve the discretized dynamic

programming using the solution from the previous grid as an initial guess.

Repeat this step until convergence.

11.6.2 Smooth Approximation Methods

The discrete approximation methods require the choice of the endogenous

states must lie in the grid. In addition, many grid points are needed to

get an accurate solution. Thus, these methods are inefficient. A natural

strategy to deal with this issue is to use value function interpolation. This

method consists of the following steps.

Step 1. Discretize the endogenous state space kmin = k1 < k2 < ... <

knk= kmax. Discretize the AR(1) process for {st} to obtain a Markov chain

with state space S = {s1, s2, ..., snz} and transition matrix (πij) .

Step 2. Start with an initial guess V0 (km, si) for m = 1, 2, ..., nk and

i = 1, 2, ..., nz . Use one of the interpolation methods introduced in Section

11.3 to get an approximate V0 on [kmin, kmax]× S.Step 3. Compute

V1 (km, si) = maxk′∈[kmin,kmax]


)+ β

nz∑j=1

πijV0(k′, sj

).

Step 4. Update V0 by V1 and go to Step 2. Keep iterating until conver-

gence.

This method is a simple variation of the discrete value function iteration

method and requires much less grid points to get an accurate solution with


faster convergence. This method, however, still suffers from the problem of

slow convergence due to the discrete state space. Next, we introduce two

more efficient methods based on polynomial approximations.

Projection Methods Projection methods approximate the value func-

tion by a linear combination of some basis functions on a truncated state

space [kmin, kmax]× [smin, smax] :

V (k, s; c) =

dk∑i=1

dz∑j=1

cijφij (k, s) ,

where cij are basis coefficients and φij are basis functions. Define the residual

function as

R (k, s; c) = V (k, s; c)−BV (k, s; c) ,

where

BV (k, s; c) = maxk′∈[kmin,kmax]

u(f (k, exp (s))− k′

)+ E

[V

(k′, s′; c

)|s].

We may use the Gauss quadrature method to derive numerical integration

for the conditional expectation. We then obtain an approximate residual

function and hence we can apply one of the implementations of the projection

methods discussed earlier to solve for c.

Parameterized Expectations Approach The parameterized expec-

tations approach (PEA) is introduced by Marcet (1988) and den Haan and

Marcet (1990) and is refined by Christiano and Fisher (2000). Here we shall

introduce the refinement of Christiano and Fisher (2000), which borrows

ideas from the projection methods and approximates the conditional expec-

tation from the Euler equation by polynomials. This method is especially

useful for models with occasionally binding constraints.


Following Christiano and Fisher (2000), we use the stochastic growth

model with irreversible investment as an example. Suppose we introduce

the constraint

K ′ ≥ (1− δ)K, (11.13)

into the previous dynamic programming problem (11.9). We may write the

first-order conditions as

u′ (f (k, z) − g (k, z))− h (k, z) = β

∫m

(g (k, z) , z′; g, h

)q(z′|z

)dz′,

h (k, z) (g (k, z) − (1− δ) k) = 0,

g (k, z)− (1− δ) k ≥ 0, h (k, z) ≥ 0,

where

m(k′, z′; g, h

)≡ u′

(f(k′, z′

)− g

(k′, z′

))f ′

(k′, z′

)− h

(k′, z′

)(1− δ) ≥ 0.

(11.14)

Here g (k, z) is the policy function for the next-period capital stock, h (k, z) is

the Lagrange multiplier associated with the constraints (11.13), and q (z′|z)is the conditional density implied by the AR(1) shock process. The inequal-

ity in (11.14) reflects the facts that m is the derivative of the value function

and the value function is increasing in the capital stock. The above condi-

tions combined with the transversality condition are sufficient for optimality.

We may use the projection methods to solve for g and h. This means that

we have to approximate these two functions by polynomials.

PEA simplifies the solution in that we need to approximate only one

function. To achieve this goal, we write the first-order conditions as:

u′ (f (k, z) − g (k, z)) = β

∫m

(g (k, z) , z′; g, h

)q(z′|z

)dz′,

g (k, z) =

{g (k, z) if g (k, z) ≥ (1− δ) k,(1− δ) k otherwise,

11.7. EXERCISES 385

h (k, z) = u′ (f (k, z) − g (k, z))− β∫m

(g (k, z) , z′; g, h

)q(z′|z

)dz′.

Define the functional equation:

0 = F (e) (k, z) ≡ e (k, z)− β∫m

(g (k, z) , z′; g, h

)q(z′|z

)dz′,

where

g (k, z) = max{(1− δ) k, f (k, z)−

(u′)−1

(e (k, z))},

h (k, z) = u′ (f (k, z) − g (k, z))− e (k, z) ,

and m is defined in (11.14). The goal is to approximate the function e (k, z)

by polynomials. We can then apply the projection methods to solve the

above functional equation for e (k, z) .

As Christiano and Fisher (2000) point out, an important advantage of

PEA is that the conditional expectation, e (k, z) , is a relatively smooth

function because expectations smooth out some kinks. Thus, this method

performs better than the usual projection methods to approximate g and h

in terms of both speed and accuracy.

11.7 Exercises

1. Use the CompEcon toolbox to construct the 5- and 20-degree approx-

imants of the functions f (x) = |x|0.6 and f (x) =(1 + 2x2

)−1on the

interval [−1, 1] using each of the following interpolation methods. For

each method and degree of approximation, plot the approximation

error and the approximant.

(a) Uniform grid, monomial basis polynomial approximant.

(b) Chebyshev node, Chebyshev basis polynomial approximant.

(c) Uniform node, linear spline.

(d) Uniform node, cubic spline.


2. Complete the solution for the example in Section 11.4.

3. Introduce endogenous labor choice in the stochastic growth model:

maxCt,Nt,Kt+1

E

∞∑t=0

βt

(Cθt (1−Nt)

1−θ)1−γ

1− γ

subject to

Ct +Kt+1 = (1− δ)Kt + ztKαt N

1−αt ,

ln zt+1 = ρ ln zt + σεt+1.

Let θ = 0.8, γ = 2, α = 0.33, ρ = 0.95, σ = 0.01, δ = 0.025, and

β = 0.99. Use the following methods to solve the model in Matlab:

(a) The discrete value function iteration method.

(b) The discrete policy function iteration method.

(c) The value function interpolation method implemented by cubic

spline or Chebyshev polynomials.

(d) The projection method based on the Euler equation, implemented

by the orthogonal collocation using Chebyshev polynomials.

(e) The projection method based on the value function, implemented

by the orthogonal collocation using Chebyshev polynomials.

Chapter 12

Structural Estimation

12.1 Estimation Methods and Asymptotic Prop-erties

12.1.1 Generalized Method of Moments

12.1.2 Maximum Likelihood

12.1.3 Simulation-Based Methods

12.2 Application: Investment

Cooper and Haltiwanger

12.3 Exercises

387

388 CHAPTER 12. STRUCTURAL ESTIMATION

Part III

Equilibrium Analysis

389

391

In this part, we apply the tools developed in previous chapters to an-

alyze a variety of equilibrium models widely used in macroeconomics and

finance. They range from finite-horizon overlapping-generations models to

infinite-horizon exchange and production models, from complete markets

models to incomplete markets models, from perfectly competitive models to

monopolistically competitive models, and from Walrasian models to search

and matching models. We shall introduce analytical methods to explore

model properties as well as numerical methods to investigate empirical im-

plications.

392

Chapter 13

Complete Markets ExchangeEconomies

13.1 Arrow-Debreu Equilibrium versus Radner Equi-librium

13.2 Recursive Formulation

13.3 The Lucas Model

13.4 Exercises

393

394CHAPTER 13. COMPLETE MARKETS EXCHANGE ECONOMIES

Chapter 14

Neoclassical Growth Models

Neoclassical growth theory is the cornerstone of the modern macroeco-

nomics. This theory is first developed independently by Solow (1956) and

Swan (1956) and is used to study long-run economic growth. The Solow-

Swan model explains long-run economic growth by looking at productivity,

capital accumulation, population growth and technological progress. In the

Solow-Swan model, the saving rate is exogenously given. Ramsey (1928) de-

velops a model in which national savings can be endogenously chosen from

household utility maximization. Cass (1965) and Koopmans (1965) mod-

ify the Ramsey model by incorporating population growth and technical

progress as in the Solow-Swan model, giving birth to the optimal growth

model, also named the Ramsey–Cass–Koopmans model or simply the Ram-

sey model. Brock and Mirman (1972) introduce uncertainty into the Ram-

sey model. Kydland and Prescott (1982) take the stochastic optimal growth

model to the U.S. data and study how much technology shocks can explain

business cycles quantitatively. They argue that growth and fluctuations can

be studied in the same neoclassical growth framework and pioneer the real

business cycles theory. In this chapter, we first present deterministic mod-

els. We then present real business cycle models. Finally, we discuss some

empirical strategies of how to take models to the data.

395

396 CHAPTER 14. NEOCLASSICAL GROWTH MODELS

14.1 Deterministic Models

14.1.1 A Basic Ramsey Model

We start with a basic deterministic Ramsey model in which there is no gov-

ernment. There is population and technological growth. We first study the

Arrow-Debreu equilibrium in which markets for contingent claims open at

date zero. We then consider two sequential market arrangements depend-

ing on who owns capital stock in the economy. Finally, we study a social

planner’s problem.

Arrow-Debreu Equilibrium

There is a continuum of households of measure unity. Each household con-

sists of identical members whose size grows at gross rate gn. The initial size

is normalized to one. Each household derives utility from consumption and

leisure according to the utility function:

∞∑t=0

βtgtnu (ct, nt) , (14.1)

where β ∈ (0, 1) is the subjective discount factor and ct and nt represent a

household member’s consumption and labor, respectively. Assume that

uc > 0, ucc < 0, limc→0

u (c, ·) =∞, un < 0, unn > 0.

There is a continuum of identical firms of measure unity. Each firm com-

bines capital and labor to produce output using a constant-returns-to-scale

production function. Due to the constant-returns-to-scale technology, we

only need to consider a single aggregate firm. Let the aggregate production

be given by

Yt = F (Kt, AtNt) ,

where Yt, Kt, Nt, and At represent aggregate output, aggregate capital,

aggregate labor demand, and technical progress, respectively. Here, AtNt

14.1. DETERMINISTIC MODELS 397

represents effective labor. Technical progress enters in this fashion is known

as labor-augmenting or Harrod-neutral.1 Technlogy gross at the gross

rate ga so that At = gta. Assume that F is homogeneous of degree one and

satisfies:

F (0, ·) = F (·, 0) = F (0, 0) = 0,

F1 > 0, F2 > 0, F11 < 0, F22 < 0,

limK→0

F1 (K, 1) =∞, limK→∞

F (K, 1) = 0.

Suppose that households own all factors of production and all shares in

firms and that these endowments are equally distributed across households.

Each period households sell capital and supply labor to firms and buy goods

produced by firms, consuming some and accumulating the rest as capital.

Assume that firms hire labor and rent capital from households to produce

output, sell output, and pay out profits to households. All transactions take

place in a single once-and-for-all market that meets in period 0. Prices and

quantities are determined simultaneously at date 0. After this market is

closed, in future dates, agents simply deliver the quantities of factors and

goods they have contracted to sell and receive those they have contracted

to buy according to the contracted prices.

Let q0t be the date-zero Arrow-Debreu price of a claim in terms of con-

sumption goods to be delivered at date t. Let wt and Rkt be the wage rate

and the rental rate at date t. Then the budget constraint is given by

1If technology enters in the form Y = F (AK,N) , technical progress is capital-augmenting. If it enters in the form Y = AF (K,N), technical progress is Hicks-neutral.For the Cobb-Douglas production function, all three forms of technical progress are equiv-alent and consistent with balanced growth. For all other production functions, the onlyform of technical progress consistent with balanced growth is labor-augmenting.


∞∑t=0

q0t(gtnct + gtnit

)=

∞∑t=0

q0t(Rktktg

tn + wtntg

tn

)+Π,

where Π represents the present value of profits from firms. A representative

household chooses sequences of {ct, it, nt} so as to maximize utility (14.1)

subject to the above budget constraint and the capital accumulation equa-

tion:

gnkt+1 = (1− δ) kt + it, k0 given, (14.2)

where δ ∈ (0, 1) is the capital depreciation rate.

The first-order conditions from the household’s optimization problem are

given by

ct : βtuc (ct, nt) = λq0t ,

nt : − βtun (ct, nt) = λq0twt,

kt+1 : q0t = q0t+1 (Rkt+1 + 1− δ) ,

where λ is the Lagrange multiplier associated with the budget constraint.

The transversality condition is given by

limt→∞ q0t kt+1 = 0.

Each firm solves the following problem:

Π = max{Kt,Nt}

∞∑t=0

q0t [F (Kt, AtNt)−RktKt − wtNt] ,

where Kt and Nt represent aggregate capital and aggregate labor demanded

by firms. The first-order conditions are given by

FK (Kt, AtNt) = Rkt, AtFN (Kt, AtNt) = wt. (14.3)


The firm makes zero profit so that Π = 0, because of the constant-returns-

to-scale technology.

An Arrow-Debreu equilibrium consists of sequences of per capita quanti-

ties {ct, it, nt, kt+1} , aggregate quantities {Ct, It, Nt,Kt+1} , and prices{q0t , wt, Rkt

}such that (i) givens prices, the sequences {ct, it, nt, kt+1} solve the house-

hold’s problem; (ii) given prices, {Kt, Nt} solves the firm’s problem; and

(iii) markets clear in that:

Ct = gtnct, Kt = gtnkt, (14.4a)

It = gtnit, Nt = gtnnt, (14.4b)

Kt+1 = (1− δ)Kt + It, (14.4c)

Ct + It = F (Kt, AtNt) . (14.4d)

We focus on the equilibrium that is consistent with balanced growth

in that all aggregate variables in the steady state must grow at constant

rates (some rates could be zero). To satisfy this property, King, Plosser and

Rebelo (1988) and King and Rebelo (??) show that the utility function must

satisfy:

u (c, n) =c1−σv (n)1− σ for σ ∈ (0, 1) or σ > 1, (14.5)

or

u (c, n) = ln c+ v (n) for σ = 1, (14.6)

for some function v.

Given this class of utility functions, we can show that Yt, Kt, It and Ct

all grow at the gross rate gagn in the steady state. In addition, the wage

rate grows at the rate ga and the rental rate of capital Rkt is constant in the

steady state. Define a detrended variable as xt = g−ta xt for xt = kt, it, ct, or

wt. Using the first-order conditions for the household’s problem and for the

firm’s problem, we can then characterize the equilibrium system as:

uc (ct, nt) = βuc (ct+1, nt+1) (Rkt+1 + 1− δ) ,


−un (ct, nt)uc (ct, nt)

= wt,

Rkt = FK

(kt, nt

), wt = FN

(kt, nt

)gngakt+1 = (1− δ) kt + ıt,

ct + ıt = F(kt, nt

)= Yt (1 + ga)

−t g−tn .

Households Own Capital

The market structure in the Arrow-Debreu equilibrium is unrealistic in that

there is a single big market opening only in date zero. We now consider

the market arrangement in which agents trade sequentially. Suppose that

households own capital and make consumption, investment, and labor sup-

ply decisions each period. Households receive income from firms by renting

capital and supplying labor. Households also own firms and earn profits πt.

A representative household’s budget constraint is given by

ct + it = Rktkt + wtnt +Πt,

where it, kt, Rkt, and wt represent investment, capital, the rental rate, and

the wage rate, respectively.

Since the firm rents capital and hires labor each period, it solves the

following static problem:

Πt = maxKt,Nt

F (Kt, AtNt)−RktKt −wtNt.

A sequential competitive equilibrium consists sequences of per capita

quantities {ct, it, nt, kt+1} , aggregate quantities {Ct, It, Nt,Kt+1} , and prices

{wt, Rkt} such that (i) givens prices, the sequences {ct, it, nt, kt+1} solve the

household’s problem; (ii) given prices, {Kt, Nt} solves the firm’s problem;

and (iii) markets clear in that conditions in (14.4) hold.


From the household’s optimization problem, we can derive the first-order

conditions:

uc (ct, nt) = βuc (ct+1, nt+1) (Rkt+1 + 1− δ) , (14.7)


= wt, (14.8)

and the transversality condition:

limt→∞ βtuc (ct, nt) kt+1 = 0. (14.9)

The first-order conditions from the firm’s problem are the same as in equa-

tion (14.3). After detrending, we can easily show that the equilibrium system

in this subsection is identical to that in the previous subsection.

Firms Own Capital

Consider a different sequential market arrangement in which firms own cap-

ital and make investment decisions. Households own firms and trade firm

shares in a stock market. Each household chooses consumption {ct}, shareholdings

{ψt+1

}, and labor supply {nt} so as to maximize utility (14.1)

subject to the budget constraint:

gtnct + ψt+1 (Vt −Dt) = ψtVt + wtgtnnt,

where Vt is the cum-dividends stock market value of the firm and Dt repre-

sents dividends.

There is a continuum of identical firms of measure unity. Each firm

chooses labor and investment so as to maximize its stock market value at

any time t:

Vt = max{It+j,Nt+j}

∞∑j=0

Mt,t+jDt+j ,

subject to

Dt+j = F (Kt+j , At+jNt+j)− wt+jNt+j − It+j ,


Kt+j+1 = (1− δ)Kt+j + It+j , (14.10)

where Mt,t+j is the discount factor or the pricing kernel. Mt,t+j gives the

date t value of a claim that delivers one unit of consumption good in date

t+ j. Normalize Mt,t = 1.

A sequential competitive equilibrium consists of sequences of quantities{ct, nt, ψt+1

}, {Kt+1, Nt, It} , and prices {Vt, wt,Mt,t+j} such that house-

holds and firms optimize and markets clear,

ψt = 1, Ct = gtnct, Nt = gtnnt,

Ct + It = F (Kt, AtNt) .

To analyze this equilibrium, we first note that the pricing kernel Mt,t+j

comes from the household’s optimization problem. The first-order condition

with respect to share holdings gives

uc (ct, nt) (Vt −Dt) = βuc (ct+1, nt+1)Vt+1.

Thus, we obtain an asset-pricing equation:

Vt = Dt +βuc (ct+1, nt+1)

uc (ct, nt)Vt+1. (14.11)

Solving this equation forward yields:

Vt =∞∑j=0

βjuc (ct+j , nt+j)

uc (ct, nt)Dt+j , (14.12)

where we have used the transversality condition to rule out bubbles:

limj→∞

βjuc (ct+j , nt+j)Vt+j = 0. (14.13)

Equation (14.12) states that the stock price is equal to the discounted

present value of the future dividends. To derive condition (14.13), we use

the transversality condition from the household problem:

limt→∞βtuc (ct, nt)ψt+1 (Vt −Dt) = 0.


Using the equilibrium condition ψt+1 = 1 and equation (14.11) delivers (??).

Equation (14.12) implies that the pricing kernel is equal to the intertemporal

marginal rate of substitution:

Mt,t+j =βjuc (ct+j , nt+j)

uc (ct, nt).

From the firm’s optimization problem, we can derive the first-order con-

ditions for Nt as in (14.3) and for It and Kt+1 :

It : Qt = 1,

Kt+1 : Qt =Mt,t+1 [FK (Kt+1, At+1Nt+1) + (1− δ)Qt+1] , (14.14)

where Qt+jMt,t+j is the Lagrange multiplier associated with the capital

accumulation equation (14.10). Here Qt is Tobin’s marginal Q, which repre-

sents the marginal value of the firm following an additional unit increase in

the installed capital. Thus, Qt may also be interpreted as the price of capital.

More formally, we write the firm’s problem by dynamic programming:

Vt (Kt) = maxIt,Kt+1

Dt +Mt,t+1Vt+1 (Kt+1)

subject to (14.10), where Vt (Kt) is the value of the firm with capital Kt.

Let Qt be the Lagrange multiplier associated with (14.10). By the envelope

theorem,

Qt =Mt,t+1Vt+1 (Kt+1)

∂Kt+1.

Because there is no adjustment cost in the model, Qt = 1. We can then

show that equations (14.7) and (14.14) are equivalent. Consequently, we can

show that the equilibrium allocations implied by the two different sequential

market arrangements are identical.

This decentralization is very useful for studying asset pricing questions.

For example, equation (14.11) implies that the (gross) return on capital or


the stock return is given by

Vt+1

Vt −Dt=

uc (ct, nt)

βuc (ct+1, nt+1)= Rkt + 1− δ,

where we have used equation (14.7). This return is also equal to the risk-free

rate if households can trade risk-free bonds.

Social Planner’s Problem

We have considered three decentralized market arrangements.2 We have

shown that they imply identical equilibrium allocations. Are these alloca-

tions Pareto optimal? To answer this question, we consider a social planner’s

problem in which the social planner chooses consumption, labor, and invest-

ment so as to maximize a representative household’s utility subject to the

resource constraint:

maxct,nt,it

∞∑t=0

βtgtnu (ct, nt)

subject to

gtnct + gtnit

= F(gtnkt, Atg

tnnt

)= Yt,

gnkt+1 = (1− δ) kt + it.

We can derive the first-order conditions:

uc (ct, nt) = βuc (ct+1, nt+1) [FK (kt+1, At+1nt+1) + 1− δ] ,


= FN (kt, Atnt) ,

ct + it = F (kt, Atnt) ,

2There exist other market arrangements that give identical equilibrium allocations. SeeStokey, Lucas and Prescott (1989) and Ljquent and Sargent (2004) for examples.


and the transversality condition given by (14.9). Defining detrended vari-

ables as in the previous analysis, we can show that

gσauc (ct, nt) = βuc (ct+1, nt+1)[FK

(kt+1, nt+1

)+ 1− δ

],


= FN

(kt, nt

),

ct + ıt = F(kt, nt

),

gngakt+1 = (1− δ) kt + ıt.

Comparing with the equilibrium systems derived earlier, we can see that

the competitive equilibrium allocations are Pareto optimal. Conversely, we

can also find equilibrium prices to support the Pareto optimal allocation for

different market arrangements.

Recursive Competitive Equilibrium

Mehra and Prescott (1980) develop an alternative approach to summarize

competitive equilibrium behavior. This approach defines equilibrium by

time-invariant decision rules and other functions that characterize prices

and state transitions, rather than by sequences of quantities and prices.

These decision rules specify current actions as a function of a limited num-

ber of state variables which fully summarize the effects of past decisions

and current information. Knowledge of these state variables provides the

economic agents with a full description of the economy’s current state. The

state transition equation determines the state in the next period, giving rise

to a recursive structure of the equilibrium. This type of equilibrium is called

recursive competitive equilibrium. It is natural for this recursive ap-

proach to apply the powerful dynamic programming theory developed earlier

to analyze both theoretical issues and empirical questions. It has become

the workhorse in modern macroeconomics. The idea of defining equilibrium


in a recursive approach has been applied to define recursive equilibrium

for noncompetitive economies.

In order to apply the concept of recursive competitive equilibrium, the

equilibrium decision rules must be time invariant. Because the previously

presented Ramsey model features long-run growth, one has to detrend vari-

ables suitably. For ease of exposition, we shall consider the special case

without long-run growth to illustrate the equilibrium concept. Suppose

ga = gn = 1.

We consider the market arrangement in which households own capital.

The equilibrium rental rate and wage rate are functions of the aggregate

state, the aggregate capital stock K. From the firm’s optimization problem,

we can show that these functions satisfy

FK (K,N) = Rk (K) , FN (K,N) =W (K) . (14.15)

Next, we write a household’s decision problem by dynamic programming:

V (k,K) = maxc,n,k′

u (c, n) + βV(k′,K ′) ,

subject to

c+ k′ − (1− δ) k = kRk (K) +W (K)n,

K ′ = G (K) . (14.16)

Here k is the individual state of the capital stock. Because prices depend on

the aggregate state and households take equilibrium sequences of prices as

given when making decisions, the aggregate state K enters each household’s

dynamic programming problem. In addition, households must form rational

expectations about the transition of the aggregate state, which is given by

(14.16). This transition or law of motion determines the evolution of prices.

A recursive competitive equilibrium consists of the value function V (k,K) ,

decision rules c = gc (k,K), n = gn (k,K) , k′ = g (k,K) , the law of motion


for K in (14.16), and price functions Rk (K) andW (K) such that: (i) Given

the price functions and the law of motion for K, the value function and de-

cision rules solve the household’s dynamic programming problem. (ii) Price

functions satisfy (14.15). (iii) Markets clear in that

k = K, k′ = K ′, g (K,K) = G (K) , c = C,n = N,

K ′ − (1− δ)K + C = F (K,N) .

Because all households are identical, the individual choices must be equal

to the aggregate choices in equilibrium. In addition, the individual law of

motion for capital must be consistent with the aggregate law of motion in

equilibrium.

Given a recursive competitive equilibrium, equilibrium sequences of prices

and quantities can be naturally generated. One can show that these se-

quences are equivalent to those in the sequential competitive equilibrium

analyzed earlier. Conversely, one can often show that a sequential com-

petitive equilibrium can be characterized recursively. But sometimes it can

be nontrivial to represent a sequential competitive equilibrium in a recur-

sive way. The difficulty comes from the selection of state variables. We

will see later that selecting suitable state variables for some problems needs

ingenious thinking.

14.1.2 Incorporating Fiscal Policy

In this subsection, we use the deterministic growth model to study fiscal

policy. We abstract from long-run growth. Consider the decentralization

in which firms own capital and make investment decisions. Consider a tax

system in which households pay dividends taxes, capital gains taxes, con-

sumption taxes and labor income taxes. The treatment of capital gains

taxes is complicated because these taxes are realization based rather than


accrual based. In addition, the capital gains tax rate is usually lower than

the dividend tax rate. Hence, firms have an incentive to repurchase shares

instead of paying dividends. This requires to introduce some frictions in

the model to limit share repurchases. To avoid this complication, we fol-

low King (1977), Auerbach (1979), Bradford (1981), and McGrattan and

Prescott (2005) and assume that dividends and capital gains taxes together

are modeled as distribution taxes.

The representative household’s problem is given by

maxCt,Nt,ψt+1

E∞∑t=0

βtu (Ct, Nt) , (14.17)

subject to

(1 + τ st )Ct+ψt+1

(Vt −

(1− τdt

)Dt

)+Bt+1

Rt= ψtVt+(1− τnt )wtNt+Bt+Tt,

where τ st , τdt , and τnt are the tax rates on consumption, distribution, and

labor income respectively. Here Bt represents holdings riskless government

bonds and Tt represents lump-sum taxes. First-order conditions are given

by

Ct : uC (Ct, Nt) = (1 + τ st )Λt,

Nt : −uN (Ct, Nt) = (1− τnt )wt,

ψt+1 : Λt

(Vt −

(1− τdt

)Dt

)= βΛt+1Vt+1,

Bt+1 : Λt = βΛt+1Rt.

The representative firm’s problem is given by

Vt = max{It,Kt+1}

(1− τdt

)Dt +

βΛt+1

ΛtVt+1,

subject to

Dt = (1− τ ct) (F (Kt, Nt)− wtNt)− It + τ ctδKt,


Kt+1 = (1− δ)Kt + It, (14.18)

where τ ct is the corporate income tax rate. First-order conditions are given

by

It : Qt = 1− τdt , (14.19)

Kt+1 : Qt =βΛt+1

Λt

[(1− δ)Qt+1 +

(1− τdt+1

)τ ct+1δ

+(1− τdt+1

) (1− τ ct+1

)FK (Kt+1, Nt+1)

], (14.20)

where Qt represents the price of capital (Tobin’s marginal Q).

The government budget constraint is given by

Gt +Bt =Bt+1

Rt+ τ stCt + τdtDt + τ ct (F (Kt, Nt)− wtNt) + τnt wtNt + Tt.

To close the model, we impose the market clearing condition ψt = 1 and the

resource constraint

Ct + It +Gt = F (Kt, Nt) = Yt.

McGrattan and Prescott (2005) prove the following result.

Proposition 14.1.1 Ex-dividends equity value satisfies

Vt −(1− τdt

)Dt =

(1− τdt

)Kt+1.

Proof. Multiplying Kt+1 on the two sides of equation (14.20) yields:

QtKt+1 =βΛt+1

Λt

[(1− δ)Qt+1Kt+1 +

(1− τdt+1

)τ ct+1δKt+1

+(1− τdt+1

) (1− τ ct+1

)FK (Kt+1, Nt+1)Kt+1

]

=βΛt+1

Λt

(1− τdt+1

) (1− τ ct+1

)(F (Kt+1, Nt+1)− wtNt+1)

+βΛt+1

Λt

[Qt+1 (Kt+2 − It+1) +

(1− τdt+1

)τ ct+1δKt+1

]

=βΛt+1

Λt

(1− τdt+1

) (1− τ ct+1

)(F (Kt+1, Nt+1)− wtNt+1)

+βΛt+1

Λt

[Qt+1Kt+2 −

(1− τdt+1

)It+1 +

(1− τdt+1

)τ ct+1δKt+1

]


=βΛt+1

Λt(Dt+1 +Qt+1Kt+2) ,

where the second equality follows from the substitution of the capital ac-

cumulation equation and the linear homogeneity of F, the second equality

follows from (14.19), and the last equality follows from the definition of

dividends Dt+1. Given the transversality condition,

limT→∞

βTΛTQTKT+1 = 0,

we find

QtKt+1 =

∞∑j=1

βjΛt+jΛt

Dt+j .

Thus, QtKt+1 is the ex-dividend equity value.

What is the impact of changes in the distribution tax rate? Consider

a permanent tax change in τd when all tax rates are constant over time.

Suppose that lump-sum taxes are levied to balance the government budget.

In this case, equations (14.19) and (14.20) imply that

1 =βΛt+1

Λt[(1− δ) + τ cδ + (1− τ c)FK (Kt+1, Nt+1)] . (14.21)

Thus, the distribution tax does not change investment decisions, but it

changes the price of capital and hence equity value. More strikingly, it

does not affect the equilibrium allocation. When other distortionary taxes

are levied to balance the government budget, the change in the distribution

tax will be distortionary.

Notice that the above result depends on an implicit assumption that

firms finance investment by retained earnings so that Dt > 0 for all t.When

Dt < 0, investors must inject new equity to firms. That is, firms must rely

on external new equity to finance investment. New equity changes capital

gains and hence triggers capital gains taxes in the U.S. tax system. In-

troducing both capital gains taxes and dividend taxes separately, one can

14.2. REAL BUSINESS CYCLE THEORY 411

show that dividend taxes affect investment (Poterba and Summers (1985)).

Gourio and Miao (2010) document empirical evidence that in the cross sec-

tion some firms use external equity financing and some firms use internal

financing. They then build a model with firm heterogeneity in idiosyncratic

productivity shocks. They calibrate the model and study the long-run quan-

titative impact of the 2003 dividend and capital gains tax reform. Gourio

and Miao (2011) study transitional dynamics when the tax reform is tem-

porary.

In the above model, we have assumed that lump-sum taxes are available

so that whenever there is a change in tax rates, we use lump-sum taxes as a

residual to balance the government budget. When there is no lump-sum tax,

we have to either adjusting some particular tax or government expenditures

to balance the government constraint. Since this adjustment is arbitrary,

there could be many equilibria.

14.2 Real Business Cycle Theory

Real business cycle (RBC) theory is pioneered by Kydland and Prescott

(1982) and Long and Plosser (1983). The basic idea of this theory is to

introduce uncertainty into the neoclassical growth model. The resulting

stochastic growth framework is then used to study business cycles quan-

titatively. This theory argues that technology shocks account for most of

business cycles. Though this view is highly debated, RBC theory provides

a profound methodological contribution and has become the cornerstone for

business cycle analysis. In particular, the use of a quantitative analysis of

dynamic stochastic general equilibrium (DSGE) models has become a stan-

dard procedure. In this section, we first present a basic RBC model and

then discuss various extensions. We refer the reader to King and Rebelo

(1999) for an excellent survey of the RBC theory.


14.2.1 A Basic RBC Model

We introduce uncertainty into the basic Ramsey model in the form of ag-

gregate total factor productivity (TFP) shocks. We emphasize that our

previous analysis regarding Pareto optimality and decentralized competi-

tive equilibrium still applies to the stochastic case. We ignore long-run

growth and consider some specific functional forms. We start by the social

problem’s problem given by

maxCt,Nt,It

E

∞∑t=0

βt [lnCt + χ ln (1−Nt)] , (14.22)

subject to

Ct + It = ztKαt N

1−αt = Yt, (14.23)

Kt+1 = (1− δ)Kt + It, (14.24)

ln zt = ρ ln zt−1 + σεt, εt ∼ IIDN (0, 1) , (14.25)

where χ measures the utility weight on leisure and the technology shock {zt}is modeled as an AR(1) process. The optimal allocation {Ct, Nt, Kt+1, It,

Yt} satisfies the following system of nonlinear difference equations:

1

Ct= Et

β

Ct+1

[αzt+1K

α−1t+1 N

1−αt+1 + 1− δ

], (14.26)

χCt1−Nt


−αt , (14.27)

and (14.23) and (14.24). The usual transversality condition also holds. In

a decentralized equilibrium in which households own capital and make real

investment, the rental rate and the wage rate are given by

Rkt = αztKα−1t N1−α

t and wt = (1− α) ztKαt N

−αt .

The next step is to parameterize the model and solve the model nu-

merically. The RBC approach advocates a calibration methodology. This

method assigns parameter values based on either long-run steady-state re-

lationship or moments from other studies such as microeconomic evidence.


Steady State. Now, we solve for the nonstochastic steady state in

which zt = 1 for all t. We use X to denote the steady-state value of a

variable Xt. By the Euler equation (14.26),

1 = β[αKα−1N1−α + 1− δ

].

It follows that

K

N=

(α

1/β − (1− δ)

) 11−α

.

Hence, the equilibrium factor prices are given by

Rkt = αY

K= α

(K

N

)α−1

, wt = (1− α)(K

N

)α.

By (14.24),

I = δK =δK

NN.

Thus, given N , we can solve for I and also

Y =

(K

N

)αN.

Finally, the steady-state consumption C comes from the resource constraint

(14.23).

Note that we have not used the equilibrium condition (14.27). We shall

use this equation to solve for χ :

χ = (1− α)(K

N

)α 1−NC

. (14.28)

The idea is that given other parameter values, there is a one-to-one map-

ping between χ and N. Here, we are using a reverse engineering approach to

calibrate a parameter using an endogenous variable as the input. This ap-

proach is very useful for complicated models because we have avoided solving

a system of nonlinear equations for endogenous variables given exogenous

parameter values.


Calibration. We are ready to assign parameter values. One period is

corresponds to a quarter. First, we set α = 0.33 which is consistent with

the fact that average labor share of income is about 2/3 in the U.S. We set

β = 0.99, implying that the annual interest rate is (1/β − 1) = 4%. Set

δ = 0.025, implying that the annual depreciate rate is about 10%. Take

N = 0.33, implying that households work 8 hours out of 24 hours a day.

Given this value of N and other calibrated parameter values, we can pin

down the parameter χ = 1.75 and also solve for all other steady-state values.3

We can use Solow residuals {SRt} for the U.S. data constructed below

to estimate the parameters ρ and σ in two steps. First, we estimate the

equation by OLS:

SRt = β0 + β1t+ ut. (14.29)

Next, we estimate the residuals by OLS:

ut = ρut−1 + σεt, (14.30)

where ut corresponds to ln zt in the basic RBC model. We then obtain

ρ = 0.99 and σ = 0.0089.

Log-linearized System. Once parameter values are assigned, the

model can be solved numerically. There are many different methods to

solve this model as discussed in previous chapters. It turns out that the

log-linearization method is extremely useful in terms of balancing between

speed and accuracy (see Aruoba, Fernaandez-Villaverde and Rubio-Ramırez

(2006)). Dynare can get the log-linearized solution in a second. Before do-

ing this, we perform log-linearization by hand in order to understand the

workings of the RBC model. We use Xt to denote log deviation from the

steady state.

3The Matlab code for calibration is calibration.m.


Start with equation (14.27). Taking logarithms and total differentiation

on the two sides of this equation yields:

Ct +N

1−N Nt = zt + αKt − αNt.

Solving yields:

Nt =1

α+ ζzt +

α

α+ ζKt −

1

α+ ζCt. (14.31)

where ζ ≡ N/ (1−N) .

Substituting (14.24) into (14.23) and taking total differentiation, we ob-

tain

dCt + dKt+1 − (1− δ) dKt

= KαN1−αdzt + αKα−1N1−αdKt + (1− α)KαN−αdN.

Thus,

CCt +KKt+1 − (1− δ)KKt

= KαN1−αzt + αKαN1−αKt + (1− α)KαN1−αNt.

Solving for Kt+1 yields

Kt+1 =Y

K

[zt + αKt + (1− α) Nt

]− C

KCt + (1− δ) Kt

=1

βKt +

Y

K

[zt + (1− α) Nt

]− C

KCt,

where we have used the steady-state Euler equation:

1 = β

(αY

K+ 1− δ

).

Substituting equation (14.31) for Nt yields:

Kt+1 =

(1

β+

1− αα+ ζ

αY

K

)Kt +

Y

K

(1 + ζ

ζ + α

)zt −

(C

K+

1− αα+ ζ

Y

K

)Ct.

(14.32)


Total differentiation of the Euler equation yields:

−dCtC2

= −EtβdCt+1

C2

[αKα−1N1−α + 1− δ

]+Et

β

C

[αKα−1N1−αdzt+1 + α (α− 1)Kα−2N1−αdKt+1

]+Et

β

Cα (1− α)Kα−1N−αdNt+1.

It follows that

−Ct = −EtCt+1 + βαY

KEt

[zt+1 + (α− 1) Kt+1 + (1− α) Nt+1

]Substituting Nt+1 using (14.31) yields:

−Ct = −EtCt+1 + βαY

KEt

[zt+1 + (α− 1) Kt+1

]+(1− α) βαY

KEt

[1

α+ ζzt+1 +

α

α+ ζKt+1 −

1

α+ ζCt+1

].

Simplifying yields:

−Ct = −[1 + (1− α) βαY

K

1

α+ ζ

]EtCt+1 + βα

Y

K

1 + ζ

α+ ζEtzt+1

−βα (1− α) YK

ζ

α+ ζEtKt+1. (14.33)

Equations (14.32) and (14.33) combined with the shock process in (14.25)

constitute a linear system of difference equations for {Ct, Kt+1}. To analyze

this system, we plot the phase diagram for the corresponding deterministic

system:

Kt+1−Kt =

(1

β− 1 +

1− αα+ ζ

αY

K

)Kt+

Y

K

(1 + ζ

ζ + α

)zt−

(C

K+

1− αα+ ζ

Y

K

)Ct.

(14.34)

Ct+1 − Ct = (1− α)βαYK

1

α+ ζCt+1 + βα

Y

K

1 + ζ

α+ ζzt+1

−βα (1− α) YK

ζ

α+ ζKt+1.


Figure 14.1: Phase diagram for the log-linearized basic RBC model.

First, consider the isocline ΔCt = Ct+1 − Ct = 0 :

Ct+1 =1 + ζ

1− αzt+1 − ζKt+1.

Figure 14.1 plots this equation after one-period lag. This line is downward

sloping.

Next, consider the isocline ΔKt = Kt+1 − Kt = 0 :

Ct =

(C

K+

1− αα+ ζ

Y

K

)−1 [( 1

β− 1 +

1− αα+ ζ

αY

K

)Kt +

Y

K

(1 + ζ

ζ + α

)zt

].

(14.35)

This line is upward sloping.

We now find the saddle path. When Kt is above the isocline Ct = Ct+1,

Ct is expected to fall because high Kt implies low marginal product of capital


and hence low interest rate. When Ct above the isocline Kt+1 = Kt, Kt is

expected to fall because households save less. Hence, there is a saddle path

converging to the steady state. Given any initial state for K0 and z0, there

is a unique solution C0, which must lie on the saddle. We can determine

the whole paths for Ct and Kt. In addition, we can also determine other

equilibrium variables Nt, wt, Rkt, It, and Yt.

Consider first Nt. Instead of using (14.31), it is more instructive to look

at the labor market. By equation (14.27), the labor supply is given by

χCt1−Nt

= wt. (14.36)

Thus, labor supply increases with wt and decreases with Ct. The dependence

on consumption captures the wealth effect. The log-linearized labor supply

equation is given byN

1−N Nt = −Ct + wt.

This curve is upward sloping. The log-linearized labor demand equation is

given by

wt = zt + αKt − αNt. (14.37)

This curve is downward sloping. The intersection point gives the equilibrium

Nt and wt as illustrated in Figure 14.2. Once we know Ct and zt, we can

determine Nt and wt since Kt is predetermined.

After deriving Nt, we can determine equilibrium output using the log-

linearized equation:

Yt = αKt + (1− α) Nt + zt.

We can also determine the rental rate Rkt. Since Rkt = αztKα−1t N1−α

t =

αYt/Kt, we get

Rkt = Yt − Kt.


Figure 14.2: Labor market equilibrium for the log-linearized basicRBC model.

Finally, we can solve for financial returns using the decentralization in

which firms own capital and make investment. As in the deterministic case,

we can show that the ex-dividend stock return is given by

Rst = Rkt + 1− δ =⇒ Rst = Rkt.

We will come back to this equation in Section 14.2.2 below.

Turn to the risk-free rate or the interest rate rt+1 which satisfies the

Euler equation

1

Ct= βEt

1 + rft+1

Ct+1.


Total differentiation yields:

−dCtC2

= βEt

⎡⎣(1 + r)−dCt+1

C2+d(1 + rft+1

)C

⎤⎦ .For the interest rate, it is awkward to use percentage deviation from the

steady state drft+1/rf . For example, when the interest rate goes up from

1% to 4%. Then the interest rate rises by 3 percentage points, but by 300

percent. Thus, without risk of confusion, we define

rft+1 = rft+1 − rf �d(1 + rft+1

)1 + rf

.

In terms of gross interest rate Rft+1 = 1 + rft+1, we then have

Rft+1 =dRft+1

Rf� Rft+1 −Rf = rft+1.

Given this definition and noting that rft+1 is known at date t, we obtain the

log-linearized Euler equation

rft+1 = EtCt+1 − Ct.

This implies that the change in the interest rate reflects the expected change

in consumption. Finally, we can drive investment simply from the resource

constraint.

Business Cycle Statistics and Model Results Business cycle statis-

tics focus primarily on volatility, comovement (or cyclicality), and persis-

tence. Volatility refers to the standard deviation of an HP filtered series.

Cyclicality refers to the contemporaneous correlation of a series with GDP.

Persistence refers to the first-order autocorrelation of a series. It is also

interesting to examine how strongly a series is correlated with output led or

lagged a number of periods, so as to say something about which series are

“lagging indicators” and which series are “leading indicators.”


The series we are most interested in looking at are output, consumption,

investment, total hours worked, the real wage rate, and the real interest

rate. In addition, we will look at average labor productivity (the ratio of

output to total hours worked), the price level, stock returns, and total factor

productivity (TFP). The price level is not in the basic RBC model, but will

appear in extended models. All series except for interest rate and stock

returns are HP filtered.

The sample period covers the first quarter of 1948 through the fourth

quarter of 2010. All series (with the exception of TFP, the price level, and

the interest rate) are expressed in per capita terms after dividing by the

civilian non-institutional population aged 16 and over. The civilian non-

institutional population aged 16 and over is downloaded from the Federal

Reserve Bank of St. Louis. All series are in logs. All nominal series (except

for the price level) are deflated by the price level. The price level is measured

as the implicit price deflator for GDP from Table 1.1.9 downloaded from the

Bureau of Economic Analysis (BEA). The nominal GDP, consumption, and

investment are taken from NIPA Table 1.1.5 downloaded from (BEA). Con-

sumption is measured as the sum of non-durable and services consumption.

Investment is measured as total private fixed investment plus consumption

expenditures on durable goods (durable goods should be thought of as in-

vestment because they provide benefits in the future, just like new physical

capital does).

Total hours are measured as total hours in the non-farm business sector,

from the Bureau of Labor Statistics (BLS). Labor productivity is output per

hour in the non-farm business sector, from the BLS. Wages are measured as

real compensation per hour in the non-farm business sector, from the BLS.

The nominal interest rate is measured as the three month Treasury Bill

rate. We compute the real interest rate as the nominal interest rate minus


the inflation rate, where inflation is measured from the GDP deflator.

TFP is measured by Solow residuals using the production function:

SRt = lnYt − α lnKt − (1− α) lnNt.

Constructing this series requires an empirical measure of the capital stock.

In practice, this is hard to measure and most existing capital stock series are

only available at an annual frequency. For example, BEA contains tables

of annual fixed assets. Typically researchers use the perpetual inventory

method to measure the capital stock. This method essentially takes data on

investment, an initial capital stock, and an estimate of the rate of depreci-

ation and constructs a series using the accumulation equation for capital:

Kt+1 = (1− δ)Kt + It.

After constructing Solow residuals {SRt}, we can then use equations (14.29)

and (14.30) to estimate the TFP shock.

The selected moments from the model and from the data (in brackets)

are shown in Table 14.1. This table reveals the following stylized facts:

• Consumption is less volatile than output.

• Investment is about two times more volatile than output.

• Total hours worker have about the same volatility as output.

• Labor productivity is less volatile than output.

• The real wage is less volatile than output.

• All macroeconomic series are highly persistent.

• Most macroeconomic series are procyclical. Hours are strongly pro-

cyclical, but labor productivity and output are weakly correlated. The

real wage is essentially acyclical.


Table 14.1. Business Cycle Statistics

Standard Relative 1st Order ContempDeviation (%) Std Dev Autocorr Corr with Y

Y 1.52 (1.85) 1.00 (1.00) 0.72 (0.85) 1 (1)C 0.74 (1.17) 0.45 (0.63) 0.76 (0.86) 0.97 (0.80)I 4.19 (4.41) 2.75 (2.38) 0.71 (0.84) 0.99 (0.62)N 0.55 (1.95) 0.36 (1.05) 0.71 (0.90) 0.98 (0.82)Y/N 0.99 (1.12) 0.65 (0.60) 0.74 (0.72) 0.99 (0.26)w 0.99 (0.87) 0.65 (0.47) 0.74 (0.72) 0.99 (-0.06)TFP 0.89 (1.10) 0.54 (0.60) 0.99 (0.80) 1.00 (0.68)Price NA (0.94) NA (0.51) NA (0.91) NA (0.00)lnRf 0.05 (0.64) 0.04 (0.35) 0.71 (0.73) 0.95 (0.03)lnRs 0.05 (16.0) 0.03 (8.84) 0.71 (??) 0.96 (??)

Notes: All variables are in logarithms. The numbers in the brackets arecomputed from the data. Other numbers are computed from the basic RBCmodel.

Table 14.1 also reveals that the basic RBC model explains the U.S. busi-

ness cycles reasonable well.4 However, the basic RBC model fails in many

dimensions. First, consumption and hours are too smooth. Second, the

real wage rate and labor productivity are too strongly procyclical. Finally,

statistics related to the real interest rate and stock returns are far off from

the data.

Because the TFP shock is the driving force of business cycles, we next

examine the impact of the TFP shock more extensively below.

Impact of A Permanent TFP Shock. Suppose that there is an

unexpected permanent increase in TFP. In this case, both isoclines ΔCt = 0

and ΔKt = 0 shift up. Figure 14.3 plots this effect. After the permanent

shock, the economy moves to a new nonstochastic steady state. Figure 14.3

also plots the new saddle path. Initial consumption must jump to this path.

4The Dynare code is rbc1.mod.


Figure 14.3: Phase diagram after a permanent TFP shock for thelog-linearized basic RBC model.

Since this path may be above or below the original steady state, the impact

of initial consumption is ambiguous. An initial fall of consumption may seem

counterintuitive. This is possible if the initial rise in investment exceeds the

increase in output. In the figure, we suppose that initial consumption jumps

up. This rise is due to the wealth effect and is consistent with the permanent

income hypothesis.

Turn to the labor market as illustrated in Figure 14.4. Higher TFP shifts

the labor demand curve out. High consumption shifts the labor supply curve

in. The net effect is to cause the wage rate to rise, but the impact on hours

is ambiguous. We suppose that hours rise.


= 0 ∗

∗= 0

Figure 14.4: The labor market responses following a permanentTFP shock for the log-linearized basic RBC model.

The interest rate must go up because consumption rises along the saddle

path. Since capital also rises along the saddle path, investment must rise

too.

Impact of A Temporary TFP Shock. Now, suppose that there is

an unexpected purely temporary TFP shock in that zt only increase today

and then remains zero in the future. Because the changes in the isoclines

last for only one period, the initial impact on consumption is very small.

Consumption rises initially and then falls back to original steady state along

the saddle path. Thus, the labor supply curve has a very small change

initially. But the labor demand curve goes up for one period, causing the


wage rate and hours to rise. The impact on hours is larger in the temporary

shock case than in the permanent shock case. Thus, the initial impact on

output is also larger in the temporary shock case than in the permanent

shock case. Since initial output goes up but consumption does not change

much, initial investment must rise. This rise is also higher than that in

the permanent shock case. The interest rate must fall on impact because

consumption is expected to fall. It then rises to the steady state level.

Effects of Persistence and Critiques of the RBC Model As we

have shown above the persistence of the TFP shock plays an important

role. Figures 14.5 and 14.6 show the impulse response functions for different

values of persistence of the TFP shock.5 These figures reveal that when

the TFP shock is more persistent, consumption and real wages rise more

on impact, but hours, investment and output rise less on impact. These

figures also show that the RBC model generally lacks an amplification and

propagation mechanism in the sense that the impact effect on output is

small and output does not exhibit a hump-shaped response (Cogely and

Nason (1995)). The response of output essentially follows the same shape

as the TFP shock. Much of the recent business cycle research has tried to

extend the basic RBC model by introducing various new elements in order

to provide an amplification and propagation mechanism. In addition, these

extensions can to overcome some limitations pointed out earlier.

14.2.2 Extensions

The basic RBC model has been extended in many ways. Since this model

has an explicit microfoundation, one may extend this model from either

the preference side or the technology side. We now discuss some extensions

along these two dimensions.

5The Matlab and Dynare codes are RBCrho.m and rbc2.mod.


0 20 400

0.002

0.004

0.006

0.008

0.01

z

0 20 400

0.005

0.01

0.015

0.02

Y

0 20 400

0.002

0.004

0.006

0.008

0.01

C0 20 40

−2

0

2

4

6

8

10x 10

−3

N

0 20 40−0.02

0

0.02

0.04

0.06

0.08I

0 20 400

0.002

0.004

0.006

0.008

0.01

0.012

K

ρ = 0ρ=0.5ρ=0.99

Figure 14.5: Comparison of impulse responses to a TFP shock withdifferent levels of persistence.

Other Utility Functions

In the basic RBC model, we have adopted a utility specification within the

class of models in King, Plosser and Rebelo (KPR for short) (1988). We

now consider other utility specifications with this class. We also consider

other utility models outside the KPR class.

KPR Utility The following specification of KPR utility functions in

(14.5) and (14.6) are widely adopted in the literature:

u (c, n) =(c (1− n)χ)1−γ

1− γ , (14.38)


0 10 20 30 400

0.002

0.004

0.006

0.008

0.01

W

0 10 20 30 40−5

0

5

10

15

20x 10

−3

Rk

0 10 20 30 40−2

0

2

4

6x 10

−4

Rs

0 10 20 30 40−1

0

1

2

3

4

5x 10

−4

Ris

kfre

e r

ate

ρ = 0ρ=0.5ρ=0.99

Figure 14.6: Comparison of impulse responses to a TFP shock withdifferent levels of persistence.

and

u (c, n) = ln c− χ n1+ν

1 + ν, (14.39)

where χ, γ, and 1/ν represents the utility weight on leisure, relative risk

aversion, and Frisch elasticity of labor supply. When γ = 1, (14.38) reduces

to (14.22). When ν = 0, (14.39) is the indivisible labor model studied by

Hansen (1985) and Rogerson (1988).

Using the previous calibration strategy, we still obtain χ = 1.75 in

(14.38). Figure 14.7 plots the impulse responses for different values of γ

in (14.38).6 This figure reveals that the impact effect on consumption is in-

6The Matlab and Dynare codes are RBCgamma.m and rbc3.mod.


creasing in γ because households want smoother consumption. This means

that the initial jump in labor is decreasing in γ. The intuition is as follows.

When TFP increases, labor demand shifts right, which is independent of

γ. When consumption increases, labor supply shifts left. The bigger is the

consumption increase, the bigger is this inward shift in labor supply, and

therefore the smaller is the hours response in equilibrium and the larger is

the wage response. As a result, the initial jump in output is smaller. In

addition, when γ is larger, the initial jump in the stock return, the risk-free

rate, and the capital rental rate is also smaller. This figure suggests that

a smaller value of γ fits the data better. However, microeconomic evidence

suggests that γ is around 2.

Figure 14.8 plots the impulse responses for different values of ν in (14.39).7

Note that when we change ν, we have to recalibrate the model to pin down

ν because the steady state changes with ν. Figure 14.8 reveals that the

indivisible labor model (ν = 0) delivers a much larger impact effects on

output, hours, and investment. In addition, the initial jump in wage is

smaller. Thus, the indivisible labor model is very useful in improving the

model performance.

GHH Utility Greenwood, Hercowitz and Huffman (1988) propose the

following utility specification:

u (c, n) = U (c−G (n)) , (14.40)

where U and G satisfy the usual assumption. This utility function implies

that the labor supply decision is determined by

G′ (n) = w.

7The Matlab and Dynare codes are calibration2.m, RBCnu.m and rbc4.mod.


0 20 400.008

0.01

0.012

0.014

Y

0 20 400

0.005

0.01

0.015

C

0 20 40−5

0

5

10x 10

−3

N

0 20 400

0.02

0.04

0.06I

0 20 400

0.005

0.01

0.015

K

0 20 406

8

10

12x 10

−3

W

0 20 40−5

0

5

10

15x 10

−3

Rk

0 20 40−2

0

2

4

6x 10

−4

Rs

0 20 40−2

0

2

4

6x 10

−4

Ris

kfre

e r

ate

γ=0.75γ = 1γ=5γ=10

Figure 14.7: Comparison of impulse responses to a TFP shock fordifferent values of the risk aversion parameter.

Unlike KPR utility, consumption does not affect labor supply. Thus, there

is no wealth effect or intertemporal substitution effect in determining labor

supply. The shift in labor demand determines the change in hours and wage.

Habit Formation The idea of the habit formation model is that peo-

ple get utility from current consumption relative to past consumption. Habit

formation can help resolve some empirical failings of the permanent in-

come hypothesis. For example, habit formation can help resolve the “excess

smoothness” puzzle because, the bigger is habit formation, the smaller con-

sumption will jump in response to news about permanent income. Habit

formation is also useful in resolving the equity premium puzzle. A large


0 20 400.008

0.01

0.012

0.014

0.016

Y

0 20 404

6

8

10

12x 10

−3

C

0 20 400

0.005

0.01

N

0 20 400

0.02

0.04

0.06

I

0 20 400

0.005

0.01

0.015

K

0 20 406

8

10

12x 10

−3

W0 20 40

−5

0

5

10

15x 10

−3

Rk

0 20 40−2

0

2

4

6x 10

−4

Rs

0 20 40−2

0

2

4

6x 10

−4

Ris

kfre

e r

ate

ν=0ν = 1ν=3ν=6

Figure 14.8: Comparison of impulse responses to a TFP shock fordifferent values of the Frisch elasticity of labor supply.

degree of habit formation makes people behave as if they are extremely

risk averse, and can thereby help explain a large equity premium without

necessarily resorting to extremely large coefficients of relative risk aversion.

There are several different types of habit formation preferences. Here I

consider the internal habit model widely used in macroeconomics:

E∞∑t=0

βt [ln (Ct − bCt−1) + χ ln (1−Nt)] ,

where b is the habit parameter. In this case, marginal utility of consumption

is given by

Λt =1

Ct − bCt−1− βbEt

1

Ct+1 − bCt.


0 20 400.009

0.01

0.011

0.012

Y

0 20 400

0.005

0.01

C

0 20 400

2

4

6x 10

−3

N

0 20 400.01

0.02

0.03

0.04

0.05I

0 20 400

0.005

0.01

0.015

K

0 20 407

8

9

10

11x 10

−3

W

0 20 40−10

−8

−6

−4x 10

−3

MU

0 20 40−2

0

2

4

6x 10

−4

Rs

0 20 40−2

0

2

4x 10

−4

Ris

kfre

e r

ate

b=0b = 0.5b=0.8

Figure 14.9: Comparison of impulse responses to a TFP shock fordifferent values of the habit formation parameter.

To study the impact of habit formation, I recalibrate the model and present

impulse response functions for different values of b in Figure 14.9.8

This figure reveals that higher degree of habit formation makes the initial

responses of output, consumption and hours smaller, but makes the initial

response of output larger. The intuition is that the household cares about

past consumption. The initial adjustment of consumption is more sluggish,

but the fall in marginal utility is larger, when the degree of habit forma-

tion is larger. Hence, the negative wealth effect of labor supply is larger

when b is larger. Habit formation is not useful for amplification. However,

8The Matlab and Dynare codes are clibration3.m, RBChabit.m, and rbc5.mod.


habit formation helps generate propagation. For example, for b = 0.8, both

consumption and output follow hump-shaped responses. Figure 14.9 also

shows that habit formation alone will not help generate large equity pre-

mium because the impulse responses of the stock return and the interest

rate are similar for b = 0, 0.5, and 0.8. We will show below that introducing

adjustment costs is important.

Capacity Utilization

The basic RBC model has weak amplification. Output can jump initially

because TFP and labor jumps since capital is predetermined. The idea of

capacity utilization is to allow the intensity of the use of capital to vary in

response to TFP shocks. The cost of high capacity utilization is to make

capital to depreciate faster. Assume that the dependence of capital depre-

ciation on capacity utilization is modeled as

δ (ut) = δ0uδ1t , δ0 > 0, δ1 > 1,

where ut > 0 represents the capacity utilization rate and the curvature

parameter δ1 determines the intensity of capacity utilization. The capital

accumulation equation becomes

Kt+1 = (1− δ (ut))Kt + It.

Let the production function be

Yt = zt (Ktut)αN1−α

t .

From the social planner’s problem, we can derive first-order conditions

u′ (Ct) = βEtu′ (Ct+1) (1− δ (ut+1) + αYt+1/Kt+1) ,

δ′ (ut) = αYt/ (Ktut) .


In a decentralized equilibrium when households own capital, we assume that

households rent capital utKt to the firm at the rental rate Rkt. The firm

demands capital and labor by solving the problem:

maxkt,nt

ztkαt n

1−αt −Rktkt − wtnt.

In equilibrium kt = Ktut and nt = Nt. We then obtain

Rkt =αYtKtut

.

Households choose optimal capacity utilization so that

δ′ (ut) = Rkt. (14.41)

The marginal benefit of an additional unit of capital utilization is the rental

rate Rkt and the marginal cost if δ′ (ut) . To calibrate this model, one may

assume that the steady-state capacity utilization rate is equal to one. We

can use (14.41) to pin down δ0δ1. Set δ0 = 0.025. We can then determine δ1

.

Capital or Investment Adjustment Costs

Without investment frictions, investment would respond too much to prof-

itability shocks compared to the firm-level empirical evidence. A natural

way to improve model performance is to introduce convex adjustment costs.

These costs are important to generate movements of asset prices and hence

are useful for understanding asset pricing phenomena. In addition, changes

in asset prices provide a propagation mechanism in models with financial

frictions. We now introduce convex adjustment costs in the basic RBC

model.

There are two general ways to model convex adjustment costs. First,

changes in capital stock incur costs. In this case, adjustment costs can

be modeled as a nonlinear function of investment. Capital stock is often


introduced in this function too. We have presented this modeling in partial

equilibrium in Section 8.5.2. Second, changes in investment incur costs. In

this case, adjustment costs can be modeled as a nonlinear function of the

change in investment. For both modeling approaches, one can introduce

adjustment costs in two ways. In the first way, adjustment costs reduce

profits directly and hence represent a loss of output in general equilibrium.

In the second way, adjustment costs enter the capital accumulation equation,

meaning that one unit of investment does not necessarily increase capital

stock by one unit.

We start by the usual case where adjustment costs are a function of in-

vestment and capital. We shall extend the Q-theory of Hayashi (1982) to

general equilibrium. Assume that adjustment costs change capital accumu-

lation equation,

Kt+1 = (1− δ)Kt +Φ(It,Kt) , (14.42)

where Φ is the adjustment cost function satisfying ΦI > 0 and ΦII < 0. In

applications, the following form is often used:

Φ (I,K) = ϕ

(I

K

)K, (14.43)

where ϕ′′ < 0, ϕ (δ) = δ, ϕ′ (δ) = 1, and ϕ′′ (δ) = κ. This assumption ensures

that the investment rate is I/K = δ and Q = 1 in the nonstochastic steady

state and that the parameter κ controls the dynamic responses of investment

to shocks. In the log-linearized solution, the parameter κ does not affect

nonstochastic steady state and no specific functional form is required.

We first solve the social planner’s problem:

max{Ct,Nt,It,Kt+1}

E∞∑t=0

βtu (Ct, Nt)

subject to (14.42) and the resource constraint

Ct + It = ztF (Kt, Nt) , (14.44)


where the utility function u and the production function F satisfy the usual

smoothness and concavity assumptions. First-order conditions are given by

Ct : uC (Ct, Nt) = Λt,

Nt : −uN (Ct, Nt) = ΛtztFN (Kt, Nt)

It : QtΦI (It,Kt) = 1, (14.45)

Kt+1 : ΛtQt = βEtΛt+1 (zt+1FK (Kt+1, Nt+1) +Qt+1 (1− δ +ΦK (It+1,Kt+1))) ,

where Λt and ΛtQt are the Lagrange multipliers associated with (14.42) and

(14.44), respectively.

Turn to the decentralized market equilibrium. When households own

capital and make investment, the representative household’s problem is as

follows:

max{Ct,Nt,It,Kt+1}

E

∞∑t=0

βtu (Ct, Nt)


Ct + It = RktKt + wtNt. (14.46)

First-order conditions are given by


Nt : −uN (Ct, Nt) = Λtwt,

It : QtΦI (It,Kt) = 1,

Kt+1 : ΛtQt = βEtΛt+1 [Rk,t+1 +Qt+1 (1− δ +ΦK (It+1,Kt+1))] ,

where Λt and ΛtQt are the Lagrange multipliers associated with (14.42) and

(14.46), respectively.

The firm’s problem is given by

max ztF (Kt, Nt)−RktKt − wtNt.



Rkt = ztFK (Kt, Nt) , wt = ztFN (Kt, Nt) .

When firms own capital and make investment decisions, the household’s

problem is given by

max{Ct,Nt,It,Kt+1}

E∞∑t=0

βtu (Ct, Nt) ,

subject to

Ct + ψt+1 (Vt −Dt) +Bt+1 = ψtVt +BtRft + wtNt,

where Vt represents the cum-dividends equity value, Dt represents dividends,

Rft represents riskfree rate between periods t − 1 and t, and ψt and Bt

represent share and bond holdings, respectively. First-order conditions are

given by


Nt : −uN (Ct, Nt) = Λtwt,

ψt+1 : Λt (Vt −Dt) = βEtΛt+1Vt+1,

Bt+1 : Λt = βEtΛt+1Rft+1.

The firm’s optimization problem is given by

V0 = max{It}

∞∑t=0

βtΛtΛ0

Dt,


Dt = ztF (Kt, Nt)−wtNt − It.


It : QtΦI (It,Kt) = 1, (14.47)


Kt+1 : ΛtQt = βEtΛt+1 [zt+1FK (Kt+1, Nt+1) +Qt+1 (1− δ +ΦK (It+1,Kt+1))] ,

(14.48)

where QtβtΛt/Λ0 is the Lagrange multiplier associated with (14.42). Equa-

tion (14.47) combined with equation (14.42) can be interpreted as capital

supply equation, while (14.48) can be interpreted as capital demand equa-

tion. In the case without capital adjustment costs (i.e., Φ (I,K) = I),

Qt = 1 and hence capital supply is flat. Changes in capital demand changes

quantity, but not its shadow price.

As in the partial equilibrium model studied in Section 8.5.2, Qt is Tobin’s

marginal Q, which is unobservable. We want to link Qt to some observable

variable such as Tobin’s average Q. The following result is a generalization

of that in Hayashi (1982). Its proof is left as an exercise.

Proposition 14.2.1 If F and Φ are linearly homogeneous, then

Qt =PtKt+1

=∂Pt∂Kt+1

,

where Pt is ex-dividend equity value defined as

Pt = EtβΛt+1

ΛtVt+1.

Define the investment return as

RI,t+1 =Qt+1 [1− δ +ΦK (It+1,Kt+1)] +Rk,t+1

Qt.

Using the above result, we can prove that the stock return is equal to the

investment return

Rs,t+1 =Pt+1 +Dt+1

Pt= RI,t+1.

The proof is simple. First, we write dividends as

Dt+1 = Rk,t+1Kt+1 − It+1.


Since Φ is linearly homogeneous,

Φ (I,K) = ΦII +ΦKK.

Using these equations, equation (14.45), and Proposition 14.2.1, we can show

that

Rs,t+1 =Pt+1 +Dt+1

Pt=Qt+1Kt+2 +Rk,t+1Kt+1 − It+1

QtKt+1

=Qt+1 (1− δ)Kt+1 +Qt+1Φ (It+1,Kt+1) +Rk,t+1 − It+1

QtKt+1

=Qt+1 (1− δ)Kt+1 +Qt+1ΦK (It+1,Kt+1)Kt+1 +Rk,t+1Kt+1

QtKt+1

=Qt+1 [1− δ +ΦK (It+1,Kt+1)] +Rk,t+1

Qt= RI,t+1.

Now, we consider the case where changes in investment incur adjustment

costs. Following Christiano, Eichenbaum and Evans (2005), we assume that

capital evolves according to

Kt+1 = (1− δ)Kt +G (It, It−1) ,

where G represents investment adjustment costs satisfying G1 > 0 and

G11 < 0. The firm’s first-order conditions are given by

It : QtG1 (It, It−1) + EtβΛt+1

ΛtQt+1G2 (It+1, It) = 1, (14.49)

Kt+1 : ΛtQt = βEtΛt+1 [zt+1FK (Kt+1, Nt+1) +Qt+1 (1− δ)] . (14.50)

What is the relationship between Qt and Pt? Qt may not be equal to Tobin’s

average Q. Eberly, Rebelo and Vincent (2011) show that, in a log-linearized

system, one can replace marginal Q by average Q and hence current invest-

ment is related to past investment and Tobin’s average Q.

In applications, the following specification is often used:

G (It, It−1) =

[1− S

(ItIt−1

)]It, (14.51)


where S satisfies S′′ > 0, S (1) = S′ (1) = 0, and S′′ (1) = κ > 0. These

assumption ensures that the steady-state investment rate I/K is equal to

δ and Q = 1 and that the parameter κ controls the dynamic responses

of investment to shocks. In the log-linearized solution, the parameter κ

does not affect nonstochastic steady state and no specific functional form is

required.

Stochastic Trends

So far, we have assumed that the data and the model are trend stationary.

To incorporate deterministic trend, one may assume that output satisfies

Yt = ztKαt (AtNt)

1−α ,

where At = gtaA0. King et al. (1991) show that many macro data exhibit

common stochastic trends. To capture this fact, we assume that TFP shocks

follow a stochastic trend in the basic RBC model:

ln zt = μz + ln zt−1 + σεt, z−1 = 1.

where μz represents the trend growth and εt is IID standard normal. In this

case, we need to detrend variables. Define detrended variables Xt = Xt/z1

1−αt

for Xt = Ct, It, Yt, and wt. Since Kt+1 is chosen in period t, we define

Kt+1 = Kt+1/z1

1−αt . We then obtain the detrended system:

1

Ct= Et

β

Ct+1

exp

(1

α− 1(μz + σεt+1)

)(Rkt+1 + 1− δ) ,

χCt1−Nt

= wt,

Ct + It = Yt = exp

(α

α− 1(μz + σεt)

)Kαt N

1−αt ,

Kt+1 = exp

(1


)(1− δ) Kt + It,


Rkt =αYt

Kt

exp

(1

1− α (μz + σεt)

),

wt = (1− α) exp(

α


)Kαt N

−αt .

We can solve the model using calibrated parameter values.9 Dynare

will produce impulse responses of the detrended variables. To construct

impulse responses of the original variables, we need to transform back, i.e.

lnxt = ln xt +ln zt1−α for any variable xt. Figure 14.10 presents the results for

lnxt − ln xss = ln xt − ln xss +ln zt1− α,

in response to a 1% increase in ε0, where xss is the steady state for the

detrended system.

This figure shows that the model with stochastic trend generates much

less volatility in hours than do transitory TFP shocks. The intuition for

this result is based on the permanent income hypothesis and our graphical

analysis of the basic RBC model. If the TFP shock is permanent, consump-

tion will jump up by more. Consumption jumping up by more means that

labor supply shifts in by more. Hence, hours rise by less and the real wage

rises by more. Since hours rise by less, output rises by less. Given a larger

increase in consumption, investment would also rise by less on impact.

Other Sources of Shocks

The main finding of the RBC theory is that technology shocks are the driving

force of business cycles. Recently, researchers find that other shocks also play

important roles. We now examine some widely studied shocks.

Investment-Specific Shocks Greenwood, Hercowitz, and Krusell (1997)

find that investment-specific technical change is the major source of eco-

9The Matlab and Dynare codes are calibtrend.m and stochtrend2.mod.


0 10 20 30 400.012

0.013

0.014

0.015Y

0 10 20 30 400.005

0.01

0.015C

0 10 20 30 400.015

0.02

0.025

0.03I

0 10 20 30 400

1

2

3

4x 10

−3 N

0 10 20 30 400

0.005

0.01

0.015Rk

0 10 20 30 400.008

0.01

0.012

0.014

0.016w

Figure 14.10: Impulse responses to a permanent TFP shock.

nomic growth. This suggests that it could be important for short-run fluc-

tuations as well. Fisher (2006) finds that combining investment-specific

technology shock with the TFP shock accounts for about 40-60% of the

fluctuations in output and hours at business cycle frequencies. We now in-

troduce a investment-specific technology shock into the basic RBC model.

We replace the law of motion for capital (14.24) by

Kt+1 = (1− δ)Kt + vtIt,

where vt follows the process:

ln vt = (1− ρv) ln v + ρv ln vt−1 + σvεvt.


Consider the decentralization where households own capital and make in-

vestment decisions. Then we have

It : Qt = 1/vt.

Kt+1 : QtΛt = βEtΛt+1 [Rk,t+1 + (1− δ)Qt+1] .

The other equations are identical. Here the capital price is equal to 1/vt. Set

v = 1, ρv = 0.90, σv = 0.01, ρ = 0.95, and σ = 0.01. The other parameter

values are the same as those in the basic RBC model. Figure 14.11 presents

the impulse responses to both the TFP shock and the investment-specific

shock.10

This figure reveals that in response to an investment-specific technology

shock, consumption, wage, labor productivity, and the capital price fall on

impact. The intuition is that a positive shock to the investment-specific

technology reduces the price of investment goods and raise the relative price

of consumption goods. Thus, consumption falls and investment rises on

impact. Because of the fall of consumption, labor supply rises on impact.

But labor demand does not change. Thus, hours rise and the wage rate falls

on impact. This causes labor productivity to fall on impact. Consequently,

combining the investment-specific shock with the TFP shock will help the

model match the cyclicality of labor productivity and the real wage rate in

the data.

Government Spending Shocks We now introduce government in

the basic RBC model. Suppose the government spending Gt is financed by

lump-sum taxes. The resource constraint becomes

Ct + It +Gt = ztKαt N

1−αt ,

10The Matlab and Dynare codes are RBCINshock.m, calibration.m and rbc6.mod.


0 20 400

0.005

0.01

0.015

Y

0 20 40−0.01

−0.005

0

0.005

0.01

C

0 20 40−0.02

0

0.02

0.04

0.06

I

0 20 40−5

0

5

10x 10

−3

N

0 20 40−5

0

5

10x 10

−3

Y/L

0 20 40−5

0

5

10x 10

−3

W

0 20 40−15

−10

−5

0

5x 10

−3

Q

0 20 40−5

0

5

10

15x 10

−4

Rs

0 20 40−5

0

5

10

15x 10

−4

Rf

Figure 14.11: Impulse responses to a TFP shock and an investment-specific technology shock.

where Gt follows the process

lnGt =(1− ρg

)ln G+ ρg lnGt−1 + σgεgt.

We recalibrate the model so that government spending accounts for 20% of

output in the nonstochastic steady state. Figure 14.12 presents the impulse

responses to a one percent increase in government spending for different

levels of persistence ρg.11

This figure reveals that consumption and investment fall on impact due

the wealth effect. But hours rise on impact because the rise in labor supply.

Thus, output also rises on impact. The larger the persistence, the larger the

11The Matlab and Dynare codes are RBCgov.m, calibgov.m and rbc7.mod.


0 20 400

0.005

0.01

0.015

G

0 20 40−5

0

5

10x 10

−4

Y

0 20 40−1.5

−1

−0.5

0x 10

−3

C

0 20 400

0.5

1

1.5x 10

−3

N

0 20 40−10

−5

0

5x 10

−3

I

0 20 40−4

−3

−2

−1

0x 10

−4

K0 20 40

−6

−4

−2

0x 10

−4

W

0 20 400

0.5

1x 10

−3

Rk

0 20 400

1

2

3

4x 10

−5

Rf

ρ

g = 0

ρg=0.5

ρg=0.95

Figure 14.12: Impulse responses to a government spending shockfor different levels of persistence.

impact on output, consumption and hours, but the smaller the impact on

investment.

Preference Shocks Finally, we introduce preference shock so that the

utility function is given by

E∞∑t=0

βtϕt (lnCt + χ ln (1−Nt)) ,

where the preference shock ϕt follows the process

lnϕt = ρp lnϕt−1 + σpεpt.


0 20 400

0.005

0.01

0.015

Pre

fere

nce

shock

0 20 40−8

−6

−4

−2

0x 10

−3

Y

0 20 40−5

0

5

10x 10

−3

C

0 20 40−15

−10

−5

0

5x 10

−3

N

0 20 40−0.1

−0.05

0

0.05

0.1

I

0 20 40−6

−4

−2

0x 10

−3

K

0 20 40−2

0

2

4x 10

−3

W

0 20 40−10

−5

0

5x 10

−3

Rk

0 20 40−10

−5

0

5x 10

−3

Rf

ρ

p = 0

ρp=0.5

ρp=0.95

Figure 14.13: Impulse responses to a preference shock for differentlevels of persistence.

Figure 14.13 presents the impulse responses to a positive one percent shock

to preferences for different levels of persistence.12 It shows that when the

preference shock is more persistent, the initial rise in consumption is less, but

the initial drop in hours, investment, and output is larger. The intuition is

that a more persistent preference shock gives a smoother consumption path.

Thus the wealth effect on labor supply is smaller on impact.

12The Matlab and Dynare codes are RBCpref.m, calibration.m and rbc8.mod.

14.3. EXERCISE 447

14.3 Exercise

1. Formulate the model in Section 14.1.2 using the Arrow-Debreu setup.

Do it again using the market arrangements in which households own

capital and make investment decisions. Show that these two formula-

tion and the one in Section 14.1.2 yield identical allocation.

2. Derive the steady state equilibrium for the model in Section 14.1.2.

3. Calibrate the capacity utilization model and use Dynare to solve it.

Plot impulse responses to a positive one percent TFP shock for differ-

ent parameter values of δ0 and δ1.

4. Prove Proposition 14.2.1.

5. Assume risk neutral preferences. Derive the log-linearized system for

the investment dynamics for the model with capital adjustment costs

in (14.43) and investment adjustment costs (14.51). Discuss the dif-

ferences.

6. Calibrate two RBC models with capital adjustment costs (14.43) and

with investment adjustment costs (14.51), respectively. Use Dynare to

solve these two models and plot impulse responses to a positive one

percent TFP shock for different curvature parameter values κ.


Chapter 15

Bayesian Estimation ofDSGE Models Using Dynare

There are several different formal and informal econometric procedures to

evaluate DSGE models quantitatively. Kydland and Prescott (1982) ad-

vocate a calibration procedure. This procedure is dominant in the early

RBC literature. Christiano and Eichenbaum (1992) use the generalized

method of moments (GMM) to estimate equilibrium relationships. Rotem-

berg and Woodford (1997) and Christiano, Eichenbaum, and Evans (2005)

use the minimum distance estimation method based on the discrepancy be-

tween VAR and DSGE model impulse response functions. Altug (1989),

McGrattan (1994), Leeper and Sims (1994), and Kim (2000) adopt the full-

information likelihood-based estimation method. The reader is referred to

Kydland and Prescott (1996), Hansen and Heckman (1996), and Sims (1996)

for discussions of methodological issues.

In this chapter, we focus on Bayesian estimation of DSGE models. This

method has several advantages over other methods. First, unlike GMM es-

timation based on equilibrium relationships such as the consumption Euler

equation, the Bayesian analysis is system-based and fits the solved DSGE

model to a vector of aggregate time series. Second, the estimation is based

449

450CHAPTER 15. BAYESIAN ESTIMATIONOFDSGEMODELS USINGDYNARE

on the likelihood function generated by the DSGE model rather than, for

instance, the discrepancy between DSGE model responses and VAR im-

pulse responses. Third, maximizing the likelihood function in the maximum

likelihood method is challenging. By contrast, computing posteriors in the

Bayesian method is much easier. Fourth, prior distributions can be used to

incorporate additional information into the parameter estimation.

We refer the reader to An and Schorfheide (2007), DeJong and Dave

(2007), and Fernandez-Villaverde (2010) for an introduction to Bayesian

estimation of DSGE models.

15.1 Principles of Bayesian Estimation

Suppose that we want to estimate a vector of parameters θ in some set

Θ. Let the prior density be p (θ) . The econometrician observes the data

Y T = {y1, y2, , , yT } . The likelihood density is the density of data Y T given

the parameter θ :

p(Y T |θ

)= p (y1|θ)

T∏t=2

p(yt|Y t−1, θ

).

The likelihood function of the parameter is given by

L(θ;Y T

)≡ p

(Y T |θ

).

The maximum likelihood method of estimation is to search a parameter

value so as to maximize the likelihood function L(θ;Y T

).

Using the Bayesian rule, we can compute the posterior as

p(θ|Y T

)=p(Y t, θ

)p (Y T )

=p(Y T |θ

)p (θ)∫

p (Y T |θ) p (θ) dθ .

Define the posterior kernel K(θ|Y T

)as

p(θ|Y T

)∝ p

(Y T |θ

)p (θ) ≡ K

(θ|Y T

).

15.1. PRINCIPLES OF BAYESIAN ESTIMATION 451

The Bayesian method is to choose a parameter value so as to maximize the

posterior density p(θ|Y T

)or the posterior kernel K

(θ|Y T

).

The difficulty of using the maximum likelihood method or the Bayesian

method is that the likelihood function typically has no analytical solution.

In addition, Bayesian analysis involves the computation of the conditional

distribution of a function of the parameters h (θ) :

E[h (θ) |Y T

]=

∫h (θ) p

(θ|Y T

)dθ. (15.1)

Numerical integration is often needed. To implement the Bayesian method,

we use a filtering procedure to evaluate the likelihood function. We then

simulate the posterior kernel using a sampling-like or Markov chain Monte

Carlo (MCMC) method such as the Metropolis-Hastings algorithm.

We now use an example to illustrate Bayesian estimation. This example

illustrates that Bayesian estimation is essentially between calibration and

maximum likelihood estimation. Suppose that the data generating process

is given by

yt = μ+ εt.

εt ∼ N (0, 1) is a Gaussian white noise. We want to estimate the parameter

μ from a sample of data Y T . We can compute the likelihood density

p(Y T |μ

)= (2π)−

T2 exp

(−1

2

T∑t=1

(yt − μ)2).

We then obtain the maximum likelihood estimate

μML =1

T

T∑t=1

yt,

which is the sample average.

Suppose that the prior is Gaussian with mean μ0 and variance σ2μ. Then

the posterior kernel is given by

(2πσ2μ

)−T2 exp

(− 1

2σ2μ(μ− μ0)2

)(2π)−

T2 exp

(−1

2

T∑t=1

(yt − μ)2).


Thus, the Bayesian estimate is given by

μBE =T μML + σ−2

μ μ0

T + σ−2μ

,

which is a linear combination of the prior mean and the maximum likelihood

estimate. When we have no prior information (i.e., σ2μ → ∞), the the

Bayesian estimate converges to the maximum likelihood estimate. When we

are sure that the prior calibrated value is truth (i.e., σ2μ → 0), the μBE → μ0.

15.2 Bayesian Estimation of DSGE Models

As discussed in Section 2.5, the equilibrium system for a DSGE model can

often be summarized by the nonlinear system (2.33). If we useM+ andM−

to denote the matrices that extract variables y+t = M+yt and y−t = M−yt,

then we can rewrite this system as

Etf (yt−1, yt, yt+1, εt) = 0, (15.2)

and write the solution as

yt = g (yt−1, εt) ,

where we still use the notations f and g without risk of confusion. Here εt

is a vector of exogenous shocks with zero mean and constant variance and

yt is a vector of equilibrium variables and exogenous states. Our objective

is to estimate the vector of structural parameters θ of the model. Bayesian

estimation typically consists of the following steps.

15.2.1 Numerical Solution and State Space Representation

The first step is to solve for the decision rule numerically given a parameter

value θ:

yt = g (yt−1, εt; θ) . (15.3)

15.2. BAYESIAN ESTIMATION OF DSGE MODELS 453

This equation is also called the transition equation. We can use many nu-

merical methods to solve the model. For estimation, we need to balance

between speed and accuracy. Linearization or log-linearization is the most

widely used method. The perturbation method is often used to derive non-

linear solutions.

Not all variables in yt are observable. The observable vector y∗t is a

subset of yt. We write its relation to yt as a measurement equation:

y∗t = h (yt, vt; θ) , (15.4)

where vt is a vector of shocks to the observables, like the measurement error.

Assume that vt and εt are independent.

While the transition equation is typically unique, the measurement equa-

tion depends on the selected observable variables, which allow for many de-

grees of freedom. For example, we can assume that we observe wages or

that we observe hours (or even that we have access to both series), since

the model has predictions regarding both variables. The only restriction is

that we can only select a number of series less than or equal to the number

of shocks in the model (εt and vt). Otherwise, the model would be up to

first order, stochastically singular. That is, the extra observables would be

a deterministic functions of the other observables and the likelihood would

be −∞ with probability 1.

In general, researchers should focus on the time series that they find

particularly informative for their models. For instance, to study the impact

of monetary policy on output and investment, the federal funds rate, output

and investment should be included, but data on unemployment may not be

useful unless the focus is on the labor market.

Given (15.3), we can compute the density p (yt|yt−1, θ) . Given (15.4), we

can compute the density p (y∗t |yt, θ) . In addition, we can derive

y∗t = h (yt, vt; θ) = h (g (yt−1, εt; θ) , vt; θ) .


Thus, we can compute the density p (y∗t |yt−1, θ) .

15.2.2 Evaluating the Likelihood Function

The next step is to compute the likelihood density:

p(Y ∗T |θ

)= p (y∗1|θ)

T∏t=2

p(y∗t |Y ∗t−1, θ

)=

∫p (y∗1|y1, θ) p (y1, θ) dy1

×T∏t=2

∫p(y∗t |Y ∗t−1, yt, θ

)p(yt|Y ∗t−1, θ

)dyt.

Note that p(y∗t |Y ∗t−1, yt, θ

)= p (y∗t |yt, θ) . Assume that p (y1, θ) is exoge-

nously given. Thus, we only need to compute {p(yt|Y ∗t−1, θ

)}Tt=1.

Filtering theory gives a procedure to compute these conditional density.

First, we use the Chapman-Kolmogorov equation:

p(yt+1|Y ∗t, θ

)=

∫p (yt+1|yt, θ) p

(yt|Y ∗t, θ

)dyt.

Second, use the Bayesian theorem to derive

p(yt|Y ∗t, θ

)=p (y∗t |yt, θ) p

(yt|Y ∗t−1, θ

)p (y∗t |Y ∗t−1, θ)

,

where

p(y∗t |Y ∗t−1, θ

)=

∫p (y∗t |yt, θ) p

(yt|Y ∗t−1, θ

)dyt,

is the conditional likelihood.

The Chapman–Kolmogorov equation says that the distribution of states

tomorrow given an observation until today, p(yt+1|Y ∗t, θ

), is equal to the

distribution today of p(yt|Y ∗t, θ

)times the transition probabilities p (yt+1|yt, θ)

integrated over all possible states. Therefore, the Chapman–Kolmogorov

equation just gives us a forecasting rule for the evolution of states. The

Bayesian theorem updates the distribution of states p(yt|Y ∗t−1, θ

)when a


new observation y∗t arrives given its probability p (y∗t |yt, θ). By a recursive

application of forecasting and updating, we can generate the complete se-

quence {p(yt|Y ∗t−1, θ

)}Tt=1 given an initial condition p (y1, θ) .

Numerical computation of this equation is nontrivial because it involves

computing numerical integration. The simplest case is the linear Gaussian

case. We can then use the Kalman filter discussed in Section 10.1.1. We

first use the linear or log-linear approximation method to solve the DSGE

model. We then obtain a linear state space representation

yt = A (θ) yt−1 + C (θ) εt,

y∗t = Gy (θ) +Gyt + vt,

where

Eεt = Evt = 0, Eεtε′t = Q (θ) , Evtv

′t = V (θ)

and yt represents the deviation from the steady state y (θ) for the variable

yt.1 Note that the coefficients in the decision rules or the transition equation

and the steady state values depend on the structural parameter θ to be

estimated. The matrix G extracts the vector of observable variables from

the vector of all equilibrium variables yt. The error term vt may contain

measure errors if the dimension of εt is less than that of y∗t .

Define

yt|t−1 = Et[yt|Y ∗t−1

], Pt = E

[(yt − yt|t−1

) (yt − yt|t−1

)′ |Y ∗t−1].

For t = 1, 2, ..., T and with initial values y1|0 and P1 given, the Kalman filter

discussed in Section 10.1.1 gives the following recursion:

at = y∗t − y∗t|t−1 = y∗t −Gy (θ)−Gyt|t−1,

Ωt|t−1 = GPtG′ + V (θ) ,

1The variable yt could be in level or in log.


Kt = A (θ)PtG′Ω−1t|t−1,

Pt+1 = A (θ)Pt (A (θ)−KtG)′ + C (θ)Q (θ)C (θ)′ ,

yt+1|t = A (θ) yt|t−1 +Ktat.

From the Kalman filter, we can compute the log likelihood density

ln p(Y ∗T |θ

)=

T∑t=1

ln p(y∗t |Y ∗t−1, θ

)

= −n∗T2

ln (2π)−T∑t=1

1

2ln

∣∣Ωt|t−1

∣∣− 1

2

T∑t=1

a′tΩ−1t|t−1at,

where n∗ is the number observables, at = yt − yt|t−1 is the innovation and

Ωt|t−1 is given by (10.8).

Linear approximations are inadequate to address some economic prob-

lems such as precautionary savings, risk premium, asymmetric effects, etc.

For these problems, one has to use nonlinear solution methods to capture

the impact of higher order moments. In this case, the Kalman filter cannot

be used and a nonlinear filter is needed. Fernandez-Villaverde and Rubio-

Ramirez (2005, 2007) introduce particle filter to the analysis of DSGE mod-

els. The reader is referred to DeJong and Dave (2007) and

15.2.3 Computing the Posterior

The final step is to compute the posterior p(θ|Y ∗T ) or the posterior kernel

K(θ|Y ∗T ). Because there is no analytical expression for p

(θ|Y ∗T ) , im-

plementing Bayesian estimation is challenging. We first derive a Bayesian

estimate. Given a parameter value for θ, we can use the previous step to

compute p(θ|Y ∗T ) or K

(θ|Y ∗T ) . We can then find the posterior mode θm

by a numerical optimization method. We may use this posterior mode as

the Bayesian estimate for the parameter. We can estimate the covariance


matrix by the negative inverse Hessian matrix evaluated at the posterior

mode:

Σm =

(−∂2 ln

(K(θ|Y ∗T ))

∂θ∂θ′|θ=θm

)−1

. (15.5)

The more difficult part is to find the posterior p(θ|Y ∗T ) for inference

and for computing integration of functions of the parameter. We can use

the MCMC method to produce a Markov chain (or a sequence of samples)

whose ergodic distribution is given by p(θ|Y ∗T ) . We then simulate from

this chain and approximate p(θ|Y ∗T ) by the empirical distribution gener-

ated by the chain. One efficient algorithm to generate the Markov chain is

the Metropolis-Hastings algorithm. This algorithm is quite general and

can be implemented in several ways. We shall focus on the Random Walk

Metropolis-Hastings (RWMH) algorithm. This algorithm builds on

the fact that under general conditions the distribution of the structural pa-

rameters will be asymptotically normal. The algorithm constructs a Gaus-

sian approximation around the posterior mode and uses a scaled version of

the asymptotic covariance matrix as the covariance matrix for the proposal

distribution. This allows for an efficient exploration of the posterior distri-

bution at least in the neighborhood of the mode. More precisely, the RWMH

algorithm implements the following steps:

Step 1 Choose a starting point θ0, which is typically the posterior mode,

and run a loop over Steps 2, 3, and 4.

Step 2 Draw a proposal θ∗ from a jumping distribution

J(θ∗|θt−1

)= N

(θt−1, cΣm

),

where Σm is given by (15.5) and c is a scaling factor.


Step 3 Solve the model numerically given θ∗ and compute p(θ∗|Y ∗T ) by

the Kalman filter. Compute the acceptance ratio

r =p(θ∗|Y ∗T )

p(θt−1|Y ∗T ) =

K(θ∗|Y ∗T )

K(θt−1|Y ∗T ) .

Step 4 Jump from θt−1 to the proposal θ∗ with probability min (r, 1) and

reject the proposal (i.e. θt = θt−1) with the remaining probability. Update

the jumping distribution and go to Step 2.

The above algorithm can be intuitively explained as follows. After ini-

tializing in Step 1, in step 2, draw a candidate parameter θ∗ from a normal

distribution, whose mean has been set to θt−1. The scalar c is a scale factor

for the variance that plays an important role as will be explained below.

In step 3, after solving the model numerically given θ∗, compute the accep-

tance ratio r as the ratio of the posterior kernel evaluated at θ∗, K(θ∗|Y ∗t) ,

to the posterior kernel evaluated at the mean of the drawing distribution,

K(θt−1|Y ∗t). In step 4, decide whether or not to hold on to your candidate

parameter θ∗. If the acceptance ratio r is greater than one, then definitely

keep your candidate θ∗. Otherwise, go back to the previous candidate θt−1

with probability 1 − r. Then, update the jumping distribution and go to

Step 2.

After having repeated these steps to generate sufficiently many draws

θ0, θ1, ..., θn, one can build a histogram of these draws and construct empir-

ical distribution. In addition, we can approximate the expectation in (15.1)

by

E[h (θ) |Y ∗T ] = 1

n

n∑i=1

h(θi).

Why should one adopt the acceptance rule in Step 4? The reason is

that this rule ensures that the entire domain of the posterior distribution

can be visited. One should not be too quickly to throw out the candidate

15.3. AN EXAMPLE 459

giving a lower value of the posterior kernel. If the jump is “uphill,” one

always accepts. But if the jump is “downhill,” one should reject with some

probability. Of course, an important parameter in this procedure is the

variance of the jumping distribution and in particular the scale factor c. If

the scale factor is too small, the acceptance rate (the fraction of candidate

parameters that are accepted in a window of time) will be too high and the

Markov Chain of candidate parameters will “mix slowly”, meaning that the

distribution will take a long time to converge to the posterior distribution

since the chain is likely to get stuck around a local maximum. On the other

hand, if the scale factor is too large, the acceptance rate will be very low (as

the candidates are likely to land in regions of low probability density and

the chain will spend too much time in the tails of the posterior distribution.

Several questions arise in implementing the RWMH algorithm in prac-

tice: How should we choose the scale factor c? What is a satisfactory ac-

ceptance rate? How many draws are ideal? How is convergence of the

Metropolis-Hastings iterations assessed? These are all important questions

that will come up in your usage of Dynare.

15.3 An Example

We use our basic RBC model studied in the previous chapter as an example

to illustrate how to use Dynare to conduct Bayesian estimation. We set the

parameter values of the model as

α = 0.33, β = 0.99, δ = 0.025, χ = 1.75, ρ = 0.95, σ = 0.01. (15.6)

Note that these parameter values imply that the steady state hours N =

0.33. We then solve the model using Dynare by log-linearization and obtain

simulated data in the file simuldataRBC.m.2 Suppose that we only observe

2The Dynare code to generate the data is rbcdata.mod.


the data of output in percentage deviations. The parameters of the model

are unknown. Our objective is to use these data to estimate the parameters

of the model using the Bayesian method.

Dynare Codes

A Dynare code consists of the following blocks.3 In the first block, one

defines endogenous variables, exogenous shocks, and parameters.

%—————————————————————-

% 1. Defining variables

%—————————————————————-

var ly lc lk li lh lw Rk z;

varexo e;

parameters beta delta chi alpha rho;

In the second block, one defines the observable variables.

%—————————————————————-

% 2. Declaring observable variables

%—————————————————————

varobs ly;

In the third block, one inputs model equations.

%—————————————————————-

% 3. Model

%—————————————————————-

model;

//Consumption Euler equation

(1/exp(lc)) = beta*(1/exp(lc(+1)))*(1+alpha...

*(exp(lk)ˆ(alpha-1))*exp(z(+1))*exp(lh(+1))ˆ(1-alpha)-delta);

// Labor supply

3The Dynare estimation codes are rbcEST.mod and rbcEST2.mod.


chi*exp(lc)/(1-exp(lh)) = exp(lw);

// Labor demand

exp(lw) = exp(z)*(1-alpha)*exp(lk(-1))âlpha*exp(lh)ˆ(-alpha);

//Resource constraint

exp(lc)+exp(li) = exp(ly);

//Production function

exp(ly) = exp(z)*(exp(lk(-1))âlpha)*(exp(lh))ˆ(1-alpha);

//Capital accumulation equation

exp(li) = exp(lk)-(1-delta)*exp(lk(-1));

//Capital rental rate

exp(Rk) = alpha*exp(ly)/exp(lk(-1));

//TFP shock

z = rho*z(-1)+e;

end;

In the fourth block, one specifies the steady state. During estimation,

Dynare recalculates the steady state of the model at each iteration of op-

timization when parameter values are updated. This step typically takes a

significant amount of time. It will be much more efficient if one can solve

for the steady state separately either by hand or using a Matlab program.

If one asks Dynare to solve for the steady state, then the fourth block reads:

%————————————————————-

% 4. Specifying Steady State

%————————————————————–

initval;

lk = 2.2;

lc = -0.26;

lh = -1.098;

li = -1.44;


ly = 0.005;

lw= 0.7;

Rk= -3.3;

z = 0;

end;

If one solves for the steady state by hand as we described in the last

chapter, we then may write the fourth block as:

steady state model;

N = 1/3;

KoverN = (alpha/(1/beta-1+delta))ˆ(1/(1-alpha));

Y = (KoverN)âlpha*N;

I = delta*KoverN*N;

K = KoverN*N;

C = Y-I;

w = (1-alpha)*Kâlpha*Nˆ(-alpha);

Rk1 = alpha*Y/K;

chi = (1-alpha)*(KoverN)âlpha*(1-N)/C; %for KPR0 and KPR3

ly = log(Y); lc = log(C); lk = log(K);

lh = log(N); li = log(I); lw = log(w);

Rk = log(Rk1); z = 0;

end;

Note that we essentially calibrate the parameter χ to target the steady-

state hours N = 1/3. In this case, the parameter χ must be excluded from

the list of parameters to be estimated. The key Dynare command in this

block is steady state model. When the analytical solution of the model is

known, this command can be used to help Dynare find the steady state

in a more efficient and reliable way, especially during estimation where the

steady state has to be recomputed repeatedly when updating parameter


values. Each line of this block consists of a variable (either an endogenous,

a temporary variable, or a parameter) which is assigned an expression (which

can contain parameters, exogenous at the steady state, or any endogenous

or temporary variable already declared above). Each line therefore looks

like:

VARIABLE NAME = EXPRESSION;

Note that it is also possible to assign several variables at the same time,

if the main function on the right-hand side is a Matlab function returning

several arguments:

[VARIABLE NAME, VARIABLE NAME... ] = FUNCTION NAME;

Dynare will automatically generate a steady state file using the informa-

tion provided in this block.

In the fifth block, one declares priors. The general syntax to introduce

priors in Dynare is the following:

estimated params;

Parameter Name, PRIOR SHAPE, PRIOR MEAN, PRIOR STANDARD ERROR

[, PRIOR THIRD PARMAETER] [,PRIOR FOURTH PARAMETER];

The following table defines each term.

PRIOR SHAPE DISTRIBUTION RANGE

NORMAL PDF N (μ, σ) R

GAMMA PDF G2 (μ, σ, p3) [p3,+∞)BETA PDF B (μ, σ, p3, p4) [p3, p4]INV GAMMA PDF IG1 (μ, σ) R+

UNIFORM PDF U (p3, p4) [p3, p4]

Here, μ is the PRIOR MEAN, σ is the PRIOR STANDARD ERROR,

p3 is the PRIOR THIRD PARAMETER (default is 0), and p4 is the PRIOR

FOURTH PARAMETER (default is 1). Note that when specifying a uni-

form distribution between 0 and 1 as a prior for a parameter, say α, one has

to put two empty spaces for μ and σ. For example, alpha, uniform pdf, , ,

0, 1;. This block for our model is given by:


%—————————————————————-

% 5. Declaring priors

%—————————————————————-

estimated params;

alpha, beta pdf, 0.35, 0.02;

beta, beta pdf, 0.99, 0.002;

delta, beta pdf, 0.025, 0.003;

chi, gamma pdf, 1.65, 0.02;

rho, beta pdf, 0.9, 0.05;

stderr e, inv gamma pdf, 0.01, inf;

end;

As mentioned before, the parameter χ need not be estimated when we

use the steady state model command. In this case, the line starting chi must

be deleted in the estimation block.

In the final block, one launches estimation. The Dynare command is

estimation. It has many options. The reader is referred to the Dynare

Reference Manual or the Dynare User Guide.

estimation(datafile = simuldataRBC, nobs = 200, order = 1, first obs =

500, mh replic = 2000, mh nblocks = 2, mh drop = 0.45, mh jscale = 0.8,

mode compute = 6) ly;

Dynare Output

Dynare output consists of tables and graphs. There are two main tables.

The first table displays results from posterior maximization generated before

the MCMC part. When all parameters are estimated, this table is given by:


prior mean posterior mode std t-stat prior pstdev

α 0.350 0.3283 0.0038 85.347 Beta 0.020β 0.990 0.9896 0.0003 3056.29 Beta 0.002δ 0.025 0.0258 0.0006 43.846 Beta 0.003χ 1.650 1.6513 0.0050 328.398 Gamma 0.020ρ 0.900 0.9481 0.0095 100.218 Beta 0.050σ 0.010 0.0096 0.0005 20.590 invGamma Inf

In this table, “pstdev” represents prior standard deviation. When χ is

calibrated to target the steady state N = 1/3, the table becomes:

prior mean posterior mode std t-stat prior pstdev

α 0.350 0.3287 0.0037 88.8908 Beta 0.020β 0.990 0.9899 0.0004 2212.18 Beta 0.002δ 0.025 0.0259 0.0007 38.445 Beta 0.003ρ 0.900 0.9573 0.0141 67.821 Beta 0.050σ 0.010 0.01 0.0004 22.417 invGamma Inf

Given other estimated parameter values in the table, the calibrated χ is

equal to 1.757. Comparing the above two tables, we can see that the method

of calibrating χ to target the steady state N = 1/3 gives better estimates of

parameters.

Dynare also presents a second table which displays estimation results

based on MCMC. We will not show the table here. Dynare output shows

the following graphs:

• Priors.

• MCMC univariate diagnostics.

• Multivariate diagnostic.

• Priors & posteriors.

• Shocks implied at the posterior mode.


0.01 0.02 0.03 0.04 0.050

200

400

600

SE_e

0.3 0.35 0.40

10

20

30

40

alpha

0.985 0.99 0.9950

50

100

150

200

beta

0.02 0.03 0.040

50

100

delta

1.6 1.65 1.70

10

20

chi

0.7 0.8 0.9 10

10

20

rho

Figure 15.1: Priors and Posteriors

• Observables and corresponding implied values.

Here we only show the figure for priors and posteriors.

Stochastic Trends

Many DSGE models feature stochastic trends. In this case, one has to

detrend the model as discussed in the last chapter. For example, suppose

that the technology shock follows a unit root process:

ln zt = μz + ln zt−1 + σεt, z−1 = 1.

After detrending, Dynare solves the detrended system. We then have to link

detrended variables to observables. There are two ways to do this. First, we

15.4. EXERCISES 467

can transform the data in terms of growth rates. Second, we can directly

work with the data in levels. Here we shall focus on the first approach

and refer the reader to the User Guide for an introduction of the second

approach. Consider the first approach and suppose that we observe output

Yt. We then transform the data to the observable ln(Yt/Yt−1), denoted by

gy obs.We want to link this observable to the Dynare solution, the detrended

output Yt = Yt/z1

1−α

t . We may add the following code to the Dynare model

block:

dz = muz + e;

gy obs = ly-ly(-1)+ (1/(1-alpha))*dz;

The rest of the codes are similar and are omitted here.

15.4 Exercises

1. Derive the Kalman filter in Section 15.2.2 using the result in Section

10.1.1.

2. Consider the example in Section 15.3. Using the parameter values in

(15.6) to solve the model by linear approximations in levels. Gener-

ate artificial data for consumption. Using these consumption data to

estimate the model by the Bayesian method.

3. Using the basic RBC model with a stochastic trend to generate ar-

tificial data. Suppose the observable data is output. Use Dynare to

estimate this model.


Chapter 16

Overlapping-GenerationsModels

An overlapping generations (OLG) model is a type of economic model in

which agents live a finite length of time and people from different generations

coexist at any point in time. In each period, some old generations die

and a new generation is born. This induces a natural heterogeneity across

individuals at a point in time, as well as nontrivial life-cycle considerations

for a given individual across time. These features of the model generate

different implications from models with infinitely-lived agents studied in

Chapters 13 and 14. In particular, competitive equilibria in the OLG model

may not to be Pareto optimal. A natural questions is how to remedy this

inefficiency. One solution is to rely on a government. Thus, the OLG model

is a workhorse in public finance. The other solution is to use a market

mechanism. If agents accept an intrinsically useless asset, e.g., fiat money, to

transfer resources across generations, then welfare can be improved. Thus,

the OLG model has a role for fiat money and can be used to address a

variety of substantive issues in monetary economics. Because an intrinsically

useless asset can be valued in an OLG model, it is also a workhorse model

for studying asset bubbles.

469

470 CHAPTER 16. OVERLAPPING-GENERATIONS MODELS

Samuelson (1958) introduces the OLG model in a pure-exchange econ-

omy. Building on this model, Diamond (1965) introduces production and

studies the role of national debt. Tirole (1985) uses the Diamond (1965)

model to study asset bubbles. Weil (1987) introduces stochastic bubbles

in the Samuelson-Diamond-Tirole model. In this chapter, we study these

models.

16.1 Exchange Economies

Time starts at date 1, lasts forever, and is denoted by t = 1, 2, .... There is no

uncertainty. The economy consists of overlapping generations of identical,

two-period lived agents. At any date t ≥ 1, the population consists of Nt

young agents and Nt−1 old agents, where Nt = gnNt−1, gn > 0. Output is

nonproduced and perishable. Each young and each old are endowed with e1

and e2 units of output, respectively.

An agent of generation t ≥ 1 consumes ctt when young, and ctt+1 when

old. His preferences are represented by the utility function:

u(ctt)+ βu

(ctt+1

),

where β ∈ (0, 1], u′ > 0, u′′ < 0, and limc→0 u′ (c) =∞. An initial old agent

at time 1 consumes c01 and derives utility according to u(c01). He is also

endowed with m00 units of an intrinsically useless and unbacked asset, called

fiat money. Money is the only asset used by agents to smooth consumption.

The beginning-of-period aggregate nominal money balances in the initial

period 1 are M0 = N0m00. A monetary authority supplies money by lump-

sum transfers. For all t ≥ 1, the post-transfer time t stock of money Mt

satisfies Mt = gmMt−1, gm > 0. The time t transfer (or tax), (gm − 1)Mt−1

is divided equally among the Nt−1 members of the current old generation.

The transfer (or tax) is taken as given and does not affect each agent’s

consumption/saving decision.

16.1. EXCHANGE ECONOMIES 471

By buying mtt units of money at the price 1/pt, a young agent acquires a

claim to mtt/pt+1 units of the consumption good next period. A young will

buy money only if he believes that he will be able to resell it at a positive

price to the yet unborn young of the coming generation. The purchase of

money in this model is thus purely speculative, in Kaldor’s (1939) definition

of the term, “buying in order to resell.” The budget constraints of a young

agent of generate t ≥ 1 are given by

ctt +mtt

pt≤ e1, (16.1)

ctt+1 ≤ e2 +mtt

pt+1+

(gm − 1)

Nt

Mt

pt+1, (16.2)

where mtt ≥ 0, and pt > 0 is the time t price level. The budget constraint of

an initial old agent at date 1 is given by

c01 ≤ y2 +m0

0

p1+

(gm − 1)

N0

M0

p1.

A monetary equilibrium consists of sequences of finite and positive

prices {pt} and allocations{c01,

(ctt, c

tt+1,m

tt

), t ≥ 1

}such that given the

price sequence,{c01,

(ctt, c

tt+1,m

tt

), t ≥ 1

}maximize utility of agents in each

generation and markets clears so that Mt = Ntmtt, and

Ntctt +Nt−1c

t−1t = Nte1 +Nt−1e2, t ≥ 1. (16.3)

If pt = ∞, then the real value of money is zero. No one will hold money

except for the initial old, mtt = 0. It follows that autarky is an equilibrium.

We call this equilibrium a nonmonetary equilibrium.

In a monetary equilibrium, the consumption Euler equation gives

u′(ctt)

βu′(ctt+1

) =ptpt+1

= Rt+1, (16.4)

where Rt+1 is the gross interest rate or the return on money. We can also

define an implicit interest rate in the nonmonetary equilibrium as

u′ (e1)βu′ (e2)

= Raut. (16.5)


Let mt =Mt/ (Ntpt) denote the equilibrium real money balances of a young

agent at time t. Imposing market clearing conditions in (16.1) and (16.2)

yields ctt = e1 − mt and ctt+1 = e2 + gnmt+1. In a monetary equilibrium

mt > 0. It follows from (16.4) and (16.5) that

u′ (e1 − mt)

βu′ (e2 + gnmt+1)= Rt+1 > Raut. (16.6)

A Special Case and Multiple Equilibria

Consider the special case in which there is no population and money supply

growth. We use the special case to illustrate some peculiar features of the

OLG model. Suppose a unit measure of identical agents is born each period.

The money supply is constant M.

Define the excess supply of good (saving or real balances) when young

as

mt ≡mtt

Pt= e1 − ctt,

and the excess demand of goods when old as

zt ≡mtt

Pt+1= ctt+1 − e2.

Note that saving mt is a function of the relative prices or the interest rate

pt/pt+1 = Rt+1. Since

zt =Mt

pt+1=Mt

pt

ptpt+1

= mtptpt+1

,

we then obtain a relationship between zt and mt as pt/pt+1 varies. Figure

16.1 plots this relationship which is called the offer curve. The slope of

the line from the origin to any point on the offer curve is equal to pt/pt+1.

In addition, the offer curves must lie on the left of the vertical line at e1.

The slope of the offer curve is important for the analysis. To derive the

slope, we use the Euler equation

u′ (e1 −mt) = βu′ (mtRt+1 + e2)Rt+1,


0

EAB

mFigure 16.1: Offer curve.

to derivedmt

dRt+1=−

[Rt+1mtβu

′′ (ctt+1

)+ βu′

(ctt+1

)]u′′ (ctt) + βR2

t+1u′′ (ctt+1

) .

Using zt = mtRt+1,

dztdRt+1

=mtu

′′ (ctt)− βu′ (ctt+1

)Rt+1

u′′ (ctt) + βR2t+1u

′′ (ctt+1

) > 0.

Thus, the shape of the offer curve depends on the sign of dmt/dRt+1.

An increase in the interest rate decreases the price of second-period con-

sumption, leading agents to save consumption from the first period to the

second period; this is the substitution effect. But it also increases the fea-

sible consumption set, making it possible to increase consumption in both

periods; this is the income effect. The net effect of these substitution and

income effects is ambiguous.

More formally, since u′′(ctt)+βR2

t+1u′′ (ctt+1

)< 0, dmt/dRt+1 < 0 if and

only if

0 < −[Rt+1mtβu

′′ (ctt+1

)+ βu′

(ctt+1

)]= βu′

(ctt+1

) [ ztctt+1

γ(ctt+1

)− 1

],


0 ∗

Figure 16.2: One stable nonmonetary steady state and one unstablemonetary steady state.

where γ (c) = −cu′′ (c) /u′ (c) is the coefficient of relative risk aversion at

c. It is also equal to the inverse elasticity of substitution. Since zt < ctt+1,

a necessary condition for the offer curve to be backward bending is that

γ(ctt+1

)> 1. That is, the substitution effect must be sufficiently small. If

γ(ctt+1

)≤ 1, the offer curve is always upward sloping.

In equilibrium, the demand for money is equal to the supply of money

so that zt = mt+1. We then obtain the equilibrium system represented by

the dynamic relation between mt+1 and mt, as illustrated in Figure 16.2.

The steady state of the economy is at the intersection of the offer curve

and the 450 line. Figure 16.2 shows that there are two steady states, one

nonmonetary at the origin and one monetary.

Whether there is a monetary equilibrium depends on the slope of the offer

curve at the origin. If the slope is less than one, a monetary equilibrium

exists. Proposition 16.1.1 below shows that this condition is necessary and

sufficient for a more general setup. A monetary steady state is unstable if

and only if |dmt+1/dmt| > 1. Figure 16.2 shows that at the monetary steady


mm∗

e

e

Figure 16.3: A locally stable monetary steady state.

state m∗, dmt+1/dmt > 1. Paths starting to the right of m∗ increase until

eventually mt becomes larger than e1, which is impossible. We can rule out

this solution. Paths starting to the left have steadily decreasing real money

balances, and all converge asymptotically to the nonmonetary steady state.

The rate of inflation increases and tends asymptotically to the inverse of the

slope of the offer curve at the origin minus one.

When the offer curve is sufficiently backward bending, we have |dmt+1/dmt| <1 and the monetary equilibrium is locally stable, as shown in Figure 16.3.

In this case, the economy eventually converges to the monetary steady state

starting from any price level in a neighborhood of m∗. The equilibrium is

indeterminate since mt is a nonpredetermined variable. As analyzed earlier,

this case happens when the substitution effect is sufficiently small and the

income effect is sufficiently large.

When the monetary steady state is locally stable, cyclical solutions can

exist. Figure 16.4 plots the offer curve from Figure 16.3 and its mirror

image around the 450 line. The two curves intersect at the steady state on

the diagonal. Given that the slope of the offer curve at the steady state is


A

B

Figure 16.4: A two-period cycle equilibrium.

less than one in absolute value, they also intersect at two other points, A

and B.

We show that the economy may have a two-period cycle. Suppose that

real balances are equal to ma in period 1. Then real balances are equal to

mb in period 2. But in period 3, real balances are equal to ma. The economy

then repeats itself as if it is in period 1.

Example 16.1.1 Let u (c) = ln c and β = 1. Assume e1 > e2. From the

consumption Euler equation, we obtain the difference equation:

pte1 −M = pt+1e2 +M.

The general solution to this difference equation is given by

pt =2M

e1 − e2+ c

(e1e2

)t,

where c is a constant to be determined. If c = 0, we obtain a stationary

monetary equilibrium with a sequence of positive constant prices. For any

c > 0, the price level goes to infinity and hence money has no value in the


long run. The economy approaches the autarky steady state. We cannot

rule out this explosive solution because there is no transversality condition

for the finitely-lived agents.

Example 16.1.2 Let β = 1, e1 = 1 − ε, and e2 = ε ∈ (0, 0.5) . Assume

that the initial old has no endowment of any asset or money. The monetary

authority supplies money M = 0.5 − ε, t ≥ 1. We construct two stationary

equilibria using a guess-and-verify method.

• A high-interest-rate equilibrium. Set pt = 1, ctt = ctt+1 = 0.5, mtt =

0.5− ε for all t ≥ 1 and c01 = 0.5. This is a monetary equilibrium. The

interest rate is Rt+1 = 1.

• A low-interest-rate equilibrium. Let p1 = 1 and pt/pt+1 = u′ (1− ε) /u′ (ε) .The price will approach infinity and hence money is worthless in the

limit. Based on this belief, no one will hold money. Given this price

system, autarky is an equilibrium. In this equilibrium, the present

value of aggregate endowment is infinity because the interest rate Raut =

pt/pt+1 < 1. Thus, the assumption of the first fundamental theorem of

welfare economics is violated. The autarky equilibrium is not Pareto

optimal.

Existence and Efficiency

Now, we study two related theoretical questions in the general setup. Under

what conditions does a monetary equilibrium exist? When it exists, under

what conditions does it improve welfare? The exposition below follows from

Wallace (1980) and Ljungqvist and Sargent (2004).

Proposition 16.1.1 The condition Rautgm < gn is necessary and sufficient

for the existence of at least one monetary equilibrium.


Proof. We first prove necessity. Suppose to the contrary that there is

a monetary equilibrium with Rautgm ≥ gn. By definition,

Rt+1 =ptpt+1

=Mt+1

gmMt

ptpt+1

=Nt+1

gmNt

mt+1

mt=gngm

mt+1

mt. (16.7)

Since Rt+1 > Raut,

mt+1

mt= Rt+1

gmgn

> Rautgmgn≥ 1. (16.8)

If Rautgm/gn > 1, the growth rate of mt is bounded uniformly above one.

Hence, the sequence {mt} is unbounded, which is inconsistent with an equi-

librium because real money balances per capita cannot exceed the endow-

ment e1 of a young agent.

If Rautgm/gn = 1, the strictly increasing sequence {mt} converges to

some limit m∞. If this limit is finite, then taking limits in (16.6) and (16.7)

yields:u′ (e1 − m∞)

βu′ (e2 + gnm∞)= R∞ =

gngm

= Raut.

Thus, m∞ = 0, contradicting the existence of a strictly increasing sequence

of positive real balances in (16.8).

To show sufficiency, we prove the existence of a unique equilibrium with

constant per-capita real money balances when Rautgm < gn. Substituting

the candidate equilibrium, mt+1 = mt = m, into (16.6) and (16.7) yields:

u′ (e1 − m)

βu′ (e2 + gnm)=gngm

> Raut =u′ (e1)βu′ (e2)

.

Thus, there exists a unique solution m > 0 such that the first equality holds.

Next, we turn to Pareto optimality.

Definition 16.1.1 An allocation C ={c01,

(ctt, c

tt+1

), t ≥ 1

}for the OLG

model is feasible if (16.3) holds or, equivalently, if

gnctt + ct−1

t = gne1 + e2, t ≥ 1. (16.9)


Definition 16.1.2 A feasible allocation C for the OLG model is Pareto

optimal if there is no other feasible allocation C such that

c01 ≥ c01,

u(ctt) + βu(ctt+1) ≥ u(ctt) + βu(ctt+1),

and at least one of there weak inequalities holds with strict inequality.

Definition 16.1.3 An allocation{c01,

(ctt, c

tt+1

), t ≥ 1

}for the OLG model

is stationary if ctt = cy and ctt+1 = co for all t ≥ 1.

Note that we do not require that c01 = co. We call an equilibrium with a

stationary allocation a stationary equilibrium.

Proposition 16.1.2 Raut ≥ gn is necessary and sufficient for the optimal-

ity of the nonmonetary equilibrium (autarky)

Proof. To prove sufficiency, suppose to the contrary that there exists

another feasible allocation C that is Pareto superior to autarky and Raut ≥gn. Let t be the first period when this alternative allocation C differs from the

autarkic allocation. The requirement that the old generation in this period

is not made worse off, ct−1t ≥ e2, implies that the first perturbation from the

autarkic allocation must be ctt < e1, with the subsequent implication that

ctt+1 > e2. It follows from feasibility that the consumption of young agents

at time t+ 1 must also fall below e1, so that

εt+1 ≡ e1 − ct+1t+1 > 0. (16.10)

Now, given ct+1t+1, define c

t+1t+2 as the solution for ct+1

t+2 to the equation:

u(ct+1t+1

)+ βu

(ct+1t+2

)= u (e1) + βu (e2) .

Clearly, ct+1t+2 > e2. Since the allocation C is Pareto superior to autarky, we

have ct+1t+2 ≥ ct+1

t+2.


We now derive an expression for ct+1t+2. Consider the indifference curve of

u (c1) + βu (c2) that yields a fixed utility equal to u (e1) + βu (e2) . Along

an indifference curve, c2 = h (c1) where h′ = −u′ (c1) / (βu′ (c2)) = −R < 0

and h′′ > 0. Thus, applying the intermediate value theorem to h, we have

h (c1) = h (e1) + (e1 − c1)[−h′ (e1) + f (e1 − c1)

], (16.11)

where the function f is strictly increasing on [0, e1] with f (0) = 0.

Now since(ct+1t+1, c

t+1t+2

)and (e1, e2) are on the same indifference curve, we

can use (16.10) and (16.11) to write

ct+1t+2 = e2 + εt+1 [Raut + f (εt+1)] ,

and after invoking ct+1t+2 ≥ ct+1

t+2, we have

ct+1t+2 − e2 ≥ εt+1 [Raut + f (εt+1)] .

Since C is feasible,

εt+2 ≡ y1 − ct+2t+2 =

ct+1t+2 − y2gn

.

It follows that

εt+2 ≥ εt+1Raut + f (εt+1)

gn> εt+1,

where the last inequality follows from the assumption and f (εt+1) > 0 for

εt+1 > 0. Continuing in this fashion yields

εt+k ≡ y1 − ct+kt+k ≥ εt+1

k−1∏j=1

Raut + f (εt+j)

gn> εt+1

[Raut + f (εt+1)

gn

]k−1

,

for k > 2, where the last inequality follows from the fact that {εt+j} is a

strictly increasing sequence. Thus {εt+k} is unbounded, contradicting the

fact that εt+k < e1.

To prove necessity, we show the existence of an alternative feasible allo-

cation that is Pareto superior to autarky when Raut < gn. Imagine that a


social planner transfers ε > 0 units of each young agent’s consumption to

the old in the same period. Because there are Nt = gnNt−1 young agents

and Nt−1 old agents in period t. An old agents will receive transfers gnε.

Suppose that the social planner keeps doing this for each period. Then the

initial old agents are better off. Consider the utility of an agent in generation

t, f (ε) ≡ u (e1 − ε) + βu (e2 + gnε) . Clearly, f (0) = u (e1) + βu (e2) . But

f ′ (0) = −u′ (e1) + βgnu′ (e2) > 0 as long as gn > Raut. Thus, each agent in

generation t is better for small ε > 0.

What is the intuition behind the condition Raut ≥ gn? If Raut < gn,

then the interest rate is less than the growth rate of the economy and hence

the present value of endowments is infinite. This violates the assumption

of the first-welfare theorem. Note that in the above proof of the necessity

part, the construction of a Pareto improving allocation relies on the fact

that a planner can continually transfer resources from the young to the old

indefinitely. Because this inefficiency comes from the intertemporal structure

of the economy, it is often called dynamic inefficiency.

With a constant nominal money supply, gm = 1, the previous two propo-

sitions show that a monetary equilibrium exists if and only if the nonmone-

tary equilibrium is Pareto inefficient. In that case, the following proposition

shows that the stationary monetary equilibrium is Pareto optimal.

Proposition 16.1.3 Given Rautgm < gn, then gm ≤ 1 is necessary and

sufficient for the optimality of the stationary monetary equilibrium.

Proof. The class of feasible stationary allocations with(ctt, c

tt+1

)=

(c1, c2) for all t ≥ 1 is given by

c1 +c2gn

= e1 +e2gn. (16.12)

Consider the stationary monetary equilibrium with allocation (c1, c2) and

constant per capital real balances m. Rewriting the budget constraints (16.1)


and (16.2) as

c1 +gmgnc2 ≤ e1 +

gmgne2 +

gmgn

(gm − 1)

Nt

Mt

pt+1,

where we have used the fact that pt/pt+1 = gn/gm. It follows that

c1 +gmgnc2 ≤ e1 +

gmgne2 + (gm − 1) m. (16.13)

To prove necessity, Figure 16.5 plots the two curves (16.12) and (16.13)

when condition gm ≤ 1 fails to hold, i.e., when gm > 1. The point that

maximizes utility subject to (16.12) is denoted (c1, c2). Transitivity of pref-

erences and the fact that the slope of budget line (16.13) is flatter than that

of (16.12) imply that (c1, c2) lies southeast of (c1, c2) . By revealed prefer-

ence, (c1, c2) is preferred to (c1, c2) and all generations born in period t ≥ 1

are better off under the allocation C. The initial old generation can also be

made better off under this alternative allocation since it is feasible to strictly

increase their consumption,

c01 = e2 + gn(e1 − c11

)> e2 + gn

(e1 − c11

)= c01.

This proves necessity.

To prove sufficiency, note that the consumption Euler equation, (16.7)

and gm ≤ 1 imply that

u′ (c1)βu′ (c2)

= Rt+1 =gngm≥ gn.

We can then use the same argument as that in the proof of the previous

proposition.

For the case of constant population, Balasko and Shell (1980) establish

a convenient general criterion for testing whether allocations are optimal.

Define the interest rate Rt+1 = u′(ctt)/βu′

(ctt+1

). Then Balasko and Shell

show that an allocation{c01,

(ctt, c

tt+1

), t ≥ 1

}is Pareto optimal if and only


Figure 16.5: Feasibility and budget lines.

if∞∑t=1

t∏s=1

Rs+1 = +∞.

Normalize p1 = 1. It follows from Rt+1 = pt/pt+1 that the above condition

is equivalent to∞∑t=1

1

pt= +∞.

The Balasko-Shell criterion for optimality shows that low-interest-rate economies

are typically not optimal.

The proof of Balasko and Shell (1980) result is quite involved. Here

we develop some intuition informally. Take the autarkic allocation e01,{ett, e

tt+1

}, t ≥ 1, and try to construct a Pareto improvement. In partic-

ular, take δ0 > 0 units of consumption from generation 1 to the initial

old generation. This obviously improves the initial old’s utility. To make

agents in generation 1 not worse off they have to receive δ1 in additional

consumption in their second period of life, with δ1 satisfying

δ0u′ (e11) = δ1βu

′ (e12) ,


or

δ1 =δ0u

′ (e11)βu′

(e12) = δ0R2 > 0.

In general, the amount,

δt = δ0

t∏s=1

Rs+1,

is the required transfers to an agent in generation t when old to compensate

for the reduction of he consumption when young. Obviously, such a scheme

does not work if the economy ends at a finite time T since the last generation

(that lives only through youth) is worse off. But since our economy has an

infinite horizon, such an intergenerational transfer scheme is feasible pro-

vided that the δt’s do not grow too fast, i.e. if interest rates are sufficiently

small. If such a transfer scheme is feasible, then we have found a Pareto

improvement over the original autarkic allocation, and hence the autarkic

equilibrium allocation is not Pareto efficient.

The Balasko and Shell (1980) result can be generalized to the case with

production and with population growth, as will be seen in the next section.

16.2 Production Economies

We now introduce production to the economy studied in the previous section.

This is the model analyzed by Diamond (1965). The production technology

is represented by a constant-returns-to-scale aggregate production function,

F (K,N). Assume that F satisfies the assumptions given in Section 14.1.1.

Define f (k) = F (K/N, 1) where k = K/N. Then f satisfies f ′ > 0, f ′′ < 0,

f (0) = 0, f ′ (0) =∞, and f ′ (∞) = 0.

The initial old agents are endowed with K1 units of capital. Agents in

other generations do not have any endowment and work only when young,

supplying inelastically. one unit of labor and earning a real wage of wt.

They consume part of their wage income and save the rest to finance their

16.2. PRODUCTION ECONOMIES 485

retirement consumption when old. The saving of the young in period t

generates the capital stock that is used to produce output in period t + 1

in combination with the labor supplied by the young generation of period

t+ 1. Assume that there is no capital depreciation.

A young agent at date t solves the following problem:

max u(ctt)+ βu

(ctt+1

),

subject to

ctt + st = wt,

ctt+1 = Rt+1st,

where st is saving in period t and Rt+1 is the gross interest rate paid on

saving held from period t to period t+ 1. The consumption Euler equation

gives

u′ (wt − st) = βu′ (Rt+1st)Rt+1.

Solving yields a saving function:

st = s (wt, Rt+1) , 0 < sw < 1.

Saving is an increasing function of the wage rate by the concavity of u.

The effect of an increase in the interest rate is ambiguous depending on

the income and substitution effects. As we show in Section 16.1, if the

coefficient of relative risk aversion is less than one, then the substitution

effect dominates, and an increase in interest rates leads to an increase in

saving.

Firms act competitively, hiring labor and renting capital according to

the following conditions

f (kt)− ktf ′ (kt) = wt, (16.14)

1 + f ′ (kt) = Rt, (16.15)


where kt is the capital-labor ratio, t ≥ 1.

Capital market equilibrium implies that

Kt+1 = Nts (wt, Rt+1) .

Factor market equilibrium implies that Kt = ktNt. We then obtain

gnkt+1 = s (wt, Rt+1) .

Substituting (16.14) and (16.15) into the above equation yields

kt+1 =s (f (kt)− ktf ′ (kt) , f ′ (kt+1) + 1)

gn, k1 given. (16.16)

This equation characterizes the equilibrium of the Diamond model.

Multiple Equilibria

The dynamics of kt depend on the slope:

dkt+1

dkt=−swktf ′′ (kt)

gn − sRf ′′ (kt+1).

Note that the numerator of this expression is positive because an increase

in the capital stock in period t increases the wage, which increases saving.

The sign of the denominator is ambiguous because the effects of an increase

in the interest rate on saving are ambiguous.

If sR > 0, then the denominator is positive. We can apply the implicit

function theorem to solve for kt+1 as a function of kt, kt+1 = G (kt) , where

G is an increasing function. We can show that G (0) = 0 and

limk→∞

G (k)

k= 0.

The limit follows from the fact that

gnkt+1

kt≤ wtkt⇒ lim

k→∞gnG (k)

k≤ lim

kt→∞f (kt)

kt− f (kt) = 0.


45°

Figure 16.6: Three positive steady states and G′ (0) > 1.

As a result, kt+1/kt = G (kt) /kt < 1 for kt sufficiently large, implying that

the graph of G eventually falls below the diagonal.

The function G passes through the origin and lies below the 450 line for

large k. Whether additional steady states with k > 0 exist depends on the

slope of G at the origin. If G′ (0) > 1, then an odd number of positive steady

states exist, as illustrated in Figure 16.6. If G′ (0) < 1, then an even number

of positive steady states exist, possibly zero, as illustrated in Figures 16.7

and 16.8.

If sR < 0, then (16.16) may not give a unique solution for kt+1 given kt.

Figure 16.9 plots an example in which kt+1 is not uniquely determined by kt

when kt is between ka and kb. The intuition is the following: If agents believe

that kt+1 is high, then the interest rate is low and hence saving is high. This

can support the agents’ belief about high kt+1. On the other hand, if agents

believe that kt+1 is low, then the interest rate is high and hence saving is low.

This can also support the agents’ belief about low kt+1. Thus, both paths

can be at equilibrium. This is an example of self-fulfilling prophecies.

The economy can exhibit fluctuations even though there are no exogenous


Figure 16.7: Two positive steady states and G′ (0) < 1.

Figure 16.8: Zero positive steady state and G′ (0) < 1.


Figure 16.9: Indeterminancy.

shocks. Depending on agents’ beliefs, various dynamics are possible.

The condition for the stability of a positive steady state kt+1 = kt =

k∗ > 0 is given by

G′ (k∗) =∣∣∣∣ −swk∗f ′′ (k∗)gn − sRf ′′ (k∗)

∣∣∣∣ < 1.

We call the following conditions the Diamond conditions: (i) sR > 0, (ii)

G′ (0) < 1, and (iii)

0 < G′ (k∗) =−swk∗f ′′ (k∗)gn − sRf ′′ (k∗)

< 1.

Figure 16.10 plots the dynamics under these conditions. In this case, there

is only one steady state kD, which will be called the Diamond steady state.

Dynamic Efficiency

Is the decentralized market equilibrium efficient? To answer this question,

we write down the aggregate resource constraint

Kt + F (Kt, Nt) = Kt+1 +Ntctt +Nt−1c

t−1t , t ≥ 1.


Figure 16.10: A unique positive and stable steady state.

An allocation(c01,

{ctt, c

tt+1, kt+1

}, t ≥ 1

)is feasible if it satisfies the above

resource constraint. Dividing the resource constraint by the number of work-

ers Nt yields:

kt + f (kt) = gnkt+1 + ctt + g−1n ct−1

t .

In the steady state, kt = k, ctt = cy and ctt+1 = co for all t ≥ 2. We then

rewrite the resource constraint as

cy + g−1n c0 = f (k)− (gn − 1) k. (16.17)

Consider a social planner that solves the problem in the steady state:

maxcy,co,k

λ0u(c01)+ λ1 [u (cy) + βu (co)]


cy + g−1n c01 = f (k1) + k1 − gnk,

where λ0 is the weight on the initial old and λ1 is the weight on the other

generations. Let μ and θ be the Lagrange multiplier associated with the

16.3. ASSET BUBBLES 491

two constraints, respectively. We may replace these two constraints with

inequality constraints. The first-order conditions are given by

u′ (cy)βu′ (c0)

= gn,

λ0u′ (c01) = θg−1

n ,

f ′ (k) = gn − 1 +θ

μ≥ gn − 1,

When the planner does not consider the initial old, λ0 = 0, so that θ = 0.We

then obtain the Golden Rule level of capital stock kG such that f ′ (kG) =

gn − 1. At this level, consumption per worker cy + g−1n c0 is maximized and

the marginal product of capital or the (net) interest rate is equal to the (net)

population growth rate. When the marginal product of capital is smaller

than the population growth rate, the economy accumulates too much capital

k∗ > kG and hence it is dynamically inefficient. A planner can reduce saving

from k∗ to kG and makes everyone better off.

We have shown that the steady-state capital stock k∗ ≤ kG is necessary

for the allocation to be dynamically efficient. Is it a sufficient condition?

Cass (1972) and Balasko and Shell (1980) show that a feasible allocation is

Pareto optimal (or dynamically efficient) if and only if

∞∑t=1

(gn)−t

t∏s=1

Rs+1 = +∞,

where Rt = 1 + f ′ (kt) . An immediate corollary is that a steady state equi-

librium is Pareto optimal (or dynamically efficient) if and only if k∗ ≤ kG

or f ′ (k∗) ≥ gn − 1.

16.3 Asset Bubbles

Suppose that the initial old holds B1 government debts. To repay principal

and interest, the government sells new debt and keeps rolling over old debt.


Suppose that there is no taxes or government spending. Here the government

bond is an intrinsically useless and unbacked asset. We will show that this

asset may be valued and hence represents a bubble asset.

A young agent can use saving to buy the government bond or to accu-

mulate capital. Thus, the capital market-clearing condition becomes

Ntst = Bt+1 +Kt+1.

Let bt = Bt/Nt denote per capita bond. Then the above condition becomes

gn (kt+1 + bt+1) = s (wt, Rt+1) . (16.18)

By no arbitrage, the return on the government bond must be equal to the

interest rate,

Bt+1 = RtBt.

We require Bt > 0 for all t ≥ 1. In per capita terms,

gnbt+1 = Rtbt. (16.19)

Substituting (16.14) and (16.15) into equations (16.18) and (16.19) yield a

system of two difference equations for two unknowns kt and bt. For simplicity,

we write Rt as R (kt) and wt as w (kt) for some functions w and R. Note

that kt is a predetermined state variable, while bt is nonpredetermined. An

equilibrium for this economy is the sequence (kt, bt)t≥1 that satisfies (16.18)

and (16.19), where Rt = R (kt) and wt = w (kt) . Clearly, if bt = 0, then

(16.19) is redundant and (16.18) reduces to (16.16). We then obtain the

Diamond equilibrium studied earlier.

We are interested in the bubbly equilibrium in which bt > 0 for all t.

In the steady state, we then have

gb (b+ k) = s (w (k) , R (k)) ,

gn = R (k) = f ′ (k) + 1.


Figure 16.11: Locus Δkt = kt+1 − kt = 0 and vector fields.

Thus, the bubbly steady state capital stock is equal to the Golden Rule

capital stock. The steady state bubble b∗ is then determined by the first

equation. To study the stability of this steady state (kG, b∗) , we use the

phase diagram.

Figure 16.11 plots the locus where

kt+1 − kt = 0 =s (w (kt) , R (kt))−R (kt) bt

gn− kt.

Solving yields:

bt =s (w (kt) , R (kt))− gnkt

R (kt).

This curve crosses the horizontal axis at the origin and the Diamond steady

state kD. We can show that

∂b

∂k=−swkf ′′ (k) + sRf

′′ (k)− gnR2 (k)

, for k = 0 and k = kD.

Under the Diamond conditions, we can check that

db

dk(0) > 0 and

db

dk(kD) < 0.


Figure 16.12: Locus Δbt = bt+1 − bt = 0 and vector fields.

In addition, we can show that

kt+1 > kt ⇐⇒s (w (kt) , R (kt+1))−R (kt) bt

gn− kt > 0

⇐⇒ s (w (kt) , R (kt))−R (kt) btgn

− kt > 0

⇐⇒ bt <s (w (kt) , R (kt))− gnkt

R (kt).

Thus, vector fields are as in Figure 16.11.

Next, Figure 16.12 plots the locus where

bt+1 − bt =(Rtgn− 1

)bt = 0.

This is a vertical line at the Golden Rule capital stock kG. We can show

that

bt+1 > bt ⇐⇒Rtgn− 1 > 0⇐⇒ kt < kG.

Now, we combine Figures 16.11 and 16.12 together to plot Figure 16.13.

This figure shows that for a bubbly steady state to exist, we must have kG <

kD or RD < gn. That is, the bubbleless equilibrium must be dynamically

inefficient. In this case, the bubbly steady state (kG, b∗) is at the intersection


Figure 16.13: Phase diagram.

of the loci kt+1 = kt and bt+1 = bt. Figure 16.13 shows that this steady

state is a saddle point. For any given initial value k1, there is a unique

b1 = b (k1) on the saddle path such that (kt, bt) moves along the saddle path

and converges to the bubbly steady state. However, if b1 < b (k1), then

(kt, bt) will converge to the Diamond steady state without a bubble.

Tirole (1985) introduces other assets with exogenously given dividends

or rents in the above model. He studies when a bubble can emerge in these

assets. Weil (1987) introduces a randomizing device in the Tirole (1985). A

bubble exists initially and agents believe that the bubble may collapse with

some probability each period. Weil (1987) shows that when this probability

is sufficiently small, an equilibrium with stochastic bubble can exist. Santos

and Woodford (1987) study conditions for the existence of rational bubbles

in a general framework. Kocherlakota (1992, 2009) study rational bubbles

in models with infinitely-lived agents. These studies analyze bubbles on

assets with exogenously given payoffs. Miao and Wang (2011a,b,c, 2012a,b)

study credit-driven stock market bubbles when dividends are endogenously

generated by production. They show that bubbles can help relax collateral


constraints and allow firms to borrow more and make more investments,

generating a high firm value.

16.4 Exercises

1. Consider the Samuelson (1958) model in Section 16.1. Suppose that

utility is ln(ctt)+ ln

(ctt+1

). Population is given by Nt = gnNt−1. Let

e1 > e2. Each initial old is endowed with one unit of intrinsically

worthless fiat money. The stock of money is constant over time.

(a) Compute the nonmonetary equilibrium.

(b) Compute all stationary monetary equilibria. Can you rank these

equilibria according to the Pareto criterion?

(c) Find the limiting rates of return on money as t → ∞ for all

monetary equilibria.

2. Consider the model in Exercise 1. Endow the initial old with a tree

that yields a constant dividend of d > 0 units of the consumption good

for each t ≥ 1.

(a) Compute all the equilibria with valued fiat money.

(b) Compute all nonmonetary equilibria.

3. Consider the Diamond (1965) model with u (c) = c1−γ/ (1− γ) and

f (k) = kα. Solve for the saving function s (R). Characterize the prop-

erties of s (R) for different values of γ. Derive the equilibrium law of

motion for capital.

4. Consider the Diamond (1965) model with u (c) = ln c and f (k) = kα.

Derive the equilibrium law of motion for capital and the steady state

capital stock.

16.4. EXERCISES 497

5. Consider the Diamond (1965) model with u (c) = ln c and

f (k) = A[ak

σ−1σ + 1− a

]σ−1σ,

where σ is the elasticity of substitution. Note that σ = 1 corresponds

to the Cobb-Douglas production function.

(a) Prove that if σ ≥ 1, then a positive steady state exists.

(b) Suppose σ < 1. Derive the equilibrium law of motion for capital.

Analyze the existence, uniqueness and stability properties of the

steady state. In particular, prove that for A sufficiently small,

there is no positive steady state.

6. In Section 16.2, we have assumed zero depreciation. Suppose that the

depreciation rate is δ. Derive the equilibrium law of motion for capital.

What is this law for δ = 1.

7. Government in the Diamond (1965) model.

Consider the model in Section 16.2. Let Gt denote the per capita

government spending. Suppose that government spending is financed

by lump-sum taxes levied on the young. Let u (c) = ln c and f (k) =

kα.

(a) Derive the equilibrium law of motion for capital.

(b) Analyze the impact of a permanent increase of government spend-

ing from GL to GH . Draw the path of the capital stock.

(c) Analyze the impact of a temporary increase of government spend-

ing from GL to GH . Draw the path of the capital stock.

8. Social security in the Diamond (1965) model.

Let u (c) = ln c and f (k) = kα.


(a) Pay-as-you-go social security. Suppose the government taxes

each young individual an amount dt and uses the proceeds to

pay benefits to the current old individuals. Thus each old person

receives gndt.

i. Derive the equilibrium law of motion for capital.

ii. What is the impact of introducing social security tax on the

equilibrium capital stock?

iii. If the economy is initially on a balanced growth path that

is dynamically efficient, how does a marginal increase in dt

affect the welfare of current and future generations? What

happens if the initial balanced growth path is dynamically

inefficient?

(b) Fully funded social security. Suppose the young contributes dt.

This amount is invested and returned with interest at time t+ 1

to the then old, Rt+1dt.

i. Derive the equilibrium law of motion for capital.

ii. What is the impact on capital and welfare? Explain the

intuition behind the result.

9. Linearize the equilibrium system for the model of Section 16.3. Use

this system to analyze the stability of the bubbly steady state.

Chapter 17

Incomplete Markets Models

Bewley Huggett and Aiyagari model

17.1 Exchange Economies

17.2 Production Economies

17.3 Exercises

499

500 CHAPTER 17. INCOMPLETE MARKETS MODELS

Chapter 18

Search and Matching Modelsof Unemployment

So far, we have focused on the neoclassical theory of the labor market.

This means that the labor supply is determined by a household’s utility

maximization problem and the labor demand is determined a firm’s profit

maximization problem. In both problems, the wage rate is taken as given

in a competitive labor market and the equilibrium wage is determined by

the market-clearing condition: The labor supply is equal to the labor de-

mand. This approach suffers from two major limitations. First, it cannot

explain the magnitude of the observed fluctuations in hours worked. By the

neoclassical theory, the marginal rate of substitution between leisure and

consumption is equal to the marginal product of labor, after adjusting for

labor and consumption taxes. However, this relationship does not hold in

the data (see, e.g., Rotemberg and Woodford (1991, 1999), Hall (1997), Mul-

ligan (2002), and Chari, Kehoe and McGrattan (2007). In particular, there

is a labor wedge between the marginal rate of substitution and the marginal

product of labor. Second, the neoclassical approach delivers predictions of

hours worked, but not unemployment. Thus, this approach cannot explain

the empirical fact that most cyclical movements in the aggregate amount

501

502CHAPTER 18. SEARCHANDMATCHINGMODELS OF UNEMPLOYMENT

of hours worked are accounted for by movements between employment and

unemployment, not by movements in hours worked by employed workers.

In this chapter, we intoduce the search and matching theory developed

by Diamond (1982), Mortensen (1982), and Pissarides (1985) (henceforth

DMP). This theory can potentially overcome the above two limitations. Its

key idea is that it takes time for an unemployed worker to find a job and for a

vacancy to be filled by a worker. In addition, the wage rate is not determined

in a centralized market. Rather, it is determined by bilateral bargaining.

This theory can be used to account for unemployment. However, it still faces

several difficulties in accounting for some labor market facts quantitatively.

For example, the standard DMP model assumes that job destruction is

exogenous. But empirical evidence shows that both job creation and job

destruction respond to exogenous shocks. Mortensen and Pissarides (1994)

introduce endogenous job destruction in the DMP framework. In addition,

the standard DMP model cannot explain the large cyclical movements of the

labor wedge, unemployment, and vacancies observed in the data (Shimer

(2005) and Hall (2005)). Thus, modification of this model is needed. As

Shimer (2005) and Hall (2005) argue, introducing wage rigidity is important.

We refer the reader to Shimer (2010) for a thorough treatment of this issue

and to Pissarides (2000) for a textbook treatment of the search and matching

theory of unemployment.

18.1 A Basic DMP Model

We first introduce a basic search and matching model with exogenous sepa-

ration based on Pissarides (1985, 2000). Consider an infinite-horizon setup

without aggregate uncertainty. There is a continuum of identical workers

of unit measure. Each worker is infinitely lived and derive utility from con-

sumption {Ct} according to the function∑∞

t=0 βtCt, where β ∈ (0, 1) .

18.1. A BASIC DMP MODEL 503

The production technology has constant returns to scale, with labor as

the only input. The marginal productivity of labor is denoted by A. Suppose

that each firm employs at most one worker. A firm can enter the economy

by posting a vacancy at the cost κ. Once a worker find a match with a firm,

the firm and the worker negotiate a wage wt and the firm generates profits

A − wt. All matches resolve in the end of the period with an exogenously

given probability ρ.

The measure of total successful matches in period t is given a matching

function mt = M (ut, vt) , where ut and vt are the aggregate measures of

unemployed workers and vacancies. Assume that the matching function is

increasing in both its arguments, concave, and linearly homogenous. Then

the probability that an unemployed worker finds a job (the job finding rate)

is given by qut = mt/ut and the probability that an open vacancy is filled

(the job filling rate) is given by qvt = mt/vt. Assume that the matched

relationship starts to produce in the next period.

Define the market tightness by θt = vt/ut. Then by the linear homogene-

ity property,

qvt =M (ut, vt)

vt≡ q (θt) , qut =

M (ut, vt)

ut≡ q (θt) θt, (18.1)

where it follows from the properties of the matching functionM that q′ (θ) <

0 and d ln q (θ) /d ln θ ∈ (−1, 0). Intuitively, when the labor market is tighter,

it is harder for a firm to meet a worker and it is easier for a worker to find

a job. Denote by Nt the aggregate employment rate. Then it follows the

following dynamics:

Nt+1 = (1− ρ)Nt +M(ut, vt), N0 given. (18.2)

Define the unemployment rate as

ut = 1−Nt. (18.3)


Suppose that the wage rate wt is determined in a Nash bargain between

a matched firm and worker. To describe this bargaining problem, we need to

derive the surplus to the worker and the firm. Let JFt be the value to a filled

job and Vt be the value to a vacancy. Then they satisfy the asset-pricing

equations:

JFt = A− wt + β(ρVt+1 + (1− ρ)JFt+1

), (18.4)

Vt = −κ+ β[qvt J

Ft+1 + (1− qvt )Vt+1

]. (18.5)

Competition drives the value to a vacancy to zero so that Vt = 0 for all t. It

follows from (18.5) that

JFt+1 =κ

βqvt. (18.6)

Substituting into (18.4) yields

JFt = A− wt +(1− ρ) κ

qvt. (18.7)

Combing the above two equations yields

κ

qvt= β

(A− wt+1 +

(1− ρ)κqvt+1

). (18.8)

Let JWt and JUt denote the values to an employed and an unemployed,

respectively. Then they satisfy the asset pricing equations:

JWt = wt + β(ρJUt+1 + (1− ρ)JWt+1

), (18.9)

JUt = b+ β[qut J

Wt+1 + (1− qut )JUt+1

], (18.10)

where b denotes unemployment insurance benefits, which are financed by

lump-sum taxes. One may interpret b broadly as nonmarket income.

The Nash bargaining problem is given by

maxwt

(JWt − JUt

)η (JFt − Vt

)1−η, (18.11)


subject to JWt ≥ JUt and JFt ≥ Vt = 0, where η ∈ (0, 1) denotes the worker’s

bargain power. The first-order condition gives

ηJFt = (1− η)(JWt − JUt

).

Define total surplus by

St = JFt + JWt − JUt .

It follows that

ηSt = JWt − JUt , (18.12)

(1− η)St = JFt =κ

βqvt−1

, (18.13)

where the last equality in (18.13) follows from the free entry condition (18.6).

By (18.4), (18.9), and (18.10), St satisfies the asset-pricing equation:

St = A− b+ β (1− ρ)St+1 − βqut(JWt+1 − JUt+1

)= A− b+ β (1− ρ− ηqut )St+1

= A− b+ κ (1− ρ− ηqut )(1− η) qvt

, (18.14)

where the second equality follows from (18.12) and the last equality follows

from (18.13). Thus,

ηSt = JWt − JUt = wt − b+ β (1− ρ− qut )(JWt+1 − JUt+1

)= wt − b+ β (1− ρ− qut ) ηSt+1,

where the second equality follows from (18.9) and (18.10) and the last equal-

ity follows from (18.12). Substituting (18.14) for St and (18.13) for St+1 into

the above equation and solving for wt yield

wt = η (A+ κθt) + (1− η) b. (18.15)

Note that this solution implies that any firm and worker match offers the

same wage wt because it depends only on the aggregate variables.


∗

∗

Job creation

Wage curve

Figure 18.1: Steady-state equilibrium market tightness and wage.

A search equilibrium consists of sequences of allocations {Nt, ut, vt, θt} ,wage {wt} , and the job filling rate qvt such that equations (18.1), (18.2),

(??), (18.8), and (18.15) are satisfied.

Steady State

Consider the steady state in which all variables are constant over time. We

write the steady-state version of equation (18.8) as

κ

q (θ)(1− β (1− ρ)) = β (A− w) . (18.16)

This equation gives a relationship between θ and w. We plot this relation in

Figure 18.1 and call it the job creation curve. In the (θ,w) space it slopes

down: Higher wage rate makes job creation less profitable and so leads to a

lower equilibrium ratio of new hires to unemployed workers. It replaces the

demand curve of Walrasian economics.

We write the steady-state version of equation (18.15) as

w = η (A+ κθ) + (1− η) b. (18.17)


This equation also gives a relationship between θ and w. We plot this re-

lation in Figure 18.1 and call it the wage curve. This curve slopes up: At

higher market tightness the relative bargaining strength of market partici-

pants shifts in favor of workers. It replaces the supply curve of Walrasian

economics.

The intersection of the wage and job creation curves gives the steady

state equilibrium (θ∗, w∗) . In particular, eliminating w, we can show that

θ∗ is the unique solution to the following equation:

κ

q (θ)(1− β (1− ρ)) + βηκθ = β (1− η) (A− b) . (18.18)

Given the solution θ, we plot the job creation curve u = vθ in Figure

18.2. We write the steady-state version of the employment dynamics (18.2)

as

ρ (1− u) =M (u, v) , (18.19)

which gives the so called Beveridge curve for (u, v) in Figure 18.2. By the

property of the matching function, it is downward sloping and convex. The

intersection of the job creation curve and the Beveridge curve determines

the equilibrium vacancy and unemployment rates.

The above two figures are useful for understanding comparative statics

properties of the model. First, consider the impact of an increase in pro-

ductivity A. It shifts the job creation curve in Figure 18.1 to the right and

wage curve up. Since 0 < η < 1, the job creation curve shifts by more, so

both wages and market tightness increase. In Figure 18.2, this rotates the

job creation line anticlockwise, increasing vacancies and reducing unemploy-

ment. Next, consider the impact of an increase in the unemployment benefit

b. It shifts the wage curve up, but does not change the job creation curve

in Figure 18.1. Thus, it raises the wage and lowers the market tightness.

In addition, it rotates the job creation curve in Figure 18.2 down, thereby


Job creation (JC)

Beverage curve (BC)

u

v

Figure 18.2: Steady-state equilibrium vacancies and unemployment.

increasing the unemployment rate and reducing the vacancy rate. The effect

of an increase in the worker bargaining power η has a similar result.

Transitional Dynamics

Now, we turn to transitional dynamics. Substituting equation (18.15) into

equation (18.8) yields the dynamic job creation equation:

κ

q (θt)= β

((1− η) (A− b)− ηκθt+1 +

(1− ρ)κq (θt+1)

). (18.20)

Rewrite the employment dynamics equation (18.2) as:

ut+1 = (1− ρ− q (θt) θt)ut + ρ, u0 given. (18.21)

We then obtain a system of two differential equations for two unknowns

(ut, θt) . To analyze the stability of the system, we linearize it around the

steady state (u∗, θ∗) :[dθt+1

dut+1

]=

[a11 0a21 a22

] [dθtdut

],


A ∆ = 0

∆ = 0 u

θ

Figure 18.3: Phase diagram.

where

a11 =q′ (θ∗) /q2 (θ∗)

β (1− ρ) q′ (θ∗) /q2 (θ∗) + βη,

a21 = uq (θ∗) [1− ξ (θ∗)] ,

a22 = 1− ρ− q (θ∗) θ∗,

where we define the elasticity of q (θ) as ξ (θ) = −θq′ (θ) /q (θ) .We can check

that ξ (θ) ∈ (0, 1) . Note that ut is predetermined but θt is nonpredetermined.

Suppose that 1− ρ− ηq (θ∗) θ∗/ξ (θ∗) > 0. Then a11 > 1. We can also check

that a22 ∈ (−1, 1) and a21 > 0. Thus, the steady state is a local saddle point.

Because the equation for θt does not depend on Nt, the solution for θt is a

constant over time.

Figure 18.3 gives the phase diagram for (ut, θt). Clearly the locus Δut =

ut+1−ut = 0 is downward sloping and the locus Δθt = 0 is a horizontal line.

If the initial unemployment rate is u0, then the market tightness immediately

jumps to the saddle path on the locus Δθt = θt+1 − θt = 0. During the

transition path, θt is constant, but ut falls until reaching the steady state.

Figure 18.4 shows the adjustment of vacancies. Vacancy initial jumps higher


Δθ = 0

∆ = 0

v

u u

A

Figure 18.4: Adjustments of vacancies and unemployment.

than its steady state value and then gradually falls by keeping the market

tightness constant over time.

Large Firms

So far, we have assumed that each firm has only one worker. Now, we

assume that each firm has many workers. There is a continuum of firms

of measure unity. The timing is as follows: At the beginning of period t,

there are N jt workers in firm j. The firm produces output AN j

t and engages

in Nash bargains with each employee separately, by taking the wages of

all other employees as given. After production and paying the wage rate

wt, the job separation occurs so that the firm loses a fraction ρ of workers.

After separation, the firm posts vjt vacancies. These vacancies are filled with

probability qvt so that at the end of period t, the firm has workers N jt+1 given

by

N jt+1 = (1− ρ)N j

t + qvt vjt . (18.22)

Define aggregate employment and aggregate vacancies as Nt =∫N jt dj and

vt =∫vjt dj. The job-filling rate qvt and the job-finding rate qut are defined in


(18.1). The aggregate law of motion for Nt is given by (18.2).

Firm j’s problem is given by

max{Nj

t ,vjt}

∞∑t=0

βt(AN j

t − wtNjt − v

jtκ

),

subject to (18.22). The first-order conditions are given by

vjt : −κ+ μtqvt = 0, (18.23)

N jt : (A− wt) + μt (1− ρ)− β−1μt−1 = 0,

where μt is the Lagrange multiplier associated with (18.22). Combining the

above two equations yields:

κ

qvt= β

(A− wt+1 +

(1− ρ) κqvt+1

).

This equation is identical to (18.8). Note that equation (18.23) plays the

same role as the free-entry condition (18.6), which determines the equilib-

rium number of vacancies. In addition, the firm takes qvt = q (θt) or the

market tightness θt as given when solving an individual firm problem. This

causes externality to other firms.

We may write firm’s problem by dynamic programming:

Jt

(N jt

)= max

vjt

AN jt − wtN

jt − v

jtκ+ βJt+1

(N jt+1

)subject to (18.22). By the envelope condition,

∂Jt

(N jt

)∂N j

t

= (A− wt + (1− ρ)μt) = A− wt +(1− ρ) κ

qvt= JFt , (18.24)

where the last equality follows from (18.7). Consider the firm’s marginal

value if it has additional ε workers at the end of period t − 1 before sepa-

ration occurs. Let w be the wage rate bargained between the firm and the


additional workers. The Bellman equation is given by

Jt

(N jt + ε

)= max

vjt

A(N jt + ε

)− wtN j

t − (1− ρ)wε

−vjtκ+ βJt+1

(N jt+1

)subject to

N jt+1 = (1− ρ)

(N jt + ε

)+ qvt v

jt . (18.25)

By the envelope condition,

∂Jt

(N jt

)∂ε

|ε=0 = A− w + (1− ρ)β∂Jt+1

(N jt+1

)∂N j

t+1

.

The first-order condition with respect to vjt is given by

κ = β∂Jt+1

(N jt+1

)∂N j

t+1

qvt .

Using the preceding two equations and (18.24) yields:

∂Jt

(N jt

)∂ε

|ε=0 = (wt − w) +∂Jt

(N jt

)∂N j

t

≡ Jn (w, t) .

The values of an employed worker JWt (w) and an unemployed worker

JUt (w) are the same as before except that we replace wt by w. The wage

rate wt is determined by the Nash bargaining problem:

wt = argmaxw

(JWt (w)− JUt (w)

)ηJn (w, t)

1−η , (18.26)

This problem is equivalent to (18.11). As a result, the large-firm formulation

and the previous formulation give an identical equilibrium outcome. Note

that one can also directly derive the wage rate wt by solving the following

simpler problem:

maxwt

(JWt − JUt

)η⎛⎝∂Jt

(N jt

)∂N j

t

⎞⎠1−η

. (18.27)


Efficiency

Does the decentralized search equilibrium efficient? Search equilibrium suf-

fers from trading externalities. The externalities are shown by the depen-

dence of the transition probabilities of unemployed workers and vacant firms

on the tightness of the labor market. Imagine that there is a social planner

that maximizes the social welfare subject to the same search and matching

frictions as firms and workers. His problem is given by

max{Nt,vt}

∞∑t=0

βt (ANt + but − vtκ) ,

subject to (18.21), (18.3), and θt = vt/ut. The social planner can internalize

externalities by adjusting market tightness θt via the choice of vt and ut (or

Nt). The first-order conditions are given by

vt : −κ+ μt (q (θt) θt)′ = 0

Nt : 0 = (A− b)− β−1μt−1

+μt[(1− ρ)

(1− q (θt) θt + (q (θt) θt)

′ θt)],

where μt is the Lagrange multiplier associated with (18.21). Then combine

the above two conditions and write the resulting steady state equation as

κ

q (θ)(1− β (1− ρ)) + βξ (θ) κθ = β (1− ξ (θ)) (A− b) . (18.28)

Comparing with equation (18.18), we deduce that when ξ (θ) = η, the

steady-state search equilibrium is socially efficient.

To understand the intuition behind this result, notice that ξ (θ) is the

elasticity of the expected duration of a job vacancy with respect to the

number of job vacancies and 1−ξ (θ) is the elasticity of the expected duration

of unemployment with respect to unemployment. If ξ (θ) is high, then it

means that at the margin firms are causing more congestion to other firms


than workers are causing to other workers. Firms must be “taxed” by the

social planner by giving workers a higher share in the wage bargain. If and

only if ξ (θ) = β, the two congestion effects offset each other.

18.2 Endogenous Job Destruction

So far, we have assume that separation is exogenous. In this section, we

introduce endogenous separation as in Mortensen and Pissarides (1994).

Suppose that exogenous separation occurs at the end of each period with

probability ρx. After exogenous separation at the end of period t− 1, each

firm is subject to an idiosyncratic productivity shock zt in the beginning of

period t, which is drawn independently and identically over time and across

firms from the fixed distribution G (z) over (0,∞) . Let the mean of zt be 1.

After observing zt, the firm decides whether to stay with the hired worker

or to separate. If it separates from the worker, the job becomes vacant and

the firm obtains value Vt. It can pay a cost κ to post a vacancy and starts

producing in the next period. The new job’s idiosyncratic productivity is

also drawn from the distribution G. But if the employer stays with the

worker, it produces output Azt and bargains the wage wt. Then we move to

period t+ 1.

Let JFt (zt) be the value of a filled job for the firm with idiosyncratic

shock zt. It satisfies the following Bellman equation:

JFt (zt) = Azt − wt + βEt[(1− ρx)max

{JFt+1 (zt+1) , Vt+1

}].

Clearly, there is a cutoff value z∗t such that when zt ≤ z∗t , the firm and the

worker separate. Thus, the endogenous separation rate is given by ρnt =

G (z∗t ) and the total separation rate is given by

ρt = ρx + (1− ρx) ρnt .

18.2. ENDOGENOUS JOB DESTRUCTION 515

Aggregate employment follows dynamics:

Nt+1 = (1− ρt)Nt + qvt vt, N0 given, (18.29)

where qvt = q (θt) is the job-filling rate and θt = vt/ut is the market tightness.

The unemployment rate is defined as ut = 1−Nt.

We can rewrite the value of a filled job as:

JFt (zt) = Azt−wt+β[(1− ρx)

∫ ∞

z∗t+1

(JFt+1 (zt+1)− Vt+1

)dG (zt+1) + Vt+1

].

The value of a vacancy Vt satisfies

Vt = −κ+ β

[qvt

∫ ∞

z∗t+1

(JFt+1 (zt+1)− Vt+1

)dG (zt+1) + Vt+1

], (18.30)

where the vacancy is filled with probability qvt . Free entry implies that Vt = 0

for all t. Thus, we obtain

β

∫ ∞

z∗t+1

JFt+1 (zt+1) dG (zt+1) =κ

qvt, (18.31)

and

JFt (zt) = Azt − wt +(1− ρx) κ

qvt. (18.32)

In addition,

κ

qvt= β

∫ ∞

z∗t+1

(Azt+1 − wt+1 +

(1− ρx) κqvt+1

)dG (zt+1) (18.33)

The reservation productivity z∗t is determined by the condition JFt (z∗t ) =

0, which implies that

Az∗t − wt +(1− ρx)κ

qvt= 0. (18.34)

The values of an employed and an unemployed satisfy the Bellman equa-

tions:

JWt (zt) = wt (zt) + β

[(1− ρx)

∫ ∞

z∗t+1

(JWt+1 (zt+1)− JUt+1

)dG (zt+1) + JUt+1

],

(18.35)


JUt = b+ βEt

[qut

∫ ∞

z∗t+1

(Jwt+1 (zt+1)− JUt+1

)dG (zt+1) + JUt+1

]. (18.36)

The wage rate is determined by Nash bargaining:

maxwt

(JWt (zt)− JUt

)η (JFt (zt)

)1−η.

The first-order condition gives

ηSt (zt) = Jwt (zt)− JUt , (18.37)

(1− η)St (zt) = JFt (zt) = Azt − wt +(1− ρx) κ

qvt, (18.38)

where the matching surplus St (zt) = JFt (zt)+ JWt (zt)− JUt . Using (18.37),

(18.35), and (18.36), we can derive

ηSt (zt) = JWt (zt)− JUt = wt (zt)− b

+β (1− ρx − qut )∫ ∞

z∗t+1

(JWt+1 (zt+1)− JUt+1

)dG (zt+1)

= wt (zt)− b+ β (1− ρx − qut ) η∫ ∞

z∗t+1

St+1 (zt+1) dG (zt+1) .

Substituting (18.38) into the above equation yields:

wt (zt) = (1− η) b+ η

(Azt +

(1− ρx)κqvt

)−ηβ (1− ρx − qut )

∫ ∞

z∗t+1

(Azt+1 − wt+1 +

(1− ρx)κqvt+1

)dG (zt+1) .

Plugging equation (18.33) yields the wage equation:

wt (zt) = η (Azt + θtκ) + (1− η) b. (18.39)

Note that the Nash bargained wage depends on the specific firm-worker

match because it depends on zt.

A search equilibrium consists of (Nt, ut, vt, z∗t , wt (zt)) such that equa-

tions (18.29), (18.3), (18.33), (18.34), and (18.39) hold. To solve for the

equilibrium, we substitute the wage equation (18.39) into (18.32) to derive

JFt (zt) = (1− η) (Azt − b)− ηθtκ+(1− ρx) κ

qvt.


At the reservation productivity z∗t ,

0 = JFt (z∗t ) = (1− η) (Az∗t − b)− ηθtκ+(1− ρx)κ

qvt. (18.40)

Taking the difference between the above two equations yields:

JFt (zt) = (1− η)A (zt − z∗t ) .

Substituting this equation into (18.31) yields:

κ

q (θt)= β (1− η)A

∫ ∞

z∗t(zt+1 − z∗t ) dG (zt+1) , (18.41)

where we have used the fact that qvt = q (θt) . This equation gives the job

creation condition.

Substituting the preceding equation into (18.40) yields:

ηθtκ = (1− η) (Az∗t − b) + β (1− ρx) (1− η)A∫ ∞

z∗t(zt+1 − z∗t ) dG (zt+1) .

(18.42)

We call this equation the job destruction condition. The job creation and

destruction conditions (18.41) and (18.42) determine the equilibrium reser-

vation productivity z∗t and market tightness θt. These equilibrium values are

constant over time. The equilibrium unemployment rate ut is then deter-

mined by equation (18.29), which can be rewritten as:

ut+1 = (1− ρt − q (θt) θt) ut + ρt, u0 given. (18.43)

We introduce two new variables useful for empirical analysis. First,

the job destruction rate is defined as the ratio of total job destruction to

employment:

jdt = ρx + (1− ρx)G (z∗t ) .

Second, the job creation rate is defined as the ratio of total new matches

to employment:

jct =M (ut, vt)

1− ut=q (θt) θtut1− ut

.


Steady State

In the steady state, all variables are constant over time. We can then write

down the steady-state job creation equation as

κ

q (θ)= β (1− η)A

∫ ∞

z∗(z − z∗) dG (z) . (18.44)

It is downward sloping. The intuition is that a higher reservation produc-

tivity implies that the match is more likely to separate. This reduces the

expected present value of the job. Thus, the firm posts fewer vacancies. We

can also write the steady-steady job destruction equation as

ηθκ = (1− η) (Az∗ − b) + β (1− ρx) (1− η)A∫ ∞

z∗(z − z∗) dG (z) ,

(18.45)

which is upward sloping. The intuition is that at higher θ, the worker’s

outside opportunities are better (and wages are higher) and so more marginal

jobs are destroyed. The intersection of these two curves give the steady-

state reservation productivity z∗ and the market tightness θ∗ as illustrated

in Figure 18.5.

After we determine (θ∗, z∗) , we use the Beveridge diagram to determine

the steady state (u∗, v∗) . The Beveridge curve is still described by equation

(18.19), except that the separation rate is given by

ρ = ρx + (1− ρx)G (z∗) .

It is downward sloping and convex. Figure 18.6 plots the Beveridge curve

and the job creation curve v = θu. The intersection of these two curves gives

(u∗, v∗) .

We now use the previous two curves to conduct a comparative statics

analysis. Consider a permanent increase in productivity A. An increase in

A shifts the job destruction curve in Figure 18.6 down and to the right. For

a given market tightness, a higher productivity raises the value of a job and


Job destruction curve (JD)

Job creation curve (JC)

Figure 18.5: Equilibrium reservation productivity and market tight-ness

makes the job less likely to be destructed. It also shifts the job creation

curve to the right. A higher productivity induces the firm to post more

vacancies holding the reservation productivity fixed. The net effect is to

raise the market tightness. But the effect on the reservation productivity is

ambiguous.

One may assume that the vacancy posting cost is proportional to the

productivity so that we replace κ with κA. This assumption is reasonable.

It is intuitive that hiring a more productivity worker is more costly. In

addition, if there is growth in A, this assumption ensures the existence of

a balanced growth path. We shall maintain this assumption in the analysis

below. As a result, A is canceled out on the two sides of the job creation

equation (18.41). We can then deduce that an increase in A reduces the

reservation productivity. At given unemployment, the job destruction rate


JC

JC’

BC

BC’

A

B

v

u

Figure 18.6: Effect of higher productivity on equilibrium vacanciesand unemployment.

18.3. UNEMPLOYMENT AND BUSINESS CYCLES 521

decreases, and the job creation rate increases (since the market tightness

rises). Unemployment must decrease until the job creation rate falls down

to the level of the lower job destruction rate. Thus, the steady-state effect

of a higher productivity is to reduce the job creation and job destruction

rates and unemployment.

Figure 18.6 shows that an increase in A rotates the job creation line anti-

clockwise. In addition, it shifts the Beveridge curve down. Thus, it reduces

unemployment, but the effect on vacancies is ambiguous. When the effect

on the job creation condition is larger, an increase in A raises vacancies.

Transitional Dynamics

Consider the transitional dynamics of a permanent increase in aggregate

productivity A. The equilibrium system consists of three equations (18.41),

(18.42), and (18.43) for three variables (θt, z∗t , ut) . Note that θt and z

∗t are

jump variables and ut is the predetermined state variable. The steady state

is a saddle point. In the absence of expected parameter changes, all three

variables, z∗t , θt and w (zt) must be at their steady state at all times. Follow-

ing a permanent increase in the aggregate productivity, the job creation rate

jumps up and the job destruction rate jumps down. The job destruction rate

remains at this constant lower level at all times. The unemployment rate

falls until the job creation rate falls down to the level of the job destruction

rate when the economy reaches a new steady state.

18.3 Unemployment and Business Cycles

In this section, we embed the basic DMP model in a real business cycle

framework. In particular, we incorporate risk aversion, capital accumu-

lation, and aggregate productivity shocks. Important early contributions

in this literature include Merz (1995), Andolfatto (1996), and den Haan,


Ramey, and Watson (2000).

Households

There is a continuum of identical households of measure unity. Each house-

hold has a continuum of members of measure unity. Each member can be

either employed or unemployed. An unemployed member earns nonmarket

income b and each employed member earns wage wt. The representative

household accumulates capital and rents to firms at the rental rate Rkt.

It also owns firms and receive aggregate dividends Dt. Let Nt denote the

number of employed household members. Household members pool income

together and get full risk-sharing within the family. Thus, each member con-

sumes the same level of consumption, denoted by Ct. The budget constraint

for a representative household is give by

Ct + It = Dt +RktKt + wtNt + b (1−Nt)− Tt, (18.46)

where Tt represents lump-sum taxes and It represents investment. We in-

terpret b as unemployment benefits financed by lump-sum taxes. Capital

evolves according to

Kt+1 = (1− δ)Kt + It, K0 given. (18.47)

The household derives utility according to

E

∞∑t=0

βt

(C1−γt − 1

1− γ − χN1+φt

1 + φ

), γ, χ, φ > 0.

We may give a microfundation of the disutility function of work by assuming

that an employed household member j ∈ [0, Nt] derives disutility of work

χjφ. Then the total disutility of work is given by∫ Nt

0χjφdj =

χN1+φt

1 + φ.


The employment rate Nt within the representative household follows

dynamics:

Nt+1 = (1− δ)Nt + qut ut, (18.48)

where ut = 1−Nt and qut represents the job-finding rate. By the matching

technology in Section 18.1, qut = q (θt) θt where θt represents market tight-

ness. In equilibrium, since households are identical, Nt and ut also represent

aggregate employment and unemployment rates. In addition, θt = vt/ut,

where vt represents aggregate vacancies.

We may write the representative household’s problem by dynamic pro-

gramming:

Vt (Kt, Nt) = maxCt,It

C1−γt

1− γ −χN1+φ

t

1 + φ+ βEtVt+1 (Kt+1, Nt+1)

subject to (18.46) and (18.47). Note that we have suppressed the aggregate

state variables in the household value function Vt (Kt, Nt). We can derive

the Euler equation:

C−γt = βEtC

−γt+1 (Rkt+1 + 1− δ) . (18.49)


∂Vt (Kt, Nt)

∂Nt= Λt (wt − b)−χNφ

t +(1− ρ− q (θt) θt) βEt∂Vt+1 (Kt+1, Nt+1)

∂Nt+1,

where Λt = C−γt . Define the marginal household surplus (measured in con-

sumption units) of getting one more household member hired as

SHt =1

Λt

∂Vt (Kt, Nt)

∂Nt.

In this definition, we have assumed the new employed member and all other

employed members earn the same wage wt. We then have

SHt = wt − b−χNφ

t

Λt+ (1− ρ− q (θt) θt)Et

βΛt+1

ΛtSHt+1. (18.50)


We may assume that the newly employed member earns a possibly dif-

ferent wage w.We now derive the marginal household surplus. Consider the

dynamic programming problem:

Vt (Kt, Nt + ε) = maxCt,It

C1−γt

1− γ −χ (Nt + ε)1+φ

1 + φ+ βEtVt+1 (Kt+1, Nt+1) ,


Ct + It = Dt +RktKt + wtNt + wε+ b (1−Nt)− Tt,

Nt+1 = (1− ρ) (Nt + ε) + q (θt) θt (1− (Nt + ε)) .


∂Vt (Kt, Nt + ε)

∂ε|ε=0

= −χNφt + Λtw + (1− ρ− q (θt) θt)Etβ

∂Vt+1 (Kt+1, Nt+1)

∂Nt+1

= Λt (w − wt) +∂Vt (Kt, Nt)

∂Nt

≡ Vn (w, t) .

Here, Vn (w, t) denotes the marginal surplus to the household when an ad-

ditional household member is hired at the wage rate w.

Firms

There is a continuum of firms of measure unity. Each firm rents capital Kjt

and hires labor N jt to produce output according to an identical constant-

returns-to-scale production function F. Production is subject to an aggregate

labor-augmenting productivity shock At, which follows a Markov process.

Each firm j chooses capital Kjt and vacancies vjt to solve the following prob-

lem:

max{Kj

t ,vjt}

E

∞∑t=0

βtΛtΛ0

[F

(Kjt , AtN

jt

)− wtN j

t −RtKjt − v

jtκ

],


subject to (18.22), where qvt = q (θt) is the job-filling rate. Note that since

households own the firm, we use the household pricing kernel to compute

firm value.

We can write down firm j’s problem by dynamic programming:

Jt

(N jt

)= max

Kjt ,v

jt

F(Kjt , AtN

jt

)−wtN j

t −RktKjt−v

jtκ+Et

βΛt+1

ΛtJt+1

(N jt+1

),

subject to (18.22), where we have suppressed aggregate state variables in

the value function Jt

(N jt

). The first-order conditions are given by

F1

(Kjt , AN

jt

)= Rkt, (18.51)

κ = EtβΛt+1

Λt

∂Jt+1

(N jt+1

)∂N j

t+1

qvt . (18.52)

Equation (18.51) implies that Kjt /N

jt does not depend on firm identity j.


∂Jt

(N jt

)∂N j

t

= AtF2

(Kjt , AtN

jt

)− wt + (1− ρ)Et

βΛt+1

Λt

∂Jt+1

(N jt+1

)∂N j

t+1

.

(18.53)

Define marginal firm surplus as

SFt =∂Jt

(N jt

)∂N j

t

.

Using this definition and equations (18.52) and (18.53), we derive

SFt = AtF2

(Kjt , AtN

jt

)− wt +

(1− ρ) κqvt

. (18.54)

Substituting back into equation (18.52) yields:

κ

q (θt)= Et

βΛt+1

Λt

{At+1F2

(Kjt+1, At+1N

jt+1

)− wt+1 +

(1− ρ)κq (θt+1)

}.

(18.55)


Next, we compute its value when firm j hires an additional ε workers in

the beginning of period t. The firm pays these workers the wage rate w. It


Jt

(N jt + ε

)= max

Kjt ,v

jt

F(Kjt , At(N

jt + ε)

)− wtN j

t − wε−RtKjt − v

jtκ

+EtβΛt+1

ΛtJt+1

(N jt+1

)subject to

N jt+1 = (1− ρ)

(N jt + ε

)+ qvt v

jt .


∂Jt

(N jt + ε

)∂ε

|ε=0 = AtF2

(Kjt , AtN

jt

)− w + (1− ρ)Et

βΛt+1

Λt

∂Jt+1

(N jt+1

)∂N j

t+1

= wt − w +∂Jt

(N jt

)∂N j

t

≡ Jn (w, t) ,

where the second equality follows from (18.53). The value Jn (w, t) denotes

the marginal surplus to the firm when it hires an additional unit of worker

at the wage rate w.

Nash Bargained Wages

The firm and the newly hired household member bargains over the wage

rate w using the Nash bargaining solution. The Nash Bargaining problem

is given by

wt = argmaxw

Vn (w, t)η Jn (w, t)

1−η ,

where we impose that the Nash solution gives the same wage wt as that

offered to the previously hired workers. By the first-order condition,

η∂Vn (w, t)

∂wJn (w, t) + (1− η) ∂Jn (w, t)

∂wVn (w, t) = 0.


Setting w = wt and using the definitions of Jn (w, t) and Vn (w, t) , we obtain:

η∂Jt

(N jt

)∂N j

t

= (1− η) 1

Λt

∂Vt (Nt)

∂Nt.

This condition can be equivalently derived from the following Nash bar-

gaining problem:

maxwt

(SHt

)η (SFt

)1−η.

The first-order condition is given by

(1− η)SHt = ηSFt .

Using this condition, we can rewrite equation (18.50) as

η

1− ηSFt = wt − b−

χNφt

Λt+ (1− ρ− q (θt) θt)Et

βΛt+1

Λt

η

1− ηSFt+1.

Substituting equations (18.54) and (18.52) yields:

η

1− η

(AtF2

(Kjt , AtN

jt

)−wt +

(1− ρ) κqvt

)= wt − b−

χNφt

Λt+ (1− ρ− q (θt) θt)

η

1− ηκ

qvt.

Solving for wt yields:

wt = (1− η)(b+

χNφt

Λt

)+ η

[AtF2

(Kjt , AtN

jt

)+ κθt

]. (18.56)

Since Kjt /N

jt is identical for all firm j, we deduce that the wage rate wt does

not depend on j.

Equilibrium

Define aggregate output, aggregate capital, aggregate employment, and ag-

gregate vacancies as Yt =∫F (Kj

t , AtNjt )dj, Kt =

∫Kjt dj, Nt =

∫N jt dj, and

vt =∫vjt dj. A search equilibrium consists of stochastic processes {Yt, Ct,Kt, Nt, ut, vt, θt, wt, Rkt}


such that equations Yt = F (Kt, AtNt) , (18.49), (18.51), (18.48), ut = 1−Nt,

vt = utθt, (18.55), (18.56) and the following resource constraint hold:

Ct +Kt+1 − (1− δ)Kt = F (Kt, AtNt)− κvt.

18.4 Exercises

1. For the model in Section 18.1, suppose that firms own capital and

make investment decisions. Households trade firm shares. Formulate

this market arrangement and define a search equilibrium. Analyze this

equilibrium.

2. Following a similar analysis in Section 18.1, derive the social planner’s

problem for the model in Section 18.2. Study efficiency properties of

the decentralized search equilibrium.

3. Suppose that there is a mean-preserving spread change in the distribu-

tion G. Study the steady-state effect on the reservation productivity

and the unemployment rate.

4. Following a similar analysis in Section 18.1, derive the social planner’s

problem for the model in Section 18.3. Study efficiency properties of

the decentralized search equilibrium.

5. Calibrate the model in Section 18.3 to match the US data. Solve the

model numerically

Chapter 19

New Keynesian Models

19.1 A Basic Model

19.2 A Medium-Scale DSGE Model

Smets and Wouters model

19.3 Exercises

529

530 CHAPTER 19. NEW KEYNESIAN MODELS

Part IV

Further Topics

531

Chapter 20

Bandit Models

20.1 Basic Problem

20.2 Market Learning

20.3 Job Matching

20.4 Experimentation and Pricing

20.5 Exercises

533

534 CHAPTER 20. BANDIT MODELS

Chapter 21

Recursive Utility

21.1 Utility Models

21.2 Optimal Growth

21.3 Portfolio Choice and Asset Pricing

21.4 Pareto Optimality

21.5 Exercises

535

536 CHAPTER 21. RECURSIVE UTILITY

Chapter 22

Recursive Contracts

22.1 Limited Commitment

22.2 Hidden Actions

22.3 Hidden Information

22.4 Exercises

537

538 CHAPTER 22. RECURSIVE CONTRACTS

Part V

Mathematical Appendix

539

Appendix A

Linear Algebra

A.1 Vector Spaces

A.2 Linear Transformations

A.3 Determinants

A.4 Eigentheory

A.5 Inner Product Spaces

A.6 Unitary Diagonalization Theorems

A.7 Quadratic Forms

A.8 Jordan Decomposition Theorem

541

542 APPENDIX A. LINEAR ALGEBRA

Appendix B

Real and Functional Analysis

B.1 Metric Spaces and Normed Vector Spaces

B.2 The Contraction Mapping Theorem

B.3 Dual Spaces

B.4 Hahn-Banach Theorem

543

544 APPENDIX B. REAL AND FUNCTIONAL ANALYSIS

Appendix C

Convex Analysis

C.1 Convex Functions

C.2 The Theorem of the Maximum

C.3 Lagrange Theorem

C.4 Kuhn-Tucker Theorem

545

546 APPENDIX C. CONVEX ANALYSIS

Appendix D

Measure and ProbabilityTheory

D.1 Measurable Spaces

D.2 Measures

D.3 Measurable Functions

D.4 Integration

D.5 Production Spaces

D.6 The Monotone Class Lemma

D.7 Conditional Expectation

Definition D.7.1 Let (X,X ) be a measurable space and let {λn} and λ be

measures in P (X,X ). Then {λn} converges strongly to λ if

lim

∫fdλn =

∫fdλ, all f ∈ B(X,X )

and if the rate of convergence is uniform for all f such that ‖f‖ = supx∈X |f (x)| ≤1.

547

548 APPENDIX D. MEASURE AND PROBABILITY THEORY

Theorem D.7.1 The sequence of measures {λn} converges strongly to λ

according to Definition D.7.1 if and only if one the following two equivalent

conditions hold:

limn→∞ ||λn − λ||TV = 0

or

limn→∞ |λn (A)− λ (A)| = 0, A ∈ X ,

and the latter convergence is uniform in A.

The appendix fragment is used only once. Subsequent appendices can

be created using the Chapter Section/Body Tag.

Afterword

The back matter often includes one or more of an index, an afterword,

acknowledgements, a bibliography, a colophon, or any other similar item.

In the back matter, chapters do not produce a chapter number, but they

are entered in the table of contents. If you are not using anything in the

back matter, you can delete the back matter TeX field and everything that

follows it.

549

550 AFTERWORD

References

551

Index

Markov, 3

552

Documents

DynamicsI