18
WORKING PAPER SERIES NO. 478 / APRIL 2005 THE CONQUEST OF U.S. INFLATION LEARNING AND ROBUSTNESS TO MODEL UNCERTAINTY ECB CONFERENCE ON MONETARY POLICY AND IMPERFECT KNOWLEDGE by Timothy Cogley and Thomas J. Sargent

Technology Assessment - Update on Genetic Tests for Non-Cancer

  • Upload
    others

  • View
    1

  • Download
    0

Embed Size (px)

Citation preview

Page 1: Technology Assessment - Update on Genetic Tests for Non-Cancer

WORKING PAPER SER IES

NO. 478 / APR I L 2005

THE CONQUEST OF U.S. INFLATION

LEARNING ANDROBUSTNESS TO MODEL UNCERTAINTY

ECB CONFERENCE ON MONETARY POLICY AND IMPERFECT KNOWLEDGE

by Timothy Cogleyand Thomas J. Sargent

Page 2: Technology Assessment - Update on Genetic Tests for Non-Cancer

In 2005 all ECB publications will feature

a motif taken from the

€50 banknote.

WORK ING PAPER S ER I E SNO. 478 / APR I L 2005

This paper can be downloaded without charge from http://www.ecb.int or from the Social Science Research Network

electronic library at http://ssrn.com/abstract_id=701291.

ECB CONFERENCE ON MONETARY POLICY AND IMPERFECT KNOWLEDGE

THE CONQUEST OF U.S. INFLATION

LEARNING ANDROBUSTNESS TO

MODEL UNCERTAINTY 1

by Timothy Cogley 2

and Thomas J. Sargent 3

1 For comments and suggestions, we thank Willem Buiter, Bennett McCallum, Ulf Söderström, Harald Uhlig, an anonymous referee, andseminar participants at Georgetown, Harvard, Bocconi, the European Central Bank, the London Business School,

and the University of Pennsylvania.

3 Corresponding author: Department of Economics, New York University, 269 Mercer Street, 7th Floor, New York,2 Department of Economics, University of California, Davis, O ne Shields Avenue, CA 95615, USA; e-mail: [email protected]

NY 10003, USA; Tel.: 212-998-3548; e-mail:[email protected]

Page 3: Technology Assessment - Update on Genetic Tests for Non-Cancer

© European Central Bank, 2005

AddressKaiserstrasse 2960311 Frankfurt am Main, Germany

Postal addressPostfach 16 03 1960066 Frankfurt am Main, Germany

Telephone+49 69 1344 0

Internethttp://www.ecb.int

Fax+49 69 1344 6000

Telex411 144 ecb d

All rights reserved.

Reproduction for educational and non-commercial purposes is permitted providedthat the source is acknowledged.

The views expressed in this paper do notnecessarily reflect those of the EuropeanCentral Bank.

The statement of purpose for the ECBWorking Paper Series is available fromthe ECB website, http://www.ecb.int.

ISSN 1561-0810 (print)ISSN 1725-2806 (online)

ECB Conference on “Monetary policy and imperfect knowledge”

This paper was presented at an ECB Conference on “Monetary policy and imperfect knowledge”, held on 14-15 October 2004 in Würzburg, Germany. The conference organisers were Vitor Gaspar, Charles Goodhart, Anil Kashyap and Frank Smets. The conference programme, including papers and discussions, can be found on the ECB’s

web site (http://www.ecb.int/events/conferences/past/html/index.en.html).

Page 4: Technology Assessment - Update on Genetic Tests for Non-Cancer

3ECB

Working Paper Series No. 478April 2005

CONTENTS

Abstract 4

1 Introduction 5

2 Learning and policy making with multipleapproximating models 72.1 Preliminary adaptive setup with a single

approximating model 72.2 Our three-model model 82.3 The central bank’s decision making

process 12

3 A statistical history of thought and policy 183.1 Direction of fit 253.2 Sensitivity to discounting 273.3 Sensitivity to prior model weights 283.4 Anatomy of stabilization 29

4 ‘Triumph’ versus ‘vindication’ 30

5 Conclusion 32

References 34

Tables and figures 39

European Central Bank working paper series 53

Page 5: Technology Assessment - Update on Genetic Tests for Non-Cancer

Abstract

Previous studies have interpreted the rise and fall of U.S. inflation after World

War II in terms of the Fed’s changing views about the natural rate hypothesis but

have left an important question unanswered. Why was the Fed so slow to implement

the low-inflation policy recommended by a natural rate model even after economists

had developed statistical evidence strongly in its favor? Our answer features model

uncertainty. Each period a central bank sets the systematic part of the inflation rate

in light of updated probabilities that it assigns to three competing models of the

Phillips curve. Cautious behavior induced by model uncertainty can explain why the

central bank presided over the inflation of the 1970s even after the data had convinced

it to place much the highest probability on the natural rate model.

4ECBWorking Paper Series No. 478April 2005

JEL classification: E31, E58, E65.

Keywords: natural unemployment rate; Phillips curve; Bayes’ law; anticipated utility;

robustness

Page 6: Technology Assessment - Update on Genetic Tests for Non-Cancer

1 Introduction

This paper uses a model of an adaptive monetary authority to interpret the rise

and fall of U.S. inflation in the 1960s, 1970s, and 1980s.1 Following DeLong (1997),

Taylor (1997), and Sargent (1999), we explore whether the rise and fall of U.S. in-

flation can be attributed to policy makers’ changing beliefs about the natural rate

hypothesis. One story that emphasizes changing beliefs goes as follows. Samuelson

and Solow (1960) taught policy makers that there was an exploitable long-run trade-

off between inflation and unemployment, and inflation rose as the authorities tried

to exploit the tradeoff. But that misguided policy experiment generated observations

that taught the authorities the natural rate hypothesis, which eventually convinced

them to reduce inflation.

The adverb “eventually” significantly qualifies this story because the data in-

dicate that the authorities should have learned the natural rate model long before

they acted on it. Sims (2001) and Cogley and Sargent (2001) demonstrate that the

natural rate hypothesis should have been learned by the early 1970s, yet inflation re-

mained high until the early 1980s.2 If the rise and fall of U.S. inflation reflected only

changing beliefs about the natural rate hypothesis, and not, say, altered purposes or

decision making arrangements, then it is puzzling that inflation remained high for

a decade after substantial statistical evidence favoring the natural rate hypothesis

had accumulated. By the early 1970s, average inflation was on the rise, yet average

unemployment had not fallen, contrary to the Samuelson-Solow model. The events

of the early 1970s turned the economics profession away from the Samuelson-Solow

model. Why did policy makers wait a decade to act on this lesson?

A number of alternative explanations for the high U.S. inflation of the 1970s

refrain from assigning an important role to policy makers’ changing beliefs about

1The model is what David Kreps (1998) called an anticipated utility model. In an anticipatedutility model, a decision maker recurrently maximizes the expected utility of a stream of futureoutcomes with respect to a model that is recurrently reestimated. Although an anticipated utilityagent optimizes and learns, he does not purposefully experiment. But as data accrue, he adapts andpossibly even respecifies the model that he uses to evaluate expected utility.

2Sims credits Albert Ando for first making this point.

5ECB

Working Paper Series No. 478April 2005

Page 7: Technology Assessment - Update on Genetic Tests for Non-Cancer

the natural rate hypothesis. For example, DeLong (1997) questions the motives

of Arthur Burns. Parkin (1993) and Ireland (1999) say that discretionary policy

making combined with a higher natural rate of unemployment resulted in a higher

inflationary bias. Chari, Christiano, and Eichenbaum (1998) and Albanesi, Chari,

and Christiano (2003) ascribe the high inflation to an expectations trap. Orphanides

(2003), Lansing (1999), and Romer and Romer (2002) emphasize that the Federal

Reserve was slow to detect the productivity slowdown and the rise in the natural

rate of unemployment.3 Primiceri (2003) emphasizes evolution in the Fed’s estimates

not only of the natural rate but also of the slope of the short-term trade-off between

inflation and unemployment.4 We believe that there is some truth in all of these ideas

but explore a different explanation. We show that concerns about the robustness of a

proposed inflation-stabilization policy across alternative models would have induced

policy makers to choose high inflation even though the data favored a model that

recommended low inflation.5

Our calculations confirm that by the mid 1970s, zero inflation would have been

optimal according to the model that was most probable among the ones that we

consider. But because the optimal policy takes model uncertainty into account via

Bayesian model averaging, that was not enough to convince our Bayesian policy maker

to abstain from trying to exploit the Phillips curve. In two competing models that had

smaller but still non-zero probability weights, a policy of quickly reducing inflation

would have been calamitous. If very bad results are associated with a particular policy

according to any of the models that retain a positive but small posterior probability,

our Bayesian policy maker refrains from that policy. When outcomes from following a

recommendation to stabilize inflation become worse under some particular worst-case

model, the ultimate decisions tend to track the recommendations of that worst-case

3See Christiano and Fitzgerald (2003) and Velde (2004) for critical surveys of these and othertheories of the Great Inflation. See John Taylor (2002) for an account that emphasizes the Fed’slearning about theories of the natural unemployment rate.

4Erceg and Levin (2003) investigate how the public’s learning about the Fed’s motives can explainthe recession that accompanied the reduction of inflation that the Fed engineered under Paul Volcker.

5Blinder (1998, p. 12-13) recounts how he used multiple models to evaluate alternative policieswhen he was Vice Chairman of the Federal Reserve Board.

6ECBWorking Paper Series No. 478April 2005

Page 8: Technology Assessment - Update on Genetic Tests for Non-Cancer

model. In this way, our statistical model rationalizes the idea that the high inflation

of the 1970s reflected the Fed’s desire to guard against the bad outcomes that would

have come from a low-inflation policy under models that by the 1970s should have

had low posterior probabilities.

2 Learning and policy making with multiple ap-

proximating models

2.1 Preliminary adaptive setup with a single approximating

model

We model the U.S. monetary authority as an adaptive decision maker like ones

described by Kreps (1998) and Sargent (1999) in the context of models of the Phillips

curve like ones in Sargent (1999), augmented with features that introduce a concern

for robustness to model uncertainty. The models in Sargent’s monograph work as

follows. In truth, but unbeknownst to the central bank, a natural rate version of a

Phillips curve relates unemployment to surprise inflation,

ut − u∗t = −θ(yt − xt) + ρ(L)ut−1 + ηt, (1)

where ut is unemployment, u∗t is the natural rate, yt is actual inflation, xt is both the

systematic part of inflation and the rate of inflation expected by the public, and ηt is

an iid normal shock with mean zero and variance σ2η. The central bank sets xt with

a decision rule that is described below; xt is related to actual inflation according to

yt = xt + ξt, (2)

where ξt is an iid normal shock with mean zero and variance σ2ξ .

The central bank does not know equation (1), and instead bases its decisions on

a statistical approximating model,

Yt = X ′tθ + νt, (3)

where Xt and Yt represent generic right- and left-hand variables in a regression, and

θ is a vector of regression parameters.

7ECB

Working Paper Series No. 478April 2005

Page 9: Technology Assessment - Update on Genetic Tests for Non-Cancer

The central bank starts date t with an estimated approximating model carried

over from date t − 1. It forms a decision rule for xt by minimizing a discounted

quadratic loss function,6

L = Et

∑∞

j=0βj(u2

t+j + λy2

t+j), (4)

subject to a constraint implied by its time t− 1 estimates of model (3). This induces

a best-response mapping from the loss function and the time t − 1 estimates of the

parameters of its approximating model to those of a time t policy rule, the first

period outcome of which is a time t policy action xt. To emphasize its dependence on

predetermined state variables and parameter estimates, we denote it as xt|t−1. The

policy action xt = xt|t−1 influences outcomes through equation (1). After observing

outcomes Yt, Xt at t, the central bank re-estimates the parameters of its policy model

(equation (3)), preparing to repeat the same decision process in the next period.

As the central bank’s beliefs (i.e., the parameters of its approximating model)

evolve with the accumulation of data, so too does its policy rule.

2.2 Our three-model model

In contrast to Sargent (1999), we take no stand on the true data generating process.

We drop Sargent’s (1999) assumption that the central bank has a unique approximat-

ing model and instead assume that the central bank acknowledges multiple models.7

Model uncertainty affects the bank’s deliberations and induces a Bayesian form of

robustness of decision rules at least across the domain of alternative models that are

on the table. Our central bank adopts a ‘blue-collar’ version of the approach to policy

making presented in Brock, Durlauf, and West (2003).

We choose three approximating models to represent Phillips curve specifications

that have been influential since the mid to late 1960’s. The first model allows a

permanently exploitable Phillips curve, the second a temporarily exploitable one,

6Our linear-quadratic framework abstracts from concerns about the zero bound on nominal in-terest rates. That could make preferences asymmetric with respect to inflation and deflation andcause policy makers to set an inflation target above zero.

7The sense in which the central bank has ‘a model’ is as a weighted average across several models.

8ECBWorking Paper Series No. 478April 2005

Page 10: Technology Assessment - Update on Genetic Tests for Non-Cancer

and the third a statistical Phillips curve that is not exploitable even temporarily.

The first statistical model is inspired by Samuelson and Solow (1960),

yt = γ0 + γ1(L)yt−1 + γ2(L)ut + η1t, (5)

where η1t is iid N(0, σ21). Its key feature is that it permits a long-run tradeoff unless

the parameter values take special configurations. We assume that the central bank

considers an unrestricted parameter configuration in order to allow the possibility

of a long-run trade-off. The bank also makes the identifying assumption that η1t is

orthogonal to the right side variables, including current unemployment.8 This makes

equation (5) a regression.

Model 2 is a restricted form of model 1, inspired by Solow’s (1968) and Tobin’s

(1968) suggestion about how to represent the natural rate hypothesis:

∆yt = δ1(L)∆yt−1 + δ2(L)(ut − u∗t ) + η2t. (6)

Like the Samuelson-Solow model, this features an exploitable short-run tradeoff be-

tween inflation and unemployment, but it recognizes a distinction between actual

unemployment and the natural rate, and it imposes Solow and Tobin’s version of

long-run neutrality. The restriction that the intercept is zero and that the sum of

the lag weights on yt equals one enforces that the long-run Phillips curve is verti-

cal, located at u∗t if the roots of δ2(L) are outside the unit circle. We assume that

the authorities make the identifying assumption that η2t is uncorrelated with current

unemployment, so that equation (6) is also a regression.

Our third model is inspired by Lucas (1972) and Sargent (1973), and it enforces

both long-run neutrality and a policy ineffectiveness proposition,

ut − u∗t = φ1(yt − xt|t−1) + φ2(L)(ut−1 − u∗

t−1) + η3t. (7)

8Sargent (1999) had more success tracking actual inflation with this ‘Keynesian’ direction offit. In contrast, a ‘classical’ direction of fit would put unemployment on the left-hand side, as inequation (1). See King and Watson (1994) and Sargent (1999) for background on how the directionof fit matters for the properties of Phillips curves. By adopting the model of government learningof Sargent and Williams (2003), Sargent, Williams, and Zha (2004) obtain much better econometricexplanations of the U.S. inflation path than were obtained by Sargent (1999).

9ECB

Working Paper Series No. 478April 2005

Page 11: Technology Assessment - Update on Genetic Tests for Non-Cancer

This model says that only unexpected inflation matters for unemployment. When

updating estimates of this model, the latent variable xt|t−1 is measured recursively as

the optimal policy of a Bayesian linear regulator who solves a discounted quadratic

control problem, taking all three approximating models into account. We describe

how the bank chooses xt|t−1 in detail below.

This model also reverses the direction of fit, putting unemployment on the left

side and inflation on the right. When entertaining this representation, the central

bank makes the identifying assumption that η3t is orthogonal to current inflation.

This identifying assumption differs from those of the other two models, but what

matters for statistical updating is the internal logic of each. Thus, we can proceed

with regression updates for this model as well.

As we demonstrate later, the assumption that the Keynesian models are estimated

with a Keynesian direction of fit is important for our story because it influences

estimates of sacrifice ratios. Models estimated in the Keynesian direction imply a

high sacrifice ratio in the 1970s, while those estimated in a classical direction predict

a low sacrifice ratio. We could incorporate uncertainty about the direction of fit

by expanding our three-model model to include versions of the Samuelson-Solow

and Solow-Tobin models estimated with a classical direction of fit. But this seems

revisionist, and in any case does not alter the results. Policy choices for the five-model

model are essentially the same as those for the three-model model.9 Accordingly, we

focus on the more parsimonious three-model representation.

Associated with each model and date is a posterior probability, αit, i = 1, 2, 3.

We assume the authorities entertain no other possibilities, so α1t + α2t + α3t = 1 for

all t. One obvious shortcoming of this approach is the assumption of an exhaustive

list of explicitly specified possible models. This has led Hansen and Sargent (2001

and 2002) and others to specify a single approximating model, to surround it with an

uncountable cloud of alternative models each of whose entropy relative to the approx-

imating model is bounded, and to use a minimax decision rule because the decision

9This happens because the additional models do not alter the identity of the ‘worst-case’ model.The significance of this fact will become clear later on.

10ECBWorking Paper Series No. 478April 2005

Page 12: Technology Assessment - Update on Genetic Tests for Non-Cancer

maker declines to put a unique prior over that huge set of models. Relative to Hansen

and Sargent’s robust decision maker and his continuum of vaguely specified models,

our three-model decision maker knows much more. He can construct a posterior over

his three models, then make decisions by an appropriate form of model averaging.

Nevertheless, a minimax flavor emerges from our Bayesian calculations with three

models.

The central bank’s decision process is similar to the one in Sargent’s (1999) model

that we described above, but with an extra step. At each date t, the central bank

implements the action xt|t−1 recommended by last period’s policy rule. Next, it gets

new data yt, ut and updates the estimates of each of its three models according to

Bayes’ theorem. It also recalculates the model weights αit by evaluating the marginal

likelihood associated with each model. Then it re-optimizes in light of its revised

view of the world, preparing a contingency plan xt+1|t for the next period.10

Notice that with this timing protocol, the Lucas-Sargent model always recom-

mends xt|t−1 = 0. There is no inflation bias because the central bank moves first,

setting policy for date t based on information available at t−1, and there is no reason

to vary xt|t−1 to stabilize ut because systematic policy is neutral. Because variation

10The observational equivalence of natural and unnatural rate models pointed out by Sargent(1976) pertains to our calculations in an important way. Using the Wold decomposition theorem,Sargent demonstrated that under a time-invariant decision rule for the systematic part of infla-tion, models that make unemployment depend on a distributed lag of inflation are observationallyequivalent with ones that make unemployment depend on a distributed lag of surprises in inflation.That result seems to suggest that our three models should be difficult to distinguish, but in factthere are two sources of information that inform the posterior probabilities αit. One involves adistinction between variation within and across monetary regimes. Although our models are obser-vationally equivalent within a monetary regime, they are not equivalent across regimes, because theLucas-Sargent model says that the equilibrium law of motion for unemployment is invariant acrosspolicy rules while the Keynesian models say it is not. In the calculations reported below, xt|t−1 isformed not from a time-invariant decision rule but from one that changes from period to period.The weight on the Lucas-Sargent model rises when new data support the invariance property, andit falls when they do not. The second force for change in αit is parsimony. The Solow-Tobin modelis nested within the Samuelson-Solow model but has fewer parameters, and our calculations creditthat parsimony. If new data suggest the Solow-Tobin model fits about as well, then its posteriorweight will rise relative to the Samuelson-Solow model. During an age before the advent of non-parametric (i.e., infinite-parametric) models, Arthur Goldberger is reputed often to have warnedagainst proliferating free parameters. Robert Lucas (1981, p. 188) said that “ . . . it is useful, in ageneral way, to be hostile toward theorists bearing free parameters.” Our Bayesian calculations arehostile toward additional free parameters.

11ECB

Working Paper Series No. 478April 2005

Page 13: Technology Assessment - Update on Genetic Tests for Non-Cancer

in yt is costly and there are no offsetting benefits in terms of reduced variation in ut,

the optimal policy within this model is xLSt|t−1

= 0.11

2.3 The central bank’s decision making process

Now we turn to the details of the central bank’s decision process. The first of the

bank’s three tasks is to update parameter estimates for each of its approximating mod-

els. Within each model, the central bank’s identifying assumptions make equations

(5), (6), and (7) regressions. In each case, we assume the bank adopts a normal-

inverse gamma prior,

p(θ, σ2) = p(θ|σ2)p(σ2), (8)

where σ2 is the variance of Phillips curve residuals. The marginal prior p(σ2) makes

the error variance an inverse gamma variate, and the conditional prior p(θ|σ2) makes

the regression parameters a normal random vector. Along with this, we assume

that the Phillips curve residuals ηit are identically and independently distributed

and conditionally normal given the right-hand variables. These assumptions make

the conditional likelihood function normal. With a normal-inverse gamma prior and

a Gaussian conditional likelihood, the posterior also belongs to the normal-inverse

gamma family, and its parameters can be updated recursively.

In particular, let Z t summarize the joint history of (Xt, Yt) up to date t. Before

seeing data at t, the central bank’s prior is

p(θ|σ2, Zt−1) = N(θt−1, σ2P−1

t−1), (9)

p(σ2|Zt−1) = IG(st−1, vt−1),

where θt−1, Pt−1, st−1, and νt−1 represent estimates based on data through period t−1.

The variable Pt−1 is a precision matrix, st−1 is a scale parameter for the inverse-gamma

density, and vt−1 counts degrees of freedom. The estimate of σ2 is just st−1/vt−1. After

11See Stokey (1989) for a discussion of time inconsistency problems in terms of alternative timingprotocols. See Sargent (1999, chapter 3) for an application of Stokey’s analysis to the Phillips curve.

12ECBWorking Paper Series No. 478April 2005