34
Preliminaries PMSE Calculating P.I.s Concluding Remarks Calculating Interval Forecasts Chapter 7 (Chatfield) Monika Turyna & Thomas Hrdina Department of Economics, University of Vienna Summer Term 2009 Turyna & Hrdina Interval Forecasts

Calculating Interval Forecasts · Forecasting not based on a probability model I Assume that the method is optimal in the sense that the one-step ahead errors are uncorrelated. I

  • Upload
    others

  • View
    7

  • Download
    0

Embed Size (px)

Citation preview

Page 1: Calculating Interval Forecasts · Forecasting not based on a probability model I Assume that the method is optimal in the sense that the one-step ahead errors are uncorrelated. I

Preliminaries PMSE Calculating P.I.s Concluding Remarks

Calculating Interval ForecastsChapter 7 (Chatfield)

Monika Turyna & Thomas Hrdina

Department of Economics, University of Vienna

Summer Term 2009

Turyna & Hrdina

Interval Forecasts

Page 2: Calculating Interval Forecasts · Forecasting not based on a probability model I Assume that the method is optimal in the sense that the one-step ahead errors are uncorrelated. I

Preliminaries PMSE Calculating P.I.s Concluding Remarks

Terminology

I An interval forecast consists of an upper and a lower limitbetween which a future value is expected to lie with aprescribed probability.

I Example: Inflation in the next quarter will lie in the interval[1%, 2.5%] with a 90% probability

I The limits are called forecast limits or prediction bounds whilethe interval is referred to as prediction interval (P.I.)

I Note: the term confidence interval usually applies to estimatesof fixed but unknown parameter values while a P.I. is anestimate of an unknown future value of a random variable

Turyna & Hrdina

Interval Forecasts

Page 3: Calculating Interval Forecasts · Forecasting not based on a probability model I Assume that the method is optimal in the sense that the one-step ahead errors are uncorrelated. I

Preliminaries PMSE Calculating P.I.s Concluding Remarks

Focus of Attention

I In what follows we concentrate on computing P.I.s for a singlevariable at a single (future) time point

I We do not cover the more difficult problem of calculating aP.I. for a single variable over a longer time horizon or a P.I. fordifferent variables at the same time point

I Furthermore, we will cover a variety of approaches forcalculating P.I.s – the various methods for forecastingtime-series require different approaches

Turyna & Hrdina

Interval Forecasts

Page 4: Calculating Interval Forecasts · Forecasting not based on a probability model I Assume that the method is optimal in the sense that the one-step ahead errors are uncorrelated. I

Preliminaries PMSE Calculating P.I.s Concluding Remarks

Importance of P.I.s

I Predictions in form of point forecasts provide no guidance asto their likely accuracy

I P.I.s, in contrast,I allow to assess future uncertainty,

I enable different strategies to be planned for different outcomesindicated by the P.I. and

I make a thorough comparison of forecasts from differentmethods possible

Turyna & Hrdina

Interval Forecasts

Page 5: Calculating Interval Forecasts · Forecasting not based on a probability model I Assume that the method is optimal in the sense that the one-step ahead errors are uncorrelated. I

Preliminaries PMSE Calculating P.I.s Concluding Remarks

Some Reasons for Disregard

I Rather neglected topic in the statistical literature, i.e.textbooks and journal papers

I No generally accepted method of calculating P.I.s

I Theoretical P.I.s difficult or impossible to evaluate (e.g. forsome multivariate or non-linear models)

I Properties of empirically based methods – based onwithin-sample residuals – have been little studied

I Software packages do not produce P.I.s at all or only on alimited scale

Turyna & Hrdina

Interval Forecasts

Page 6: Calculating Interval Forecasts · Forecasting not based on a probability model I Assume that the method is optimal in the sense that the one-step ahead errors are uncorrelated. I

Preliminaries PMSE Calculating P.I.s Concluding Remarks

Density Forecasts

I Finding the entire probability distribution of a future value iscalled density forecasting

I For linear models with normally distributed innovations, thedensity forecast is usually a normal distribution with the meanequal to the point forecast and the variance equal to thatused for computing the prediction interval.

I Of course, given the density forecast one can construct theP.I.s for any desired level of probability

I Problem: when forecast error distribution is not normal

Turyna & Hrdina

Interval Forecasts

Page 7: Calculating Interval Forecasts · Forecasting not based on a probability model I Assume that the method is optimal in the sense that the one-step ahead errors are uncorrelated. I

Preliminaries PMSE Calculating P.I.s Concluding Remarks

Fan Charts

I Idea: for consecutive future values, construct several P.I.s fordifferent probabilities (e.g. from 10% to 90%) and plot themin the same graph using different levels of shading for differentprobabilities

I Usually the darkest shade covers the P.I.s with a 10%probability while the lightest shade covers the P.I.s with a90% probability

I Intervals typically get wider, indicating increasing uncertaintyabout future values, i.e. they fan out

I “Fan charts could become a valuable tool for presenting theuncertainty attached to forecasts [. . . ]” (Chatfield, 2000)

Turyna & Hrdina

Interval Forecasts

Page 8: Calculating Interval Forecasts · Forecasting not based on a probability model I Assume that the method is optimal in the sense that the one-step ahead errors are uncorrelated. I

Preliminaries PMSE Calculating P.I.s Concluding Remarks

Density Forecasts & Fan ChartsExample

Turyna & Hrdina

Interval Forecasts

Page 9: Calculating Interval Forecasts · Forecasting not based on a probability model I Assume that the method is optimal in the sense that the one-step ahead errors are uncorrelated. I

Preliminaries PMSE Calculating P.I.s Concluding Remarks

Density Forecasts & Fan ChartsExample (cont.)

Turyna & Hrdina

Interval Forecasts

Page 10: Calculating Interval Forecasts · Forecasting not based on a probability model I Assume that the method is optimal in the sense that the one-step ahead errors are uncorrelated. I

Preliminaries PMSE Calculating P.I.s Concluding Remarks

NotationConditional Forecast Error

I An observed time series (x1, x2, . . . , xN) is regarded as a finiterealisation of a stochastic process {Xt}

I The point forecast of the random variable XN+h, conditionalon data up to time N, is denoted by XN(h) when regarded asa random variable and by xN(h) when regarded as a particularvalue

I The conditional forecast error – conditional on data up totime N and on the particular forecast – is the random variableeN(h) = XN+h − xN(h)

I The observed value of eN(h) = xN+h − xN(h) only becomesavailable at time N + h

Turyna & Hrdina

Interval Forecasts

Page 11: Calculating Interval Forecasts · Forecasting not based on a probability model I Assume that the method is optimal in the sense that the one-step ahead errors are uncorrelated. I

Preliminaries PMSE Calculating P.I.s Concluding Remarks

NotationForecast Errors & Fitted Residuals

I It is important to differentiate between the (out-of-sample)conditional forecast errors eN(h), the fitted residuals(within-sample “forecast” errors) and the model innovations

I The out-of-sample observed forecasting errorseN(h) = xN+h − xN(h) are true forecasting errors

I The within-sample observed “forecasting” errors[xt − xt−1(1)], for t = 2, . . . ,N, are merely the residuals fromthe fitted model, i.e. the difference between observed andfitted values

Turyna & Hrdina

Interval Forecasts

Page 12: Calculating Interval Forecasts · Forecasting not based on a probability model I Assume that the method is optimal in the sense that the one-step ahead errors are uncorrelated. I

Preliminaries PMSE Calculating P.I.s Concluding Remarks

NotationForecast Errors & Fitted Residuals (cont.)

I These fitted residuals will not be the same as the true modelinnovations because they depend on parameter estimates (andperhaps on estimated starting values)

I They are also not true forecasting errors if the parametershave been estimated using data up to N

I However, if one finds the “true” model and the latter does notchange then it is reasonable to expect the true forecast errorsto have similar properties as the fitted residuals

I Practice: true forecast errors tend to be larger than expectedfrom within-sample fit (change in the model?)

Turyna & Hrdina

Interval Forecasts

Page 13: Calculating Interval Forecasts · Forecasting not based on a probability model I Assume that the method is optimal in the sense that the one-step ahead errors are uncorrelated. I

Preliminaries PMSE Calculating P.I.s Concluding Remarks

Prediction Mean Square Error (PMSE)

I The uncertainty of an h-step-ahead forecast of a singlevariable is assessed with its prediction mean square error givenby E [eN(h)2]

I For an unbiased forecast, i.e. where the conditionalexpectation E [XN+h] = xN(h), it holds that E [eN(h)] = 0 andthus E [eN(h)2] = Var [eN(h)]

I Note: we are not interested in the variance of the forecast butin the variance of the forecast error (the particular value ofthe point forecast xN(h) is determined exactly and hasvariance zero)

Turyna & Hrdina

Interval Forecasts

Page 14: Calculating Interval Forecasts · Forecasting not based on a probability model I Assume that the method is optimal in the sense that the one-step ahead errors are uncorrelated. I

Preliminaries PMSE Calculating P.I.s Concluding Remarks

Prediction Mean Square Error (PMSE)Example

I Consider the zero mean AR(1) process Xt = φ1Xt−1 + εt ,where {εt} are independent N(0, σ2

ε )

I Assuming that we know φ1 and σ2ε , it can be shown that the

“true-model” PMSE – the true forecast error variance – isgiven by

E [eN(h)2] = σ2ε (1− φ2h

1 )/(1− φ21) (1)

I In practice we would have to replace φ1 and σ2ε with sample

estimates

Turyna & Hrdina

Interval Forecasts

Page 15: Calculating Interval Forecasts · Forecasting not based on a probability model I Assume that the method is optimal in the sense that the one-step ahead errors are uncorrelated. I

Preliminaries PMSE Calculating P.I.s Concluding Remarks

Prediction Mean Square Error (PMSE)Example (cont.)

I Assume that we want to calculate the PMSE of aone-step-ahead forecast, i.e. we want to know E [eN(1)2]

I Since we do not know the true parameter values, theone-step-ahead forecasting error is given by

eN(1) = XN+1 − xN(1)

= φ1xN + εN+1 − φ1xN = (φ1 − φ1)xN + εN+1

I If the estimate φ1 is unbiased then we getE [eN(1)2] = E [ε2N+1] = σ2

ε which is the same as for h = 1 inequation (1)

Turyna & Hrdina

Interval Forecasts

Page 16: Calculating Interval Forecasts · Forecasting not based on a probability model I Assume that the method is optimal in the sense that the one-step ahead errors are uncorrelated. I

Preliminaries PMSE Calculating P.I.s Concluding Remarks

Prediction Mean Square Error (PMSE)Bias Correction

I The previous example shows that analysts have to bear inmind the effects of parameter uncertainty on estimates of thePMSE

I Though PMSE estimates can be corrected to allow forparameter uncertainty, the formulas are complex andcorrections are merely of order 1/N

I “Overall, the effect of parameter uncertainty seems likely tobe of a smaller order of magnitude in general than that due toother sources, notably the effects of model uncertainty andthe effects of errors and outliers [. . . ].” (Chatfield, 2000)

Turyna & Hrdina

Interval Forecasts

Page 17: Calculating Interval Forecasts · Forecasting not based on a probability model I Assume that the method is optimal in the sense that the one-step ahead errors are uncorrelated. I

Preliminaries PMSE Calculating P.I.s Concluding Remarks

Introduction

I General formula for a 100(1− α)% P.I for XN+h

xN(h)± zα/2√

Var[eN(h)]

I Symmetric about xN(h) – assumes the forecast is unbiasedE [eN(h)2] = Var[eN(h)]

I Assumes errors are normally distributed [sometimes for shortseries zα/2 replaced by the respective percentage point of at-distribution]

I The latter assumption usually violated even for a linear modelwith Gaussian innovations, when model parameters estimatedfrom the same data used to compute forecasts

I For any method the main problem is to evaluate Var[eN(h)]

Turyna & Hrdina

Interval Forecasts

Page 18: Calculating Interval Forecasts · Forecasting not based on a probability model I Assume that the method is optimal in the sense that the one-step ahead errors are uncorrelated. I

Preliminaries PMSE Calculating P.I.s Concluding Remarks

P.I.s from a Probability Model

P.I.s derived from a probability model

I Ignore parameter uncertaintyI Example – ARIMA forecasting:

I Xt = εt + ψ1εt−1 + ψ2εt−2 + . . .I then: eN(h) = [XN+h − XN+h] = εN+h +

∑h−1j=1 ψjεN+h−j

I thus: Var[eN(h)] = [1 + ψ21 + · · ·+ ψ2

h−1]σ2ε

I In practice replace ψi and σ2ε with ψi and σ2

ε

Turyna & Hrdina

Interval Forecasts

Page 19: Calculating Interval Forecasts · Forecasting not based on a probability model I Assume that the method is optimal in the sense that the one-step ahead errors are uncorrelated. I

Preliminaries PMSE Calculating P.I.s Concluding Remarks

P.I.s from a Probability Model

P.I.s derived from a probability model cont’d

I Similar formulas available for: Vector ARMA models,structural state–space models and various regressions (thelatter typically allow for parameter uncertainty and areconditional in the sense that they depend on the particularvalues of the explanatory variables from where a prediction isbeing made)

I Typically not available for: complicated simultaneous equationmodels, non–linear models, ARCH and other stochasticvolatility models

I P.I.s usually overoptimistic if the true model is not known andmust be chosen from a class of models

Turyna & Hrdina

Interval Forecasts

Page 20: Calculating Interval Forecasts · Forecasting not based on a probability model I Assume that the method is optimal in the sense that the one-step ahead errors are uncorrelated. I

Preliminaries PMSE Calculating P.I.s Concluding Remarks

Informal models

P.I.s without model identification

What to do when a forecasting method is selected without anyformal model identification procedure?

I assume that the method is optimal (in some sense)

I apply some empirical procedure

Turyna & Hrdina

Interval Forecasts

Page 21: Calculating Interval Forecasts · Forecasting not based on a probability model I Assume that the method is optimal in the sense that the one-step ahead errors are uncorrelated. I

Preliminaries PMSE Calculating P.I.s Concluding Remarks

Informal models

P.I.s assuming a method is optimal

Example: Exponential smoothing

I Optimal for ARIMA(0,1,1)

I PMSE formula: Var[eN(h)] = [1 + (h − 1)α2]σ2e

where σ2e = Var[en(1)]

I Should one use this formula even if the model has not beenformally identified?

I Reasonable if:I Observed one-step-ahead forecast errors show no obvious

autocorrelationI No other obvious features of the data (e.g. trend) which need

to be modelled

Turyna & Hrdina

Interval Forecasts

Page 22: Calculating Interval Forecasts · Forecasting not based on a probability model I Assume that the method is optimal in the sense that the one-step ahead errors are uncorrelated. I

Preliminaries PMSE Calculating P.I.s Concluding Remarks

Informal models

Forecasting not based on a probability model

I Assume that the method is optimal in the sense that theone-step ahead errors are uncorrelated.

I Easy to check by looking at the correlogram of theone-step-ahead errors:

I if there is correlation we have more structure in the data whichshould improve the forecast.

I If we assume that one–step–ahead errors have also equalvariance it should be possible to evaluate Var[eN(h)] in termsof Var[eN(1)]

I Example: Applied to Holt–Winters method with additive andmultiplicative seasonality; in the additive case found to beequivalent to results from SARIMA for which Holt–Winters isoptimal

Turyna & Hrdina

Interval Forecasts

Page 23: Calculating Interval Forecasts · Forecasting not based on a probability model I Assume that the method is optimal in the sense that the one-step ahead errors are uncorrelated. I

Preliminaries PMSE Calculating P.I.s Concluding Remarks

Informal models

”Approximate” formulae

I Used if no theoretical formula is available

I Sometimes, due to simplicity, used even if there exists atheoretical formula

I Usually very inaccurateI Examples:

1 Var[eN(h)] = hσ2e

where σ2e = Var[eN(1)]

only true in a random walk model2 Var[eN(h)] = (0.659 + 0.341h)2σ2

e

3 Some formulas for the Holt–Winters method [Bowermann andO’Connel 1987]

Turyna & Hrdina

Interval Forecasts

Page 24: Calculating Interval Forecasts · Forecasting not based on a probability model I Assume that the method is optimal in the sense that the one-step ahead errors are uncorrelated. I

Preliminaries PMSE Calculating P.I.s Concluding Remarks

Empirically based P.I.s

Methods based on the observed distribution

1 I Apply the forecasting method to all the past dataI Find the within–sample forecast errors at 1,2,3. . . steps ahead

from all available time originsI Find the variance of these errorsI Assuming normality an approximate 100(1− α)%P.I. for XN+h

isxN(h)± zαse,h

where se,h is the standard deviation of the h–steps–ahead errorsI Values of se,h can be unreliable for small N and large h

Turyna & Hrdina

Interval Forecasts

Page 25: Calculating Interval Forecasts · Forecasting not based on a probability model I Assume that the method is optimal in the sense that the one-step ahead errors are uncorrelated. I

Preliminaries PMSE Calculating P.I.s Concluding Remarks

Empirically based P.I.s

Methods based on the observed distribution cont’d

2 I Split the past data into two parts, fit the method to the firstpart

I Make predictions of the second partI Resulting ”errors” are much more like true forecast errors than

those of the first methodI Refit the model with one additional observation in the first

part and one less in the second and so onI Interestingly it has been found that errors follow a gamma

rather than normal distribution

Turyna & Hrdina

Interval Forecasts

Page 26: Calculating Interval Forecasts · Forecasting not based on a probability model I Assume that the method is optimal in the sense that the one-step ahead errors are uncorrelated. I

Preliminaries PMSE Calculating P.I.s Concluding Remarks

Simulation and resampling methods

Simulation – Monte Carlo approach

I Given the probability time–series model, simulate past andfuture behavior by generating a series of random variables

I Repeat many times to obtain a large set of outcomes, calledpseudo–data

I Evaluate P.I. by finding the interval within which the requiredpercentage of future values lie

I Assumes the model has been correctly identified

Turyna & Hrdina

Interval Forecasts

Page 27: Calculating Interval Forecasts · Forecasting not based on a probability model I Assume that the method is optimal in the sense that the one-step ahead errors are uncorrelated. I

Preliminaries PMSE Calculating P.I.s Concluding Remarks

Simulation and resampling methods

Resampling – Bootstrapping

I Sample from the empirical distribution of past fitted ”errors”

I The procedure approximates the theoretical distribution ofinnovations by the empirical distribution of the observedresiduals – a distribution–free approach

I Since in the time series context successive observations arecorrelated over time, thus bootstrapping over fitted errors[which are hopefully approximately independent] rather thenindependent observations

I However: procedure highly dependent on the fitted modelchoice

Turyna & Hrdina

Interval Forecasts

Page 28: Calculating Interval Forecasts · Forecasting not based on a probability model I Assume that the method is optimal in the sense that the one-step ahead errors are uncorrelated. I

Preliminaries PMSE Calculating P.I.s Concluding Remarks

The Bayesian approach

The Bayesian approach

I Given a suitable model, allows computation of a completeprobability distribution for a future value

I Once probability distribution known, compute P.I.s bydecision–theoretic approach or Bayesian version of the P.I.general formula

I Alternatively: simulate the predictive distributionI Bayesian model averaging

I Natural if the analyst relies not on a single model but on amixture

I Use Bayesian methods to find a sensible set of models and toaverage over these in a appropriate way

Turyna & Hrdina

Interval Forecasts

Page 29: Calculating Interval Forecasts · Forecasting not based on a probability model I Assume that the method is optimal in the sense that the one-step ahead errors are uncorrelated. I

Preliminaries PMSE Calculating P.I.s Concluding Remarks

P.I.s for transformed variables

P.I.s for transformed variables

I Non–linear transformation of variable Xt : Yt = g(Xt) [e.g.logarithmic or Box–Cox transformation]

I P.I.s for YN+h can be calculated in an appropriate way

I But: How to get back P.I.s for the original variable?

I Usually point forecast for XN+h namely g−1[yN(h)]is biasedsince E [g−1(Y )] 6= g−1[E (Y )]

I E.g.: If the predictive distribution of YN+h is symmetric withmean yN(h) then g−1[yN(h) is the median of XN+h

I Fortunately: If the P.I. for YN+h has a probability (1−α) thenthe retransformed P.I. for XN+h will have the same probability

I Often P.I. for XN+h will be asymmetric

Turyna & Hrdina

Interval Forecasts

Page 30: Calculating Interval Forecasts · Forecasting not based on a probability model I Assume that the method is optimal in the sense that the one-step ahead errors are uncorrelated. I

Preliminaries PMSE Calculating P.I.s Concluding Remarks

Why are P.I.s too narrow?

I Empirical evidence shows that out-of-sample forecast errorstend to be larger than model-fitted residuals, implying thatmore than 5% of future observations will fall outside a 95%P.I. on average

I The various reasons why P.I.s are too narrow in generalinclude

1. parameter uncertainty

2. non-normally distributed innovations

3. identification of the wrong model

4. changes in the structure underlying the model

Turyna & Hrdina

Interval Forecasts

Page 31: Calculating Interval Forecasts · Forecasting not based on a probability model I Assume that the method is optimal in the sense that the one-step ahead errors are uncorrelated. I

Preliminaries PMSE Calculating P.I.s Concluding Remarks

Why are P.I.s too narrow? (cont.)

I ad 1) This problem can be mitigated by using theoreticalPSME formulae which account for parameter uncertainty byincorporating correction terms; corrections of this form seemto be negligible in comparison with those accounting for othersources of uncertainty

I ad 2) A common problem which is often associated withasymmetry or heavy tails; in particular the latter, which is dueto outliers and errors in the data, can have severe effects onmodel identification and on the resulting point forecasts andP.I.s

Turyna & Hrdina

Interval Forecasts

Page 32: Calculating Interval Forecasts · Forecasting not based on a probability model I Assume that the method is optimal in the sense that the one-step ahead errors are uncorrelated. I

Preliminaries PMSE Calculating P.I.s Concluding Remarks

Why are P.I.s too narrow? (cont.)

I ad 3) This problem is related to model uncertainty and can bemitigated by applying appropriate diagnostic checks (e.g.checking whether the one-step-ahead fitted residuals areuncorrelated and have constant variance; if this is not the casethan there is more structure which should be exploited)

I ad 4) The underlying structure changes if it slowly evolvesover time or there are sudden shifts; in both cases the modelparameters will change (Chatfield argues that an observationthat falls outside a P.I. could indicate a change in theunderlying model)

Turyna & Hrdina

Interval Forecasts

Page 33: Calculating Interval Forecasts · Forecasting not based on a probability model I Assume that the method is optimal in the sense that the one-step ahead errors are uncorrelated. I

Preliminaries PMSE Calculating P.I.s Concluding Remarks

Summary

I The basic message from this lecture is that P.I.s are areasonable extension to point forecasts

I Wherever applicable, P.I.s should be calculated by formulatinga model that approximates the DGP well and from which thePMSE can be used to obtain the interval

I To allow for parameter uncertainty correction terms can beused to calculate the PMSE; since the correction is rathersmall these terms are often omitted

I It is vital to note that the approach just described rests on theassumptions that the correct model has been fitted, the errorsare normally distributed and that the future is like the past

Turyna & Hrdina

Interval Forecasts

Page 34: Calculating Interval Forecasts · Forecasting not based on a probability model I Assume that the method is optimal in the sense that the one-step ahead errors are uncorrelated. I

Preliminaries PMSE Calculating P.I.s Concluding Remarks

Summary (cont.)

I The various “approximate” formulae for calculating P.I.s canbe very inaccurate and should therefore not be used

I If theoretical formulae are not available or there is doubtabout the assumptions stated above, empirically basedresampling methods are a good alternative

I One should always bear in mind that P.I.s tend to be toonarrow in practice, in particular because the wrong model hasbeen identified or the model has changed

I Alternatively one could also consider different approaches tocalculating P.I.s (e.g. Bayesian model averaging)

Turyna & Hrdina

Interval Forecasts