Forecasting and VAR models

Embed Size (px)

Citation preview

  • 8/8/2019 Forecasting and VAR models

    1/12

    Forecasting and VAR Models

    Franz Eigner

    UK Econometric Forecasting

    Prof. Kunst, SS09

    June 21th, 2009

    1

  • 8/8/2019 Forecasting and VAR models

    2/12

    Contents

    1 Introduction in multivariate forecasting 3

    2 Vector autoregressive models (VAR) 42.1 Definition of a stationary VAR(p) . . . . . . . . . . . . . . . . . . 42.2 Definition of a stationary bivariate VAR(1) . . . . . . . . . . . . 52.3 Specification, estimation and extensions . . . . . . . . . . . . . . 62.4 Structural analyses of the VAR . . . . . . . . . . . . . . . . . . . 6

    3 Forecasting with VAR models 73.1 Naive forecast (MMSE) . . . . . . . . . . . . . . . . . . . . . . . 73.2 Simulation-based forecast . . . . . . . . . . . . . . . . . . . . . . 73.3 Conditional forecast . . . . . . . . . . . . . . . . . . . . . . . . . 8

    4 VAR models and Cointegration 94.1 Integration and Cointegration . . . . . . . . . . . . . . . . . . . . 94.2 Vector error correction model (VECM) . . . . . . . . . . . . . . . 10

    5 Applications for VAR/VEC 10

    6 Final discussion about forecast quality 11

    2

  • 8/8/2019 Forecasting and VAR models

    3/12

    1 Introduction in multivariate forecasting

    Multivariate data are given, when observations are taken on two or more timeseries for the same time periods, describing e.g. measures of economic activitylike GDP or the inflation index. Multivariate modelling can then be assessed,examining the structure, that is the interrelationship among the series in orderto obtain more accurate forecasts of the series of interest. Especially for eco-nomic series mutual interactions between economic variables is often present,e.g. the dependency between wages and prices. Therefore one may believe thatthe neglection of the relationship between economic variables, which is donein univariate forecasting, is not adequate for forecasting economic series. As aconsequence, multivariate forecasting, aiming at understanding the underlyingstructure of a given system, seems to be very appealing.

    The main focus of this paper lies in the description of multivariate forecasting

    procedures. In order not to lose ones head due to the large number of forecastingprocedures, a brief overview is given in advance in Figure 1.

    univariate forecasting

    model-free:

    smoothing

    filtering

    models:

    ARIMA,

    GARCH

    multivariate forecasting

    open loop system(single equation)

    multiple regressiontransfer function

    feedback?no yes

    closed loop system(multiple equation)

    stationary:VAR,SVARnonstationary:VEC,SVEC

    Figure 1: Overview of forecasting procedures

    Whereas in univariate forecasting the distinction between model-free and

    model-based is important, the crucial question in multivariate modelling is thepresence of feedback.

    In a single equation system, the variation in a dependent or response vari-able is explained by the variation in one or several predictor (independent)variables and no feedback is assumed, this means no effects from the dependentvariable (output) back to the predictor variables (input). A single equation sys-tem is therefore denoted as an open-loop system. Popular open-loop systemsare multiple regression models and transfer equation models. In the presenceof feedback, data are generated by a closed-loop system and a single equationsystem is not adequate anymore. The most popular closed loop systems arenowadays the VAR models, which are the multivariate version of the univariate

    3

  • 8/8/2019 Forecasting and VAR models

    4/12

    AR. The VAR model became popular by Sims (1980), who advocated them

    as an alternative to simultaneous equations models, which do not focus on thedynamic structure of the variables. One advantage of VAR may be that theytreat all variables as endogenous, whereas in econometric modeling one gener-ally needs to classify variables as exogenous, predetermined and endogenous.However this classification is not always known and theoretical considerationsto find the correct one may be wrong. There is a broad variety of VAR mod-els, integrating moving average terms (VARMA), structural features (SVAR) orbayesian methods (BVAR). Cointegration can be implemented in form of thevector error correction model (VEC).

    Before modelling multivariate time-series data, a careful examination of thedata set should be assessed in advance. One important tool is the CCF (crosscorrelation function), which is a generalization of the ACF to the multivariatecase. One can use it for identification of the model, which means finding the

    optimal lag and for identification of the leading series by looking at the maximumcross-correlation. However especially in case of time dependence within thecomponent series and of feedback between the series, final statements from theempirical CCV are difficult to make.

    2 Vector autoregressive models (VAR)

    This chapter focuses on the analysis of covariance stationary multivariate timeseries using VAR models.

    2.1 Definition of a stationary VAR(p)

    A VAR consists of a set of K endogenous variables yt = (y1t, . . . , ykt, . . . , yKt)for k = 1, . . . , K . The VAR(p) process is then defined as:

    yt = 1yt1 + . . . + pytp + ut

    with i as (KxK) coefficient matrices for i = 1, . . . , p and ui is a K dimensionalwhite noise process. This means it holds: E(ut) = 0, E(utut) = u, E(utus) =0 for t = sThe VAR(p) process is stable, when it generates stationary time series, implyingthat the equation returns to an equilibrium after a shock. This can be checkedby the characteristic matrix polynomial.

    det(IK 1z . . .pzp) = 0 for z 1 (1)

    The process is stable if all roots of the matrix polynomial are larger than onein absolute value. If the solution of above equation has a root for z = 1,then either some or all variables in the VAR(p) process are integrated of orderone. The stability of a VAR(p) process can also be examined by consideringthe companion form and calculating the eigenvalues of the coefficient matrix.A VAR with p lags can always be equivalently rewritten as a VAR with onlyone lag (so called companion form) by appropriately redefining the dependentvariable. The transformation amounts to merely stacking the lags of the VAR(p)variable in the new VAR(1) dependent variable.

    Yt = Yt1 + vt

    4

  • 8/8/2019 Forecasting and VAR models

    5/12

    where Yt =

    yt...

    ytp+1

    , =

    1 2 . . . p1 pI 0 . . . 0 00 I . . . 0 0...

    .... . .

    ......

    0 0 . . . I 0

    , vt =

    ui

    0...0

    It holds: If the moduli of the eigenvalues are less than one, the VAR process isstable.

    2.2 Definition of a stationary bivariate VAR(1)

    Further descriptions and analyses are presented using the simple bivariate VAR(1)process.

    y1t = 11y1,t1 + 12y2,t1 + u1t

    y2t = 21y1,t1 + 22y2,t1 + u2t

    This can be rewritten as:

    yt = 1yt1 + ut

    where uTt = (uit, u2t) and 1 =

    11 1221 22

    uTt is bivariate white noise, which means that innovations have zero means andare uncorrelated through time, within and between series. However, u1t andu2t may be correlated at the same time point. Notice that the all equations

    have the same regressors. Therefore the VAR(1) as well as the VAR(p) modelare just a seemingly unrelated regression (SUR) models with lagged variablesas common regressors.

    ...

    ...

    Y1,t-1

    !21

    !11

    !12

    !22Y2,t-1

    Y1,t

    Y2,t

    Figure 2: Dependence within and between yt

    VAR models provide the possibility for analyzing the relation between thevariables involved. These relations or dependences between and within timeseries are expressed in the coefficient matrix and are shown in Figure 2. Onecalls a variable y1t causal for a variable y2t, if the information in y1t is helpfulfor improving the forecasts of y2t. Obviously, y1t is not causal for y2t whenthe coefficient variable 21 equals zero. Then granger causality goes only in onedirection, that is from y2t to y1t, which would lead to an open-loop system, with

    5

  • 8/8/2019 Forecasting and VAR models

    6/12

    yit following an AR(1) process. Because Granger-noncausality is characterized

    by such zero restrictions on the levels VAR representation, standard F-tests canbe applied for causality analysis. It is obvious that unidirectional causality ex-ists, if the coefficient matrix i can be reordered as an lower (upper) triangular.If this is not the case, there is mutually dependency between the variables andthe VAR model should be more adequate than transfer models.

    2.3 Specification, estimation and extensions

    The unrestricted VAR(p) model may inadequate to represent sufficiently themain characteristics of the data. (Chatfield, 2001). This simple VAR modele.g. assumes stationarity of all time series. This is rarely the case for originaleconomic series. One can account for trends and seasonality by including deter-ministic elements into the equation, for instance dummy variables. Additionally,

    stochastic exogenous variables could be implemented as well. Data set may bealso transformed by taking differences (e.g. first-differences or seasonal differ-ences), in order to make them stationary. However, differencing nonstationaryseries is not without its weaknesses, which will be explained later.

    The Multivariate Least Square (MLS) can be assessed to estimate the co-efficients. As the explanatory variables are the same in each equation, MLSis equivalent to the Ordinary least squares estimator applied to each equationseparately. The question of lag order selection can be solved by using a F-test, testing if the additional explained sum of squares is significant. Howevermostly nowadays one takes the model with the optimal information criterion.One should restrict the maximum number of lags in advance. A VAR(4)-modelwith 3 variables you have to estimate already 36 coefficients. A large number

    of parameters leads to a reduction of accuracy in prediction of the separateparameters. Furthermore estimation will suffer from potential overfitting bi-ases, delivering poor out-of-sample forecasts. Thus one often estimates specialrestricted VAR models (e.g. SVAR), setting some coefficients to zero on theground of theoretical considerations. A special branch of restricted VAR modelsare Bayesian VAR models. They prevent over-fitting by shrinking parametershigher than first order towards zero, e.g. by using Minnesota priors. Otherextensions are the VARX and VARMAX models, which allow for exogenousvariables whose dynamics do not depend on the modelled endogenous variables.For forecasting, this may be inconvenient, because these endogenous variablesrequire an extrapolation technique or assumptions on their future behaviour.

    2.4 Structural analyses of the VARThe general VAR(p) model has many parameters, and they may be difficult tointerpret due to complex interactions and feedback between the variables in themodel. As a result, the dynamic properties of a VAR(p) are often summarizedusing various types of structural analysis. [...] In structural analysis, certainassumptions about the causal structure of the data under investigation are im-posed, and the resulting causal impacts of unexpected shocks or innovations tospecified variables on the variables in the model are summarized. (Zivot/Wang,2002). The granger causality, which was explained before, is such a summariza-tion tool. Other summarization tools are the impulse response functions andthe forecast error variance decomposition.

    6

  • 8/8/2019 Forecasting and VAR models

    7/12

    While Granger causality falls short of quantifying the impact of the impulse

    variable on the response variable over time, the impulse response analysis canbe used. For meaningful results it is important to isolate the actual shocks ofinterest, which requires imposing some structure on the VAR.

    These orthogonal shock matrices can then further be used for the forecast er-ror variance decomposition (FEVD), which answers the question: what portionof the variance of the forecast error in predicting yi,t+1 is due to the structuralshock?

    3 Forecasting with VAR models

    3.1 Naive forecast (MMSE)

    Forecasting from a VAR model is similar to forecasting from a univariate ARmodel and the following gives a brief description. Similar to AR models, mini-mum mean square error (MMSE) forecasts can be easily computed for the VARmodels. Future values of yt are simply replaced with MMSE forecasts, whileassuming future error terms as zero. Past values of yt and ut are replaced byobserved values. It is also called naive forecast. In the naive case, the presenceof forecast error is ignored.Having a VAR(1) process as

    yt = 1yt1 + ut

    the best one-step-ahead forecast is given by

    yN(1) = 1yN

    For the two-step-ahead forecast holds

    yN(2) = 1yN(1) = 21yN

    where the unknown lags of yN+1 are replaced by the respective point forecast.Forecasts for longer horizons h (h-step forecasts) may be obtained using thechain-rule of forecasting as

    yN(h) = h1 yN

    Notice that forecasts are obtained by multiplicating matrices 1

    , as opposed tothe AR(1) where scalars were multiplicated.

    3.2 Simulation-based forecast

    Because this method does not take into account any errors of the forecasts,forecasts may be biased. One could then use bootstrapping forecast methods.Although simulation-based forecasts are still obtained by replacing unknownlags with point forecasts, they also incorporate forecast errors when estimatinghigher step forecasts by adding a drawn value of the estimated error vector utfrom the VAR model in the estimation. If correctly specified, it is less biased,but also much more computational intensive, according to Zivot/Wang (2002).

    7

  • 8/8/2019 Forecasting and VAR models

    8/12

    3.3 Conditional forecast

    Forecasts from VAR models are quite flexible because they can be made con-ditional on the potential future paths of specified variables in the model. Forexample, when forecasting multivariate macroeconomic variables using quar-terly data from a VAR model, it may happen that some of the future valuesof certain variables in the VAR model are known, because data on these vari-ables are released earlier than data on the other variables. By incorporatingthe knowledge of the future path of certain variables, in principle it should bepossible to obtain more reliable forecasts of the other variables in the system.Another use of con- ditional forecasting is the generation of forecasts conditionalon different policy scenarios. These scenario-based conditional forecasts allowone to answer the question: if something happens to some variables in the sys-tem in the future, how will it affect forecasts of other variables in the future?

    (Zivot/Wang, 2002).

    8

  • 8/8/2019 Forecasting and VAR models

    9/12

    4 VAR models and Cointegration

    The following chapter describes the analysis of nonstationary multivariate timeseries using VAR models that incorporate cointegration relationships.

    4.1 Integration and Cointegration

    By so far one assumed that all time series are covariance stationary. If one re-laxes this assumption, so allow for non-stationarity in the data, VAR modellingis inadequate because the stability assumption (1) is not fulfilled anymore. Onecould get rid of the problem by differencing the data set or assess other trans-formations to make time series stationary. However, sometimes the structure ofthe trend is of interest by itself, especially whether it is stochastic or determin-istic. Are there more reasons for not differencing? Most economic theories are

    expressed in levels. Therefore being obliged to difference the data set would bea substantial obstacle in testing economic theories. Furthermore by differencingone eliminates important dynamic and long-term features from the data and youfurthermore lose observations. Therefore it is always kind of disappointing if oneis obliged to difference the data. An alternative and more advanced approachwould be to assess vector error correction models, which integrate cointegrationtechniques into VAR. Ignoring cointegrating relationships by simply differenc-ing the data would imply ignoring equilibrium conditions. Forecasts, which baseon such misspecified models may then violate these theory-based plausabilityconditions.

    The idea behind cointegration is to find a linear combination between twoI(d) variables that yields a variable with lower order of integration. Although theindividual series are nonstationary, they are tied together by the cointegratingvector.

    Definition of integration: A time series yt I(d) (integrated of order d) ifdyt is stable but

    d1yt is not.Definition of cointegration: yt I(d) is cointegrated, if there exists a kx1

    fixed vector = 0, so that yt is integrated of order < d. ( I(0) is stable).Let us consider the bivariate cointegrated VAR(1) process. One may find a

    linear combination of y1t and y2t, that is (y1tky2t), which is stationary. Theny1t and y2t are called cointegrated. The linear combination can be interpreted asa constrain implying a long-run relationship. The two variables are fit togetherin the long run by the cointegrating vector. This long-run relationship can beincorporated as lagged cointegrating error term in a VECM, allowing estimationwithout differencing the data, therefore not neglecting long-term information.

    Before describing the VECM, one has to know in which cases the VECM isadequate.

    Consider the bivariate VAR(2)

    yt = 1yt1 + 2yt2 + ut

    with the matrix polynomial for z=1 ( stability condition)

    (1) = (I1 2) =

    where rank() equals the cointegration rank of the system yt.

    9

  • 8/8/2019 Forecasting and VAR models

    10/12

    0 ... no cointegration ( difference VAR)

    1 ... one cointegrating vector ( VECM)

    2 ... process is stable ( VAR)

    When the cointegration rank equals zero, no cointegration can be found. One isunfortunately obliged to difference the data and go on with estimating a VAR.The case of interest is the second one, when the cointegration rank lies between1 and p-1, (p is here 2). This means one can find one or several cointegratingvector and therefore the VECM should be estimated.

    4.2 Vector error correction model (VECM)

    This cointegration relationship can be integrated into a bivariate VAR(2) process

    by subtracting yt1 on both sides and rearranging terms so as to obtain

    yt = yt yt1 = yt1 + 1yt1 + ut

    which is the so-called VECM form, where 1 = 2 is the transition matrixand = holds

    as the loading matrix (speed of adjustment)

    consisting the independent cointegrating vector

    Yt1 as the lagged disequilibrium error

    yt1 as the error correction term (long-run part)

    (to catch the idea: consider bivariate VAR(1) equation: y1t = 1(y1,t1y2,t1)+u1t with long-run equilibrium y1t = y2t)

    The VECM form tells us: changes in yt can be explained by their own history,lagged changes of the other variables, and the error from the long-run equilib-rium in the previous period. All variables in the VECM are stationary, alsoyt1, which is made stationary by . The long term equation is implementedin yt1, whereas short-run coefficients are described in 1. To summarize,the long-run or cointegration relations are often associated with specific eco-nomic relations which are of particular interest, whereas the short-run dynamicsdescribe the adjustment to the long-run relations when disturbances have oc-curred. (Ltkepohl, 2007).

    Estimation of such a VECM needs a specific procedure called reduced rankestimation and forecasts can then be estimated following the MMSE (naiveforecasting) method in VAR. One may forecast the changes in the variables,Y, or the levels of the variables Y.

    5 Applications for VAR/VEC

    Information, especially for VEC, is taken from Zivot/Wang (2002).

    VAR can be used for stationary time series forecasting, e.g. interest rates, some

    10

  • 8/8/2019 Forecasting and VAR models

    11/12

    exchange rates and some asset returns. One could also analyse many economic

    variables on the basis of its growth rate.VEC can always be used when one is able to verify long-term equilibirium con-ditions in theoretical economics. Such cointegration relationships can be foundin economics and finance.

    Economics:

    Money demand models imply cointegration between money, income, pricesand interest rates.

    Growth theory models imply cointegration between income, consumptionand investment, with productivity being the common trend.

    Purchasing power parity implies cointegration between the nominal ex-

    change rate and foreign and domestic prices. The Fisher equation implies cointegration between nominal interest rates

    and inflation.

    Finance:

    Cointegration at a high frequency is motivated by arbitrage arguments.The Law of One Price implies that identical assets must sell for the sameprice to avoid arbitrage opportunities. This implies cointegration betweenthe prices of the same asset trading on different markets, for example.

    Cointegration at a low frequency is motivated by economic equilibriumtheories linking assets prices or expected returns to fundamentals. Forexample, the present value model of stock prices states that a stocks price

    is an expected discounted present value of its expected future dividendsor earnings. This links the behavior of stock prices at low frequencies tothe behavior of dividends or earnings.

    6 Final discussion about forecast quality

    Multivariate modelling seems to be appealing on theoretical grounds. Unfor-tunately, forecast competitions show that situation is not that clear. If VARforecasts perform better than their univariate pendants is controversial and var-ious on several topics data set. According to Chatfield, you may have a slightlybetter than 50:50 chance to obtain improved forecasts by using multivariateforecasting methods. This may be a result from the more complex model selec-

    tion, which is susceptible to errors and can therefore lead to misspecified mod-els. Considering forecasts of BVAR, Chatfield emphasizes the results of Boero(1990), Bayesian VAR model is better than a large-scale econometric model forshort-term forecasting, but not for long-term forecasts where the econometricmodel can benefit from judgemental interventions by the model user and may beable to pick up non-linearities not captured by (linear) VAR models. As saidbefore, (unrestricted) VAR model forecasts often seem to suffer from too manyparameters, which give a spuriously good fit within the sample, but lead to poorout-of-sample forecasts. Therefore as Chatfield mentions, a main motivationfor VAR modelling often lies in trying to get a better understanding of a givensystem, rather than in trying to get better forecasts.

    11

  • 8/8/2019 Forecasting and VAR models

    12/12

    References

    Chatfield, C. (2001) Time-series Forecasting. Chapman & Hall.

    Helmut Luetkepohl, (2007) "Econometric Analysis with Vector AutoregressiveModels," Economics Working Papers ECO2007/11, European University In-stitute.

    Lesage, J.P. (1990) A comparison of the forecasting ability of the ECM andVAR models, Review of Economics and Statistics, 72, 664-71.

    Shoesmith, G.L. (1992) Cointegration, error correction and improved regionalVAR forecasting, Journal of Forecasting, 11, 91-109.

    Shoesmith, G.L. (1995) Long term forecasting of noncointegrated and cointe-grated regional and national models, Journal of Regional Science, 35, 43-64.

    Zivot E., Wang J. (2002) Modeling Financial Time Series with S-PLUS.

    12