Upload
allen-powers
View
217
Download
3
Embed Size (px)
Citation preview
Forecasting Techniques: Single Equation Regressions
Su, Chapter 10, section III
Regression Models
• Represent functional relationships between economic variables
• Usually estimated by OLS techniques
• General Form
Yt = 0 + 1X1t + 2X2t + … + kX1k + ut
Yt : Dependent Variable Xit‘s : Explanitory Variables
i‘s: Parameters ut : Stochastic Term
Regression: Forecasting Ability
• Depends on the structure of the regression equation, including– Degrees of Freedom: Should be > 30– Statistical Significance and sign of parameters– High Goodness of Fit
• Low Standard Error of Estimate
• High R2
Forecasting with Regression Models
• Depends on choice of X’s, which is generally guided by economic theory– Example: According to the IS/LM model, what
variables would be useful for forecasting GDP?
• Generally speaking, more data should be preferred
Some Useful Concepts I
• Ex Post Forecast: Extrapolation goes beyond sample period but not into future– Example: Sample period for regression is 1970-
1997, forecast through 2000
• Ex Ante Forecast: Extrapolation extends into future– Example: Sample period is 1990:1-2001:1,
forecast through 2002:1
Some Useful Concepts II
• Predictive power of a regression model depends on its lag structure
• Conditional Forecasts: Some contemporaneous explanatory variables appear on RHS– Must also predict values for these contemporaneous
explanatory variables
• Unconditional Forecasts: Only lagged explanatory variables appear on RHS
Some Useful Concepts III
• Point Forecast: Predicts a single number– Example: The Dow will be 1100 on July 1
• Interval Forecast: Shows a numerical interval in which the actual value can be expected to fall– Example: The Dow will be between 1000 and
2000 on July 1 with 99% probability
Example: Automobile Sales
• Want to replicate the regression results in section 4
• Use the regression data analysis tool to replicate the results on page 348
• Model: Yt = + Xt + ut
• Y: Automobile Sales X: New Car Price
• Linear Demand Curve
Demand for New Cars
50.0
60.0
70.0
80.0
90.0
100.0
110.0
120.0
130.0
4000 5000 6000 7000 8000 9000 10000
Sales
Pric
e
Procedure
• Step 1: Copy the sales and price data to a new worksheet
• Step 2: Start the regression data analysis tool
• Specify correct ranges
Regression OutputSUMMARY OUTPUT
Regression StatisticsMultiple R 0.59R Square 0.35Adjusted R Square 0.31Standard Error 1013.7142Observations 20
ANOVAdf SS MS
Regression 1 9794932.261 9794932Residual 18 18497098.29 1027617Total 19 28292030.55
Coefficients Standard Error t StatIntercept 10200.23 887.95 11.49X Variable 1 -30.2750 9.8062 -3.09
Interpreting Regression Results
• Yt = 10,200.23 - 30.275Xt (10.20)
– Parameter on X: -30.27– t-statistic: 3.08
Ex Post Point Forecasts
• To make an ex post forecast for 1991, simply plug the actual value of the price index for 1991 into (10.20) - Put in D22
• Yt = 10,200.23 - 30.275(125.3) = 6,406.77
• Note that ex post forecasts can be done for any year in the period for which data are available
Evaluation of Ex Post Forecasts
• Can also evaluate forecasts within sample
• Copy the formula from D22 into D21
• Where in the regression output can you find this number?
• Fill in the rest of column D with the Ex Post Forecasts and plot the actual sales and the Ex Post forecasts
Actual Sales and Ex Post Forecast
0
2000
4000
6000
8000
10000
12000
1971 1973 1975 1977 1979 1981 1983 1985 1987 1989 1991
Summary Statistics
• Already know how to calculate, but in this case the regression function has already done some of the heavy lifting
• We saw where the Ex Post forecasts could be found, what about the forecast errors?
Residuals and Forecast Errors
• In the terminology of econometrics, ex post forecast errors are called residuals
• The OLS estimator is designed to minimize the sum of the residuals squared - OLS estimates minimize MSE and RMSE
• To find value of MSE, look on the ANOVA table, for the row labeled Residual and under the column labeled SS
Ex Ante Point Forecasts
• To generate these, must forecast X, as these forecasts are conditional on unknown future values (must pretend that the present is 1991 in this case)
• How should X be forecast?
Ex Ante Point Forecasts: Example
• Step 1: Extend the time column to 1994
• Step 2: Calculate the forecasted X’s using the same change naïve forecasting model in column C
• Step 3: Using the formula from above, calculate the Ex Ante forecasts for 1992 - 1994 and chart them
Ex Post and Ex Ante Forecasts
0
2000
4000
6000
8000
10000
12000
1971 1973 1975 1977 1979 1981 1983 1985 1987 1989 1991 1993
Interval Forecasts
• Instead of a line, can also display the range in which the forecast values will probably fall
• These are called interval forecasts and are based on the variance of the regression
• Based on (10.18)
Interval Forecasts: Example
• Must calculate average of X and sum of X - average(X) = x
• First term of (10.18) is just ex ante forecast• t0.025 is just a value from a table in a
statistics book• e has already been calculated by the
regression program• Text has wrong numbers
Forecast Interval
Ex Post and Ex Ante Forecasts
0
2000
4000
6000
8000
10000
12000
1971 1973 1975 1977 1979 1981 1983 1985 1987 1989 1991 1993
Autoregressive Models
• Even though they use sophisticated statistical techniques, these models are extrapolations
• The explanatory variables (X’s) are lagged values of the dependent variable
• Assumes that the time path of a variable is self-generating
• Also called the “Chain Principle”
AR Models: Functional Forms
• General:Xt = f(Xt-1,Xt-2,Xt-3,...,1, 2,, 3...,ut)– ut : residual term, captures random components– Must specify form and lag length
• Linear form, lag length kXt = 0 + 1 Xt-1,+ 2Xt-2,+ …+ kXt-k + ut
Note that both No Change and Same Change naïve forecasts are special cases of this
AR Models: Determining Lag Length
• The general form has an infinite number of parameters, but we never have this much data - model must be restricted to be used
• Assume that the impact of some distant Xt-j are trivial and insignificant
• Rule of thumb: don’t use a k >4 because of econometric problems
Dummy Variables
• Requires no additional economic data
• Was discussed in chapter 2
• Two Types:– Trend– Seasonal / annual
Dummy Variables: Trends
• Uses a time variable T (=1,2,3,…) and extrapolates X along its time pathLinear: Xt = + Tt
Exponential: X = e + Tt
Reciprocal: X = 1/[ + Tt]
Parabolic: X = 0 + 1 Tt,+ 2T2t
Dummy Variables: Seasonal
• These are “Intercept shifters” - they allow the intercept term 0 to vary systematically
• Single Equation Model with Quarterly Dummies:
Yt = 1Q1+2Q2+3Q3+4Q4+1X1t+…+kX1k+ut
• Can also use monthly dummies if Y is monthly
• Get a different forecast for each quarter
Other Dummy Variables
• Dummy variables can be useful tools in forecasting
• Recall from the earlier section that the single equation forecast for new car sales was high for 1991 because it was a recessionary year
• Can use a dummy variable for recessions to improve this forecast
Example: Recession Dummy
• Model: Yt = + Xt + DR + ut
• Y: Automobile Sales X: New Car Price
• DR: Recession Dummy, = 1 in years with troughs
• Add new sheet to spreadsheet, copy Year, New Car Sales, New Car Price
• Look at Table 7.1, p. 236 to create dummy
Empirical Results
Yt = - 31.66Xt - DR
(571.918) (6.233) (360.237)
Forecast with Recession Dummy
0
2000
4000
6000
8000
10000
12000
1971 1973 1975 1977 1979 1981 1983 1985 1987 1989 1991
Actual
Forecast
Forecast Comparison
No Dummy Dummy
1991 F 6406.77 4839.97
SEE 1013.71 643.83
SSR 18,497,098.29 7,046,970
R^2 0.35 0.75
Exercise: AR Models
• Data: U.S. Population 1948-1990
• Available in a text file on Web page (tab2-1.txt)
• Step 1: Read file into Excel
Exercise: Creating Lag Variables
• Best way is with formulas, although could copy as well
• Population data are in column 2
• Step 2: Label columns 3-6 “Lag1”, “Lag2”, “Lag3” and “Lag4”
• What value goes in C3? D4? E5? F6?
Year Pop Lag1 Lag2 Lag3 Lag41948 147.201949 149.77 147.201950 152.27 147.201951 154.88 147.201952 157.55 147.20
• C3 is the Lag1 value for 1949, which is the actual population in 1948 - population lagged one year
• D4 is the Lag2 value for 1950, which is the actual population in 1948 - population lagged two years
• Step 3:Fill in rest of lags using formulas
Exercise: AR Regressions
• Step 4: Replicate the regression results on page 352. Note: Watch sample period
• Step 5: Calculate Ex Post forecasts for the sample period and RMSE for each method– Which has the lowest RMSE?
• Step 6: Calculate Ex Ante population forecasts through 2025 and compare to Table 10.4
Exercise: Trend Forecasting
• Step 1: Create trend and trend squared variables in the spreadsheet
• Step 2: Replicate the three regression results shown on page 354
• Step 3: Calculate a 100 year ahead Ex Ante forecast of U.S. population using each, and chart the time paths
• How accurate are these forecasts