Upload
shyam-pillai
View
29
Download
0
Embed Size (px)
DESCRIPTION
an interesting ppt on forecasting i found online
Citation preview
Dr. C. Ertuna 1
Statistical Forecasting Models
(Lesson - 07)
Best Bet to See the Future
Dr. C. Ertuna 2
Statistical Forecasting Models
• Time Series Models: independent variable is time.– Moving Average– Exponential Smoothening– Holt-Winters Model
• Explanatory Methods: independent variable is one or more factor(s).– Regression
Dr. C. Ertuna 3
Time Series Models
• Statistical Time Series Models are very useful for short range forecasting problems such as weekly sales.
• Time series models assume that whatever forces have influenced the variables in question (sales) in the recent past will continue into the near future.
Dr. C. Ertuna 4
Time Series ComponentsA time series can be described by models based on the following
components
Tt Trend Component
St Seasonal Component
Ct Cyclical Component
It Irregular Component
Using these components we can define a time series as the sum of its components or an additive model
Alternatively, in other circumstances we might define a time series as the product of its components or a multiplicative model – often represented as a logarithmic model
ttttt ICSTX
ttttt ICSTX
Dr. C. Ertuna 5
Components of Time Series Data
• A linear trend is any long-term increase or decrease in a time series in which the rate of change is relatively constant.
• A seasonal component is a pattern that is repeated throughout a time series and has a recurrence period of at most one year.
• A cyclical component is a pattern within the time series that repeats itself throughout the time series and has a recurrence period of more than one year.
Dr. C. Ertuna 6
Components of Time Series Data
• The irregular (or random) component refers to changes in the time-series data that are unpredictable and cannot be associated with the trend, seasonal, or cyclical components.
Dr. C. Ertuna 7
Stationary Time Series Models
Time series with constant mean and variance are called stationary time series.
When Trend, Seasonal, or Cyclical effects are not significant then
a) Moving Average Models and
b) Exponential Smoothing Models
are useful over short time periods.
Dr. C. Ertuna 8
Moving Average Models
• Simple Moving Average forecast is computed as the average of the most recent k-observations.
• Weighted Moving Average forecast is computed as the weighted average of the most recent k-observations where the most recent observation has the highest weight.
Dr. C. Ertuna 9
Moving Average Models
• Simple Moving Average Forecast
• Weighted Moving Average Forecastk
Y)Y(EF
1t
ktii
tt
k
Yw)Y(EF
1t
ktiii
tt
Dr. C. Ertuna 10
Weighted Moving Average
• To determine best weights and period (k) we can use forecast accuracy.
• MSE = Mean Square Error is a good measure for forecast accuracy.
• RMSE = is the square root of the MSE.
Actual wMA(k=3)
Month Burglaries 100.00% =SUM(C4:C6) All weights should add-up exactly to 1
42 88 0.1 The further away from the forecast period
43 44 0.3 weights: the lower is the weight
44 60 0.6 Most recent observation has the highest weight
45 56 58.0 =B5*$C$6+B4*$C$5+B3*$C$4
46 70 56.0 =B6*$C$6+B5*$C$5+B4*$C$4
47 91 64.8 =B7*$C$6+B6*$C$5+B5*$C$4
48 54 81.2 :
49 60 66.7 :
50 48 61.3 :
51 35 52.2 :
52 49 41.4 :
53 44 44.7 :
54 61 44.6 :
55 68 54.7 :
56 82 63.5 :
57 71 75.7 :
58 50 74.0
59 59.5 Preliminary forecasted number of burglaries
MSE = 256.3 =SUMXMY2(B7:B20,C7:C20)/COUNT(B7:B20)
RMSE = 16.01 =SQRT(C22)
Data: Evens - Burglaries
Dr. C. Ertuna 11
Weighted Moving Average• Tools / Solver• Set Target Cell: Cell containing RMSE value• Equal to: Min• By Changing Cells: Cells containing weights• Subject to constraints: Cell containing sum of the weight = 1• Options / (check) Assume Non-Negativity• Solve ----- Keep Solver Solution ----- OK
Actual wMA(k=3)
Month Burglaries 100.00%
42 88 0.1
43 44 0.3
44 60 0.6
45 56 58.0
46 70 56.0
47 91 64.8
48 54 81.2
49 60 66.7
50 48 61.3
51 35 52.2
52 49 41.4
53 44 44.7
54 61 44.6
55 68 54.7
56 82 63.5
57 71 75.7
58 50 74.0
59 59.5
MSE = 256.3RMSE = 16.01
Dr. C. Ertuna 12
Weighted Moving Average
• Best weights for a given “k” (in this case “3”) is determined by solver trough minimizing RMSE.
• Same procedure could be applied to models with different k’s and the one with lowest RMSE could be considered as the model with best forecasting period.
Actual wMA(k=3)
Month Burglaries 100.00%
42 88 0.0285
43 44 0.2093
44 60 0.7622
45 56 57.5
46 70 56.5
47 91 66.8
48 54 85.6
49 60 62.2
50 48 59.6
51 35 50.7
52 49 38.4
53 44 46.0
54 61 44.8
55 68 57.1
56 82 65.8
57 71 78.5
58 50 73.2
59 55.3
MSE = 250.6
RMSE = 15.83
Dr. C. Ertuna 13
Moving Average Models
• Tools/ Data Analysis / Moving Average• Input Range: Observations with title (No time)
• Output Range: Select next column to the input range and 1-Row below of the first observation
• Chart misaligns the forecasted values! Forecasted 59th month is aligned with 58th month
Months Crime k = 3 errors
50 48
51 35 #N/A #N/A
52 49 #N/A #N/A
53 44 44.00 #N/A
54 61 42.67 #N/A
55 68 51.33 6.33
56 82 57.67 8.21
57 71 70.33 10.59
58 50 73.67 9.13
59 67.67 12.32
Moving Average
0
10
20
30
40
50
60
70
80
90
50 51 52 53 54 55 56 57 58
Months
Cri
mes
Actual
Forecast
Dr. C. Ertuna 14
Exponential Smoothing
Exponential smoothing is a time-series smoothing and forecasting technique that produces an exponentially weighted moving average in which each smoothing calculation or forecast is dependent upon all previously observed values.
• The smoothing factor “α” is a value between 0 and 1, where α closer to 1 means more weigh to the recent observations and hence more rapidly changing forecast.
Dr. C. Ertuna 15
Exponential Smoothing Model
where:
Ft= Forecast value for period t
Yt-1 = Actual value for period t-1
Ft-1 = Forecast value for period t-1
= Alpha (smoothing constant)
)FY(FF 1t1t1tt
1t1tt F)1(YF or
Dr. C. Ertuna 16
Exponential Smoothing Model
• Tools/ Data Analysis / Exponential Smoothing.
• Input Range: Observations with title (No time)
• Output Range: Select next column to the input range and first Row of the first observation
• Damping Factor: 1-α (not α)
Month Crimes alpha=0.7
50 48 #N/A
51 35 48.0
52 49 38.9
53 44 46.0
54 61 44.6
55 68 56.1
56 82 64.4
57 71 76.7
58 50 72.7
59 ? 56.8
Exponential Smoothing
0
10
20
30
40
50
60
70
80
90
50 51 52 53 54 55 56 57 58 59
Months
Cri
mes
Actual
Forecast
Dr. C. Ertuna 17
Exponential Smoothing Model
• To determine best “α” we can use forecast accuracy.
• MSE = Mean Square Error is a good measure for forecast accuracy.
A B C D
1 Month Crime 0.7
2 50 48 #N/A
3 51 35 48.00 ! Actual observation B2
4 52 49 38.90
5 53 44 45.97
6 54 61 44.59
7 55 68 56.088 56 82 64.429 57 71 76.73
10 58 50 72.7211 59 ? 56.82 =$C$1*B10+(1-$C$1)*C1012
13 MSE = 193.0 =SUMXMY2(B3:B10,C3:C10)/COUNT(B3:B10)
Dr. C. Ertuna 18
Holt-Winters Model
The Holt-Winters forecasting model could be used in forecasting trends. Holt-Winters model consists of both an exponentially smoothing component (E, w) and a trend component (T, v) with two different smoothing factors.
Dr. C. Ertuna 19
Holt-Winters Model
where:Ft+k= Forecast value k periods from t Yt-1 = Actual value for period t-1Et-1 = Estimated value for period t-1Tt = Trend for period tw = Smoothing constant for estimatesv = Smoothing factor for trendk = number of periods
)TE)(w1(wYE 1t1t1tt
1t1ttt T)v1()EE(vT
ttkt kTEF
1. E1 and T1 are not defined.
2. E2 = Y2
3. T2 = Y2 – Y1
Dr. C. Ertuna 20
Holt-Winters Model
• E_2 = Y_2 and T_2 = (Y_2-Y_1)• E_12 = $D$1*C14+(1-$D$1)*(D13+E13)• T_12 = $E$1*(D14-D13)+(1-$E$1)*E13• F_13 = D14+E14
A B C D E1 w = 0.7 0.5 = v2 Month Sales E T F3 1 4.8 N/A N/A4 2 4.0 4.0 -0.85 3 5.5 4.8 0.0 3.26 4 15.6 12.4 3.8 4.87 5 23.1 21.0 6.2 16.18 6 23.3 24.5 4.8 27.29 7 31.4 30.8 5.6 29.3
10 8 46.0 43.1 8.9 36.311 9 46.1 47.9 6.9 52.112 10 41.9 45.8 2.4 54.813 11 45.5 46.3 1.4 48.114 12 53.5 51.8 3.5 47.715 13 55.24
Holt-Winter Forecasting
0.010.020.030.040.050.060.0
Months
Sa
les Sales
F
Dr. C. Ertuna 21
Holt-Winters Model
• Set E (smoothing component), T (trend component), and F (forecasted values) columns next to Y (actual observations) in the same sequence
• Determine initial “w” and “v” values• Leave E,T &F blanc for the base period (t=1)
• Set E2 = Y2
• Set T2 = Y2-Y1 Note: (F2 is blanc)
Dr. C. Ertuna 22
Holt-Winters Model
• Formulate E3 = w*Y3 + (1-w)*(E2+T2)
• Formulate T3 = v*(E3-E2) + (1-v)*T2
• Formulate F3 = E2 + T2
• Copy the formulas down until reaching one cell further than the last observation (Yn).
• Compute MSE using Y’s and F’s• Use solver to determine optimal “w” and “v”.
Dr. C. Ertuna 23
Holt-Winters Model
Solver set up for Holt Winters:
• Target Cell: MSE (min)
• Changing Cells: w and v
• Constrains: w <= 1
w >= 0
v <= 1
v >= 0
Dr. C. Ertuna 24
Forecasting with Crystal Ball
• CBTools / CB Predictor – [Input Data] Select
Range, First Raw, First Column Next
– [Data Attribute] Data is in Next– [Method Gallery] Select All Next– [Results] Number of periods to forecast [1]
Select Past Forecasts at cell Run
periods, etc.
Dr. C. Ertuna 25
Forecasting with Crystal BallYear Actual Revenue
1975 5.0 Actual Revenues of EASTMAN KODAC1976 5.4 Data: EASTMANK1977 6.01978 7.01979 8.01980 9.71981 10.31982 10.81983 10.21984 10.61985 10.61986 11.51987 13.31988 17.01989 18.41990 18.91991 19.41992 20.21993 16.31994 13.71995 15.31996 16.21997 14.51998 13.41999 14.1
Dr. C. Ertuna 26
Forecasting with Crystal Ball
Forecast:
DateLower:
5% ForecastUpper:
95%
2000 11.9 14.4 17.0
Method Errors:
Method RMSE MAD MAPE
Best:
Double Exponential Smoothing 1.5043 0.9871 7.68%
2nd:
Single Exponential Smoothing 1.5147 1.1566 9.03%
3rd:Single Moving
Average 1.5453 1.2042 9.40%
4th:Double Moving
Average 2.0855 1.592 11.16%
Method Parameters:
Method Parameter Value
Best:
Double Exponential Smoothing Alpha 0.999
Beta 0.051
2nd:
Single Exponential Smoothing Alpha 0.999
3rd: Single Moving Average Periods 1
4th: Double Moving Average Periods 2
Actual Revenue
0.0
5.0
10.0
15.0
20.0
25.0
Data
Fitted
Forecast
Upper: 95%
Low er: 5%
StudentEdition
StudentEdition
Dr. C. Ertuna 27
Performance of a Model
Performance of a model is measured by Theil’s U.
The Theil's U statistic falls between 0 and 1.
When U = 0, that means that the predictive performance of the model is excellant and when U = 1 then it means that the forecasting performance is not better than just using the last actual observation as a forecast.
Dr. C. Ertuna 28
Theil’s U versus RMSE
The difference between RMSE (or MAD or MAPE) and Theil’s U is that the formars are measure of ‘fit’; measuring how well model fits to the historical data.
The Theil's U on the other hand measures how well the model predicts against a ‘naive’ model. A forecast in a naive model is done by repeating the most recent value of the variable as the next forecasted value.
Dr. C. Ertuna 29
Choosing Forecasting Model
The forecasting model should be the one with lowest Theil’s U.
If the best Theil’s U model is not the same as the best RMSE model then you need to run CB again by checking only the best Theil’s U model to obtain forecasted value.
P.S. CB uses forecasting value of the lowest RMSE model (best model according CB)!
Dr. C. Ertuna 30
Determining Performance
Theil’s U determins the forecasting performance of the model.
The interpretation in daily language is as follows: Interpret (1- Thei’l U)
1.00 – 0.80 High (strong) forecasting power
0.80 – 0.60 Moderately high forecasting power
0.60 – 0.40 Moderate forecasting power
0.40 – 0.20 Weak forecasting power
0.20 – 0.00 Very weak forecasting power
Dr. C. Ertuna 31
Regression or Time Series Forecast
Here is the guiding principle when to apply Regression and when to apply Time Series Forecast.
• As some thing changes (one or more independent variables) how does another thing (dependent variable) change is an issue of directional relationship For directional relationships we can use regression.
• If the independent variable is TIME (as time changes how does a variable change) Then we can use either regression or time series forecasting models
Dr. C. Ertuna 32
Explanatory Methods
Simple Linear Regression Model: The simplest inferential forecasting model is the simple linear regression model, where time (t) is the independent variable and the least square line is used to forecast the future values of Yt.
Dr. C. Ertuna 33
Regression in Forecasting Trends
where:
Yt = Value of trend at time t
0 = Intercept of the trend line
1 = Slope of the trend line
t = Time (t = 1, 2, . . . )
t10tt t)Y(EF
Dr. C. Ertuna 34
Regression in Forecasting Seasonality
• Many time series have distinct seasonal pattern. (For example room sales are usually highest around summer periods.)
• Multiple regression models can be used to forecast a time series with seasonal components.
• The use of dummy variables for seasonality is common.– Dummy variables needed = total number of seasonality –1– For example: Quarterly Seasonal: 3 Dummies are needed, Monthly
Seasonal: 11 Dummies needed, etc.– The load of each seasonal variable (dummy) is compared to the
one which is hidden in intercept.
Dr. C. Ertuna 35
Regression in Forecasting Seasonality
t34231210tt QQQt)Y(EF
where:
Q1 = 1 , if quarter is 1, = 0 otherwise
Q2 = 1 , if quarter is 2, = 0 otherwise
Q3 = 1 , if quarter is 3, = 0 otherwise
2 = the load of Q1 above Q4
0 = the overall intercept + the load of Q4
t = Time (t = 1, 2, . . . )
Dr. C. Ertuna 36
Seasonal RegressionMegaWattsPower Load Year Q1 Q2 Q3
106.8 1973.1 1 0 089.2 1973.2 0 2 0
110.7 1973.3 0 0 391.7 1973.4 0 0 0
108.6 1974.1 1 0 098.9 1974.2 0 2 0
120.1 1974.3 0 0 3102.1 1974.4 0 0 0113.1 1975.1 1 0 0
94.2 1975.2 0 2 0120.5 1975.3 0 0 3107.4 1975.4 0 0 0116.2 1976.1 1 0 0104.4 1976.2 0 2 0131.7 1976.3 0 0 3117.9 1976.4 0 0 0
Seasonal Regression
80.0085.0090.0095.00
100.00105.00110.00115.00120.00125.00130.00135.00
Year/Quarter
Po
we
r
Predicted PowerLoad
Actual PowerLoad
E(Y_Q1) = -10801.6 + 5.52 * Year.1 + 8.06
E(Y_Q2) = -10801.6 + 5.52 * Year.2 + -3.50
E(Y_Q3) = -10801.6 + 5.52 * Year.3 + 5.51
E(Y_Q4) = -10801.6 + 5.52 * Year.4
Dr. C. Ertuna 37
Next Lesson
(Lesson - 09) Introduction to Optimization