67
Lecture 13 Forecasting Methods for Time Series Data

Lecture 23

Embed Size (px)

DESCRIPTION

Nothing Special

Citation preview

  • Lecture 13Forecasting Methods for Time Series Data

  • Introduction

    Time-series data are data gathered on a single unit (person, firm, etc.) over a sequence of time periods.

    These time periods may be years, months or any other measure of time.

    Here we assume the data are gathered over only one type of interval (for example, we do not have a mix of quarterly and monthly data).

  • Forecasting is the Goal

    For this type of analysis, the primary goal is to forecast future values of the series.

    Two approaches to this problem are using a causal model or an extrapolative model.

  • Causal Forecasting Models A causal model is much like the regression

    models we have been working with in that the researcher tries to identify predictor variables for the series.

    An example might be to model sales as a function of advertising and competitors' actions, all over time.

    Of course, to forecast with such a model you also have to forecast the future of the predictor series.

  • Extrapolative Forecasting Models

    For these models, the researcher tries to identify patterns of behavior from the series own history.

    These patterns could include trends and seasonal factors

    Once identified, the researcher assumes history will repeat itself and extrapolates the patterns into the future.

  • Nave Forecasts Baseline forecasting methods are methods that are very

    easy to use that can serve as benchmarks for more elaborate techniques.

    The idea behind them is that the more complicated technique had better substantially outperform them or they are not worth the effort.

    The simplest of these techniques is the nave method which forecasts that the next time period will be the same as this one. That is:

    TT yy =+1

  • Measuring Forecast Accuracy

    Once we have seen how well a forecasting model performed, it is useful to have some summary statistics of the model's accuracy.

    Three commonly used measures of forecast accuracy are:

    1. Mean squared deviation (MSD)2. Mean absolute deviation (MAD)3. Mean absolute percent error (MAPE)

  • Mean Squared Deviation

    Assume we have n forecasts and the same number of actual values. The forecast error is thus:

    The first of our measures computes average squared error:

    ( )n

    yyMSD

    n

    iii

    =

    = 12

    )( ii yy

  • Absolute Error Measures Our second technique uses absolute instead of squared

    error:

    Finally, we look at absolute error as a percentage of the series value:

    n

    yyMAD

    n

    iii

    =

    = 1

    %100

    1

    ==

    nyyy

    MAPE

    n

    i i

    ii

  • Using the Error Measures

    When choosing among two or more techniques, we obviously want to use the one with smallest error.

    Sometimes a technique that is best under one measure is not under another, requiring a judgment call.

    The best measure to use may depend on the type of series being forecast.

  • Moving Averages

    As its name implies, a moving average is just an average of several consecutive observations, computed over a rolling origin.

    An m-period moving average is:

    ( )tmtmtt yyymy +++= +++ L2111

  • Example: XYZ Sales

    There are 20 monthly observations for sales of the XYZ Corporation in Table 11.1.

    To help in production planning, they forecast using a 3-month moving average.

    The first forecast is generated for month 4, and is (4+13+9)/3 = 8.667

  • XYZ Sales, Forecasts

    and Forecast

    errors

    Time Period SALES

    Nave Forecast

    Nave FC Error

    MovAvg Forecast

    MA FC Error

    1 42 133 94 11 9 2 8.667 2.3335 10 11 -1 11.000 -1.0006 3 10 -7 10.000 -7.0007 15 3 12 8.000 7.0008 4 15 -11 9.333 -5.3339 4 4 0 7.333 -3.33310 9 4 5 7.667 1.33311 5 9 -4 5.667 -0.66712 7 5 2 6.000 1.00013 8 7 1 7.000 1.00014 8 8 0 6.667 1.33315 18 8 10 7.667 10.33316 8 18 -10 11.333 -3.33317 5 8 -3 11.333 -6.33318 15 5 10 10.333 4.66719 5 15 -10 9.333 -4.33320 7 5 2 8.333 -1.333

  • Forecast Errors

    The table also lists the nave model forecast for the same months, plus the forecast errors for both methods.

    Examination of the errors shows that the nave method was better in 5 months and was tied 3 times, so the moving average is not clearly better.

    The forecasts for both methods are on the next two graphs.

  • Actual

    Predicted

    Actual Predicted

    20100

    18

    13

    8

    3

    S

    A

    L

    E

    S

    Time

    MSD:

    MAD:

    MAPE:

    Length:

    Moving Average

    45.7647

    5.2941

    78.1727

    1

    Moving AverageNave Method Forecasts

  • Actual

    Predicted Actual Predicted

    20100

    18

    13

    8

    3

    S

    A

    L

    E

    S

    Time

    MSD:

    MAD:

    MAPE:

    Length:

    Moving Average

    20.6078

    3.6275

    56.5909

    3

    Moving Average3-Month Moving Average Forecasts

  • Accuracy Statistics

    Although the moving average provided a poorer forecast on some occasions, on an overall basis it was superior.

    56.5913.62720.608MA-3

    78.1735.29445.765Nave

    MAPEMADMSDMethod

  • Exponential Smoothing Although the moving average was clearly better

    than the nave approach, its errors were still over 50% of the actual sales.

    The technique assigned an equal weight to each of the three months, and it may have been better to assign more weight to the most recent month.

    Exponential smoothing models are a special type of moving average that does just that.

  • Single Exponential Smoothing (SES)

    This technique is best applied to a series with no trend or seasonality.

    Its forecast is a weighted average of all past data, with the largest weight on the most recent observation and then declining exponentially.

    11

    22

    11

    )1()1(

    )1(

    yyyyy

    TT

    TTT

    ++++

    +=

    L

  • Choices for Weight

    The smoothing constant should be between 0 and 1.

    If it is close to 1 the model is more responsive to recent changes.

    If it is closer to 0 it tends to give a smoother forecast because more weight is given to past observations.

    Many programs have a way to choose an "optimal" weight.

  • Alternative Representation

    An alternative representation of the model is:

    The next forecast is a weighted average of the most recent observation and the most recent forecast.

    The recursive nature of the equation makes it easier to use for hand calculation or in a spreadsheet.

    TTT yyy )1( 1 +=+

  • Compact Data Storage One feature of SES is that it requires very little

    data to be maintained in order to use it. For each series, all you really need to keep from

    month to month is the forecast you just made and the smoothing constant .

    This makes the technique a favorite of people who need to make a large number of forecasts every month.

  • Example XYZ Sales

    For now, let's use =.3 as the smoothing constant. The first forecast we can make is for t=2:

    We have no forecast for the first period, so there is an initialization problem.

    Here we just take:

    112 )3.1(3. yyy +=

    11 yy =

  • Forecast and Accuracy

    With these assumptions, the forecast for period 2 is just y1=4.

    Proceeding from there, we next get:

    Table 2 shows all of the forecasts and forecast errors. These can be used to compute MSD = 22.858, MAD = 3.949 and MAPE = 58.813%.

    7.6)4(7.)13(3.)3.1(3. 223 =+=+= yyy

  • Table 2Single

    ExponentialSmoothing Forecasts and Errors

    Time Period SALES

    Exp. Smth. Forecast

    Forecast Error

    1 42 13 4.00 9.003 9 6.70 2.304 11 7.39 3.615 10 8.47 1.536 3 8.93 -5.937 15 7.15 7.858 4 9.51 -5.519 4 7.85 -3.85

    10 9 6.70 2.3011 5 7.39 -2.3912 7 6.67 0.3313 8 6.77 1.2314 8 7.14 0.8615 18 7.40 10.6016 8 10.58 -2.5817 5 9.80 -4.8018 15 8.36 6.6419 5 10.35 -5.3520 7 8.75 -1.75

  • A Better Choice of ? Because we made no attempt at all to find the

    best smoothing constant, we might be able to do better.

    If doing this in a spreadsheet, you could try values from .1 to .9 and pick the value with the best accuracy statistics.

    Minitab finds the best constant (that minimizes MSD) and also has a better way of handling the initial conditions.

  • Minitab Output Using =.065

    Actual

    Predicted

    Actual Predicted

    20100

    18

    13

    8

    3

    S

    A

    L

    E

    S

    Time

    MSD:

    MAD:

    MAPE:

    Alpha:

    Smoothing Constant

    18.2536

    3.3833

    48.3945

    0.065

    Single Exponential Smoothing

  • A Hard Series to Forecast

    Minitab's optimal constant was =.065which produced the minimum MSD of 18.254.

    Although this is the best forecast yet, we note that MAPE is still pretty high at 48.4%.

    It turns out that this series is pretty difficult to work with because it fluctuates up and down so much.

  • Double Exponential Smoothing

    If you apply SES to a series with an upward trend it will consistently underestimate yt because it will never "catch up" to the rise.

    Double exponential smoothing (DES) is an extension of SES that assumes there is a trend in the series.

    It uses a pair of smoothing equations.

  • The DES Equations

    First the level of the series is smoothed:

    Then the trend in the series is smoothed:

    Finally the forecast is generated from:

    ))(1( 11 ++= tttt TLyL

    11 )1()( += tttt TLLT

    ttmt mTLy +=+

  • Notes on the DES Equations The forecast can be generated for any number of

    periods (m) ahead.

    The 1-step-ahead forecast is Lt + Tt which says we expect the series to go up by one unit of trend from the current level.

    The Lt equation is just a weighted average of the current observation and the previous 1-step-ahead forecast (as in SES).

    The Tt equation is a weighted average of the previous trend and the change in level (Lt Lt-1).

  • Choice of Constants

    Both and are usually between 0 and 1. The trend constant is usually fairly small

    because you don't want the trend to change too rapidly.

    We also have to provide initial values for both L and T. A common practice is to estimate them from the first 6 observations.

  • Example : New Construction A contractor wants a forecast of new

    construction in 2002 and 2003, and has 11 years of data on US construction to use in the analysis.

    First, we model the series with SES. Note that the model usually under forecasts as the smoothing equation tends to lag one-period behind the upward trend.

    Also note the flat forecast. This does not look correct as construction increased every year over the 1991-2001 period.

  • Actual

    Predicted

    Forecast

    Actual Predicted Forecast

    14121086420

    920

    820

    720

    620

    520

    420

    N

    E

    W

    C

    O

    N

    Time

    MSD:

    MAD:MAPE:

    Alpha:

    Smoothing Constant

    1007.11

    25.66 4.20

    1.558

    SES Forecasts For New Construction

  • Applying DES

    Next we model the series with DES, using the "optimal" constants selected via Minitab.

    This works better. The fit no longer lags behind. The MAD and MAPE are about half what they were with SES.

    The forecast is for the upward trend to continue, which looks correct.

  • Actual

    Predicted

    Forecast

    Actual Predicted Forecast

    14121086420

    1000

    900

    800

    700

    600

    500

    N

    E

    W

    C

    O

    N

    Time

    MSD:MAD:MAPE:

    Gamma (trend):Alpha (level):Smoothing Constants

    214.709 13.030 2.029

    0.2260.783

    DES Forecasts For New Construction

  • Winter's Exponential Smoothing

    For series that have seasonality, neither SES or DES will work well because they cannot handle it.

    Winter's exponential smoothing method adds a seasonal adjustment mechanism and a third equation.

    There are both additive and multiplicative versions.

  • The Seasonals These are a set of numbers that reflect how the series,

    on a fairly regular basis, tends to increase or decrease at certain calendar periods.

    For example, in a quarterly (s=4) retail series we might see sales typically 25% higher than the rest of the year during 4th quarter.

    Multiplicative model seasonal factors might look like (.85, .93, .96, 1.25) over four consecutive quarters.

    In an additive model this would look different; perhaps (-100, -85, -10, +160).

  • Winter's Method, Multiplicative Version A seasonal factor joins the level equation:

    The trend is the same as in DES:

    The seasonal factors are smoothed:

    Finally the forecast is generated from:

    ))(1( 11

    ++= ttst

    tt TLS

    yL

    11 )1()( += tttt TLLT

    pstttmt SmTLy ++ += )(

    stt

    tt SL

    yS += )1(

  • Multiplicative Versus Additive Note that the term (Lt + mTt) is the same as it was in

    DES. This is a forecast of a non-seasonal series.

    The model is called multiplicative because in the forecast, the term above is multiplied by the seasonal factor St-s+m.

    In the additive form of the model, the seasonal is added to (Lt + mTt).

    If the seasonality tends to be a constant amount over time, use the additive form. Use the other version if the seasonal fluctuation seems to increase with the series level.

  • Winter's Method, Additive Version The seasonal factor is subtracted out:

    The trend is the same as before:

    The seasonal factors are smoothed differently:

    The forecast has the seasonal added back in:

    ))(1()( 11 ++= ttsttt TLSyL

    11 )1()( += tttt TLLT

    pstttmt SmTLy ++ ++= )(

    stttt SLyS += )1()(

  • Example ABX Company Sales We have seen the sales of this winter

    sports merchandise company in Chapters 3 and 7.

    Quarterly sales (in $1000s) are modeled over the period from 1994 through 2003.

    The seasonal swings seem to be constant over the years, implying we should use the additive model.

  • 300

    250

    200

    Dec-2003Jun-2001Dec-1998Jun-1996

    S

    A

    L

    E

    S

    Date/Time

    ABX Company Sales

    Seasonal swings do not increase with level.

    Use additive form.

  • Choice of Constants

    Now we have to supply three constants, one for each smoothing equation.

    Minitab defaults to .2 for all three and unfortunately does not have an "optimize" option here.

    Usually the trend and seasonal constants are small, but you might have to try several sets of values.

  • Actual

    Predicted

    Actual Predicted

    403020100

    300

    250

    200

    S

    A

    L

    E

    S

    Time

    MSD:MAD:MAPE:

    Delta (season):Gamma (trend):Alpha (level):Smoothing Constants

    68.9938 6.9703 2.9101

    0.2000.2000.200

    Winter's Method Using Default Constants

  • Example Electrical and Appliance Sales

    Monthly sales amounts (in millions) from electrical and appliance stores are in the file ELECTAPP11.

    The data span from January of 1991 through December of 2002.

    This time the seasonal swings are increases as the level goes up, so the multiplicative model should work.

  • 12000

    7000

    2000

    12010080604020

    S

    a

    l

    e

    s

    Index

    Electrical and Appliance Sales, 1993-2002

    Seasonal swing increases with level.

    Use multiplicative.

  • Multiplicative Better

    On the next two slides are the Minitab analysis of the two forms of the model using the default value of .2 for all the constants.

    The MAPE for the additive model is 4, higher than the multiplicative's 3.1.

    The MAD for additive is 247, higher than the other's 168.8.

  • Actual

    Predicted

    Actual Predicted

    100500

    12000

    10000

    8000

    6000

    4000

    S

    a

    l

    e

    s

    Time

    MSD:MAD:MAPE:

    Delta (season):Gamma (trend):Alpha (level):Smoothing Constants

    66091.5 168.8

    3.1

    0.2000.2000.200

    Winter's Multiplicative Model Fit to Sales

  • Actual

    Predicted

    Actual Predicted

    100500

    12000

    10000

    8000

    6000

    4000

    S

    a

    l

    e

    s

    Time

    MSD:MAD:MAPE:

    Delta (season):Gamma (trend):Alpha (level):Smoothing Constants

    147681 247

    4

    0.2000.2000.200

    Winter's Additive Model Fit to Sales

  • Decomposition The Winter's method can be applied to a wide

    variety of series and its three smoothing equations can respond to many different types of behavior.

    Unfortunately, it can be a lot of work searching for the best smoothing constants.

    Decomposition models have the same components as Winter's method, but do not require such effort.

  • Component Models

    The series is viewed as either the sum or product of three components: trend, seasonal and error.

    The additive form is:yt = Tt + St + Et

    The multiplicative counterpart is:yt = Tt St Et

  • A Simplifying Assumption The main difference between the Winter's

    method and decomposition is that here we assume that the trend and seasonal functions are constant over time.

    For example, for quarterly data, we assume S1 = S5 = S9 and so on.

    The trend and seasonal components are estimated after a series of filtering steps.

  • Step 1: Smooth out seasonality

    We want to compute a centered moving average for each data point.

    For the ABX sales data, the first moving average we can compute is:(221 + 203.5 + 190 + 225.5)/4 = 210

    The middle of this time interval is in between the second and third quarters, which is not one of our time periods.

  • Centered Moving Average To correctly locate things, we get the next moving

    average using data from periods 2 through 5:(203.5 + 190 + 225.5 + 223)/4 = 210.5

    The time location for this one is between quarters 3 and 4.

    We average this one with the previous one to center the moving average on the third data point.CMA3 = (210 + 210.5)/2 = 210.25

  • Step 2: Remove the trend

    Once we have a CMA associated with each data point, we can take the trend out of our data by:DeTrendt = yt CMAt

    With the trend removed, our series now consists of seasonality and error.

    Step 3 estimates the seasonals.

  • Step 3: Find the seasonal indices

    We now want to find out what kind of seasonality occurs in each quarter.

    On the next slide we show the detrendeddata arrayed by quarter and year.

    As a measure of "typical seasonal activity" we compute the median for each quarter.

  • Computation of Seasonal Factors

    Year 1st Qtr 2nd Qtr 3rd Qtr 4th Qtr1994 -20.250 16.6881995 13.875 -21.250 -7.000 8.8751996 14.813 -9.063 -15.063 11.8751997 19.563 -29.563 -4.188 14.2501998 12.375 -6.438 -16.938 8.8751999 10.125 -12.563 -7.063 8.1882000 22.750 -13.063 -27.000 18.8132001 19.813 -20.688 -10.375 16.1252002 12.438 -13.438 -15.688 11.5632003 14.813 -10.813

    Medians 14.813 -13.063 -15.063 11.875Adj Medians 15.173 -12.704 -14.704 12.235

  • Adjusting the medians

    If you add up the four medians, their sum is -1.438.

    We would like the four seasonal indices to sum to zero, so we add back in one fourth of this amount to each median.

    The adjusted medians now sum to zero and will be used as our seasonal indices.

  • Step 4: Deseasonalize the original data

    Now go back to the original data and adjust it by subtracting out the seasonal factor. For example:SA1 = y1 S1 = 221 15.172 = 205.828SA5 = y5 S5 = 223 15.172 = 207.828

    Note that the same seasonal is used for both calculations because they are both first quarters.

    The resulting SA series contains only trend and error. We fit a trend regression to this data using the same method we used in Chapter 3.

  • Step 5: Fit a trend line We create a time index variable that ranges from

    1 in first quarter 1994 through 40 in fourth quarter 2003.

    We then regress the SA series on this index and obtain the equation:Tt = 198.8094 + 2.566006 t

    Note that we could have used one of the other trend functions mentioned in Chapter 5 if things were not linear.

  • Step 6: Compute the fits

    We now have both the trend and seasonals so can now obtain fitted values for each time point in our sample.

    We can then get the fit error and compute MAD, MAPE and MSD to compare with other techniques.

  • Step 7: Forecast To forecast ahead, we just extend the trend line and

    apply the seasonal factor.

    For fourth quarter 2004, the computations are:

    T44 = 198.8094 + 2.566006(44) = 311.714

    ^y44 = 311.714 + 12.234 = 323.948

  • Minitab Decomposition

    On the next slide, we use the Minitab decomposition routine to model the ABX sales.

    The fit MAPE is 2.2003 and the fit MAD is 5.3111, both better than the Winter's model with the default constants.

  • Actual

    Predicted Actual Predicted

    403020100

    300

    250

    200

    Time

    S

    A

    L

    E

    S

    MSD:MAD:MAPE:

    46.0583 5.2962 2.1958

    Additive Decomposition for ABX Sales

  • Example : Electrical and Appliance Sales

    A multiplicative model should work better for this series because of the increasing seasonal swings.

    The filtering is a little different in Step 2 and Step 4 because of the multiplicative form.

    The decomposition model fit (next slide) did not work as well as the Winter's model, with larger MAD, MAPE and MSD.

  • Actual

    Predicted

    Actual Predicted

    140120100806040200

    12500

    10000

    7500

    5000

    Time

    S

    a

    l

    e

    s

    MSD:MAD:MAPE:

    93237.7 239.7

    4.3

    Multiplicative Decomposition for Sales