Upload
others
View
17
Download
0
Embed Size (px)
Citation preview
Electricity Consumption Forecasting
in the Khan Younis Province Using
Exponential Smoothing and Box -
Jenkins Methods: A Modeling
Viewpoint.
August 26, 2015
The Islamic University of Gaza
Faculty of Science
Department of Mathematics
Electricity Consumption Forecasting in the Khan Younis Province
Using Exponential Smoothing and Box - Jenkins Methods
: A Modeling Viewpoint
Submitted By
RANA MAHMOUED ABU AL RISH
Supervised By
Dr. Bisher M.Iqelan
A THESIS SUBMITTED IN PARTIAL FULFILMENT OF THE REQUIREMENT
FOR THE DEGREE OF MASTER OF MATHEMATICS
June, 2015
To my parents...
To my son Ryad...
To my husband Ahmed...
And to all knowledge seekers...
i
Contents
Acknowledgments ix
Abbreviation x
Abstract 1
Literature Review 2
Introduction 3
I 5
1 Introduction 6
1.1 Examples Of Time Series . . . . . . . . . . . . . . . . . . . . . . . . . . 6
1.2 Properties of Time series . . . . . . . . . . . . . . . . . . . . . . . . . 8
1.3 Stationary Time Series . . . . . . . . . . . . . . . . . . . . . . . . . . . 9
2 Box-Jenkins methodology 13
2.1 Models for Stationary Time Series . . . . . . . . . . . . . . . . . . . . 13
2.1.1 General Linear Processes . . . . . . . . . . . . . . . . . . . . . . 13
2.1.2 Autoregressive Process . . . . . . . . . . . . . . . . . . . . . . . 14
2.1.3 Moving Average Processes . . . . . . . . . . . . . . . . . . . . . 16
2.1.4 Autoregressive Moving Average Model . . . . . . . . . . . . . 17
2.2 Models for non Stationary Time Series . . . . . . . . . . . . . . . . . 21
2.2.1 Multiplicative Seasonal ARIMA Models . . . . . . . . . . . . 22
ii
2.3 Forecasting . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 24
2.3.1 Model Identification . . . . . . . . . . . . . . . . . . . . . . . . . 26
2.3.2 Parameter Estimation of the SARIMAModel . . . . . . . . . 27
2.3.3 Diagnostics Checking Of The Fitted Model . . . . . . . . . . 27
2.3.4 Forecasting the study variable . . . . . . . . . . . . . . . . . . . . . 28
3 Exponential Smoothing 29
3.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 29
3.1.1 Classification of Exponential Smoothing Methods . . . . . . 30
3.1.2 Point Forecasts for the Best-Known Methods . . . . . . . . . 31
3.2 Simple Exponential Smoothing (N,N Method) . . . . . . . . . . . . . . . . 31
3.3 Holt Linear Method (A,N Method) . . . . . . . . . . . . . . . . . . . 34
3.4 Damped Trend Method (Ad, A Method) . . . . . . . . . . . . . . . . 35
3.4.1 Additive damped trend . . . . . . . . . . . . . . . . . . . . . . . 35
3.5 Holt-Winters Trend and Seasonality Method . . . . . . . . . . . . . 36
3.5.1 Additive Seasonality (A,A Method) . . . . . . . . . . . . . . . 37
3.6 General Point Forecasting Equations . . . . . . . . . . . . . . . . . . 38
3.7 Innovations state space models for exponential smoothing . . . . . 39
3.7.1 ETS(A,N,N): simple exponential smoothing with additive
errors . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 40
3.7.2 ETS(M,N,N): simple exponential smoothing with multi-
plicative errors . . . . . . . . . . . . . . . . . . . . . . . . . . . . 40
3.7.3 State Space Models for Holts Linear Method . . . . . . . . . 41
3.7.4 State Space Models for All Exponential Smoothing Methods 42
3.8 Initialization and Estimation . . . . . . . . . . . . . . . . . . . . . . . 44
3.8.1 Initialization . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 44
3.8.2 Estimation and model selection . . . . . . . . . . . . . . . . . . . . 45
3.9 Measure error . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 46
iii
II Case Study 49
4 Analysis Data Using Box Jenkins Method 50
4.1 Data Description . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 50
4.1.1 The Box-Jenkins Approach to Fitting ARIMA Model: . . . . . . . 51
4.2 Model Specification . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 53
5 Analysis data using exponential smoothing methods 58
5.1 exponential smoothing model . . . . . . . . . . . . . . . . . . . . . . . . . 58
5.1.1 First Method : Simple Exponential Smoothing Model . . . 58
5.1.2 Second Method :Holt’s Linear Trend Method . . . . . . . . . 61
5.1.3 Damped trend methods . . . . . . . . . . . . . . . . . . . . . . 63
5.1.4 Holt-Winters seasonal method . . . . . . . . . . . . . . . . . . 66
5.2 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 70
CONCLUSIONS 73
Recommendations 74
Refrences 75
Appendix 77
iv
List of Figures
1.1 Average Monthly Temperatures, Dubuque, lowa . . . . . . . . . . . . . . . 7
1.2 Monthly Temperatures, Dubuque, lowa . . . . . . . . . . . . . . . . . . . . 8
2.1 simulated AR(1)process with φ = 0.9 . . . . . . . . . . . . . . . . . . . . . 15
2.2 simulated AR(1)process with φ = 0.9 . . . . . . . . . . . . . . . . . . . . . 17
3.1 Oil production in Saudi Arabia from 1996 to 2007 . . . . . . . . . . . . . . 34
4.1 Time series plot of electricity consumption in province Khan Younis monthly
symbols . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 50
4.2 Acf for monthly electricity consumption . . . . . . . . . . . . . . . . . . . 52
4.3 Pacf for monthly electricity consumption . . . . . . . . . . . . . . . . . . . 53
4.4 First difference of monthly electricity consumption . . . . . . . . . . . . . 54
4.5 Residuals from the fitted ARIMA(2, 1, 2)(1, 0, 1)12 model . . . . . . . . . . 55
4.6 Forecasts for monthly electricity consumption . . . . . . . . . . . . . . . . 56
5.1 Simple exponential smoothing applied to electricity consumption in province
Khan Younis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 60
5.2 Forecasts from simple exponential smoothing . . . . . . . . . . . . . . . . . 61
5.3 Forecasts from Holt’s linear method . . . . . . . . . . . . . . . . . . . . . . 63
5.4 Forecasts from Damped Holts method with exponential trend . . . . . . . 65
5.5 Forecasting electricity data using Holt-Winters method with both additive
and multiplicative seasonality. . . . . . . . . . . . . . . . . . . . . . . . . . 68
5.6 Estimated components for Holt-Winters method with additive and multi-
plicative seasonal components.. . . . . . . . . . . . . . . . . . . . . . . . . 69
v
5.7 Forecasting data using multiplicative seasonal components . . . . . . . . . 70
vi
List of Tables
2.1 Behavior of the ACF and the PACF for ARMA Models . . . . . . . . . . 18
4.1 Descriptive Statistics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 51
4.2 P-values for Augmented Dickey-Fuller (ADF ) test And Kwiatkowski-Phillips-
Schmidt- Shin (KPSS)Test for monthly electricity consumption . . . . . . 52
4.3 SARIMA Models Criteria for the monthly electricity consumption . . . . . 55
4.4 Comparative between Prediction data of 2011 usingARIMA(2, 1, 2)(1, 0, 1)12
and Actual data . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 57
5.1 Prediction data for year 2011 using simple exponential smoothing with
three different values for the smoothing parameter α. . . . . . . . . . . . . 59
5.2 Measure error for simple exponential smoothing models . . . . . . . . . . 60
5.3 Comparative between actual data prediction data of year 2011 using Holt
linear method. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 62
5.4 Eroor Measure For Holt trend Model . . . . . . . . . . . . . . . . . . . . . 62
5.5 Comparative between actual data and prediction data of year 2011 using
Damped trend method . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 64
5.6 Measure error for Damped trend models . . . . . . . . . . . . . . . . . . . 65
5.7 Prediction data of month of year 2011 using Holt-Winters method with
both additive and multiplicative seasonality . . . . . . . . . . . . . . . . . 67
5.8 Measure error for two model additive and multiplicative seasonal compo-
nents . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 68
5.9 Measure error for fitted model in all methods . . . . . . . . . . . . . . . . . 71
vii
5.10 Comparative between actual data and prediction data byARIMA(2, 1, 2)(1, 0, 1)12
and ETS(M,A,M) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 71
5.11 Measure Error for ETS(M,A,M) and ARIMA(2, 1, 2)(1, 0, 1)12 Models . 72
viii
Acknowledgments
Praise be to Almighty ALLAH who always help and guide me to bring forth to light this
work. I am also grateful to my supervisor Bisher M.Iqelan for suggesting the topic of the
thesis, tremendous support and healthy ideas. It has been a privilege to work with. My
special thanks to all members of the Math department at the Islamic University of Gaza
for their help and teaching. Also thanks to my parents, son, husband and family members
,who have always shadowed me with love and fortitude.
ix
Abbreviation
ACF Autocorrelation function.
ACVF Auto covariance function
ADF Augmented Dickey-Fuller
AR(p) Autoregressive Model of order p.
ARMA(p, q) Autoregressive moving average model of order (p, q).
ARIMA(p, d, q) Integrated autoregressive moving average model of order (p, q)
AIC Akaikes Information Criterion.
AICc AIC, Bias Corrected
BIC Bayesian Information Criterion.
KPSS Kwiatkowski-Phillips-Schmidt-Shin
SARIMA Seasonal Integrated Autoregressive Moving Average Model
SES Simple Exponential Smoothing
SIC Schwarzs Information Criterion
MA(q) Moving average model of order q.
MAE The Mean Absolute Error
MAPE The Mean Absolute Percentage Error
MSE The Mean Squared Error.
NID Normal Independent Distributed
PACF Partial autocorrelation function.
RMSE The Root Mean Squared Error
RSS Residual Sum of Squares
WN White Noise
x
Abstract
Time Series analysis can be used to extract information hidden in data. The classical
techniques for Time Series data analysis are the linear Time Series models including
the Moving Average Models (MA), the Autoregressive Models (AR), the Autoregres-
sive Moving Average Models (ARMA), the Seasonal Integrated Moving Average Models
(SARIMA). We are mention and display these models in details, and we show the
important characteristics and methods of finding their parameters, auto covariance ,au-
tocorrelation functions and partial autocorrelation function. We are presented a details
of Exponential Smoothing Model and his methods like Simple Exponential Smoothing
Model, Holts Linear Method, Damped Trend Method and Holt-Winters Trend and Sea-
sonality Method .
In this theses we have used Box - Jenkins models and Exponential Smoothing Model
to analysis the electricity data of Khan Younis province in period 2000 − 2010 and we
compar between two models to choose the fitting model for forecasting data in period Jan
2011 to Dec 2011.
Aftre comparative the best model is Exponential Smoothing Model. We are using R
program .
1
Literature Review
Many scientists and researchers have studied time series. The mathematician Fourier
first touched to study time series in (1807) he representation time series as infinite series
for Certain functions sin and cos this representation has been named Fourier series, this
series adopted by Schuster Stocks, (1906) and Beveridge (1922). The theory and practice
of time series analysis have developed rapidly since the appearance in 1970 of the seminal
work of George E. P. Box and Gwilym M. Jenkins, Time Series Analysis: Forecasting and
Control, now available in its third edition (1994) with co-author Gregory C. Reinsel. Many
books on time series have appeared since then, but some of them give too little practical
application, while others give too little theoretical background. This book attempts to
present both application, and theory at a level accessible to a wide variety of students and
practitioners. Our approach is to mix application and theory throughout the book as they
are naturally needed. Shummay and David 1999 studying, presented examples of time
series and Box - jenkins . Peter J. Brock well and Richard A. Davis (2001) studying time
series and forecasting. In (2003) Degerine, S. and Lambert-Lacroix studied concepts of
time series. In(1998) researcher (Makridaskis, S, et.al) studying Exponential Smoothing
Method for time series, in (2002) researcher (Celia F., Balaji V.,Les S., Asish G and
Amar R ) Simple Exponential Smoothing method and Holt Winter and they application
for sales of women’s clothing. In (2003) (Simon,2003) studying Exponential Smoothing
Method for time series and prepare formula to calculate average and σ In (2008)(Rob J.
Hyndman, Anne B. Koehler, J. Keith Ord and Ralph D. Snyder) studying Exponential
Smoothing method and forecasting.
2
Introduction
The prediction of future behavior of the time series is important issues in the statistical
sciences in order to need it in the areas of all of life, such as prediction of the situation, air
temperatures, that most countries rely on its plans and programs of development on the
basis and methods of advanced process in order to reach more effective results and leads
Census key role in This building plans and programs. Interested has put these studies
a range of statistical methods and mathematical methods to take advantage of them to
predict, and that of the most important problems facing researchers when doing analysis
of time series is the stability of the series or not, which could affect the mathematical
model. In this theses, we studied the number of consumers of electricity in the period
2000− 2010 in Khan Younis and treatment it by using two models
1. Box - Jenkins Models
2. Exponential Smoothing Model
To choose the best model in order to predict the number of consumers in 2011.
This thesis is organized as fellows We start by recalling background of the time series and
its properties in chapter 1 it contains 3 sections Examples Of Time Series,Properties of
Time series and Stationary Time Series.
Chapter 2 introduces Box-Jenkins Models and contains 3 section Models for Stationary
Time Series Moving Average Models (MA), Autoregressive Models (AR) and Autoregres-
sive Moving Average Models (ARMA). Models for non Stationary Time Series Seasonal
Integrated Moving Average Models (SARIMA), and Forecasting
Chapter 3 talk about Exponential Smoothing Models ETS and important methods of this
model as Simple Exponential Smoothing Model, Holts Linear Method, Damped Trend
3
Method and Holt-Winters Trend and Seasonality Method and we have studied the prop-
erties of all methods. Chapter 4 and 5 given the result of application of two model on
electricity data and the result.
4
Part I
5
Chapter 1
Introduction
In this chapter, we introduce some basic ideas of time series analysis and we will study
some properties of time series in section 1.2.
The purpose of time series analysis is generally:
1. To understand or model the stochastic mechanism that gives rise to an observed series
2. To predict or forecast the future values of a series based on the history of that series
and, possibly, other related series or factors.
3. To describe the characteristics of these oscillations This chapter contain 3 section
Example of Time Series, Properties of Time Series and Stationary Time Series.
Definition 1.0.1. A time series is a set of observations Yt each one being recorded at
a specific time t. Or is a sequence of data points, measured typically at successive time
instants at uniform time interval.
1.1 Examples Of Time Series
Definition 1.1.1. A time series is a set of observations Yt each one being recorded at a
specific time t.
Or is a sequence of data points, measured typically at successive time instants at uniform
time interval.
Example 1.1.1. Average Monthly Temperatures, Dubuque, Iowa
Figure 1.1 shows the average yearly temperatures in New Haven from 1964 to 1976 ,it
6
climate is warm during summer when temperatures tend to be in the 70′s and very cold
during winter when temperatures tend to be in the 20′s. The warmest month of the
year is the July with an average maximum temperature of 82.8 degrees Fahrenheit, while
the coldest month of the year is January with an average minimum temperature of 16.9
degree Fahrenheit. This time series displays a very regular pattern called seasonality.
Seasonality for monthly values occurs when observations twelve months apart are related
in some manner or another. All Januarys and Februarys are quite cold but they are
similar in value and different from the temperatures of the warmer months of June, July,
and August for example. There is still variation among the January values and variation
among the June values. Models for such series must accommodate this variation while
preserving the similarities. Here the reason for the seasonality is well understood the
Northern Hemispheres changing inclination toward the sun. For more details see [7]
Figure 1.1: Average Monthly Temperatures, Dubuque, lowa
7
Figure 1.2: Monthly Temperatures, Dubuque, lowa
1.2 Properties of Time series
This section descries the fundamental concepts in the theory of time series models. In
particular, we introduce the concepts of mean, variance, covariance functions, stationary
processes, autocorrelation functions and partial autocorrelation functions.For more details
see [15] and [16]
Definition 1.2.1. Mean Function
For any time series {Yt} the mean function denoted by µt is defined as
µt = E(Yt) (1.2.1)
Definition 1.2.2. The Auto Covariance Function
For any time series {Yt} the auto covariance function (ACV F )of the time series {Yt}denoted by γY (t, s) is defined as the second moment product
γY (t, s) = cov(Yt, Ys) (1.2.2)
= E[(Yt − µt)(Ys − µs)] (1.2.3)
= E(YtYs)− µtµs (1.2.4)
8
For all time point s and t. When no possible confusion exists about which time series
we are referring to, we will drop the subscript and write γY (t, s) as γs,t.
It is clear that, for s = t, the auto covariance reduces to the variance, because
γY (t, t) = V ar(Yt) (1.2.5)
= E[Yt − E{Yt}]2 (1.2.6)
Note that γY (t, s) = γY (s, t) and |γt,s| ≤√γt,tγs,s
Definition 1.2.3. The Autocorrelation Function
The autocorrelation function (ACF ) of time series {Yt} denoted by ρt,s is defined as
follows
ρt,s = Corr(Yt, Ys) (1.2.7)
=Cov(Yt, Ys)√
(V arYt)(V arYs)(1.2.8)
The ACF measures the linear predictability of the series at time t, say, Yt, using only
the value Ys. We note that −1 ≤ ρt,s ≤ 1 values of ρt,s near ±1 indicate strong linear
dependance, where as values near zero indicate weak linear dependance and if ρt,s = 0 we
say that (Yt), (Ys) are uncorrelated.
1.3 Stationary Time Series
The preceding definitions of the mean and auto covariance functions are completely gen-
eral. Although we have not made any special assumptions about the behavior of the time
series, many of the preceding examples have hinted that a sort of regularity may exist
over time in the behavior of a time series. We introduce the notion of regularity using a
concept called stationarity.[1]
Definition 1.3.1. Strict stationarity
Time series Yt is said to be strict stationary if the joint distribution of {Yt1 , Yt2 , .....Ytn}is the same as the joint distribution of {Yt1+h, Yt2+h, ...., Ytn+h}
9
Definition 1.3.2. Weakly Stationarity
Time series Yt is said to be weakly stationary if
1. the mean value function µt is constant .
2. the covariance function, γs,t depends on s and t only through their difference |s− t|.
In the literature, usually stationarity means weak stationarity, unless otherwise speci-
fied. One important case where stationarity implies strict stationarity is if the time series
is Gaussian which means that the distribution functions of {Yt} are all multivari-
ate Gaussian, i.e. the joint density of FYt,Yt+j1 ,...,Yt+jn (yt, yt+j1 , ..., yt+jn) is Gaussian.
Example 1.3.1. Random walk
let (St : t = 0, 1, 2, ...) be sequence of independent identically distributed random variables
each with zero mean and variance σ2 the observed time series {Yt : t = 1, 2, ....} is
constructed as follows
Y1 = S1
Y2 = S1 + S2
...
Yt = S1 + S2 + ...St
E(St) = 0, E(S2t ) = tσ2 for all t,and for h ≥ 0
γs(t+ h, t) = Cov(St+h, St)
= Cov(St + Yt+1 + Yt+2 + ....+ Yt+h, St)
= Cov(St, St)
= V ar(St)
= E(S2t )− (E(St))
2
= tσ2
Since γs(t+ h, t) depends on t , the series {St} is not stationary.
10
Notation 1.3.1. Note that because the mean function, µt = E(Yt) of a stationary time
series is independent of time t, we will write
µt = µ
Also, because the covariance function of a stationary time series,γs,t depends on s and
t only through their difference |s− t|, we may simplify the notation. Let s = t+h, where
h represents the time shift or lag, then
γ(t+h,t) = cov(Yt+h, Yt)
= E[(Yt+h − µt+h)(Yt − µt)]
= E[(Yh − µ)(Y0 − µ)]
= γ(h,0)
does not depend on the time argument t we have assumed that V ar(Yt) = γ(0,0) <∞.
Henceforth, for convenience we will drop the second argument of γ(h,0).
Definition 1.3.3. The auto covariance function (ACV F )
The auto covariance function of a stationary time series will be written as
γ(h) = Cov(Yt+h, Yt) (1.3.1)
= E[(Yt+h − µ)(Yt − µ)] (1.3.2)
A final useful property that auto covariance function of a stationary series is symmetric
around the origin, that is
γh = γ−h (1.3.3)
Proposition 1.3.2. (Properties of Auto covariance Function (ACV F )
The auto covariance function (ACV F ) of a stationary time series Yt has the following
properties:
11
• Nonnegativity: γ0 ≥ 0
• Bounded ness: | γh |≤ γ0, for any h ∈ Z
• Symmetry :γh =γ−h
• γ(t,s) = γ(0,|s−t|)
Proof. See [16]
Definition 1.3.4. The autocorrelation function (ACF )
The autocorrelation function (ACF ) of a stationary time series will be written as
ρh =γ(t+h,t)
√γ(t+h,t+h)γ(t,t)
(1.3.4)
=γhγ0
(1.3.5)
Proposition 1.3.3. (Properties of Autocorrelation Function (ACF ))
The autocorrelation function ρ(h) of a stationary time series Yt has the following proper-
ties:
• ρ0 = 1
• | ρh |≤ 1 ,for all h ∈ Z.
• ρh = ρ−h
Proof. See [16]
Definition 1.3.5. (The partial Autocorrelation Function(PACF ))
The partial Autocorrelation Function (PACF ) of time series {Yt} denoted byφkk
φkk = corr(Yt, Yt−k|Yt−1, Yt−2, ..., Yt−k+1) (1.3.6)
We have studied in this chapter the basic rules of time series and properties like
Variance, Covariance, auto covariance, correlation and auto correlation and we studied
types of time series. In the second chapter we will study Box - Jenkins model Moving
Average, Autoregressive and Moving Average Autoregressive
12
Chapter 2
Box-Jenkins methodology
2.1 Models for Stationary Time Series
This chapter discusses the basic concepts of a broad class of parametric time series models
the autoregressive moving average (ARMA) models. These models have assumed great
importance in modeling real-world processes. For more details see[7]
2.1.1 General Linear Processes
we will study a class of linear models, called ”linear time series models” that are designed
specifically for modeling the dynamic behavior of time series. These include, moving-
average (MA), autoregressive (AR) and autoregressive-moving average (ARMA) models.
Definition 2.1.1. Time series {Yt} is a linear process if it has the representation
yt = et + ψ1et−1 + ψ2et−2 + ... (2.1.1)
or
yt =∑∞
j=0 ψjet−j
for all t ,where et have zero mean and variance σ2 and ψj is a sequence of constant
with∑∞
j=1 ψ2j <∞ and ψ0 = 1
13
Definition 2.1.2. (White noise)
Time series Yt is said to be a white noise with mean zero and variance σ2 written as
Yt ∼ WN(0, σ2)
if and only if {Yt} has zero mean and covariance function as
γh =
σ2, if h = 0;
0, if h 6= 0 .
It is clear that a white noise process is stationary.
Definition 2.1.3. (Back Shift Operator)
For any time series {Yt} the Back Shift Operator is defined by
BYt = Yt−1
and extend it to powers B2Yt = B(BYt) = BYt−1 = Yt−2 and so on. Thus
BkYt = Yt−k (2.1.2)
An important part of time series analysis is the selection of a suitable model for
data. These models are very important tool for forecasting. We will take three famous
models: Autoregressive (AR) model, Moving average (MA) model and Autoregression
Moving average (ARMA) model. These models are very important in modeling real
world processes. We can rewrite the time series models by simplified and useful formula
using Back Shift Operator B.
2.1.2 Autoregressive Process
Definition 2.1.4. Autoregressive Process
The autoregressive process of order p, denoted by AR(p), is defined as
Yt = φ1Yt−1 + φ2Yt−2 + ...+ φpYt−p + εt (2.1.3)
Where φ1, φ2, ..., φp are the parameters of the model and εt ∼ WN (0, σ2)
14
The mean of Yt in (2.1.3) is zero .If the mean µ of Yt is not zero replace Yt byYt − µin 2.1.3 i.e
Yt − µ = φ1(Yt−1 − µ) + φ2(Yt−2 − µ)......+ φp(Yt−p − µ) + εt
By using the back shift operator we can write AR(p) as
(1-φ1B − φ2B2 − ....− φpBp)Yt = εt (2.1.4)
or even more concisely as
φ(B)Yt = εt (2.1.5)
φ(B) is called the characteristic polynomial where
φ(B) = 1− φ1B − φ2B2 − ....− φpBp
Figure 2.1 displays the time plot of a simulated AR(1) process with φ = 0.9
Figure 2.1: simulated AR(1)process with φ = 0.9
15
Definition 2.1.5. Causality
A linear process {Yt} is causal of {Wt} if there is a
ψ(B) = ψ0 + ψ1B + ψ2B2 + ...
with | ψj |<∞ and
Yt = ψ(B)Wt
2.1.3 Moving Average Processes
Definition 2.1.6. Moving Average
Moving model of order q denoted by MA(q) model, is defined as
Yt = εt − θ1εt−1 − θ2εt−2 − ....− θqεt−q (2.1.6)
where are θ1, θ2.....θq are parameters
Some texts and software packages write the MA model with negative coefficients that
is
Yt = εt + θ1εt−1 + θ2εt−2 + ....+ θqεt−q
By using the back shift operator we can write the MA(q) as
Yt = (1− θ1B − θ2B2 − .....− θqBq)εt (2.1.7)
or even more concisely as
Yt = θ(B)εt (2.1.8)
16
θ(B) is called the characteristic polynomial where
θ(B) = 1− θ1B − θ2B2 − .....− θqBq
Figure 2.2 shows a time plot of a simulated MA(1) series withθ =0.9
Figure 2.2: simulated AR(1)process with φ = 0.9
Definition 2.1.7. Invertibility
A linear process {Yt} is invertible of {Wt} if there is a
π(B) = π0 + π1B + π2B2 + ...
with∑∞
j=0 |πj| <∞ and
Wt = π(B)Yt
2.1.4 Autoregressive Moving Average Model
Definition 2.1.8. Autoregressive Moving Average Model
The Autoregressive Moving Average Model denoted by ARMA(p, q) is defined as
Yt = φ1Yt−1 + φ2Yt−2 + ......φpYt−p + εt − θ1εt−1 − θ2εt−2....θqεt−q (2.1.9)
17
with φp 6= 0, θq 6= 0 and σ2ε > 0, and the parameters p and q are called the autoregressive
and the moving average orders, respectively.
By using the back shift operator, we can write the ARMA(p, q)as
(1-φ1B − φ2B2 − ....− φpBp)Yt = (1− θ1B − θ2B2 − .....− θqBq)εt (2.1.10)
φ(B)Yt = θ(B)εt (2.1.11)
Definition 2.1.9. characteristic polynomial
The AR(p), MA(q) polynomial are defined as
φ(x) = 1− φ1(x)− φ2(x)2......φp(x)p (2.1.12)
and
θ(x) = 1 + θ1(x) + θ2(x)2 + ...+ θq(x)q (2.1.13)
respectively, where x is a complex number.
Table 2.1: Behavior of the ACF and the PACF for ARMA Models
AR(p) MA(q) ARMA(p, q)
ACF Tails off Cuts off after lag q Tails off
PACF Cuts off after lag p Tails off Tails off
Remark 2.1.1. We can say about φ(B)Yt = θ(B)εt ARMA(p, q) if no common factor
between φ(x) and θ(x)
Example 2.1.1. Consider the process
Yt = 0.75Yt−1 − 0.125Yt−2 + εt − 0.5εt−1
18
or in operator form
(1-0.75B + 0.125B2)Yt = (1 + 0.5B)εt
At first Yt appear to be an ARMA(2, 1) process. But the associated polynomial
φ(z) = (1− 0.75z + 0.125z2)Yt = (1− 0.5z)(1− 0.25z)
θ(z) = (1− 0.5z)εt
have a common factor that can be canceled. After cancelation, the model is reduced to
Yt = 0.25Yt−1 + εt
so the model is an AR(1)
Definition 2.1.10. An ARMA(p, q) model φ(B)Yt = θ(B)εt is said to be causal, if the
time series (Yt) can be written as a one-sided linear process:
Yt =∑∞
j=0 ψjεt−j = ψ(B)εt (2.1.14)
where ψ(B) =∑∞
j=0 ψjBj and
∑∞j=0 |ψj| <∞
Causality of an ARMA(p, q) process
An ARMA(p, q) model is causal if and only if φ(z) 6= 0 for |z| ≤ 1
the coefficients of the linear process given 2.1.14 can be determined by solving
ψ(z) =∑∞
j=0 ψjzj = θ(z)
φ(z), |z| ≤ 1
Another way to expressing note is that an ARMA process is causal only when the
roots of φ(z) lie outside the unit circle that is,
φ(z) = 0 only when |z| > 1
Definition 2.1.11. Invertible Of An ARMA
An ARMA(p, q) model φ(B)Yt = θ(B)εt is said to be invertible, if the time series (Yt)
can be written as
19
π(B)Yt =∑∞
j=0 πjYt−j = εt (2.1.15)
where π(B) =∑∞
j=0 πjBj and
∑∞j=0 |πj| <∞. See[16]
Invertibility of an ARMA(p, q) Process
An ARMA(p, q) model is invertible if and only if θ(z) 6= 0 for |z| ≤ 1 The coefficients
πj of π(B) given in 2.1.15 can be determined by solving
π(z) =∑∞
j=0 πjzj = φ(z)
θ(z), |z| ≤ 1
Another way to expressing last note is that an ARMA process is invertible only when the
roots of θ(z) lie outside the unit circle that is, θ(z) = 0 only when |z| > 1.
Example 2.1.2. Consider the process
Yt = 0.4Yt−1 + 0.45Yt−2 + εt + εt−1 + 0.25εt−2
or, in operator form,
(1-0.4B-0.45B2)Yt = (1 +B + 0.25B2)εt
At first, Yt appears to be an ARMA(2, 2) process. But, the associated polynomials
φ(z) = (1− 0.4z − 0.4z2) = (1 + 0.5z)(1− 0.9z)
θ(z) = (1 + z + 0.25z2) = (1 + 0.5z)2
have a common factor that can be canceled. After cancelation, the polynomials become
φ(z) = (1 − 0.9z) and θ(z) = (1 + 0.5z) so the model is an ARMA(1, 1) model,(1 −0.9B)Yt = (1 + 0.5B)εt or
Yt = 0.9Yt−1 + 0.5εt−1 + εt
The model is causal because φ(z) = (1− 0.9z) = 0 when z = 109
which is outside the unit
circle. The model is also invertible because the root of θ(z) = (1 + 0.5z) is z = −2 which
is outside the unit circle.
20
2.2 Models for non Stationary Time Series
In statistics, an autoregressive integrated moving average (ARIMA) model is a general-
ization of an autoregressive moving average or (ARMA) model. These models are fitted
to time series data either to better understand the data or to predict future points in the
series. The ARIMA model is applied in some cases where data show evidence of non
stationarity, where an initial differencing step (corresponding to the ”integrated” part of
the model) can be applied to remove the non stationarity. The model is generally referred
to as an ARIMA(p, d, q) model where p, d, and q are integers greater than or equal to
zero and refer to the order of the autoregressive, integrated, and moving average parts of
the model respectively. The first parameter p refers to the number of autoregressive lags
(not counting the unit roots), the second parameter d refers to the order of integration,
and the third parameter q gives the number of moving average lags. For more details see
[7] and [12]
Definition 2.2.1. Integrated Autoregressive Moving Average Model ARIMA
A process Yt is said to be integrated autoregressive moving average model abbreviated
ARIMA(p, d, q)if
∇dYt = (1−B)dYt (2.2.1)
is stationary ARMA process (p, q). In general we will write the model as
φ(B)(1−B)dYt = θ(B)et (2.2.2)
IfE(∇dYt) = µ, we write the model as
φ(B)(1−B)dYt = α + θ(B)et (2.2.3)
where α = µ(1− φ1 − φ2 − ...− φp)
21
2.2.1 Multiplicative Seasonal ARIMA Models
In this section, we introduce several modifications made to the ARIMA model to account
for seasonal and non stationary behavior. Often, the dependence on the past tends to
occur most strongly at multiples of some underlying seasonal lag s.
Definition 2.2.2. Seasonal Time Series
Seasonal variation is a component of a time series which is defined as the repetitive and
predictable movement around the trend line in one year or less.
Some Examples of Seasonal Time Series:
• Monthly Carbon Dioxide Levels at Alert, Canada from January 1994 through De-
cember 2004.
• Monthly U.S. Retail and Food Service Sales from January 1992 to August 2008
in millions of dollars.
• Electricity consumption of an industrial sector of U.S.
Definition 2.2.3. Seasonal MA(Q) Model
A seasonal MA(Q) model of order Q with seasonal period s is define by
Yt = et −Θ1et−s −Θ2et−2s − ....−ΘQet−Qs (2.2.4)
with seasonal MA characteristic polynomial
Θ(x) = 1−Θ1xs −Θ2x
2s − .....−ΘQxQs
Definition 2.2.4. Seasonal AR(P ) Model
A seasonal AR(P ) model of order P and seasonal period s is defined by
Yt = Φ1Yt−s + Φ2Yt−2s + ....+ ΦPYP−s + et (2.2.5)
with seasonal AR characteristic polynomial
Φ(x) = 1− Φ1xs − Φ2x
2s − ....− ΦPxPs
22
Definition 2.2.5. Multiplicative Seasonal ARIMA Model SARIMA
Multiplicative Seasonal ARIMA Model takes the form
Φp(Bs)yt = ΘQ(Bs)et (2.2.6)
with operator
Φ(B) = 1− Φ1Bs − Φ2B
2s − ....− ΦPBPs
Θ(B) = 1−Θ1Bs −Θ2B
2s − .....−ΘQBQs
The multiplicative seasonal autoregressive integrated moving average model, or SARIMA
model is given by
φ(B)Φ(B)∇d∇Ds = θ(B)Θ(B)et (2.2.7)
The general model is denoted as ARIMA(p, d, q)(P,D,Q)s .
Example 2.2.1. Consider the following model, which often provides a reasonable repre-
sentation for seasonal, non stationary, economic time series.We display the equations for
the model, denoted by ARIMA (0, 1, 1)(0, 1, 1)12
(1-B12)(1−B)yt = (1 + ΘB12)(1 + θB)et
Expanding both sides
(1 - B - B12 +B13)yt = (1 + θB + ΘB12 + ΘθB13)et
see [16]
23
2.3 Forecasting
In this section, we shall consider the calculation of forecasts and their properties for both
deterministic trend models and ARIMA models. Based on the available history of the
series up to time t, namely Y1, Y2, Y3, ...., Yt−1 we would like to forecast the value of Yt+L
that will occur L time units into the future. For more details see [12] and [15] .
Definition 2.3.1. Minimum Mean Square
The minimum mean square error forecast is given by
Yt(L) = E(Yt+L|Y1, Y2, ...., Yt) (2.3.1)
For ARIMA models, the forecasts can be expressed in several different ways. Each
expression contributes to our understanding of the overall forecasting procedure with
respect to computing, updating, assessing precision, or long-term forecasting behavior.
Definition 2.3.2. Akaikes Information Criterion (AIC)
Akaikes Information Criterion (AIC)
AIC = -2log(maximum likelihood) + 2k
(2.3.2)
where k is the number of parameters in the model
Akaikes Information Criterion (AIC)has another definition
Definition 2.3.3. Akaikes Information Criterion (AIC)
AIC = lnσ2 +n+ 2k
n(2.3.3)
where k is the number of parameters in the model and
σ2 = RSSkn
24
where RSSk denotes the residual sum of squares under the model with k regression coef-
ficients.
Definition 2.3.4. AIC, Bias Corrected (AICc)
AICc =ln σ2k + n+k
n−k−2
(2.3.4)
Definition 2.3.5. Schwarzes Information Criterion (SIC)
SIC = lnσ2k + k lnnn
(2.3.5)
SIC is also called the Bayesian Information Criterion (BIC)
For more details see [16]
Example 2.3.1. AR(1)
consider the model with a nonzero mean that satisfies
Yt − µ = φ(Yt−1 − µ) + et
Replacing t by t+ 1 in last equation we have
Yt+1 − µ = φ(Yt − µ) + et+1 (2.3.6)
Take the conditional expectations of both sides of Equation
Yt(1)− µ = φ[E(Yt|Y1, Y2, ...., Yt)− µ] + E(et+1|Y1, Y2, ...., Yt) (2.3.7)
Since E(Yt|Y1, Y2, ...., Yt) = Yt and et+1 is independent of Y1, Y2, ...., Yt−1
we have E(et+1|Y1, Y2, ...., Yt) = E(et+1) = 0.
Thus Equation (2.3.7) can be written as
Yt(1) = µ+ φ(Yt − µ)
Now consider a general lead time L. Replacing t by t+L in last Equation and taking the
conditional expectations of both sides produces
Yt(L) = µ+ φ(Yt−L − µ) for L≥1
since |φ| ≤ 1, we have simply Y t ≈ µ for large L .
25
2.3.1 Model Identification
Definition 2.3.6. Identification:
Means to find out the appropriate values of p, q, d, P,Q and D of the order of general
SARIMA model, we will use ACF and PACF to find these values.
Stationarity and Seasonality
The first step in developing a Box-Jenkins model is to determine if the series is stationary
and if there is any significant seasonality that needs to be modeled.
Detecting seasonality
Seasonality (or periodicity) can usually be assessed from an autocorrelation plot, a sea-
sonal subseries plot, or a spectral plot.
Differencing to achieve stationarity
Box and Jenkins recommend the differencing approach to achieve stationarity. However,
fitting a curve and subtracting the fitted values from the original data can also be used
in the context of Box-Jenkins models.
Seasonal differencing
At the model identification stage, our goal is to detect seasonality, if it exists, and to
identify the order for the seasonal autoregressive and seasonal moving average terms. For
many series, the period is known and a single seasonality term is sufficient. For example,
for monthly data we would typically include either a seasonal AR term or a seasonal MA
term. For Box-Jenkins models, we do not explicitly remove seasonality before fitting the
model. Instead, we include the order of the seasonal terms in the model specification to the
ARIMA estimation software. However, it may be helpful to apply a seasonal difference to
the data and regenerate the autocorrelation and partial autocorrelation plots. This may
help in the model identification of the non-seasonal component of the model. In some
cases, the seasonal differencing may remove most or all of the seasonality effect.
Identify p and q
Once stationarity and seasonality have been addressed, the next step is to identify the
order of the autoregressive and moving average terms.
Detecting stationarity
Stationarity can be assessed from a run sequence plot. The run sequence plot should
26
show constant location and scale. It can also be detected from an autocorrelation plot.
Specifically, non-stationarity is often indicated by an autocorrelation plot with very slow
decay.
Order of Autoregressive Process (p)
Specifically, for an AR(1) process, the sample autocorrelation function should have an
exponentially decreasing appearance. However, higher-order AR processes are often a
mixture of exponentially decreasing and damped sinusoidal components. For higher-
order autoregressive processes, the sample autocorrelation needs to be supplemented with
a partial autocorrelation plot. The partial autocorrelation of an AR(p) process becomes
zero at lag p+1 and greater, so we examine the sample partial autocorrelation function to
see if there is evidence of a departure from zero. This is usually determined by placing a
95confidence interval on the sample partial autocorrelation plot (most software programs
that generate sample autocorrelation plots will also plot this confidence interval). If the
software program does not generate the confidence band, it is approximately 2√N
with N
denoting the sample size.
Order of Moving Average Process (q)
The autocorrelation function of a MA(q) process becomes zero at lag q+1 and greater, so
we examine the sample autocorrelation function to see where it essentially becomes zero.
We do this by placing the 95 confidence interval for the sample autocorrelation function on
the sample autocorrelation plot. Most software that can generate the autocorrelation plot
can also generate this confidence interval. The sample partial autocorrelation function is
generally not helpful for identifying the order of the moving average process.
2.3.2 Parameter Estimation of the SARIMAModel
After getting appropriate value of p, q, d, P,Q and D the next stage to find the values of
θ, φ,Θ and Φ
2.3.3 Diagnostics Checking Of The Fitted Model
Diagnostics test is applied to understand whether the estimated parameters and residuals
of the fitted SARIMA model are significant. ACF and PACF of Residuals We hope
27
that these will show the WN pattern.
2.3.4 Forecasting the study variable
When the model is complete it will be used to forecast the future behavior of the currency
pair.
28
Chapter 3
Exponential Smoothing
3.1 Introduction
Exponential smoothing is probably the widely used class of procedures for smoothing
discrete time series in order to forecast the immediate future. This popularity can be
attributed to its simplicity, its computational efficiency, the ease of adjusting its respon-
siveness to changes in the process being forecast, and its reasonable accuracy. The idea of
exponential smoothing is to smooth the original series the way the moving average does
and to use the smoothed series in forecasting future values of the variable of interest. In
exponential smoothing, however, we want to allow the more recent values of the series to
have greater influence on the forecast of future values than the more distant observations.
Exponential smoothing is a simple and pragmatic approach to forecasting, whereby the
forecast is constructed from an exponentially weighted average of past observations. The
largest weight is given to the present observation, less weight to the immediately preced-
ing observation, even less weight to the observation before that, and so on exponential
decay of influence of past data.
Historically, exponential smoothing describes a class of forecasting methods. In fact,
some of the most successful forecasting methods are based on the concept of exponen-
tial smoothing. There are a variety of methods that fall into the exponential smoothing
family, each having the property that forecasts are weighted combinations of past obser-
vations, with recent observations given relatively more weight than older observations.
29
The name exponential smoothing reflects the fact that the weights decrease exponentially
as the observations get old.[17] Exponential Smoothing statistical technique for detect-
ing significant changes in data by ignoring the fluctuations irrelevant to the purpose at
hand. In exponential smoothing (as opposed to in moving averages smoothing), older
data is given progressively-less relative weight (importance) whereas newer data is given
progressively-greater weight. Also called averaging, it is employed in making short-term
forecasts. The ’wait-and-see’ attitude to changes around them is the intuitive way people
employ exponential smoothing in their daily living.
3.1.1 Classification of Exponential Smoothing Methods
In exponential smoothing, we always start with the trend component, which is itself
a combination of a level term (`) and a growth term (b). The level and growth can be
combined in a number of ways, giving five future trend types. Let (Th) denote the forecast
trend over the next h time periods, and let ϕ denote a damping parameter (0 < ϕ < 1).
Then the five trend types or growth patterns are as follows:
Non :Th = `
Additive :Th = `+ bh
Additivedamped :Th = `+ (ϕ+ ϕ2 + ...+ ϕh)
Multiplicative :Th = `bh
Multiplicativedamped :Th = `b(ϕ+ϕ2+...+ϕh)
If the error component is ignored, then we have the fifteen exponential smoothing
methods given in the following table. Some of these methods are better known under
other names. For example, cell (N,N) describes the simple exponential smoothing (or
SES) method, cell (A,N) describes Holts linear method, and cell (Ad, N) describes the
damped trend method. Holt-Winters additive method is given by cell (A,A), and Holt-
Winters multiplicative method is given by cell (A,M). The other cells correspond to less
commonly used but analogous methods.
30
3.1.2 Point Forecasts for the Best-Known Methods
In this section, a simple introduction is provided to some of the best known exponential
smoothing methods simple exponential smoothing (N,N), Holts linear method (A,N),
the damped trend method (Ad, N) and Holt-Winters seasonal method (A,A) and (A,M).
3.2 Simple Exponential Smoothing (N,N Method)
The simplest of the exponentially smoothing methods is naturally called simple exponen-
tial smoothing(SES). (In some books [8], it is called single exponential smoothing.) This
method is used for short-range forecasting, usually just one month into the future. The
model assumes that the data fluctuates around a reasonably stable mean (no trend or
consistent pattern of growth). For more details see [11] and [17]
Definition 3.2.1. Simple exponential Smoothing
Simple exponential Smoothing equation is defined as:
yt+1 = yt + α(yt − yt) (3.2.1)
where α is constant between 0 and 1 . Another way of writing the last equation is
yt+1 = αyt + (1− α)yt (3.2.2)
31
If this substitution process is repeated by replacing yt−1 with its components, yt−2
with its components, and so on, the result is
yt+1 = αyt + (1− α)yt−1 + α(1− α)2yt−2 + ...+ α(1− α)t−1y1 + (1− α)ty1
(3.2.3)
So yt+1 represents a weighted moving average of all past observations with the weights
decreasing exponentially hence the name exponential smoothing. We note that the weight
of y1 may be quite large when α is small and the time series is relatively short. For longer
range forecasts, it is assumed that the forecast function is flat. That is,
yt+h|h = yt+1
A flat forecast function is used because simple exponential smoothing works best for
data that have no trend, seasonality, or other underlying patterns. Another way of writing
this is to let `t = yt+1 then 3.2.2 becomes
`t = αyt + (1− α)`t
The value of `t is a measure of the level of the series at time t
Initial Value
The initial value of yt plays an important role in computing all the subsequent values.
Setting it to y1 is one method of initialization. Another possibility would be to average
the first four or five observations. The smaller the value of α the more important is the
selection of the initial value of yt.
32
Component form
An alternative representation is the component form. For simple exponential smoothing
the only component included is the level, `t (Other methods considered later in this chapter
may also include a trend bt and seasonal component st.) Component form representations
of exponential smoothing methods comprise a forecast equation and a smoothing equa-
tion for each of the components included in the method. The component form of simple
exponential smoothing is given by:
Forecast equation yt+1 = `t
Smoothing equation `t = αyt + (1− α)`t−1,
where `t is the level (or the smoothed value) of the series at time t. The forecast
equation shows that the forecasted value at time t + 1 is the estimated level at time t.
The smoothing equation for the level (usually referred to as the level equation) gives the
estimated level of the series at each period t. Applying the forecast equation for time T
gives, yT+1|T = `T , the most recent estimated level. If we replace `t by yt+1|t and `t−1
by yt|t−1 in the smoothing equation, we will recover the weighted average form of simple
exponential smoothing.
Error correction form
The third form of simple exponential smoothing is obtained by re-arranging the level
equation in the component form to get what we refer to as the error correction form
`t = `t−1 + α(yt − `t−1)
= `t−1 + αet
where et = yt − `t−1 = yt − yt for t = 1, ..., T . That is, et is the one-step within-
sample forecast error at time t. The within-sample forecast errors lead to the adjust-
ment/correction of the estimated level throughout the smoothing process for t = 1, ..., T .
For more details see [11] and [17]
Example 3.2.1. The data in Figure 3.1 do not display any clear trending behavior or
any seasonality, although the mean of the data may be changing slowly over time.
33
Figure 3.1: Oil production in Saudi Arabia from 1996 to 2007
3.3 Holt Linear Method (A,N Method)
Holt (1957) extended simple exponential smoothing to linear exponential smoothing to
allow forecasting of data with trends. The forecast for Holt,s linear exponential smoothing
method is found using two smoothing constants, α and β (with values between 0 and 1),
and three equations:
Definition 3.3.1. Holts linear method equations are defined as :
Level :`t = αyt + (1− α)(`t−1 + bt−1) (3.3.1)
Growth :bt = β∗(`t − `t−1) + (1− β∗)bt−1 (3.3.2)
Forecast :yt+h = `t + bth (3.3.3)
Here `t denotes an estimate of the level of the series at time t and bt denotes an
estimate of the slope (or growth) of the series at time t.
One interesting special case of this method occurs when β∗ = 0 Then
Level :`t = αyt + (1− α)(`t−1 + bt−1)
Forecast :yt+h|t = `t + bth
34
As with simple exponential smoothing, the level equation here shows that `t is a
weighted average of observation yt and the within-sample one-step-ahead forecast for
timet, here given by `t−1 + bt−1. The trend equation shows that bt is a weighted average
of the estimated trend at time t based on `t − `t−l and bt−1, the previous estimate of the
trend. The forecast function is no longer flat but trending.
Error correction form
The error correction form of the level and the trend equations show the adjustments in
terms of the within -sample one-step forecast errors
`t = `t−1 + bt−1 + αet
bt = bt−1 + αβ∗et
where et = yt − (`t−1 + bt−1) = yt − yt|t−1
3.4 Damped Trend Method (Ad, A Method)
The forecasts generated by Holts linear method display a constant trend (increasing or
decreasing) indefinitely into the future. Even more extreme are the forecasts generated
by the exponential trend method which include exponential growth or decline. Empirical
evidence indicates that these methods tend to over-forecast, especially for longer forecast
horizons. Motivated by this observation, Gardner and McKenzie (1985) introduced a
parameter that dampensthe trend to a flat line some time in the future. Methods that
include a damped trend have proven to be very successful and are arguably the most
popular individual methods when forecasts are required automatically for many series.[9]
3.4.1 Additive damped trend
Definition 3.4.1. Additive damped trend method equations are defined as
35
Level :`t = αyt + (1− α)(`t−1 + φbt−1) (3.4.1)
Growth :bt = β∗(`t − `t−1) + (1− β∗)φbt−1 (3.4.2)
Forecast :yt+h|t = `t + (φ+ φ2 + ...+ φh)bt (3.4.3)
Thus, the growth for the one-step forecast of yt+1 is φbt, and the growth is dampened
by a factor of φ for each additional future time period.
Notation 3.4.1. • If φ = 1 this method gives the same forecasts as Holts linear
method
• For 0 < φ < 1, as h −→∞ the forecasts approach an asymptote given by`t+φbt1−φ .
We usually restrict φ > 0 to avoid a negative coefficient being applied to bt−1 in
(3.4.2), and φ < 1to avoid bt increasing exponentially.
error correction
The error correction form of the smoothing equations is
`t = `t−1 + φbt−1 + αet
bt = φbt−1 + αβ∗et
3.5 Holt-Winters Trend and Seasonality Method
If the data have no trend or seasonal patterns, then simple exponential smoothing is ap-
propriate. If the data display a linear trend, Holts linear method is appropriate. But
if the data are seasonal, these methods, on their own, cannot handle the problem well.
Holt-Winters method was extended by Winters (1960) to capture seasonality directly is
based on three smoothing equations one for the level, one for trend and one for sea-
sonality with smoothing parameters α, β∗and γ. We use m to denote the period of the
seasonality. It is similar to Holts linear method, with one additional equation for dealing
with seasonality.There are two variations to this method that differ in the nature of the
seasonal component. The additive method is preferred when the seasonal variations are
36
roughly constant through the series, while the multiplicative method is preferred when
the seasonal variations are changing proportional to the level of the series. With the
additive method, the seasonal component is expressed in absolute terms in the scale of
the observed series, and in the level equation the series is seasonally adjusted by subtract-
ing the seasonal component. Within each year the seasonal component will add up to
approximately zero with the multiplicative method, the seasonal component is expressed
in relative terms (percentages) and the series is seasonally adjusted by dividing through
by the seasonal component. Within each year, the seasonal component will sum up to
approximately m. See [8] and [14]
3.5.1 Additive Seasonality (A,A Method)
The seasonal component in Holt-Winters method may also be treated additively, although
this is less common.
Definition 3.5.1. The basic equations for Holt-Winters additive method are as follows:
Level :`t = α(yt − st−m) + (1− α)(`t−1 + bt−1) (3.5.1)
Growth :bt = β∗(`t − `t−1) + (1− β∗)bt−1 (3.5.2)
Seasonal :st = γ(yt − `t−1 − bt−1) + (1− γ)st−m (3.5.3)
Forecast :yt+h|t = `t + bth+ st−m+h+m(3.5.4)
The equation for the seasonal component is often expressed as
st = γ∗(yt − `t) + (1− γ∗)st−m
If we substitute `t from the smoothing equation for the level of the component form
above, we get
st = γ∗(1− α)(yt − `t−1 − bt−1) + (1− γ∗(1− α))st−m
37
which is identical to the smoothing equation for the seasonal component we specify
here with γ = γ∗(1− α) The usual parameter restriction is 0 ≤ γ∗ ≤ 1, which translates
to 0 ≤ γ ≤ 1− α.
The error correction form of the smoothing equations is
`t = `t−1 + bt−1 + αet
bt = bt−1 + αβ∗et
st = st−m + γet
where et = yt − (`t−1 + bt−1 + st−m) = yt − yt|t−1are the one-step training forecast errors.
3.6 General Point Forecasting Equations
Table 3.2 gives recursive formulae for computing point forecasts h periods ahead for all
of the exponential smoothing methods. In each case,`t denotes the series level at time t,
bt denotes the slope at time t, st denotes the seasonal component of the series at timet,
and m denotes the number of seasons in a year,α, β∗, γ and φ are constants and
φh = φ+ φ2 + ...+ φh. and h+m = [(h− 1)mod m] + 1
38
3.7 Innovations state space models for exponential
smoothing
We now introduce the state space models that underlie exponential smoothing methods.
For each method, there are two models.
• Model with additive errors
• Model with multiplicative errors.
The point forecasts for the two models are identical(provided the same parameter values
are used), but their prediction intervals will differ.
To distinguish the models with additive and multiplicative errors, we add an extra letter to
the front of the method notation. The triplet (E,T,S) refers to the three components;error,
trend and seasonality. So the model ETS(A,A,N) has additive errors, additive trend and
no seasonality in other words, this is Holts linear method with additive errors. Similarly,
ETS(M,Md,M) refers to a model with multiplicative errors,a damped multiplicative
39
trend and multiplicative seasonality. The notation ETS(, , ) helps in remembering the
order in which the components are specified. ETS can also be considered an abbreviation
of Exponential Smoothing.[6]
3.7.1 ETS(A,N,N): simple exponential smoothing with additive
errors
As discussed in Section (3.3.1), the error correction form of simple exponential smoothing
is given by
`t = `t−1 + αet
where et = yt−`t−1 and yt|t−1 = `t−1.Thus et = yt−yt|t−1 represents a one-step forecast
error and we can write
yt = `t−1 + et
To make this into an innovations state space model, all we need to do is specify the
probability distribution for et For a model with additive errors, we assume that one-step
forecast errors et are normally distributed white noise with mean 0 and variance σ2 i.e
et = εt ∼ NID(0, σ2).
Then the equations of the model can be written as
yt = `t−1 + εt (3.7.1)
`t = `t−1 + αεt (3.7.2)
3.7.2 ETS(M,N,N): simple exponential smoothing with multi-
plicative errors
In a similar fashion, we can specify models with multiplicative errors by writing the one-
step random errors as relative errors:
40
εt =yt−yt−1|tyt−1|t
Substituting yt−1|t = `t−1 gives yt = `t−1 + `t−1εt and et = yt − yt−1|t = `t−1εt
Then we can write the multiplicative form of the state space model as
yt = `t−1(1 + εt)
`t = `t−1(1 + αεt)
3.7.3 State Space Models for Holts Linear Method
We can now explain the ideas using Holts linear method
Additive Error Model: ETS(A,A,N)
let µt = yt = `t−1+bt−1 denote the one-step forecast of yt assuming we know the values
of all parameters. Also let εt = yt−µt denote the one-step forecast error at time t . From
(3.3.3) we find that
yt = `t−1 + bt−1 + εt (3.7.3)
and using (3.3.1) ,(3.3.2) we can write
`t = `t−1 + bt−1 + αεt (3.7.4)
bt = bt−1 + β∗(`t − `t−1 − bt−1) (3.7.5)
= bt−1 + αβ∗εt (3.7.6)
We simplify the last expression by setting β = αβ∗
Multiplicative Error Model:ETS(M,A,N)
A model with multiplicative error can be derived similarly, by first setting εt = yt−µtµt
so
41
that εt is a relative error. Then, following a similar approach to that for additive errors,
we find
yt = (`t−1 + bt−1)(1 + εt)
`t = (`t−1 + bt−1)(1 + αεt)
bt = bt−1 + β(`t−1 + bt−1)εt
3.7.4 State Space Models for All Exponential Smoothing Meth-
ods
The underlying equations for the additive error models are given in Table .We use β = αβ∗
to simplify the notation. Multiplicative error models are obtained by replacing εt with
εtµt in the equations of Table 3.3 and 3.4 The resulting multiplicative error equations are
given in the Table 3.3 and 3.4 .
42
43
3.8 Initialization and Estimation
In order to use these models for forecasting, we need to specify the type of model to be
used (model selection), the value of x0 (initialization), and the values of the parameters
α, β, γ and φ (estimation). In this section, we discuss initialization and estimation, leaving
model.
3.8.1 Initialization
The non-linear optimization requires some initial values. We use α = β = γ = 0.5 and
φ = 0.9 The initial values of `0, b0 and sk(k = −m + 1, ..., 0, ) are obtained using the
following heuristic scheme.
• Initial seasonal component.
1. For seasonal data, compute a 2m moving average through the first few years
of data (we use up to four years if the data are available). Denote this by {ft}t = (m
2) + 1, (m
2) + 2, ...
2. For additive seasonality, we detrend the data to obtain yt− ft .For multiplica-
tive seasonality, we detrend the data to obtain ytft
Then compute initial seasonal
indices,s−m+1, ..., s0 by averaging the detrended data for each season. Normal-
ize these seasonal indices so that they add to zero for additive seasonality, and
add to m for multiplicative seasonality.
• Initial level component
1. For seasonal data, compute a linear trend using linear regression on the first ten
seasonally adjusted values (using the seasonal indices obtained above) against
a time variable t = 1, ..., 10
2. For nonseasonal data, compute a linear trend on the first ten observations
against a time variable t = 1, ..., 10 Then set F`0 to be the intercept of the
trend.
44
• Initial growth component.
1. For additive trend, set b0 to be the slope of the trend.
2. For multiplicative trend, set b0 = 1 + ba
where a denotes the intercept and b
denotes the slope of the fitted trend.
These initial states are then refined by estimating them along with the parameters, as
described below.
3.8.2 Estimation and model selection
Let
L∗(θ, x0) = n log(∑n
t=1 e2t/k
2(xt−1)) + 2∑n
t=1 log | k(xt−1) | (3.8.1)
Then L∗ is equal to twice the negative logarithm of the conditional likelihood function
of the state space model (with constant terms eliminated). An alternative to estimating
the parameters by minimizing the sum of squared errors, is to maximize the likelihood.
The likelihood is the probability of the data arising from the specified model. So a large
likelihood is associated with a good model. For an additive error model, maximizing the
likelihood gives the same results as minimizing the sum of squared errors. However, differ-
ent results will be obtained for multiplicative error models. In this section, we will estimate
the smoothing parameters θ = (α, β, γ, φ) and initial states x0 = (`0, b0, s0, s−1, ..., s−m+1)
by maximizing the likelihood. The possible values that the smoothing parameters can take
is restricted. Traditionally the parameters have been constrained to lie between 0 and 1
so that the equations can be interpreted as weighted averages.That is, 0 < α, β∗, γ∗ and
φ < 1 For the state space models, we have setβ = αβ∗ and γ = (1− α)γ∗ . Therefore the
traditional restrictions translate to 0 < α < 1 0 < β < α and 0 < γ < 1−α . In practice,
the damping parameter φ is usually constrained further to prevent numerical difficulties
in estimating the model. A common constraint is to set 0.8 < φ < 0.98. Another way
to view the parameters is through a consideration of the mathematical properties of the
state space models . Then the parameters are constrained to prevent observations in the
distant past having a continuing effect on current forecasts. This leads to some admis-
45
sibility constraints on the parameters which are usually (but not always) less restrictive
than the usual region.[11]
3.9 Measure error
Due to the fundamental importance of time series forecasting in many practical situa-
tions, proper care should be taken while selecting a particular model,to estimate forecast
accuracy and to compare different models. Each of these measures is a function of the
actual and forecasted values of the time series.In each of the forthcoming definitions, ytis
the actual value, ft is the forecasted value, et = yt − ft is the forecast error and n is the
size of the test set.
Definition 3.9.1. The Mean Absolute Error (MAE)
The Mean Absolute Error (MAE) is defined as
MAE= 1n
∑nt=1 |et|
• It measures the average absolute deviation of forecasted values from original ones.
• In MAE, the effects of positive and negative errors do not cancel out.
Definition 3.9.2. The Mean Absolute Percentage Error (MAPE)
The Mean Absolute Percentage Error (MAPE)is defined as
MAPE = 1n
∑nt=1 |
etyt| × 100 (3.9.1)
• This measure represents the percentage of average absolute error occurred.
• It is independent of the scale of measurement, but affected by data transformation
Definition 3.9.3. The Mean Squared Error (MSE) The Mean Squared Error (MSE)
is defined as
46
MSE= 1n
∑e2t
MSE gives an overall idea of the error occurred during forecasting.
Definition 3.9.4. The Root Mean Squared Error (RMSE) The Root Mean Squared
Error (RMSE)is defined as
RMSE =√
1n
∑e2t
47
48
Part II
Case Study
49
Chapter 4
Analysis Data Using Box Jenkins
Method
4.1 Data Description
The data of our study is monthly observations for electricity consumption in Khan Younis
province during the period from January 2000 to December 2010. The data were taken
at the end of every month. The total number of observations is (132). is a time series
data, an overview of data from January 2000 to December 2010 are plotted in Figure 4.1
Figure 4.1: Time series plot of electricity consumption in province Khan Younis monthly
symbols
50
Let’s take general idea about the data, We will show some descriptive statistics of the
time series in Table (4.1)
Table 4.1: Descriptive Statistics
Statistics
Min 4.761
Median 11.9
Mean 11.38
Max 19.610
4.1.1 The Box-Jenkins Approach to Fitting ARIMA Model:
We can see from Figure ( 4.1) that there seems to be seasonal variation in the number of
consumption.
One way to determine more objectively if differeincing is required is to use a unit root
test. These are statistical hypothesis tests of stationarity that are designed for determining
whether differentiae is required.
A number of unit root tests are available, and they are based on different assumptions
and may lead to conflicting answers. One of the most popular tests is the Augmented
Dickey-Fuller (ADF ) test. The null-hypothesis for an ADF test is that the data are non-
stationary. So large p-values are indicative of non-stationarity, and small p-values suggest
stationarity. Another popular unit root test is the Kwiatkowski-Phillips-Schmidt-Shin
(KPSS) test. This reverses the hypotheses, so the null-hypothesis is that the data are
stationary. In this case, small p-values (e.g., less than 0.05) suggest that differentiae is
required.
It’s known that
• For the ADF test, if p-value ≤ 0.05 the process stationary
• For the KPSS test, if p-value ≥ 0.05 the process stationary
51
The results as shown in table 4.2 ,the KPSS test: the p-value is 0.01 which is less
than p = 0.05, the ADF test:the p-value of the Augmented Dickey-Fuller (ADF ) test
equals 0.3 which greater than p = 0.05.
Table 4.2: P-values for Augmented Dickey-Fuller (ADF ) test And Kwiatkowski-Phillips-
Schmidt- Shin (KPSS)Test for monthly electricity consumption
test electricity consumption
ADF 0.3
KPSS 0.01
This result indicates that the time series of monthly electricity consumption is not
stationary. Investigation is also done by the examination of the autocorrelation and
partial autocorrelation functions as shown in Figure 4.2 and 4.3 respectively
Figure 4.2: Acf for monthly electricity consumption
52
Figure 4.3: Pacf for monthly electricity consumption
4.2 Model Specification
The data are clearly non-stationary, with some seasonality, so we will first take a seasonal
difference. The seasonally difference data are shown in Figure 4.4.
This shows one way to make a time series stationary compute the differences between
consecutive observations. This is known as differentiae.
Our aim now is to find an appropriate ARIMA model based on the ACF and PACF
shown in Figure 4.4 as As shown in the plot of ACF and PACF in Figure 4.5
Non-seasonal behavior:
The significant spike at lag 2 and 3 in the ACF suggests a non-seasonal MA(1) compo-
nent.This leads for non-seasonal MA(1). The significant spike at lag 2 and lag 3 also at
lag 11, 12 and 13.
Seasonal behavior:
We look at what going on around lags 1, 12 and 24 and so on. That is the ACF has signifi-
cant lags at 1, 12, 24. This lead to a seasonalMA(1) component. Consequently, this initial
analysis suggests that a possible model for these data is an ARIMA (0, 1, 1)(0, 1, 1)12 The
53
AICc of the model is 409.13 but the
residuals of the model show that there is significant lag at 36 which indicate some
additional seasonal term. while that for the ARIMA (0, 1, 1)(0, 1, 0) is 456.97
we tried other models with AR terms as well , but none that gave a smaller AICc value .
Consequently, we choose the ARIMA(2, 1, 2)(1, 0, 1)12 as we shown in Figure 4.6
the residuals of model show that there are significant spikes in both the ACF and PACF
Table 4.3 display the model
Figure 4.4: First difference of monthly electricity consumption
54
Table 4.3: SARIMA Models Criteria for the monthly electricity consumption
model AIC AICc BIC
ARIMA(0,1,1)(0,1,1)12 408.94 409.13 417.59
ARIMA(0,1,1)(0,0,1)12 402.86 403.03 411.77
ARIMA(0,1,1)(0,1,0)12 456.88 456.97 462.64
ARIMA(2,1,1)(1,0,1)12 402 402.61 419.82
ARIMA(2,2,1)(1,0,1)12 400.89 401.51 418.67
ARIMA(2,1,2)(1,0,1)12 388.83 389.66 409.62
ARIMA(1,1,1)(1,0,1)12 399.28 399.72 414.13
After calculate the error of models the best fit model is ARIMA(2, 1, 2)(1, 0, 1)12. So
we now have a seasonal ARIMA model that passes the required checks and is ready for
forecasting. Forecasts from the model for the next years are shown in Figure 4.7 Notice
how the forecasts follow the recent trend in the data . The large and rapidly increasing
prediction intervals show that the retail trade index could start increasing or decreasing
at any time while the point forecasts trend downwards, the prediction intervals allow for
the data to trend downward during the forecast period.
Figure 4.5: Residuals from the fitted ARIMA(2, 1, 2)(1, 0, 1)12 model
55
Figure 4.6: Forecasts for monthly electricity consumption
Now we want to predict data of year 2011 using model ARIMA(2, 1, 2)(1, 0, 1)12. And
make comparative between prediction data and actual data See Table 4.4
56
Table 4.4: Comparative between Prediction data of 2011 using ARIMA(2, 1, 2)(1, 0, 1)12
and Actual data
Month of year 2011 Actual value Prediction valu
Jan 15.40 16.15
Feb 17.12 15.54
Mar 16.23 16.75
Apr 17.47 16.19
May 17.62 17.10
Jun 18.08 17.34
Jul 17.73 18.25
Aug 18.35 17.22
Sep 16.03 18.44
Oct 19.61 18.82
Nov 18.63 16.43
Dec 18.35 18.59
57
Chapter 5
Analysis data using exponential
smoothing methods
5.1 exponential smoothing model
Our aim in this section analysis electricity data using exponential smoothing model to
find the fit model. We present in chapter (3) some type of exponential smoothing method
like simple exponential smoothing, Holts linear method, Holt-Winters damped method,...
.
5.1.1 First Method : Simple Exponential Smoothing Model
As we shown in chapter 3 his method is suitable for forecasting data with no trend or
seasonal pattern.
Forecast equation = yt+1 = `t
Smoothing equation = `t = αyt + (1− α)`t−1,
In our data, simple exponential smoothing is applied to forecast electricity consump-
tion in Khan Younis. Now we want to use simple exponential smoothing model for predict
data in period from Jan 2011 to Dec 2011 by changing the value of α we will take three
model
58
• Model(1) when α = 0.2 the model equations will defined as
`t = 0.2 = yt + 0.8`t−1
• Model (2) when α = 0.6 the model equations will defined as
`t = 0.6yt + 0.4`t−1
• Model (3) when α = 0.89 the model equations will defined as
`t = 0.89yt + 0.11`t−1
By using R the prediction data of year 2011 is like as Table 5.1. Figure 5.2 is a plot
of the data over the period Jan 2000 - Dec 2010, which shows a changing level over time
but no obvious trending behavior.
Table 5.1: Prediction data for year 2011 using simple exponential smoothing with three
different values for the smoothing parameter α.
Month of year 2011 Actual Value α = 0.2 α = 0.6 α = 0.89
Jan 15.40 15.29 16.15 16.39
Feb 17.12 15.31 15.70 15.63
Mar 16.23 15.67 16.55 16.77
Apr 17.47 15.78 16.36 16.36
May 17.62 16.12 17.02 17.21
Jun 18.08 16.42 17.38 17.52
Jul 17.73 16.75 17.80 17.95
Aug 18.35 16.95 17.76 17.79
Sep 16.03 17.23 18.11 18.22
Oct 19.61 17.71 19.01 19.29
Nov 18.63 17.37 17.22 16.78
Dec 18.35 17.62 18.06 18.20
59
Figure 5.1: Simple exponential smoothing applied to electricity consumption in province
Khan Younis
Now we want to calculate error measure for the three model to chose fitting model see
Table 5.2.
Table 5.2: Measure error for simple exponential smoothing models
Model RMSE MAE MAPE
Model1 1.246 0.9436 8.4063
Model2 1.0478 0.7477 6.6897
Model3 1.0350 0.7334 6.49301
As we show in Table 5.2 model 3 `t = 0.89yt + 0.11`t−1 is fitting model, it has smallest
error now after determined the fitting model we want to plot it see Figure 5.2
60
Figure 5.2: Forecasts from simple exponential smoothing
5.1.2 Second Method :Holt’s Linear Trend Method
Holt (1957) extended simple exponential smoothing to allow forecasting of data with a
trend. This method involves a forecast equation and two smoothing equations (one for
the level and one for the trend): as we discus in chapter 3
We want use this method for predict data in period Jan 2011 to Dec(2011) by using
two model
• Model 1 with α = 0.8 ,β∗ = 0.2,
`t = 0.8yt + 0.2(`t−1 + bt−1)
bt = 0.2(`t − `t−1) + 0.8bt−1
• Model 2 with α = 0.6 ,β∗ = 0.4
`t = 0.6yt + 0.4(`t−1 + bt−1)
bt = 0.4(`t − `t−1) + 0.6bt−1
61
By using R we will calculate the prediction data by model(1) and model 2 and make
comparative between two models and actual data see Table 5.3
Table 5.3: Comparative between actual data prediction data of year 2011 using Holt linear
method.
Month of year 2011 Actual Value α = 0.8,β = 0.2 α = 0.6,β = 0.4
Jan 15.40 16.78 16.82
Feb 17.12 15.77 16.11
Mar 16.23 17.16 17.10
Apr 17.47 16.58 16.76
May 17.62 17.59 17.53
Jun 18.08 17.92 17.95
Jul 17.73 18.39 18.43
Aug 18.35 18.09 18.25
Sep 16.03 18.57 18.56
Oct 19.61 19.84 19.70
Nov 18.63 17.12 16.62
Dec 18.35 18.37 18.01
We want to calculate the error of two model to chose the fitting model. See Table 5.4
Table 5.4: Eroor Measure For Holt trend Model
Model ME RMSE MAE MPE MAPE MASE
Fit1 -0.04 1.11 0.77 -0.65 6.84 0.47
Fit2 -0.04 1.17 0.82 -0.56 7.48 0.50
As we show in Table 5.4 Model 1 with α = 0.8 and β∗ = 0.6,
`t = 0.8yt + 0.2(`t−1 + bt−1)
bt = 0.2(`t − `t−1) + 0.8bt−1
it had smallest error is the fitting model. Figure 5.3 display data.
62
Figure 5.3: Forecasts from Holt’s linear method
5.1.3 Damped trend methods
The forecasts generated by Holts linear method display a constant trend (increasing or
decreasing) indefinitely into the future. Even more extreme are the forecasts generated
by the exponential trend method which include exponential growth or decline.
Now we want use this method to predict data in the period Jan 2011 to Dec 2011 by
using two models and choose fitting model. Table 5.5 explain the data.
• Model 1 Additive Damped Trend (ETS(A,Ad,M))
with α = 0.76,β = 0.0001 and φ = 0.98
• Model 2 Multiplicative Damped Trend ETS(M,Md, N)
with α = 0.76,β = 0.0001 and φ = 0.98
63
Table 5.5: Comparative between actual data and prediction data of year 2011 using
Damped trend method
Month of year Actual Value ETS(A,Ad,M) ETS(M,Md, N)
Jan 15.40 16.39 16.37
Feb 17.12 15.65 15.68
Mar 16.23 16.78 16.75
Apr 17.74 16.37 16.39
May 17.62 17.21 17.20
Jun 18.08 17.53 17.53
Jul 17.73 17.96 17.95
Aug 18.35 17.80 17.81
Sep 16.03 18.22 18.22
Oct 19.61 19.29 19.26
Nov 19.63 16.82 16.92
Dec 18.35 18.20 18.18
After predict data for year 2011 we want to calculate measure error of two models to
choose the fitting model with smallest error. See Table 5.6.
64
Table 5.6: Measure error for Damped trend models
Model ETS(A,Ad,M) ETS(M,Md, N)
ME 0.029 0.026
RMSE 1.030 1.034
MAE 0.730 0.731
MPE -0.38 -0.34
MAPE 6.51 6.54
MASE 0.45 0.446
As we show in Table 5.6 The fitting model is model 1 now we want to plot fitting
model see Figure 5.4. With the exception of the multiplicative damped trend method,
the smoothing parameter for the slope parameter is estimated to be zero, indicating that
the trend is not changing over time. Of course, the trend estimated using the damped
trend methods will change in the future due to the damping.
Figure 5.4: Forecasts from Damped Holts method with exponential trend
65
5.1.4 Holt-Winters seasonal method
Holt-Winters seasonal method In our data we employ the Holt-Winters method with both
additive and multiplicative seasonality to forecast
electricity consumption in Khan Younis. Now we want to discus tow model additive
and multiplicative as we show in section 3.4 to choose the fitting model and forecasting
data in 2011
• Model 1 the Holt-Winters additive ETS(A,A,A)
with α = 0.76,β = 0.002 and γ = 0.001
• Model 2 the Holt-Winters multiplicative ETS(M, A, M)
with α = 0.51, β = 0.09 and γ = 0.002
We want to application method with additive and multiplicative seasonality for our
data and make comparative between actual value and prediction data of year 2011 see
Table 5.7
66
Table 5.7: Prediction data of month of year 2011 using Holt-Winters method with both
additive and multiplicative seasonality
Month of year 2011 Actual Value ETS(A,A,A) ETS(M,A,M)
Jan 15.40 15.82 15.48
Feb 17.12 16.27 16.39
Mar 16.23 16.79 16.79
Apr 17.47 16.97 16.88
May 17.62 17.51 17.24
Jun 18.08 18.02 18.58
Jul 17.73 18.18 18.22
Aug 18.35 17.76 17.70
Sep 16.03 18.05 17.90
Oct 19.61 18.52 18.16
Nov 18.63 17.33 18.02
Dec 18.35 18.01 17.55
The results show that the method with the multiplicative seasonality fits the data best.
This was expected as the time plot shows the seasonal variation in the data increases
as the level of the series increases. This is also reflected in the two sets of forecasts,
the forecasts generated by the method with the multiplicative seasonality portray larger
and increasing seasonal variation as the level of the forecasts increases compared to the
forecasts generated by the method with additive seasonality. See Figure 5.5 and 5.6.
67
Figure 5.5: Forecasting electricity data using Holt-Winters method with both additive
and multiplicative seasonality.
After we predict data of year 2011 we will calculate the error measure of models. See
Table 5.8
Table 5.8: Measure error for two model additive and multiplicative seasonal components
Model ME RMSE MAE MPE MAPE MASE
Model 1 -0.0089 0.953 0.659 -0.524 6.069 0.402
Model 2 -0.0019 0.944 0.644 -0.540 5.926 0.393
68
Figure 5.6: Estimated components for Holt-Winters method with additive and multiplica-
tive seasonal components..
The best fitted model is multiplicative model ETS(M,A,M) it has smallest error we
will plot the data using model2 See Figure 5.7
69
Figure 5.7: Forecasting data using multiplicative seasonal components
5.2 Summary
In this section we want to make comparative between the Box-Jenkins method and ex-
ponential smoothing model; to choose the best model for forecast the data in period Jan
2011 to Dec 2011
Firstly we want to comparative between all fitted model in all methods of exponential
smoothing model to chose fitting model by calculate the error. Table 5.9 display the
measure error of all model.
• Model 1 when α = 0.89 the model equations will defined as
`t = 0.89yt + 0.11`t−1
• Model 2 Holt with α = 0.8 ,β∗ = 0.2,
`t = 0.8yt + 0.2(`t−1 + bt−1)
bt = 0.2(`t − `t−1) + 0.8bt−1
• Model 3 Additive damped trend (ETS(A,Ad,M))
• Model 4 the Holt-Winters multiplicative ETS(M,A,M)
70
Table 5.9: Measure error for fitted model in all methods
Model RMSE MAE MAPE
Fit1 1.0350 0.7334 6.49301
Fit2 1.0766 0.7524 6.6129
Fit3 1.03 0.73 6.54
Fit4 0.9442 0.6445 5.9268
From Table 5.9 the best fit model is model 4 ETS(M,A,M) i.e the data is seasonality
m = 12
Secondly we want to make comparative between fit model of exponential smoothing,
and fitted SARIM model to choose best model that is the aim of our study.
The fitted ARIMA model is ARIMA(2, 1, 2)(1, 0, 1)12 and the fitted exponential
smoothing model is Holt-Winters multiplicative. Table 5.10 display the prediction data
of year 2011 by ARIMA(2, 1, 2)(1, 0, 1)12 and Holt-Winters multiplicative.
Table 5.10: Comparative between actual data and prediction data by
ARIMA(2, 1, 2)(1, 0, 1)12 and ETS(M,A,M)
Month of year 2011 Actual value ETS(M,A,M) ARIMA(2, 1, 2)(1, 0, 1)12
Jan 15.40 15.48 16.15
Feb 17.12 16.39 15.54
Mar 16.23 16.79 16.75
Apr 17.47 16.88 16.19
May 17.62 17.24 17.10
Jun 18.08 18.58 17.34
Jul 17.73 18.22 18.25
Aug 18.35 17.70 17.22
Sep 16.03 17.90 18.44
Oct 19.61 18.16 18.82
Nov 18.63 18.02 16.43
Dec 18.35 17.55 18.59
71
Finally the best model of our data Holt-Winters multiplicative seasonality. it has
smallest measure error Table 5.11
Table 5.11: Measure Error for ETS(M,A,M) and ARIMA(2, 1, 2)(1, 0, 1)12 Models
Model RMSE MAE MAPE
ETS(M,A,M) 0.9442 0.6445 5.9268
ARIMA(2, 1, 2)(1, 0, 1)12 0.963 0.708 6.244
Notation 5.2.1. We want to show before calculate the result we product data from 10−6
because value of data very large
72
CONCLUSIONS
From all the discussion of this study the following conclusions can be drawn
• Data of electricity consumption is non stationary time series.
• The best model for electricity consumption after treatment non stationary is ARIMA
ARIMA(2, 1, 2)(1, 0, 1)12. This was supported by the ACF,AIC,BIC and AICc.
• After application Exponential Smoothing Model best model ETS(M,A,M). This was
supported by the measure error.
• After the comparison between two fitting modelARIMA(2, 1, 2)(1, 0, 1)12, ETS(M,A,M).
ETS(M,A,M) is the best fit model.
• Result above confirms the good treatment using Exponential Smoothing Model as well as
improved the model.
• Choose the best model depends on several criteria ACF,AIC,BIC,AICc, measure error,
and depends on researcher experience and wisdom.
73
Recommendations
Exponential Smoothing Model ETS provides an alternative methodology to ARIMA
model for forecasting. So we recommended the following
• conduct more research and comparisons to ascertain the extent suitable ARIMA models
in this area
• We recommend using R for good result.
• Recommend that all workers in the Statistical area, should use the exponential smoothing
methods methods, because of its efficiency in forecasting
74
Bibliography
[1] Box, G.E.P and Jenkins, G.M. Time Series Analysis, Forcaseing and Control (3rd
edittion). Holden-Day, San Francesco, (1970).
[2] Brown, R. G. Statistical forecasting for inventory control,McGraw-Hill, New
York,(1959).
[3] Brown, R. G. Smoothing, forecasting and prediction of discrete time series, Prentice
Hall, Englewood Cliffs, New Jersey,(1963).
[4] Celia F., Balaji V. Les S. , Asish G. Amar R.”Afuzzy Foreca-sting Model for Womens
Casual Sales”, International Journal of Clothing Science and Technology 15(2), 107-
125, (2004).
[5] Degerine, S. and Lambert-Lacroix, S. . Partial autocorrelation function of a non sta-
tionary time series. Journal Multivariate Anal. 87, 4659, (2003).
[6] Gardner, Jr, E. S. and E. McKenzie Forecasting trends in time series, Management
Science,(1985).
[7] George C. S. Wang A GUIDE TO BOX-JENKINS modeling By ,the journal of busi-
ness forcasting, spring, 2008
[8] Gottman, M John., Time Series Analysis, A Comprehensive Introduction for Social
Scientists. Cambridge University Press, British, (1981).
[9] Holt, C. C. Forecasting trends and seasonals by exponentially weighted averages,
O.N.R. Memorandum 52/1957, Carnegie Institute of Technology,(1957).
[10] Holt, C. C. Forecasting seasonals and trends by exponentially weighted moving aver-
ages, International Journal of Forecasting, (2004).
75
[11] Hyndman Rop J.,Georgy Athanasopoulos Forecasting Principles and practies ,2013
Availabel at http://www.otexts.org/fpp.
[12] Jonathan D. Cryer Kung-Sik Chan,Time Series Analysis With Applications in R
Second Edition
[13] Makridaskis, S. Wheel Wright, S. and McGee, E. ”Forecasting: Methods and Appli-
cations”, 2nd ed., John Wiley and Sons, New York, USA. (1983)
[14] Makridaskis, S. Wheel Wright, S. and Hyndman, R. Forecasting: Methods and Ap-
plications”, 3rd ed., Jhon- Wiely and Sons, New York, USA.(1998)
[15] Peter J. Brockwell Richard A. Davis,Introduction to Time Series and Forecasting
Second Edition
[16] Robert H. Shumway David S. Stoffer,Time Series Analysis and Its Applications With
R Examples Second Edition(1999)
[17] Rob J. Hyndman, Anne B. Koehler, J. Keith Ord and Ralph D. Snyder, Forecasting
with Exponential Smoothing, (2008).
[18] Simon Shaw ”Exponential Smoothing Example”, [email protected],
2003/04 semester II.(2003)
[19] Taylor, J. W. ”Exponential Smoothing with a Damped Multiplicative Trend”, Inter-
national Journal of Forecasting,(2003a).
[20] Taylor, J.W. Short-term electricity demand forecasting using double seasonal expo-
nential smoothing, Journal of the Operational Research Society,(2003b).
[21] Wedding Ii DK, Cios KJ Time series forecasting by combining RBF networks, cer-
tainty factors, and the Box-Jenkins model, (1996).
[22] Winters, P. R. Forecasting sales by exponentially weighted moving averages, Man-
agement Science, (1960).
76
Appendix
R Codes Used In My Thesis
First part SARIMA
>win.graph(width=4.875, height=2.5,pointsize=8)
>data(electts)
>plot(electts,ylab=’electricity consumption’,xlab=’time’,type=’o’)
\\
\\
par(mfrow=c(2,1))
acf(as.vector(electts),main="")
pacf(as.vector(electts),main="")
tsdisplay(electts,main="")
Example of ARIMA coding
fit <- Arima(as.vector(electts), order=c(0,1,3))
fit <- Arima(as.vector(electts), order=c(0,0,3))
fit <- auto.arima(as.vector(electts),seasonal=FALSE)
tsdisplay(diff(electts),main="")
Part 2
77
:Expontial smoothing model
fit1 <- ses(electts, alpha=0.2, initial="simple", h=3)
fit2 <- ses(electts, alpha=0.6, initial="simple", h=3)
fit3 <- ses(electts, h=3)
plot(fit1, plot.conf=FALSE, ylab="elect (consumtion)",
xlab="Year", main="", fcol="white", type="o")
lines(fitted(fit1), col="blue", type="o")
lines(fitted(fit2), col="red", type="o")
lines(fitted(fit3), col="green", type="o")
lines(fit1$mean, col="blue", type="o")
lines(fit2$mean, col="red", type="o")
lines(fit3$mean, col="green", type="o")
legend("topleft",lty=1, col=c(1,"blue","red","green"),
c("data", expression(alpha == 0.2), expression(alpha == 0.6),
expression(alpha == 0.89)),pch=1)
elect <- window(electts,start=2000,end=2012)
fit1 <- holt(elect, alpha=0.8, beta=0.2, initial="simple", h=5)
fit2 <- holt(elect, alpha=0.4, beta=0.6, initial="simple", exponential=TRUE, h=5)
# Results for first model:
fit1$model$state
fitted(fit1)
fit1$mean
plot(fit2, type="o", ylab="electricity consumtion", xlab="Year",
fcol="white", plot.conf=FALSE)
78
lines(fitted(fit1), col="blue")
lines(fitted(fit2), col="red")
lines(fitted(fit3), col="green")
lines(fit1$mean, col="blue", type="o")
lines(fit2$mean, col="red", type="o")
lines(fit3$mean, col="green", type="o")
legend("topleft", lty=1, col=c("black","blue","red","green"),
c("Data","Holt’s linear trend","Exponential trend","Additive damped trend"))
Level and slope components for Holts linear trend method and the
additive damped trend method.
> fit1 <- ses(electts)
> fit2 <- holt(electts)
> fit3 <- holt(electts,exponential=TRUE)
> fit4 <- holt(electts,damped=TRUE)
> fit5 <- holt(electts,exponential=TRUE,damped=TRUE)
> # Results for first model:
fit1$model
accuracy(fit1) # training set
> plot(fit2$model$state)
> plot(fit4$model$state)
> plot(fit1$model$state)
> plot(fit3$model$state)
plot(fit3, type="o", ylab="elect",
79
flwd=1, plot.conf=FALSE)
lines(window(electts,start=2001),type="o")
lines(fit1$mean,col=2)
lines(fit2$mean,col=3)
lines(fit4$mean,col=5)
lines(fit5$mean,col=6)
legend("topleft", lty=1, pch=1, col=1:6,
c("Data","SES","Holt’s","Exponential",
"Additive Damped","Multiplicative Damped"))
elect <- window(electts,start=2000)
fit1 <- hw(electts,seasonal="additive")
fit2 <- hw(electts,seasonal="multiplicative")
plot(fit2,ylab="consumption",
plot.conf=FALSE, type="o", fcol="white", xlab="Year")
lines(fitted(fit1), col="red", lty=2)
lines(fitted(fit2), col="green", lty=2)
lines(fit1$mean, type="o", col="red")
lines(fit2$mean, type="o", col="green")
legend("topleft",lty=1, pch=1, col=1:3,
c("data","Holt Winters’ Additive","Holt Winters’ Multiplicative"))
states <- cbind(fit1$model$states[,1:3],fit2$model$states[,1:3])
colnames(states) <- c("level","slope","seasonal","level","slope","seasonal")
plot(states, xlab="Year")
fit1$model$state[,1:3]
fitted(fit1)
80
fit1$mean
81