Electricity Consumption Forecasting Viewpoint · of Exponential Smoothing Model and his methods like Simple Exponential Smoothing Model, Holts Linear Method, Damped Trend Method and

Electricity Consumption Forecasting

in the Khan Younis Province Using

Exponential Smoothing and Box -

Jenkins Methods: A Modeling

Viewpoint.

August 26, 2015

The Islamic University of Gaza

Faculty of Science

Department of Mathematics

Electricity Consumption Forecasting in the Khan Younis Province

Using Exponential Smoothing and Box - Jenkins Methods

: A Modeling Viewpoint

Submitted By

RANA MAHMOUED ABU AL RISH

Supervised By

Dr. Bisher M.Iqelan

A THESIS SUBMITTED IN PARTIAL FULFILMENT OF THE REQUIREMENT

FOR THE DEGREE OF MASTER OF MATHEMATICS

June, 2015

To my parents...

To my son Ryad...

To my husband Ahmed...

And to all knowledge seekers...

i

Contents

Acknowledgments ix

Abbreviation x

Abstract 1

Literature Review 2

Introduction 3

I 5

1 Introduction 6

1.1 Examples Of Time Series . . . . . . . . . . . . . . . . . . . . . . . . . . 6

1.2 Properties of Time series . . . . . . . . . . . . . . . . . . . . . . . . . 8

1.3 Stationary Time Series . . . . . . . . . . . . . . . . . . . . . . . . . . . 9

2 Box-Jenkins methodology 13

2.1 Models for Stationary Time Series . . . . . . . . . . . . . . . . . . . . 13

2.1.1 General Linear Processes . . . . . . . . . . . . . . . . . . . . . . 13

2.1.2 Autoregressive Process . . . . . . . . . . . . . . . . . . . . . . . 14

2.1.3 Moving Average Processes . . . . . . . . . . . . . . . . . . . . . 16

2.1.4 Autoregressive Moving Average Model . . . . . . . . . . . . . 17

2.2 Models for non Stationary Time Series . . . . . . . . . . . . . . . . . 21

2.2.1 Multiplicative Seasonal ARIMA Models . . . . . . . . . . . . 22

ii

2.3 Forecasting . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 24

2.3.1 Model Identification . . . . . . . . . . . . . . . . . . . . . . . . . 26

2.3.2 Parameter Estimation of the SARIMAModel . . . . . . . . . 27

2.3.3 Diagnostics Checking Of The Fitted Model . . . . . . . . . . 27

2.3.4 Forecasting the study variable . . . . . . . . . . . . . . . . . . . . . 28

3 Exponential Smoothing 29

3.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 29

3.1.1 Classification of Exponential Smoothing Methods . . . . . . 30

3.1.2 Point Forecasts for the Best-Known Methods . . . . . . . . . 31

3.2 Simple Exponential Smoothing (N,N Method) . . . . . . . . . . . . . . . . 31

3.3 Holt Linear Method (A,N Method) . . . . . . . . . . . . . . . . . . . 34

3.4 Damped Trend Method (Ad, A Method) . . . . . . . . . . . . . . . . 35

3.4.1 Additive damped trend . . . . . . . . . . . . . . . . . . . . . . . 35

3.5 Holt-Winters Trend and Seasonality Method . . . . . . . . . . . . . 36

3.5.1 Additive Seasonality (A,A Method) . . . . . . . . . . . . . . . 37

3.6 General Point Forecasting Equations . . . . . . . . . . . . . . . . . . 38

3.7 Innovations state space models for exponential smoothing . . . . . 39

3.7.1 ETS(A,N,N): simple exponential smoothing with additive

errors . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 40

3.7.2 ETS(M,N,N): simple exponential smoothing with multi-

plicative errors . . . . . . . . . . . . . . . . . . . . . . . . . . . . 40

3.7.3 State Space Models for Holts Linear Method . . . . . . . . . 41

3.7.4 State Space Models for All Exponential Smoothing Methods 42

3.8 Initialization and Estimation . . . . . . . . . . . . . . . . . . . . . . . 44

3.8.1 Initialization . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 44

3.8.2 Estimation and model selection . . . . . . . . . . . . . . . . . . . . 45

3.9 Measure error . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 46

iii

II Case Study 49

4 Analysis Data Using Box Jenkins Method 50

4.1 Data Description . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 50

4.1.1 The Box-Jenkins Approach to Fitting ARIMA Model: . . . . . . . 51

4.2 Model Specification . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 53

5 Analysis data using exponential smoothing methods 58

5.1 exponential smoothing model . . . . . . . . . . . . . . . . . . . . . . . . . 58

5.1.1 First Method : Simple Exponential Smoothing Model . . . 58

5.1.2 Second Method :Holt’s Linear Trend Method . . . . . . . . . 61

5.1.3 Damped trend methods . . . . . . . . . . . . . . . . . . . . . . 63

5.1.4 Holt-Winters seasonal method . . . . . . . . . . . . . . . . . . 66

5.2 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 70

CONCLUSIONS 73

Recommendations 74

Refrences 75

Appendix 77

iv

List of Figures

1.1 Average Monthly Temperatures, Dubuque, lowa . . . . . . . . . . . . . . . 7

1.2 Monthly Temperatures, Dubuque, lowa . . . . . . . . . . . . . . . . . . . . 8

2.1 simulated AR(1)process with φ = 0.9 . . . . . . . . . . . . . . . . . . . . . 15

2.2 simulated AR(1)process with φ = 0.9 . . . . . . . . . . . . . . . . . . . . . 17

3.1 Oil production in Saudi Arabia from 1996 to 2007 . . . . . . . . . . . . . . 34

4.1 Time series plot of electricity consumption in province Khan Younis monthly

symbols . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 50

4.2 Acf for monthly electricity consumption . . . . . . . . . . . . . . . . . . . 52

4.3 Pacf for monthly electricity consumption . . . . . . . . . . . . . . . . . . . 53

4.4 First difference of monthly electricity consumption . . . . . . . . . . . . . 54

4.5 Residuals from the fitted ARIMA(2, 1, 2)(1, 0, 1)12 model . . . . . . . . . . 55

4.6 Forecasts for monthly electricity consumption . . . . . . . . . . . . . . . . 56

5.1 Simple exponential smoothing applied to electricity consumption in province

Khan Younis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 60

5.2 Forecasts from simple exponential smoothing . . . . . . . . . . . . . . . . . 61

5.3 Forecasts from Holt’s linear method . . . . . . . . . . . . . . . . . . . . . . 63

5.4 Forecasts from Damped Holts method with exponential trend . . . . . . . 65

5.5 Forecasting electricity data using Holt-Winters method with both additive

and multiplicative seasonality. . . . . . . . . . . . . . . . . . . . . . . . . . 68

5.6 Estimated components for Holt-Winters method with additive and multi-

plicative seasonal components.. . . . . . . . . . . . . . . . . . . . . . . . . 69

v

5.7 Forecasting data using multiplicative seasonal components . . . . . . . . . 70

vi

List of Tables

2.1 Behavior of the ACF and the PACF for ARMA Models . . . . . . . . . . 18

4.1 Descriptive Statistics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 51

4.2 P-values for Augmented Dickey-Fuller (ADF ) test And Kwiatkowski-Phillips-

Schmidt- Shin (KPSS)Test for monthly electricity consumption . . . . . . 52

4.3 SARIMA Models Criteria for the monthly electricity consumption . . . . . 55

4.4 Comparative between Prediction data of 2011 usingARIMA(2, 1, 2)(1, 0, 1)12

and Actual data . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 57

5.1 Prediction data for year 2011 using simple exponential smoothing with

three different values for the smoothing parameter α. . . . . . . . . . . . . 59

5.2 Measure error for simple exponential smoothing models . . . . . . . . . . 60

5.3 Comparative between actual data prediction data of year 2011 using Holt

linear method. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 62

5.4 Eroor Measure For Holt trend Model . . . . . . . . . . . . . . . . . . . . . 62

5.5 Comparative between actual data and prediction data of year 2011 using

Damped trend method . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 64

5.6 Measure error for Damped trend models . . . . . . . . . . . . . . . . . . . 65

5.7 Prediction data of month of year 2011 using Holt-Winters method with

both additive and multiplicative seasonality . . . . . . . . . . . . . . . . . 67

5.8 Measure error for two model additive and multiplicative seasonal compo-

nents . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 68

5.9 Measure error for fitted model in all methods . . . . . . . . . . . . . . . . . 71

vii

5.10 Comparative between actual data and prediction data byARIMA(2, 1, 2)(1, 0, 1)12

and ETS(M,A,M) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 71

5.11 Measure Error for ETS(M,A,M) and ARIMA(2, 1, 2)(1, 0, 1)12 Models . 72

viii

Acknowledgments

Praise be to Almighty ALLAH who always help and guide me to bring forth to light this

work. I am also grateful to my supervisor Bisher M.Iqelan for suggesting the topic of the

thesis, tremendous support and healthy ideas. It has been a privilege to work with. My

special thanks to all members of the Math department at the Islamic University of Gaza

for their help and teaching. Also thanks to my parents, son, husband and family members

,who have always shadowed me with love and fortitude.

ix

Abbreviation

ACF Autocorrelation function.

ACVF Auto covariance function

ADF Augmented Dickey-Fuller

AR(p) Autoregressive Model of order p.

ARMA(p, q) Autoregressive moving average model of order (p, q).

ARIMA(p, d, q) Integrated autoregressive moving average model of order (p, q)

AIC Akaikes Information Criterion.

AICc AIC, Bias Corrected

BIC Bayesian Information Criterion.

KPSS Kwiatkowski-Phillips-Schmidt-Shin

SARIMA Seasonal Integrated Autoregressive Moving Average Model

SES Simple Exponential Smoothing

SIC Schwarzs Information Criterion

MA(q) Moving average model of order q.

MAE The Mean Absolute Error

MAPE The Mean Absolute Percentage Error

MSE The Mean Squared Error.

NID Normal Independent Distributed

PACF Partial autocorrelation function.

RMSE The Root Mean Squared Error

RSS Residual Sum of Squares

WN White Noise

x

Abstract

Time Series analysis can be used to extract information hidden in data. The classical

techniques for Time Series data analysis are the linear Time Series models including

the Moving Average Models (MA), the Autoregressive Models (AR), the Autoregres-

sive Moving Average Models (ARMA), the Seasonal Integrated Moving Average Models

(SARIMA). We are mention and display these models in details, and we show the

important characteristics and methods of finding their parameters, auto covariance ,au-

tocorrelation functions and partial autocorrelation function. We are presented a details

of Exponential Smoothing Model and his methods like Simple Exponential Smoothing

Model, Holts Linear Method, Damped Trend Method and Holt-Winters Trend and Sea-

sonality Method .

In this theses we have used Box - Jenkins models and Exponential Smoothing Model

to analysis the electricity data of Khan Younis province in period 2000 − 2010 and we

compar between two models to choose the fitting model for forecasting data in period Jan

2011 to Dec 2011.

Aftre comparative the best model is Exponential Smoothing Model. We are using R

program .

1

Literature Review

Many scientists and researchers have studied time series. The mathematician Fourier

first touched to study time series in (1807) he representation time series as infinite series

for Certain functions sin and cos this representation has been named Fourier series, this

series adopted by Schuster Stocks, (1906) and Beveridge (1922). The theory and practice

of time series analysis have developed rapidly since the appearance in 1970 of the seminal

work of George E. P. Box and Gwilym M. Jenkins, Time Series Analysis: Forecasting and

Control, now available in its third edition (1994) with co-author Gregory C. Reinsel. Many

books on time series have appeared since then, but some of them give too little practical

application, while others give too little theoretical background. This book attempts to

present both application, and theory at a level accessible to a wide variety of students and

practitioners. Our approach is to mix application and theory throughout the book as they

are naturally needed. Shummay and David 1999 studying, presented examples of time

series and Box - jenkins . Peter J. Brock well and Richard A. Davis (2001) studying time

series and forecasting. In (2003) Degerine, S. and Lambert-Lacroix studied concepts of

time series. In(1998) researcher (Makridaskis, S, et.al) studying Exponential Smoothing

Method for time series, in (2002) researcher (Celia F., Balaji V.,Les S., Asish G and

Amar R ) Simple Exponential Smoothing method and Holt Winter and they application

for sales of women’s clothing. In (2003) (Simon,2003) studying Exponential Smoothing

Method for time series and prepare formula to calculate average and σ In (2008)(Rob J.

Hyndman, Anne B. Koehler, J. Keith Ord and Ralph D. Snyder) studying Exponential

Smoothing method and forecasting.

2

Introduction

The prediction of future behavior of the time series is important issues in the statistical

sciences in order to need it in the areas of all of life, such as prediction of the situation, air

temperatures, that most countries rely on its plans and programs of development on the

basis and methods of advanced process in order to reach more effective results and leads

Census key role in This building plans and programs. Interested has put these studies

a range of statistical methods and mathematical methods to take advantage of them to

predict, and that of the most important problems facing researchers when doing analysis

of time series is the stability of the series or not, which could affect the mathematical

model. In this theses, we studied the number of consumers of electricity in the period

2000− 2010 in Khan Younis and treatment it by using two models

1. Box - Jenkins Models

2. Exponential Smoothing Model

To choose the best model in order to predict the number of consumers in 2011.

This thesis is organized as fellows We start by recalling background of the time series and

its properties in chapter 1 it contains 3 sections Examples Of Time Series,Properties of

Time series and Stationary Time Series.

Chapter 2 introduces Box-Jenkins Models and contains 3 section Models for Stationary

Time Series Moving Average Models (MA), Autoregressive Models (AR) and Autoregres-

sive Moving Average Models (ARMA). Models for non Stationary Time Series Seasonal

Integrated Moving Average Models (SARIMA), and Forecasting

Chapter 3 talk about Exponential Smoothing Models ETS and important methods of this

model as Simple Exponential Smoothing Model, Holts Linear Method, Damped Trend

3

Method and Holt-Winters Trend and Seasonality Method and we have studied the prop-

erties of all methods. Chapter 4 and 5 given the result of application of two model on

electricity data and the result.

4

Part I

5

Chapter 1

Introduction

In this chapter, we introduce some basic ideas of time series analysis and we will study

some properties of time series in section 1.2.

The purpose of time series analysis is generally:

1. To understand or model the stochastic mechanism that gives rise to an observed series

2. To predict or forecast the future values of a series based on the history of that series

and, possibly, other related series or factors.

3. To describe the characteristics of these oscillations This chapter contain 3 section

Example of Time Series, Properties of Time Series and Stationary Time Series.

Definition 1.0.1. A time series is a set of observations Yt each one being recorded at

a specific time t. Or is a sequence of data points, measured typically at successive time

instants at uniform time interval.

1.1 Examples Of Time Series

Definition 1.1.1. A time series is a set of observations Yt each one being recorded at a

specific time t.

Or is a sequence of data points, measured typically at successive time instants at uniform

time interval.

Example 1.1.1. Average Monthly Temperatures, Dubuque, Iowa

Figure 1.1 shows the average yearly temperatures in New Haven from 1964 to 1976 ,it

6

climate is warm during summer when temperatures tend to be in the 70′s and very cold

during winter when temperatures tend to be in the 20′s. The warmest month of the

year is the July with an average maximum temperature of 82.8 degrees Fahrenheit, while

the coldest month of the year is January with an average minimum temperature of 16.9

degree Fahrenheit. This time series displays a very regular pattern called seasonality.

Seasonality for monthly values occurs when observations twelve months apart are related

in some manner or another. All Januarys and Februarys are quite cold but they are

similar in value and different from the temperatures of the warmer months of June, July,

and August for example. There is still variation among the January values and variation

among the June values. Models for such series must accommodate this variation while

preserving the similarities. Here the reason for the seasonality is well understood the

Northern Hemispheres changing inclination toward the sun. For more details see [7]

Figure 1.1: Average Monthly Temperatures, Dubuque, lowa

7

Figure 1.2: Monthly Temperatures, Dubuque, lowa

1.2 Properties of Time series

This section descries the fundamental concepts in the theory of time series models. In

particular, we introduce the concepts of mean, variance, covariance functions, stationary

processes, autocorrelation functions and partial autocorrelation functions.For more details

see [15] and [16]

Definition 1.2.1. Mean Function

For any time series {Yt} the mean function denoted by µt is defined as

µt = E(Yt) (1.2.1)

Definition 1.2.2. The Auto Covariance Function

For any time series {Yt} the auto covariance function (ACV F )of the time series {Yt}denoted by γY (t, s) is defined as the second moment product

γY (t, s) = cov(Yt, Ys) (1.2.2)

= E[(Yt − µt)(Ys − µs)] (1.2.3)

= E(YtYs)− µtµs (1.2.4)

8

For all time point s and t. When no possible confusion exists about which time series

we are referring to, we will drop the subscript and write γY (t, s) as γs,t.

It is clear that, for s = t, the auto covariance reduces to the variance, because

γY (t, t) = V ar(Yt) (1.2.5)

= E[Yt − E{Yt}]2 (1.2.6)

Note that γY (t, s) = γY (s, t) and |γt,s| ≤√γt,tγs,s

Definition 1.2.3. The Autocorrelation Function

The autocorrelation function (ACF ) of time series {Yt} denoted by ρt,s is defined as

follows

ρt,s = Corr(Yt, Ys) (1.2.7)

=Cov(Yt, Ys)√

(V arYt)(V arYs)(1.2.8)

The ACF measures the linear predictability of the series at time t, say, Yt, using only

the value Ys. We note that −1 ≤ ρt,s ≤ 1 values of ρt,s near ±1 indicate strong linear

dependance, where as values near zero indicate weak linear dependance and if ρt,s = 0 we

say that (Yt), (Ys) are uncorrelated.

1.3 Stationary Time Series

The preceding definitions of the mean and auto covariance functions are completely gen-

eral. Although we have not made any special assumptions about the behavior of the time

series, many of the preceding examples have hinted that a sort of regularity may exist

over time in the behavior of a time series. We introduce the notion of regularity using a

concept called stationarity.[1]

Definition 1.3.1. Strict stationarity

Time series Yt is said to be strict stationary if the joint distribution of {Yt1 , Yt2 , .....Ytn}is the same as the joint distribution of {Yt1+h, Yt2+h, ...., Ytn+h}

9

Definition 1.3.2. Weakly Stationarity

Time series Yt is said to be weakly stationary if

1. the mean value function µt is constant .

2. the covariance function, γs,t depends on s and t only through their difference |s− t|.

In the literature, usually stationarity means weak stationarity, unless otherwise speci-

fied. One important case where stationarity implies strict stationarity is if the time series

is Gaussian which means that the distribution functions of {Yt} are all multivari-

ate Gaussian, i.e. the joint density of FYt,Yt+j1 ,...,Yt+jn (yt, yt+j1 , ..., yt+jn) is Gaussian.

Example 1.3.1. Random walk

let (St : t = 0, 1, 2, ...) be sequence of independent identically distributed random variables

each with zero mean and variance σ2 the observed time series {Yt : t = 1, 2, ....} is

constructed as follows

Y1 = S1

Y2 = S1 + S2

...

Yt = S1 + S2 + ...St

E(St) = 0, E(S2t ) = tσ2 for all t,and for h ≥ 0

γs(t+ h, t) = Cov(St+h, St)

= Cov(St + Yt+1 + Yt+2 + ....+ Yt+h, St)

= Cov(St, St)

= V ar(St)

= E(S2t )− (E(St))

2

= tσ2

Since γs(t+ h, t) depends on t , the series {St} is not stationary.

10

Notation 1.3.1. Note that because the mean function, µt = E(Yt) of a stationary time

series is independent of time t, we will write

µt = µ

Also, because the covariance function of a stationary time series,γs,t depends on s and

t only through their difference |s− t|, we may simplify the notation. Let s = t+h, where

h represents the time shift or lag, then

γ(t+h,t) = cov(Yt+h, Yt)

= E[(Yt+h − µt+h)(Yt − µt)]

= E[(Yh − µ)(Y0 − µ)]

= γ(h,0)

does not depend on the time argument t we have assumed that V ar(Yt) = γ(0,0) <∞.

Henceforth, for convenience we will drop the second argument of γ(h,0).

Definition 1.3.3. The auto covariance function (ACV F )

The auto covariance function of a stationary time series will be written as

γ(h) = Cov(Yt+h, Yt) (1.3.1)

= E[(Yt+h − µ)(Yt − µ)] (1.3.2)

A final useful property that auto covariance function of a stationary series is symmetric

around the origin, that is

γh = γ−h (1.3.3)

Proposition 1.3.2. (Properties of Auto covariance Function (ACV F )

The auto covariance function (ACV F ) of a stationary time series Yt has the following

properties:

11

• Nonnegativity: γ0 ≥ 0

• Bounded ness: | γh |≤ γ0, for any h ∈ Z

• Symmetry :γh =γ−h

• γ(t,s) = γ(0,|s−t|)

Proof. See [16]

Definition 1.3.4. The autocorrelation function (ACF )

The autocorrelation function (ACF ) of a stationary time series will be written as

ρh =γ(t+h,t)

√γ(t+h,t+h)γ(t,t)

(1.3.4)

=γhγ0

(1.3.5)

Proposition 1.3.3. (Properties of Autocorrelation Function (ACF ))

The autocorrelation function ρ(h) of a stationary time series Yt has the following proper-

ties:

• ρ0 = 1

• | ρh |≤ 1 ,for all h ∈ Z.

• ρh = ρ−h

Proof. See [16]

Definition 1.3.5. (The partial Autocorrelation Function(PACF ))

The partial Autocorrelation Function (PACF ) of time series {Yt} denoted byφkk

φkk = corr(Yt, Yt−k|Yt−1, Yt−2, ..., Yt−k+1) (1.3.6)

We have studied in this chapter the basic rules of time series and properties like

Variance, Covariance, auto covariance, correlation and auto correlation and we studied

types of time series. In the second chapter we will study Box - Jenkins model Moving

Average, Autoregressive and Moving Average Autoregressive

12

Chapter 2

Box-Jenkins methodology

2.1 Models for Stationary Time Series

This chapter discusses the basic concepts of a broad class of parametric time series models

the autoregressive moving average (ARMA) models. These models have assumed great

importance in modeling real-world processes. For more details see[7]

2.1.1 General Linear Processes

we will study a class of linear models, called ”linear time series models” that are designed

specifically for modeling the dynamic behavior of time series. These include, moving-

average (MA), autoregressive (AR) and autoregressive-moving average (ARMA) models.

Definition 2.1.1. Time series {Yt} is a linear process if it has the representation

yt = et + ψ1et−1 + ψ2et−2 + ... (2.1.1)

or

yt =∑∞

j=0 ψjet−j

for all t ,where et have zero mean and variance σ2 and ψj is a sequence of constant

with∑∞

j=1 ψ2j <∞ and ψ0 = 1

13

Definition 2.1.2. (White noise)

Time series Yt is said to be a white noise with mean zero and variance σ2 written as

Yt ∼ WN(0, σ2)

if and only if {Yt} has zero mean and covariance function as

γh =

σ2, if h = 0;

0, if h 6= 0 .

It is clear that a white noise process is stationary.

Definition 2.1.3. (Back Shift Operator)

For any time series {Yt} the Back Shift Operator is defined by

BYt = Yt−1

and extend it to powers B2Yt = B(BYt) = BYt−1 = Yt−2 and so on. Thus

BkYt = Yt−k (2.1.2)

An important part of time series analysis is the selection of a suitable model for

data. These models are very important tool for forecasting. We will take three famous

models: Autoregressive (AR) model, Moving average (MA) model and Autoregression

Moving average (ARMA) model. These models are very important in modeling real

world processes. We can rewrite the time series models by simplified and useful formula

using Back Shift Operator B.

2.1.2 Autoregressive Process

Definition 2.1.4. Autoregressive Process

The autoregressive process of order p, denoted by AR(p), is defined as

Yt = φ1Yt−1 + φ2Yt−2 + ...+ φpYt−p + εt (2.1.3)

Where φ1, φ2, ..., φp are the parameters of the model and εt ∼ WN (0, σ2)

14

The mean of Yt in (2.1.3) is zero .If the mean µ of Yt is not zero replace Yt byYt − µin 2.1.3 i.e

Yt − µ = φ1(Yt−1 − µ) + φ2(Yt−2 − µ)......+ φp(Yt−p − µ) + εt

By using the back shift operator we can write AR(p) as

(1-φ1B − φ2B2 − ....− φpBp)Yt = εt (2.1.4)

or even more concisely as

φ(B)Yt = εt (2.1.5)

φ(B) is called the characteristic polynomial where

φ(B) = 1− φ1B − φ2B2 − ....− φpBp

Figure 2.1 displays the time plot of a simulated AR(1) process with φ = 0.9

Figure 2.1: simulated AR(1)process with φ = 0.9

15

Definition 2.1.5. Causality

A linear process {Yt} is causal of {Wt} if there is a

ψ(B) = ψ0 + ψ1B + ψ2B2 + ...

with | ψj |<∞ and

Yt = ψ(B)Wt

2.1.3 Moving Average Processes

Definition 2.1.6. Moving Average

Moving model of order q denoted by MA(q) model, is defined as

Yt = εt − θ1εt−1 − θ2εt−2 − ....− θqεt−q (2.1.6)

where are θ1, θ2.....θq are parameters

Some texts and software packages write the MA model with negative coefficients that

is

Yt = εt + θ1εt−1 + θ2εt−2 + ....+ θqεt−q

By using the back shift operator we can write the MA(q) as

Yt = (1− θ1B − θ2B2 − .....− θqBq)εt (2.1.7)

or even more concisely as

Yt = θ(B)εt (2.1.8)

16

θ(B) is called the characteristic polynomial where

θ(B) = 1− θ1B − θ2B2 − .....− θqBq

Figure 2.2 shows a time plot of a simulated MA(1) series withθ =0.9

Figure 2.2: simulated AR(1)process with φ = 0.9

Definition 2.1.7. Invertibility

A linear process {Yt} is invertible of {Wt} if there is a

π(B) = π0 + π1B + π2B2 + ...

with∑∞

j=0 |πj| <∞ and

Wt = π(B)Yt

2.1.4 Autoregressive Moving Average Model

Definition 2.1.8. Autoregressive Moving Average Model

The Autoregressive Moving Average Model denoted by ARMA(p, q) is defined as

Yt = φ1Yt−1 + φ2Yt−2 + ......φpYt−p + εt − θ1εt−1 − θ2εt−2....θqεt−q (2.1.9)

17

with φp 6= 0, θq 6= 0 and σ2ε > 0, and the parameters p and q are called the autoregressive

and the moving average orders, respectively.

By using the back shift operator, we can write the ARMA(p, q)as

(1-φ1B − φ2B2 − ....− φpBp)Yt = (1− θ1B − θ2B2 − .....− θqBq)εt (2.1.10)

φ(B)Yt = θ(B)εt (2.1.11)

Definition 2.1.9. characteristic polynomial

The AR(p), MA(q) polynomial are defined as

φ(x) = 1− φ1(x)− φ2(x)2......φp(x)p (2.1.12)

and

θ(x) = 1 + θ1(x) + θ2(x)2 + ...+ θq(x)q (2.1.13)

respectively, where x is a complex number.

Table 2.1: Behavior of the ACF and the PACF for ARMA Models

AR(p) MA(q) ARMA(p, q)

ACF Tails off Cuts off after lag q Tails off

PACF Cuts off after lag p Tails off Tails off

Remark 2.1.1. We can say about φ(B)Yt = θ(B)εt ARMA(p, q) if no common factor

between φ(x) and θ(x)

Example 2.1.1. Consider the process

Yt = 0.75Yt−1 − 0.125Yt−2 + εt − 0.5εt−1

18

or in operator form

(1-0.75B + 0.125B2)Yt = (1 + 0.5B)εt

At first Yt appear to be an ARMA(2, 1) process. But the associated polynomial

φ(z) = (1− 0.75z + 0.125z2)Yt = (1− 0.5z)(1− 0.25z)

θ(z) = (1− 0.5z)εt

have a common factor that can be canceled. After cancelation, the model is reduced to

Yt = 0.25Yt−1 + εt

so the model is an AR(1)

Definition 2.1.10. An ARMA(p, q) model φ(B)Yt = θ(B)εt is said to be causal, if the

time series (Yt) can be written as a one-sided linear process:

Yt =∑∞

j=0 ψjεt−j = ψ(B)εt (2.1.14)

where ψ(B) =∑∞

j=0 ψjBj and

∑∞j=0 |ψj| <∞

Causality of an ARMA(p, q) process

An ARMA(p, q) model is causal if and only if φ(z) 6= 0 for |z| ≤ 1

the coefficients of the linear process given 2.1.14 can be determined by solving

ψ(z) =∑∞

j=0 ψjzj = θ(z)

φ(z), |z| ≤ 1

Another way to expressing note is that an ARMA process is causal only when the

roots of φ(z) lie outside the unit circle that is,

φ(z) = 0 only when |z| > 1

Definition 2.1.11. Invertible Of An ARMA

An ARMA(p, q) model φ(B)Yt = θ(B)εt is said to be invertible, if the time series (Yt)

can be written as

19

π(B)Yt =∑∞

j=0 πjYt−j = εt (2.1.15)

where π(B) =∑∞

j=0 πjBj and

∑∞j=0 |πj| <∞. See[16]

Invertibility of an ARMA(p, q) Process

An ARMA(p, q) model is invertible if and only if θ(z) 6= 0 for |z| ≤ 1 The coefficients

πj of π(B) given in 2.1.15 can be determined by solving

π(z) =∑∞

j=0 πjzj = φ(z)

θ(z), |z| ≤ 1

Another way to expressing last note is that an ARMA process is invertible only when the

roots of θ(z) lie outside the unit circle that is, θ(z) = 0 only when |z| > 1.

Example 2.1.2. Consider the process

Yt = 0.4Yt−1 + 0.45Yt−2 + εt + εt−1 + 0.25εt−2

or, in operator form,

(1-0.4B-0.45B2)Yt = (1 +B + 0.25B2)εt

At first, Yt appears to be an ARMA(2, 2) process. But, the associated polynomials

φ(z) = (1− 0.4z − 0.4z2) = (1 + 0.5z)(1− 0.9z)

θ(z) = (1 + z + 0.25z2) = (1 + 0.5z)2

have a common factor that can be canceled. After cancelation, the polynomials become

φ(z) = (1 − 0.9z) and θ(z) = (1 + 0.5z) so the model is an ARMA(1, 1) model,(1 −0.9B)Yt = (1 + 0.5B)εt or

Yt = 0.9Yt−1 + 0.5εt−1 + εt

The model is causal because φ(z) = (1− 0.9z) = 0 when z = 109

which is outside the unit

circle. The model is also invertible because the root of θ(z) = (1 + 0.5z) is z = −2 which

is outside the unit circle.

20

2.2 Models for non Stationary Time Series

In statistics, an autoregressive integrated moving average (ARIMA) model is a general-

ization of an autoregressive moving average or (ARMA) model. These models are fitted

to time series data either to better understand the data or to predict future points in the

series. The ARIMA model is applied in some cases where data show evidence of non

stationarity, where an initial differencing step (corresponding to the ”integrated” part of

the model) can be applied to remove the non stationarity. The model is generally referred

to as an ARIMA(p, d, q) model where p, d, and q are integers greater than or equal to

zero and refer to the order of the autoregressive, integrated, and moving average parts of

the model respectively. The first parameter p refers to the number of autoregressive lags

(not counting the unit roots), the second parameter d refers to the order of integration,

and the third parameter q gives the number of moving average lags. For more details see

[7] and [12]

Definition 2.2.1. Integrated Autoregressive Moving Average Model ARIMA

A process Yt is said to be integrated autoregressive moving average model abbreviated

ARIMA(p, d, q)if

∇dYt = (1−B)dYt (2.2.1)

is stationary ARMA process (p, q). In general we will write the model as

φ(B)(1−B)dYt = θ(B)et (2.2.2)

IfE(∇dYt) = µ, we write the model as

φ(B)(1−B)dYt = α + θ(B)et (2.2.3)

where α = µ(1− φ1 − φ2 − ...− φp)

21

2.2.1 Multiplicative Seasonal ARIMA Models

In this section, we introduce several modifications made to the ARIMA model to account

for seasonal and non stationary behavior. Often, the dependence on the past tends to

occur most strongly at multiples of some underlying seasonal lag s.

Definition 2.2.2. Seasonal Time Series

Seasonal variation is a component of a time series which is defined as the repetitive and

predictable movement around the trend line in one year or less.

Some Examples of Seasonal Time Series:

• Monthly Carbon Dioxide Levels at Alert, Canada from January 1994 through De-

cember 2004.

• Monthly U.S. Retail and Food Service Sales from January 1992 to August 2008

in millions of dollars.

• Electricity consumption of an industrial sector of U.S.

Definition 2.2.3. Seasonal MA(Q) Model

A seasonal MA(Q) model of order Q with seasonal period s is define by

Yt = et −Θ1et−s −Θ2et−2s − ....−ΘQet−Qs (2.2.4)

with seasonal MA characteristic polynomial

Θ(x) = 1−Θ1xs −Θ2x

2s − .....−ΘQxQs

Definition 2.2.4. Seasonal AR(P ) Model

A seasonal AR(P ) model of order P and seasonal period s is defined by

Yt = Φ1Yt−s + Φ2Yt−2s + ....+ ΦPYP−s + et (2.2.5)

with seasonal AR characteristic polynomial

Φ(x) = 1− Φ1xs − Φ2x

2s − ....− ΦPxPs

22

Definition 2.2.5. Multiplicative Seasonal ARIMA Model SARIMA

Multiplicative Seasonal ARIMA Model takes the form

Φp(Bs)yt = ΘQ(Bs)et (2.2.6)

with operator

Φ(B) = 1− Φ1Bs − Φ2B

2s − ....− ΦPBPs

Θ(B) = 1−Θ1Bs −Θ2B

2s − .....−ΘQBQs

The multiplicative seasonal autoregressive integrated moving average model, or SARIMA

model is given by

φ(B)Φ(B)∇d∇Ds = θ(B)Θ(B)et (2.2.7)

The general model is denoted as ARIMA(p, d, q)(P,D,Q)s .

Example 2.2.1. Consider the following model, which often provides a reasonable repre-

sentation for seasonal, non stationary, economic time series.We display the equations for

the model, denoted by ARIMA (0, 1, 1)(0, 1, 1)12

(1-B12)(1−B)yt = (1 + ΘB12)(1 + θB)et

Expanding both sides

(1 - B - B12 +B13)yt = (1 + θB + ΘB12 + ΘθB13)et

see [16]

23

2.3 Forecasting

In this section, we shall consider the calculation of forecasts and their properties for both

deterministic trend models and ARIMA models. Based on the available history of the

series up to time t, namely Y1, Y2, Y3, ...., Yt−1 we would like to forecast the value of Yt+L

that will occur L time units into the future. For more details see [12] and [15] .

Definition 2.3.1. Minimum Mean Square

The minimum mean square error forecast is given by

Yt(L) = E(Yt+L|Y1, Y2, ...., Yt) (2.3.1)

For ARIMA models, the forecasts can be expressed in several different ways. Each

expression contributes to our understanding of the overall forecasting procedure with

respect to computing, updating, assessing precision, or long-term forecasting behavior.

Definition 2.3.2. Akaikes Information Criterion (AIC)

Akaikes Information Criterion (AIC)

AIC = -2log(maximum likelihood) + 2k

(2.3.2)

where k is the number of parameters in the model

Akaikes Information Criterion (AIC)has another definition

Definition 2.3.3. Akaikes Information Criterion (AIC)

AIC = lnσ2 +n+ 2k

n(2.3.3)

where k is the number of parameters in the model and

σ2 = RSSkn

24

where RSSk denotes the residual sum of squares under the model with k regression coef-

ficients.

Definition 2.3.4. AIC, Bias Corrected (AICc)

AICc =ln σ2k + n+k

n−k−2

(2.3.4)

Definition 2.3.5. Schwarzes Information Criterion (SIC)

SIC = lnσ2k + k lnnn

(2.3.5)

SIC is also called the Bayesian Information Criterion (BIC)

For more details see [16]

Example 2.3.1. AR(1)

consider the model with a nonzero mean that satisfies

Yt − µ = φ(Yt−1 − µ) + et

Replacing t by t+ 1 in last equation we have

Yt+1 − µ = φ(Yt − µ) + et+1 (2.3.6)

Take the conditional expectations of both sides of Equation

Yt(1)− µ = φ[E(Yt|Y1, Y2, ...., Yt)− µ] + E(et+1|Y1, Y2, ...., Yt) (2.3.7)

Since E(Yt|Y1, Y2, ...., Yt) = Yt and et+1 is independent of Y1, Y2, ...., Yt−1

we have E(et+1|Y1, Y2, ...., Yt) = E(et+1) = 0.

Thus Equation (2.3.7) can be written as

Yt(1) = µ+ φ(Yt − µ)

Now consider a general lead time L. Replacing t by t+L in last Equation and taking the

conditional expectations of both sides produces

Yt(L) = µ+ φ(Yt−L − µ) for L≥1

since |φ| ≤ 1, we have simply Y t ≈ µ for large L .

25

2.3.1 Model Identification

Definition 2.3.6. Identification:

Means to find out the appropriate values of p, q, d, P,Q and D of the order of general

SARIMA model, we will use ACF and PACF to find these values.

Stationarity and Seasonality

The first step in developing a Box-Jenkins model is to determine if the series is stationary

and if there is any significant seasonality that needs to be modeled.

Detecting seasonality

Seasonality (or periodicity) can usually be assessed from an autocorrelation plot, a sea-

sonal subseries plot, or a spectral plot.

Differencing to achieve stationarity

Box and Jenkins recommend the differencing approach to achieve stationarity. However,

fitting a curve and subtracting the fitted values from the original data can also be used

in the context of Box-Jenkins models.

Seasonal differencing

At the model identification stage, our goal is to detect seasonality, if it exists, and to

identify the order for the seasonal autoregressive and seasonal moving average terms. For

many series, the period is known and a single seasonality term is sufficient. For example,

for monthly data we would typically include either a seasonal AR term or a seasonal MA

term. For Box-Jenkins models, we do not explicitly remove seasonality before fitting the

model. Instead, we include the order of the seasonal terms in the model specification to the

ARIMA estimation software. However, it may be helpful to apply a seasonal difference to

the data and regenerate the autocorrelation and partial autocorrelation plots. This may

help in the model identification of the non-seasonal component of the model. In some

cases, the seasonal differencing may remove most or all of the seasonality effect.

Identify p and q

Once stationarity and seasonality have been addressed, the next step is to identify the

order of the autoregressive and moving average terms.

Detecting stationarity

Stationarity can be assessed from a run sequence plot. The run sequence plot should

26

show constant location and scale. It can also be detected from an autocorrelation plot.

Specifically, non-stationarity is often indicated by an autocorrelation plot with very slow

decay.

Order of Autoregressive Process (p)

Specifically, for an AR(1) process, the sample autocorrelation function should have an

exponentially decreasing appearance. However, higher-order AR processes are often a

mixture of exponentially decreasing and damped sinusoidal components. For higher-

order autoregressive processes, the sample autocorrelation needs to be supplemented with

a partial autocorrelation plot. The partial autocorrelation of an AR(p) process becomes

zero at lag p+1 and greater, so we examine the sample partial autocorrelation function to

see if there is evidence of a departure from zero. This is usually determined by placing a

95confidence interval on the sample partial autocorrelation plot (most software programs

that generate sample autocorrelation plots will also plot this confidence interval). If the

software program does not generate the confidence band, it is approximately 2√N

with N

denoting the sample size.

Order of Moving Average Process (q)

The autocorrelation function of a MA(q) process becomes zero at lag q+1 and greater, so

we examine the sample autocorrelation function to see where it essentially becomes zero.

We do this by placing the 95 confidence interval for the sample autocorrelation function on

the sample autocorrelation plot. Most software that can generate the autocorrelation plot

can also generate this confidence interval. The sample partial autocorrelation function is

generally not helpful for identifying the order of the moving average process.

2.3.2 Parameter Estimation of the SARIMAModel

After getting appropriate value of p, q, d, P,Q and D the next stage to find the values of

θ, φ,Θ and Φ

2.3.3 Diagnostics Checking Of The Fitted Model

Diagnostics test is applied to understand whether the estimated parameters and residuals

of the fitted SARIMA model are significant. ACF and PACF of Residuals We hope

27

that these will show the WN pattern.

2.3.4 Forecasting the study variable

When the model is complete it will be used to forecast the future behavior of the currency

pair.

28

Chapter 3

Exponential Smoothing

3.1 Introduction

Exponential smoothing is probably the widely used class of procedures for smoothing

discrete time series in order to forecast the immediate future. This popularity can be

attributed to its simplicity, its computational efficiency, the ease of adjusting its respon-

siveness to changes in the process being forecast, and its reasonable accuracy. The idea of

exponential smoothing is to smooth the original series the way the moving average does

and to use the smoothed series in forecasting future values of the variable of interest. In

exponential smoothing, however, we want to allow the more recent values of the series to

have greater influence on the forecast of future values than the more distant observations.

Exponential smoothing is a simple and pragmatic approach to forecasting, whereby the

forecast is constructed from an exponentially weighted average of past observations. The

largest weight is given to the present observation, less weight to the immediately preced-

ing observation, even less weight to the observation before that, and so on exponential

decay of influence of past data.

Historically, exponential smoothing describes a class of forecasting methods. In fact,

some of the most successful forecasting methods are based on the concept of exponen-

tial smoothing. There are a variety of methods that fall into the exponential smoothing

family, each having the property that forecasts are weighted combinations of past obser-

vations, with recent observations given relatively more weight than older observations.

29

The name exponential smoothing reflects the fact that the weights decrease exponentially

as the observations get old.[17] Exponential Smoothing statistical technique for detect-

ing significant changes in data by ignoring the fluctuations irrelevant to the purpose at

hand. In exponential smoothing (as opposed to in moving averages smoothing), older

data is given progressively-less relative weight (importance) whereas newer data is given

progressively-greater weight. Also called averaging, it is employed in making short-term

forecasts. The ’wait-and-see’ attitude to changes around them is the intuitive way people

employ exponential smoothing in their daily living.

3.1.1 Classification of Exponential Smoothing Methods

In exponential smoothing, we always start with the trend component, which is itself

a combination of a level term (`) and a growth term (b). The level and growth can be

combined in a number of ways, giving five future trend types. Let (Th) denote the forecast

trend over the next h time periods, and let ϕ denote a damping parameter (0 < ϕ < 1).

Then the five trend types or growth patterns are as follows:

Non :Th = `

Additive :Th = `+ bh

Additivedamped :Th = `+ (ϕ+ ϕ2 + ...+ ϕh)

Multiplicative :Th = `bh

Multiplicativedamped :Th = `b(ϕ+ϕ2+...+ϕh)

If the error component is ignored, then we have the fifteen exponential smoothing

methods given in the following table. Some of these methods are better known under

other names. For example, cell (N,N) describes the simple exponential smoothing (or

SES) method, cell (A,N) describes Holts linear method, and cell (Ad, N) describes the

damped trend method. Holt-Winters additive method is given by cell (A,A), and Holt-

Winters multiplicative method is given by cell (A,M). The other cells correspond to less

commonly used but analogous methods.

30

3.1.2 Point Forecasts for the Best-Known Methods

In this section, a simple introduction is provided to some of the best known exponential

smoothing methods simple exponential smoothing (N,N), Holts linear method (A,N),

the damped trend method (Ad, N) and Holt-Winters seasonal method (A,A) and (A,M).

3.2 Simple Exponential Smoothing (N,N Method)

The simplest of the exponentially smoothing methods is naturally called simple exponen-

tial smoothing(SES). (In some books [8], it is called single exponential smoothing.) This

method is used for short-range forecasting, usually just one month into the future. The

model assumes that the data fluctuates around a reasonably stable mean (no trend or

consistent pattern of growth). For more details see [11] and [17]

Definition 3.2.1. Simple exponential Smoothing

Simple exponential Smoothing equation is defined as:

yt+1 = yt + α(yt − yt) (3.2.1)

where α is constant between 0 and 1 . Another way of writing the last equation is

yt+1 = αyt + (1− α)yt (3.2.2)

31

If this substitution process is repeated by replacing yt−1 with its components, yt−2

with its components, and so on, the result is

yt+1 = αyt + (1− α)yt−1 + α(1− α)2yt−2 + ...+ α(1− α)t−1y1 + (1− α)ty1

(3.2.3)

So yt+1 represents a weighted moving average of all past observations with the weights

decreasing exponentially hence the name exponential smoothing. We note that the weight

of y1 may be quite large when α is small and the time series is relatively short. For longer

range forecasts, it is assumed that the forecast function is flat. That is,

yt+h|h = yt+1

A flat forecast function is used because simple exponential smoothing works best for

data that have no trend, seasonality, or other underlying patterns. Another way of writing

this is to let `t = yt+1 then 3.2.2 becomes

`t = αyt + (1− α)`t

The value of `t is a measure of the level of the series at time t

Initial Value

The initial value of yt plays an important role in computing all the subsequent values.

Setting it to y1 is one method of initialization. Another possibility would be to average

the first four or five observations. The smaller the value of α the more important is the

selection of the initial value of yt.

32

Component form

An alternative representation is the component form. For simple exponential smoothing

the only component included is the level, `t (Other methods considered later in this chapter

may also include a trend bt and seasonal component st.) Component form representations

of exponential smoothing methods comprise a forecast equation and a smoothing equa-

tion for each of the components included in the method. The component form of simple

exponential smoothing is given by:

Forecast equation yt+1 = `t

Smoothing equation `t = αyt + (1− α)`t−1,

where `t is the level (or the smoothed value) of the series at time t. The forecast

equation shows that the forecasted value at time t + 1 is the estimated level at time t.

The smoothing equation for the level (usually referred to as the level equation) gives the

estimated level of the series at each period t. Applying the forecast equation for time T

gives, yT+1|T = `T , the most recent estimated level. If we replace `t by yt+1|t and `t−1

by yt|t−1 in the smoothing equation, we will recover the weighted average form of simple

exponential smoothing.

Error correction form

The third form of simple exponential smoothing is obtained by re-arranging the level

equation in the component form to get what we refer to as the error correction form

`t = `t−1 + α(yt − `t−1)

= `t−1 + αet

where et = yt − `t−1 = yt − yt for t = 1, ..., T . That is, et is the one-step within-

sample forecast error at time t. The within-sample forecast errors lead to the adjust-

ment/correction of the estimated level throughout the smoothing process for t = 1, ..., T .

For more details see [11] and [17]

Example 3.2.1. The data in Figure 3.1 do not display any clear trending behavior or

any seasonality, although the mean of the data may be changing slowly over time.

33

Figure 3.1: Oil production in Saudi Arabia from 1996 to 2007

3.3 Holt Linear Method (A,N Method)

Holt (1957) extended simple exponential smoothing to linear exponential smoothing to

allow forecasting of data with trends. The forecast for Holt,s linear exponential smoothing

method is found using two smoothing constants, α and β (with values between 0 and 1),

and three equations:

Definition 3.3.1. Holts linear method equations are defined as :

Level :`t = αyt + (1− α)(`t−1 + bt−1) (3.3.1)

Growth :bt = β∗(`t − `t−1) + (1− β∗)bt−1 (3.3.2)

Forecast :yt+h = `t + bth (3.3.3)

Here `t denotes an estimate of the level of the series at time t and bt denotes an

estimate of the slope (or growth) of the series at time t.

One interesting special case of this method occurs when β∗ = 0 Then

Level :`t = αyt + (1− α)(`t−1 + bt−1)

Forecast :yt+h|t = `t + bth

34

As with simple exponential smoothing, the level equation here shows that `t is a

weighted average of observation yt and the within-sample one-step-ahead forecast for

timet, here given by `t−1 + bt−1. The trend equation shows that bt is a weighted average

of the estimated trend at time t based on `t − `t−l and bt−1, the previous estimate of the

trend. The forecast function is no longer flat but trending.

Error correction form

The error correction form of the level and the trend equations show the adjustments in

terms of the within -sample one-step forecast errors

`t = `t−1 + bt−1 + αet

bt = bt−1 + αβ∗et

where et = yt − (`t−1 + bt−1) = yt − yt|t−1

3.4 Damped Trend Method (Ad, A Method)

The forecasts generated by Holts linear method display a constant trend (increasing or

decreasing) indefinitely into the future. Even more extreme are the forecasts generated

by the exponential trend method which include exponential growth or decline. Empirical

evidence indicates that these methods tend to over-forecast, especially for longer forecast

horizons. Motivated by this observation, Gardner and McKenzie (1985) introduced a

parameter that dampensthe trend to a flat line some time in the future. Methods that

include a damped trend have proven to be very successful and are arguably the most

popular individual methods when forecasts are required automatically for many series.[9]

3.4.1 Additive damped trend

Definition 3.4.1. Additive damped trend method equations are defined as

35

Level :`t = αyt + (1− α)(`t−1 + φbt−1) (3.4.1)

Growth :bt = β∗(`t − `t−1) + (1− β∗)φbt−1 (3.4.2)

Forecast :yt+h|t = `t + (φ+ φ2 + ...+ φh)bt (3.4.3)

Thus, the growth for the one-step forecast of yt+1 is φbt, and the growth is dampened

by a factor of φ for each additional future time period.

Notation 3.4.1. • If φ = 1 this method gives the same forecasts as Holts linear

method

• For 0 < φ < 1, as h −→∞ the forecasts approach an asymptote given by`t+φbt1−φ .

We usually restrict φ > 0 to avoid a negative coefficient being applied to bt−1 in

(3.4.2), and φ < 1to avoid bt increasing exponentially.

error correction

The error correction form of the smoothing equations is

`t = `t−1 + φbt−1 + αet

bt = φbt−1 + αβ∗et

3.5 Holt-Winters Trend and Seasonality Method

If the data have no trend or seasonal patterns, then simple exponential smoothing is ap-

propriate. If the data display a linear trend, Holts linear method is appropriate. But

if the data are seasonal, these methods, on their own, cannot handle the problem well.

Holt-Winters method was extended by Winters (1960) to capture seasonality directly is

based on three smoothing equations one for the level, one for trend and one for sea-

sonality with smoothing parameters α, β∗and γ. We use m to denote the period of the

seasonality. It is similar to Holts linear method, with one additional equation for dealing

with seasonality.There are two variations to this method that differ in the nature of the

seasonal component. The additive method is preferred when the seasonal variations are

36

roughly constant through the series, while the multiplicative method is preferred when

the seasonal variations are changing proportional to the level of the series. With the

additive method, the seasonal component is expressed in absolute terms in the scale of

the observed series, and in the level equation the series is seasonally adjusted by subtract-

ing the seasonal component. Within each year the seasonal component will add up to

approximately zero with the multiplicative method, the seasonal component is expressed

in relative terms (percentages) and the series is seasonally adjusted by dividing through

by the seasonal component. Within each year, the seasonal component will sum up to

approximately m. See [8] and [14]

3.5.1 Additive Seasonality (A,A Method)

The seasonal component in Holt-Winters method may also be treated additively, although

this is less common.

Definition 3.5.1. The basic equations for Holt-Winters additive method are as follows:

Level :`t = α(yt − st−m) + (1− α)(`t−1 + bt−1) (3.5.1)

Growth :bt = β∗(`t − `t−1) + (1− β∗)bt−1 (3.5.2)

Seasonal :st = γ(yt − `t−1 − bt−1) + (1− γ)st−m (3.5.3)

Forecast :yt+h|t = `t + bth+ st−m+h+m(3.5.4)

The equation for the seasonal component is often expressed as

st = γ∗(yt − `t) + (1− γ∗)st−m

If we substitute `t from the smoothing equation for the level of the component form

above, we get

st = γ∗(1− α)(yt − `t−1 − bt−1) + (1− γ∗(1− α))st−m

37

which is identical to the smoothing equation for the seasonal component we specify

here with γ = γ∗(1− α) The usual parameter restriction is 0 ≤ γ∗ ≤ 1, which translates

to 0 ≤ γ ≤ 1− α.

The error correction form of the smoothing equations is

`t = `t−1 + bt−1 + αet

bt = bt−1 + αβ∗et

st = st−m + γet

where et = yt − (`t−1 + bt−1 + st−m) = yt − yt|t−1are the one-step training forecast errors.

3.6 General Point Forecasting Equations

Table 3.2 gives recursive formulae for computing point forecasts h periods ahead for all

of the exponential smoothing methods. In each case,`t denotes the series level at time t,

bt denotes the slope at time t, st denotes the seasonal component of the series at timet,

and m denotes the number of seasons in a year,α, β∗, γ and φ are constants and

φh = φ+ φ2 + ...+ φh. and h+m = [(h− 1)mod m] + 1

38

3.7 Innovations state space models for exponential

smoothing

We now introduce the state space models that underlie exponential smoothing methods.

For each method, there are two models.

• Model with additive errors

• Model with multiplicative errors.

The point forecasts for the two models are identical(provided the same parameter values

are used), but their prediction intervals will differ.

To distinguish the models with additive and multiplicative errors, we add an extra letter to

the front of the method notation. The triplet (E,T,S) refers to the three components;error,

trend and seasonality. So the model ETS(A,A,N) has additive errors, additive trend and

no seasonality in other words, this is Holts linear method with additive errors. Similarly,

ETS(M,Md,M) refers to a model with multiplicative errors,a damped multiplicative

39

trend and multiplicative seasonality. The notation ETS(, , ) helps in remembering the

order in which the components are specified. ETS can also be considered an abbreviation

of Exponential Smoothing.[6]

3.7.1 ETS(A,N,N): simple exponential smoothing with additive

errors

As discussed in Section (3.3.1), the error correction form of simple exponential smoothing

is given by

`t = `t−1 + αet

where et = yt−`t−1 and yt|t−1 = `t−1.Thus et = yt−yt|t−1 represents a one-step forecast

error and we can write

yt = `t−1 + et

To make this into an innovations state space model, all we need to do is specify the

probability distribution for et For a model with additive errors, we assume that one-step

forecast errors et are normally distributed white noise with mean 0 and variance σ2 i.e

et = εt ∼ NID(0, σ2).

Then the equations of the model can be written as

yt = `t−1 + εt (3.7.1)

`t = `t−1 + αεt (3.7.2)

3.7.2 ETS(M,N,N): simple exponential smoothing with multi-

plicative errors

In a similar fashion, we can specify models with multiplicative errors by writing the one-

step random errors as relative errors:

40

εt =yt−yt−1|tyt−1|t

Substituting yt−1|t = `t−1 gives yt = `t−1 + `t−1εt and et = yt − yt−1|t = `t−1εt

Then we can write the multiplicative form of the state space model as

yt = `t−1(1 + εt)

`t = `t−1(1 + αεt)

3.7.3 State Space Models for Holts Linear Method

We can now explain the ideas using Holts linear method

Additive Error Model: ETS(A,A,N)

let µt = yt = `t−1+bt−1 denote the one-step forecast of yt assuming we know the values

of all parameters. Also let εt = yt−µt denote the one-step forecast error at time t . From

(3.3.3) we find that

yt = `t−1 + bt−1 + εt (3.7.3)

and using (3.3.1) ,(3.3.2) we can write

`t = `t−1 + bt−1 + αεt (3.7.4)

bt = bt−1 + β∗(`t − `t−1 − bt−1) (3.7.5)

= bt−1 + αβ∗εt (3.7.6)

We simplify the last expression by setting β = αβ∗

Multiplicative Error Model:ETS(M,A,N)

A model with multiplicative error can be derived similarly, by first setting εt = yt−µtµt

so

41

that εt is a relative error. Then, following a similar approach to that for additive errors,

we find

yt = (`t−1 + bt−1)(1 + εt)

`t = (`t−1 + bt−1)(1 + αεt)

bt = bt−1 + β(`t−1 + bt−1)εt

3.7.4 State Space Models for All Exponential Smoothing Meth-

ods

The underlying equations for the additive error models are given in Table .We use β = αβ∗

to simplify the notation. Multiplicative error models are obtained by replacing εt with

εtµt in the equations of Table 3.3 and 3.4 The resulting multiplicative error equations are

given in the Table 3.3 and 3.4 .

42

43

3.8 Initialization and Estimation

In order to use these models for forecasting, we need to specify the type of model to be

used (model selection), the value of x0 (initialization), and the values of the parameters

α, β, γ and φ (estimation). In this section, we discuss initialization and estimation, leaving

model.

3.8.1 Initialization

The non-linear optimization requires some initial values. We use α = β = γ = 0.5 and

φ = 0.9 The initial values of `0, b0 and sk(k = −m + 1, ..., 0, ) are obtained using the

following heuristic scheme.

• Initial seasonal component.

1. For seasonal data, compute a 2m moving average through the first few years

of data (we use up to four years if the data are available). Denote this by {ft}t = (m

2) + 1, (m

2) + 2, ...

2. For additive seasonality, we detrend the data to obtain yt− ft .For multiplica-

tive seasonality, we detrend the data to obtain ytft

Then compute initial seasonal

indices,s−m+1, ..., s0 by averaging the detrended data for each season. Normal-

ize these seasonal indices so that they add to zero for additive seasonality, and

add to m for multiplicative seasonality.

• Initial level component

1. For seasonal data, compute a linear trend using linear regression on the first ten

seasonally adjusted values (using the seasonal indices obtained above) against

a time variable t = 1, ..., 10

2. For nonseasonal data, compute a linear trend on the first ten observations

against a time variable t = 1, ..., 10 Then set F`0 to be the intercept of the

trend.

44

• Initial growth component.

1. For additive trend, set b0 to be the slope of the trend.

2. For multiplicative trend, set b0 = 1 + ba

where a denotes the intercept and b

denotes the slope of the fitted trend.

These initial states are then refined by estimating them along with the parameters, as

described below.

3.8.2 Estimation and model selection

Let

L∗(θ, x0) = n log(∑n

t=1 e2t/k

2(xt−1)) + 2∑n

t=1 log | k(xt−1) | (3.8.1)

Then L∗ is equal to twice the negative logarithm of the conditional likelihood function

of the state space model (with constant terms eliminated). An alternative to estimating

the parameters by minimizing the sum of squared errors, is to maximize the likelihood.

The likelihood is the probability of the data arising from the specified model. So a large

likelihood is associated with a good model. For an additive error model, maximizing the

likelihood gives the same results as minimizing the sum of squared errors. However, differ-

ent results will be obtained for multiplicative error models. In this section, we will estimate

the smoothing parameters θ = (α, β, γ, φ) and initial states x0 = (`0, b0, s0, s−1, ..., s−m+1)

by maximizing the likelihood. The possible values that the smoothing parameters can take

is restricted. Traditionally the parameters have been constrained to lie between 0 and 1

so that the equations can be interpreted as weighted averages.That is, 0 < α, β∗, γ∗ and

φ < 1 For the state space models, we have setβ = αβ∗ and γ = (1− α)γ∗ . Therefore the

traditional restrictions translate to 0 < α < 1 0 < β < α and 0 < γ < 1−α . In practice,

the damping parameter φ is usually constrained further to prevent numerical difficulties

in estimating the model. A common constraint is to set 0.8 < φ < 0.98. Another way

to view the parameters is through a consideration of the mathematical properties of the

state space models . Then the parameters are constrained to prevent observations in the

distant past having a continuing effect on current forecasts. This leads to some admis-

45

sibility constraints on the parameters which are usually (but not always) less restrictive

than the usual region.[11]

3.9 Measure error

Due to the fundamental importance of time series forecasting in many practical situa-

tions, proper care should be taken while selecting a particular model,to estimate forecast

accuracy and to compare different models. Each of these measures is a function of the

actual and forecasted values of the time series.In each of the forthcoming definitions, ytis

the actual value, ft is the forecasted value, et = yt − ft is the forecast error and n is the

size of the test set.

Definition 3.9.1. The Mean Absolute Error (MAE)

The Mean Absolute Error (MAE) is defined as

MAE= 1n

∑nt=1 |et|

• It measures the average absolute deviation of forecasted values from original ones.

• In MAE, the effects of positive and negative errors do not cancel out.

Definition 3.9.2. The Mean Absolute Percentage Error (MAPE)

The Mean Absolute Percentage Error (MAPE)is defined as

MAPE = 1n

∑nt=1 |

etyt| × 100 (3.9.1)

• This measure represents the percentage of average absolute error occurred.

• It is independent of the scale of measurement, but affected by data transformation

Definition 3.9.3. The Mean Squared Error (MSE) The Mean Squared Error (MSE)

is defined as

46

MSE= 1n

∑e2t

MSE gives an overall idea of the error occurred during forecasting.

Definition 3.9.4. The Root Mean Squared Error (RMSE) The Root Mean Squared

Error (RMSE)is defined as

RMSE =√

1n

∑e2t

47

48

Part II

Case Study

49

Chapter 4

Analysis Data Using Box Jenkins

Method

4.1 Data Description

The data of our study is monthly observations for electricity consumption in Khan Younis

province during the period from January 2000 to December 2010. The data were taken

at the end of every month. The total number of observations is (132). is a time series

data, an overview of data from January 2000 to December 2010 are plotted in Figure 4.1

Figure 4.1: Time series plot of electricity consumption in province Khan Younis monthly

symbols

50

Let’s take general idea about the data, We will show some descriptive statistics of the

time series in Table (4.1)

Table 4.1: Descriptive Statistics

Statistics

Min 4.761

Median 11.9

Mean 11.38

Max 19.610

4.1.1 The Box-Jenkins Approach to Fitting ARIMA Model:

We can see from Figure ( 4.1) that there seems to be seasonal variation in the number of

consumption.

One way to determine more objectively if differeincing is required is to use a unit root

test. These are statistical hypothesis tests of stationarity that are designed for determining

whether differentiae is required.

A number of unit root tests are available, and they are based on different assumptions

and may lead to conflicting answers. One of the most popular tests is the Augmented

Dickey-Fuller (ADF ) test. The null-hypothesis for an ADF test is that the data are non-

stationary. So large p-values are indicative of non-stationarity, and small p-values suggest

stationarity. Another popular unit root test is the Kwiatkowski-Phillips-Schmidt-Shin

(KPSS) test. This reverses the hypotheses, so the null-hypothesis is that the data are

stationary. In this case, small p-values (e.g., less than 0.05) suggest that differentiae is

required.

It’s known that

• For the ADF test, if p-value ≤ 0.05 the process stationary

• For the KPSS test, if p-value ≥ 0.05 the process stationary

51

The results as shown in table 4.2 ,the KPSS test: the p-value is 0.01 which is less

than p = 0.05, the ADF test:the p-value of the Augmented Dickey-Fuller (ADF ) test

equals 0.3 which greater than p = 0.05.

Table 4.2: P-values for Augmented Dickey-Fuller (ADF ) test And Kwiatkowski-Phillips-

Schmidt- Shin (KPSS)Test for monthly electricity consumption

test electricity consumption

ADF 0.3

KPSS 0.01

This result indicates that the time series of monthly electricity consumption is not

stationary. Investigation is also done by the examination of the autocorrelation and

partial autocorrelation functions as shown in Figure 4.2 and 4.3 respectively

Figure 4.2: Acf for monthly electricity consumption

52

Figure 4.3: Pacf for monthly electricity consumption

4.2 Model Specification

The data are clearly non-stationary, with some seasonality, so we will first take a seasonal

difference. The seasonally difference data are shown in Figure 4.4.

This shows one way to make a time series stationary compute the differences between

consecutive observations. This is known as differentiae.

Our aim now is to find an appropriate ARIMA model based on the ACF and PACF

shown in Figure 4.4 as As shown in the plot of ACF and PACF in Figure 4.5

Non-seasonal behavior:

The significant spike at lag 2 and 3 in the ACF suggests a non-seasonal MA(1) compo-

nent.This leads for non-seasonal MA(1). The significant spike at lag 2 and lag 3 also at

lag 11, 12 and 13.

Seasonal behavior:

We look at what going on around lags 1, 12 and 24 and so on. That is the ACF has signifi-

cant lags at 1, 12, 24. This lead to a seasonalMA(1) component. Consequently, this initial

analysis suggests that a possible model for these data is an ARIMA (0, 1, 1)(0, 1, 1)12 The

53

AICc of the model is 409.13 but the

residuals of the model show that there is significant lag at 36 which indicate some

additional seasonal term. while that for the ARIMA (0, 1, 1)(0, 1, 0) is 456.97

we tried other models with AR terms as well , but none that gave a smaller AICc value .

Consequently, we choose the ARIMA(2, 1, 2)(1, 0, 1)12 as we shown in Figure 4.6

the residuals of model show that there are significant spikes in both the ACF and PACF

Table 4.3 display the model

Figure 4.4: First difference of monthly electricity consumption

54

Table 4.3: SARIMA Models Criteria for the monthly electricity consumption

model AIC AICc BIC

ARIMA(0,1,1)(0,1,1)12 408.94 409.13 417.59

ARIMA(0,1,1)(0,0,1)12 402.86 403.03 411.77

ARIMA(0,1,1)(0,1,0)12 456.88 456.97 462.64

ARIMA(2,1,1)(1,0,1)12 402 402.61 419.82

ARIMA(2,2,1)(1,0,1)12 400.89 401.51 418.67

ARIMA(2,1,2)(1,0,1)12 388.83 389.66 409.62

ARIMA(1,1,1)(1,0,1)12 399.28 399.72 414.13

After calculate the error of models the best fit model is ARIMA(2, 1, 2)(1, 0, 1)12. So

we now have a seasonal ARIMA model that passes the required checks and is ready for

forecasting. Forecasts from the model for the next years are shown in Figure 4.7 Notice

how the forecasts follow the recent trend in the data . The large and rapidly increasing

prediction intervals show that the retail trade index could start increasing or decreasing

at any time while the point forecasts trend downwards, the prediction intervals allow for

the data to trend downward during the forecast period.

Figure 4.5: Residuals from the fitted ARIMA(2, 1, 2)(1, 0, 1)12 model

55

Figure 4.6: Forecasts for monthly electricity consumption

Now we want to predict data of year 2011 using model ARIMA(2, 1, 2)(1, 0, 1)12. And

make comparative between prediction data and actual data See Table 4.4

56

Table 4.4: Comparative between Prediction data of 2011 using ARIMA(2, 1, 2)(1, 0, 1)12

and Actual data

Month of year 2011 Actual value Prediction valu

Jan 15.40 16.15

Feb 17.12 15.54

Mar 16.23 16.75

Apr 17.47 16.19

May 17.62 17.10

Jun 18.08 17.34

Jul 17.73 18.25

Aug 18.35 17.22

Sep 16.03 18.44

Oct 19.61 18.82

Nov 18.63 16.43

Dec 18.35 18.59

57

Chapter 5

Analysis data using exponential

smoothing methods

5.1 exponential smoothing model

Our aim in this section analysis electricity data using exponential smoothing model to

find the fit model. We present in chapter (3) some type of exponential smoothing method

like simple exponential smoothing, Holts linear method, Holt-Winters damped method,...

.

5.1.1 First Method : Simple Exponential Smoothing Model

As we shown in chapter 3 his method is suitable for forecasting data with no trend or

seasonal pattern.

Forecast equation = yt+1 = `t

Smoothing equation = `t = αyt + (1− α)`t−1,

In our data, simple exponential smoothing is applied to forecast electricity consump-

tion in Khan Younis. Now we want to use simple exponential smoothing model for predict

data in period from Jan 2011 to Dec 2011 by changing the value of α we will take three

model

58

• Model(1) when α = 0.2 the model equations will defined as

`t = 0.2 = yt + 0.8`t−1

• Model (2) when α = 0.6 the model equations will defined as

`t = 0.6yt + 0.4`t−1

• Model (3) when α = 0.89 the model equations will defined as

`t = 0.89yt + 0.11`t−1

By using R the prediction data of year 2011 is like as Table 5.1. Figure 5.2 is a plot

of the data over the period Jan 2000 - Dec 2010, which shows a changing level over time

but no obvious trending behavior.

Table 5.1: Prediction data for year 2011 using simple exponential smoothing with three

different values for the smoothing parameter α.

Month of year 2011 Actual Value α = 0.2 α = 0.6 α = 0.89

Jan 15.40 15.29 16.15 16.39

Feb 17.12 15.31 15.70 15.63

Mar 16.23 15.67 16.55 16.77

Apr 17.47 15.78 16.36 16.36

May 17.62 16.12 17.02 17.21

Jun 18.08 16.42 17.38 17.52

Jul 17.73 16.75 17.80 17.95

Aug 18.35 16.95 17.76 17.79

Sep 16.03 17.23 18.11 18.22

Oct 19.61 17.71 19.01 19.29

Nov 18.63 17.37 17.22 16.78

Dec 18.35 17.62 18.06 18.20

59

Figure 5.1: Simple exponential smoothing applied to electricity consumption in province

Khan Younis

Now we want to calculate error measure for the three model to chose fitting model see

Table 5.2.

Table 5.2: Measure error for simple exponential smoothing models

Model RMSE MAE MAPE

Model1 1.246 0.9436 8.4063

Model2 1.0478 0.7477 6.6897

Model3 1.0350 0.7334 6.49301

As we show in Table 5.2 model 3 `t = 0.89yt + 0.11`t−1 is fitting model, it has smallest

error now after determined the fitting model we want to plot it see Figure 5.2

60

Figure 5.2: Forecasts from simple exponential smoothing

5.1.2 Second Method :Holt’s Linear Trend Method

Holt (1957) extended simple exponential smoothing to allow forecasting of data with a

trend. This method involves a forecast equation and two smoothing equations (one for

the level and one for the trend): as we discus in chapter 3

We want use this method for predict data in period Jan 2011 to Dec(2011) by using

two model

• Model 1 with α = 0.8 ,β∗ = 0.2,

`t = 0.8yt + 0.2(`t−1 + bt−1)

bt = 0.2(`t − `t−1) + 0.8bt−1

• Model 2 with α = 0.6 ,β∗ = 0.4

`t = 0.6yt + 0.4(`t−1 + bt−1)

bt = 0.4(`t − `t−1) + 0.6bt−1

61

By using R we will calculate the prediction data by model(1) and model 2 and make

comparative between two models and actual data see Table 5.3

Table 5.3: Comparative between actual data prediction data of year 2011 using Holt linear

method.

Month of year 2011 Actual Value α = 0.8,β = 0.2 α = 0.6,β = 0.4

Jan 15.40 16.78 16.82

Feb 17.12 15.77 16.11

Mar 16.23 17.16 17.10

Apr 17.47 16.58 16.76

May 17.62 17.59 17.53

Jun 18.08 17.92 17.95

Jul 17.73 18.39 18.43

Aug 18.35 18.09 18.25

Sep 16.03 18.57 18.56

Oct 19.61 19.84 19.70

Nov 18.63 17.12 16.62

Dec 18.35 18.37 18.01

We want to calculate the error of two model to chose the fitting model. See Table 5.4

Table 5.4: Eroor Measure For Holt trend Model

Model ME RMSE MAE MPE MAPE MASE

Fit1 -0.04 1.11 0.77 -0.65 6.84 0.47

Fit2 -0.04 1.17 0.82 -0.56 7.48 0.50

As we show in Table 5.4 Model 1 with α = 0.8 and β∗ = 0.6,

`t = 0.8yt + 0.2(`t−1 + bt−1)

bt = 0.2(`t − `t−1) + 0.8bt−1

it had smallest error is the fitting model. Figure 5.3 display data.

62

Figure 5.3: Forecasts from Holt’s linear method

5.1.3 Damped trend methods

The forecasts generated by Holts linear method display a constant trend (increasing or

decreasing) indefinitely into the future. Even more extreme are the forecasts generated

by the exponential trend method which include exponential growth or decline.

Now we want use this method to predict data in the period Jan 2011 to Dec 2011 by

using two models and choose fitting model. Table 5.5 explain the data.

• Model 1 Additive Damped Trend (ETS(A,Ad,M))

with α = 0.76,β = 0.0001 and φ = 0.98

• Model 2 Multiplicative Damped Trend ETS(M,Md, N)

with α = 0.76,β = 0.0001 and φ = 0.98

63

Table 5.5: Comparative between actual data and prediction data of year 2011 using

Damped trend method

Month of year Actual Value ETS(A,Ad,M) ETS(M,Md, N)

Jan 15.40 16.39 16.37

Feb 17.12 15.65 15.68

Mar 16.23 16.78 16.75

Apr 17.74 16.37 16.39

May 17.62 17.21 17.20

Jun 18.08 17.53 17.53

Jul 17.73 17.96 17.95

Aug 18.35 17.80 17.81

Sep 16.03 18.22 18.22

Oct 19.61 19.29 19.26

Nov 19.63 16.82 16.92

Dec 18.35 18.20 18.18

After predict data for year 2011 we want to calculate measure error of two models to

choose the fitting model with smallest error. See Table 5.6.

64

Table 5.6: Measure error for Damped trend models

Model ETS(A,Ad,M) ETS(M,Md, N)

ME 0.029 0.026

RMSE 1.030 1.034

MAE 0.730 0.731

MPE -0.38 -0.34

MAPE 6.51 6.54

MASE 0.45 0.446

As we show in Table 5.6 The fitting model is model 1 now we want to plot fitting

model see Figure 5.4. With the exception of the multiplicative damped trend method,

the smoothing parameter for the slope parameter is estimated to be zero, indicating that

the trend is not changing over time. Of course, the trend estimated using the damped

trend methods will change in the future due to the damping.

Figure 5.4: Forecasts from Damped Holts method with exponential trend

65

5.1.4 Holt-Winters seasonal method

Holt-Winters seasonal method In our data we employ the Holt-Winters method with both

additive and multiplicative seasonality to forecast

electricity consumption in Khan Younis. Now we want to discus tow model additive

and multiplicative as we show in section 3.4 to choose the fitting model and forecasting

data in 2011

• Model 1 the Holt-Winters additive ETS(A,A,A)

with α = 0.76,β = 0.002 and γ = 0.001

• Model 2 the Holt-Winters multiplicative ETS(M, A, M)

with α = 0.51, β = 0.09 and γ = 0.002

We want to application method with additive and multiplicative seasonality for our

data and make comparative between actual value and prediction data of year 2011 see

Table 5.7

66

Table 5.7: Prediction data of month of year 2011 using Holt-Winters method with both

additive and multiplicative seasonality

Month of year 2011 Actual Value ETS(A,A,A) ETS(M,A,M)

Jan 15.40 15.82 15.48

Feb 17.12 16.27 16.39

Mar 16.23 16.79 16.79

Apr 17.47 16.97 16.88

May 17.62 17.51 17.24

Jun 18.08 18.02 18.58

Jul 17.73 18.18 18.22

Aug 18.35 17.76 17.70

Sep 16.03 18.05 17.90

Oct 19.61 18.52 18.16

Nov 18.63 17.33 18.02

Dec 18.35 18.01 17.55

The results show that the method with the multiplicative seasonality fits the data best.

This was expected as the time plot shows the seasonal variation in the data increases

as the level of the series increases. This is also reflected in the two sets of forecasts,

the forecasts generated by the method with the multiplicative seasonality portray larger

and increasing seasonal variation as the level of the forecasts increases compared to the

forecasts generated by the method with additive seasonality. See Figure 5.5 and 5.6.

67

Figure 5.5: Forecasting electricity data using Holt-Winters method with both additive

and multiplicative seasonality.

After we predict data of year 2011 we will calculate the error measure of models. See

Table 5.8

Table 5.8: Measure error for two model additive and multiplicative seasonal components

Model ME RMSE MAE MPE MAPE MASE

Model 1 -0.0089 0.953 0.659 -0.524 6.069 0.402

Model 2 -0.0019 0.944 0.644 -0.540 5.926 0.393

68

Figure 5.6: Estimated components for Holt-Winters method with additive and multiplica-

tive seasonal components..

The best fitted model is multiplicative model ETS(M,A,M) it has smallest error we

will plot the data using model2 See Figure 5.7

69

Figure 5.7: Forecasting data using multiplicative seasonal components

5.2 Summary

In this section we want to make comparative between the Box-Jenkins method and ex-

ponential smoothing model; to choose the best model for forecast the data in period Jan

2011 to Dec 2011

Firstly we want to comparative between all fitted model in all methods of exponential

smoothing model to chose fitting model by calculate the error. Table 5.9 display the

measure error of all model.

• Model 1 when α = 0.89 the model equations will defined as

`t = 0.89yt + 0.11`t−1

• Model 2 Holt with α = 0.8 ,β∗ = 0.2,

`t = 0.8yt + 0.2(`t−1 + bt−1)

bt = 0.2(`t − `t−1) + 0.8bt−1

• Model 3 Additive damped trend (ETS(A,Ad,M))

• Model 4 the Holt-Winters multiplicative ETS(M,A,M)

70

Table 5.9: Measure error for fitted model in all methods

Model RMSE MAE MAPE

Fit1 1.0350 0.7334 6.49301

Fit2 1.0766 0.7524 6.6129

Fit3 1.03 0.73 6.54

Fit4 0.9442 0.6445 5.9268

From Table 5.9 the best fit model is model 4 ETS(M,A,M) i.e the data is seasonality

m = 12

Secondly we want to make comparative between fit model of exponential smoothing,

and fitted SARIM model to choose best model that is the aim of our study.

The fitted ARIMA model is ARIMA(2, 1, 2)(1, 0, 1)12 and the fitted exponential

smoothing model is Holt-Winters multiplicative. Table 5.10 display the prediction data

of year 2011 by ARIMA(2, 1, 2)(1, 0, 1)12 and Holt-Winters multiplicative.

Table 5.10: Comparative between actual data and prediction data by

ARIMA(2, 1, 2)(1, 0, 1)12 and ETS(M,A,M)

Month of year 2011 Actual value ETS(M,A,M) ARIMA(2, 1, 2)(1, 0, 1)12

Jan 15.40 15.48 16.15

Feb 17.12 16.39 15.54

Mar 16.23 16.79 16.75

Apr 17.47 16.88 16.19

May 17.62 17.24 17.10

Jun 18.08 18.58 17.34

Jul 17.73 18.22 18.25

Aug 18.35 17.70 17.22

Sep 16.03 17.90 18.44

Oct 19.61 18.16 18.82

Nov 18.63 18.02 16.43

Dec 18.35 17.55 18.59

71

Finally the best model of our data Holt-Winters multiplicative seasonality. it has

smallest measure error Table 5.11

Table 5.11: Measure Error for ETS(M,A,M) and ARIMA(2, 1, 2)(1, 0, 1)12 Models

Model RMSE MAE MAPE

ETS(M,A,M) 0.9442 0.6445 5.9268

ARIMA(2, 1, 2)(1, 0, 1)12 0.963 0.708 6.244

Notation 5.2.1. We want to show before calculate the result we product data from 10−6

because value of data very large

72

CONCLUSIONS

From all the discussion of this study the following conclusions can be drawn

• Data of electricity consumption is non stationary time series.

• The best model for electricity consumption after treatment non stationary is ARIMA

ARIMA(2, 1, 2)(1, 0, 1)12. This was supported by the ACF,AIC,BIC and AICc.

• After application Exponential Smoothing Model best model ETS(M,A,M). This was

supported by the measure error.

• After the comparison between two fitting modelARIMA(2, 1, 2)(1, 0, 1)12, ETS(M,A,M).

ETS(M,A,M) is the best fit model.

• Result above confirms the good treatment using Exponential Smoothing Model as well as

improved the model.

• Choose the best model depends on several criteria ACF,AIC,BIC,AICc, measure error,

and depends on researcher experience and wisdom.

73

Recommendations

Exponential Smoothing Model ETS provides an alternative methodology to ARIMA

model for forecasting. So we recommended the following

• conduct more research and comparisons to ascertain the extent suitable ARIMA models

in this area

• We recommend using R for good result.

• Recommend that all workers in the Statistical area, should use the exponential smoothing

methods methods, because of its efficiency in forecasting

74

Bibliography

[1] Box, G.E.P and Jenkins, G.M. Time Series Analysis, Forcaseing and Control (3rd

edittion). Holden-Day, San Francesco, (1970).

[2] Brown, R. G. Statistical forecasting for inventory control,McGraw-Hill, New

York,(1959).

[3] Brown, R. G. Smoothing, forecasting and prediction of discrete time series, Prentice

Hall, Englewood Cliffs, New Jersey,(1963).

[4] Celia F., Balaji V. Les S. , Asish G. Amar R.”Afuzzy Foreca-sting Model for Womens

Casual Sales”, International Journal of Clothing Science and Technology 15(2), 107-

125, (2004).

[5] Degerine, S. and Lambert-Lacroix, S. . Partial autocorrelation function of a non sta-

tionary time series. Journal Multivariate Anal. 87, 4659, (2003).

[6] Gardner, Jr, E. S. and E. McKenzie Forecasting trends in time series, Management

Science,(1985).

[7] George C. S. Wang A GUIDE TO BOX-JENKINS modeling By ,the journal of busi-

ness forcasting, spring, 2008

[8] Gottman, M John., Time Series Analysis, A Comprehensive Introduction for Social

Scientists. Cambridge University Press, British, (1981).

[9] Holt, C. C. Forecasting trends and seasonals by exponentially weighted averages,

O.N.R. Memorandum 52/1957, Carnegie Institute of Technology,(1957).

[10] Holt, C. C. Forecasting seasonals and trends by exponentially weighted moving aver-

ages, International Journal of Forecasting, (2004).

75

[11] Hyndman Rop J.,Georgy Athanasopoulos Forecasting Principles and practies ,2013

Availabel at http://www.otexts.org/fpp.

[12] Jonathan D. Cryer Kung-Sik Chan,Time Series Analysis With Applications in R

Second Edition

[13] Makridaskis, S. Wheel Wright, S. and McGee, E. ”Forecasting: Methods and Appli-

cations”, 2nd ed., John Wiley and Sons, New York, USA. (1983)

[14] Makridaskis, S. Wheel Wright, S. and Hyndman, R. Forecasting: Methods and Ap-

plications”, 3rd ed., Jhon- Wiely and Sons, New York, USA.(1998)

[15] Peter J. Brockwell Richard A. Davis,Introduction to Time Series and Forecasting

Second Edition

[16] Robert H. Shumway David S. Stoffer,Time Series Analysis and Its Applications With

R Examples Second Edition(1999)

[17] Rob J. Hyndman, Anne B. Koehler, J. Keith Ord and Ralph D. Snyder, Forecasting

with Exponential Smoothing, (2008).

[18] Simon Shaw ”Exponential Smoothing Example”, [email protected],

2003/04 semester II.(2003)

[19] Taylor, J. W. ”Exponential Smoothing with a Damped Multiplicative Trend”, Inter-

national Journal of Forecasting,(2003a).

[20] Taylor, J.W. Short-term electricity demand forecasting using double seasonal expo-

nential smoothing, Journal of the Operational Research Society,(2003b).

[21] Wedding Ii DK, Cios KJ Time series forecasting by combining RBF networks, cer-

tainty factors, and the Box-Jenkins model, (1996).

[22] Winters, P. R. Forecasting sales by exponentially weighted moving averages, Man-

agement Science, (1960).

76

Appendix

R Codes Used In My Thesis

First part SARIMA

>win.graph(width=4.875, height=2.5,pointsize=8)

>data(electts)

>plot(electts,ylab=’electricity consumption’,xlab=’time’,type=’o’)

\\

\\

par(mfrow=c(2,1))

acf(as.vector(electts),main="")

pacf(as.vector(electts),main="")

tsdisplay(electts,main="")

Example of ARIMA coding

fit <- Arima(as.vector(electts), order=c(0,1,3))

fit <- Arima(as.vector(electts), order=c(0,0,3))

fit <- auto.arima(as.vector(electts),seasonal=FALSE)

tsdisplay(diff(electts),main="")

Part 2

77

:Expontial smoothing model

fit1 <- ses(electts, alpha=0.2, initial="simple", h=3)

fit2 <- ses(electts, alpha=0.6, initial="simple", h=3)

fit3 <- ses(electts, h=3)

plot(fit1, plot.conf=FALSE, ylab="elect (consumtion)",

xlab="Year", main="", fcol="white", type="o")

lines(fitted(fit1), col="blue", type="o")

lines(fitted(fit2), col="red", type="o")

lines(fitted(fit3), col="green", type="o")

lines(fit1$mean, col="blue", type="o")

lines(fit2$mean, col="red", type="o")

lines(fit3$mean, col="green", type="o")

legend("topleft",lty=1, col=c(1,"blue","red","green"),

c("data", expression(alpha == 0.2), expression(alpha == 0.6),

expression(alpha == 0.89)),pch=1)

elect <- window(electts,start=2000,end=2012)

fit1 <- holt(elect, alpha=0.8, beta=0.2, initial="simple", h=5)

fit2 <- holt(elect, alpha=0.4, beta=0.6, initial="simple", exponential=TRUE, h=5)

# Results for first model:

fit1$model$state

fitted(fit1)

fit1$mean

plot(fit2, type="o", ylab="electricity consumtion", xlab="Year",

fcol="white", plot.conf=FALSE)

78

lines(fitted(fit1), col="blue")

lines(fitted(fit2), col="red")

lines(fitted(fit3), col="green")

lines(fit1$mean, col="blue", type="o")

lines(fit2$mean, col="red", type="o")

lines(fit3$mean, col="green", type="o")

legend("topleft", lty=1, col=c("black","blue","red","green"),

c("Data","Holt’s linear trend","Exponential trend","Additive damped trend"))

Level and slope components for Holts linear trend method and the

additive damped trend method.

> fit1 <- ses(electts)

> fit2 <- holt(electts)

> fit3 <- holt(electts,exponential=TRUE)

> fit4 <- holt(electts,damped=TRUE)

> fit5 <- holt(electts,exponential=TRUE,damped=TRUE)

> # Results for first model:

fit1$model

accuracy(fit1) # training set

> plot(fit2$model$state)




plot(fit3, type="o", ylab="elect",

79

flwd=1, plot.conf=FALSE)

lines(window(electts,start=2001),type="o")

lines(fit1$mean,col=2)




legend("topleft", lty=1, pch=1, col=1:6,

c("Data","SES","Holt’s","Exponential",

"Additive Damped","Multiplicative Damped"))

elect <- window(electts,start=2000)

fit1 <- hw(electts,seasonal="additive")

fit2 <- hw(electts,seasonal="multiplicative")

plot(fit2,ylab="consumption",

plot.conf=FALSE, type="o", fcol="white", xlab="Year")

lines(fitted(fit1), col="red", lty=2)

lines(fitted(fit2), col="green", lty=2)

lines(fit1$mean, type="o", col="red")

lines(fit2$mean, type="o", col="green")

legend("topleft",lty=1, pch=1, col=1:3,

c("data","Holt Winters’ Additive","Holt Winters’ Multiplicative"))

states <- cbind(fit1$model$states[,1:3],fit2$model$states[,1:3])

colnames(states) <- c("level","slope","seasonal","level","slope","seasonal")

plot(states, xlab="Year")

fit1$model$state[,1:3]

fitted(fit1)

80

fit1$mean

81

Documents

Electricity Consumption Forecasting Viewpoint · of Exponential Smoothing Model and his methods like Simple Exponential Smoothing Model, Holts Linear Method, Damped Trend Method and