Upload
others
View
26
Download
0
Embed Size (px)
Citation preview
Eötvös Lóránd University
Corvinus University of Budapest
EMD and wavelet decomposition based denoising and
forecasting of crude oil prices
MSc thesis
Author: Supervisor:
Bálint Plangár Milán Csaba Badics
May 10, 2019
ACKNOWLEDGEMENT
Firstly, I would like to express my sincere gratitude to my advisor Milán Csaba Badics
for the continuous support of my research, for his patience, motivation, immense
knowledge and critical mindset. His guidance helped me in all the time of research and
writing of this thesis. The door to Milán’s office was always open whenever I ran into a
trouble spot or had a question about my research or writing. He consistently allowed this
paper to be my own work, but steered me in the right direction whenever he thought I
needed it. I could not have imagined having a better advisor and mentor for my research
project.
NYILATKOZAT
Név: Plangár Bálint
ELTE Természettudományi Kar, szak: Biztosítási és pénzügyi matematika
NEPTUN azonosító: JL3QFB
Szakdolgozat címe:
EMD and wavelet decomposition based denoising and forecasting of crude oil prices
A szakdolgozat szerzőjeként fegyelmi felelősségem tudatában kijelentem, hogy a
dolgozatom önálló munkám eredménye, saját szellemi termékem, abban a hivatkozások
és idézések standard szabályait következetesen alkalmaztam, mások által írt részeket a
megfelelő idézés nélkül nem használtam fel.
Budapest, 2019.05.10 ______________________
a hallgató aláírása
Table of Contents
1. Int roduct ion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .1
2 . Lit erature review . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .4
3 . Cr it ica l review . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10
4. Research framework fo r financ ia l t ime ser ies fo recast ing . . . . . . . . . . 12
5. Poss ible research quest ions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21
6. Decomposit ion methods . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 22
6.1. Empir ica l mode decomposit io n . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 22
6.2. Discret e wave let based deco mposit ion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 26
7. Data . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 30
8. Empir ica l ana lys is and resu lt s . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 36
8.1. Pred ict ion st rat egy . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 36
8.2. Pred ict ion mode l . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 40
8.3. Resu lt s . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 40
9. Robustness check . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 48
10. Conc lus io n . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 51
References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 54
List of figures
1. FIGURE: SIMPLIFIED REPRESENTATION OF THE FOUR BROAD RESEARCH DESIGNS, SOURCE: OWN
FIGURE ........................................................................................................................................ 14 2. FIGURE: RESEARCH FRAMEWORK OF FINANCIAL TIME SERIES FORECASTING, SOURCE: OWN FIGURE
.................................................................................................................................................... 19 3. FIGURE: PLOTTING THE ENVELOPE AND THEIR MEAN, SOURCE: METATRADER, 2012..................... 25 4. FIGURE: COMPARISON OF TRANSFORMATIONS, SOURCE: ULIHA, 2016, 512.P ................................. 26 5. FIGURE: PROCESS OF WAVELET DECOMPOSITION, SOURCE: MIRZAEI ET AL., 2010, 303.P. ............. 29 FIGURE 6: BRENT CRUDE OIL PRICES AND RETURNS FOR THE ENTIRE SAMPLE, SOURCE: OWN FIGURE
.................................................................................................................................................... 31 7. FIGURE: NUMBER OF IMFS DURING THE ESTIMATION PERIOD USING EXPANDING WINDOW, SOURCE:
OWN FIGURE ............................................................................................................................... 32 8. FIGURE: COMPONENTS OF BRENT CRUDE OIL GENERATED BY EMD ON THE ENTIRE SAMPLE,
SOURCE: OWN FIGURE ................................................................................................................ 33 9. FIGURE: IN-SAMPLE COMPONENTS OF BRENT CRUDE OIL GENERATED BY EMD, SOURCE: OWN
FIGURE ........................................................................................................................................ 34 10. FIGURE: COMPONENTS OF BRENT CRUDE OIL GENERATED BY EMD DURING THE RECESSION,
2006.09. – 2010.09., SOURCE: OWN FIGURE ................................................................................ 35 11. FIGURE: PREDICTION PROCESS, SOURCE: OWN FIGURE ................................................................ 36 12. FIGURE: SELECTED RESEARCH DESIGN FOR EMPIRICAL MODE DECOMPOSITION, SOURCE: OWN
FIGURE ........................................................................................................................................ 37 13. FIGURE: RATIO OF SIGNIFICANT LAGS IN THE FIRST THREE IMFS, SOURCE: OWN FIGURE ........... 41 14. FIGURE: TYPICAL VALUES OF PERMUTATION ENTROPY ESTIMATED FROM DENOISED SIGNALS,
SOURCE: OWN FIGURE ................................................................................................................ 42 15. FIGURE: NUMBER OF DROPPED IMFS BASED ON SAMPLE ENTROPY AND NUMBER OF GENERATED
IMFS USING EXPANDING WINDOW, SOURCE: OWN FIGURE ......................................................... 43 16. FIGURE: NUMBER OF DROPPED IMFS BASED ON SHANNON ENTROPY AND NUMBER OF GENERATED
IMFS USING EXPANDING WINDOW, SOURCE: OWN FIGURE ......................................................... 43 17. FIGURE: NUMBER OF DROPPED DETAIL COMPONENTS BASED ON SHANNON AND SAMPLE ENTROPY
USING EXPANDING WINDOW, SOURCE: OWN FIGURE ................................................................... 45 18. FIGURE: DENOISED SIGNALS AND THEIR PERMUTATION ENTROPY USING WAVELET
DECOMPOSITION, SOURCE: OWN FIGURE .................................................................................... 46 19. FIGURE: CUMULATIVE RSE OF THE TWO BEST PERFORMING MODELS THROUGHOUT THE OUT-OF-
SAMPLE PERIOD, SOURCE: OWN FIGURE ..................................................................................... 48 20. FIGURE: HISTOGRAMS OF THE NUMBER OF IMFS USING ROLLING WINDOW, SOURCE: OWN FIGURE
.................................................................................................................................................... 49 21. FIGURE: PERMUTATION ENTROPY BASED NOISE SELECTION IN CASE OF EMD, SOURCE: OWN
FIGURE ........................................................................................................................................ 50 22. FIGURE: NUMBER OF DROPPED DETAIL COMPONENTS BASED ON SHANNON AND SAMPLE ENTROPY
USING WEEKLY DATA AND ROLLING WINDOW, SOURCE: OWN FIGURE ........................................ 50
EMD and wavelet decomposition based denoising and forecasting of crude oil prices Bálint Plangár
1
1. Introduct ion
Signal processing is long-known technique for analyzing and detecting hidden
components in a measured signal. It has been applied mainly in the field of electrical
engineering, however signal processing has several application fields for example
processing or interpreting spoken words (Smith et al., 2017), processing pictures or
videos (Baimbetov, 2015). It can be used also for image or video compression (Berres et
al., 2017) and noise reduction (Boukhayma et al., 2016).
Signal decomposition is a useful technique for applying noise reduction or
analyzing the original time series in a less complicated representation. The most
commonly used strategy is the ‘divide and conquer’ strategy, which is a decomposition-
ensemble learning paradigm. The strategy divides the original time series into meaningful
components then predicts the components instead of the original time series. The
decomposition-ensemble models show better performance than the conventional single
models. Signal decomposition is also useful for noise reduction which helps focusing on
the most important components of the time series. Noise reduction means dropping a
component or components after decomposition. Studies showed that noise reduction can
guarantee high superiority in data fitting, resulting in better prediction performance.
(Jammazi & Aloui, 2012) (Guo et al., 2012) (Harris & Yilmaz, 2009). We can think of
signal processing (decomposition, noise reduction) as the preprocessing stage of model
building.
Financial time series have the characteristics of complex nonlinearity, dynamic
variation, high irregularity and non-stationarity (Watkins and Plourde, 1994) (Krichene,
2007) (Zhang et al, 2015). That is why conventional financial econometric tools
(ARIMA, GARCH, VAR etc.) are not efficient methods for describing financial time
series. Even machine learning models failed to fit the data and produce satisfying
prediction results. Due to the benefits of signal processing several studies applied signal
decomposition in the field of economic/financial time series prediction. The advantages
of signal processing turned the attention of researchers to the methods of signal
processing.
The traditional forecasting strategies can be generally described as (Tang et al.
2014):
𝑋𝑡+ℎ = 𝑓(𝑋𝑡) + 𝜀𝑡 (1)
EMD and wavelet decomposition based denoising and forecasting of crude oil prices Bálint Plangár
2
where 𝑋𝑡 denotes the value of a time series at time t, h is the prediction horizon, 𝑋𝑡 =
{𝑋𝑡−1, … , 𝑋𝑡−𝑙} are the past values of the original time series and 𝜀𝑡 is the prediction errors
following independent and identical distribution. Based on the f function design and the
parameter evaluation methods, the existing models for crude oil price forecasting can fall
into three main types (Yu et al., 2015): (1) traditional econometric models with relatively
simple fixed functions and strict data assumptions for example auto-regressive integrated
moving average (ARIMA) (Xiang & Zhuang, 2013), generalized autoregressive
conditional heteroscedasticity (GARCH) (Nomikos & Andriosopoulos, 2012), vector
auto-regression (VAR) (Mirmirani & Li, 2005) or error correction models (ECM) (Lanza
et al., 2005). (2) Machine learning techniques with flexible functions and self-learning
capability such as artificial neural networks (ANN) (Guo et al., 2012), support vector
machines (SVM) (Kim, 2003) or support vector regressions (SVR) (Lin et al., 2012). (3)
Hybrid models combining several single models. (Yu et al, 2015)
Nevertheless these techniques gradually infiltrated into the field of financial time
series analysis because these methods are able to represent smooth and also volatile
functions in a way that can obtain time and frequency information of a time series
(Yousefi et al., 2005) (Guo et al., 2012) (Bekiros, Marcellino, 2013). The prediction of
oil prices are even more challenging than the prediction of financial time series, since the
price of oil is strongly influenced by many factors, which can cause large-scale price
movements for example political events, investors’ expectations about the future, weather
or economic reports of top oil producing countries.
Oil price series forecasting receives a great attention, since oil price plays an
important role in the world economy (Guan et al., 2016) (H-Y. Zhang et al. 2015)
(Juvenal, Petrella, 2014). Crude oil is among the most important energy resource since it
is the world’s most dominant fuel, making up just over a third of all energy consumed
(BP, 2018). Furthermore crude oil is also the world’s largest and most actively traded
commodity, Brent crude oil and West Texas Intermediate are among the top three most
traded commodities in the world (FIA, 2018).
Although the current literature of traditional financial econometric forecasting has
promising results, the tools of signal processing are not widespread in financial
econometric researches. Nonetheless one can find several mistakes in the literature, which
make the reproduction and comparability of articles difficult. There is no general research
framework, which can help categorize the articles. It is not possible to recreate most of
the papers because they lack the necessary parameters, data or program. It is a common
EMD and wavelet decomposition based denoising and forecasting of crude oil prices Bálint Plangár
3
mistake that papers ignore the look-ahead bias, since their models use future information.
Some papers do not specify the window size or type (rolling or expanding) and the
hyperparameter optimization method, furthermore researchers rarely emphasize the
sensitivity of the applied method to window type and size. The selection of benchmark
models are often not designed properly, frequently a flexible model is compared to a
relatively simple one. It is a frequent mistake in the literature that the differences between
EMD and wavelet are analysed with different prediction models, consequently the partial
effects (decomposition, noise reduction etc.) are not described thoroughly. Some papers
overcomplicate their prediction models using for example neural network for the
decomposed data and another neural network for the predictions of the previous one.
There are papers which reconstruct the components into low-medium-high frequency
components or low-medium-high-trend components etc. however often there is no
analysis on the optimality of the reconstruction method. The model comparison often
lacks statistical hypothesis testing. In most of the papers an economic evaluation based
on the prediction model (portfolio selection, Sharp-rate etc) or robustness check (different
frequencies, volatile vs calm period) are missing. Some papers compare their prediction
models based only on one time series data and they choose a relatively short out-of-
sample period.
Given the available literature, the paper’s contribution is threefold: (1) the paper
provides a thorough literature review based on the most important articles, (2) the paper
introduces a general research framework which describes the possible research designs in
decomposition based economic/financial time series forecasting and classifies the articles
introduced in the literature review with the help of the general research framework, (3)
the paper compares PACF, entropy and the expert judgement based noise selection
methods in terms of their contribution to prediction accuracy.
The remainder of this paper is organized as follows. Section 2 summarizes the most
important studies that has been carried out in decomposition based financial time series
forecasting. Section 3 provides a critical review of the literature, focusing on the factors
that restrain papers’ reproducibility and comparability. Section 4 describes the general
research framework for the current and future literature of financial time series
forecasting. Section 5 provides some of the possible research questions and designs that
can be formulated based on the framework. The applied methodologies including
advanced techniques and the research framework of the paper are described in section 6.
After that section 7 introduces the research design and the time series data, which is
EMD and wavelet decomposition based denoising and forecasting of crude oil prices Bálint Plangár
4
followed by the description of the empirical analysis and results in section 8. Finally,
section 9 provides a robustness check for the prediction models and section 10 concludes
the paper.
2. Literature review
This section summarizes the most important studies that has been carried out in
decomposition based economic/financial time series forecasting. All the papers apply one
of the decomposition methods from wavelet or empirical mode decomposition family.
The main purpose of this section is to describe the trends in financial time series
forecasting, focusing particularly on the differences of signal processing methods.
Yousefi et al. (2005) illustrated an application of wavelets as a possible technique
for investigating the issue of market efficiency in futures markets for crude oil. They
introduced a wavelet-based prediction procedure to provide forecasts for the spot price
over the horizons of one, two, three and four months. The results of their models are
compared with data from the actual futures markets for oil. The relative performance of
this procedure is used to investigate whether futures markets are efficiently priced. They
used average monthly WTI spot prices and NYMEX futures prices, the data covers the
period 1986-2003. The Daubechies’ wavelet of order seven and a five level wavelet
decomposition is applied as a prediction model. The predictions are calculated as an
extension of the decomposed data on each level, then the authors reconstructed the data
with the help of inverse wavelet transform. For the approximation level a spline fit, while
for the lower detail levels a trigonometric fit is applied. Researchers came to the
conclusion that the futures market might not be efficiently priced, since the wavelet-based
predictions of spot prices were closer to the real spot prices than actual futures prices.
Jammazi and Aloui (2012) implemented the dynamic properties of multilayer back
propagation neural network (MBPNN) and the Harr A trous wavelet with six level
decomposition to achieve prominent prediction of crude oil prices. They use monthly
WTI crude oil spot prices to generate out of sample forecasts, the data covers the period
from 1988 to 2010. They choose the prediction horizon to be 19 months and the
conventional MBPNN as a benchmark model. To ameliorate the fitting ability of the
MBPNN, the high frequency components (D1 – D6) are dropped and only the smoothed
signal is used for model building. The inverse wavelet transform is applied for the
smoothed component to reconstruct smoothed WTI prices. They came to the conclusion
EMD and wavelet decomposition based denoising and forecasting of crude oil prices Bálint Plangár
5
that reducing excess noise from WTI price can ameliorate the fitting ability of the
MBPNN, since the hybrid model outperformed the standard MBPNN model.
Bekiros and Marcelino (2013) used a shift-invariant wavelet transform to analyze
the dependence structure and predictability of currency markets across different
timescales. Their study attempts to probe into the micro-foundations of across-scale
causal heterogeneity on the basis of trader behavior with different time horizons. They
use three time series of daily closing currency rates, namely EUR/USD, YPN/USD and
GBP/USD. They calculated the foreign exchange returns and realized volatility series for
model building purposes and they chose random walk as a benchmark model. The data
span a time period from 1999.01.05. to 2010.05.10. (2960 observations). The researchers
determined the optimal level of multiscale decomposition with respect to the
minimization of the Shannon entropy-related criterion. They used different models for
the approximation and the details. For the approximation level (A4) a cubic spline fit,
while for the details (D1-D4) ARIMA is applied to extend the decomposed signal. The
prediction procedure includes the following steps: invariant transformation with the
SIDWT, boundary extension with spline and ARIMA, reconstruction of the wavelet
series with inverse SIDWT, finally the out-of-sample forecasts for one day to five day are
obtained and compared to the prediction calculated from neural network. The authors
showed that the application of wavelet decomposition and artificial neural networks
provided enhanced predictability.
Yu et al. (2008) proposed an empirical mode decomposition (EMD) based neural
network ensemble learning paradigm for forecasting crude oil spot prices. They used
daily WTI and Brent crude oil from the period of 1986-2006. After the original crude oil
spot series were decomposed, a three-layer-feedforward neural network (FNN) model
was used to model each of the extracted IMFs. After that an adaptive linear neural
network (ALNN) was applied to formulate an ensemble output for the original crude oil
price series. The following models were used as benchmarks: EMD-FNN-Averaging,
EMD-ARIMA-ALNN, EMD-ARIMA-Averaging, Single FNN, and single ARIMA. The
authors’ results show that the decomposition-and-ensemble strategy can effectively
improve the prediction performance based on RMSE and deviation statistics. They also
show that EMD is a meaningful tool for prediction performance improvement.
Lin et al. (2012) proposed a hybrid forecasting model using EMD and least squares
support vector regression (LSSVR) for foreign exchange rate forecasting. The LSSVR is
constructed to forecast each IMFs and the residual value individually and then all these
EMD and wavelet decomposition based denoising and forecasting of crude oil prices Bálint Plangár
6
forecasted values are aggregated to produce the final forecasted value for foreign
exchange rates. This is a typical application of ‘divide and conquer’ strategy. Daily
USD/NTD, JPY/NTD and RMB/NTD exchange rates are used and the data covers the
period 2005.07.01. – 2009.12.31. The researchers use the following benchmark models:
EMD-ARIMA, single LSSVR and single ARIMA without time series decomposition.
Their results show that the proposed EMD-LSSVR model outperforms the benchmark
models based on various statistical performance measures.
Xiong et al. (2013) proposes a hybrid model built on EMD based on the feed-
forward neural network (FNN) modeling framework incorporating the slope based
method (SBM). The slope based method is proposed to restrain the end effect that
occurred during the shifting process of EMD. The authors examine the iterated, direct and
multiple-input multiple output (MIMO) forecasting strategy. After the original crude oil
spot series were decomposed, a three-layer-feedforward neural network (FNN) model
was used to model each of the extracted IMFs. This was followed by the application of
another FNN to formulate an ensemble output for the original crude oil price series.
Weekly data from the WTI crude oil spot price are used between the period of 2000.01.07.
– 2011.12.30. They examine several prediction horizons including 4, 8, 12, 16, 20 and
24. The researchers use the following models as benchmarks: single FNN without EMD,
naïve random walk without EMD and EMD-FNN without SBM. The results indicate that
the proposed EMD-SBM-FNN model using the MIMO strategy is the best in terms of
prediction accuracy.
Shu-ping et al.’s (2014) study incorporates the idea of decomposition-
reconstruction-ensemble. The new insight of their paper is to use the run length judgement
method to reconstruct the component sequences based on the characteristics of the
components. They built a multiscale combined forecasting model based on EMD. They
apply ANN and SVM as prediction models. Monthly spot price of WTI crude oil from
January 1986 to November 2013 is selected. The oil price series was decomposed and
reconstructed into high-, medium-, low frequency and trend sequences. They use ANN
model for the high frequency, SVM for medium and low frequency individually and
ARIMA for the trend component. The authors apply another SVM to formulate an
ensemble output for the original time series. In their analysis the researchers apply the
run length judgement method, which is a potential tool for noise selection, however they
do not drop any components. Their model generated out of sample prediction for 12 and
23 periods ahead. They came to the conclusion that the multiscale combined model
EMD and wavelet decomposition based denoising and forecasting of crude oil prices Bálint Plangár
7
obtained the best forecasting result compared with single ARIMA, Elman, SVM and
GARCH and combined models including ARIMA-SVM and EMD-SVM-SVM method.
Yu et al. (2015) proposed a decomposition-ensemble methodology with data-
characteristic driven reconstruction for crude oil price forecasting to enhance prediction
accuracy and reduce computation complexity. Four main steps are involved in the study:
data decomposition for simplifying the complex data, component reconstruction based on
data-characteristic driven modeling, individual prediction for each reconstructed
component and ensemble prediction for final output. The weekly crude oil prices in the
WTI and Brent markets are used, the data covers the period January 1986 and July 2014.
They analyze multiple reconstruction methods including run length judgement, fine to
coarse and sample entropy reconstruction. Besides, numerous benchmark models are
applied to test their proposed method including typical decomposition-ensemble models
without reconstruction and similar decomposition-ensemble models with existing
reconstruction strategies. The authors tested the proposed method with several prediction
horizons including 1,2,3 and 4 weeks. The results indicate that the data-characteristic
driven reconstruction approach improves the existing decomposition-ensemble
techniques based on statistical performance measures and computational time.
Zhu et al. (2016) developed an adaptive multiscale ensemble learning paradigm
incorporating ensemble empirical mode decomposition (EEMD), particle swarm
optimization and LSSVM with kernel function prototype. Three main steps are involved
in the study: with the help of extrema symmetry expansion EEMD (ESE-EEMD) the
original oil price series is decomposed, after that the authors applied the fine-to-coarse
reconstruction algorithm in order to identify the high frequency, low frequency and trend
components. Different prediction models are used for each of the components, ARIMA
is used to predict the high frequency components, LSSVM is used to predict the low
frequency and trend components, finally the prediction results of all components are
aggregated. The article analyzes three energy price series including daily WTI crude oil.
The study applies the fine-to-coarse method which can be used for noise selection.
Numerous benchmark models are applied, including typical decomposition-ensemble
models without reconstruction and similar decomposition-ensemble models with existing
reconstruction strategies. The results indicate that the proposed method can significantly
improve the level and directional prediction accuracy.
Lahmiri (2016) presents a new time series forecasting model which integrates
variational mode decomposition (VMD) and general regression neural network (GRNN).
EMD and wavelet decomposition based denoising and forecasting of crude oil prices Bálint Plangár
8
Three benchmark models are applied: EMD-GRNN, FFNN and ARIMA. Daily data of
WTI, CANUS and the Volatility index from 2008.01.02. to 2013.12.16. are used to
conduct the experiments. Two main steps are involved: EMD or VMD is applied to the
original data to obtain components, then they will be fed to the GRNN for forecasting
purpose. The researchers demonstrated the superiority of the VMD-based method over
the three competing prediction approach, consequently VMD is an effective technique for
analysis and prediction of economic and financial time series. VMD has the ability to
separate tones of similar frequencies and it is more robust to noisy data contrary to EMD.
Table 1. summarizes the main articles mentioned in this section and gives extra
details of the research papers. The last three rows of the table contains articles that are not
mentioned in the literature review however the decomposition method they use can be
useful for financial time series forecasting.
This section introduced the most important research papers that readers can most
frequently encounter. The above review helps the reader to become familiar with the
current trends in financial time series forecasting particularly with the decomposition
based prediction strategies. Although the current literature of traditional financial
econometric forecasting has promising results, the tools of signal processing are not
widespread in financial econometric researches. In spite of the fact that one can find
promising results in the literature, signal processing methods in economic/financial time
series forecasting is not widespread because articles are not reproducible and comparable.
A general research framework is missing from the literature, which could help categorize
the articles, determine the necessary parameters for reproduction and foster comparability
of studies.
9 9
1. Table: Details of the research papers described in the literature review, Source: Own table
Author Data Frequency Window Decomposition
method
Stopping
criterion
Noise selection
/Noise
reduction
Aggregation Prediction
horizon
Main prediction
model
Yousefi et al. (2005) WTI spot price NYMEX futures
Monthly 100 random samples
Daubechies’ Wavelet Expert judgement
Not used Signal processing inverse
1,2,3,4 spline, trigonometric fit
Yu et al. (2008) WTI spot price Brent spot price
Daily NaN EMD Residual based Not used Learning 1, 30 FNN (ensemble: ALNN)
Jammazi & Aloui (2012)
WTI spot price Monthly NaN Haar A Trous Wavelet Expert judgement
Expert judgement / Drop D1-D6
Signal processing inverse
19 MBPNN
Lin et al. (2012) FX rates Daily NaN EMD Residual based Not used Sum of components NaN LSSVR
Guo et al. (2012) Wind speed Monthly/Daily NaN Modified EMD Residual based Expert judgement/ Drop more freq.
Sum of components 1,18 FNN
Bekiros & Marcelino (2013)
FX rates, volatility, return
Daily Rolling Shift invariant DWT Expert judgement
Not used Signal processing inverse
1,2,3,4,5 spline, ARIMA
Xiong et al. (2013) WTI spot price Weekly Multiple window type
SBM-EMD NaN Not used Learning 1,4,8,12,16,20,24
FNN (ensemble: FNN)
Shu-ping et al. (2014)
WTI spot price Monthly NaN EMD NaN Not used Learning 12,23
SVM,NN,ARIMA
(ensemble: SVM)
Xiong et al. (2014) NN3 competition Monthly NaN EMD,
Daubechies’ Wavelet
Expert
judgement Not used Learning 1
SVR (ensemble: SVR)
Yu et al. (2015) WTI spot price
Brent spot price Weekly NaN EEMD NaN Not used Sum of components 1,2,3,4 LSSVR, ANN
Zhu et al. (2016) WTI spot price CO2 EUA
Daily Rolling ESE-EEMD Expert judgement
Not used Sum of components 1 ARIMA, LSSVM
Lahmiri (2016) WTI spot price FX rates, VIX
Monthly/Daily NaN VMD Residual based Not used Learning 1 GRNN
Afanasyev & Fedorova (2016)
Power exchange Daily Rolling CEEMDAN Expert judgement
Not used NaN 1 NaN
EMD and wavelet decomposition based denoising and forecasting of crude oil prices Bálint Plangár
10
3. Crit ica l review
The main purpose of this section is to provide a critical review of the literature
focusing primarily on the mistakes and criticisms. The criticisms are formulated based on
the research papers mentioned in the literature review. The section can help researchers
to form an opinion on the results of studies and it can foster the application of signal
processing in economic/financial time series forecasting.
First and foremost, there is no general research framework, which summarizes the
main results and conclusions of studies, the relevant research questions and the possible
research designs. There is no review paper in the decomposition based forecasting
literature, which summarizes the main results of studies, the possible research designs
and the relevant research directions. Consequently a general research framework can
solve the aforementioned problems. This study intends to provide a research framework
for the current and future literature of economic/financial time series forecasting in
section 4. and it also provides the possible research questions and designs in section 5.
It is not possible to reproduce most of the papers because they lack the necessary
model parameters, data or code. In spite of the fact that decomposition based
economic/financial time series forecasting studies have promising results, their external
and internal validity is low. The most frequently missing elements of research
descriptions are the window size and type used for decomposition and prediction. Data,
packages/toolboxes used for the analysis and a model description with parameter
selection should be provided. If studies were easily reproducible, a great progress could
be made on their application in economic/financial time series analysis.
The stopping criterions of decomposition methods are not analyzed thoroughly,
their instability is frequently ignored. Analyzing the connection between the number of
components and the characteristics of a time series is a prerequisite to the spread of signal
processing methods in economic/financial time series analysis. A well-designed
robustness check can solve this problem. Changing the data type (return or price), window
size and type or the frequency of the data provide solution for the problem.
It is a common mistake that papers ignore the look-ahead bias, since their models
use future information. This improves prediction accuracy and it gives an accurate
prediction result, which is actually unreliable. The decomposition result is highly
sensitive to the window therefore decomposing the entire time series, then utilizing this
EMD and wavelet decomposition based denoising and forecasting of crude oil prices Bálint Plangár
11
at an earlier prediction is a mistake. Consequently, it cannot be compared to traditional
econometric tools. Based on the descriptions and figures provided in studies, researchers
do not analyze the number of components during the prediction process, which is a crucial
element of an analysis. A thorough analyses should be made on the number of
components, since it changes as the window used for decomposition rolls or expands (e.g.
in case of EMD). Due to the fact that wavelet has a predetermined component number the
information content of the component should be analyzed throughout the prediction
horizon.
It is rarely explained which model should be fit on components (low-, medium,-
high-frequency etc.). The statistical analysis of components are often missing from
studies. Reconstructing the components into low-, medium-, high-frequency components
is a frequently used method however it is difficult to explain why this method should
work in general. Researchers should pay more attention to the analysis of components
(complexity, nonlinearity, structural breaks etc.) and choose the reconstruction method
and forecasting models accordingly. This can foster the comparison of different
decomposition methods.
It is a frequent mistake in the literature that the differences between decomposition
methods are analyzed with different prediction models, consequently the partial effects
(decomposition, reconstruction, noise reduction) are not described properly. In this case
it is impossible to decide whether the decomposition or the noise reduction improved the
prediction accuracy. A properly designed study selects benchmark models in a way that
can separate the positive effects of decomposition and noise reduction. Consequently the
choice of benchmark models is crucial to the separation of partial effects.
It is also a mistake that the out-of-sample time period is often too short. Out-of-
sample evaluation shows how good the applied method is. The longer the out-of-sample
period is, the more reliable the results become. Consequently it is worth choosing a long
out-of-sample period and repeat the analysis both with rolling and expanding window.
Statistical comparison of models are rarely done, therefore the significance of the
difference of two prediction models is not checked. Diebold-Mariano test is the most
frequently used, however using model confidence set is a better approach to test the
difference of two models. Moreover in most of the papers an economic evaluation, based
on the prediction models is missing (for example analyzing the differences between
EMD and wavelet decomposition based denoising and forecasting of crude oil prices Bálint Plangár
12
Sharpe-ratios based on a portfolio selection). Statistical evaluation, per se, does not
provide information about the economic efficiency and applicability of models.
Studies usually miss robustness check. It can be easily done by changing the
frequency of the data (intraday, daily, weekly etc.), using price time series instead of
return series or the window size and type can also be changed. Applying linear and
nonlinear prediction models is also a good strategy to check robustness. Articles often
ignore the analysis of decomposition and noise reduction in case of periods which have
different characteristics (volatile, smooth, noisy periods). Using robustness check can
strengthen the reliability of results and provides more information about the
decomposition strategy.
There are no articles in the decomposition based forecasting literature that apply
simulation. A well-designed research is missing which analyses how noise reduction
performs in case of time series with different characteristics. In case of simulation the
data generating process can be controlled and a comprehensive analysis can be performed
on decomposition and noise reduction.
The definition of noise and its representation in a time series are not described
thoroughly. The implementation of noise reduction can differ in case of different
prediction approaches. That is why these characteristics make the comparability of
research papers more difficult. It is difficult to measure the partial effect, which stems
from noise reduction, if the concept of noise is not described properly.
The criticisms expressed in this section are the author’s own opinion and not part
of any review papers. Nevertheless avoiding the aforementioned mistakes have several
benefits. It can foster the application of signal processing methods in economic/financial
time series forecasting and strengthen the reliability, validity of results.
4. Research framework for financia l t ime series forecast ing
The main purpose of this section is to provide a research framework for the current
and future literature of financial time series forecasting. The section emphasizes how
difficult the interpretation and reproduction of an article is without a proper general
framework. It is a currently missing element of the literature in spite of the fact that it has
many advantages. The framework paves the way for comparing papers in the field of
financial time series forecasting and gives a road map for future researches.
EMD and wavelet decomposition based denoising and forecasting of crude oil prices Bálint Plangár
13
The framework provides the following advantages: (1) the proposed framework
facilitates the aggregation of results of the current literature, consequently it helps us
better understand the efficiency of signal processing techniques in financial time series
analysis. Moreover it paves the way for a meta-study in which the current results can be
combined. (2) It helps with the formulation of the research design, since it is easier to
design your research if you know the general framework of the field, (3) it makes a great
progress in comparing and classifying research papers, since it provides the necessary
groupings for classification. (4) It helps researchers specify all the necessary details or
parameters of their research design thereby facilitating the paper’s reproduction, (5) it
helps determine the reliability of the results presented in a research paper. All in all the
framework fills a gap in the current literature which opens up the opportunities for further
researches.
Based on the papers described in the literature review there are four broad research
designs. The designs are depicted on figure 1., in spite of the fact that figure 1. simplifies
the research approaches, its perspicuity makes them easy to understand. As a first step all
of the approaches involve the decomposition of the original signal into components. The
first method applies noise selection and noise reduction in the second stage in order to
enhance the prediction performance, then applies signal processing inverse in order to
obtain the denoised signal. If the denoising method is well designed the resulting signal
should be less complex and hopefully easier to predict. The second approach also involves
decomposition in the first stage, after that it predicts the future value of each components
in the second stage, then applies signal processing inverse and obtains the predicted
values. This research design gives us the possibility to predict different components with
different prediction methods (e.g. ANN for highly irregular components and linear
regression for a smooth component). Reconstructing the components into low-medium-
high components is a frequently applied technique in the literature. The third research
design involves decomposition in the first stage, prediction of the components in the
second stage (reconstruction of the components can be used here as well), however
instead of using the inverse method of signal processing it builds a new prediction model
which uses the prediction of the second stage as input variables. This research design can
put different weights on the predicted values of the second stage, however it is a complex
prediction approach and its application should be well-founded. The fourth research
design applies first a decomposition method, then the components are used as input
EMD and wavelet decomposition based denoising and forecasting of crude oil prices Bálint Plangár
14
variables for a prediction model. Of course the reconstruction of components into low-
medium-high components can be used here as well. Nevertheless the fourth research
design is the less frequently used design in the literature.
1. Figure: Simplified representation of the four broad research designs, Source: Own figure
The detailed representation of the research designs is described on figure 2. This
paper proposes the structure introduced on the figure as the general research framework
for decomposition based financial time series forecasting. The dark green boxes represent
the main stages of a prediction process. The first stage involves the data selection, the
second stage is the decomposition, which is followed by the frequency selection. After
the frequency selection is done we arrive to the fourth stage which is the reconstruction.
After that researchers should choose the number of models in the fifth stage, design the
prediction thoroughly in the sixth stage and finally select the aggregation method. This
framework is sufficiently detailed to categorize research papers, moreover it defines all
the necessary parameters which should be given in any research paper in order to ensure
comparability and replication. Furthermore, with the help of the framework, it is easier to
determine what the research question of an article is.
The first stage involves the selection of data type, data frequency and window. The
return and price level should be separated due to the different characteristics of the same
data expressed in returns and price level. Another important issue with the data selection
EMD and wavelet decomposition based denoising and forecasting of crude oil prices Bálint Plangár
15
is its frequency, because a model that is the best fit on weekly data is not necessarily the
best on intraday data. The window size and type (rolling or expanding) should also be
given, because some methods are highly sensitive to these parameters. The size and type
of the window are the most frequently missing elements of a research design. Here
bootstrap means selectin random samples of consecutive observations with equal length.
The elements of the first stage are data type, frequency and window. They can be used
for robustness check, which is rarely done in researches.
The second stage is the decomposition. In this stage a broad method family, the
exact decomposition method and the stopping criterion should be selected. In the current
literature there are two frequently used approaches: empirical mode decomposition
(EMD) and wavelet based decomposition. Both of the methods have improved
modifications which have -in theory- better characteristics, however there are few papers
in the literature which analyze the partial effect of choosing an improved modification
instead of the simplest version. These are listed in column II. b). The variational mode
decomposition (VMD) method in column II. a) has been proposed as an alternative of
EMD to easily separate tones of similar frequencies in data where EMD fails. This paper
lists separately VMD and EMD because VMD is based on a different algorithm. EMD is
the simplest version of the decomposition family, it will be described later in this paper.
The EMD modified with the slope based method intends to handle the end effect problem
of the simple EMD, while the ensemble empirical mode decomposition (EEMD) intends
to handle the potential mode mixing problem of EMD. The extrema symmetry expansion
EEMD is the modified version of EEMD, which gives a solution for both mode mixing
and end effect problem. Nevertheless EEMD introduces additional noise into the results
of decomposition and does not produce stable number of IMFs after applying to the same
time-series. The complete ensemble empirical mode decomposition with adaptive noise
(CEEMDAN) is introduced in the literature to solve this problem. The third
decomposition method described in this paper is the discrete wavelet transform. This
method will be described later in detail. The most frequently applied wavelet is the
Daubechies’ and the Haar wavelet in the literature. However the classical decimated
DWT involves subsampling of the filter output to half the original length, which leads to
a serious drawback, namely the transform is not shift invariant. Specifically, the DWT of
a shifted signal is not the shifted version of the DWT of the signal. Nevertheless an
undecimated DWT can be implemented without the subsampling technique, moreover
EMD and wavelet decomposition based denoising and forecasting of crude oil prices Bálint Plangár
16
they are invariant to circularly shifting the time series. A new variation of the undecimated
DWT, namely the shift invariant DWT (SIDWT) is proposed in the literature. Besides
being shift invariant, SIDWT employs a specialized periodic extension pattern to deal
with boundary effects. However SIDWT is not an orthogonal basis, since it produces an
over-determined representation of the series. The SIDWT method will be described later
in this paper. After the decomposition method has been chosen a stopping criterion should
be selected. Research papers which apply certain thresholds or 𝑙𝑜𝑔2𝑁 or determine the
maximum number of shifting or use a predetermined order as a stopping criterion are
classified in the expert judgement group. The residual based stopping criterion is
applicable only in case of EMD family. This will be described later in this paper. Some
papers pursue an optimal decomposition with respect to the minimization of an entropy-
related criterion, which describes the information-relevant properties of the representation
of a signal.
The third stage is the frequency selection. This stage starts with noise selection.
Here expert judgement contains all the papers that selected certain component or
components as noise without analysis (e.g. select the highest frequency component). A
noise component can be selected with the help of partial autocorrelation function. PACF
is a way to measure the linear relation of a time series with its own lagged values when
the intermediate effects are filtered out. The run length judgement method is a tool for
measuring the irregularity of a given signal. It assigns a run number to a signal and larger
the number is, the higher the volatility is. Another way of selecting noise is to use an
entropy related approach. The permutation entropy, the sample entropy and the Shannon
entropy are possible tools for noise selection. There could be other entropy definitions as
well, however column III. a) lists all that are mentioned in one of the papers in Table 1.
Lot of papers do not apply any noise selection method, they are classified into the ‘skip
noise selection’ category. After the noise component or components are selected we can
drop one or more components. Here a ‘no drop’ box is introduced in order to classify
papers which skipped the noise selection procedure.
The fourth stage is the reconstruction. In this stage one should select the
reconstruction type and rule. Total aggregation means the aggregation of the components
for the original level. It is a box for those articles which drop a noise component then
aggregates the components and analyze the denoised time series later on. Several papers
reconstruct the components into low, medium and high frequency components in order
EMD and wavelet decomposition based denoising and forecasting of crude oil prices Bálint Plangár
17
to analyze different features of the signal separately and improve prediction performance.
The ‘no reconstruction’ box is for those papers which do not use reconstruction. There
are several reconstruction rules that can be applied. Expert judgement incorporates all the
papers that use reconstruction without analysis. The run length judgement method is the
same as in the case of noise selection. This method can also be used as a reconstruction
rule. In case of the data characteristic driven reconstruction rule, the decomposed modes
are thoroughly analyzed to explore the hidden data characteristics (complexity, cyclicity,
mutability, tendency) and are accordingly reconstructed. Fine to coarse reconstruction
rule can be described as the following: high-pass filtering by adding fast oscillations
(IMFs with smaller index) up to slow (IMFs with larger index). First we sum some
components then we calculate t test to identify how many components can be summed up
without departing significantly from zero. These components will be reconstructed into a
high frequency component and the rest of the IMFs will be reconstructed into a low
frequency component. A clustering method can be used as well on statistics calculated
from each components.
In the fifth stage one should choose the number of models used for prediction. The
‘one model’ contains typically those papers which decompose the original data, drop a
noise component then aggregate the components for the original level. In case of the
‘same models’ and ‘different models’ boxes researchers build multiple prediction models.
For example, a paper that is classified into the ‘same models’ box decomposes the original
data and builds ANN for each of the components, while a paper from the ‘different
models’ box builds ANN for one component, SVM for another etc.
The sixth stage is the prediction. Here the window, the prediction horizon, the
prediction model, feature selection method and the hyperparameter optimization should
be selected. Researchers can select the ‘same’ window if they want to use the same type
as in the case of I. c) or it is possible to choose different. The prediction horizon can be
set to one or multiple periods. Column VI. c) lists all the prediction models that were used
in papers introduced in Table 1. Feature selection lists the methods that can be used for
selecting input variables. Here expert judgement contains all the papers that selected input
variables without analysis. Using ANN typically involves the input selection through
optimization on a validation set. It is important to point out that most of the papers do not
use hyperparameter selection through optimization on a validation set, instead they use a
predetermined model architecture.
EMD and wavelet decomposition based denoising and forecasting of crude oil prices Bálint Plangár
18
The seventh stage is the aggregation. Here ‘no aggregation’ box is created for those
researches that aggregated the decomposed signal in an earlier stage (first method) or use
the components as input variables to predict directly the future value of the original signal
(fourth method). The ‘prediction’ box incorporates researches where a new model is fit
on the predicted values obtained using each of the components. Some papers apply the
inverse of the decomposition method at the end to obtain the predicted values (fourth
method), these are classified into the ‘signal processing inverse’ box. It is the wavelet
inverse or the summation in case of EMD family.
Table 2. classifies the researches described in the literature review based on the
general research framework introduced in this section. This paper assigns seventeen
numbers for each of the researches based on their main prediction model. Every number
represents a column from figure 2., the first number shows the data type, the second
number the data frequency etc. and the last number represents the aggregation method,
from each of the columns a number should be selected that is why a zero value is given
to a column in case the researchers do not specify it. The first three numbers would be 1-
2-0 in case of a paper which analyze daily return data but there is no information written
about the window. Some papers apply multiple prediction models or window type. In this
case more box numbers are given for the same column.
This section of the paper introduced a general research framework which describes
the possible research designs in decomposition based economic/financial time series
forecasting. The framework can help compare papers in the field of economic/financial
time series forecasting and gives a road map for future researches. Furthermore it defines
all the necessary parameters that should be given for replication. Besides, this section also
classified those research papers that were introduced in the literature review. Based on
Table 1. the window size, type and the stopping criterion are the most frequently missing
parameters from a research design.
19
2. Figure: Research framework of financial time series forecasting, Source: Own figure
20
Author Title Category
Yousefi et al. (2005) Wavelet-based prediction of oil prices 2-4-3| 3-6-1| 7-1| 3-7| 1| 1-12- 12 -1-1| 3
Yu et al. (2008) Forecasting crude oil price with an EMD-based neural network ensemble learning paradigm
2-2-0| 2-1-2| 7-1| 3-7| 2| 0-12-6-1-2| 2
Lin et al. (2012) Empirical mode decomposition–based least squares support vector regression for foreign exchange rate forecasting
2-2-0| 2-1-2| 7-2| 3-7| 2| 0-0-7-1-2| 3
Jammazi & Aloui (2012) Crude oil price forecasting: Experimental evidence from wavelet decomposition and neural network modeling
2-4-0| 3-7-1| 1-2| 3-7| 1| 0-2-6-1-2| 3
Bekiros & Marcelino (2013) The multiscale causal dynamics of foreign exchange markets 1-2-1| 3-8-1| 7-1| 3-7| 1| 1-12 – 24 -3-1| 3
Xiong et al. (2013) Beyond one-step-ahead forecasting: Evaluation of alternative multi-step-ahead
forecasting models for crude oil prices 2-3-12| 2-2-0| 7-2| 3-7| 2| 1-12-6-3-2| 2
Shu-ping et al. (2014) Multiscale Combined Model Based on Run-Length-Judgment Method and Its Application in Oil Price Forecasting
2-4-0| 2-1-0| 7-2| 2-2| 3| 0-2-468-3-2| 2
Yu et al. (2015) A decomposition–ensemble model with data-characteristic-driven reconstruction for crude oil price forecasting
2-3-0| 2-3-0| 7-2| 2-2345| 2| 0-12-67-1-12| 3
Zhu et al. (2016) An Adaptive Multiscale Ensemble Learning Paradigm for Nonstationary and Nonlinear Energy Price Time Series Forecasting
2-2-2| 2-4-1| 7-2| 2-4| 3| 1-1-47-1-2| 3
Lahmiri (2016) A variational mode decompoisition approach for analysis and forecasting of economic and financial time series
2-23-0| 12-1-2| 7-2| 3-7| 1| 0-1-6-1-1| 2
2. Table: Classification of researches introduced in the literature review, Source: Own table
EMD and wavelet decomposition based denoising and forecasting of crude oil prices Bálint Plangár
21
5. Possible research quest ions
This section provides some of the possible research questions and designs that can
be formulated based on the framework. The main purpose of this section is to briefly
introduce the research questions which can solve one of the problems mentioned in the
critical review. These are potential articles which can make a significant progress in the
economic/financial time series forecasting literature.
Analyzing the stopping criterion of the EMD model family. This problem is
important because the stopping criterion can influence the number of components. There
are cases where the spline fit cannot be done or the threshold should be changed in order
to ensure convergence. During the analysis the window size and type, data frequency,
volatile and smooth periods should be taken into account.
Investigating the sensitivity of the decomposition of economic/financial time series
to the selected window. In this case one should test how sensitive the EMD and wavelet
resolution is, how the number of components change when the window rolls or expands
in case of EMD and how the information content of the components change in case of
wavelet. One should select the number of components in advance, when applying wavelet
decomposition, that is why the information content of these components should be
considered.
Comparing noise selection methods and analyzing their effect on prediction
accuracy. In this research project one should summarize the possible noise definitions in
case of the four prediction methods (introduced in section 4.) and find which noise
selection method is appropriate to use in each of the cases.
Comparing different reconstruction methods. In this research project one should
test which of the reconstruction method is the most efficient, how many components (low,
medium, high etc.) should be made, which of the characteristics should be used to
reconstruct the components.
Analyzing the partial effects of decomposition, reconstruction and noise reduction
separately. All of these methods can enhance prediction performance however their
individual contribution to prediction accuracy is rarely analyzed.
Testing the efficiency of prediction models in case the values of components are
predicted separately. Some papers apply different prediction models on components,
some choose only one model. Nevertheless it should be investigated whether it is worth
choosing models based on statistical properties of a component.
EMD and wavelet decomposition based denoising and forecasting of crude oil prices Bálint Plangár
22
Perform an analysis based on simulated data. Define several data generating
processes which produce time series with different frequencies then use them to create an
aggregated time series. Due to the controlled nature of the analysis the result of signal
processing methods will be more reliable.
Investigating whether there is a relationship between the amplitude of noise and
liquidity, volatility or volatility of volatility in case of time series which are from different
asset classes.
Analyze the efficiency of decomposition methods in prediction accuracy in case of
economic/financial time series which are rarely analyzed by signal processing methods
(for example volatility index, inflation, GDP).
Apply multi-dimensional decomposition, exploiting the relations between time
series. One should investigate whether the result of simultaneous decomposition can be
used to predict one or the other time series more accurately.
The potential research questions mentioned above do not claim to be exhaustive,
however they illustrate well that several studies are missing from the literature.
Nevertheless the author claims that the results of these researches could make a great
progress in decomposition based economic/financial time series forecasting literature.
6. Decomposit ion methods
This section provides a detailed introduction to the decomposition methods,
namely, empirical mode decomposition (EMD) and wavelet based decomposition. These
are the two methods which are used to analyse the original signal in a new representation.
6.1. Empirical mode decomposition
Empirical mode decomposition (EMD) method first appeared in the article of
Huang et al. (1998). They introduced a new method to deal with both non-stationary and
nonlinear data by decomposing the signal first, and analyse the physical meaning of the
decomposition later.
EMD has the characteristics of being intuitive, direct, a posteriori and adaptive with
the basis of the decomposition based on the data. The basic principle of EMD is to
decompose the signal into a sum of oscillatory functions, namely, intrinsic mode
functions (IMF). The decomposition based on three assumptions: (1) the signal has at
least two extrema, one maximum and one minimum, (2) the characteristic time scale is
EMD and wavelet decomposition based denoising and forecasting of crude oil prices Bálint Plangár
23
defined by the time lapse between the extrema, (3) if the data has no extrema but contains
only inflection points then it can be differentiated once or more times to get the extrema
(Huang et al. 1998). Huang et al. (1999) introduced two requirements in order to get
meaningful IMFs: (1) in the whole data series, the number of extrema (sum of maxima
and minima) and the number of zero crossings, must be equal, or differ at most by one,
(2) the mean value of the envelopes defined by local maxima and minima must be zero at
all points. Nevertheless the components’ orthogonality is not guaranteed theoretically.
For some data, the neighboring components could certainly have sections of data carrying
the same frequency at different time durations. The amount of leakage usually depends
on the length of data as well as the decomposition results. However Huang et al. (1998)
argues that orthogonality is a requirement only for linear decomposition systems, it would
not make physical sense for a nonlinear decomposition as in EMD.
The different scales can be identified directly in two ways. First by the time lapse
between the successive alternations of local maxima and minima, secondly by the time
lapse between the successive zero crossings. Huang et al. (1998) adopted the time lapse
between successive extrema as the definition of the time scale for the intrinsic oscillatory
mode. This choice is beneficial because it gives a fine resolution of the oscillatory mode.
One can extract the scales by the shifting process. Any data series 𝑥(𝑡) (𝑡 =
1,2, … 𝑛) can be decomposed according to the following shifting procedure (Yu et al.,
2008):
1) Identify all the local extrema, including local maxima and local minima of the
time series 𝑥(𝑡)
2) Connect all local extrema by a cubic spline line to generate its upper and lower
envelopes 𝑋𝑢𝑝(𝑡) and 𝑋𝑙𝑜𝑤(𝑡). In this step we should fit a cubic spline
separately to the time series of local minimum and local maximum points.
3) Compute the point-by-point envelope mean 𝑚(𝑡) from upper and lower
envelopes (𝑚(𝑡) =𝑋𝑢𝑝(𝑡)+𝑋𝑙𝑜𝑤(𝑡)
2)
4) Extract the details: 𝑐(𝑡) = 𝑥(𝑡) − 𝑚(𝑡). Steps 1) – 4) is plotted on Figure 3.
5) Check the properties of 𝑐(𝑡), if 𝑐(𝑡) meets the two requirements of Huang et al.
(1999), an IMF is derived and 𝑥(𝑡) should be replaced with the residual : 𝑟(𝑡) =
𝑥(𝑡) − 𝑐(𝑡). In case 𝑐(𝑡) is not an IMF then 𝑥(𝑡) should be replaced with 𝑐(𝑡).
One has to repeat the algorithm 1) – 5) until a stopping criterion is satisfied.
EMD and wavelet decomposition based denoising and forecasting of crude oil prices Bálint Plangár
24
The typical stopping criterions can be classified into three groups (1) residual based,
(2) expert judgement, (3) entropy based. According to the residual based criterions the
above algorithm should be stopped if we reach the final time series 𝑟(𝑡) as a residual
component that becomes a monotonic function or has at most one local extremum. This
criterion is suggested by Huang et al. (1999). The following researches also used this
method as a stopping criterion (Lin at al., 2012), (Yu et al., 2008), (Guo et al., 2012).
Riling et al. (2003) introduces the mode amplitude 𝛼(𝑡) ≔ 𝑥𝑢𝑝(𝑡)− 𝑥𝑙𝑜𝑤(𝑡)
2, and the
evaluation function 𝜎(𝑡) ≔ |𝑚(𝑡)
𝛼(𝑡)|. Thus the sifting is iterated until 𝜎(𝑡) < 𝜃1 for some
prescribed fraction (1 − 𝛼) of the total duration, while 𝜎(𝑡) < 𝜃2 for the remaining
fractions, where 𝜃1and 𝜃2 aimed to guarantee globally small fluctuations in the mean
while taking into account locally large excursions. One can typically set 𝜃1 = 0.05, 𝜃2 =
0.5 and 𝛼(𝑡) = 0.05.
There are several approaches in the literature where the stopping criterion includes
a certain threshold that is determined by the researcher. Lahmiri (2016) computed the
standard deviation (SD) from two consecutive sifting results. According to this approach
the shifting process should be stopped if the standard deviation is less than an arbitrary
small number1. Huang et al. (1998) emphasize that carrying the shifting process to an
extreme could make the resulting IMF a pure frequency modulated signal of constant
amplitude. To guarantee that the IMF components have enough physical sense one should
set SD value between 0.2 – 0.3. Another stopping criterion can be defined by the
following three conditions: (1) at each point (mean amplitude) < (threshold * envelope
amplitude), (2) mean of Boolean array ((mean amplitude)/(envelope amplitude) >
threshold) < tolerance, and (3) the number of zero crossings and the number of extrema
is less than or equal to one Lahmiri (2016). In this case threshold, threshold2 and the
tolerance value are set by the researcher, Lahmiri (2016) applied 0.5, 0.5, 0.5 values. Zhu
et al. (2016) terminated the shifting process when it reached the maximum shifting times
of 10. In Xiong et al. (2014) paper the whole sifting process stops after 𝑙𝑜𝑔2𝑁 IMFs have
been extracted, where N is the length of the data series.
Tseng and Lee (2010) applied an etropic analysis strategy. They analyized to what
extent information relevant to underlying functions of x(t) is carried in the IMFs. They
defined a normalized information scale to measure the information extent. Their
1 𝑆𝐷(𝑘) =
∑ (𝑐𝑘−1− 𝑐𝑘 )2𝑇
𝑡=0
∑ 𝑐𝑘−12𝑇
𝑡=0< 𝜀
EMD and wavelet decomposition based denoising and forecasting of crude oil prices Bálint Plangár
25
numerical studies showed that the scale correctly quantifies the extent of information that
is codified in an IMF. Based on this scale the IFMs that are information-free components
can be identified.
After the stopping criterion is satisfied, the original data series can be expressed by
𝑥(𝑡) = ∑ 𝑐𝑗(𝑡) + 𝑟𝑛(𝑡)𝑛𝑗=1 , where n is the number of IMFs, 𝑟𝑛(𝑡) is the final residual
which is the main trend of 𝑥(𝑡) and 𝑐𝑗(𝑡) (𝑗 = 1, … , 𝑛) are the IMFs. Thus, one can
achieve decomposition of the data series into n-empirical mode functions and one
residual. The IMF components have different frequency band and they change with
variation of time series 𝑥(𝑡), while 𝑟𝑛(𝑡) represents the central tendency of the data (Yu
et al., 2008).
3. Figure: Plotting the envelope and their mean, Source: Metatrader, 2012
Empirical mode decomposition has several distinct advantages, however it also has
some serious disadvantages. On the one hand it is relatively easy to understand and
implement, the fluctuations within a time series are automatically and adaptively selected
from the time series, it is robust for nonlinear and nonstationary time series
decomposition, EMD can adaptively decompose a time series into several IMF
components and one residual components. Unlike wavelet decomposition, EMD is not
required to determine a filter base function before decomposition (Yu et al., 2008). On
the other hand the decomposition results can be mode mixing, which means that a single
IMF contains sparsely distributed timescales, or similar timescales are broken down into
different IMFs (i.e. orthogonality condition is not satisfied) (Zhu et al., 2016).
Furthermore EMD suffers the end effect. End effect refers to the situation in which when
EMD and wavelet decomposition based denoising and forecasting of crude oil prices Bálint Plangár
26
calculating the upper and lower envelopes with the cubic spline function in the sifting
process of EMD, divergence appears on both ends of the data series and gradually
influences the inside of the data series, greatly distorting the results (Deng et al., 2001).
6.2. Discrete wavelet based decomposition
Wavelet methodology, a refinement of Fourier analysis, is an alternative for
analyzing nonstationary data with high irregularities and cyclical pattern. The wavelet
multiscale decomposition allows for simultaneous analysis in the time and frequency
domain. It converts a signal into a series of wavelets and provides a way for analyzing
waveforms, bounded in both frequency and duration. That is why wavelet decomposition
could be a valuable means of exploring the complex dynamics of financial time series
(Bekiros, Marcellino, 2013). Figure 4. depicts the benefits of wavelet transform in
comparison with the time domain representation, Fourier transform and the short-time
Fourier transform.
4. Figure: Comparison of transformations, Source: Uliha, 2016, 512.p
Figure 4. highlights that in case of a time domain representation we have no
frequency information however we have information about the amplitude of a signal. The
Fourier transform uses a basis of sines and cosines of different frequencies to determine
how much of each frequency the signal contains. The Fourier transform does not allow
the frequency content of the signal to change over time therefore it can tell us how much
EMD and wavelet decomposition based denoising and forecasting of crude oil prices Bálint Plangár
27
of each frequency exists in the signal but it does not tell us when in time these frequency
components exist. To overcome such limitation it has been suggested the short-time
Fourier transform. It consists in applying a short-time window to the signal and
performing the Fourier transform within this window as it slides across all the data.
However, any time-frequency analysis is limited by the Heisenberg uncertainty principle,
which states it is impossible to know simultaneously the exact frequency and the exact
time of occurrence of this frequency in a signal (i.e. there is a trade off between time and
frequency resolution). The problem with the short-time Fourier transform is that it uses
constant length windows. In contrast, the wavelet transform uses local base functions that
can be stretched and translated with a flexible resolution in both frequency and time. In
case of the wavelet transform, the time resolution is intrinsically adjusted to the frequency
with the window width narrowing when focusing on high frequencies while widening
when assessing low frequencies. Allowing for windows of different size makes it possible
to improve the frequency resolution of the low frequencies and the time resolution of the
high frequencies. This means that, a certain high or low frequency component can be
located better in time. Wavelet enables a more flexible approach in time series analysis,
wavelet analysis is seen as a refinement of Fourier analysis. (Rua, 2012) (Uliha, 2016)
The signal 𝑥[𝑛] is a discrete time function i.e. a sequence, where n is an integer.
The procedure starts with passing the sequence through a half band digital lowpass filter
with impulse response ℎ[𝑛]. Signal filtering corresponds to the mathematical operation
of convolution of the signal with the impulse response of the filter. The convolution is
defined as follows:
𝑥[𝑛] ∗ ℎ[𝑛] = ∑ 𝑥[𝑘] ∙ ℎ[𝑥 − 𝑘]∞𝑘=−∞ (2)
A half band lowpass filter removes all frequencies2 that are above half of the highest
frequency in the signal. After passing the signal through a half band lowpass filter, half
of the samples can be eliminated. Discarding every other sample will subsample the signal
by two and the signal will then have half the number of points. The scale of the signal is
now doubled. The lowpass filtering removes the high frequency information, but leaves
the scale unchanged. Only the subsampling process changes the scale. However
resolution is related to the amount of information in the signal, and therefore, it is affected
by the filtering operations. Nevertheless the subsampling operation after filtering does
2 In discrete signals frequency is expressed in terms of radians.
EMD and wavelet decomposition based denoising and forecasting of crude oil prices Bálint Plangár
28
not affect the resolution, half the samples can be discarded without any loss of
information. In summary, the lowpass filtering halves the resolution, but leaves the scale
unchanged. The signal is then subsampled by 2 since half of the number of samples are
redundant. This doubles the scale. This procedure can be expressed as:
𝑦[𝑛] = ∑ ℎ[𝑘] ∙ 𝑥[2𝑛 − 𝑘]∞𝑘=−∞ (3)
The DWT analyzes the signal at different frequency bands with different resolutions by
decomposing the signal into a coarse approximation and detail information. DWT
employs two sets of functions, called scaling functions and wavelet functions, which are
associated with low pass and highpass filters, respectively. The decomposition of the
signal into different frequency bands is simply obtained by successive highpass and
lowpass filtering of the time domain signal. In summary the original signal is first passed
through a halfband highpass filter𝑔[𝑛] and a lowpass filter ℎ[𝑛], after the filtering half of
the samples can be eliminated, then the signal can be subsampled by 2, simply by
discarding every other sample. This constitutes one level of decomposition and can be
expressed as follows:
𝑦ℎ𝑖𝑔ℎ[𝑘] = ∑ 𝑥[𝑛] ∙ 𝑔[2𝑘 − 𝑛]𝑛 (4)
𝑦𝑙𝑜𝑤[𝑘] = ∑ 𝑥[𝑛] ∙ ℎ[2𝑘 − 𝑛]𝑛 (5)
where 𝑦ℎ𝑖𝑔ℎ[𝑘] and 𝑦𝑙𝑜𝑤[𝑘] are the outputs of the highpass and lowpass filters after
subsampling by 2. The decomposition halves the time resolution since only half the
numbers of samples now characterize the entire signal. However, this operation doubles
the frequency resolution, since the frequency band of the signal now spans only half the
previous frequency band, reducing the uncertainty in the frequency by half. This
procedure can be repeated for further decomposition. Figure 5. illustrates this procedure.
EMD and wavelet decomposition based denoising and forecasting of crude oil prices Bálint Plangár
29
5. Figure: Process of wavelet decomposition, Source: Mirzaei et al., 2010, 303.p.
The highpass and lowpass filters are not independent of each other and they are related
by the following equation, where L is the filter length (in number of points):
𝑔[𝐿 − 1 − 𝑛] = (−1)𝑛 ∙ ℎ[𝑛] (6)
The frequency bands that have little information for the original signal will have very low
amplitudes, consequently that part of the signal can be discarded without loss of
information, allowing data reduction. The reconstruction of the original signal is easy if
we use halfband filters, since they form orthonormal basis. The reconstruction formula
can be expressed by:
𝑥[𝑛] = ∑ ((𝑦ℎ𝑖𝑔ℎ[𝑘] ∙ 𝑔[2𝑘 − 𝑛]) + (𝑦𝑙𝑜𝑤[𝑘] ∙ ℎ[2𝑘 − 𝑛]))∞𝑘=−∞ (7)
However, if the filters are not ideal halfband, then perfect reconstruction cannot be
achieved Daubechies (1992). The most famous wavelets are known as the Daubechies’
wavelets, however Coiflet, Haar and Symlet wavelets are also frequently used types.
One of the most important benefits of wavelet decomposition is its strong
theoretical background and the possibility of applying wavelet that produces orthogonal
components. In comparison with Fourier transform, wavelet transform uses local base
functions that can be stretched and translated with a flexible resolution in both frequency
and time, resulting in more frequency and time domain information. Due to the filtering
one can easily choose noise component. On the other hand one should choose a base
function before the analysis which can highly affect the results, there is no recipe book
EMD and wavelet decomposition based denoising and forecasting of crude oil prices Bálint Plangár
30
for choosing the type of wavelet for a specific time series. Not only should the base
function be chosen by the researcher in advance, but also the order of the wavelet. The
classical, decimated discrete wavelet transform involves subsampling of the output of the
high- and low-pass filters to half their original length. This leads to a serious drawback,
namely the transform is not invariant in the real-axis. Specifically, the DWT of a shifted
signal is not the shifted version of the DWT of the signal (Bekiros, Marcellino, 2013).
Furthermore wavelet transform, just like EMD, suffers from the boundary effect (Su et
al., 2012).
7. Data
This section introduces the data that is used for the research. First, the main
properties of the data will be presented, as well as some of its most important descriptive
statistics. Then the effect of empirical mode decomposition will be described on an
example.
In this study the daily Brent crude oil spot price is chosen as an experimental
sample. The data is available and can be downloaded from the website of Energy
Information Administration. The data span a time period from 2000.01.04. to 2019.03.14.
(4867 observations). The given sample length is chosen as it encompasses the most
relevant extreme events occurred in the history of oil price for example 2001’s Terrorist
attack, IRAQ invasion of 2003, subprime crisis in 2008 and the OPEC decision in
2014.The original time series is non-stationary based on KPSS and ADF tests, that is why
this study uses log returns for prediction purposes. Only the log returns will be
decomposed and later predicted with the selected models. The data set is divided into two
parts, the in-sample period starts from 2000.01.04. and lasts 2006.01.03. (1542
observations), while the out-of-sample covers the period 2006.01.04. to 2019.03.14.
(3325 observations). The original Brent crude oil time series and the log returns are shown
on figure 6. the dashed line separates the in-sample from the out-of-sample period. Table
3. describes some of the most important descriptive statistics of log returns.
EMD and wavelet decomposition based denoising and forecasting of crude oil prices Bálint Plangár
31
In-sample Out-of-sample
Mean 0.06089 % 0.00153 %
Standard deviation 0.02445 0.02137
Median 0.13609 % 0.02151 %
Min – max -0.19891 – 0.12853 -0.16832 – 0.18129
ADF statistic (p-value) -38.91 (0.001) -56.79 (0.001)
KPSS statistic (p-value) 0.023 (0.1) 0.067 (0.1)
Auto(1) – Auto(2) 0.0079 – 0.0259 0.0149 – 0.0138 3. Table: Descriptive statistics of log returns calculated on observations from the in-sample and
out-of-sample, Source: Own table
Based on the statistics in table 3. the two samples have relatively similar characteristics,
their mean return can be regarded as equal based on two-sample t-test. They are both
stationary and have no first and second order autocorrelation. Nevertheless both contain
volatile and calm periods.
Figure 6: Brent crude oil prices and returns for the entire sample, Source: Own figure
Figure 8-10. show the empirical mode decomposition of log returns in different periods.
Figure 8. shows a decomposition procedure which uses the entire sample. The red signal
is the original log return series, the green signal is the residual and the blue signals are the
IMFs. This figure describes how EMD decomposes the original signal into meaningful
EMD and wavelet decomposition based denoising and forecasting of crude oil prices Bálint Plangár
32
components. The original signal can be reconstructed by simply summing up all the
components. The components contain high-, mid- and low-frequency information and
capture the complex characteristic of returns. The same decomposition for the in-sample
is described on figure 9. The figure shows us that the decomposition result is not
independent from the window size. In case of using only the in-sample information 15
components are generated, while 19 components are obtained from the entire sample. I
also calculated the components using data two years before and after the collapse of
Lehman Brothers. The results are shown on figure 10. In this case 13 components are
generated. The instability of components can be explained by the continuously changing
environment and by the occurrence of extreme events which can alter the data generating
process. Nevertheless the instability of components makes the result of noise selection
more difficult to interpret, since the number of selected components will also vary during
the analyzed period. Histograms of the number of EMD components can be seen on figure
7. An expending window type is applied for the decomposition, the first window covers
the in-sample period then the window expands as new data is given to the sample on a
daily basis. Figure 7. shows that the number of IMFs gradually increases as the window
expands. It is important to emphasize that the decomposition process in case of EMD can
be time consuming if we apply an expanding window.
In this paper I used MATLAB R2016b software for my calculations, for wavelet
decomposition I used ‘wavedec’ function, while ‘emd’ function was applied for empirical
mode decomposition. Signal reconstruction can be done by summation in case of EMD,
while ‘waverec’ function can be used for reconstructing wavelet coefficients.
7. Figure: Number of IMFs during the estimation period using expanding window, Source: Own
figure
33
33
8. Figure: Components of Brent crude oil generated by EMD on the entire sample, Source: Own figure
34
34
9. Figure: In-sample components of Brent crude oil generated by EMD, Source: Own figure
35
35
10. Figure: Components of Brent crude oil generated by EMD during the recession, 2006.09. – 2010.09., Source: Own figure
EMD and wavelet decomposition based denoising and forecasting of crude oil prices Bálint Plangár
36
8. Empirica l analys is and results
This section introduces the prediction strategy, briefly describes the ARIMA model
and presents the results. The prediction strategy will be described with the help of the
general research framework which was introduced in section 4. This paper applies
ARIMA as prediction model, because it is a simple model and its parameters can be
estimated relatively quickly. It is important to emphasize that the focus of this paper is on
the denoising ability of EMD and wavelet.
8.1. Prediction strategy
This study applies the first research design from the four broad designs which were
introduced in section 4 on figure 1. This involves the decomposition of the original data,
the approach then applies different noise selection methods to choose and drop noise
components. The rest of the components are aggregated with the help of signal processing
inverse, which is summation in case of EMD, wavelet inverse in case of wavelet
decomposition. The prediction process is shown on figure 11. This study defines noise as
the following: the component or components of an observable signal, which, if dropped,
improves prediction accuracy.
11. Figure: Prediction process, Source: Own figure
The detailed version of the research design is depicted on figure 12.
37
12. Figure: Selected research design for empirical mode decomposition, Source: Own figure
EMD and wavelet decomposition based denoising and forecasting of crude oil prices Bálint Plangár
38
Figure 12. helps better understand the selected research design for empirical mode
decomposition. This study analyzes daily log return data, using expanding window. The
first window is from 2000.01.04. to 2006.01.03. and it expands on a daily basis. Empirical
mode decomposition is selected as a decomposition method and a residual based stopping
criterion is chosen for terminating the algorithm. This paper applies the same stopping
criterion as Lahmiri (2016). Lahmiri (2016) computed the standard deviation (SD) from
two consecutive sifting results. The shifting process should be stopped if the standard
deviation is less than an arbitrary small number3. Huang et al. (1998) emphasize that
carrying the shifting process to an extreme could make the resulting IMF a pure frequency
modulated signal of constant amplitude. To guarantee that the IMF components have
enough physical sense one should set SD value between 0.2 – 0.3. This paper selected
𝜀 =0.2 as the stopping criterion, however several times the stopping criterion had to be
set 𝜀 =0.3 because the algorithm failed to converge.
This paper applies expert judgement, PACF, permutation entropy, sample entropy
and Shannon entropy as noise selection methods. In case of expert judgement the first,
the first two, the first three then the first four components are dropped. PACF approach
is based on the consideration that an uncorrelated identically distributed random sequence
with zero expected value can be regarded as white noise. The entropies are used as a tool
for optimal decomposition with respect to the minimization of an entropy, which
describes the information-relevant properties of the representation of a signal. The
entropy of each denoised signal is estimated step-wise and it is compared with the one
from the previous level. The procedure is the following, after decomposition the first
component is dropped, the rest of the signal is aggregated then the entropy of the denoised
signal is estimated. After that the first two components are dropped, the rest of the signal
is aggregated then the entropy of the denoised signal is estimated etc. The optimal level
of decomposition is determined at the minimum value of the entropy.
After the noise selection is done, a component or some components are dropped
and the rest of the components are aggregated. This paper applies only one prediction
model and uses the same window for prediction as for the decomposition (i.e. expanding
window). An ARIMA (p,0,q) model is selected for a one period prediction, the lag
parameters are p=1,2,3,4 and q=0,1,2,3,4. The optimal parameters are chosen based on
3 𝑆𝐷(𝑘) =
∑ (𝑐𝑘−1− 𝑐𝑘 )2𝑇
𝑡=0
∑ 𝑐𝑘−12𝑇
𝑡=0< 𝜀
EMD and wavelet decomposition based denoising and forecasting of crude oil prices Bálint Plangár
39
Bayesian information criterion. Due to the fact that the components were aggregated in
an earlier stage, there is no need for aggregation at the end of the process.
The research design is the same in case of wavelet decomposition except for the
fact that a 10-level discrete wavelet decomposition is applied with the help of order 7
Daubechies wavelet. The level is selected based on Bekiros & Marcellino (2013), and the
order 7 Daubechies wavelet is applied in this study because it is one of the most popular
selection in the literature. Wavelet decomposition generates one approximation
component and ten detail components. The detail components contain the high frequency
information, therefore these are the components which are potentially selected as noise.
The noise selection procedure is the same as in the case of EMD. The expert judgement
approach involves dropping the first, the first two, the first three then the first four detail
components. After decomposition, PACF based noise selection aggregates each D1-D10
components per se with wavelet inverse. Components that have no autocorrelation are
dropped. In case of the entropy statistics, after decomposition, D1 component is dropped
then the rest of the components (A10, D2-D10) are aggregated using the inverse wavelet
transform and the three entropies are estimated. After that the first two detail components
are dropped, the rest of the signal is aggregated (A10, D3-D10) using wavelet inverse
then the entropy of the denoised signal is estimated etc. The optimal level of
decomposition is determined at the minimum value of the entropy.
To measure the forecasting performance, two main criteria are used for evaluation
of level prediction and directional forecasting, respectively. The root mean squared error
(RMSE) is selected as the evaluation of level prediction. RMSE can be defined as
𝑅𝑀𝑆𝐸 = √1
𝑁∑ (�̂�(𝑡) − 𝑥(𝑡))2𝑁
𝑡=1 (8)
where N is the number of prediction, �̂�(𝑡) is the predicted value and 𝑥(𝑡) is the observed
signal. Accuracy is one of the most important criteria for forecasting models, the other
being the decision improvements generated from directional predictions. From the
business point of view the latter is more important than the former. The ability to predict
movement direction can be measured by a directional statistic (𝐷𝑠𝑡𝑎𝑡) (Yu et al. 2008).
The statistic can be expressed as
EMD and wavelet decomposition based denoising and forecasting of crude oil prices Bálint Plangár
40
𝐷𝑠𝑡𝑎𝑡 =1
𝑁∑ 𝑎𝑡
𝑁𝑡=1 (9)
where 𝑎𝑡 = 1 if (𝑦𝑡+1 − 𝑥𝑡)(𝑥𝑡+1 − 𝑥𝑡) ≥ 0 and 𝑎𝑡 = 0 otherwise. Here 𝑦𝑡+1 represents
the predicted value given by a model and 𝑥𝑡, 𝑥𝑡+1 are observed values.
8.2. Prediction model
This section describes briefly the prediction model ARIMA. In this research
ARIMA models are trained on the denoised signal in order to generate one period out of
sample prediction. The literature is rich in the description of ARIMA that is why only the
most important characteristics of the two models are highlighted in this section.
In an ARIMA model (Box & Jenkins, 1970) the future value of a variable is
assumed to be a linear function of several past observations and random errors. The
underlying process that generates the time series takes the following form:
Φ(𝐵)𝑦𝑡 = 𝜃(𝐵)𝑒𝑡 (10)
where 𝑦𝑡 and 𝑒𝑡 are the actual value and random error at time t respectively. 𝐵 denotes
the backward shift operator 𝐵𝑦𝑡 = 𝑦𝑡−1 and 𝐵2𝑦𝑡 = 𝑦𝑡−1 etc. and Φ(𝐵), 𝜃(𝐵) denotes
the following:
Φ(𝐵) = 1 − Φ1𝐵1 − Φ2𝐵2 − ⋯ − Φ𝑝𝐵𝑝 (11)
𝜃(𝐵) = 1 − 𝜃1𝐵1 − 𝜃2𝐵2 − ⋯ − 𝜃𝑞𝐵𝑞 (12)
where p ang q are parameters and often referred to as lag orders of the model. Random
errors are assumed to be independently and identically distributed with a mean of zero
and a constant variance. If the dth difference of {𝑦𝑡} is an ARMA process of order p and
q, then 𝑦𝑡 is called an ARIMA(p-d-q) process.
8.3. Results
This section summarizes the most important empirical results of the study. For the
sake of simplicity the section first introduces the results of the PACF noise selection,
followed by the results of the entropy based noise selection in cases when EMD was used
as decomposition method. After that, the results of wavelet based decomposition are
summarized in the same order, followed by the noise selection made with an expert
EMD and wavelet decomposition based denoising and forecasting of crude oil prices Bálint Plangár
41
judgement approach. At the end of this section the results of EMD and wavelet
decomposition are compared.
Using the prediction strategy with PACF noise selection is proved to be a weak
approach. In case we want to drop components that are uncorrelated with their own lags
(i.e. PACF can be considered as zero for all lags) then none of the components are selected
as noise. Even the highest frequency component has significant first, second and third
order autocorrelation. The first bar chart of figure 13.4 shows the ratio of the first IMFs
that have statistically significant lags, based on the figure all the IMF1 have statistically
significant first, second and third order autocorrelation (i.e. PACF lags are not zero). The
results are the same in case of IMF2 and IMF3 sequences. That is why this approach
suggests that all the IMFs have information content thereby using the observed signal
(return) is beneficial.
Permutation entropy, a natural complexity measure for time series (Bant & Pompe,
2002), is proved to be a weak approach in this study. The time delay was set to one, the
order of the ordinal patterns was set to three. This means that three consecutive
observations were grouped in embedded vectors5. The noise selection was not successful
with permutation entropy, since its value decreased as more and more IMFs were
4 The signal was decomposed on a daily basis, therby the number of decomposition was 3325. I collected
all the IMF1, IMF2 and IMF3 sequences becasue theese are the highest frequency components. The figure
shows the ratio of IMFs that have statistically significant 1-6 lags based on PACF. 5 For more details check Riedl et al. (2013)
13. Figure: Ratio of significant lags in the first three IMFs, Source: Own figure
EMD and wavelet decomposition based denoising and forecasting of crude oil prices Bálint Plangár
42
dropped. Consequently the minimum value of permutation entropy was calculated at the
trend component in 91% of all decompositions.
14. Figure: Typical values of permutation entropy estimated from denoised signals, Source: Own
figure
Figure 14. shows the effect of denoising on permutation entropy. The values on figure
14. were calculated using the entire sample and shows how permutation entropy decreases
as more and more IMFs are dropped6. In spite of the fact that figure 14. shows the result
of one decomposition, the same pattern appeared in most of the cases. Therefore
permutation entropy suggests in 91% of decompositions that all of the IMF components
should be dropped except the trend. Noise selection with sample and Shannon entropy
were more successful. Figure 15-16. show the result of noise selection based on sample
and Shannon entropy. The left histogram shows the number of dropped components based
on the entropy and the right is the number of components generated with expanding
window.
6 The first point on the figure was calculated after denoising the signal from IMF1. The second point on the
figure shows the value of permutation entropy when IMF1 and IMF2 are dropped etc.
EMD and wavelet decomposition based denoising and forecasting of crude oil prices Bálint Plangár
43
15. Figure: Number of dropped IMFs based on sample entropy and number of generated IMFs
using expanding window, Source: Own figure
16. Figure: Number of dropped IMFs based on Shannon entropy and number of generated IMFs
using expanding window, Source: Own figure
Noise selection with sample entropy led to a similar result but not that radical as
permutation entropy. Sample entropy suggests to drop several components and use the
last two to three components for reconstruction. In case of sample entropy the embedding
dimension was set to 200 and the tolerance value to 0.37. Nevertheless Sample entropy
can be used for the analysis because in most of the times it suggests to keep some of the
components which we can use for reconstructing a signal. Shannon entropy based noise
7 For determining the parameters I used Richman and Mooran (2000) study, however parameter selection
involved several trials and erros.
EMD and wavelet decomposition based denoising and forecasting of crude oil prices Bálint Plangár
44
selection has the most promising result, it suggests to keep on average seven to eight
components. Later in this section the prediction enhancing performance of sample and
Shannon entropy based noise selection is described.
Using level ten wavelet decomposition and PACF for noise selection has the same
result as in case of EMD: none of the components are selected as noise because all are
autocorrelated based on PACF. Consequently in this study PACF could not be used as a
noise selection tool.
Wavelet decomposition also led to the same conclusion in case of the three
entropies. The more detail components we drop the less the value of permutation entropy
is. In general the reconstruction of approximation coefficient is suggested based on
permutation entropy. Figure 18. shows how permutation entropy decreases as more and
more detail components are dropped. The top chart on the left side shows a signal that is
denoised from D1, the chart under it shows the signal that is denoised from D1 and D2,
the last chart is the reconstructed approximation component. Their permutation entropy
value is presented on the right side of the figure. Figure 18. was created using the entire
sample, thereby it depicts one decomposition, however the pattern on the figure is similar
to the majority of decompositions. Sample and Shannon entropy led to a similar result as
in case of empirical mode decomposition. Figure 17. shows the dropped detail
components based on Shannon and sample entropy. A level ten wavelet decomposition
was applied, therefore every time an entropy suggests that ten detail components should
be dropped is equivalent to reconstructing only the approximation coefficients. As in case
of EMD Shannon entropy based noise selection has the most promising result, it suggests
to keep two to three detail components.
EMD and wavelet decomposition based denoising and forecasting of crude oil prices Bálint Plangár
45
17. Figure: Number of dropped detail components based on Shannon and Sample entropy
using expanding window, Source: Own figure
46
18. Figure: Denoised signals and their permutation entropy using wavelet decomposition, Source: Own figure
EMD and wavelet decomposition based denoising and forecasting of crude oil prices Bálint Plangár
47
Due to the fact that sample and Shannon entropy suggest to drop several
components both in case of EMD and wavelet decomposition, an expert judgement
approach is applied for noise selection. This strategy involves dropping the first, the first
two, the first three and the first four components. These components are selected
arbitrarily, nevertheless these are the ‘high frequency’ components and it stands to reason
that dropping them is beneficial. In the following part of this section the prediction
performances of the above methods are summarized, using the prediction strategy
described in section 8.1.
Table 4. shows the RMSE values for predicting the original log returns, the
prediction performance of using an expert judgement approach and the results of entropy
based denoising. It also shows the Diebold Mariano test statistics. Table 5. shows the
values of the direction statistic.
Original Method Expert judgement Entropy
2.115
- Drop 1 Drop 1-
2
Drop 1-
3
Drop 1-
4
Shannon Sample
EMD 2.812 3.445 3.550 3.402 3.191 3.040
Wavelet 2.589 2.625 2.464 2.370 2.30 2.412
DM test 5.84 11.29 14.54 14.96 14.28 11.23 4. Table: Prediction performance of the denoising methods based on RMSE (multiplied by 100),
Source: Own table
Original Method Expert judgement Entropy
75.42%
- Drop 1 Drop 1-
2
Drop 1-
3
Drop 1-
4
Shannon Sample
EMD 64.18% 45.65% 30.86% 29.41% 37.62% 36.92%
Wavelet 67.88% 65.92% 70% 71.22% 74.11% 62.33% 5. Table: Prediction performance of the denoising methods based on 𝑫𝒔𝒕𝒂𝒕 ,Source: Own table
Based on table 4-5. dropping IMF1 (i.e. the highest frequency component) is the best
prediction strategy for empirical mode decomposition, while selecting noise with
Shannon entropy gives the most accurate level prediction in case of wavelet
decomposition. The direction statistic led to the same conclusion. Diebold-Mariano test
analyses the equivalence of two forecasts based on squared prediction errors. Every EMD
prediction strategy is compared to its wavelet counterpart. Based on DM statistics the
equivalence of forecasts can be rejected. Based on table 4-5. wavelet based decomposition
led to a more accurate prediction relative to EMD. However, forecasting the original
signal is the most accurate prediction based on both RMSE and directional statistic. If we
compare the prediction accuracy of the original signal to the best performing EMD and
wavelet, the DM statistic rejects their equivalence. Figure 19. shows the evolution of
EMD and wavelet decomposition based denoising and forecasting of crude oil prices Bálint Plangár
48
cumulative RSE throughout the out-of-sample period in case of the best performing EMD
and wavelet models.
19. Figure: Cumulative RSE of the two best performing models throughout the out-of-sample
period, Source: Own figure
9. Robustness check
In the last section of this study, I will perform robustness checks. In this section, I
will test the validity of my results, by recalculating the models with slightly different
settings. This way I can check how sensitive my results are. The general research
framework (figure 2.) gives a tool for robustness check by selecting different settings in
Data column. This study uses weekly log returns for the same period and applies rolling
widow for decomposition and prediction. Each window contains 250 observations and
the number of decomposition was 751.
The threshold selection for empirical mode decomposition was highly sensitive.
This study used 𝜀 =0.2 as a threshold for terminating the shifting process. However the
threshold had to be changed several times between values 0.2 and 0.4 in order to ensure
convergence. Seemingly there is no connection between threshold changes and volatile
periods. The root cause of the parameter change is unknown. I had the same problem in
case of expanding window with daily data and rolling window with weekly data.
I also had difficulties with the spline fit on the local minima and maxima time series.
In case of using rolling window (500, 1000, 1500, 1600, 2000 and 2500) and daily log
EMD and wavelet decomposition based denoising and forecasting of crude oil prices Bálint Plangár
49
returns the spline fit could not be applied, which is an important part of the shifting
process.
The number of IMFs are gradually increased in case of expanding window (figure
7.), while it remained stable for weekly data and rolling window. This can be explained
by the change in the data generating process. In case we use expanding window all the
past information is used even those that are observed before a potential regime shift. A
rolling window, that incorporates 250 observations every time, has less change to use
information from multiple regimes. Another explanation for the stable IMF number stems
from the fact that weekly data are smoother than daily data. Therefore high frequency
components are removed as we change from daily to weekly data.
20. Figure: Histograms of the number of IMFs using rolling window, Source: Own figure
The result of PACF noise selection remained the same both in case of EMD and
wavelet decomposition. The components are autocorrelated based on PACF,
consequently they cannot be considered as white noise.
Noise selection with permutation entropy remained the same both in case of EMD
and wavelet decomposition.
EMD and wavelet decomposition based denoising and forecasting of crude oil prices Bálint Plangár
50
21. Figure: Permutation entropy based noise selection in case of EMD, Source: Own figure
Figure 21. shows the histogram of IMFs on the left, using rolling window, weekly
data and the result of all 751 decompositions. The histogram on the right shows the
number of IMFs that are suggested to drop by permutation entropy. The figure shows that
almost all of the IMFs should be dropped based on permutation entropy. Wavelet
decomposition led to the same conclusion. The more detail components we drop the less
the value of permutation entropy is. In general the reconstruction of approximation
coefficients is suggested based on permutation entropy. Sample entropy drops all the
detail components, while Shannon entropy gives similar results as in case of expanding
window and daily data.
22. Figure: Number of dropped detail components based on Shannon and Sample entropy using
weekly data and rolling window, Source: Own figure
EMD and wavelet decomposition based denoising and forecasting of crude oil prices Bálint Plangár
51
Table 6. shows the RMSE values for predicting the original log returns, the prediction
performance of using an expert judgement approach and the results of entropy based
denoising. The best performing models changed, sample entropy based denoising is the
most accurate in case of EMD, while dropping IMF1 and IMF2 strategy is the most
accurate in case of wavelet based decomposition. Using rolling window and weekly data
does not change the fact that wavelet based denoising strategies are more accurate than
empirical mode decomposition based strategies. All the wavelet strategies are better than
predicting the original log returns.
Original Method Expert judgement Entropy
4.569
- Drop 1 Drop 1-
2
Drop 1-
3
Drop 1-
4
Shannon Sample
EMD 7.468 7.049 6.986 6.133 5.365 5.217
Wavelet 4.056 4.045 4.048 4.054 4.052 4.102
DM test 22.23 19.38 18.73 14.72 12.68 12.74 6. Table: Prediction performance of the denoising methods based on RMSE (multiplied by 100)
using rolling window and weekly data, Source: Own table
Original Method Expert judgement Entropy
69.6%
- Drop 1 Drop 1-
2
Drop 1-
3
Drop 1-
4
Shannon Sample
EMD 58.0% 55.2% 57.07% 57.87% 57.5% 60.0%
Wavelet 63.47% 59.73% 64.93% 67.2% 68.4% 64.1% 7. Table: Prediction performance of the denoising methods based on 𝑫𝒔𝒕𝒂𝒕 using rolling window
and weekly data, Source: Own table
10. Conclus ion
Given the available literature, the paper’s contribution is threefold: (1) the paper
introduced a general research framework which describes the possible research designs
in decomposition based financial time series forecasting, (2) the paper provided a
thorough literature review based on the most important articles and classified it with the
help of the general research framework, (3) the paper compared PACF, entropy and the
expert judgement based noise selection methods in terms of their contribution to
prediction accuracy.
The framework introduced in this paper has several advantages, it helps with the
formulation of the research design, it helps researchers specify all the necessary details
or parameters of their research design thereby facilitating the paper’s reproduction, the
EMD and wavelet decomposition based denoising and forecasting of crude oil prices Bálint Plangár
52
framework makes a great progress in comparing and classifying research papers, since it
provides the necessary groupings for classification. The proposed framework facilitates
the aggregation of the results of the current literature, consequently it helps us better
understand the efficiency of signal processing techniques in financial time series analysis.
Moreover it paves the way for a meta-study in which the current results can be combined.
It helps determine the reliability of the results presented in a research paper. The
framework fills a gap in the current literature which opens up the opportunities for further
researches.
This paper provided a thorough literature review on financial time series
forecasting. All the papers apply one of the decomposition methods from wavelet or
empirical mode decomposition family and they analyze oil price or foreign exchange
data. The main purpose of the literature review was to describe the trends in financial
time series forecasting, focusing particularly on characteristics such as decomposition
method, data, noise selection, reconstruction method, prediction models and the result of
their analysis.
Finally the paper compared PACF, entropy and the expert judgement based noise
selection methods. The noise selection was based on the decomposition result of
empirical mode decomposition and wavelet decomposition. Dropping the highest
frequency component was the best strategy in case of EMD, while Shannon entropy noise
selection resulted the most accurate prediction in case of wavelet decomposition.
However none of the strategies produced better performance than predicting the original
time series. This paper performed a robustness check where the decomposition strategies
were recalculated with slightly different settings. Using weekly data and rolling window
the wavelet based denoising strategies produced more accurate forecasts than predicting
the original signal. This result emphasizes the sensitivity of denoising methods to the
input data and the parameter settings.
The analysis gives reason for concern. First of all the threshold selection for
empirical mode decomposition was highly sensitive. The convergence of the EMD
algorithm is not stable, it frequently stopped because it failed to fit a spline. Moreover the
number of IMFs gradually increased using expanding window, which makes the
interpretation of components more difficult. Using EMD for decomposition is time
consuming, the daily decomposition with expanding window took about 4-5 hours. The
decomposition based forecasting strategies have promising results, as it was shown in the
literature review, however the decomposition strategies presented in this paper failed to
EMD and wavelet decomposition based denoising and forecasting of crude oil prices Bálint Plangár
53
beat the strategy where decomposition was not involved in case of expanding window.
The situation was different when rolling window was applied. Finding a proper noise
selection method can enhance prediction performance, since it can help models to train
for the fundamental part of a signal and capture the most important factors.
54
References
A. Boukhayma, A. Peizerat, C. Enz, 2016, Noise Reduction Techniques and Scaling
Effects towards Photon Counting CMOS Image Sensors, Sensors, 2016. Apr. 09.
A. Chen, M.T. Leung, D. Hazem, 2003, Application of neural networks to an emerging
financial market: Forecasting and trading the Taiwan Stock Index, Computers &
Operations Research, Vol. 30, 901-923. p.
A. Mirzaei, A. Ayatollahi, P. Gifani, L. Salehi, 2010, Spectral Entropy for Epileptic
Seizures Detection, 2010 Second International Conference on Computational Intelligence
A. Lanza, M. Manera, M. Giovannini, 2005,Modeling and forecasting cointegrated
relationships among heavy oil and product prices, Energy Economics, Vol. 27, 831–48.
p.
A. Rua, 2012, Wavelets in economics, Banco de Portugal Economic Bulletin, Vol. 18,
No. 2, pp. 71–79.
A.C. Smith, P. Monaghan, F. Huettig, 2017, The multimodal nature of spoken word
processing in the visual world: Testing the predictions of alternative models of
multimodal integration, Journal of Memory and Language vol. 93, 2017 April, 276-303.
p.
A. S. Berres, T. L. Turton, M. Petersen, D. H. Rogers, J.P. Ahrens, 2017, Video
Compression for Ocean Simulation Image Databases, Workshop on Visualisation in
Environmental Sciences
BP, 2018, Statistical Review of World Energy, 2018 June, 67th edition
[Link:https://www.bp.com/content/dam/bp/business-
sites/en/global/corporate/pdfs/energy-economics/statistical-review/bp-stats-review-
2018-full-report.pdf]
B. Zhu, X. Shi, J. Chevallier, P. Wang, Y-M. Wei, 2016, An Adaptive Multiscale
Ensemble Learning Paradigm for Nonstationary and Nonlinear Energy Price Time Series
Forecasting, Journal of Forecasting
C. Bandt, B. Pompe, 2002, Permutation Entropy: A Natural Complexity Measure for
Time Series, Physical Review Letters, Vol. 88, No. 17
C-S. Lin, S-H. Chiu, T-Y. Lin, 2012, Empirical mode decomposition-based least squares
support vector regression for foreign exchange rate forecasting, Economic Modelling,
vol 29, 2583-2590. p.
C-Y. Tseng, HC Lee, 2010, Entropic interpretation of empirical mode decomposition and
its applications in signal decomposition, Advances in Adaptive Data Analysis, Vol. 2,
No. 4, 429-449. p.
FIA, 2018, Total 2017 volume 25.2 billion contracts, down 0.1% from 2016, 2018. jan.
24.
[Link: https://fia.org/articles/total-2017-volume-252-billion-contracts-down-01-2016]
55
J.L. Zhang, Y.J. Zhang, L. Zhang, 2015, A novel hybrid model for crude oil price
forecasting, Energy Economics, Vol. 49, 2015. May, 649-659. p.
N. Krichene, 2007, Recent Dynamics of Crude Oil Prices, International Monetary Fund,
Working Paper December 2006
N.E. Huang, Z. Shen, S.R. Long, M.C. Wu, H.H. Shih, Q. Zheng, N.C. Yen, C.C. Tung,
H.H. Liu,1998, The empirical mode decomposition and the Hilbert spectrum for
nonlinear and nonstationary time series analysis, Proceedings of the Royal Society A:
Mathematical, Physical & Engineering Sciences 454, 903–995.
L. Juvenal, I. Petrella, 2014, Speculation in the oil market, Journal of Applied
Econometrics, vol 30, 2015 June/July, 621-649. p.
L. Yu, Z. Wang, L. Tang, 2015, A decomposition-ensemble model with data-
characteristic-driven reconstruction for crude oil price forecasting, Journal of Applied
Energy, vol. 156, 251-267. p.
L.Yu, S. Wang, K.K. Lai, 2008, Forecasting crude oil price with an EMD-based neural
network ensemble learning paradigm, Energy Economics, vol. 30, 2623 – 2635. p.
L. Yu, W. Dai, L. Tang, 2016, A novel decomposition ensemble model with extended
extreme learning machine for crude oil price forecasting, Engineering Application of
Artificial Intelligence, Article in Press
M. Khashei, M. Bijari, 2010, An artificial neural network (p, d,q) model for timeseries
forecasting, Expert Systems with Applications, Vol. 37, 479-489. p.
G. Uliha, 2016, Az olajár és a makrogazdaság kapcsolatának elemzése folytonos wavelet
transzformáció segítségével, Statisztikai Szemle, Vol. 94, No. 5., 505 -534. p.
G.C. Watkins, A. Plourde, 1994, How volatile are crude oil prices?, OPEC Review, vol
18. 220-245.p.
G.E.P. Box, G. Jenkins, 1970. Time Series Analysis: Forecasting and Control, Holden-
Day, San Francisco, CA.
G.P. Zhang, B. E. Patuwo, M.Y. Hu, 2001, A simulation study of artificial neural
networks for nonlinear time-series forecasting, Computers & Operations Research, Vol.
28, 381-396.p.
H. Su, Q. Liu, J. Li, 2012, Boundary Effects Reduction in Wavelet Transform for Time-
frequency Analysis, WSEAS Transaction on Signal Processing, Vol. 8., Issue 4, 169-
179.p.
H-Y. Zhang, Q. Ji, Y. Fan, 2015, What drives the formation of global oil trade patterns,
Energy Economics, vol. 49, 2015. March
I. Daubechies, 1992, Ten Lectures on Wavelets. Regional Conference Series in Applied
Mathematics (SIAM), vol. 61. Society for Industrial and Applied Mathematics,
Philadelphia, USA
56
J.S. Richman, J.R. Mooran, 2000, Physiological time-series analysis using approximate
entropy and sample entropy, American Journal of Physiology, Vol. 278, 2039-2049. p.
K-J. Kim, 2003, Financial time series forecasting using support vector machines,
Neurocomputing, Vol. 55, Issues 1-2, 307-319.p.
M. Riedl, A. Müller, N. Wessel, 2013, Practical consideration of permutation entropy,
The European Physical Journal Special Topics, Vol. 222, June 2013, 249-262. p.
N. Nomikos, K. Andriosopoulos, Modelling energy spot prices: empirical evidence
from NYMEX. Energy Econ 2012, Vol. 34, 1153–69. p.
Q. Guan, H. An, X. Gao, S. Huang, H. Li, 2016, Estimating potential trade links in the
international crude oil trade: A link prediction approach, Energy, Vol 102, 406-415. p.
R. Jammazi, C. Aloui, 2012, Crude oil price forecasting: Experimental evidence from
wavelet decomposition and neural network modeling, Energy Economics, Vol. 34, 828 –
841.p.
R. D.F. Harris, F. Yilmaz, 2009, A momentum trading strategy based on the low frequency
component of the exchange rate, Journal of Banking and Finance, Vol. 33, 1575-1585. p.
S. Bekiros, M. Marcellino, The multiscale causal dynamics of foreign exchange markets,
Journal of International Money and Finance, Vol. 33, 282-305. p.
S. Lahmiri, A variational mode decompoisition approach for analysis and forecasting of
economic and financial time series, Expert Systems with Application, Vol. 55, 268-273.p.
S. Mirmirani, HC. Li, 2005, A comparison of VAR and neural networks with genetic
algorithm in forecasting price of oil, Advances Econometrics, Vol. 19, 203–23. p.
S. Yousefi, I. Weinreich, D. Reinarz, 2005, Wavelet based prediction of oil prices, Chaos,
Solitons and Fractals, 265-275. p.
Tang L, Yu L, He KJ., 2014, A novel data-characteristic-driven modeling methodology
for nuclear energy consumption forecasting, Applied Energy, Vol. 128, 1–14. p.
T. Xiong, Y. Bao, Z. Hu, 2013, Beyond one-step-ahead forecasting: Evaluation of
alternative multi-step-ahead forecasting models for crude oil prices, Energy Economics,
Vol. 40, 405-415.p.
T. Xiong, Y. Bao, Z. Hu, 2014, Does restraining end effect matter in EMD-based
modeling framework for time series prediction? Some experimental evidences,
Neurocomputing, Vol. 123, 174-184. p.
W. Shu-ping, H. Ai-mei, W. Zhen-xin, L. Ya-qing, B. Xiao-wei, 2014, Multiscale
Combined Model Based on Run-Length-Judgment Method and Its Application in Oil
Price Forecasting, Hindawi Publishing Corporation
Y. Baimbetov, I. Khalil, M. Steinbauer, G. Anderst-Kotsis, 2015, Using Big Data for
Emotionally Intelligent Mobile Services through Multi-Modal Emotion Recognition, In:
57
Geissbühler A., Demongeot J., Mokhtari M., Abdulrazak B., Aloulou H. (eds) Inclusive
Smart Cities and e-Health. ICOST 2015. Lecture Notes in Computer Science, vol 9102.
Springer, Cham
Y. Deng, W. Wang, C. Qian, Z. Wang, D. Dai, 2001, Boundary-processing-technique in
EMD method and Hilbert transform, Chinese Science Bulletin, Vol. 46, 954 – 960. p.
Y. Xiang, HX. Zhuang, 2013, Application of ARIMA model in short-term prediction of
international crude oil price, Advances in Material Research, Vol. 798, 979–82. p.
Z. Guo, W. Zhao, H. Lu, J. Wang, 2012, Multi-step forecasting for wind speed using a
modified EMD-based artificial neural network model, Renewable Energy, Vol. 37, 241-
249. p.
Z. Wu, N.E. Huang, 2009, Ensemble empirical mode decomposition: a noise assisted
data analysis method, Advances Adapive Data Analysis, Vol. 1, 1- 41. p.