Upload
others
View
22
Download
0
Embed Size (px)
Citation preview
Research ArticleData-Driven Machine-Learning Model in District HeatingSystem for Heat Load Prediction A Comparison Study
Fisnik Dalipi1 Sule Yildirim Yayilgan1 and Alemayehu Gebremedhin2
1Faculty of Computer Science and Media Technology Norwegian University of Science and Technology 2815 Gjoslashvik Norway2Faculty of Technology and Management Norwegian University of Science and Technology 2815 Gjoslashvik Norway
Correspondence should be addressed to Fisnik Dalipi fisnikdalipintnuno
Received 21 February 2016 Revised 11 May 2016 Accepted 16 May 2016
Academic Editor Shyi-Ming Chen
Copyright copy 2016 Fisnik Dalipi et al This is an open access article distributed under the Creative Commons Attribution Licensewhich permits unrestricted use distribution and reproduction in any medium provided the original work is properly cited
We present our data-driven supervisedmachine-learning (ML)model to predict heat load for buildings in a district heating system(DHS) Even thoughML has been used as an approach to heat load prediction in literature it is hard to select an approach that willqualify as a solution for our case as existing solutions are quite problem specific For that reason we compared and evaluated threeML algorithms within a framework on operational data from a DH system in order to generate the required prediction modelThe algorithms examined are Support Vector Regression (SVR) Partial Least Square (PLS) and random forest (RF) We use thedata collected from buildings at several locations for a period of 29 weeks Concerning the accuracy of predicting the heat load weevaluate the performance of the proposed algorithms using mean absolute error (MAE) mean absolute percentage error (MAPE)and correlation coefficient In order to determine which algorithm had the best accuracy we conducted performance comparisonamong these ML algorithmsThe comparison of the algorithms indicates that for DH heat load prediction SVRmethod presentedin this paper is the most efficient one out of the three also compared to other methods found in the literature
1 Introduction
As stated in the report of European Commission strategy forenergy the continuous growing of energy demandworldwidehas made energy security a major concern for EU citizensThis demand is expected to increase by 27 by 2030 withimportant changes to energy supply and trade [1] Being thelargest energy and CO
2emitter in the EU the building sector
is responsible for 40ndash50 of energy consumption in Europeand about 30ndash40 worldwide [2] The North Europeancountries have proved themselves as forerunners in thedevelopment and application of clean and sustainable energysolutions Their excellent performance on adopting suchsolutions enables them to achieve ambitious national climateobjectives and requirements and to serve as key players in theentire European energy system [3]
District heating (DH) system is an optimal way of supply-ing heat to various sectors of the society such as industrialpublic or private buildings DH network offers functionaleconomic and ecological advantages and is also instrumentalin reducing the global and local CO
2emissions It offers an
enormous adaptability to combine different types of energysources efficiently [4] Considering the recent technologicaltrends of progressing to smart energy infrastructures thedevelopment of the fourth generation of district heatingimplies meeting the objective of more energy-efficient build-ings Moreover this also envisions DH networks to be asan integrated part of the operation of smart energy systemsthat is integrated smart electricity gas and thermal grids [5]The application of new and innovative technology in districtheating is therefore considered essential to improve energyefficiency [6]
Thederegulation of the electricitymarket and the increas-ing share of energy-efficient buildings have put district heat-ing in a more vulnerable position with regard to challengesin terms of cost effectiveness supply security and energysustainability within the local heat market With this back-ground it is therefore important for district heating sector tomaintain an efficient and competitive district heating systemwhich is able to meet the various requirements which char-acterize the heat market In a flexible district heating systemwith multiple energy sources and production technologies
Hindawi Publishing CorporationApplied Computational Intelligence and So ComputingVolume 2016 Article ID 3403150 11 pageshttpdxdoiorg10115520163403150
2 Applied Computational Intelligence and Soft Computing
the need for accurate load forecasting has become more andmore important This is especially important in a districtheating system with simultaneous production of heat steamand electricity
In this paper with the application of three different MLalgorithms to predict heat consumption we investigate theperformance of SupportVector Regression (SVR)Partial LeastSquares (PLS) and random forest (RF) approach to developheat load forecasting models by making a comparative studyOur focus is on low error high accuracy and validating ourapproach with real data We also compare the error analysisof each algorithm with existing techniques (models) and alsofind the most efficient one out of the three
The rest of the paper is organized as follows Section 2 out-lines the related work where we provide an overview ofmanyapproaches to load prediction that are found in the literatureIn Section 3 we provide some background information aboutDH conceptsThis is followed by a presentation of the systemframework and related predictionmodels given in Section 4Further in Section 5 we present and discuss the evaluationand results Finally Section 6 concludes the paper
2 Related Work
The state of the art in the area of energy (heating cooling andelectric energy) demand estimation in buildings is classifiedas forward (classical) and data-driven (inverse) approaches[7] While the forward modelling approach generally usesequations with physical parameters that describe the buildingas input the inverse modelling approach uses machine-learning techniques Here the model takes the monitoredbuilding energy consumption data as inputs which areexpressed in terms of one or more driving variables and aset of empirical parameters and are widely applied for variousmeasurements and other aspects of building performance [8]The main advantage of data-driven models is that they canalso operate online making the process very easily updatablebased on new data Considering the fact thatMLmodels offerpowerful tools for discovery of patterns from large volumesof data and their ability to capture nonlinear behavior of theheat demand they represent a suitable technique to predictthe energy demand at the consumer side
Numerous ML models and methods have been appliedfor heat load prediction during the last decade A goodoverview of some recent references is given by Mestekemper[6 9] The former also built his own prediction modelsusing dynamic factor models A simple model proposed byDotzauer [10] uses the ambient temperature and a weeklypattern for prediction of the heat demand in DH Theauthor makes the social component equal to a constantvalue for all days of the week There is another interestingmodel which address the utilization of a grey box thatcombines physical knowledge with mathematical modelling[11] Some approaches to predict the heat load discussed in theliterature include artificial neural networks (ANN) [12ndash15]In [12] a backpropagation three-layered ANN is used for theprediction of the heat demand of different building samplesThe inputs of the network for training and testing are building
transparency ratio () orientation angles (degrees) andinsulation thickness (cm) and the output is building heatingenergy needs (Wh) When ANNrsquos outputs of this studyare compared with numerical results average 948ndash985accuracy is achieved The authors have shown that ANN is apowerful tool for prediction of building energy needs In [13]the authors discuss the way self-organizingmaps (SOMs) andmultilayer perceptrons (MLP) can be used to develop a two-stage algorithm for autonomous construction of predictionmodels The problem of heat demand prediction in a districtheating company is used as a case study where SOM is used asameans of grouping similar customer profiles in the first stageand MLP is used for predicting heat demand in the secondstage However the authors do not provide any informationrelated to the error rates obtained during the predictions
In [14] recurrent neural networks (RNNs) are used forheat load prediction in district heating and cooling systemsThe authors compare their prediction results from RNNwiththe prediction results obtained from a three-layered feedforward neural network (TLNN) The mean squared errorbetween the TLNN and the stationary actual heat load isreported to be 21052 whereas it is 11822 between the RNNand the actual heat load data In the nonstationary case RNNstill provides lowermean squared errorTheuse of RNNs risesthe expectation to capture the trend of heat load since it usesheat load data for several days as the input
In [15] time historical consumption data and ambienttemperatures were used as input parameters to forecastheat consumption for one week in the future The authorscompared the performances of three black-box modellingtechniques SVR PLS and ANN for the prediction of heatconsumption in the Suseo DH network and analyzed theaccuracy of each method by comparing forecasting errorsThe authors report that in one-day-ahead overall averageerror of PLS is 387 while that of ANN and SVR is 654and 495 respectivelyThemaximum error of SVR is 982which is lower than that of PLS (1647) and ANN (1320)In terms of the overall error the authors indicate that PLSexhibits better forecasting performance than ANN or SVR
In [16] a multiple regression (MR) model is used forheat load forecasting The reported MAE is 930 The modeldescribed in [17] uses an online machine-learning approachnamed Fast Incremental Model Trees with Drift Detection(FIMT-DD) for heat load prediction and hence allows theflexibility of updating the model when the distribution oftarget variable changes The results of the study indicate thatMAE and MAPE for FIMT-DD (using Bagging) have lowervalues in comparison to Adaptive Model Rules (AMRules)and Instance Based Learner on Streams (IBLStreams)
Authors in [18] compare the performance of four super-vised ML algorithms (MLR FFN SVR and RegressionTree (RT)) by studying the effect of internal and externalfactors The external factors include outdoor temperaturesolar radiation wind speed and wind directionThe internalfactors are related to the district heating system and includesupply and return water pressure supply and return watertemperature the difference of supply and return temperatureand circular flowTheir study shows that SVR showed the best
Applied Computational Intelligence and Soft Computing 3
Table 1 ML models for heat demand prediction in the literature
Applied algorithms [12] [13] [14] [15] [16] [17] [32] [33] [34] [35] [36] Our workANNFNNSOMRNN + + + + minus minus + minus minus + minus minus
MLRMRPLS minus minus minus + + minus + minus minus minus minus +SVMSVR minus minus minus + minus minus + + minus minus + +BN minus minus minus minus minus minus minus minus + minus minus minus
DTRFRT minus minus minus minus minus + + minus minus minus minus +Ensembles minus minus minus minus minus + minus minus minus minus minus +
accuracy on heat load prediction for 1- to 24-hour horizonsHowever the prediction accuracy decreases with the rise inhorizon from 1 to 18 hours
Wu et al [19] discuss and implement SVR as a pre-dictive model to the buildingrsquos historical energy use Theirpredictive model proved to approximate current energy usewith some seasonal and customer-specific variations in theapproximations Another work [20] discusses the importanceof prediction of load in a smart energy grid network Theauthors propose a BN to predict the total consumer waterheat consumption in households Shamshirband et al [21]construct an adaptive neurofuzzy inference system (ANFIS)which is a special case of the ANN family to predict heat loadfor individual consumers in a DH system Their result indi-cates that more improvements of the model are required forprediction horizons greater than 1 hour Protic et al [22] studythe relevance of short-termheat load prediction for operationcontrol inDHnetworkHere authors apply SVR for heat loadprediction for only one substation for time horizon of every 15minutes To improve the predictive model authors also add adummy variable to define the state of DH operation
In literature the research towards developing load fore-casting models is also discussed from different perspectivesand used in different energy related applications such ashead load in district heating wind turbine reaction torqueprediction [23] and wind power forecasting [24 25]
In [23] SVR is employed for wind turbine torque pre-diction The results show that an improvement in accuracycan be achieved and conclude that SVR can be consideredas a suitable alternative for prediction It can be also seenthat the proposed SVR prediction models produce higheraccuracy compared toANNandANFIS (adaptive neurofuzzyinference system) The work discussed in [24] considers thepenetrations of renewable energies in electrical power sys-tems by increasing the level of uncertainty In such situationstraditional methods for forecasting of load demand cannotproperly handle these uncertainties Hence they implement aneural network method for constructing prediction intervalsby using a lowupper bound estimation (LUBE) approachTheauthors conduct a comparative analysis and show that thismethod can increase the prediction intervals quality for loadand wind power generation predictions
Bhaskar and Singh [25] perform a statistical basedwind power prediction using numerical weather prediction(NWP) In order to validate the effectiveness of the proposedmethod the authors compared it with benchmark modelssuch as persistence (PER) and new-reference (NR) and show
that the proposed model outperforms these benchmarkmodels
Additionally due to innovations in the future sustainableand smart energy systems and recent technological trendswith IoT (Internet of Things) many research works [5 26]consider DH systems as being an integral part in Smart Gridwithin the smart city concept Moreover such a DH systemmodel will require high computation time and resourcesfor knowledge representation knowledge inference andoperational optimization problemsThus in response to thisresearchers are continuously focusing on the developmentand use of fast and efficient algorithms for real-time process-ing of energy and behavior related data
As a summary previous research on heat load predictionpoints to various training algorithms ANN including RNNFFN (Feedforward Neural Network)MLP and SOM MRincluding MLR and PLS SVM including SVR Bayesiannetworks (BN) decision trees (DT) ensemble methods [27]FIMT-DD AMRules and IBLStreams
In spite of the interest and the considerable efforts givenby the research community so far there is no consensusamong researchers on neither selecting the most suitabletraining model for heat load prediction nor selecting anappropriate set of input parameters for training the modelwith [16] in order to achieve high level of prediction accuracyThis is due to the fact that superiority of one model overanother in heat load prediction cannot be asserted in generalbecause performance of each model rather depends on thestructure of the prediction problem and the type of dataavailableThe comparison in [15] pointed to the superior per-formance of SVR already however as our problem structureand inputs are different from theirs we chose to do a compar-ison of several up-to-date models to find the most promisingapproach for our case Table 1 lists models from the literatureThe ldquoplusrdquo sign indicates that a particular algorithm has beenapplied while ldquominusrdquo means the opposite Based on thetable we concluded that SVR PLS and RF provide us witha unique combination of models to compare with each otherSimplicity and efficiency of each model in our combinationare preferred such that rapid and simple assessment of energydemand with high accuracy can be obtained
3 District Heating Systems
District heating is a well-proven technology for heat supplyfrom different energy sources through heat generation anddistribution to heat consumers DH systems are valuable
4 Applied Computational Intelligence and Soft Computing
Energy sourceHeat
generation plant
DH networkHeat demand Space heating
Domestic hot water
Supply pipeReturn pipe
Figure 1 District heating block diagram
infrastructure assets which enable effective resource utiliza-tion by incorporating the use of various energy sources Oneof the main advantages of DH system is that it facilitatesthe use of combined heat and power (CHP) generation andthereby makes the overall system efficient
District heating can play a crucial role in reaching someof the energy and environmental objectives by reducing CO
2
emissions and improve overall energy efficiency In a districtheating system heat is distributed through a network of hot-water pipes from heat-supplying plants to end usersThe heatis mostly used for space heating and domestic hot waterA simplified schematic picture of DH system is shown inFigure 1
The main components of district heating system areheat generation units distribution network and customersubstations The heat generation unit may use heat-onlyboilers or CHP plants or a combination of these two for heatgeneration Various types of energy sources like biomassmunicipal solid waste and industrial waste heat can be usedfor heat production The heat is then distributed to differentcustomers through a network of pipeline In the customersubstations the heat energy from the network is transferredto the end users internal heating system
The heat-supplying units are designed to meet the heatdemandThe heat output to the network depends on themassflow of the hot water and the temperature difference betweenthe supply and the return line The supply temperature of thehot water is controlled directly from the plantrsquos control roombased on the outdoor temperature and it follows mostly agiven operation temperature curve The return temperatureon the other hand depends mainly on the customerrsquos heatusage and also other network specific constraints The levelof the supply temperature differs from country to coun-try For instance in Sweden the temperature level variesbetween 70 and 120∘C depending on season and weather[28]
The heat load in district heating systems is the sum of allheat loads that are connected to the network and distributionand other losses in the network
With increased concerns about the environment climatechange and energy economy DH is an obvious choice to beused Nowadays district heating systems are equipped withadvanced and cutting-edge technology systems and sensorsthat monitor and control production units from a controlroom remotely From a smart city perspective one of thefuture challenges that now remains is to integrate districtheating with the electricity sector as well as the transportsector Heat load forecasting models with high accuracy are
Data preparation (aggregation and preprocessing)
Machine learning algorithms(SVR PLS and RF)
Collection of operational data
Heat load forecast curves
Figure 2 Workflow scenario of the proposed heat load approach
important to keep up with the rapid development in thisdirection
4 System Design and Application
As mentioned earlier in this study we perform short timeprediction for heat consumption and evaluate the threeML methods For the development of the heat load systempresented in our previous work [6] in this section we presentand describe our heat load prediction approach in detail asshown in Figure 2 which includes collection of operationaldata data preparation and the examined ML algorithms
In this work there are two main tasks which are relevantin the system implementation (a) data aggregation andpreprocessing and (b) ML application where the heal loadprediction is approached with the supervisedML algorithms
41 Operational Data Collection The data we have used inthis study is provided by Eidsiva Bioenergi ASwhich operates
Applied Computational Intelligence and Soft Computing 5
Table 2 Typical data samples for one day from the Eidsiva dataset
Time (h) FT (∘C) RT (∘C) Flow (m3h) HL (MW)1 1009 604 3165 1462 994 575 2791 1343 996 583 2283 1324 1003 598 2619 1515 1005 596 2769 1496 1001 589 2705 128
20 987 597 353 14821 1014 589 3309 16122 1011 591 2508 14223 1008 598 2602 15424 1019 593 2479 141
one of Europersquos most modern waste incineration plantslocated in the city of Hamar Norway The plant producesdistrict heating process steam and electricity These dataare collected by regular measurements that are part of thecontrol system in a DH plant The measurements consist of24measurements per each day that is every hourThe datasetcontains values of the parameters time of day (tD) forwardtemperature (FT) return temperature (RT) flow rate (FR)and heat load (HL) The data are collected in the periodbetween October 1st 2014 and April 30th 2015 In Table 2 wepresent a portion of typical data samples for one day
42 Data Preparation In this module activities related topreparing the data to be compatible with the ML moduleare performed The module includes data aggregation andpreprocessing During the process of data aggregation wecombine the sources of this data with the weather data(outdoor temperature) which is collected at the same intervalwith previous parameters Consequently we obtain fromthe aggregation process these output parameters outdoortemperature (OT) heat load (HL) forward temperature(FT) time of day (tD) and the difference between forwardtemperature (FT) and return temperature (RT) namely DT
43 Machine-Learning Predictive Modelling Machine learn-ing (ML) is a very broad subject it goes from very abstracttheory to extreme practice It turns out that even amongstmachine-learning practitioners there is no very well accepteddefinition of what is machine learning As a subfield ofartificial intelligence with its objective on building modelsthat learn from data machine learning has made tremendousimprovements and applications in the last decade
In general ML can be clearly defined as a set of methodsthat can automatically detect and extract knowledge patternsin empirical data such as sensor data or databases and thenuse the discovered knowledge patterns to predict future dataor execute other types of decision-making under uncertaintyML is divided into three principal groups supervised learn-ing (predictive learning approach) unsupervised learning
(descriptive learning approach) and reinforcement learning[29] In supervised learning the algorithm is given data inwhich the ldquocorrect answerrdquo for each example is told and themain property of the supervised learning is that the maincriteria of the target function 119910 = 119891(119909) are unknown Atthe very high level the two steps of supervised learningare as follows (i) train a machine-learning model usinglabeled data that consist of119873 data pairs (119909
1 119910
1) (119909
119873 119910
119873)
called instances and (ii) make predictions on new data forwhich the label is unknown Each instance is described byan input vector 119909
119894which incorporates a set of attributes 119860 =
119860
1 119860
2 119860
119898 and a label 119910
119894of the target attribute that
represents the wanted output To summarize these two stepsthe predictive model is learning from past examples made upof inputs and outputs and then applying what is learned tofuture inputs in order to predict future outputs Since we aremaking predictions on unseen data which is data that is notused to train the model it is often said that the primary goalof supervised learning is to buildmodels that generalizes thatis the built machine-learning model accurately predicts thefuture rather than the past Therefore the goal is to train amodel that can afterwards predict the label of new instancesand to figure out the target function
Based on the type of output variable 119910119894 supervised
learning tasks are further divided into two types as clas-sification and regression problems In problems where theoutput variable 119910
119894is categorical or nominal (or belongs to a
finite set) the ML tasks are known as classification problemsor pattern recognition whereas in regression problems theoutput variable is a real valued scalar or takes continuousvalues
431 Support Vector Regression (SVR) Support vectormachines (SVM) as a set of supervised learning algorithmsbased on a statistical learning theory are one of the mostsuccessful and widely appliedmachine-learningmethods forboth solving regression and pattern recognition problemsSince the formulation of SVMs is based on structural riskminimization and not on empirical risk minimization thisalgorithm shows better performance than the traditionalones Support Vector Regression (SVR) is a method ofSVM specifically for regressions In SVR the objectivefunction (eg the error function that may need to beminimized) is convex meaning that the global optimumis always reached and satisfied This is sharply in contrastto artificial neural networks (ANNs) where for instancethe classical backpropagation learning algorithm is prone toconvergence to ldquobadrdquo local minima [30 31] which makesthem harder to analyze theoretically In practice SVR havegreatly outperformed ANNs in a wide range of applications[31]
In SVR the input119909 ismapped first into an119898-dimensionalfeature space by using nonlinear mapping As a subsequentstep we construct a linear model in that feature spaceMathematically the linear model 119891(119909 119908) is given by
119891 (119909 119908) =
119898
sum
119895=1
119908
119895119892
119895(119909) + 119887 (1)
6 Applied Computational Intelligence and Soft Computing
where 119892119895(119909) 119895 = 1 119898 represents the set of nonlinear
transformations while 119887 is the bias term andmost of the timeis assumed to be zero hence we omit this term
The model obtained by SVR depends exclusively on asubset of the training data at the same time SVR tries toreducemodel complexity byminimizing 1199082 Consequentlythe objective of SVR is to minimize the following function[32]
min 1
2
119908
2+ 119862
119899
sum
119894=1
(120585
119894+ 120585
lowast
119894)
such that 119910
119894minus 119891 (119909
119894 119908) le 120576 + 120585
lowast
119894
119891 (119909
119894 119908) minus 119910
119894le 120576 + 120585
119894
120585
119894 120585
lowast
119894ge 0 119894 = 1 119899
(2)
In these equations 120576 is a new type of (insensitive) lossfunction or a threshold which denotes the desired errorrange for all points The nonnegative variables 120585
119894and 120585lowast119894are
called slack variables they measure the deviation of trainingsamples outside 120576 that is guaranteeing that a solution existsfor all 120576 The parameter 119862 gt 0 is a penalty term used todetermine the tradeoff between data fitting and flatness and119908 are the regression weights In most cases the optimizationproblem can be easily solved if transformed into a dualproblem By utilizing Lagrange multipliers the dualizationmethod is applied as follows
119871 =
1
2
119908
2+ 119862
119899
sum
119894=1
(120585
119894+ 120585
lowast
119894)
minus
119899
sum
119894=1
120572
lowast
119894(120576 + 120585
lowast
119894minus 119910
119894+ 119891 (119909
119894 119908))
minus
119899
sum
119894=1
120572
119894(120576 + 120585
119894+ 119910
119894minus 119891 (119909
119894 119908))
minus
119899
sum
119894=1
(120582
119894120585
119894+ 120582
lowast
119894120585
lowast
119894)
(3)
where 119871 is the Lagrangian and 120572119894 120572
lowast
119894 120582
119894 120582
lowast
119894ge 0 are called the
Lagrange multipliersConsidering the saddle point condition it follows that the
partial derivatives of119871 in relation to variables (119908 119887 120585119894 120585
lowast
119894) will
disappear for optimality By proceeding with similar stepswe end up with the dual optimization problem Finally thesolution of the dual problem is given by
119891 (119909) =
119899SV
sum
119894=1
(120572
119894minus 120572
lowast
119894)119870 (119909
119894minus 119909) (4)
where 0 le 120572lowast119894le 119862 0 le 120572
119894le 119862 119899SV is the number of space
vectors and 119870 is the kernel function which for given twovectors in input space will return to a higher dimensional
feature space the dot product of their images The kernel isgiven by
119870(119909 119909
119894) =
119898
sum
119894=1
119892
119895(119909) 119892
119895(119909
119894) (5)
In order to map the input data to a higher dimensional spaceand to handle nonlinearities between input vectors and theirrespective class we use as a kernel the Gaussian radial basisfunction (RBF) which has 120574 as its kernel parameter Once thekernel is selected we used grid search to identify the best pairof the regularization parameters119862 and 120574 that is the pair withthe best cross-validation accuracy
432 Partial Least Squares (PLS) The Partial Least Squares(PLS) technique is a learningmethod based onmultiple linearregression model that takes into account the latent structurein both datasetsThe dataset consists of explanatory variables119883
119894and dependent variables 119884
119895 The model is linear as can be
seen in (6) that for each sample 119899 the value 119910119899119895is
119910
119899119895=
119896
sum
119894=0
120573
119894119909
119899119894+ 120576
119899119895
(6)
The PLS model is similar to a model from a linearregression however the way of calculating 120573
119894is different
The principle of PLS regression is that the data tables ormatrices119883 and 119884 are decomposed into latent structure in aniterative process The latent structure corresponding to themost variation of 119884 is extracted and explained by a latentstructure of119883 that explains it the best
The Partial Least Squares (PLS) technique is a learningmethod based on multivariate regression model which canbe used for correlating the information in one data matrix119883 to the information in another matrix 119884 More specificallyPLS is used to find the fundamental relations between twomatrices (119883 and 119884) which are projected onto several keyfactors such as 119879 and 119880 and linear regression is performedfor the relation between these factors Factor119880 represents themost variations for 119884 whereas factor 119879 denotes the variationsfor 119883 but it is not necessarily explaining the most variationin119883
The first results of PLS are the model equations show-ing the 120573 coefficients that give the relationship betweenvariables 119883 and 119884 These model equations are as follows
119884 = 119883120573 + 120576
119884 = 119879
ℎ119862
1015840
ℎ+ 120576
ℎ= 119883119882
lowast
ℎ119862
1015840
ℎ+ 120576
ℎ
= 119883119882
ℎ(119875
1015840
ℎ119882
ℎ)
minus1
119862
1015840
ℎ+ 120576
ℎ
(7)
where119884 is thematrix of dependent variables 119883 is thematrixof explanatory variables 119879
ℎ 119862ℎ 119882lowastℎ 119882ℎ and 119875
ℎare the
matrices generated by the PLS algorithm and 120576ℎis the matrix
of residuals Matrix 120573 of the regression coefficients of119884 on119883with ℎ components generated by the PLS algorithm is givenby
120573 = 119882
ℎ(119875
1015840
ℎ119882
ℎ)
minus1
119862
1015840
ℎ
(8)
Applied Computational Intelligence and Soft Computing 7
Sample data
Train dataDataset to grow the single trees
Feature selectionRandom selection of predictors
Grow treeSplit data using the best predictors
Estimate OOB errorby applying the tree to the OOB data
Out-Of-Bag (OOB)Data to estimate the error of the
grown tree
Random forestAs collection of all trees
Repeat until criteria for stopping tree growth
are fulfilled
Repeat until specified number
of trees is obtained
Figure 3 Random forest algorithm process flow
The advantage of PLS is that this algorithm allows takinginto account the data structure in both 119883 and 119884 matrices Italso provides great visual results that help the interpretationof data Finally yet importantly PLS can model severalresponse variables at the same time taking into account theirstructure
433 Random Forest (RF) Random forest algorithm pro-posed by Breiman [33] is an ensemble-learning algorithmconsisting of three predictors where the trees are formulatedbased on various random features It develops lots of decisiontrees based on random selection of data and random selectionof variables providing the class of dependent variable basedon many trees
This method is based on the combination of a largecollection of decorrelated decision trees (ie Classificationand Regression Trees (CART) [34]) Since all the trees arebased on random selection of data as well as variables theseare random trees and many such random trees lead to arandom forest The name forest means that we use manydecision trees to make a better classification of the dependentvariable The CART technique divides the learning sampleusing an algorithm known as binary recursive partitioningThis splitting or partitioning starts from the most importantvariable to the less important ones and it is applied to each ofthe new branches of the tree [35]
In order to increase the algorithm accuracy and reducethe generalization error of the ensemble trees another
technique called Bagging is incorporated The estimation forthe generalization error is performed with the Out-Of-Bag(OOB) method Bagging is used on the training dataset togenerate a lot of copies of it where each one corresponds to adecision tree
With the RF algorithm each tree is grown as follows [36]
(a) If the number of cases (observations) in the trainingset is119873 sample119873 cases at random but with replace-ment from the original data this sample will be thetraining set for the growing tree
(b) If there are 119872 input variables a number 119898 ≪ 119872
is specified such that at each node 119898 variables areselected at random out of the119872 and the best split onthese119898 is used to split the node This value119898 is heldconstant during the forest growing
(c) Each tree is grown to the largest extent possible
The process flow in the random forest models is shown inFigure 3
The error rate of RF primarily depends on the correlationdegree between any two trees and the prediction accuracy ofan individual tree The principal advantages of this methodare the ease of parallelization robustness and indifference tonoise and outliers in most of the dataset Due to its unbiasednature of execution this method avoids overfitting
8 Applied Computational Intelligence and Soft Computing
Table 3 Performance results for one week
Algorithm Training TestingMAE () MAPE () Correlation coefficient MAE () MAPE () Correlation coefficient
SVR 6491 311 090 7032 343 091PLS 26524 1273 081 23868 1042 079RF 17382 1977 083 15932 1861 084
Heat output (MW)
Heat load measurements (Oct 1ndashApr 30)
minus500
000
500
1000
1500
2000
2500
3000
Outdoor temp (∘C)
Figure 4 Hourly heat load pattern in relation to outdoor tempera-ture in the DH network
5 Performance Evaluation and Results
The proposed approach is implemented in MATLAB R2014a[37] and executed in a PC with Intel Core i7 processor with27GHz speed and 8GB of RAM In this work as trainingdataset we select the datameasured during the first 28 weekswhich consist of 4872 instances As for prediction period wechoose the 29th week as test data that is 148 instances Inorder to evaluate the performance of the proposed algorithmsin terms of accuracy of the results we use the mean absoluteaverage (MAE) the mean average percentage error (MAPE)and the correlation coefficient which measures the correla-tion between the actual and predicted heat load values MAEand MAPE are defined as follows
MAE =sum
119899
119894=1
1003816
1003816
1003816
1003816
119901
119894minus 119910
119894
1003816
1003816
1003816
1003816
119899
MAPE =sum
119899
119894=1
1003816
1003816
1003816
1003816
1003816
(119901
119894minus 119910
1015840
119894) 119901
119894
1003816
1003816
1003816
1003816
1003816
119899
(9)
where 119901119894is the actual value 119910
119894is the predicted value and 119899 is
the number of samples in the training setWe apply a 10-fold cross-validation for obtaining the valid
evaluation metrics Figure 4 presents the heat load in thenetwork with respect to the outdoor temperatureWe see thatthe higher values of the heat load occur during the days withlower outdoor temperature values which in fact reflects theincreased consumption of heat
Figure 5(a) shows results of actual heat load and heatload prediction for one week based on SVR algorithm InFigure 5(b) results of heat prediction production for oneweek based on PLS are shown with data on actual heal load
whereas results of heat load forecasting for one week basedon RF are shown in Figure 5(c)
As can be seen from Figure 5 the predicted HL withSVR is closer to the actual energy consumption with anMAPE value of about 343 and a correlation coefficient of091 Correlation has been consistent throughout training andtesting On the other hand graphs presented in Figures 5(b)and 5(c) for PLS and RF respectively are less accurate withhigher errors The performance of the PLS is significantlylower compared to SVR Concerning RF as the trees startedto become progressively uncorrelated the error rate alsodeclined significantly The best performance of the SVR overthe other two methods is attributed to the efficient modellingof feature space and to the fact that SVR is less proneto overfitting and that does not require any procedure todetermine the explicit form as in ordinary regression analysis
Table 3 outlines the results of the investigated ML algo-rithms for heat consumption prediction for both trainingphase (where we developed the models using supervisedlearning schemes) and testing phase (to generalize newlyunseen data) From Table 3 it is evident that SVR showsthe best prediction performance in terms of average errorsand correlation coefficient confirming the superiority of SVRover the other machine-learning methods Therefore basedon the assumption that the number and type of the operatingfacilities should be determined SVRcanbe effectively appliedin the management of a DHS The mean absolute percentageerror value of 343 obtained in our approachwith the SVR islower than or equal to the mean absolute percentage error ofstate-of-the-art approaches for heat load prediction in DHSMoreover SVR is also better than PLS and RF in terms ofmean average errors and correlation coefficient Neverthelesssometimes it is impossible to perform direct comparisonwith other works due to different system implementation andinput data and different structure of the experimental setup
51 Comparison with State-of-the-Art Methods Some priorresearch work is carried out to predict and analyze the headdemand in DH However due to different system designand input data and different architecture implementation orstructure of the experimental setup sometimes it is difficult toperform direct comparison Our approach uses operationaldata from DHS to model the heat demand and since theSVR exhibited the best prediction performance we use thismethod to perform the comparison In our case we obtainedsmaller MAPE compared to [17] where the experimentalresults showed MAPE of 477 which is lower than or atleast equal to the mean percentage error of state-of-the-artregression approaches that have been proposed for heat loadforecasting in DH systems Furthermore reported results
Applied Computational Intelligence and Soft Computing 9
Mon Tu
e
Wed Th
u Fri
Sat
Sun
Support Vector Regression (SVR)
Actual HLPredicted HL
000
500
1000
1500
2000
2500
Hea
t loa
d (M
W)
(a) Predicted versus actual demand with SVR
000
500
1000
1500
2000
2500
Mon Tu
e
Wed Thu Fri
Sat
Sun
Hea
t loa
d (M
W)
Partial Least Squares (PLS)
Actual HLPredicted HL
(b) Predicted versus actual demand with PLS
000
500
1000
1500
2000
2500
Hea
t loa
d (M
W)
Random forest (RF)
Mon Tu
e
Wed Th
u Fri
Sat
Sun
Actual HLPredicted HL
(c) Predicted versus actual demand with RF
Figure 5 Hourly heat load prediction based on SVR PLS and RF for one week
from our study are also superior compared to the resultspresented in [15 32] More specifically the SVR method weapply shows better results in terms of MAPE and correlationcoefficient compared to [32] where MAPE in this paperis 554 and the correlation coefficient is 087 As far asthe comparison with work [15] is concerned in terms ofMAPE our SVR method exhibits better hourly predictionperformance where theMAPE for one week is 563 On theother hand the PLSmethod performs better than in our casehaving the MAPE value of about 899
6 Conclusion
District heating (DH) sector can play an indispensable rolein the current and future sustainable energy system of theNorth European countries where the share of DH in the total
European heat market is significantly high The innovationsand emergence of new technology and the increasing focuson smart buildings impose energy systems to face a chal-lenge in adapting to customersrsquo more flexible and individualsolutions Consequently one needs to know more aboutwhat drives customersrsquo requirements choices and prioritiesTherefore the creation and application of innovative ITecosystems in DH are considered essential to improve energyefficiency
Heat load prediction in the past decade has attracted alot of interest to the researchers since it can assist in energyefficiency of the DHS which also leads to cost reduction forheat suppliers and to many environmental benefits
In this paper three ML algorithms for heat load predic-tion in a DH network are developed and presentedThe algo-rithms are SVR PLS andRFTheheat load predictionmodels
10 Applied Computational Intelligence and Soft Computing
were developed using data from 29 weeks The predictedhourly results were compared with actual heat load dataPerformances of these three different ML algorithms werestudied compared and analyzed SVR algorithm proved tobe the most efficient one producing the best performance interms of average errors and correlation coefficient Moreoverthe prediction results were also compared against existingSVR and PLS methods in literature showing that the SVRpresented in this paper produces better accuracy
In conclusion the comparison results validate the notionthat the developed SVRmethod is appropriate for applicationin heat load prediction or it can serve as a promisingalternative for existing models
As for the future work apart from outdoor temperaturewe intend to incorporate other meteorological parametersinfluencing the heat load such as wind speed and humidity
Competing Interests
The authors declare that they have no competing interests
Acknowledgments
The authors would like to thank Eidsiva Bioenergi AS Nor-way which kindly provided the data for this research work
References
[1] Report on European Energy Security Strategy May 2014 Brus-sels httpeur-lexeuropaeulegal-contentENTXTPDFuri=CELEX52014DC0330ampfrom=EN
[2] A R Day P Ogumka P G Jones and A Dunsdon ldquoTheuse of the planning system to encourage low carbon energytechnologies in buildingsrdquo Renewable Energy vol 34 no 9 pp2016ndash2021 2009
[3] IEA (International Energy Agency) Nordic Energy TechnologyPerspectives OECDIEA Paris France 2013
[4] A Gebremedhin and H Zinko ldquoSeasonal heat storages indistrict heating systemsrdquo in Proceedings of the 11th InternationalThermal Energy Storage for Energy Efficiency and SustainabilityConference (Effstock rsquo09) Stockholm Sweden June 2009
[5] H Lund S Werner R Wiltshire et al ldquo4th Generation DistrictHeating (4GDH) integrating smart thermal grids into futuresustainable energy systemsrdquo Energy vol 68 pp 1ndash11 2014
[6] F Dalipi S Y Yayilgan and A Gebremedhin ldquoA machinelearning approach to increase energy efficiency in district heat-ing systemsrdquo in Proceedings of the International Conference onEnvironmental Engineering and Computer Application (ICEECArsquo14) pp 223ndash226 CRC Press Hong Kong December 2014
[7] N Fumo ldquoA review on the basics of building energy estimationrdquoRenewable and Sustainable Energy Reviews vol 31 pp 53ndash602014
[8] D E Claridge ldquoUsing simulation models for building com-missioningrdquo in Proceedings of the International Conferenceon Enhanced Building Operation Energy Systems LaboratoryTexas AampM University October 2004
[9] T Mestekemper Energy demand forecasting and dynamic watertemperature management [PhD thesis] Bielefeld UniversityBielefeld Germany 2011
[10] E Dotzauer ldquoSimple model for prediction of loads in district-heating systemsrdquo Applied Energy vol 73 no 3-4 pp 277ndash2842002
[11] HANielsen andHMadsen ldquoModelling the heat consumptionin district heating systems using a grey-box approachrdquo Energyand Buildings vol 38 no 1 pp 63ndash71 2006
[12] B B Ekici and U T Aksoy ldquoPrediction of building energyconsumption by using artificial neural networksrdquo Advances inEngineering Software vol 40 no 5 pp 356ndash362 2009
[13] M Grzenda and B Macukow ldquoDemand prediction with multi-stage neural processingrdquo in Advances in Natural Computationand Data Mining pp 131ndash141 2006
[14] K Kato M Sakawa K Ishimaru S Ushiro and T ShibanoldquoHeat load prediction through recurrent neural network indistrict heating and cooling systemsrdquo in Proceedings of the IEEEInternational Conference on SystemsMan andCybernetics (SMCrsquo08) pp 1401ndash1406 Singapore October 2008
[15] T C Park U S Kim L-H Kim B W Jo and Y K Yeo ldquoHeatconsumption forecasting using partial least squares artificialneural network and support vector regression techniques in dis-trict heating systemsrdquo Korean Journal of Chemical Engineeringvol 27 no 4 pp 1063ndash1071 2010
[16] T Catalina V Iordache and B Caracaleanu ldquoMultiple regres-sion model for fast prediction of the heating energy demandrdquoEnergy and Buildings vol 57 pp 302ndash312 2013
[17] S Provatas N Lavesson and C Johansson ldquoAn onlinemachinelearning algorithm for heat load forecasting in district heatingsystemsrdquo in Proceedings of the 14th International Symposium onDistrict Heating and Cooling Stockholm Sweden September2014
[18] S Idowu S Saguna C Ahlund and O Schelen ldquoForecastingheat load for smart district heating systems a machine learningapproachrdquo in Proceedings of the IEEE International Conferenceon Smart Grid Communications (SmartGridComm rsquo14) pp 554ndash559 IEEE Venice Italy November 2014
[19] L Wu G Kaiser D Solomon R Winter A Boulanger andR Anderson Improving Efficiency and Reliability of Build-ing Systems Using Machine Learning and Automated OnlineEvaluation Columbia University Academic Commons 2012httphdlhandlenet10022ACP13213
[20] M Vlachopoulou G Chin J C Fuller S Lu and K KalsildquoModel for aggregated water heater load using dynamicBayesian networksrdquo Tech Rep Pacific Northwest NationalLaboratory (PNNL) Richland Wash USA 2012
[21] S Shamshirband D Petkovic R Enayatifar et al ldquoHeat loadprediction in district heating systemswith adaptive neuro-fuzzymethodrdquoRenewable and Sustainable Energy Reviews vol 48 pp760ndash767 2015
[22] M Protic S Shamshirband M H Anisi et al ldquoAppraisal ofsoft computing methods for short term consumersrsquo heat loadprediction in district heating systemsrdquo Energy vol 82 pp 697ndash704 2015
[23] S Shamshirband D Petkovic A Amini et al ldquoSupport vec-tor regression methodology for wind turbine reaction torqueprediction with power-split hydrostatic continuous variabletransmissionrdquo Energy vol 67 pp 623ndash630 2014
[24] H Quan D Srinivasan and A Khosravi ldquoShort-term load andwind power forecasting using neural network-based predictionintervalsrdquo IEEE Transactions on Neural Networks and LearningSystems vol 25 no 2 pp 303ndash315 2014
Applied Computational Intelligence and Soft Computing 11
[25] K Bhaskar and S N Singh ldquoAWNN-assisted wind power fore-casting using feed-forward neural networkrdquo IEEE Transactionson Sustainable Energy vol 3 no 2 pp 306ndash315 2012
[26] Heat Pump Centre Newsletters International Energy Agency(IEA) Heat Pumping Technologies Vol 30 Nr 22012
[27] T G Dietterich ldquoAn experimental comparison of three meth-ods for constructing ensembles of decision trees baggingboosting and randomizationrdquo Machine Learning vol 40 no2 pp 139ndash157 2000
[28] SwedishDistrictHeatingAssociation 2016 httpwwwsvensk-fjarrvarmese
[29] K P Murphy Machine Learning A Probabilistic PerspectiveAdaptive Computation and Machine Learning Series MITPress 2012
[30] L Wang Support Vector Machines Theory and ApplicationsSpringer New York NY USA 2005
[31] D Basak S Pal and D C Patranabis ldquoSupport vector regres-sionsrdquoNeural Information ProcessingmdashLetters and Reviews vol11 no 10 pp 203ndash224 2007
[32] R E Edwards J New and L E Parker ldquoPredicting futurehourly residential electrical consumption a machine learningcase studyrdquo Energy and Buildings vol 49 pp 591ndash603 2012
[33] L Breiman ldquoRandom forestsrdquoMachine Learning vol 45 no 1pp 5ndash32 2001
[34] L Breiman J H Friedman R A Olshen and C J StoneClassification and Regression Trees Wadsworth and BrooksMonterey Calif USA 1984
[35] P Vezza C Comoglio M Rosso and A Viglione ldquoLowflows regionalization in North-Western Italyrdquo Water ResourcesManagement vol 24 no 14 pp 4049ndash4074 2010
[36] L Breiman and A Cutler ldquoRandom forestsrdquo January 2016httpwwwstatberkeleyeduusersbreimanRandomForests
[37] MATLAB Version 830 MathWorks Natick Mass USA 2014
Submit your manuscripts athttpwwwhindawicom
Computer Games Technology
International Journal of
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Distributed Sensor Networks
International Journal of
Advances in
FuzzySystems
Hindawi Publishing Corporationhttpwwwhindawicom
Volume 2014
International Journal of
ReconfigurableComputing
Hindawi Publishing Corporation httpwwwhindawicom Volume 2014
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Applied Computational Intelligence and Soft Computing
thinspAdvancesthinspinthinsp
Artificial Intelligence
HindawithinspPublishingthinspCorporationhttpwwwhindawicom Volumethinsp2014
Advances inSoftware EngineeringHindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Electrical and Computer Engineering
Journal of
Journal of
Computer Networks and Communications
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Hindawi Publishing Corporation
httpwwwhindawicom Volume 2014
Advances in
Multimedia
International Journal of
Biomedical Imaging
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
ArtificialNeural Systems
Advances in
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
RoboticsJournal of
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Computational Intelligence and Neuroscience
Industrial EngineeringJournal of
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Modelling amp Simulation in EngineeringHindawi Publishing Corporation httpwwwhindawicom Volume 2014
The Scientific World JournalHindawi Publishing Corporation httpwwwhindawicom Volume 2014
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Human-ComputerInteraction
Advances in
Computer EngineeringAdvances in
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
2 Applied Computational Intelligence and Soft Computing
the need for accurate load forecasting has become more andmore important This is especially important in a districtheating system with simultaneous production of heat steamand electricity
In this paper with the application of three different MLalgorithms to predict heat consumption we investigate theperformance of SupportVector Regression (SVR)Partial LeastSquares (PLS) and random forest (RF) approach to developheat load forecasting models by making a comparative studyOur focus is on low error high accuracy and validating ourapproach with real data We also compare the error analysisof each algorithm with existing techniques (models) and alsofind the most efficient one out of the three
The rest of the paper is organized as follows Section 2 out-lines the related work where we provide an overview ofmanyapproaches to load prediction that are found in the literatureIn Section 3 we provide some background information aboutDH conceptsThis is followed by a presentation of the systemframework and related predictionmodels given in Section 4Further in Section 5 we present and discuss the evaluationand results Finally Section 6 concludes the paper
2 Related Work
The state of the art in the area of energy (heating cooling andelectric energy) demand estimation in buildings is classifiedas forward (classical) and data-driven (inverse) approaches[7] While the forward modelling approach generally usesequations with physical parameters that describe the buildingas input the inverse modelling approach uses machine-learning techniques Here the model takes the monitoredbuilding energy consumption data as inputs which areexpressed in terms of one or more driving variables and aset of empirical parameters and are widely applied for variousmeasurements and other aspects of building performance [8]The main advantage of data-driven models is that they canalso operate online making the process very easily updatablebased on new data Considering the fact thatMLmodels offerpowerful tools for discovery of patterns from large volumesof data and their ability to capture nonlinear behavior of theheat demand they represent a suitable technique to predictthe energy demand at the consumer side
Numerous ML models and methods have been appliedfor heat load prediction during the last decade A goodoverview of some recent references is given by Mestekemper[6 9] The former also built his own prediction modelsusing dynamic factor models A simple model proposed byDotzauer [10] uses the ambient temperature and a weeklypattern for prediction of the heat demand in DH Theauthor makes the social component equal to a constantvalue for all days of the week There is another interestingmodel which address the utilization of a grey box thatcombines physical knowledge with mathematical modelling[11] Some approaches to predict the heat load discussed in theliterature include artificial neural networks (ANN) [12ndash15]In [12] a backpropagation three-layered ANN is used for theprediction of the heat demand of different building samplesThe inputs of the network for training and testing are building
transparency ratio () orientation angles (degrees) andinsulation thickness (cm) and the output is building heatingenergy needs (Wh) When ANNrsquos outputs of this studyare compared with numerical results average 948ndash985accuracy is achieved The authors have shown that ANN is apowerful tool for prediction of building energy needs In [13]the authors discuss the way self-organizingmaps (SOMs) andmultilayer perceptrons (MLP) can be used to develop a two-stage algorithm for autonomous construction of predictionmodels The problem of heat demand prediction in a districtheating company is used as a case study where SOM is used asameans of grouping similar customer profiles in the first stageand MLP is used for predicting heat demand in the secondstage However the authors do not provide any informationrelated to the error rates obtained during the predictions
In [14] recurrent neural networks (RNNs) are used forheat load prediction in district heating and cooling systemsThe authors compare their prediction results from RNNwiththe prediction results obtained from a three-layered feedforward neural network (TLNN) The mean squared errorbetween the TLNN and the stationary actual heat load isreported to be 21052 whereas it is 11822 between the RNNand the actual heat load data In the nonstationary case RNNstill provides lowermean squared errorTheuse of RNNs risesthe expectation to capture the trend of heat load since it usesheat load data for several days as the input
In [15] time historical consumption data and ambienttemperatures were used as input parameters to forecastheat consumption for one week in the future The authorscompared the performances of three black-box modellingtechniques SVR PLS and ANN for the prediction of heatconsumption in the Suseo DH network and analyzed theaccuracy of each method by comparing forecasting errorsThe authors report that in one-day-ahead overall averageerror of PLS is 387 while that of ANN and SVR is 654and 495 respectivelyThemaximum error of SVR is 982which is lower than that of PLS (1647) and ANN (1320)In terms of the overall error the authors indicate that PLSexhibits better forecasting performance than ANN or SVR
In [16] a multiple regression (MR) model is used forheat load forecasting The reported MAE is 930 The modeldescribed in [17] uses an online machine-learning approachnamed Fast Incremental Model Trees with Drift Detection(FIMT-DD) for heat load prediction and hence allows theflexibility of updating the model when the distribution oftarget variable changes The results of the study indicate thatMAE and MAPE for FIMT-DD (using Bagging) have lowervalues in comparison to Adaptive Model Rules (AMRules)and Instance Based Learner on Streams (IBLStreams)
Authors in [18] compare the performance of four super-vised ML algorithms (MLR FFN SVR and RegressionTree (RT)) by studying the effect of internal and externalfactors The external factors include outdoor temperaturesolar radiation wind speed and wind directionThe internalfactors are related to the district heating system and includesupply and return water pressure supply and return watertemperature the difference of supply and return temperatureand circular flowTheir study shows that SVR showed the best
Applied Computational Intelligence and Soft Computing 3
Table 1 ML models for heat demand prediction in the literature
Applied algorithms [12] [13] [14] [15] [16] [17] [32] [33] [34] [35] [36] Our workANNFNNSOMRNN + + + + minus minus + minus minus + minus minus
MLRMRPLS minus minus minus + + minus + minus minus minus minus +SVMSVR minus minus minus + minus minus + + minus minus + +BN minus minus minus minus minus minus minus minus + minus minus minus
DTRFRT minus minus minus minus minus + + minus minus minus minus +Ensembles minus minus minus minus minus + minus minus minus minus minus +
accuracy on heat load prediction for 1- to 24-hour horizonsHowever the prediction accuracy decreases with the rise inhorizon from 1 to 18 hours
Wu et al [19] discuss and implement SVR as a pre-dictive model to the buildingrsquos historical energy use Theirpredictive model proved to approximate current energy usewith some seasonal and customer-specific variations in theapproximations Another work [20] discusses the importanceof prediction of load in a smart energy grid network Theauthors propose a BN to predict the total consumer waterheat consumption in households Shamshirband et al [21]construct an adaptive neurofuzzy inference system (ANFIS)which is a special case of the ANN family to predict heat loadfor individual consumers in a DH system Their result indi-cates that more improvements of the model are required forprediction horizons greater than 1 hour Protic et al [22] studythe relevance of short-termheat load prediction for operationcontrol inDHnetworkHere authors apply SVR for heat loadprediction for only one substation for time horizon of every 15minutes To improve the predictive model authors also add adummy variable to define the state of DH operation
In literature the research towards developing load fore-casting models is also discussed from different perspectivesand used in different energy related applications such ashead load in district heating wind turbine reaction torqueprediction [23] and wind power forecasting [24 25]
In [23] SVR is employed for wind turbine torque pre-diction The results show that an improvement in accuracycan be achieved and conclude that SVR can be consideredas a suitable alternative for prediction It can be also seenthat the proposed SVR prediction models produce higheraccuracy compared toANNandANFIS (adaptive neurofuzzyinference system) The work discussed in [24] considers thepenetrations of renewable energies in electrical power sys-tems by increasing the level of uncertainty In such situationstraditional methods for forecasting of load demand cannotproperly handle these uncertainties Hence they implement aneural network method for constructing prediction intervalsby using a lowupper bound estimation (LUBE) approachTheauthors conduct a comparative analysis and show that thismethod can increase the prediction intervals quality for loadand wind power generation predictions
Bhaskar and Singh [25] perform a statistical basedwind power prediction using numerical weather prediction(NWP) In order to validate the effectiveness of the proposedmethod the authors compared it with benchmark modelssuch as persistence (PER) and new-reference (NR) and show
that the proposed model outperforms these benchmarkmodels
Additionally due to innovations in the future sustainableand smart energy systems and recent technological trendswith IoT (Internet of Things) many research works [5 26]consider DH systems as being an integral part in Smart Gridwithin the smart city concept Moreover such a DH systemmodel will require high computation time and resourcesfor knowledge representation knowledge inference andoperational optimization problemsThus in response to thisresearchers are continuously focusing on the developmentand use of fast and efficient algorithms for real-time process-ing of energy and behavior related data
As a summary previous research on heat load predictionpoints to various training algorithms ANN including RNNFFN (Feedforward Neural Network)MLP and SOM MRincluding MLR and PLS SVM including SVR Bayesiannetworks (BN) decision trees (DT) ensemble methods [27]FIMT-DD AMRules and IBLStreams
In spite of the interest and the considerable efforts givenby the research community so far there is no consensusamong researchers on neither selecting the most suitabletraining model for heat load prediction nor selecting anappropriate set of input parameters for training the modelwith [16] in order to achieve high level of prediction accuracyThis is due to the fact that superiority of one model overanother in heat load prediction cannot be asserted in generalbecause performance of each model rather depends on thestructure of the prediction problem and the type of dataavailableThe comparison in [15] pointed to the superior per-formance of SVR already however as our problem structureand inputs are different from theirs we chose to do a compar-ison of several up-to-date models to find the most promisingapproach for our case Table 1 lists models from the literatureThe ldquoplusrdquo sign indicates that a particular algorithm has beenapplied while ldquominusrdquo means the opposite Based on thetable we concluded that SVR PLS and RF provide us witha unique combination of models to compare with each otherSimplicity and efficiency of each model in our combinationare preferred such that rapid and simple assessment of energydemand with high accuracy can be obtained
3 District Heating Systems
District heating is a well-proven technology for heat supplyfrom different energy sources through heat generation anddistribution to heat consumers DH systems are valuable
4 Applied Computational Intelligence and Soft Computing
Energy sourceHeat
generation plant
DH networkHeat demand Space heating
Domestic hot water
Supply pipeReturn pipe
Figure 1 District heating block diagram
infrastructure assets which enable effective resource utiliza-tion by incorporating the use of various energy sources Oneof the main advantages of DH system is that it facilitatesthe use of combined heat and power (CHP) generation andthereby makes the overall system efficient
District heating can play a crucial role in reaching someof the energy and environmental objectives by reducing CO
2
emissions and improve overall energy efficiency In a districtheating system heat is distributed through a network of hot-water pipes from heat-supplying plants to end usersThe heatis mostly used for space heating and domestic hot waterA simplified schematic picture of DH system is shown inFigure 1
The main components of district heating system areheat generation units distribution network and customersubstations The heat generation unit may use heat-onlyboilers or CHP plants or a combination of these two for heatgeneration Various types of energy sources like biomassmunicipal solid waste and industrial waste heat can be usedfor heat production The heat is then distributed to differentcustomers through a network of pipeline In the customersubstations the heat energy from the network is transferredto the end users internal heating system
The heat-supplying units are designed to meet the heatdemandThe heat output to the network depends on themassflow of the hot water and the temperature difference betweenthe supply and the return line The supply temperature of thehot water is controlled directly from the plantrsquos control roombased on the outdoor temperature and it follows mostly agiven operation temperature curve The return temperatureon the other hand depends mainly on the customerrsquos heatusage and also other network specific constraints The levelof the supply temperature differs from country to coun-try For instance in Sweden the temperature level variesbetween 70 and 120∘C depending on season and weather[28]
The heat load in district heating systems is the sum of allheat loads that are connected to the network and distributionand other losses in the network
With increased concerns about the environment climatechange and energy economy DH is an obvious choice to beused Nowadays district heating systems are equipped withadvanced and cutting-edge technology systems and sensorsthat monitor and control production units from a controlroom remotely From a smart city perspective one of thefuture challenges that now remains is to integrate districtheating with the electricity sector as well as the transportsector Heat load forecasting models with high accuracy are
Data preparation (aggregation and preprocessing)
Machine learning algorithms(SVR PLS and RF)
Collection of operational data
Heat load forecast curves
Figure 2 Workflow scenario of the proposed heat load approach
important to keep up with the rapid development in thisdirection
4 System Design and Application
As mentioned earlier in this study we perform short timeprediction for heat consumption and evaluate the threeML methods For the development of the heat load systempresented in our previous work [6] in this section we presentand describe our heat load prediction approach in detail asshown in Figure 2 which includes collection of operationaldata data preparation and the examined ML algorithms
In this work there are two main tasks which are relevantin the system implementation (a) data aggregation andpreprocessing and (b) ML application where the heal loadprediction is approached with the supervisedML algorithms
41 Operational Data Collection The data we have used inthis study is provided by Eidsiva Bioenergi ASwhich operates
Applied Computational Intelligence and Soft Computing 5
Table 2 Typical data samples for one day from the Eidsiva dataset
Time (h) FT (∘C) RT (∘C) Flow (m3h) HL (MW)1 1009 604 3165 1462 994 575 2791 1343 996 583 2283 1324 1003 598 2619 1515 1005 596 2769 1496 1001 589 2705 128
20 987 597 353 14821 1014 589 3309 16122 1011 591 2508 14223 1008 598 2602 15424 1019 593 2479 141
one of Europersquos most modern waste incineration plantslocated in the city of Hamar Norway The plant producesdistrict heating process steam and electricity These dataare collected by regular measurements that are part of thecontrol system in a DH plant The measurements consist of24measurements per each day that is every hourThe datasetcontains values of the parameters time of day (tD) forwardtemperature (FT) return temperature (RT) flow rate (FR)and heat load (HL) The data are collected in the periodbetween October 1st 2014 and April 30th 2015 In Table 2 wepresent a portion of typical data samples for one day
42 Data Preparation In this module activities related topreparing the data to be compatible with the ML moduleare performed The module includes data aggregation andpreprocessing During the process of data aggregation wecombine the sources of this data with the weather data(outdoor temperature) which is collected at the same intervalwith previous parameters Consequently we obtain fromthe aggregation process these output parameters outdoortemperature (OT) heat load (HL) forward temperature(FT) time of day (tD) and the difference between forwardtemperature (FT) and return temperature (RT) namely DT
43 Machine-Learning Predictive Modelling Machine learn-ing (ML) is a very broad subject it goes from very abstracttheory to extreme practice It turns out that even amongstmachine-learning practitioners there is no very well accepteddefinition of what is machine learning As a subfield ofartificial intelligence with its objective on building modelsthat learn from data machine learning has made tremendousimprovements and applications in the last decade
In general ML can be clearly defined as a set of methodsthat can automatically detect and extract knowledge patternsin empirical data such as sensor data or databases and thenuse the discovered knowledge patterns to predict future dataor execute other types of decision-making under uncertaintyML is divided into three principal groups supervised learn-ing (predictive learning approach) unsupervised learning
(descriptive learning approach) and reinforcement learning[29] In supervised learning the algorithm is given data inwhich the ldquocorrect answerrdquo for each example is told and themain property of the supervised learning is that the maincriteria of the target function 119910 = 119891(119909) are unknown Atthe very high level the two steps of supervised learningare as follows (i) train a machine-learning model usinglabeled data that consist of119873 data pairs (119909
1 119910
1) (119909
119873 119910
119873)
called instances and (ii) make predictions on new data forwhich the label is unknown Each instance is described byan input vector 119909
119894which incorporates a set of attributes 119860 =
119860
1 119860
2 119860
119898 and a label 119910
119894of the target attribute that
represents the wanted output To summarize these two stepsthe predictive model is learning from past examples made upof inputs and outputs and then applying what is learned tofuture inputs in order to predict future outputs Since we aremaking predictions on unseen data which is data that is notused to train the model it is often said that the primary goalof supervised learning is to buildmodels that generalizes thatis the built machine-learning model accurately predicts thefuture rather than the past Therefore the goal is to train amodel that can afterwards predict the label of new instancesand to figure out the target function
Based on the type of output variable 119910119894 supervised
learning tasks are further divided into two types as clas-sification and regression problems In problems where theoutput variable 119910
119894is categorical or nominal (or belongs to a
finite set) the ML tasks are known as classification problemsor pattern recognition whereas in regression problems theoutput variable is a real valued scalar or takes continuousvalues
431 Support Vector Regression (SVR) Support vectormachines (SVM) as a set of supervised learning algorithmsbased on a statistical learning theory are one of the mostsuccessful and widely appliedmachine-learningmethods forboth solving regression and pattern recognition problemsSince the formulation of SVMs is based on structural riskminimization and not on empirical risk minimization thisalgorithm shows better performance than the traditionalones Support Vector Regression (SVR) is a method ofSVM specifically for regressions In SVR the objectivefunction (eg the error function that may need to beminimized) is convex meaning that the global optimumis always reached and satisfied This is sharply in contrastto artificial neural networks (ANNs) where for instancethe classical backpropagation learning algorithm is prone toconvergence to ldquobadrdquo local minima [30 31] which makesthem harder to analyze theoretically In practice SVR havegreatly outperformed ANNs in a wide range of applications[31]
In SVR the input119909 ismapped first into an119898-dimensionalfeature space by using nonlinear mapping As a subsequentstep we construct a linear model in that feature spaceMathematically the linear model 119891(119909 119908) is given by
119891 (119909 119908) =
119898
sum
119895=1
119908
119895119892
119895(119909) + 119887 (1)
6 Applied Computational Intelligence and Soft Computing
where 119892119895(119909) 119895 = 1 119898 represents the set of nonlinear
transformations while 119887 is the bias term andmost of the timeis assumed to be zero hence we omit this term
The model obtained by SVR depends exclusively on asubset of the training data at the same time SVR tries toreducemodel complexity byminimizing 1199082 Consequentlythe objective of SVR is to minimize the following function[32]
min 1
2
119908
2+ 119862
119899
sum
119894=1
(120585
119894+ 120585
lowast
119894)
such that 119910
119894minus 119891 (119909
119894 119908) le 120576 + 120585
lowast
119894
119891 (119909
119894 119908) minus 119910
119894le 120576 + 120585
119894
120585
119894 120585
lowast
119894ge 0 119894 = 1 119899
(2)
In these equations 120576 is a new type of (insensitive) lossfunction or a threshold which denotes the desired errorrange for all points The nonnegative variables 120585
119894and 120585lowast119894are
called slack variables they measure the deviation of trainingsamples outside 120576 that is guaranteeing that a solution existsfor all 120576 The parameter 119862 gt 0 is a penalty term used todetermine the tradeoff between data fitting and flatness and119908 are the regression weights In most cases the optimizationproblem can be easily solved if transformed into a dualproblem By utilizing Lagrange multipliers the dualizationmethod is applied as follows
119871 =
1
2
119908
2+ 119862
119899
sum
119894=1
(120585
119894+ 120585
lowast
119894)
minus
119899
sum
119894=1
120572
lowast
119894(120576 + 120585
lowast
119894minus 119910
119894+ 119891 (119909
119894 119908))
minus
119899
sum
119894=1
120572
119894(120576 + 120585
119894+ 119910
119894minus 119891 (119909
119894 119908))
minus
119899
sum
119894=1
(120582
119894120585
119894+ 120582
lowast
119894120585
lowast
119894)
(3)
where 119871 is the Lagrangian and 120572119894 120572
lowast
119894 120582
119894 120582
lowast
119894ge 0 are called the
Lagrange multipliersConsidering the saddle point condition it follows that the
partial derivatives of119871 in relation to variables (119908 119887 120585119894 120585
lowast
119894) will
disappear for optimality By proceeding with similar stepswe end up with the dual optimization problem Finally thesolution of the dual problem is given by
119891 (119909) =
119899SV
sum
119894=1
(120572
119894minus 120572
lowast
119894)119870 (119909
119894minus 119909) (4)
where 0 le 120572lowast119894le 119862 0 le 120572
119894le 119862 119899SV is the number of space
vectors and 119870 is the kernel function which for given twovectors in input space will return to a higher dimensional
feature space the dot product of their images The kernel isgiven by
119870(119909 119909
119894) =
119898
sum
119894=1
119892
119895(119909) 119892
119895(119909
119894) (5)
In order to map the input data to a higher dimensional spaceand to handle nonlinearities between input vectors and theirrespective class we use as a kernel the Gaussian radial basisfunction (RBF) which has 120574 as its kernel parameter Once thekernel is selected we used grid search to identify the best pairof the regularization parameters119862 and 120574 that is the pair withthe best cross-validation accuracy
432 Partial Least Squares (PLS) The Partial Least Squares(PLS) technique is a learningmethod based onmultiple linearregression model that takes into account the latent structurein both datasetsThe dataset consists of explanatory variables119883
119894and dependent variables 119884
119895 The model is linear as can be
seen in (6) that for each sample 119899 the value 119910119899119895is
119910
119899119895=
119896
sum
119894=0
120573
119894119909
119899119894+ 120576
119899119895
(6)
The PLS model is similar to a model from a linearregression however the way of calculating 120573
119894is different
The principle of PLS regression is that the data tables ormatrices119883 and 119884 are decomposed into latent structure in aniterative process The latent structure corresponding to themost variation of 119884 is extracted and explained by a latentstructure of119883 that explains it the best
The Partial Least Squares (PLS) technique is a learningmethod based on multivariate regression model which canbe used for correlating the information in one data matrix119883 to the information in another matrix 119884 More specificallyPLS is used to find the fundamental relations between twomatrices (119883 and 119884) which are projected onto several keyfactors such as 119879 and 119880 and linear regression is performedfor the relation between these factors Factor119880 represents themost variations for 119884 whereas factor 119879 denotes the variationsfor 119883 but it is not necessarily explaining the most variationin119883
The first results of PLS are the model equations show-ing the 120573 coefficients that give the relationship betweenvariables 119883 and 119884 These model equations are as follows
119884 = 119883120573 + 120576
119884 = 119879
ℎ119862
1015840
ℎ+ 120576
ℎ= 119883119882
lowast
ℎ119862
1015840
ℎ+ 120576
ℎ
= 119883119882
ℎ(119875
1015840
ℎ119882
ℎ)
minus1
119862
1015840
ℎ+ 120576
ℎ
(7)
where119884 is thematrix of dependent variables 119883 is thematrixof explanatory variables 119879
ℎ 119862ℎ 119882lowastℎ 119882ℎ and 119875
ℎare the
matrices generated by the PLS algorithm and 120576ℎis the matrix
of residuals Matrix 120573 of the regression coefficients of119884 on119883with ℎ components generated by the PLS algorithm is givenby
120573 = 119882
ℎ(119875
1015840
ℎ119882
ℎ)
minus1
119862
1015840
ℎ
(8)
Applied Computational Intelligence and Soft Computing 7
Sample data
Train dataDataset to grow the single trees
Feature selectionRandom selection of predictors
Grow treeSplit data using the best predictors
Estimate OOB errorby applying the tree to the OOB data
Out-Of-Bag (OOB)Data to estimate the error of the
grown tree
Random forestAs collection of all trees
Repeat until criteria for stopping tree growth
are fulfilled
Repeat until specified number
of trees is obtained
Figure 3 Random forest algorithm process flow
The advantage of PLS is that this algorithm allows takinginto account the data structure in both 119883 and 119884 matrices Italso provides great visual results that help the interpretationof data Finally yet importantly PLS can model severalresponse variables at the same time taking into account theirstructure
433 Random Forest (RF) Random forest algorithm pro-posed by Breiman [33] is an ensemble-learning algorithmconsisting of three predictors where the trees are formulatedbased on various random features It develops lots of decisiontrees based on random selection of data and random selectionof variables providing the class of dependent variable basedon many trees
This method is based on the combination of a largecollection of decorrelated decision trees (ie Classificationand Regression Trees (CART) [34]) Since all the trees arebased on random selection of data as well as variables theseare random trees and many such random trees lead to arandom forest The name forest means that we use manydecision trees to make a better classification of the dependentvariable The CART technique divides the learning sampleusing an algorithm known as binary recursive partitioningThis splitting or partitioning starts from the most importantvariable to the less important ones and it is applied to each ofthe new branches of the tree [35]
In order to increase the algorithm accuracy and reducethe generalization error of the ensemble trees another
technique called Bagging is incorporated The estimation forthe generalization error is performed with the Out-Of-Bag(OOB) method Bagging is used on the training dataset togenerate a lot of copies of it where each one corresponds to adecision tree
With the RF algorithm each tree is grown as follows [36]
(a) If the number of cases (observations) in the trainingset is119873 sample119873 cases at random but with replace-ment from the original data this sample will be thetraining set for the growing tree
(b) If there are 119872 input variables a number 119898 ≪ 119872
is specified such that at each node 119898 variables areselected at random out of the119872 and the best split onthese119898 is used to split the node This value119898 is heldconstant during the forest growing
(c) Each tree is grown to the largest extent possible
The process flow in the random forest models is shown inFigure 3
The error rate of RF primarily depends on the correlationdegree between any two trees and the prediction accuracy ofan individual tree The principal advantages of this methodare the ease of parallelization robustness and indifference tonoise and outliers in most of the dataset Due to its unbiasednature of execution this method avoids overfitting
8 Applied Computational Intelligence and Soft Computing
Table 3 Performance results for one week
Algorithm Training TestingMAE () MAPE () Correlation coefficient MAE () MAPE () Correlation coefficient
SVR 6491 311 090 7032 343 091PLS 26524 1273 081 23868 1042 079RF 17382 1977 083 15932 1861 084
Heat output (MW)
Heat load measurements (Oct 1ndashApr 30)
minus500
000
500
1000
1500
2000
2500
3000
Outdoor temp (∘C)
Figure 4 Hourly heat load pattern in relation to outdoor tempera-ture in the DH network
5 Performance Evaluation and Results
The proposed approach is implemented in MATLAB R2014a[37] and executed in a PC with Intel Core i7 processor with27GHz speed and 8GB of RAM In this work as trainingdataset we select the datameasured during the first 28 weekswhich consist of 4872 instances As for prediction period wechoose the 29th week as test data that is 148 instances Inorder to evaluate the performance of the proposed algorithmsin terms of accuracy of the results we use the mean absoluteaverage (MAE) the mean average percentage error (MAPE)and the correlation coefficient which measures the correla-tion between the actual and predicted heat load values MAEand MAPE are defined as follows
MAE =sum
119899
119894=1
1003816
1003816
1003816
1003816
119901
119894minus 119910
119894
1003816
1003816
1003816
1003816
119899
MAPE =sum
119899
119894=1
1003816
1003816
1003816
1003816
1003816
(119901
119894minus 119910
1015840
119894) 119901
119894
1003816
1003816
1003816
1003816
1003816
119899
(9)
where 119901119894is the actual value 119910
119894is the predicted value and 119899 is
the number of samples in the training setWe apply a 10-fold cross-validation for obtaining the valid
evaluation metrics Figure 4 presents the heat load in thenetwork with respect to the outdoor temperatureWe see thatthe higher values of the heat load occur during the days withlower outdoor temperature values which in fact reflects theincreased consumption of heat
Figure 5(a) shows results of actual heat load and heatload prediction for one week based on SVR algorithm InFigure 5(b) results of heat prediction production for oneweek based on PLS are shown with data on actual heal load
whereas results of heat load forecasting for one week basedon RF are shown in Figure 5(c)
As can be seen from Figure 5 the predicted HL withSVR is closer to the actual energy consumption with anMAPE value of about 343 and a correlation coefficient of091 Correlation has been consistent throughout training andtesting On the other hand graphs presented in Figures 5(b)and 5(c) for PLS and RF respectively are less accurate withhigher errors The performance of the PLS is significantlylower compared to SVR Concerning RF as the trees startedto become progressively uncorrelated the error rate alsodeclined significantly The best performance of the SVR overthe other two methods is attributed to the efficient modellingof feature space and to the fact that SVR is less proneto overfitting and that does not require any procedure todetermine the explicit form as in ordinary regression analysis
Table 3 outlines the results of the investigated ML algo-rithms for heat consumption prediction for both trainingphase (where we developed the models using supervisedlearning schemes) and testing phase (to generalize newlyunseen data) From Table 3 it is evident that SVR showsthe best prediction performance in terms of average errorsand correlation coefficient confirming the superiority of SVRover the other machine-learning methods Therefore basedon the assumption that the number and type of the operatingfacilities should be determined SVRcanbe effectively appliedin the management of a DHS The mean absolute percentageerror value of 343 obtained in our approachwith the SVR islower than or equal to the mean absolute percentage error ofstate-of-the-art approaches for heat load prediction in DHSMoreover SVR is also better than PLS and RF in terms ofmean average errors and correlation coefficient Neverthelesssometimes it is impossible to perform direct comparisonwith other works due to different system implementation andinput data and different structure of the experimental setup
51 Comparison with State-of-the-Art Methods Some priorresearch work is carried out to predict and analyze the headdemand in DH However due to different system designand input data and different architecture implementation orstructure of the experimental setup sometimes it is difficult toperform direct comparison Our approach uses operationaldata from DHS to model the heat demand and since theSVR exhibited the best prediction performance we use thismethod to perform the comparison In our case we obtainedsmaller MAPE compared to [17] where the experimentalresults showed MAPE of 477 which is lower than or atleast equal to the mean percentage error of state-of-the-artregression approaches that have been proposed for heat loadforecasting in DH systems Furthermore reported results
Applied Computational Intelligence and Soft Computing 9
Mon Tu
e
Wed Th
u Fri
Sat
Sun
Support Vector Regression (SVR)
Actual HLPredicted HL
000
500
1000
1500
2000
2500
Hea
t loa
d (M
W)
(a) Predicted versus actual demand with SVR
000
500
1000
1500
2000
2500
Mon Tu
e
Wed Thu Fri
Sat
Sun
Hea
t loa
d (M
W)
Partial Least Squares (PLS)
Actual HLPredicted HL
(b) Predicted versus actual demand with PLS
000
500
1000
1500
2000
2500
Hea
t loa
d (M
W)
Random forest (RF)
Mon Tu
e
Wed Th
u Fri
Sat
Sun
Actual HLPredicted HL
(c) Predicted versus actual demand with RF
Figure 5 Hourly heat load prediction based on SVR PLS and RF for one week
from our study are also superior compared to the resultspresented in [15 32] More specifically the SVR method weapply shows better results in terms of MAPE and correlationcoefficient compared to [32] where MAPE in this paperis 554 and the correlation coefficient is 087 As far asthe comparison with work [15] is concerned in terms ofMAPE our SVR method exhibits better hourly predictionperformance where theMAPE for one week is 563 On theother hand the PLSmethod performs better than in our casehaving the MAPE value of about 899
6 Conclusion
District heating (DH) sector can play an indispensable rolein the current and future sustainable energy system of theNorth European countries where the share of DH in the total
European heat market is significantly high The innovationsand emergence of new technology and the increasing focuson smart buildings impose energy systems to face a chal-lenge in adapting to customersrsquo more flexible and individualsolutions Consequently one needs to know more aboutwhat drives customersrsquo requirements choices and prioritiesTherefore the creation and application of innovative ITecosystems in DH are considered essential to improve energyefficiency
Heat load prediction in the past decade has attracted alot of interest to the researchers since it can assist in energyefficiency of the DHS which also leads to cost reduction forheat suppliers and to many environmental benefits
In this paper three ML algorithms for heat load predic-tion in a DH network are developed and presentedThe algo-rithms are SVR PLS andRFTheheat load predictionmodels
10 Applied Computational Intelligence and Soft Computing
were developed using data from 29 weeks The predictedhourly results were compared with actual heat load dataPerformances of these three different ML algorithms werestudied compared and analyzed SVR algorithm proved tobe the most efficient one producing the best performance interms of average errors and correlation coefficient Moreoverthe prediction results were also compared against existingSVR and PLS methods in literature showing that the SVRpresented in this paper produces better accuracy
In conclusion the comparison results validate the notionthat the developed SVRmethod is appropriate for applicationin heat load prediction or it can serve as a promisingalternative for existing models
As for the future work apart from outdoor temperaturewe intend to incorporate other meteorological parametersinfluencing the heat load such as wind speed and humidity
Competing Interests
The authors declare that they have no competing interests
Acknowledgments
The authors would like to thank Eidsiva Bioenergi AS Nor-way which kindly provided the data for this research work
References
[1] Report on European Energy Security Strategy May 2014 Brus-sels httpeur-lexeuropaeulegal-contentENTXTPDFuri=CELEX52014DC0330ampfrom=EN
[2] A R Day P Ogumka P G Jones and A Dunsdon ldquoTheuse of the planning system to encourage low carbon energytechnologies in buildingsrdquo Renewable Energy vol 34 no 9 pp2016ndash2021 2009
[3] IEA (International Energy Agency) Nordic Energy TechnologyPerspectives OECDIEA Paris France 2013
[4] A Gebremedhin and H Zinko ldquoSeasonal heat storages indistrict heating systemsrdquo in Proceedings of the 11th InternationalThermal Energy Storage for Energy Efficiency and SustainabilityConference (Effstock rsquo09) Stockholm Sweden June 2009
[5] H Lund S Werner R Wiltshire et al ldquo4th Generation DistrictHeating (4GDH) integrating smart thermal grids into futuresustainable energy systemsrdquo Energy vol 68 pp 1ndash11 2014
[6] F Dalipi S Y Yayilgan and A Gebremedhin ldquoA machinelearning approach to increase energy efficiency in district heat-ing systemsrdquo in Proceedings of the International Conference onEnvironmental Engineering and Computer Application (ICEECArsquo14) pp 223ndash226 CRC Press Hong Kong December 2014
[7] N Fumo ldquoA review on the basics of building energy estimationrdquoRenewable and Sustainable Energy Reviews vol 31 pp 53ndash602014
[8] D E Claridge ldquoUsing simulation models for building com-missioningrdquo in Proceedings of the International Conferenceon Enhanced Building Operation Energy Systems LaboratoryTexas AampM University October 2004
[9] T Mestekemper Energy demand forecasting and dynamic watertemperature management [PhD thesis] Bielefeld UniversityBielefeld Germany 2011
[10] E Dotzauer ldquoSimple model for prediction of loads in district-heating systemsrdquo Applied Energy vol 73 no 3-4 pp 277ndash2842002
[11] HANielsen andHMadsen ldquoModelling the heat consumptionin district heating systems using a grey-box approachrdquo Energyand Buildings vol 38 no 1 pp 63ndash71 2006
[12] B B Ekici and U T Aksoy ldquoPrediction of building energyconsumption by using artificial neural networksrdquo Advances inEngineering Software vol 40 no 5 pp 356ndash362 2009
[13] M Grzenda and B Macukow ldquoDemand prediction with multi-stage neural processingrdquo in Advances in Natural Computationand Data Mining pp 131ndash141 2006
[14] K Kato M Sakawa K Ishimaru S Ushiro and T ShibanoldquoHeat load prediction through recurrent neural network indistrict heating and cooling systemsrdquo in Proceedings of the IEEEInternational Conference on SystemsMan andCybernetics (SMCrsquo08) pp 1401ndash1406 Singapore October 2008
[15] T C Park U S Kim L-H Kim B W Jo and Y K Yeo ldquoHeatconsumption forecasting using partial least squares artificialneural network and support vector regression techniques in dis-trict heating systemsrdquo Korean Journal of Chemical Engineeringvol 27 no 4 pp 1063ndash1071 2010
[16] T Catalina V Iordache and B Caracaleanu ldquoMultiple regres-sion model for fast prediction of the heating energy demandrdquoEnergy and Buildings vol 57 pp 302ndash312 2013
[17] S Provatas N Lavesson and C Johansson ldquoAn onlinemachinelearning algorithm for heat load forecasting in district heatingsystemsrdquo in Proceedings of the 14th International Symposium onDistrict Heating and Cooling Stockholm Sweden September2014
[18] S Idowu S Saguna C Ahlund and O Schelen ldquoForecastingheat load for smart district heating systems a machine learningapproachrdquo in Proceedings of the IEEE International Conferenceon Smart Grid Communications (SmartGridComm rsquo14) pp 554ndash559 IEEE Venice Italy November 2014
[19] L Wu G Kaiser D Solomon R Winter A Boulanger andR Anderson Improving Efficiency and Reliability of Build-ing Systems Using Machine Learning and Automated OnlineEvaluation Columbia University Academic Commons 2012httphdlhandlenet10022ACP13213
[20] M Vlachopoulou G Chin J C Fuller S Lu and K KalsildquoModel for aggregated water heater load using dynamicBayesian networksrdquo Tech Rep Pacific Northwest NationalLaboratory (PNNL) Richland Wash USA 2012
[21] S Shamshirband D Petkovic R Enayatifar et al ldquoHeat loadprediction in district heating systemswith adaptive neuro-fuzzymethodrdquoRenewable and Sustainable Energy Reviews vol 48 pp760ndash767 2015
[22] M Protic S Shamshirband M H Anisi et al ldquoAppraisal ofsoft computing methods for short term consumersrsquo heat loadprediction in district heating systemsrdquo Energy vol 82 pp 697ndash704 2015
[23] S Shamshirband D Petkovic A Amini et al ldquoSupport vec-tor regression methodology for wind turbine reaction torqueprediction with power-split hydrostatic continuous variabletransmissionrdquo Energy vol 67 pp 623ndash630 2014
[24] H Quan D Srinivasan and A Khosravi ldquoShort-term load andwind power forecasting using neural network-based predictionintervalsrdquo IEEE Transactions on Neural Networks and LearningSystems vol 25 no 2 pp 303ndash315 2014
Applied Computational Intelligence and Soft Computing 11
[25] K Bhaskar and S N Singh ldquoAWNN-assisted wind power fore-casting using feed-forward neural networkrdquo IEEE Transactionson Sustainable Energy vol 3 no 2 pp 306ndash315 2012
[26] Heat Pump Centre Newsletters International Energy Agency(IEA) Heat Pumping Technologies Vol 30 Nr 22012
[27] T G Dietterich ldquoAn experimental comparison of three meth-ods for constructing ensembles of decision trees baggingboosting and randomizationrdquo Machine Learning vol 40 no2 pp 139ndash157 2000
[28] SwedishDistrictHeatingAssociation 2016 httpwwwsvensk-fjarrvarmese
[29] K P Murphy Machine Learning A Probabilistic PerspectiveAdaptive Computation and Machine Learning Series MITPress 2012
[30] L Wang Support Vector Machines Theory and ApplicationsSpringer New York NY USA 2005
[31] D Basak S Pal and D C Patranabis ldquoSupport vector regres-sionsrdquoNeural Information ProcessingmdashLetters and Reviews vol11 no 10 pp 203ndash224 2007
[32] R E Edwards J New and L E Parker ldquoPredicting futurehourly residential electrical consumption a machine learningcase studyrdquo Energy and Buildings vol 49 pp 591ndash603 2012
[33] L Breiman ldquoRandom forestsrdquoMachine Learning vol 45 no 1pp 5ndash32 2001
[34] L Breiman J H Friedman R A Olshen and C J StoneClassification and Regression Trees Wadsworth and BrooksMonterey Calif USA 1984
[35] P Vezza C Comoglio M Rosso and A Viglione ldquoLowflows regionalization in North-Western Italyrdquo Water ResourcesManagement vol 24 no 14 pp 4049ndash4074 2010
[36] L Breiman and A Cutler ldquoRandom forestsrdquo January 2016httpwwwstatberkeleyeduusersbreimanRandomForests
[37] MATLAB Version 830 MathWorks Natick Mass USA 2014
Submit your manuscripts athttpwwwhindawicom
Computer Games Technology
International Journal of
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Distributed Sensor Networks
International Journal of
Advances in
FuzzySystems
Hindawi Publishing Corporationhttpwwwhindawicom
Volume 2014
International Journal of
ReconfigurableComputing
Hindawi Publishing Corporation httpwwwhindawicom Volume 2014
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Applied Computational Intelligence and Soft Computing
thinspAdvancesthinspinthinsp
Artificial Intelligence
HindawithinspPublishingthinspCorporationhttpwwwhindawicom Volumethinsp2014
Advances inSoftware EngineeringHindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Electrical and Computer Engineering
Journal of
Journal of
Computer Networks and Communications
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Hindawi Publishing Corporation
httpwwwhindawicom Volume 2014
Advances in
Multimedia
International Journal of
Biomedical Imaging
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
ArtificialNeural Systems
Advances in
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
RoboticsJournal of
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Computational Intelligence and Neuroscience
Industrial EngineeringJournal of
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Modelling amp Simulation in EngineeringHindawi Publishing Corporation httpwwwhindawicom Volume 2014
The Scientific World JournalHindawi Publishing Corporation httpwwwhindawicom Volume 2014
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Human-ComputerInteraction
Advances in
Computer EngineeringAdvances in
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Applied Computational Intelligence and Soft Computing 3
Table 1 ML models for heat demand prediction in the literature
Applied algorithms [12] [13] [14] [15] [16] [17] [32] [33] [34] [35] [36] Our workANNFNNSOMRNN + + + + minus minus + minus minus + minus minus
MLRMRPLS minus minus minus + + minus + minus minus minus minus +SVMSVR minus minus minus + minus minus + + minus minus + +BN minus minus minus minus minus minus minus minus + minus minus minus
DTRFRT minus minus minus minus minus + + minus minus minus minus +Ensembles minus minus minus minus minus + minus minus minus minus minus +
accuracy on heat load prediction for 1- to 24-hour horizonsHowever the prediction accuracy decreases with the rise inhorizon from 1 to 18 hours
Wu et al [19] discuss and implement SVR as a pre-dictive model to the buildingrsquos historical energy use Theirpredictive model proved to approximate current energy usewith some seasonal and customer-specific variations in theapproximations Another work [20] discusses the importanceof prediction of load in a smart energy grid network Theauthors propose a BN to predict the total consumer waterheat consumption in households Shamshirband et al [21]construct an adaptive neurofuzzy inference system (ANFIS)which is a special case of the ANN family to predict heat loadfor individual consumers in a DH system Their result indi-cates that more improvements of the model are required forprediction horizons greater than 1 hour Protic et al [22] studythe relevance of short-termheat load prediction for operationcontrol inDHnetworkHere authors apply SVR for heat loadprediction for only one substation for time horizon of every 15minutes To improve the predictive model authors also add adummy variable to define the state of DH operation
In literature the research towards developing load fore-casting models is also discussed from different perspectivesand used in different energy related applications such ashead load in district heating wind turbine reaction torqueprediction [23] and wind power forecasting [24 25]
In [23] SVR is employed for wind turbine torque pre-diction The results show that an improvement in accuracycan be achieved and conclude that SVR can be consideredas a suitable alternative for prediction It can be also seenthat the proposed SVR prediction models produce higheraccuracy compared toANNandANFIS (adaptive neurofuzzyinference system) The work discussed in [24] considers thepenetrations of renewable energies in electrical power sys-tems by increasing the level of uncertainty In such situationstraditional methods for forecasting of load demand cannotproperly handle these uncertainties Hence they implement aneural network method for constructing prediction intervalsby using a lowupper bound estimation (LUBE) approachTheauthors conduct a comparative analysis and show that thismethod can increase the prediction intervals quality for loadand wind power generation predictions
Bhaskar and Singh [25] perform a statistical basedwind power prediction using numerical weather prediction(NWP) In order to validate the effectiveness of the proposedmethod the authors compared it with benchmark modelssuch as persistence (PER) and new-reference (NR) and show
that the proposed model outperforms these benchmarkmodels
Additionally due to innovations in the future sustainableand smart energy systems and recent technological trendswith IoT (Internet of Things) many research works [5 26]consider DH systems as being an integral part in Smart Gridwithin the smart city concept Moreover such a DH systemmodel will require high computation time and resourcesfor knowledge representation knowledge inference andoperational optimization problemsThus in response to thisresearchers are continuously focusing on the developmentand use of fast and efficient algorithms for real-time process-ing of energy and behavior related data
As a summary previous research on heat load predictionpoints to various training algorithms ANN including RNNFFN (Feedforward Neural Network)MLP and SOM MRincluding MLR and PLS SVM including SVR Bayesiannetworks (BN) decision trees (DT) ensemble methods [27]FIMT-DD AMRules and IBLStreams
In spite of the interest and the considerable efforts givenby the research community so far there is no consensusamong researchers on neither selecting the most suitabletraining model for heat load prediction nor selecting anappropriate set of input parameters for training the modelwith [16] in order to achieve high level of prediction accuracyThis is due to the fact that superiority of one model overanother in heat load prediction cannot be asserted in generalbecause performance of each model rather depends on thestructure of the prediction problem and the type of dataavailableThe comparison in [15] pointed to the superior per-formance of SVR already however as our problem structureand inputs are different from theirs we chose to do a compar-ison of several up-to-date models to find the most promisingapproach for our case Table 1 lists models from the literatureThe ldquoplusrdquo sign indicates that a particular algorithm has beenapplied while ldquominusrdquo means the opposite Based on thetable we concluded that SVR PLS and RF provide us witha unique combination of models to compare with each otherSimplicity and efficiency of each model in our combinationare preferred such that rapid and simple assessment of energydemand with high accuracy can be obtained
3 District Heating Systems
District heating is a well-proven technology for heat supplyfrom different energy sources through heat generation anddistribution to heat consumers DH systems are valuable
4 Applied Computational Intelligence and Soft Computing
Energy sourceHeat
generation plant
DH networkHeat demand Space heating
Domestic hot water
Supply pipeReturn pipe
Figure 1 District heating block diagram
infrastructure assets which enable effective resource utiliza-tion by incorporating the use of various energy sources Oneof the main advantages of DH system is that it facilitatesthe use of combined heat and power (CHP) generation andthereby makes the overall system efficient
District heating can play a crucial role in reaching someof the energy and environmental objectives by reducing CO
2
emissions and improve overall energy efficiency In a districtheating system heat is distributed through a network of hot-water pipes from heat-supplying plants to end usersThe heatis mostly used for space heating and domestic hot waterA simplified schematic picture of DH system is shown inFigure 1
The main components of district heating system areheat generation units distribution network and customersubstations The heat generation unit may use heat-onlyboilers or CHP plants or a combination of these two for heatgeneration Various types of energy sources like biomassmunicipal solid waste and industrial waste heat can be usedfor heat production The heat is then distributed to differentcustomers through a network of pipeline In the customersubstations the heat energy from the network is transferredto the end users internal heating system
The heat-supplying units are designed to meet the heatdemandThe heat output to the network depends on themassflow of the hot water and the temperature difference betweenthe supply and the return line The supply temperature of thehot water is controlled directly from the plantrsquos control roombased on the outdoor temperature and it follows mostly agiven operation temperature curve The return temperatureon the other hand depends mainly on the customerrsquos heatusage and also other network specific constraints The levelof the supply temperature differs from country to coun-try For instance in Sweden the temperature level variesbetween 70 and 120∘C depending on season and weather[28]
The heat load in district heating systems is the sum of allheat loads that are connected to the network and distributionand other losses in the network
With increased concerns about the environment climatechange and energy economy DH is an obvious choice to beused Nowadays district heating systems are equipped withadvanced and cutting-edge technology systems and sensorsthat monitor and control production units from a controlroom remotely From a smart city perspective one of thefuture challenges that now remains is to integrate districtheating with the electricity sector as well as the transportsector Heat load forecasting models with high accuracy are
Data preparation (aggregation and preprocessing)
Machine learning algorithms(SVR PLS and RF)
Collection of operational data
Heat load forecast curves
Figure 2 Workflow scenario of the proposed heat load approach
important to keep up with the rapid development in thisdirection
4 System Design and Application
As mentioned earlier in this study we perform short timeprediction for heat consumption and evaluate the threeML methods For the development of the heat load systempresented in our previous work [6] in this section we presentand describe our heat load prediction approach in detail asshown in Figure 2 which includes collection of operationaldata data preparation and the examined ML algorithms
In this work there are two main tasks which are relevantin the system implementation (a) data aggregation andpreprocessing and (b) ML application where the heal loadprediction is approached with the supervisedML algorithms
41 Operational Data Collection The data we have used inthis study is provided by Eidsiva Bioenergi ASwhich operates
Applied Computational Intelligence and Soft Computing 5
Table 2 Typical data samples for one day from the Eidsiva dataset
Time (h) FT (∘C) RT (∘C) Flow (m3h) HL (MW)1 1009 604 3165 1462 994 575 2791 1343 996 583 2283 1324 1003 598 2619 1515 1005 596 2769 1496 1001 589 2705 128
20 987 597 353 14821 1014 589 3309 16122 1011 591 2508 14223 1008 598 2602 15424 1019 593 2479 141
one of Europersquos most modern waste incineration plantslocated in the city of Hamar Norway The plant producesdistrict heating process steam and electricity These dataare collected by regular measurements that are part of thecontrol system in a DH plant The measurements consist of24measurements per each day that is every hourThe datasetcontains values of the parameters time of day (tD) forwardtemperature (FT) return temperature (RT) flow rate (FR)and heat load (HL) The data are collected in the periodbetween October 1st 2014 and April 30th 2015 In Table 2 wepresent a portion of typical data samples for one day
42 Data Preparation In this module activities related topreparing the data to be compatible with the ML moduleare performed The module includes data aggregation andpreprocessing During the process of data aggregation wecombine the sources of this data with the weather data(outdoor temperature) which is collected at the same intervalwith previous parameters Consequently we obtain fromthe aggregation process these output parameters outdoortemperature (OT) heat load (HL) forward temperature(FT) time of day (tD) and the difference between forwardtemperature (FT) and return temperature (RT) namely DT
43 Machine-Learning Predictive Modelling Machine learn-ing (ML) is a very broad subject it goes from very abstracttheory to extreme practice It turns out that even amongstmachine-learning practitioners there is no very well accepteddefinition of what is machine learning As a subfield ofartificial intelligence with its objective on building modelsthat learn from data machine learning has made tremendousimprovements and applications in the last decade
In general ML can be clearly defined as a set of methodsthat can automatically detect and extract knowledge patternsin empirical data such as sensor data or databases and thenuse the discovered knowledge patterns to predict future dataor execute other types of decision-making under uncertaintyML is divided into three principal groups supervised learn-ing (predictive learning approach) unsupervised learning
(descriptive learning approach) and reinforcement learning[29] In supervised learning the algorithm is given data inwhich the ldquocorrect answerrdquo for each example is told and themain property of the supervised learning is that the maincriteria of the target function 119910 = 119891(119909) are unknown Atthe very high level the two steps of supervised learningare as follows (i) train a machine-learning model usinglabeled data that consist of119873 data pairs (119909
1 119910
1) (119909
119873 119910
119873)
called instances and (ii) make predictions on new data forwhich the label is unknown Each instance is described byan input vector 119909
119894which incorporates a set of attributes 119860 =
119860
1 119860
2 119860
119898 and a label 119910
119894of the target attribute that
represents the wanted output To summarize these two stepsthe predictive model is learning from past examples made upof inputs and outputs and then applying what is learned tofuture inputs in order to predict future outputs Since we aremaking predictions on unseen data which is data that is notused to train the model it is often said that the primary goalof supervised learning is to buildmodels that generalizes thatis the built machine-learning model accurately predicts thefuture rather than the past Therefore the goal is to train amodel that can afterwards predict the label of new instancesand to figure out the target function
Based on the type of output variable 119910119894 supervised
learning tasks are further divided into two types as clas-sification and regression problems In problems where theoutput variable 119910
119894is categorical or nominal (or belongs to a
finite set) the ML tasks are known as classification problemsor pattern recognition whereas in regression problems theoutput variable is a real valued scalar or takes continuousvalues
431 Support Vector Regression (SVR) Support vectormachines (SVM) as a set of supervised learning algorithmsbased on a statistical learning theory are one of the mostsuccessful and widely appliedmachine-learningmethods forboth solving regression and pattern recognition problemsSince the formulation of SVMs is based on structural riskminimization and not on empirical risk minimization thisalgorithm shows better performance than the traditionalones Support Vector Regression (SVR) is a method ofSVM specifically for regressions In SVR the objectivefunction (eg the error function that may need to beminimized) is convex meaning that the global optimumis always reached and satisfied This is sharply in contrastto artificial neural networks (ANNs) where for instancethe classical backpropagation learning algorithm is prone toconvergence to ldquobadrdquo local minima [30 31] which makesthem harder to analyze theoretically In practice SVR havegreatly outperformed ANNs in a wide range of applications[31]
In SVR the input119909 ismapped first into an119898-dimensionalfeature space by using nonlinear mapping As a subsequentstep we construct a linear model in that feature spaceMathematically the linear model 119891(119909 119908) is given by
119891 (119909 119908) =
119898
sum
119895=1
119908
119895119892
119895(119909) + 119887 (1)
6 Applied Computational Intelligence and Soft Computing
where 119892119895(119909) 119895 = 1 119898 represents the set of nonlinear
transformations while 119887 is the bias term andmost of the timeis assumed to be zero hence we omit this term
The model obtained by SVR depends exclusively on asubset of the training data at the same time SVR tries toreducemodel complexity byminimizing 1199082 Consequentlythe objective of SVR is to minimize the following function[32]
min 1
2
119908
2+ 119862
119899
sum
119894=1
(120585
119894+ 120585
lowast
119894)
such that 119910
119894minus 119891 (119909
119894 119908) le 120576 + 120585
lowast
119894
119891 (119909
119894 119908) minus 119910
119894le 120576 + 120585
119894
120585
119894 120585
lowast
119894ge 0 119894 = 1 119899
(2)
In these equations 120576 is a new type of (insensitive) lossfunction or a threshold which denotes the desired errorrange for all points The nonnegative variables 120585
119894and 120585lowast119894are
called slack variables they measure the deviation of trainingsamples outside 120576 that is guaranteeing that a solution existsfor all 120576 The parameter 119862 gt 0 is a penalty term used todetermine the tradeoff between data fitting and flatness and119908 are the regression weights In most cases the optimizationproblem can be easily solved if transformed into a dualproblem By utilizing Lagrange multipliers the dualizationmethod is applied as follows
119871 =
1
2
119908
2+ 119862
119899
sum
119894=1
(120585
119894+ 120585
lowast
119894)
minus
119899
sum
119894=1
120572
lowast
119894(120576 + 120585
lowast
119894minus 119910
119894+ 119891 (119909
119894 119908))
minus
119899
sum
119894=1
120572
119894(120576 + 120585
119894+ 119910
119894minus 119891 (119909
119894 119908))
minus
119899
sum
119894=1
(120582
119894120585
119894+ 120582
lowast
119894120585
lowast
119894)
(3)
where 119871 is the Lagrangian and 120572119894 120572
lowast
119894 120582
119894 120582
lowast
119894ge 0 are called the
Lagrange multipliersConsidering the saddle point condition it follows that the
partial derivatives of119871 in relation to variables (119908 119887 120585119894 120585
lowast
119894) will
disappear for optimality By proceeding with similar stepswe end up with the dual optimization problem Finally thesolution of the dual problem is given by
119891 (119909) =
119899SV
sum
119894=1
(120572
119894minus 120572
lowast
119894)119870 (119909
119894minus 119909) (4)
where 0 le 120572lowast119894le 119862 0 le 120572
119894le 119862 119899SV is the number of space
vectors and 119870 is the kernel function which for given twovectors in input space will return to a higher dimensional
feature space the dot product of their images The kernel isgiven by
119870(119909 119909
119894) =
119898
sum
119894=1
119892
119895(119909) 119892
119895(119909
119894) (5)
In order to map the input data to a higher dimensional spaceand to handle nonlinearities between input vectors and theirrespective class we use as a kernel the Gaussian radial basisfunction (RBF) which has 120574 as its kernel parameter Once thekernel is selected we used grid search to identify the best pairof the regularization parameters119862 and 120574 that is the pair withthe best cross-validation accuracy
432 Partial Least Squares (PLS) The Partial Least Squares(PLS) technique is a learningmethod based onmultiple linearregression model that takes into account the latent structurein both datasetsThe dataset consists of explanatory variables119883
119894and dependent variables 119884
119895 The model is linear as can be
seen in (6) that for each sample 119899 the value 119910119899119895is
119910
119899119895=
119896
sum
119894=0
120573
119894119909
119899119894+ 120576
119899119895
(6)
The PLS model is similar to a model from a linearregression however the way of calculating 120573
119894is different
The principle of PLS regression is that the data tables ormatrices119883 and 119884 are decomposed into latent structure in aniterative process The latent structure corresponding to themost variation of 119884 is extracted and explained by a latentstructure of119883 that explains it the best
The Partial Least Squares (PLS) technique is a learningmethod based on multivariate regression model which canbe used for correlating the information in one data matrix119883 to the information in another matrix 119884 More specificallyPLS is used to find the fundamental relations between twomatrices (119883 and 119884) which are projected onto several keyfactors such as 119879 and 119880 and linear regression is performedfor the relation between these factors Factor119880 represents themost variations for 119884 whereas factor 119879 denotes the variationsfor 119883 but it is not necessarily explaining the most variationin119883
The first results of PLS are the model equations show-ing the 120573 coefficients that give the relationship betweenvariables 119883 and 119884 These model equations are as follows
119884 = 119883120573 + 120576
119884 = 119879
ℎ119862
1015840
ℎ+ 120576
ℎ= 119883119882
lowast
ℎ119862
1015840
ℎ+ 120576
ℎ
= 119883119882
ℎ(119875
1015840
ℎ119882
ℎ)
minus1
119862
1015840
ℎ+ 120576
ℎ
(7)
where119884 is thematrix of dependent variables 119883 is thematrixof explanatory variables 119879
ℎ 119862ℎ 119882lowastℎ 119882ℎ and 119875
ℎare the
matrices generated by the PLS algorithm and 120576ℎis the matrix
of residuals Matrix 120573 of the regression coefficients of119884 on119883with ℎ components generated by the PLS algorithm is givenby
120573 = 119882
ℎ(119875
1015840
ℎ119882
ℎ)
minus1
119862
1015840
ℎ
(8)
Applied Computational Intelligence and Soft Computing 7
Sample data
Train dataDataset to grow the single trees
Feature selectionRandom selection of predictors
Grow treeSplit data using the best predictors
Estimate OOB errorby applying the tree to the OOB data
Out-Of-Bag (OOB)Data to estimate the error of the
grown tree
Random forestAs collection of all trees
Repeat until criteria for stopping tree growth
are fulfilled
Repeat until specified number
of trees is obtained
Figure 3 Random forest algorithm process flow
The advantage of PLS is that this algorithm allows takinginto account the data structure in both 119883 and 119884 matrices Italso provides great visual results that help the interpretationof data Finally yet importantly PLS can model severalresponse variables at the same time taking into account theirstructure
433 Random Forest (RF) Random forest algorithm pro-posed by Breiman [33] is an ensemble-learning algorithmconsisting of three predictors where the trees are formulatedbased on various random features It develops lots of decisiontrees based on random selection of data and random selectionof variables providing the class of dependent variable basedon many trees
This method is based on the combination of a largecollection of decorrelated decision trees (ie Classificationand Regression Trees (CART) [34]) Since all the trees arebased on random selection of data as well as variables theseare random trees and many such random trees lead to arandom forest The name forest means that we use manydecision trees to make a better classification of the dependentvariable The CART technique divides the learning sampleusing an algorithm known as binary recursive partitioningThis splitting or partitioning starts from the most importantvariable to the less important ones and it is applied to each ofthe new branches of the tree [35]
In order to increase the algorithm accuracy and reducethe generalization error of the ensemble trees another
technique called Bagging is incorporated The estimation forthe generalization error is performed with the Out-Of-Bag(OOB) method Bagging is used on the training dataset togenerate a lot of copies of it where each one corresponds to adecision tree
With the RF algorithm each tree is grown as follows [36]
(a) If the number of cases (observations) in the trainingset is119873 sample119873 cases at random but with replace-ment from the original data this sample will be thetraining set for the growing tree
(b) If there are 119872 input variables a number 119898 ≪ 119872
is specified such that at each node 119898 variables areselected at random out of the119872 and the best split onthese119898 is used to split the node This value119898 is heldconstant during the forest growing
(c) Each tree is grown to the largest extent possible
The process flow in the random forest models is shown inFigure 3
The error rate of RF primarily depends on the correlationdegree between any two trees and the prediction accuracy ofan individual tree The principal advantages of this methodare the ease of parallelization robustness and indifference tonoise and outliers in most of the dataset Due to its unbiasednature of execution this method avoids overfitting
8 Applied Computational Intelligence and Soft Computing
Table 3 Performance results for one week
Algorithm Training TestingMAE () MAPE () Correlation coefficient MAE () MAPE () Correlation coefficient
SVR 6491 311 090 7032 343 091PLS 26524 1273 081 23868 1042 079RF 17382 1977 083 15932 1861 084
Heat output (MW)
Heat load measurements (Oct 1ndashApr 30)
minus500
000
500
1000
1500
2000
2500
3000
Outdoor temp (∘C)
Figure 4 Hourly heat load pattern in relation to outdoor tempera-ture in the DH network
5 Performance Evaluation and Results
The proposed approach is implemented in MATLAB R2014a[37] and executed in a PC with Intel Core i7 processor with27GHz speed and 8GB of RAM In this work as trainingdataset we select the datameasured during the first 28 weekswhich consist of 4872 instances As for prediction period wechoose the 29th week as test data that is 148 instances Inorder to evaluate the performance of the proposed algorithmsin terms of accuracy of the results we use the mean absoluteaverage (MAE) the mean average percentage error (MAPE)and the correlation coefficient which measures the correla-tion between the actual and predicted heat load values MAEand MAPE are defined as follows
MAE =sum
119899
119894=1
1003816
1003816
1003816
1003816
119901
119894minus 119910
119894
1003816
1003816
1003816
1003816
119899
MAPE =sum
119899
119894=1
1003816
1003816
1003816
1003816
1003816
(119901
119894minus 119910
1015840
119894) 119901
119894
1003816
1003816
1003816
1003816
1003816
119899
(9)
where 119901119894is the actual value 119910
119894is the predicted value and 119899 is
the number of samples in the training setWe apply a 10-fold cross-validation for obtaining the valid
evaluation metrics Figure 4 presents the heat load in thenetwork with respect to the outdoor temperatureWe see thatthe higher values of the heat load occur during the days withlower outdoor temperature values which in fact reflects theincreased consumption of heat
Figure 5(a) shows results of actual heat load and heatload prediction for one week based on SVR algorithm InFigure 5(b) results of heat prediction production for oneweek based on PLS are shown with data on actual heal load
whereas results of heat load forecasting for one week basedon RF are shown in Figure 5(c)
As can be seen from Figure 5 the predicted HL withSVR is closer to the actual energy consumption with anMAPE value of about 343 and a correlation coefficient of091 Correlation has been consistent throughout training andtesting On the other hand graphs presented in Figures 5(b)and 5(c) for PLS and RF respectively are less accurate withhigher errors The performance of the PLS is significantlylower compared to SVR Concerning RF as the trees startedto become progressively uncorrelated the error rate alsodeclined significantly The best performance of the SVR overthe other two methods is attributed to the efficient modellingof feature space and to the fact that SVR is less proneto overfitting and that does not require any procedure todetermine the explicit form as in ordinary regression analysis
Table 3 outlines the results of the investigated ML algo-rithms for heat consumption prediction for both trainingphase (where we developed the models using supervisedlearning schemes) and testing phase (to generalize newlyunseen data) From Table 3 it is evident that SVR showsthe best prediction performance in terms of average errorsand correlation coefficient confirming the superiority of SVRover the other machine-learning methods Therefore basedon the assumption that the number and type of the operatingfacilities should be determined SVRcanbe effectively appliedin the management of a DHS The mean absolute percentageerror value of 343 obtained in our approachwith the SVR islower than or equal to the mean absolute percentage error ofstate-of-the-art approaches for heat load prediction in DHSMoreover SVR is also better than PLS and RF in terms ofmean average errors and correlation coefficient Neverthelesssometimes it is impossible to perform direct comparisonwith other works due to different system implementation andinput data and different structure of the experimental setup
51 Comparison with State-of-the-Art Methods Some priorresearch work is carried out to predict and analyze the headdemand in DH However due to different system designand input data and different architecture implementation orstructure of the experimental setup sometimes it is difficult toperform direct comparison Our approach uses operationaldata from DHS to model the heat demand and since theSVR exhibited the best prediction performance we use thismethod to perform the comparison In our case we obtainedsmaller MAPE compared to [17] where the experimentalresults showed MAPE of 477 which is lower than or atleast equal to the mean percentage error of state-of-the-artregression approaches that have been proposed for heat loadforecasting in DH systems Furthermore reported results
Applied Computational Intelligence and Soft Computing 9
Mon Tu
e
Wed Th
u Fri
Sat
Sun
Support Vector Regression (SVR)
Actual HLPredicted HL
000
500
1000
1500
2000
2500
Hea
t loa
d (M
W)
(a) Predicted versus actual demand with SVR
000
500
1000
1500
2000
2500
Mon Tu
e
Wed Thu Fri
Sat
Sun
Hea
t loa
d (M
W)
Partial Least Squares (PLS)
Actual HLPredicted HL
(b) Predicted versus actual demand with PLS
000
500
1000
1500
2000
2500
Hea
t loa
d (M
W)
Random forest (RF)
Mon Tu
e
Wed Th
u Fri
Sat
Sun
Actual HLPredicted HL
(c) Predicted versus actual demand with RF
Figure 5 Hourly heat load prediction based on SVR PLS and RF for one week
from our study are also superior compared to the resultspresented in [15 32] More specifically the SVR method weapply shows better results in terms of MAPE and correlationcoefficient compared to [32] where MAPE in this paperis 554 and the correlation coefficient is 087 As far asthe comparison with work [15] is concerned in terms ofMAPE our SVR method exhibits better hourly predictionperformance where theMAPE for one week is 563 On theother hand the PLSmethod performs better than in our casehaving the MAPE value of about 899
6 Conclusion
District heating (DH) sector can play an indispensable rolein the current and future sustainable energy system of theNorth European countries where the share of DH in the total
European heat market is significantly high The innovationsand emergence of new technology and the increasing focuson smart buildings impose energy systems to face a chal-lenge in adapting to customersrsquo more flexible and individualsolutions Consequently one needs to know more aboutwhat drives customersrsquo requirements choices and prioritiesTherefore the creation and application of innovative ITecosystems in DH are considered essential to improve energyefficiency
Heat load prediction in the past decade has attracted alot of interest to the researchers since it can assist in energyefficiency of the DHS which also leads to cost reduction forheat suppliers and to many environmental benefits
In this paper three ML algorithms for heat load predic-tion in a DH network are developed and presentedThe algo-rithms are SVR PLS andRFTheheat load predictionmodels
10 Applied Computational Intelligence and Soft Computing
were developed using data from 29 weeks The predictedhourly results were compared with actual heat load dataPerformances of these three different ML algorithms werestudied compared and analyzed SVR algorithm proved tobe the most efficient one producing the best performance interms of average errors and correlation coefficient Moreoverthe prediction results were also compared against existingSVR and PLS methods in literature showing that the SVRpresented in this paper produces better accuracy
In conclusion the comparison results validate the notionthat the developed SVRmethod is appropriate for applicationin heat load prediction or it can serve as a promisingalternative for existing models
As for the future work apart from outdoor temperaturewe intend to incorporate other meteorological parametersinfluencing the heat load such as wind speed and humidity
Competing Interests
The authors declare that they have no competing interests
Acknowledgments
The authors would like to thank Eidsiva Bioenergi AS Nor-way which kindly provided the data for this research work
References
[1] Report on European Energy Security Strategy May 2014 Brus-sels httpeur-lexeuropaeulegal-contentENTXTPDFuri=CELEX52014DC0330ampfrom=EN
[2] A R Day P Ogumka P G Jones and A Dunsdon ldquoTheuse of the planning system to encourage low carbon energytechnologies in buildingsrdquo Renewable Energy vol 34 no 9 pp2016ndash2021 2009
[3] IEA (International Energy Agency) Nordic Energy TechnologyPerspectives OECDIEA Paris France 2013
[4] A Gebremedhin and H Zinko ldquoSeasonal heat storages indistrict heating systemsrdquo in Proceedings of the 11th InternationalThermal Energy Storage for Energy Efficiency and SustainabilityConference (Effstock rsquo09) Stockholm Sweden June 2009
[5] H Lund S Werner R Wiltshire et al ldquo4th Generation DistrictHeating (4GDH) integrating smart thermal grids into futuresustainable energy systemsrdquo Energy vol 68 pp 1ndash11 2014
[6] F Dalipi S Y Yayilgan and A Gebremedhin ldquoA machinelearning approach to increase energy efficiency in district heat-ing systemsrdquo in Proceedings of the International Conference onEnvironmental Engineering and Computer Application (ICEECArsquo14) pp 223ndash226 CRC Press Hong Kong December 2014
[7] N Fumo ldquoA review on the basics of building energy estimationrdquoRenewable and Sustainable Energy Reviews vol 31 pp 53ndash602014
[8] D E Claridge ldquoUsing simulation models for building com-missioningrdquo in Proceedings of the International Conferenceon Enhanced Building Operation Energy Systems LaboratoryTexas AampM University October 2004
[9] T Mestekemper Energy demand forecasting and dynamic watertemperature management [PhD thesis] Bielefeld UniversityBielefeld Germany 2011
[10] E Dotzauer ldquoSimple model for prediction of loads in district-heating systemsrdquo Applied Energy vol 73 no 3-4 pp 277ndash2842002
[11] HANielsen andHMadsen ldquoModelling the heat consumptionin district heating systems using a grey-box approachrdquo Energyand Buildings vol 38 no 1 pp 63ndash71 2006
[12] B B Ekici and U T Aksoy ldquoPrediction of building energyconsumption by using artificial neural networksrdquo Advances inEngineering Software vol 40 no 5 pp 356ndash362 2009
[13] M Grzenda and B Macukow ldquoDemand prediction with multi-stage neural processingrdquo in Advances in Natural Computationand Data Mining pp 131ndash141 2006
[14] K Kato M Sakawa K Ishimaru S Ushiro and T ShibanoldquoHeat load prediction through recurrent neural network indistrict heating and cooling systemsrdquo in Proceedings of the IEEEInternational Conference on SystemsMan andCybernetics (SMCrsquo08) pp 1401ndash1406 Singapore October 2008
[15] T C Park U S Kim L-H Kim B W Jo and Y K Yeo ldquoHeatconsumption forecasting using partial least squares artificialneural network and support vector regression techniques in dis-trict heating systemsrdquo Korean Journal of Chemical Engineeringvol 27 no 4 pp 1063ndash1071 2010
[16] T Catalina V Iordache and B Caracaleanu ldquoMultiple regres-sion model for fast prediction of the heating energy demandrdquoEnergy and Buildings vol 57 pp 302ndash312 2013
[17] S Provatas N Lavesson and C Johansson ldquoAn onlinemachinelearning algorithm for heat load forecasting in district heatingsystemsrdquo in Proceedings of the 14th International Symposium onDistrict Heating and Cooling Stockholm Sweden September2014
[18] S Idowu S Saguna C Ahlund and O Schelen ldquoForecastingheat load for smart district heating systems a machine learningapproachrdquo in Proceedings of the IEEE International Conferenceon Smart Grid Communications (SmartGridComm rsquo14) pp 554ndash559 IEEE Venice Italy November 2014
[19] L Wu G Kaiser D Solomon R Winter A Boulanger andR Anderson Improving Efficiency and Reliability of Build-ing Systems Using Machine Learning and Automated OnlineEvaluation Columbia University Academic Commons 2012httphdlhandlenet10022ACP13213
[20] M Vlachopoulou G Chin J C Fuller S Lu and K KalsildquoModel for aggregated water heater load using dynamicBayesian networksrdquo Tech Rep Pacific Northwest NationalLaboratory (PNNL) Richland Wash USA 2012
[21] S Shamshirband D Petkovic R Enayatifar et al ldquoHeat loadprediction in district heating systemswith adaptive neuro-fuzzymethodrdquoRenewable and Sustainable Energy Reviews vol 48 pp760ndash767 2015
[22] M Protic S Shamshirband M H Anisi et al ldquoAppraisal ofsoft computing methods for short term consumersrsquo heat loadprediction in district heating systemsrdquo Energy vol 82 pp 697ndash704 2015
[23] S Shamshirband D Petkovic A Amini et al ldquoSupport vec-tor regression methodology for wind turbine reaction torqueprediction with power-split hydrostatic continuous variabletransmissionrdquo Energy vol 67 pp 623ndash630 2014
[24] H Quan D Srinivasan and A Khosravi ldquoShort-term load andwind power forecasting using neural network-based predictionintervalsrdquo IEEE Transactions on Neural Networks and LearningSystems vol 25 no 2 pp 303ndash315 2014
Applied Computational Intelligence and Soft Computing 11
[25] K Bhaskar and S N Singh ldquoAWNN-assisted wind power fore-casting using feed-forward neural networkrdquo IEEE Transactionson Sustainable Energy vol 3 no 2 pp 306ndash315 2012
[26] Heat Pump Centre Newsletters International Energy Agency(IEA) Heat Pumping Technologies Vol 30 Nr 22012
[27] T G Dietterich ldquoAn experimental comparison of three meth-ods for constructing ensembles of decision trees baggingboosting and randomizationrdquo Machine Learning vol 40 no2 pp 139ndash157 2000
[28] SwedishDistrictHeatingAssociation 2016 httpwwwsvensk-fjarrvarmese
[29] K P Murphy Machine Learning A Probabilistic PerspectiveAdaptive Computation and Machine Learning Series MITPress 2012
[30] L Wang Support Vector Machines Theory and ApplicationsSpringer New York NY USA 2005
[31] D Basak S Pal and D C Patranabis ldquoSupport vector regres-sionsrdquoNeural Information ProcessingmdashLetters and Reviews vol11 no 10 pp 203ndash224 2007
[32] R E Edwards J New and L E Parker ldquoPredicting futurehourly residential electrical consumption a machine learningcase studyrdquo Energy and Buildings vol 49 pp 591ndash603 2012
[33] L Breiman ldquoRandom forestsrdquoMachine Learning vol 45 no 1pp 5ndash32 2001
[34] L Breiman J H Friedman R A Olshen and C J StoneClassification and Regression Trees Wadsworth and BrooksMonterey Calif USA 1984
[35] P Vezza C Comoglio M Rosso and A Viglione ldquoLowflows regionalization in North-Western Italyrdquo Water ResourcesManagement vol 24 no 14 pp 4049ndash4074 2010
[36] L Breiman and A Cutler ldquoRandom forestsrdquo January 2016httpwwwstatberkeleyeduusersbreimanRandomForests
[37] MATLAB Version 830 MathWorks Natick Mass USA 2014
Submit your manuscripts athttpwwwhindawicom
Computer Games Technology
International Journal of
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Distributed Sensor Networks
International Journal of
Advances in
FuzzySystems
Hindawi Publishing Corporationhttpwwwhindawicom
Volume 2014
International Journal of
ReconfigurableComputing
Hindawi Publishing Corporation httpwwwhindawicom Volume 2014
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Applied Computational Intelligence and Soft Computing
thinspAdvancesthinspinthinsp
Artificial Intelligence
HindawithinspPublishingthinspCorporationhttpwwwhindawicom Volumethinsp2014
Advances inSoftware EngineeringHindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Electrical and Computer Engineering
Journal of
Journal of
Computer Networks and Communications
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Hindawi Publishing Corporation
httpwwwhindawicom Volume 2014
Advances in
Multimedia
International Journal of
Biomedical Imaging
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
ArtificialNeural Systems
Advances in
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
RoboticsJournal of
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Computational Intelligence and Neuroscience
Industrial EngineeringJournal of
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Modelling amp Simulation in EngineeringHindawi Publishing Corporation httpwwwhindawicom Volume 2014
The Scientific World JournalHindawi Publishing Corporation httpwwwhindawicom Volume 2014
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Human-ComputerInteraction
Advances in
Computer EngineeringAdvances in
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
4 Applied Computational Intelligence and Soft Computing
Energy sourceHeat
generation plant
DH networkHeat demand Space heating
Domestic hot water
Supply pipeReturn pipe
Figure 1 District heating block diagram
infrastructure assets which enable effective resource utiliza-tion by incorporating the use of various energy sources Oneof the main advantages of DH system is that it facilitatesthe use of combined heat and power (CHP) generation andthereby makes the overall system efficient
District heating can play a crucial role in reaching someof the energy and environmental objectives by reducing CO
2
emissions and improve overall energy efficiency In a districtheating system heat is distributed through a network of hot-water pipes from heat-supplying plants to end usersThe heatis mostly used for space heating and domestic hot waterA simplified schematic picture of DH system is shown inFigure 1
The main components of district heating system areheat generation units distribution network and customersubstations The heat generation unit may use heat-onlyboilers or CHP plants or a combination of these two for heatgeneration Various types of energy sources like biomassmunicipal solid waste and industrial waste heat can be usedfor heat production The heat is then distributed to differentcustomers through a network of pipeline In the customersubstations the heat energy from the network is transferredto the end users internal heating system
The heat-supplying units are designed to meet the heatdemandThe heat output to the network depends on themassflow of the hot water and the temperature difference betweenthe supply and the return line The supply temperature of thehot water is controlled directly from the plantrsquos control roombased on the outdoor temperature and it follows mostly agiven operation temperature curve The return temperatureon the other hand depends mainly on the customerrsquos heatusage and also other network specific constraints The levelof the supply temperature differs from country to coun-try For instance in Sweden the temperature level variesbetween 70 and 120∘C depending on season and weather[28]
The heat load in district heating systems is the sum of allheat loads that are connected to the network and distributionand other losses in the network
With increased concerns about the environment climatechange and energy economy DH is an obvious choice to beused Nowadays district heating systems are equipped withadvanced and cutting-edge technology systems and sensorsthat monitor and control production units from a controlroom remotely From a smart city perspective one of thefuture challenges that now remains is to integrate districtheating with the electricity sector as well as the transportsector Heat load forecasting models with high accuracy are
Data preparation (aggregation and preprocessing)
Machine learning algorithms(SVR PLS and RF)
Collection of operational data
Heat load forecast curves
Figure 2 Workflow scenario of the proposed heat load approach
important to keep up with the rapid development in thisdirection
4 System Design and Application
As mentioned earlier in this study we perform short timeprediction for heat consumption and evaluate the threeML methods For the development of the heat load systempresented in our previous work [6] in this section we presentand describe our heat load prediction approach in detail asshown in Figure 2 which includes collection of operationaldata data preparation and the examined ML algorithms
In this work there are two main tasks which are relevantin the system implementation (a) data aggregation andpreprocessing and (b) ML application where the heal loadprediction is approached with the supervisedML algorithms
41 Operational Data Collection The data we have used inthis study is provided by Eidsiva Bioenergi ASwhich operates
Applied Computational Intelligence and Soft Computing 5
Table 2 Typical data samples for one day from the Eidsiva dataset
Time (h) FT (∘C) RT (∘C) Flow (m3h) HL (MW)1 1009 604 3165 1462 994 575 2791 1343 996 583 2283 1324 1003 598 2619 1515 1005 596 2769 1496 1001 589 2705 128
20 987 597 353 14821 1014 589 3309 16122 1011 591 2508 14223 1008 598 2602 15424 1019 593 2479 141
one of Europersquos most modern waste incineration plantslocated in the city of Hamar Norway The plant producesdistrict heating process steam and electricity These dataare collected by regular measurements that are part of thecontrol system in a DH plant The measurements consist of24measurements per each day that is every hourThe datasetcontains values of the parameters time of day (tD) forwardtemperature (FT) return temperature (RT) flow rate (FR)and heat load (HL) The data are collected in the periodbetween October 1st 2014 and April 30th 2015 In Table 2 wepresent a portion of typical data samples for one day
42 Data Preparation In this module activities related topreparing the data to be compatible with the ML moduleare performed The module includes data aggregation andpreprocessing During the process of data aggregation wecombine the sources of this data with the weather data(outdoor temperature) which is collected at the same intervalwith previous parameters Consequently we obtain fromthe aggregation process these output parameters outdoortemperature (OT) heat load (HL) forward temperature(FT) time of day (tD) and the difference between forwardtemperature (FT) and return temperature (RT) namely DT
43 Machine-Learning Predictive Modelling Machine learn-ing (ML) is a very broad subject it goes from very abstracttheory to extreme practice It turns out that even amongstmachine-learning practitioners there is no very well accepteddefinition of what is machine learning As a subfield ofartificial intelligence with its objective on building modelsthat learn from data machine learning has made tremendousimprovements and applications in the last decade
In general ML can be clearly defined as a set of methodsthat can automatically detect and extract knowledge patternsin empirical data such as sensor data or databases and thenuse the discovered knowledge patterns to predict future dataor execute other types of decision-making under uncertaintyML is divided into three principal groups supervised learn-ing (predictive learning approach) unsupervised learning
(descriptive learning approach) and reinforcement learning[29] In supervised learning the algorithm is given data inwhich the ldquocorrect answerrdquo for each example is told and themain property of the supervised learning is that the maincriteria of the target function 119910 = 119891(119909) are unknown Atthe very high level the two steps of supervised learningare as follows (i) train a machine-learning model usinglabeled data that consist of119873 data pairs (119909
1 119910
1) (119909
119873 119910
119873)
called instances and (ii) make predictions on new data forwhich the label is unknown Each instance is described byan input vector 119909
119894which incorporates a set of attributes 119860 =
119860
1 119860
2 119860
119898 and a label 119910
119894of the target attribute that
represents the wanted output To summarize these two stepsthe predictive model is learning from past examples made upof inputs and outputs and then applying what is learned tofuture inputs in order to predict future outputs Since we aremaking predictions on unseen data which is data that is notused to train the model it is often said that the primary goalof supervised learning is to buildmodels that generalizes thatis the built machine-learning model accurately predicts thefuture rather than the past Therefore the goal is to train amodel that can afterwards predict the label of new instancesand to figure out the target function
Based on the type of output variable 119910119894 supervised
learning tasks are further divided into two types as clas-sification and regression problems In problems where theoutput variable 119910
119894is categorical or nominal (or belongs to a
finite set) the ML tasks are known as classification problemsor pattern recognition whereas in regression problems theoutput variable is a real valued scalar or takes continuousvalues
431 Support Vector Regression (SVR) Support vectormachines (SVM) as a set of supervised learning algorithmsbased on a statistical learning theory are one of the mostsuccessful and widely appliedmachine-learningmethods forboth solving regression and pattern recognition problemsSince the formulation of SVMs is based on structural riskminimization and not on empirical risk minimization thisalgorithm shows better performance than the traditionalones Support Vector Regression (SVR) is a method ofSVM specifically for regressions In SVR the objectivefunction (eg the error function that may need to beminimized) is convex meaning that the global optimumis always reached and satisfied This is sharply in contrastto artificial neural networks (ANNs) where for instancethe classical backpropagation learning algorithm is prone toconvergence to ldquobadrdquo local minima [30 31] which makesthem harder to analyze theoretically In practice SVR havegreatly outperformed ANNs in a wide range of applications[31]
In SVR the input119909 ismapped first into an119898-dimensionalfeature space by using nonlinear mapping As a subsequentstep we construct a linear model in that feature spaceMathematically the linear model 119891(119909 119908) is given by
119891 (119909 119908) =
119898
sum
119895=1
119908
119895119892
119895(119909) + 119887 (1)
6 Applied Computational Intelligence and Soft Computing
where 119892119895(119909) 119895 = 1 119898 represents the set of nonlinear
transformations while 119887 is the bias term andmost of the timeis assumed to be zero hence we omit this term
The model obtained by SVR depends exclusively on asubset of the training data at the same time SVR tries toreducemodel complexity byminimizing 1199082 Consequentlythe objective of SVR is to minimize the following function[32]
min 1
2
119908
2+ 119862
119899
sum
119894=1
(120585
119894+ 120585
lowast
119894)
such that 119910
119894minus 119891 (119909
119894 119908) le 120576 + 120585
lowast
119894
119891 (119909
119894 119908) minus 119910
119894le 120576 + 120585
119894
120585
119894 120585
lowast
119894ge 0 119894 = 1 119899
(2)
In these equations 120576 is a new type of (insensitive) lossfunction or a threshold which denotes the desired errorrange for all points The nonnegative variables 120585
119894and 120585lowast119894are
called slack variables they measure the deviation of trainingsamples outside 120576 that is guaranteeing that a solution existsfor all 120576 The parameter 119862 gt 0 is a penalty term used todetermine the tradeoff between data fitting and flatness and119908 are the regression weights In most cases the optimizationproblem can be easily solved if transformed into a dualproblem By utilizing Lagrange multipliers the dualizationmethod is applied as follows
119871 =
1
2
119908
2+ 119862
119899
sum
119894=1
(120585
119894+ 120585
lowast
119894)
minus
119899
sum
119894=1
120572
lowast
119894(120576 + 120585
lowast
119894minus 119910
119894+ 119891 (119909
119894 119908))
minus
119899
sum
119894=1
120572
119894(120576 + 120585
119894+ 119910
119894minus 119891 (119909
119894 119908))
minus
119899
sum
119894=1
(120582
119894120585
119894+ 120582
lowast
119894120585
lowast
119894)
(3)
where 119871 is the Lagrangian and 120572119894 120572
lowast
119894 120582
119894 120582
lowast
119894ge 0 are called the
Lagrange multipliersConsidering the saddle point condition it follows that the
partial derivatives of119871 in relation to variables (119908 119887 120585119894 120585
lowast
119894) will
disappear for optimality By proceeding with similar stepswe end up with the dual optimization problem Finally thesolution of the dual problem is given by
119891 (119909) =
119899SV
sum
119894=1
(120572
119894minus 120572
lowast
119894)119870 (119909
119894minus 119909) (4)
where 0 le 120572lowast119894le 119862 0 le 120572
119894le 119862 119899SV is the number of space
vectors and 119870 is the kernel function which for given twovectors in input space will return to a higher dimensional
feature space the dot product of their images The kernel isgiven by
119870(119909 119909
119894) =
119898
sum
119894=1
119892
119895(119909) 119892
119895(119909
119894) (5)
In order to map the input data to a higher dimensional spaceand to handle nonlinearities between input vectors and theirrespective class we use as a kernel the Gaussian radial basisfunction (RBF) which has 120574 as its kernel parameter Once thekernel is selected we used grid search to identify the best pairof the regularization parameters119862 and 120574 that is the pair withthe best cross-validation accuracy
432 Partial Least Squares (PLS) The Partial Least Squares(PLS) technique is a learningmethod based onmultiple linearregression model that takes into account the latent structurein both datasetsThe dataset consists of explanatory variables119883
119894and dependent variables 119884
119895 The model is linear as can be
seen in (6) that for each sample 119899 the value 119910119899119895is
119910
119899119895=
119896
sum
119894=0
120573
119894119909
119899119894+ 120576
119899119895
(6)
The PLS model is similar to a model from a linearregression however the way of calculating 120573
119894is different
The principle of PLS regression is that the data tables ormatrices119883 and 119884 are decomposed into latent structure in aniterative process The latent structure corresponding to themost variation of 119884 is extracted and explained by a latentstructure of119883 that explains it the best
The Partial Least Squares (PLS) technique is a learningmethod based on multivariate regression model which canbe used for correlating the information in one data matrix119883 to the information in another matrix 119884 More specificallyPLS is used to find the fundamental relations between twomatrices (119883 and 119884) which are projected onto several keyfactors such as 119879 and 119880 and linear regression is performedfor the relation between these factors Factor119880 represents themost variations for 119884 whereas factor 119879 denotes the variationsfor 119883 but it is not necessarily explaining the most variationin119883
The first results of PLS are the model equations show-ing the 120573 coefficients that give the relationship betweenvariables 119883 and 119884 These model equations are as follows
119884 = 119883120573 + 120576
119884 = 119879
ℎ119862
1015840
ℎ+ 120576
ℎ= 119883119882
lowast
ℎ119862
1015840
ℎ+ 120576
ℎ
= 119883119882
ℎ(119875
1015840
ℎ119882
ℎ)
minus1
119862
1015840
ℎ+ 120576
ℎ
(7)
where119884 is thematrix of dependent variables 119883 is thematrixof explanatory variables 119879
ℎ 119862ℎ 119882lowastℎ 119882ℎ and 119875
ℎare the
matrices generated by the PLS algorithm and 120576ℎis the matrix
of residuals Matrix 120573 of the regression coefficients of119884 on119883with ℎ components generated by the PLS algorithm is givenby
120573 = 119882
ℎ(119875
1015840
ℎ119882
ℎ)
minus1
119862
1015840
ℎ
(8)
Applied Computational Intelligence and Soft Computing 7
Sample data
Train dataDataset to grow the single trees
Feature selectionRandom selection of predictors
Grow treeSplit data using the best predictors
Estimate OOB errorby applying the tree to the OOB data
Out-Of-Bag (OOB)Data to estimate the error of the
grown tree
Random forestAs collection of all trees
Repeat until criteria for stopping tree growth
are fulfilled
Repeat until specified number
of trees is obtained
Figure 3 Random forest algorithm process flow
The advantage of PLS is that this algorithm allows takinginto account the data structure in both 119883 and 119884 matrices Italso provides great visual results that help the interpretationof data Finally yet importantly PLS can model severalresponse variables at the same time taking into account theirstructure
433 Random Forest (RF) Random forest algorithm pro-posed by Breiman [33] is an ensemble-learning algorithmconsisting of three predictors where the trees are formulatedbased on various random features It develops lots of decisiontrees based on random selection of data and random selectionof variables providing the class of dependent variable basedon many trees
This method is based on the combination of a largecollection of decorrelated decision trees (ie Classificationand Regression Trees (CART) [34]) Since all the trees arebased on random selection of data as well as variables theseare random trees and many such random trees lead to arandom forest The name forest means that we use manydecision trees to make a better classification of the dependentvariable The CART technique divides the learning sampleusing an algorithm known as binary recursive partitioningThis splitting or partitioning starts from the most importantvariable to the less important ones and it is applied to each ofthe new branches of the tree [35]
In order to increase the algorithm accuracy and reducethe generalization error of the ensemble trees another
technique called Bagging is incorporated The estimation forthe generalization error is performed with the Out-Of-Bag(OOB) method Bagging is used on the training dataset togenerate a lot of copies of it where each one corresponds to adecision tree
With the RF algorithm each tree is grown as follows [36]
(a) If the number of cases (observations) in the trainingset is119873 sample119873 cases at random but with replace-ment from the original data this sample will be thetraining set for the growing tree
(b) If there are 119872 input variables a number 119898 ≪ 119872
is specified such that at each node 119898 variables areselected at random out of the119872 and the best split onthese119898 is used to split the node This value119898 is heldconstant during the forest growing
(c) Each tree is grown to the largest extent possible
The process flow in the random forest models is shown inFigure 3
The error rate of RF primarily depends on the correlationdegree between any two trees and the prediction accuracy ofan individual tree The principal advantages of this methodare the ease of parallelization robustness and indifference tonoise and outliers in most of the dataset Due to its unbiasednature of execution this method avoids overfitting
8 Applied Computational Intelligence and Soft Computing
Table 3 Performance results for one week
Algorithm Training TestingMAE () MAPE () Correlation coefficient MAE () MAPE () Correlation coefficient
SVR 6491 311 090 7032 343 091PLS 26524 1273 081 23868 1042 079RF 17382 1977 083 15932 1861 084
Heat output (MW)
Heat load measurements (Oct 1ndashApr 30)
minus500
000
500
1000
1500
2000
2500
3000
Outdoor temp (∘C)
Figure 4 Hourly heat load pattern in relation to outdoor tempera-ture in the DH network
5 Performance Evaluation and Results
The proposed approach is implemented in MATLAB R2014a[37] and executed in a PC with Intel Core i7 processor with27GHz speed and 8GB of RAM In this work as trainingdataset we select the datameasured during the first 28 weekswhich consist of 4872 instances As for prediction period wechoose the 29th week as test data that is 148 instances Inorder to evaluate the performance of the proposed algorithmsin terms of accuracy of the results we use the mean absoluteaverage (MAE) the mean average percentage error (MAPE)and the correlation coefficient which measures the correla-tion between the actual and predicted heat load values MAEand MAPE are defined as follows
MAE =sum
119899
119894=1
1003816
1003816
1003816
1003816
119901
119894minus 119910
119894
1003816
1003816
1003816
1003816
119899
MAPE =sum
119899
119894=1
1003816
1003816
1003816
1003816
1003816
(119901
119894minus 119910
1015840
119894) 119901
119894
1003816
1003816
1003816
1003816
1003816
119899
(9)
where 119901119894is the actual value 119910
119894is the predicted value and 119899 is
the number of samples in the training setWe apply a 10-fold cross-validation for obtaining the valid
evaluation metrics Figure 4 presents the heat load in thenetwork with respect to the outdoor temperatureWe see thatthe higher values of the heat load occur during the days withlower outdoor temperature values which in fact reflects theincreased consumption of heat
Figure 5(a) shows results of actual heat load and heatload prediction for one week based on SVR algorithm InFigure 5(b) results of heat prediction production for oneweek based on PLS are shown with data on actual heal load
whereas results of heat load forecasting for one week basedon RF are shown in Figure 5(c)
As can be seen from Figure 5 the predicted HL withSVR is closer to the actual energy consumption with anMAPE value of about 343 and a correlation coefficient of091 Correlation has been consistent throughout training andtesting On the other hand graphs presented in Figures 5(b)and 5(c) for PLS and RF respectively are less accurate withhigher errors The performance of the PLS is significantlylower compared to SVR Concerning RF as the trees startedto become progressively uncorrelated the error rate alsodeclined significantly The best performance of the SVR overthe other two methods is attributed to the efficient modellingof feature space and to the fact that SVR is less proneto overfitting and that does not require any procedure todetermine the explicit form as in ordinary regression analysis
Table 3 outlines the results of the investigated ML algo-rithms for heat consumption prediction for both trainingphase (where we developed the models using supervisedlearning schemes) and testing phase (to generalize newlyunseen data) From Table 3 it is evident that SVR showsthe best prediction performance in terms of average errorsand correlation coefficient confirming the superiority of SVRover the other machine-learning methods Therefore basedon the assumption that the number and type of the operatingfacilities should be determined SVRcanbe effectively appliedin the management of a DHS The mean absolute percentageerror value of 343 obtained in our approachwith the SVR islower than or equal to the mean absolute percentage error ofstate-of-the-art approaches for heat load prediction in DHSMoreover SVR is also better than PLS and RF in terms ofmean average errors and correlation coefficient Neverthelesssometimes it is impossible to perform direct comparisonwith other works due to different system implementation andinput data and different structure of the experimental setup
51 Comparison with State-of-the-Art Methods Some priorresearch work is carried out to predict and analyze the headdemand in DH However due to different system designand input data and different architecture implementation orstructure of the experimental setup sometimes it is difficult toperform direct comparison Our approach uses operationaldata from DHS to model the heat demand and since theSVR exhibited the best prediction performance we use thismethod to perform the comparison In our case we obtainedsmaller MAPE compared to [17] where the experimentalresults showed MAPE of 477 which is lower than or atleast equal to the mean percentage error of state-of-the-artregression approaches that have been proposed for heat loadforecasting in DH systems Furthermore reported results
Applied Computational Intelligence and Soft Computing 9
Mon Tu
e
Wed Th
u Fri
Sat
Sun
Support Vector Regression (SVR)
Actual HLPredicted HL
000
500
1000
1500
2000
2500
Hea
t loa
d (M
W)
(a) Predicted versus actual demand with SVR
000
500
1000
1500
2000
2500
Mon Tu
e
Wed Thu Fri
Sat
Sun
Hea
t loa
d (M
W)
Partial Least Squares (PLS)
Actual HLPredicted HL
(b) Predicted versus actual demand with PLS
000
500
1000
1500
2000
2500
Hea
t loa
d (M
W)
Random forest (RF)
Mon Tu
e
Wed Th
u Fri
Sat
Sun
Actual HLPredicted HL
(c) Predicted versus actual demand with RF
Figure 5 Hourly heat load prediction based on SVR PLS and RF for one week
from our study are also superior compared to the resultspresented in [15 32] More specifically the SVR method weapply shows better results in terms of MAPE and correlationcoefficient compared to [32] where MAPE in this paperis 554 and the correlation coefficient is 087 As far asthe comparison with work [15] is concerned in terms ofMAPE our SVR method exhibits better hourly predictionperformance where theMAPE for one week is 563 On theother hand the PLSmethod performs better than in our casehaving the MAPE value of about 899
6 Conclusion
District heating (DH) sector can play an indispensable rolein the current and future sustainable energy system of theNorth European countries where the share of DH in the total
European heat market is significantly high The innovationsand emergence of new technology and the increasing focuson smart buildings impose energy systems to face a chal-lenge in adapting to customersrsquo more flexible and individualsolutions Consequently one needs to know more aboutwhat drives customersrsquo requirements choices and prioritiesTherefore the creation and application of innovative ITecosystems in DH are considered essential to improve energyefficiency
Heat load prediction in the past decade has attracted alot of interest to the researchers since it can assist in energyefficiency of the DHS which also leads to cost reduction forheat suppliers and to many environmental benefits
In this paper three ML algorithms for heat load predic-tion in a DH network are developed and presentedThe algo-rithms are SVR PLS andRFTheheat load predictionmodels
10 Applied Computational Intelligence and Soft Computing
were developed using data from 29 weeks The predictedhourly results were compared with actual heat load dataPerformances of these three different ML algorithms werestudied compared and analyzed SVR algorithm proved tobe the most efficient one producing the best performance interms of average errors and correlation coefficient Moreoverthe prediction results were also compared against existingSVR and PLS methods in literature showing that the SVRpresented in this paper produces better accuracy
In conclusion the comparison results validate the notionthat the developed SVRmethod is appropriate for applicationin heat load prediction or it can serve as a promisingalternative for existing models
As for the future work apart from outdoor temperaturewe intend to incorporate other meteorological parametersinfluencing the heat load such as wind speed and humidity
Competing Interests
The authors declare that they have no competing interests
Acknowledgments
The authors would like to thank Eidsiva Bioenergi AS Nor-way which kindly provided the data for this research work
References
[1] Report on European Energy Security Strategy May 2014 Brus-sels httpeur-lexeuropaeulegal-contentENTXTPDFuri=CELEX52014DC0330ampfrom=EN
[2] A R Day P Ogumka P G Jones and A Dunsdon ldquoTheuse of the planning system to encourage low carbon energytechnologies in buildingsrdquo Renewable Energy vol 34 no 9 pp2016ndash2021 2009
[3] IEA (International Energy Agency) Nordic Energy TechnologyPerspectives OECDIEA Paris France 2013
[4] A Gebremedhin and H Zinko ldquoSeasonal heat storages indistrict heating systemsrdquo in Proceedings of the 11th InternationalThermal Energy Storage for Energy Efficiency and SustainabilityConference (Effstock rsquo09) Stockholm Sweden June 2009
[5] H Lund S Werner R Wiltshire et al ldquo4th Generation DistrictHeating (4GDH) integrating smart thermal grids into futuresustainable energy systemsrdquo Energy vol 68 pp 1ndash11 2014
[6] F Dalipi S Y Yayilgan and A Gebremedhin ldquoA machinelearning approach to increase energy efficiency in district heat-ing systemsrdquo in Proceedings of the International Conference onEnvironmental Engineering and Computer Application (ICEECArsquo14) pp 223ndash226 CRC Press Hong Kong December 2014
[7] N Fumo ldquoA review on the basics of building energy estimationrdquoRenewable and Sustainable Energy Reviews vol 31 pp 53ndash602014
[8] D E Claridge ldquoUsing simulation models for building com-missioningrdquo in Proceedings of the International Conferenceon Enhanced Building Operation Energy Systems LaboratoryTexas AampM University October 2004
[9] T Mestekemper Energy demand forecasting and dynamic watertemperature management [PhD thesis] Bielefeld UniversityBielefeld Germany 2011
[10] E Dotzauer ldquoSimple model for prediction of loads in district-heating systemsrdquo Applied Energy vol 73 no 3-4 pp 277ndash2842002
[11] HANielsen andHMadsen ldquoModelling the heat consumptionin district heating systems using a grey-box approachrdquo Energyand Buildings vol 38 no 1 pp 63ndash71 2006
[12] B B Ekici and U T Aksoy ldquoPrediction of building energyconsumption by using artificial neural networksrdquo Advances inEngineering Software vol 40 no 5 pp 356ndash362 2009
[13] M Grzenda and B Macukow ldquoDemand prediction with multi-stage neural processingrdquo in Advances in Natural Computationand Data Mining pp 131ndash141 2006
[14] K Kato M Sakawa K Ishimaru S Ushiro and T ShibanoldquoHeat load prediction through recurrent neural network indistrict heating and cooling systemsrdquo in Proceedings of the IEEEInternational Conference on SystemsMan andCybernetics (SMCrsquo08) pp 1401ndash1406 Singapore October 2008
[15] T C Park U S Kim L-H Kim B W Jo and Y K Yeo ldquoHeatconsumption forecasting using partial least squares artificialneural network and support vector regression techniques in dis-trict heating systemsrdquo Korean Journal of Chemical Engineeringvol 27 no 4 pp 1063ndash1071 2010
[16] T Catalina V Iordache and B Caracaleanu ldquoMultiple regres-sion model for fast prediction of the heating energy demandrdquoEnergy and Buildings vol 57 pp 302ndash312 2013
[17] S Provatas N Lavesson and C Johansson ldquoAn onlinemachinelearning algorithm for heat load forecasting in district heatingsystemsrdquo in Proceedings of the 14th International Symposium onDistrict Heating and Cooling Stockholm Sweden September2014
[18] S Idowu S Saguna C Ahlund and O Schelen ldquoForecastingheat load for smart district heating systems a machine learningapproachrdquo in Proceedings of the IEEE International Conferenceon Smart Grid Communications (SmartGridComm rsquo14) pp 554ndash559 IEEE Venice Italy November 2014
[19] L Wu G Kaiser D Solomon R Winter A Boulanger andR Anderson Improving Efficiency and Reliability of Build-ing Systems Using Machine Learning and Automated OnlineEvaluation Columbia University Academic Commons 2012httphdlhandlenet10022ACP13213
[20] M Vlachopoulou G Chin J C Fuller S Lu and K KalsildquoModel for aggregated water heater load using dynamicBayesian networksrdquo Tech Rep Pacific Northwest NationalLaboratory (PNNL) Richland Wash USA 2012
[21] S Shamshirband D Petkovic R Enayatifar et al ldquoHeat loadprediction in district heating systemswith adaptive neuro-fuzzymethodrdquoRenewable and Sustainable Energy Reviews vol 48 pp760ndash767 2015
[22] M Protic S Shamshirband M H Anisi et al ldquoAppraisal ofsoft computing methods for short term consumersrsquo heat loadprediction in district heating systemsrdquo Energy vol 82 pp 697ndash704 2015
[23] S Shamshirband D Petkovic A Amini et al ldquoSupport vec-tor regression methodology for wind turbine reaction torqueprediction with power-split hydrostatic continuous variabletransmissionrdquo Energy vol 67 pp 623ndash630 2014
[24] H Quan D Srinivasan and A Khosravi ldquoShort-term load andwind power forecasting using neural network-based predictionintervalsrdquo IEEE Transactions on Neural Networks and LearningSystems vol 25 no 2 pp 303ndash315 2014
Applied Computational Intelligence and Soft Computing 11
[25] K Bhaskar and S N Singh ldquoAWNN-assisted wind power fore-casting using feed-forward neural networkrdquo IEEE Transactionson Sustainable Energy vol 3 no 2 pp 306ndash315 2012
[26] Heat Pump Centre Newsletters International Energy Agency(IEA) Heat Pumping Technologies Vol 30 Nr 22012
[27] T G Dietterich ldquoAn experimental comparison of three meth-ods for constructing ensembles of decision trees baggingboosting and randomizationrdquo Machine Learning vol 40 no2 pp 139ndash157 2000
[28] SwedishDistrictHeatingAssociation 2016 httpwwwsvensk-fjarrvarmese
[29] K P Murphy Machine Learning A Probabilistic PerspectiveAdaptive Computation and Machine Learning Series MITPress 2012
[30] L Wang Support Vector Machines Theory and ApplicationsSpringer New York NY USA 2005
[31] D Basak S Pal and D C Patranabis ldquoSupport vector regres-sionsrdquoNeural Information ProcessingmdashLetters and Reviews vol11 no 10 pp 203ndash224 2007
[32] R E Edwards J New and L E Parker ldquoPredicting futurehourly residential electrical consumption a machine learningcase studyrdquo Energy and Buildings vol 49 pp 591ndash603 2012
[33] L Breiman ldquoRandom forestsrdquoMachine Learning vol 45 no 1pp 5ndash32 2001
[34] L Breiman J H Friedman R A Olshen and C J StoneClassification and Regression Trees Wadsworth and BrooksMonterey Calif USA 1984
[35] P Vezza C Comoglio M Rosso and A Viglione ldquoLowflows regionalization in North-Western Italyrdquo Water ResourcesManagement vol 24 no 14 pp 4049ndash4074 2010
[36] L Breiman and A Cutler ldquoRandom forestsrdquo January 2016httpwwwstatberkeleyeduusersbreimanRandomForests
[37] MATLAB Version 830 MathWorks Natick Mass USA 2014
Submit your manuscripts athttpwwwhindawicom
Computer Games Technology
International Journal of
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Distributed Sensor Networks
International Journal of
Advances in
FuzzySystems
Hindawi Publishing Corporationhttpwwwhindawicom
Volume 2014
International Journal of
ReconfigurableComputing
Hindawi Publishing Corporation httpwwwhindawicom Volume 2014
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Applied Computational Intelligence and Soft Computing
thinspAdvancesthinspinthinsp
Artificial Intelligence
HindawithinspPublishingthinspCorporationhttpwwwhindawicom Volumethinsp2014
Advances inSoftware EngineeringHindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Electrical and Computer Engineering
Journal of
Journal of
Computer Networks and Communications
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Hindawi Publishing Corporation
httpwwwhindawicom Volume 2014
Advances in
Multimedia
International Journal of
Biomedical Imaging
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
ArtificialNeural Systems
Advances in
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
RoboticsJournal of
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Computational Intelligence and Neuroscience
Industrial EngineeringJournal of
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Modelling amp Simulation in EngineeringHindawi Publishing Corporation httpwwwhindawicom Volume 2014
The Scientific World JournalHindawi Publishing Corporation httpwwwhindawicom Volume 2014
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Human-ComputerInteraction
Advances in
Computer EngineeringAdvances in
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Applied Computational Intelligence and Soft Computing 5
Table 2 Typical data samples for one day from the Eidsiva dataset
Time (h) FT (∘C) RT (∘C) Flow (m3h) HL (MW)1 1009 604 3165 1462 994 575 2791 1343 996 583 2283 1324 1003 598 2619 1515 1005 596 2769 1496 1001 589 2705 128
20 987 597 353 14821 1014 589 3309 16122 1011 591 2508 14223 1008 598 2602 15424 1019 593 2479 141
one of Europersquos most modern waste incineration plantslocated in the city of Hamar Norway The plant producesdistrict heating process steam and electricity These dataare collected by regular measurements that are part of thecontrol system in a DH plant The measurements consist of24measurements per each day that is every hourThe datasetcontains values of the parameters time of day (tD) forwardtemperature (FT) return temperature (RT) flow rate (FR)and heat load (HL) The data are collected in the periodbetween October 1st 2014 and April 30th 2015 In Table 2 wepresent a portion of typical data samples for one day
42 Data Preparation In this module activities related topreparing the data to be compatible with the ML moduleare performed The module includes data aggregation andpreprocessing During the process of data aggregation wecombine the sources of this data with the weather data(outdoor temperature) which is collected at the same intervalwith previous parameters Consequently we obtain fromthe aggregation process these output parameters outdoortemperature (OT) heat load (HL) forward temperature(FT) time of day (tD) and the difference between forwardtemperature (FT) and return temperature (RT) namely DT
43 Machine-Learning Predictive Modelling Machine learn-ing (ML) is a very broad subject it goes from very abstracttheory to extreme practice It turns out that even amongstmachine-learning practitioners there is no very well accepteddefinition of what is machine learning As a subfield ofartificial intelligence with its objective on building modelsthat learn from data machine learning has made tremendousimprovements and applications in the last decade
In general ML can be clearly defined as a set of methodsthat can automatically detect and extract knowledge patternsin empirical data such as sensor data or databases and thenuse the discovered knowledge patterns to predict future dataor execute other types of decision-making under uncertaintyML is divided into three principal groups supervised learn-ing (predictive learning approach) unsupervised learning
(descriptive learning approach) and reinforcement learning[29] In supervised learning the algorithm is given data inwhich the ldquocorrect answerrdquo for each example is told and themain property of the supervised learning is that the maincriteria of the target function 119910 = 119891(119909) are unknown Atthe very high level the two steps of supervised learningare as follows (i) train a machine-learning model usinglabeled data that consist of119873 data pairs (119909
1 119910
1) (119909
119873 119910
119873)
called instances and (ii) make predictions on new data forwhich the label is unknown Each instance is described byan input vector 119909
119894which incorporates a set of attributes 119860 =
119860
1 119860
2 119860
119898 and a label 119910
119894of the target attribute that
represents the wanted output To summarize these two stepsthe predictive model is learning from past examples made upof inputs and outputs and then applying what is learned tofuture inputs in order to predict future outputs Since we aremaking predictions on unseen data which is data that is notused to train the model it is often said that the primary goalof supervised learning is to buildmodels that generalizes thatis the built machine-learning model accurately predicts thefuture rather than the past Therefore the goal is to train amodel that can afterwards predict the label of new instancesand to figure out the target function
Based on the type of output variable 119910119894 supervised
learning tasks are further divided into two types as clas-sification and regression problems In problems where theoutput variable 119910
119894is categorical or nominal (or belongs to a
finite set) the ML tasks are known as classification problemsor pattern recognition whereas in regression problems theoutput variable is a real valued scalar or takes continuousvalues
431 Support Vector Regression (SVR) Support vectormachines (SVM) as a set of supervised learning algorithmsbased on a statistical learning theory are one of the mostsuccessful and widely appliedmachine-learningmethods forboth solving regression and pattern recognition problemsSince the formulation of SVMs is based on structural riskminimization and not on empirical risk minimization thisalgorithm shows better performance than the traditionalones Support Vector Regression (SVR) is a method ofSVM specifically for regressions In SVR the objectivefunction (eg the error function that may need to beminimized) is convex meaning that the global optimumis always reached and satisfied This is sharply in contrastto artificial neural networks (ANNs) where for instancethe classical backpropagation learning algorithm is prone toconvergence to ldquobadrdquo local minima [30 31] which makesthem harder to analyze theoretically In practice SVR havegreatly outperformed ANNs in a wide range of applications[31]
In SVR the input119909 ismapped first into an119898-dimensionalfeature space by using nonlinear mapping As a subsequentstep we construct a linear model in that feature spaceMathematically the linear model 119891(119909 119908) is given by
119891 (119909 119908) =
119898
sum
119895=1
119908
119895119892
119895(119909) + 119887 (1)
6 Applied Computational Intelligence and Soft Computing
where 119892119895(119909) 119895 = 1 119898 represents the set of nonlinear
transformations while 119887 is the bias term andmost of the timeis assumed to be zero hence we omit this term
The model obtained by SVR depends exclusively on asubset of the training data at the same time SVR tries toreducemodel complexity byminimizing 1199082 Consequentlythe objective of SVR is to minimize the following function[32]
min 1
2
119908
2+ 119862
119899
sum
119894=1
(120585
119894+ 120585
lowast
119894)
such that 119910
119894minus 119891 (119909
119894 119908) le 120576 + 120585
lowast
119894
119891 (119909
119894 119908) minus 119910
119894le 120576 + 120585
119894
120585
119894 120585
lowast
119894ge 0 119894 = 1 119899
(2)
In these equations 120576 is a new type of (insensitive) lossfunction or a threshold which denotes the desired errorrange for all points The nonnegative variables 120585
119894and 120585lowast119894are
called slack variables they measure the deviation of trainingsamples outside 120576 that is guaranteeing that a solution existsfor all 120576 The parameter 119862 gt 0 is a penalty term used todetermine the tradeoff between data fitting and flatness and119908 are the regression weights In most cases the optimizationproblem can be easily solved if transformed into a dualproblem By utilizing Lagrange multipliers the dualizationmethod is applied as follows
119871 =
1
2
119908
2+ 119862
119899
sum
119894=1
(120585
119894+ 120585
lowast
119894)
minus
119899
sum
119894=1
120572
lowast
119894(120576 + 120585
lowast
119894minus 119910
119894+ 119891 (119909
119894 119908))
minus
119899
sum
119894=1
120572
119894(120576 + 120585
119894+ 119910
119894minus 119891 (119909
119894 119908))
minus
119899
sum
119894=1
(120582
119894120585
119894+ 120582
lowast
119894120585
lowast
119894)
(3)
where 119871 is the Lagrangian and 120572119894 120572
lowast
119894 120582
119894 120582
lowast
119894ge 0 are called the
Lagrange multipliersConsidering the saddle point condition it follows that the
partial derivatives of119871 in relation to variables (119908 119887 120585119894 120585
lowast
119894) will
disappear for optimality By proceeding with similar stepswe end up with the dual optimization problem Finally thesolution of the dual problem is given by
119891 (119909) =
119899SV
sum
119894=1
(120572
119894minus 120572
lowast
119894)119870 (119909
119894minus 119909) (4)
where 0 le 120572lowast119894le 119862 0 le 120572
119894le 119862 119899SV is the number of space
vectors and 119870 is the kernel function which for given twovectors in input space will return to a higher dimensional
feature space the dot product of their images The kernel isgiven by
119870(119909 119909
119894) =
119898
sum
119894=1
119892
119895(119909) 119892
119895(119909
119894) (5)
In order to map the input data to a higher dimensional spaceand to handle nonlinearities between input vectors and theirrespective class we use as a kernel the Gaussian radial basisfunction (RBF) which has 120574 as its kernel parameter Once thekernel is selected we used grid search to identify the best pairof the regularization parameters119862 and 120574 that is the pair withthe best cross-validation accuracy
432 Partial Least Squares (PLS) The Partial Least Squares(PLS) technique is a learningmethod based onmultiple linearregression model that takes into account the latent structurein both datasetsThe dataset consists of explanatory variables119883
119894and dependent variables 119884
119895 The model is linear as can be
seen in (6) that for each sample 119899 the value 119910119899119895is
119910
119899119895=
119896
sum
119894=0
120573
119894119909
119899119894+ 120576
119899119895
(6)
The PLS model is similar to a model from a linearregression however the way of calculating 120573
119894is different
The principle of PLS regression is that the data tables ormatrices119883 and 119884 are decomposed into latent structure in aniterative process The latent structure corresponding to themost variation of 119884 is extracted and explained by a latentstructure of119883 that explains it the best
The Partial Least Squares (PLS) technique is a learningmethod based on multivariate regression model which canbe used for correlating the information in one data matrix119883 to the information in another matrix 119884 More specificallyPLS is used to find the fundamental relations between twomatrices (119883 and 119884) which are projected onto several keyfactors such as 119879 and 119880 and linear regression is performedfor the relation between these factors Factor119880 represents themost variations for 119884 whereas factor 119879 denotes the variationsfor 119883 but it is not necessarily explaining the most variationin119883
The first results of PLS are the model equations show-ing the 120573 coefficients that give the relationship betweenvariables 119883 and 119884 These model equations are as follows
119884 = 119883120573 + 120576
119884 = 119879
ℎ119862
1015840
ℎ+ 120576
ℎ= 119883119882
lowast
ℎ119862
1015840
ℎ+ 120576
ℎ
= 119883119882
ℎ(119875
1015840
ℎ119882
ℎ)
minus1
119862
1015840
ℎ+ 120576
ℎ
(7)
where119884 is thematrix of dependent variables 119883 is thematrixof explanatory variables 119879
ℎ 119862ℎ 119882lowastℎ 119882ℎ and 119875
ℎare the
matrices generated by the PLS algorithm and 120576ℎis the matrix
of residuals Matrix 120573 of the regression coefficients of119884 on119883with ℎ components generated by the PLS algorithm is givenby
120573 = 119882
ℎ(119875
1015840
ℎ119882
ℎ)
minus1
119862
1015840
ℎ
(8)
Applied Computational Intelligence and Soft Computing 7
Sample data
Train dataDataset to grow the single trees
Feature selectionRandom selection of predictors
Grow treeSplit data using the best predictors
Estimate OOB errorby applying the tree to the OOB data
Out-Of-Bag (OOB)Data to estimate the error of the
grown tree
Random forestAs collection of all trees
Repeat until criteria for stopping tree growth
are fulfilled
Repeat until specified number
of trees is obtained
Figure 3 Random forest algorithm process flow
The advantage of PLS is that this algorithm allows takinginto account the data structure in both 119883 and 119884 matrices Italso provides great visual results that help the interpretationof data Finally yet importantly PLS can model severalresponse variables at the same time taking into account theirstructure
433 Random Forest (RF) Random forest algorithm pro-posed by Breiman [33] is an ensemble-learning algorithmconsisting of three predictors where the trees are formulatedbased on various random features It develops lots of decisiontrees based on random selection of data and random selectionof variables providing the class of dependent variable basedon many trees
This method is based on the combination of a largecollection of decorrelated decision trees (ie Classificationand Regression Trees (CART) [34]) Since all the trees arebased on random selection of data as well as variables theseare random trees and many such random trees lead to arandom forest The name forest means that we use manydecision trees to make a better classification of the dependentvariable The CART technique divides the learning sampleusing an algorithm known as binary recursive partitioningThis splitting or partitioning starts from the most importantvariable to the less important ones and it is applied to each ofthe new branches of the tree [35]
In order to increase the algorithm accuracy and reducethe generalization error of the ensemble trees another
technique called Bagging is incorporated The estimation forthe generalization error is performed with the Out-Of-Bag(OOB) method Bagging is used on the training dataset togenerate a lot of copies of it where each one corresponds to adecision tree
With the RF algorithm each tree is grown as follows [36]
(a) If the number of cases (observations) in the trainingset is119873 sample119873 cases at random but with replace-ment from the original data this sample will be thetraining set for the growing tree
(b) If there are 119872 input variables a number 119898 ≪ 119872
is specified such that at each node 119898 variables areselected at random out of the119872 and the best split onthese119898 is used to split the node This value119898 is heldconstant during the forest growing
(c) Each tree is grown to the largest extent possible
The process flow in the random forest models is shown inFigure 3
The error rate of RF primarily depends on the correlationdegree between any two trees and the prediction accuracy ofan individual tree The principal advantages of this methodare the ease of parallelization robustness and indifference tonoise and outliers in most of the dataset Due to its unbiasednature of execution this method avoids overfitting
8 Applied Computational Intelligence and Soft Computing
Table 3 Performance results for one week
Algorithm Training TestingMAE () MAPE () Correlation coefficient MAE () MAPE () Correlation coefficient
SVR 6491 311 090 7032 343 091PLS 26524 1273 081 23868 1042 079RF 17382 1977 083 15932 1861 084
Heat output (MW)
Heat load measurements (Oct 1ndashApr 30)
minus500
000
500
1000
1500
2000
2500
3000
Outdoor temp (∘C)
Figure 4 Hourly heat load pattern in relation to outdoor tempera-ture in the DH network
5 Performance Evaluation and Results
The proposed approach is implemented in MATLAB R2014a[37] and executed in a PC with Intel Core i7 processor with27GHz speed and 8GB of RAM In this work as trainingdataset we select the datameasured during the first 28 weekswhich consist of 4872 instances As for prediction period wechoose the 29th week as test data that is 148 instances Inorder to evaluate the performance of the proposed algorithmsin terms of accuracy of the results we use the mean absoluteaverage (MAE) the mean average percentage error (MAPE)and the correlation coefficient which measures the correla-tion between the actual and predicted heat load values MAEand MAPE are defined as follows
MAE =sum
119899
119894=1
1003816
1003816
1003816
1003816
119901
119894minus 119910
119894
1003816
1003816
1003816
1003816
119899
MAPE =sum
119899
119894=1
1003816
1003816
1003816
1003816
1003816
(119901
119894minus 119910
1015840
119894) 119901
119894
1003816
1003816
1003816
1003816
1003816
119899
(9)
where 119901119894is the actual value 119910
119894is the predicted value and 119899 is
the number of samples in the training setWe apply a 10-fold cross-validation for obtaining the valid
evaluation metrics Figure 4 presents the heat load in thenetwork with respect to the outdoor temperatureWe see thatthe higher values of the heat load occur during the days withlower outdoor temperature values which in fact reflects theincreased consumption of heat
Figure 5(a) shows results of actual heat load and heatload prediction for one week based on SVR algorithm InFigure 5(b) results of heat prediction production for oneweek based on PLS are shown with data on actual heal load
whereas results of heat load forecasting for one week basedon RF are shown in Figure 5(c)
As can be seen from Figure 5 the predicted HL withSVR is closer to the actual energy consumption with anMAPE value of about 343 and a correlation coefficient of091 Correlation has been consistent throughout training andtesting On the other hand graphs presented in Figures 5(b)and 5(c) for PLS and RF respectively are less accurate withhigher errors The performance of the PLS is significantlylower compared to SVR Concerning RF as the trees startedto become progressively uncorrelated the error rate alsodeclined significantly The best performance of the SVR overthe other two methods is attributed to the efficient modellingof feature space and to the fact that SVR is less proneto overfitting and that does not require any procedure todetermine the explicit form as in ordinary regression analysis
Table 3 outlines the results of the investigated ML algo-rithms for heat consumption prediction for both trainingphase (where we developed the models using supervisedlearning schemes) and testing phase (to generalize newlyunseen data) From Table 3 it is evident that SVR showsthe best prediction performance in terms of average errorsand correlation coefficient confirming the superiority of SVRover the other machine-learning methods Therefore basedon the assumption that the number and type of the operatingfacilities should be determined SVRcanbe effectively appliedin the management of a DHS The mean absolute percentageerror value of 343 obtained in our approachwith the SVR islower than or equal to the mean absolute percentage error ofstate-of-the-art approaches for heat load prediction in DHSMoreover SVR is also better than PLS and RF in terms ofmean average errors and correlation coefficient Neverthelesssometimes it is impossible to perform direct comparisonwith other works due to different system implementation andinput data and different structure of the experimental setup
51 Comparison with State-of-the-Art Methods Some priorresearch work is carried out to predict and analyze the headdemand in DH However due to different system designand input data and different architecture implementation orstructure of the experimental setup sometimes it is difficult toperform direct comparison Our approach uses operationaldata from DHS to model the heat demand and since theSVR exhibited the best prediction performance we use thismethod to perform the comparison In our case we obtainedsmaller MAPE compared to [17] where the experimentalresults showed MAPE of 477 which is lower than or atleast equal to the mean percentage error of state-of-the-artregression approaches that have been proposed for heat loadforecasting in DH systems Furthermore reported results
Applied Computational Intelligence and Soft Computing 9
Mon Tu
e
Wed Th
u Fri
Sat
Sun
Support Vector Regression (SVR)
Actual HLPredicted HL
000
500
1000
1500
2000
2500
Hea
t loa
d (M
W)
(a) Predicted versus actual demand with SVR
000
500
1000
1500
2000
2500
Mon Tu
e
Wed Thu Fri
Sat
Sun
Hea
t loa
d (M
W)
Partial Least Squares (PLS)
Actual HLPredicted HL
(b) Predicted versus actual demand with PLS
000
500
1000
1500
2000
2500
Hea
t loa
d (M
W)
Random forest (RF)
Mon Tu
e
Wed Th
u Fri
Sat
Sun
Actual HLPredicted HL
(c) Predicted versus actual demand with RF
Figure 5 Hourly heat load prediction based on SVR PLS and RF for one week
from our study are also superior compared to the resultspresented in [15 32] More specifically the SVR method weapply shows better results in terms of MAPE and correlationcoefficient compared to [32] where MAPE in this paperis 554 and the correlation coefficient is 087 As far asthe comparison with work [15] is concerned in terms ofMAPE our SVR method exhibits better hourly predictionperformance where theMAPE for one week is 563 On theother hand the PLSmethod performs better than in our casehaving the MAPE value of about 899
6 Conclusion
District heating (DH) sector can play an indispensable rolein the current and future sustainable energy system of theNorth European countries where the share of DH in the total
European heat market is significantly high The innovationsand emergence of new technology and the increasing focuson smart buildings impose energy systems to face a chal-lenge in adapting to customersrsquo more flexible and individualsolutions Consequently one needs to know more aboutwhat drives customersrsquo requirements choices and prioritiesTherefore the creation and application of innovative ITecosystems in DH are considered essential to improve energyefficiency
Heat load prediction in the past decade has attracted alot of interest to the researchers since it can assist in energyefficiency of the DHS which also leads to cost reduction forheat suppliers and to many environmental benefits
In this paper three ML algorithms for heat load predic-tion in a DH network are developed and presentedThe algo-rithms are SVR PLS andRFTheheat load predictionmodels
10 Applied Computational Intelligence and Soft Computing
were developed using data from 29 weeks The predictedhourly results were compared with actual heat load dataPerformances of these three different ML algorithms werestudied compared and analyzed SVR algorithm proved tobe the most efficient one producing the best performance interms of average errors and correlation coefficient Moreoverthe prediction results were also compared against existingSVR and PLS methods in literature showing that the SVRpresented in this paper produces better accuracy
In conclusion the comparison results validate the notionthat the developed SVRmethod is appropriate for applicationin heat load prediction or it can serve as a promisingalternative for existing models
As for the future work apart from outdoor temperaturewe intend to incorporate other meteorological parametersinfluencing the heat load such as wind speed and humidity
Competing Interests
The authors declare that they have no competing interests
Acknowledgments
The authors would like to thank Eidsiva Bioenergi AS Nor-way which kindly provided the data for this research work
References
[1] Report on European Energy Security Strategy May 2014 Brus-sels httpeur-lexeuropaeulegal-contentENTXTPDFuri=CELEX52014DC0330ampfrom=EN
[2] A R Day P Ogumka P G Jones and A Dunsdon ldquoTheuse of the planning system to encourage low carbon energytechnologies in buildingsrdquo Renewable Energy vol 34 no 9 pp2016ndash2021 2009
[3] IEA (International Energy Agency) Nordic Energy TechnologyPerspectives OECDIEA Paris France 2013
[4] A Gebremedhin and H Zinko ldquoSeasonal heat storages indistrict heating systemsrdquo in Proceedings of the 11th InternationalThermal Energy Storage for Energy Efficiency and SustainabilityConference (Effstock rsquo09) Stockholm Sweden June 2009
[5] H Lund S Werner R Wiltshire et al ldquo4th Generation DistrictHeating (4GDH) integrating smart thermal grids into futuresustainable energy systemsrdquo Energy vol 68 pp 1ndash11 2014
[6] F Dalipi S Y Yayilgan and A Gebremedhin ldquoA machinelearning approach to increase energy efficiency in district heat-ing systemsrdquo in Proceedings of the International Conference onEnvironmental Engineering and Computer Application (ICEECArsquo14) pp 223ndash226 CRC Press Hong Kong December 2014
[7] N Fumo ldquoA review on the basics of building energy estimationrdquoRenewable and Sustainable Energy Reviews vol 31 pp 53ndash602014
[8] D E Claridge ldquoUsing simulation models for building com-missioningrdquo in Proceedings of the International Conferenceon Enhanced Building Operation Energy Systems LaboratoryTexas AampM University October 2004
[9] T Mestekemper Energy demand forecasting and dynamic watertemperature management [PhD thesis] Bielefeld UniversityBielefeld Germany 2011
[10] E Dotzauer ldquoSimple model for prediction of loads in district-heating systemsrdquo Applied Energy vol 73 no 3-4 pp 277ndash2842002
[11] HANielsen andHMadsen ldquoModelling the heat consumptionin district heating systems using a grey-box approachrdquo Energyand Buildings vol 38 no 1 pp 63ndash71 2006
[12] B B Ekici and U T Aksoy ldquoPrediction of building energyconsumption by using artificial neural networksrdquo Advances inEngineering Software vol 40 no 5 pp 356ndash362 2009
[13] M Grzenda and B Macukow ldquoDemand prediction with multi-stage neural processingrdquo in Advances in Natural Computationand Data Mining pp 131ndash141 2006
[14] K Kato M Sakawa K Ishimaru S Ushiro and T ShibanoldquoHeat load prediction through recurrent neural network indistrict heating and cooling systemsrdquo in Proceedings of the IEEEInternational Conference on SystemsMan andCybernetics (SMCrsquo08) pp 1401ndash1406 Singapore October 2008
[15] T C Park U S Kim L-H Kim B W Jo and Y K Yeo ldquoHeatconsumption forecasting using partial least squares artificialneural network and support vector regression techniques in dis-trict heating systemsrdquo Korean Journal of Chemical Engineeringvol 27 no 4 pp 1063ndash1071 2010
[16] T Catalina V Iordache and B Caracaleanu ldquoMultiple regres-sion model for fast prediction of the heating energy demandrdquoEnergy and Buildings vol 57 pp 302ndash312 2013
[17] S Provatas N Lavesson and C Johansson ldquoAn onlinemachinelearning algorithm for heat load forecasting in district heatingsystemsrdquo in Proceedings of the 14th International Symposium onDistrict Heating and Cooling Stockholm Sweden September2014
[18] S Idowu S Saguna C Ahlund and O Schelen ldquoForecastingheat load for smart district heating systems a machine learningapproachrdquo in Proceedings of the IEEE International Conferenceon Smart Grid Communications (SmartGridComm rsquo14) pp 554ndash559 IEEE Venice Italy November 2014
[19] L Wu G Kaiser D Solomon R Winter A Boulanger andR Anderson Improving Efficiency and Reliability of Build-ing Systems Using Machine Learning and Automated OnlineEvaluation Columbia University Academic Commons 2012httphdlhandlenet10022ACP13213
[20] M Vlachopoulou G Chin J C Fuller S Lu and K KalsildquoModel for aggregated water heater load using dynamicBayesian networksrdquo Tech Rep Pacific Northwest NationalLaboratory (PNNL) Richland Wash USA 2012
[21] S Shamshirband D Petkovic R Enayatifar et al ldquoHeat loadprediction in district heating systemswith adaptive neuro-fuzzymethodrdquoRenewable and Sustainable Energy Reviews vol 48 pp760ndash767 2015
[22] M Protic S Shamshirband M H Anisi et al ldquoAppraisal ofsoft computing methods for short term consumersrsquo heat loadprediction in district heating systemsrdquo Energy vol 82 pp 697ndash704 2015
[23] S Shamshirband D Petkovic A Amini et al ldquoSupport vec-tor regression methodology for wind turbine reaction torqueprediction with power-split hydrostatic continuous variabletransmissionrdquo Energy vol 67 pp 623ndash630 2014
[24] H Quan D Srinivasan and A Khosravi ldquoShort-term load andwind power forecasting using neural network-based predictionintervalsrdquo IEEE Transactions on Neural Networks and LearningSystems vol 25 no 2 pp 303ndash315 2014
Applied Computational Intelligence and Soft Computing 11
[25] K Bhaskar and S N Singh ldquoAWNN-assisted wind power fore-casting using feed-forward neural networkrdquo IEEE Transactionson Sustainable Energy vol 3 no 2 pp 306ndash315 2012
[26] Heat Pump Centre Newsletters International Energy Agency(IEA) Heat Pumping Technologies Vol 30 Nr 22012
[27] T G Dietterich ldquoAn experimental comparison of three meth-ods for constructing ensembles of decision trees baggingboosting and randomizationrdquo Machine Learning vol 40 no2 pp 139ndash157 2000
[28] SwedishDistrictHeatingAssociation 2016 httpwwwsvensk-fjarrvarmese
[29] K P Murphy Machine Learning A Probabilistic PerspectiveAdaptive Computation and Machine Learning Series MITPress 2012
[30] L Wang Support Vector Machines Theory and ApplicationsSpringer New York NY USA 2005
[31] D Basak S Pal and D C Patranabis ldquoSupport vector regres-sionsrdquoNeural Information ProcessingmdashLetters and Reviews vol11 no 10 pp 203ndash224 2007
[32] R E Edwards J New and L E Parker ldquoPredicting futurehourly residential electrical consumption a machine learningcase studyrdquo Energy and Buildings vol 49 pp 591ndash603 2012
[33] L Breiman ldquoRandom forestsrdquoMachine Learning vol 45 no 1pp 5ndash32 2001
[34] L Breiman J H Friedman R A Olshen and C J StoneClassification and Regression Trees Wadsworth and BrooksMonterey Calif USA 1984
[35] P Vezza C Comoglio M Rosso and A Viglione ldquoLowflows regionalization in North-Western Italyrdquo Water ResourcesManagement vol 24 no 14 pp 4049ndash4074 2010
[36] L Breiman and A Cutler ldquoRandom forestsrdquo January 2016httpwwwstatberkeleyeduusersbreimanRandomForests
[37] MATLAB Version 830 MathWorks Natick Mass USA 2014
Submit your manuscripts athttpwwwhindawicom
Computer Games Technology
International Journal of
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Distributed Sensor Networks
International Journal of
Advances in
FuzzySystems
Hindawi Publishing Corporationhttpwwwhindawicom
Volume 2014
International Journal of
ReconfigurableComputing
Hindawi Publishing Corporation httpwwwhindawicom Volume 2014
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Applied Computational Intelligence and Soft Computing
thinspAdvancesthinspinthinsp
Artificial Intelligence
HindawithinspPublishingthinspCorporationhttpwwwhindawicom Volumethinsp2014
Advances inSoftware EngineeringHindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Electrical and Computer Engineering
Journal of
Journal of
Computer Networks and Communications
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Hindawi Publishing Corporation
httpwwwhindawicom Volume 2014
Advances in
Multimedia
International Journal of
Biomedical Imaging
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
ArtificialNeural Systems
Advances in
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
RoboticsJournal of
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Computational Intelligence and Neuroscience
Industrial EngineeringJournal of
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Modelling amp Simulation in EngineeringHindawi Publishing Corporation httpwwwhindawicom Volume 2014
The Scientific World JournalHindawi Publishing Corporation httpwwwhindawicom Volume 2014
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Human-ComputerInteraction
Advances in
Computer EngineeringAdvances in
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
6 Applied Computational Intelligence and Soft Computing
where 119892119895(119909) 119895 = 1 119898 represents the set of nonlinear
transformations while 119887 is the bias term andmost of the timeis assumed to be zero hence we omit this term
The model obtained by SVR depends exclusively on asubset of the training data at the same time SVR tries toreducemodel complexity byminimizing 1199082 Consequentlythe objective of SVR is to minimize the following function[32]
min 1
2
119908
2+ 119862
119899
sum
119894=1
(120585
119894+ 120585
lowast
119894)
such that 119910
119894minus 119891 (119909
119894 119908) le 120576 + 120585
lowast
119894
119891 (119909
119894 119908) minus 119910
119894le 120576 + 120585
119894
120585
119894 120585
lowast
119894ge 0 119894 = 1 119899
(2)
In these equations 120576 is a new type of (insensitive) lossfunction or a threshold which denotes the desired errorrange for all points The nonnegative variables 120585
119894and 120585lowast119894are
called slack variables they measure the deviation of trainingsamples outside 120576 that is guaranteeing that a solution existsfor all 120576 The parameter 119862 gt 0 is a penalty term used todetermine the tradeoff between data fitting and flatness and119908 are the regression weights In most cases the optimizationproblem can be easily solved if transformed into a dualproblem By utilizing Lagrange multipliers the dualizationmethod is applied as follows
119871 =
1
2
119908
2+ 119862
119899
sum
119894=1
(120585
119894+ 120585
lowast
119894)
minus
119899
sum
119894=1
120572
lowast
119894(120576 + 120585
lowast
119894minus 119910
119894+ 119891 (119909
119894 119908))
minus
119899
sum
119894=1
120572
119894(120576 + 120585
119894+ 119910
119894minus 119891 (119909
119894 119908))
minus
119899
sum
119894=1
(120582
119894120585
119894+ 120582
lowast
119894120585
lowast
119894)
(3)
where 119871 is the Lagrangian and 120572119894 120572
lowast
119894 120582
119894 120582
lowast
119894ge 0 are called the
Lagrange multipliersConsidering the saddle point condition it follows that the
partial derivatives of119871 in relation to variables (119908 119887 120585119894 120585
lowast
119894) will
disappear for optimality By proceeding with similar stepswe end up with the dual optimization problem Finally thesolution of the dual problem is given by
119891 (119909) =
119899SV
sum
119894=1
(120572
119894minus 120572
lowast
119894)119870 (119909
119894minus 119909) (4)
where 0 le 120572lowast119894le 119862 0 le 120572
119894le 119862 119899SV is the number of space
vectors and 119870 is the kernel function which for given twovectors in input space will return to a higher dimensional
feature space the dot product of their images The kernel isgiven by
119870(119909 119909
119894) =
119898
sum
119894=1
119892
119895(119909) 119892
119895(119909
119894) (5)
In order to map the input data to a higher dimensional spaceand to handle nonlinearities between input vectors and theirrespective class we use as a kernel the Gaussian radial basisfunction (RBF) which has 120574 as its kernel parameter Once thekernel is selected we used grid search to identify the best pairof the regularization parameters119862 and 120574 that is the pair withthe best cross-validation accuracy
432 Partial Least Squares (PLS) The Partial Least Squares(PLS) technique is a learningmethod based onmultiple linearregression model that takes into account the latent structurein both datasetsThe dataset consists of explanatory variables119883
119894and dependent variables 119884
119895 The model is linear as can be
seen in (6) that for each sample 119899 the value 119910119899119895is
119910
119899119895=
119896
sum
119894=0
120573
119894119909
119899119894+ 120576
119899119895
(6)
The PLS model is similar to a model from a linearregression however the way of calculating 120573
119894is different
The principle of PLS regression is that the data tables ormatrices119883 and 119884 are decomposed into latent structure in aniterative process The latent structure corresponding to themost variation of 119884 is extracted and explained by a latentstructure of119883 that explains it the best
The Partial Least Squares (PLS) technique is a learningmethod based on multivariate regression model which canbe used for correlating the information in one data matrix119883 to the information in another matrix 119884 More specificallyPLS is used to find the fundamental relations between twomatrices (119883 and 119884) which are projected onto several keyfactors such as 119879 and 119880 and linear regression is performedfor the relation between these factors Factor119880 represents themost variations for 119884 whereas factor 119879 denotes the variationsfor 119883 but it is not necessarily explaining the most variationin119883
The first results of PLS are the model equations show-ing the 120573 coefficients that give the relationship betweenvariables 119883 and 119884 These model equations are as follows
119884 = 119883120573 + 120576
119884 = 119879
ℎ119862
1015840
ℎ+ 120576
ℎ= 119883119882
lowast
ℎ119862
1015840
ℎ+ 120576
ℎ
= 119883119882
ℎ(119875
1015840
ℎ119882
ℎ)
minus1
119862
1015840
ℎ+ 120576
ℎ
(7)
where119884 is thematrix of dependent variables 119883 is thematrixof explanatory variables 119879
ℎ 119862ℎ 119882lowastℎ 119882ℎ and 119875
ℎare the
matrices generated by the PLS algorithm and 120576ℎis the matrix
of residuals Matrix 120573 of the regression coefficients of119884 on119883with ℎ components generated by the PLS algorithm is givenby
120573 = 119882
ℎ(119875
1015840
ℎ119882
ℎ)
minus1
119862
1015840
ℎ
(8)
Applied Computational Intelligence and Soft Computing 7
Sample data
Train dataDataset to grow the single trees
Feature selectionRandom selection of predictors
Grow treeSplit data using the best predictors
Estimate OOB errorby applying the tree to the OOB data
Out-Of-Bag (OOB)Data to estimate the error of the
grown tree
Random forestAs collection of all trees
Repeat until criteria for stopping tree growth
are fulfilled
Repeat until specified number
of trees is obtained
Figure 3 Random forest algorithm process flow
The advantage of PLS is that this algorithm allows takinginto account the data structure in both 119883 and 119884 matrices Italso provides great visual results that help the interpretationof data Finally yet importantly PLS can model severalresponse variables at the same time taking into account theirstructure
433 Random Forest (RF) Random forest algorithm pro-posed by Breiman [33] is an ensemble-learning algorithmconsisting of three predictors where the trees are formulatedbased on various random features It develops lots of decisiontrees based on random selection of data and random selectionof variables providing the class of dependent variable basedon many trees
This method is based on the combination of a largecollection of decorrelated decision trees (ie Classificationand Regression Trees (CART) [34]) Since all the trees arebased on random selection of data as well as variables theseare random trees and many such random trees lead to arandom forest The name forest means that we use manydecision trees to make a better classification of the dependentvariable The CART technique divides the learning sampleusing an algorithm known as binary recursive partitioningThis splitting or partitioning starts from the most importantvariable to the less important ones and it is applied to each ofthe new branches of the tree [35]
In order to increase the algorithm accuracy and reducethe generalization error of the ensemble trees another
technique called Bagging is incorporated The estimation forthe generalization error is performed with the Out-Of-Bag(OOB) method Bagging is used on the training dataset togenerate a lot of copies of it where each one corresponds to adecision tree
With the RF algorithm each tree is grown as follows [36]
(a) If the number of cases (observations) in the trainingset is119873 sample119873 cases at random but with replace-ment from the original data this sample will be thetraining set for the growing tree
(b) If there are 119872 input variables a number 119898 ≪ 119872
is specified such that at each node 119898 variables areselected at random out of the119872 and the best split onthese119898 is used to split the node This value119898 is heldconstant during the forest growing
(c) Each tree is grown to the largest extent possible
The process flow in the random forest models is shown inFigure 3
The error rate of RF primarily depends on the correlationdegree between any two trees and the prediction accuracy ofan individual tree The principal advantages of this methodare the ease of parallelization robustness and indifference tonoise and outliers in most of the dataset Due to its unbiasednature of execution this method avoids overfitting
8 Applied Computational Intelligence and Soft Computing
Table 3 Performance results for one week
Algorithm Training TestingMAE () MAPE () Correlation coefficient MAE () MAPE () Correlation coefficient
SVR 6491 311 090 7032 343 091PLS 26524 1273 081 23868 1042 079RF 17382 1977 083 15932 1861 084
Heat output (MW)
Heat load measurements (Oct 1ndashApr 30)
minus500
000
500
1000
1500
2000
2500
3000
Outdoor temp (∘C)
Figure 4 Hourly heat load pattern in relation to outdoor tempera-ture in the DH network
5 Performance Evaluation and Results
The proposed approach is implemented in MATLAB R2014a[37] and executed in a PC with Intel Core i7 processor with27GHz speed and 8GB of RAM In this work as trainingdataset we select the datameasured during the first 28 weekswhich consist of 4872 instances As for prediction period wechoose the 29th week as test data that is 148 instances Inorder to evaluate the performance of the proposed algorithmsin terms of accuracy of the results we use the mean absoluteaverage (MAE) the mean average percentage error (MAPE)and the correlation coefficient which measures the correla-tion between the actual and predicted heat load values MAEand MAPE are defined as follows
MAE =sum
119899
119894=1
1003816
1003816
1003816
1003816
119901
119894minus 119910
119894
1003816
1003816
1003816
1003816
119899
MAPE =sum
119899
119894=1
1003816
1003816
1003816
1003816
1003816
(119901
119894minus 119910
1015840
119894) 119901
119894
1003816
1003816
1003816
1003816
1003816
119899
(9)
where 119901119894is the actual value 119910
119894is the predicted value and 119899 is
the number of samples in the training setWe apply a 10-fold cross-validation for obtaining the valid
evaluation metrics Figure 4 presents the heat load in thenetwork with respect to the outdoor temperatureWe see thatthe higher values of the heat load occur during the days withlower outdoor temperature values which in fact reflects theincreased consumption of heat
Figure 5(a) shows results of actual heat load and heatload prediction for one week based on SVR algorithm InFigure 5(b) results of heat prediction production for oneweek based on PLS are shown with data on actual heal load
whereas results of heat load forecasting for one week basedon RF are shown in Figure 5(c)
As can be seen from Figure 5 the predicted HL withSVR is closer to the actual energy consumption with anMAPE value of about 343 and a correlation coefficient of091 Correlation has been consistent throughout training andtesting On the other hand graphs presented in Figures 5(b)and 5(c) for PLS and RF respectively are less accurate withhigher errors The performance of the PLS is significantlylower compared to SVR Concerning RF as the trees startedto become progressively uncorrelated the error rate alsodeclined significantly The best performance of the SVR overthe other two methods is attributed to the efficient modellingof feature space and to the fact that SVR is less proneto overfitting and that does not require any procedure todetermine the explicit form as in ordinary regression analysis
Table 3 outlines the results of the investigated ML algo-rithms for heat consumption prediction for both trainingphase (where we developed the models using supervisedlearning schemes) and testing phase (to generalize newlyunseen data) From Table 3 it is evident that SVR showsthe best prediction performance in terms of average errorsand correlation coefficient confirming the superiority of SVRover the other machine-learning methods Therefore basedon the assumption that the number and type of the operatingfacilities should be determined SVRcanbe effectively appliedin the management of a DHS The mean absolute percentageerror value of 343 obtained in our approachwith the SVR islower than or equal to the mean absolute percentage error ofstate-of-the-art approaches for heat load prediction in DHSMoreover SVR is also better than PLS and RF in terms ofmean average errors and correlation coefficient Neverthelesssometimes it is impossible to perform direct comparisonwith other works due to different system implementation andinput data and different structure of the experimental setup
51 Comparison with State-of-the-Art Methods Some priorresearch work is carried out to predict and analyze the headdemand in DH However due to different system designand input data and different architecture implementation orstructure of the experimental setup sometimes it is difficult toperform direct comparison Our approach uses operationaldata from DHS to model the heat demand and since theSVR exhibited the best prediction performance we use thismethod to perform the comparison In our case we obtainedsmaller MAPE compared to [17] where the experimentalresults showed MAPE of 477 which is lower than or atleast equal to the mean percentage error of state-of-the-artregression approaches that have been proposed for heat loadforecasting in DH systems Furthermore reported results
Applied Computational Intelligence and Soft Computing 9
Mon Tu
e
Wed Th
u Fri
Sat
Sun
Support Vector Regression (SVR)
Actual HLPredicted HL
000
500
1000
1500
2000
2500
Hea
t loa
d (M
W)
(a) Predicted versus actual demand with SVR
000
500
1000
1500
2000
2500
Mon Tu
e
Wed Thu Fri
Sat
Sun
Hea
t loa
d (M
W)
Partial Least Squares (PLS)
Actual HLPredicted HL
(b) Predicted versus actual demand with PLS
000
500
1000
1500
2000
2500
Hea
t loa
d (M
W)
Random forest (RF)
Mon Tu
e
Wed Th
u Fri
Sat
Sun
Actual HLPredicted HL
(c) Predicted versus actual demand with RF
Figure 5 Hourly heat load prediction based on SVR PLS and RF for one week
from our study are also superior compared to the resultspresented in [15 32] More specifically the SVR method weapply shows better results in terms of MAPE and correlationcoefficient compared to [32] where MAPE in this paperis 554 and the correlation coefficient is 087 As far asthe comparison with work [15] is concerned in terms ofMAPE our SVR method exhibits better hourly predictionperformance where theMAPE for one week is 563 On theother hand the PLSmethod performs better than in our casehaving the MAPE value of about 899
6 Conclusion
District heating (DH) sector can play an indispensable rolein the current and future sustainable energy system of theNorth European countries where the share of DH in the total
European heat market is significantly high The innovationsand emergence of new technology and the increasing focuson smart buildings impose energy systems to face a chal-lenge in adapting to customersrsquo more flexible and individualsolutions Consequently one needs to know more aboutwhat drives customersrsquo requirements choices and prioritiesTherefore the creation and application of innovative ITecosystems in DH are considered essential to improve energyefficiency
Heat load prediction in the past decade has attracted alot of interest to the researchers since it can assist in energyefficiency of the DHS which also leads to cost reduction forheat suppliers and to many environmental benefits
In this paper three ML algorithms for heat load predic-tion in a DH network are developed and presentedThe algo-rithms are SVR PLS andRFTheheat load predictionmodels
10 Applied Computational Intelligence and Soft Computing
were developed using data from 29 weeks The predictedhourly results were compared with actual heat load dataPerformances of these three different ML algorithms werestudied compared and analyzed SVR algorithm proved tobe the most efficient one producing the best performance interms of average errors and correlation coefficient Moreoverthe prediction results were also compared against existingSVR and PLS methods in literature showing that the SVRpresented in this paper produces better accuracy
In conclusion the comparison results validate the notionthat the developed SVRmethod is appropriate for applicationin heat load prediction or it can serve as a promisingalternative for existing models
As for the future work apart from outdoor temperaturewe intend to incorporate other meteorological parametersinfluencing the heat load such as wind speed and humidity
Competing Interests
The authors declare that they have no competing interests
Acknowledgments
The authors would like to thank Eidsiva Bioenergi AS Nor-way which kindly provided the data for this research work
References
[1] Report on European Energy Security Strategy May 2014 Brus-sels httpeur-lexeuropaeulegal-contentENTXTPDFuri=CELEX52014DC0330ampfrom=EN
[2] A R Day P Ogumka P G Jones and A Dunsdon ldquoTheuse of the planning system to encourage low carbon energytechnologies in buildingsrdquo Renewable Energy vol 34 no 9 pp2016ndash2021 2009
[3] IEA (International Energy Agency) Nordic Energy TechnologyPerspectives OECDIEA Paris France 2013
[4] A Gebremedhin and H Zinko ldquoSeasonal heat storages indistrict heating systemsrdquo in Proceedings of the 11th InternationalThermal Energy Storage for Energy Efficiency and SustainabilityConference (Effstock rsquo09) Stockholm Sweden June 2009
[5] H Lund S Werner R Wiltshire et al ldquo4th Generation DistrictHeating (4GDH) integrating smart thermal grids into futuresustainable energy systemsrdquo Energy vol 68 pp 1ndash11 2014
[6] F Dalipi S Y Yayilgan and A Gebremedhin ldquoA machinelearning approach to increase energy efficiency in district heat-ing systemsrdquo in Proceedings of the International Conference onEnvironmental Engineering and Computer Application (ICEECArsquo14) pp 223ndash226 CRC Press Hong Kong December 2014
[7] N Fumo ldquoA review on the basics of building energy estimationrdquoRenewable and Sustainable Energy Reviews vol 31 pp 53ndash602014
[8] D E Claridge ldquoUsing simulation models for building com-missioningrdquo in Proceedings of the International Conferenceon Enhanced Building Operation Energy Systems LaboratoryTexas AampM University October 2004
[9] T Mestekemper Energy demand forecasting and dynamic watertemperature management [PhD thesis] Bielefeld UniversityBielefeld Germany 2011
[10] E Dotzauer ldquoSimple model for prediction of loads in district-heating systemsrdquo Applied Energy vol 73 no 3-4 pp 277ndash2842002
[11] HANielsen andHMadsen ldquoModelling the heat consumptionin district heating systems using a grey-box approachrdquo Energyand Buildings vol 38 no 1 pp 63ndash71 2006
[12] B B Ekici and U T Aksoy ldquoPrediction of building energyconsumption by using artificial neural networksrdquo Advances inEngineering Software vol 40 no 5 pp 356ndash362 2009
[13] M Grzenda and B Macukow ldquoDemand prediction with multi-stage neural processingrdquo in Advances in Natural Computationand Data Mining pp 131ndash141 2006
[14] K Kato M Sakawa K Ishimaru S Ushiro and T ShibanoldquoHeat load prediction through recurrent neural network indistrict heating and cooling systemsrdquo in Proceedings of the IEEEInternational Conference on SystemsMan andCybernetics (SMCrsquo08) pp 1401ndash1406 Singapore October 2008
[15] T C Park U S Kim L-H Kim B W Jo and Y K Yeo ldquoHeatconsumption forecasting using partial least squares artificialneural network and support vector regression techniques in dis-trict heating systemsrdquo Korean Journal of Chemical Engineeringvol 27 no 4 pp 1063ndash1071 2010
[16] T Catalina V Iordache and B Caracaleanu ldquoMultiple regres-sion model for fast prediction of the heating energy demandrdquoEnergy and Buildings vol 57 pp 302ndash312 2013
[17] S Provatas N Lavesson and C Johansson ldquoAn onlinemachinelearning algorithm for heat load forecasting in district heatingsystemsrdquo in Proceedings of the 14th International Symposium onDistrict Heating and Cooling Stockholm Sweden September2014
[18] S Idowu S Saguna C Ahlund and O Schelen ldquoForecastingheat load for smart district heating systems a machine learningapproachrdquo in Proceedings of the IEEE International Conferenceon Smart Grid Communications (SmartGridComm rsquo14) pp 554ndash559 IEEE Venice Italy November 2014
[19] L Wu G Kaiser D Solomon R Winter A Boulanger andR Anderson Improving Efficiency and Reliability of Build-ing Systems Using Machine Learning and Automated OnlineEvaluation Columbia University Academic Commons 2012httphdlhandlenet10022ACP13213
[20] M Vlachopoulou G Chin J C Fuller S Lu and K KalsildquoModel for aggregated water heater load using dynamicBayesian networksrdquo Tech Rep Pacific Northwest NationalLaboratory (PNNL) Richland Wash USA 2012
[21] S Shamshirband D Petkovic R Enayatifar et al ldquoHeat loadprediction in district heating systemswith adaptive neuro-fuzzymethodrdquoRenewable and Sustainable Energy Reviews vol 48 pp760ndash767 2015
[22] M Protic S Shamshirband M H Anisi et al ldquoAppraisal ofsoft computing methods for short term consumersrsquo heat loadprediction in district heating systemsrdquo Energy vol 82 pp 697ndash704 2015
[23] S Shamshirband D Petkovic A Amini et al ldquoSupport vec-tor regression methodology for wind turbine reaction torqueprediction with power-split hydrostatic continuous variabletransmissionrdquo Energy vol 67 pp 623ndash630 2014
[24] H Quan D Srinivasan and A Khosravi ldquoShort-term load andwind power forecasting using neural network-based predictionintervalsrdquo IEEE Transactions on Neural Networks and LearningSystems vol 25 no 2 pp 303ndash315 2014
Applied Computational Intelligence and Soft Computing 11
[25] K Bhaskar and S N Singh ldquoAWNN-assisted wind power fore-casting using feed-forward neural networkrdquo IEEE Transactionson Sustainable Energy vol 3 no 2 pp 306ndash315 2012
[26] Heat Pump Centre Newsletters International Energy Agency(IEA) Heat Pumping Technologies Vol 30 Nr 22012
[27] T G Dietterich ldquoAn experimental comparison of three meth-ods for constructing ensembles of decision trees baggingboosting and randomizationrdquo Machine Learning vol 40 no2 pp 139ndash157 2000
[28] SwedishDistrictHeatingAssociation 2016 httpwwwsvensk-fjarrvarmese
[29] K P Murphy Machine Learning A Probabilistic PerspectiveAdaptive Computation and Machine Learning Series MITPress 2012
[30] L Wang Support Vector Machines Theory and ApplicationsSpringer New York NY USA 2005
[31] D Basak S Pal and D C Patranabis ldquoSupport vector regres-sionsrdquoNeural Information ProcessingmdashLetters and Reviews vol11 no 10 pp 203ndash224 2007
[32] R E Edwards J New and L E Parker ldquoPredicting futurehourly residential electrical consumption a machine learningcase studyrdquo Energy and Buildings vol 49 pp 591ndash603 2012
[33] L Breiman ldquoRandom forestsrdquoMachine Learning vol 45 no 1pp 5ndash32 2001
[34] L Breiman J H Friedman R A Olshen and C J StoneClassification and Regression Trees Wadsworth and BrooksMonterey Calif USA 1984
[35] P Vezza C Comoglio M Rosso and A Viglione ldquoLowflows regionalization in North-Western Italyrdquo Water ResourcesManagement vol 24 no 14 pp 4049ndash4074 2010
[36] L Breiman and A Cutler ldquoRandom forestsrdquo January 2016httpwwwstatberkeleyeduusersbreimanRandomForests
[37] MATLAB Version 830 MathWorks Natick Mass USA 2014
Submit your manuscripts athttpwwwhindawicom
Computer Games Technology
International Journal of
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Distributed Sensor Networks
International Journal of
Advances in
FuzzySystems
Hindawi Publishing Corporationhttpwwwhindawicom
Volume 2014
International Journal of
ReconfigurableComputing
Hindawi Publishing Corporation httpwwwhindawicom Volume 2014
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Applied Computational Intelligence and Soft Computing
thinspAdvancesthinspinthinsp
Artificial Intelligence
HindawithinspPublishingthinspCorporationhttpwwwhindawicom Volumethinsp2014
Advances inSoftware EngineeringHindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Electrical and Computer Engineering
Journal of
Journal of
Computer Networks and Communications
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Hindawi Publishing Corporation
httpwwwhindawicom Volume 2014
Advances in
Multimedia
International Journal of
Biomedical Imaging
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
ArtificialNeural Systems
Advances in
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
RoboticsJournal of
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Computational Intelligence and Neuroscience
Industrial EngineeringJournal of
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Modelling amp Simulation in EngineeringHindawi Publishing Corporation httpwwwhindawicom Volume 2014
The Scientific World JournalHindawi Publishing Corporation httpwwwhindawicom Volume 2014
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Human-ComputerInteraction
Advances in
Computer EngineeringAdvances in
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Applied Computational Intelligence and Soft Computing 7
Sample data
Train dataDataset to grow the single trees
Feature selectionRandom selection of predictors
Grow treeSplit data using the best predictors
Estimate OOB errorby applying the tree to the OOB data
Out-Of-Bag (OOB)Data to estimate the error of the
grown tree
Random forestAs collection of all trees
Repeat until criteria for stopping tree growth
are fulfilled
Repeat until specified number
of trees is obtained
Figure 3 Random forest algorithm process flow
The advantage of PLS is that this algorithm allows takinginto account the data structure in both 119883 and 119884 matrices Italso provides great visual results that help the interpretationof data Finally yet importantly PLS can model severalresponse variables at the same time taking into account theirstructure
433 Random Forest (RF) Random forest algorithm pro-posed by Breiman [33] is an ensemble-learning algorithmconsisting of three predictors where the trees are formulatedbased on various random features It develops lots of decisiontrees based on random selection of data and random selectionof variables providing the class of dependent variable basedon many trees
This method is based on the combination of a largecollection of decorrelated decision trees (ie Classificationand Regression Trees (CART) [34]) Since all the trees arebased on random selection of data as well as variables theseare random trees and many such random trees lead to arandom forest The name forest means that we use manydecision trees to make a better classification of the dependentvariable The CART technique divides the learning sampleusing an algorithm known as binary recursive partitioningThis splitting or partitioning starts from the most importantvariable to the less important ones and it is applied to each ofthe new branches of the tree [35]
In order to increase the algorithm accuracy and reducethe generalization error of the ensemble trees another
technique called Bagging is incorporated The estimation forthe generalization error is performed with the Out-Of-Bag(OOB) method Bagging is used on the training dataset togenerate a lot of copies of it where each one corresponds to adecision tree
With the RF algorithm each tree is grown as follows [36]
(a) If the number of cases (observations) in the trainingset is119873 sample119873 cases at random but with replace-ment from the original data this sample will be thetraining set for the growing tree
(b) If there are 119872 input variables a number 119898 ≪ 119872
is specified such that at each node 119898 variables areselected at random out of the119872 and the best split onthese119898 is used to split the node This value119898 is heldconstant during the forest growing
(c) Each tree is grown to the largest extent possible
The process flow in the random forest models is shown inFigure 3
The error rate of RF primarily depends on the correlationdegree between any two trees and the prediction accuracy ofan individual tree The principal advantages of this methodare the ease of parallelization robustness and indifference tonoise and outliers in most of the dataset Due to its unbiasednature of execution this method avoids overfitting
8 Applied Computational Intelligence and Soft Computing
Table 3 Performance results for one week
Algorithm Training TestingMAE () MAPE () Correlation coefficient MAE () MAPE () Correlation coefficient
SVR 6491 311 090 7032 343 091PLS 26524 1273 081 23868 1042 079RF 17382 1977 083 15932 1861 084
Heat output (MW)
Heat load measurements (Oct 1ndashApr 30)
minus500
000
500
1000
1500
2000
2500
3000
Outdoor temp (∘C)
Figure 4 Hourly heat load pattern in relation to outdoor tempera-ture in the DH network
5 Performance Evaluation and Results
The proposed approach is implemented in MATLAB R2014a[37] and executed in a PC with Intel Core i7 processor with27GHz speed and 8GB of RAM In this work as trainingdataset we select the datameasured during the first 28 weekswhich consist of 4872 instances As for prediction period wechoose the 29th week as test data that is 148 instances Inorder to evaluate the performance of the proposed algorithmsin terms of accuracy of the results we use the mean absoluteaverage (MAE) the mean average percentage error (MAPE)and the correlation coefficient which measures the correla-tion between the actual and predicted heat load values MAEand MAPE are defined as follows
MAE =sum
119899
119894=1
1003816
1003816
1003816
1003816
119901
119894minus 119910
119894
1003816
1003816
1003816
1003816
119899
MAPE =sum
119899
119894=1
1003816
1003816
1003816
1003816
1003816
(119901
119894minus 119910
1015840
119894) 119901
119894
1003816
1003816
1003816
1003816
1003816
119899
(9)
where 119901119894is the actual value 119910
119894is the predicted value and 119899 is
the number of samples in the training setWe apply a 10-fold cross-validation for obtaining the valid
evaluation metrics Figure 4 presents the heat load in thenetwork with respect to the outdoor temperatureWe see thatthe higher values of the heat load occur during the days withlower outdoor temperature values which in fact reflects theincreased consumption of heat
Figure 5(a) shows results of actual heat load and heatload prediction for one week based on SVR algorithm InFigure 5(b) results of heat prediction production for oneweek based on PLS are shown with data on actual heal load
whereas results of heat load forecasting for one week basedon RF are shown in Figure 5(c)
As can be seen from Figure 5 the predicted HL withSVR is closer to the actual energy consumption with anMAPE value of about 343 and a correlation coefficient of091 Correlation has been consistent throughout training andtesting On the other hand graphs presented in Figures 5(b)and 5(c) for PLS and RF respectively are less accurate withhigher errors The performance of the PLS is significantlylower compared to SVR Concerning RF as the trees startedto become progressively uncorrelated the error rate alsodeclined significantly The best performance of the SVR overthe other two methods is attributed to the efficient modellingof feature space and to the fact that SVR is less proneto overfitting and that does not require any procedure todetermine the explicit form as in ordinary regression analysis
Table 3 outlines the results of the investigated ML algo-rithms for heat consumption prediction for both trainingphase (where we developed the models using supervisedlearning schemes) and testing phase (to generalize newlyunseen data) From Table 3 it is evident that SVR showsthe best prediction performance in terms of average errorsand correlation coefficient confirming the superiority of SVRover the other machine-learning methods Therefore basedon the assumption that the number and type of the operatingfacilities should be determined SVRcanbe effectively appliedin the management of a DHS The mean absolute percentageerror value of 343 obtained in our approachwith the SVR islower than or equal to the mean absolute percentage error ofstate-of-the-art approaches for heat load prediction in DHSMoreover SVR is also better than PLS and RF in terms ofmean average errors and correlation coefficient Neverthelesssometimes it is impossible to perform direct comparisonwith other works due to different system implementation andinput data and different structure of the experimental setup
51 Comparison with State-of-the-Art Methods Some priorresearch work is carried out to predict and analyze the headdemand in DH However due to different system designand input data and different architecture implementation orstructure of the experimental setup sometimes it is difficult toperform direct comparison Our approach uses operationaldata from DHS to model the heat demand and since theSVR exhibited the best prediction performance we use thismethod to perform the comparison In our case we obtainedsmaller MAPE compared to [17] where the experimentalresults showed MAPE of 477 which is lower than or atleast equal to the mean percentage error of state-of-the-artregression approaches that have been proposed for heat loadforecasting in DH systems Furthermore reported results
Applied Computational Intelligence and Soft Computing 9
Mon Tu
e
Wed Th
u Fri
Sat
Sun
Support Vector Regression (SVR)
Actual HLPredicted HL
000
500
1000
1500
2000
2500
Hea
t loa
d (M
W)
(a) Predicted versus actual demand with SVR
000
500
1000
1500
2000
2500
Mon Tu
e
Wed Thu Fri
Sat
Sun
Hea
t loa
d (M
W)
Partial Least Squares (PLS)
Actual HLPredicted HL
(b) Predicted versus actual demand with PLS
000
500
1000
1500
2000
2500
Hea
t loa
d (M
W)
Random forest (RF)
Mon Tu
e
Wed Th
u Fri
Sat
Sun
Actual HLPredicted HL
(c) Predicted versus actual demand with RF
Figure 5 Hourly heat load prediction based on SVR PLS and RF for one week
from our study are also superior compared to the resultspresented in [15 32] More specifically the SVR method weapply shows better results in terms of MAPE and correlationcoefficient compared to [32] where MAPE in this paperis 554 and the correlation coefficient is 087 As far asthe comparison with work [15] is concerned in terms ofMAPE our SVR method exhibits better hourly predictionperformance where theMAPE for one week is 563 On theother hand the PLSmethod performs better than in our casehaving the MAPE value of about 899
6 Conclusion
District heating (DH) sector can play an indispensable rolein the current and future sustainable energy system of theNorth European countries where the share of DH in the total
European heat market is significantly high The innovationsand emergence of new technology and the increasing focuson smart buildings impose energy systems to face a chal-lenge in adapting to customersrsquo more flexible and individualsolutions Consequently one needs to know more aboutwhat drives customersrsquo requirements choices and prioritiesTherefore the creation and application of innovative ITecosystems in DH are considered essential to improve energyefficiency
Heat load prediction in the past decade has attracted alot of interest to the researchers since it can assist in energyefficiency of the DHS which also leads to cost reduction forheat suppliers and to many environmental benefits
In this paper three ML algorithms for heat load predic-tion in a DH network are developed and presentedThe algo-rithms are SVR PLS andRFTheheat load predictionmodels
10 Applied Computational Intelligence and Soft Computing
were developed using data from 29 weeks The predictedhourly results were compared with actual heat load dataPerformances of these three different ML algorithms werestudied compared and analyzed SVR algorithm proved tobe the most efficient one producing the best performance interms of average errors and correlation coefficient Moreoverthe prediction results were also compared against existingSVR and PLS methods in literature showing that the SVRpresented in this paper produces better accuracy
In conclusion the comparison results validate the notionthat the developed SVRmethod is appropriate for applicationin heat load prediction or it can serve as a promisingalternative for existing models
As for the future work apart from outdoor temperaturewe intend to incorporate other meteorological parametersinfluencing the heat load such as wind speed and humidity
Competing Interests
The authors declare that they have no competing interests
Acknowledgments
The authors would like to thank Eidsiva Bioenergi AS Nor-way which kindly provided the data for this research work
References
[1] Report on European Energy Security Strategy May 2014 Brus-sels httpeur-lexeuropaeulegal-contentENTXTPDFuri=CELEX52014DC0330ampfrom=EN
[2] A R Day P Ogumka P G Jones and A Dunsdon ldquoTheuse of the planning system to encourage low carbon energytechnologies in buildingsrdquo Renewable Energy vol 34 no 9 pp2016ndash2021 2009
[3] IEA (International Energy Agency) Nordic Energy TechnologyPerspectives OECDIEA Paris France 2013
[4] A Gebremedhin and H Zinko ldquoSeasonal heat storages indistrict heating systemsrdquo in Proceedings of the 11th InternationalThermal Energy Storage for Energy Efficiency and SustainabilityConference (Effstock rsquo09) Stockholm Sweden June 2009
[5] H Lund S Werner R Wiltshire et al ldquo4th Generation DistrictHeating (4GDH) integrating smart thermal grids into futuresustainable energy systemsrdquo Energy vol 68 pp 1ndash11 2014
[6] F Dalipi S Y Yayilgan and A Gebremedhin ldquoA machinelearning approach to increase energy efficiency in district heat-ing systemsrdquo in Proceedings of the International Conference onEnvironmental Engineering and Computer Application (ICEECArsquo14) pp 223ndash226 CRC Press Hong Kong December 2014
[7] N Fumo ldquoA review on the basics of building energy estimationrdquoRenewable and Sustainable Energy Reviews vol 31 pp 53ndash602014
[8] D E Claridge ldquoUsing simulation models for building com-missioningrdquo in Proceedings of the International Conferenceon Enhanced Building Operation Energy Systems LaboratoryTexas AampM University October 2004
[9] T Mestekemper Energy demand forecasting and dynamic watertemperature management [PhD thesis] Bielefeld UniversityBielefeld Germany 2011
[10] E Dotzauer ldquoSimple model for prediction of loads in district-heating systemsrdquo Applied Energy vol 73 no 3-4 pp 277ndash2842002
[11] HANielsen andHMadsen ldquoModelling the heat consumptionin district heating systems using a grey-box approachrdquo Energyand Buildings vol 38 no 1 pp 63ndash71 2006
[12] B B Ekici and U T Aksoy ldquoPrediction of building energyconsumption by using artificial neural networksrdquo Advances inEngineering Software vol 40 no 5 pp 356ndash362 2009
[13] M Grzenda and B Macukow ldquoDemand prediction with multi-stage neural processingrdquo in Advances in Natural Computationand Data Mining pp 131ndash141 2006
[14] K Kato M Sakawa K Ishimaru S Ushiro and T ShibanoldquoHeat load prediction through recurrent neural network indistrict heating and cooling systemsrdquo in Proceedings of the IEEEInternational Conference on SystemsMan andCybernetics (SMCrsquo08) pp 1401ndash1406 Singapore October 2008
[15] T C Park U S Kim L-H Kim B W Jo and Y K Yeo ldquoHeatconsumption forecasting using partial least squares artificialneural network and support vector regression techniques in dis-trict heating systemsrdquo Korean Journal of Chemical Engineeringvol 27 no 4 pp 1063ndash1071 2010
[16] T Catalina V Iordache and B Caracaleanu ldquoMultiple regres-sion model for fast prediction of the heating energy demandrdquoEnergy and Buildings vol 57 pp 302ndash312 2013
[17] S Provatas N Lavesson and C Johansson ldquoAn onlinemachinelearning algorithm for heat load forecasting in district heatingsystemsrdquo in Proceedings of the 14th International Symposium onDistrict Heating and Cooling Stockholm Sweden September2014
[18] S Idowu S Saguna C Ahlund and O Schelen ldquoForecastingheat load for smart district heating systems a machine learningapproachrdquo in Proceedings of the IEEE International Conferenceon Smart Grid Communications (SmartGridComm rsquo14) pp 554ndash559 IEEE Venice Italy November 2014
[19] L Wu G Kaiser D Solomon R Winter A Boulanger andR Anderson Improving Efficiency and Reliability of Build-ing Systems Using Machine Learning and Automated OnlineEvaluation Columbia University Academic Commons 2012httphdlhandlenet10022ACP13213
[20] M Vlachopoulou G Chin J C Fuller S Lu and K KalsildquoModel for aggregated water heater load using dynamicBayesian networksrdquo Tech Rep Pacific Northwest NationalLaboratory (PNNL) Richland Wash USA 2012
[21] S Shamshirband D Petkovic R Enayatifar et al ldquoHeat loadprediction in district heating systemswith adaptive neuro-fuzzymethodrdquoRenewable and Sustainable Energy Reviews vol 48 pp760ndash767 2015
[22] M Protic S Shamshirband M H Anisi et al ldquoAppraisal ofsoft computing methods for short term consumersrsquo heat loadprediction in district heating systemsrdquo Energy vol 82 pp 697ndash704 2015
[23] S Shamshirband D Petkovic A Amini et al ldquoSupport vec-tor regression methodology for wind turbine reaction torqueprediction with power-split hydrostatic continuous variabletransmissionrdquo Energy vol 67 pp 623ndash630 2014
[24] H Quan D Srinivasan and A Khosravi ldquoShort-term load andwind power forecasting using neural network-based predictionintervalsrdquo IEEE Transactions on Neural Networks and LearningSystems vol 25 no 2 pp 303ndash315 2014
Applied Computational Intelligence and Soft Computing 11
[25] K Bhaskar and S N Singh ldquoAWNN-assisted wind power fore-casting using feed-forward neural networkrdquo IEEE Transactionson Sustainable Energy vol 3 no 2 pp 306ndash315 2012
[26] Heat Pump Centre Newsletters International Energy Agency(IEA) Heat Pumping Technologies Vol 30 Nr 22012
[27] T G Dietterich ldquoAn experimental comparison of three meth-ods for constructing ensembles of decision trees baggingboosting and randomizationrdquo Machine Learning vol 40 no2 pp 139ndash157 2000
[28] SwedishDistrictHeatingAssociation 2016 httpwwwsvensk-fjarrvarmese
[29] K P Murphy Machine Learning A Probabilistic PerspectiveAdaptive Computation and Machine Learning Series MITPress 2012
[30] L Wang Support Vector Machines Theory and ApplicationsSpringer New York NY USA 2005
[31] D Basak S Pal and D C Patranabis ldquoSupport vector regres-sionsrdquoNeural Information ProcessingmdashLetters and Reviews vol11 no 10 pp 203ndash224 2007
[32] R E Edwards J New and L E Parker ldquoPredicting futurehourly residential electrical consumption a machine learningcase studyrdquo Energy and Buildings vol 49 pp 591ndash603 2012
[33] L Breiman ldquoRandom forestsrdquoMachine Learning vol 45 no 1pp 5ndash32 2001
[34] L Breiman J H Friedman R A Olshen and C J StoneClassification and Regression Trees Wadsworth and BrooksMonterey Calif USA 1984
[35] P Vezza C Comoglio M Rosso and A Viglione ldquoLowflows regionalization in North-Western Italyrdquo Water ResourcesManagement vol 24 no 14 pp 4049ndash4074 2010
[36] L Breiman and A Cutler ldquoRandom forestsrdquo January 2016httpwwwstatberkeleyeduusersbreimanRandomForests
[37] MATLAB Version 830 MathWorks Natick Mass USA 2014
Submit your manuscripts athttpwwwhindawicom
Computer Games Technology
International Journal of
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Distributed Sensor Networks
International Journal of
Advances in
FuzzySystems
Hindawi Publishing Corporationhttpwwwhindawicom
Volume 2014
International Journal of
ReconfigurableComputing
Hindawi Publishing Corporation httpwwwhindawicom Volume 2014
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Applied Computational Intelligence and Soft Computing
thinspAdvancesthinspinthinsp
Artificial Intelligence
HindawithinspPublishingthinspCorporationhttpwwwhindawicom Volumethinsp2014
Advances inSoftware EngineeringHindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Electrical and Computer Engineering
Journal of
Journal of
Computer Networks and Communications
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Hindawi Publishing Corporation
httpwwwhindawicom Volume 2014
Advances in
Multimedia
International Journal of
Biomedical Imaging
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
ArtificialNeural Systems
Advances in
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
RoboticsJournal of
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Computational Intelligence and Neuroscience
Industrial EngineeringJournal of
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Modelling amp Simulation in EngineeringHindawi Publishing Corporation httpwwwhindawicom Volume 2014
The Scientific World JournalHindawi Publishing Corporation httpwwwhindawicom Volume 2014
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Human-ComputerInteraction
Advances in
Computer EngineeringAdvances in
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
8 Applied Computational Intelligence and Soft Computing
Table 3 Performance results for one week
Algorithm Training TestingMAE () MAPE () Correlation coefficient MAE () MAPE () Correlation coefficient
SVR 6491 311 090 7032 343 091PLS 26524 1273 081 23868 1042 079RF 17382 1977 083 15932 1861 084
Heat output (MW)
Heat load measurements (Oct 1ndashApr 30)
minus500
000
500
1000
1500
2000
2500
3000
Outdoor temp (∘C)
Figure 4 Hourly heat load pattern in relation to outdoor tempera-ture in the DH network
5 Performance Evaluation and Results
The proposed approach is implemented in MATLAB R2014a[37] and executed in a PC with Intel Core i7 processor with27GHz speed and 8GB of RAM In this work as trainingdataset we select the datameasured during the first 28 weekswhich consist of 4872 instances As for prediction period wechoose the 29th week as test data that is 148 instances Inorder to evaluate the performance of the proposed algorithmsin terms of accuracy of the results we use the mean absoluteaverage (MAE) the mean average percentage error (MAPE)and the correlation coefficient which measures the correla-tion between the actual and predicted heat load values MAEand MAPE are defined as follows
MAE =sum
119899
119894=1
1003816
1003816
1003816
1003816
119901
119894minus 119910
119894
1003816
1003816
1003816
1003816
119899
MAPE =sum
119899
119894=1
1003816
1003816
1003816
1003816
1003816
(119901
119894minus 119910
1015840
119894) 119901
119894
1003816
1003816
1003816
1003816
1003816
119899
(9)
where 119901119894is the actual value 119910
119894is the predicted value and 119899 is
the number of samples in the training setWe apply a 10-fold cross-validation for obtaining the valid
evaluation metrics Figure 4 presents the heat load in thenetwork with respect to the outdoor temperatureWe see thatthe higher values of the heat load occur during the days withlower outdoor temperature values which in fact reflects theincreased consumption of heat
Figure 5(a) shows results of actual heat load and heatload prediction for one week based on SVR algorithm InFigure 5(b) results of heat prediction production for oneweek based on PLS are shown with data on actual heal load
whereas results of heat load forecasting for one week basedon RF are shown in Figure 5(c)
As can be seen from Figure 5 the predicted HL withSVR is closer to the actual energy consumption with anMAPE value of about 343 and a correlation coefficient of091 Correlation has been consistent throughout training andtesting On the other hand graphs presented in Figures 5(b)and 5(c) for PLS and RF respectively are less accurate withhigher errors The performance of the PLS is significantlylower compared to SVR Concerning RF as the trees startedto become progressively uncorrelated the error rate alsodeclined significantly The best performance of the SVR overthe other two methods is attributed to the efficient modellingof feature space and to the fact that SVR is less proneto overfitting and that does not require any procedure todetermine the explicit form as in ordinary regression analysis
Table 3 outlines the results of the investigated ML algo-rithms for heat consumption prediction for both trainingphase (where we developed the models using supervisedlearning schemes) and testing phase (to generalize newlyunseen data) From Table 3 it is evident that SVR showsthe best prediction performance in terms of average errorsand correlation coefficient confirming the superiority of SVRover the other machine-learning methods Therefore basedon the assumption that the number and type of the operatingfacilities should be determined SVRcanbe effectively appliedin the management of a DHS The mean absolute percentageerror value of 343 obtained in our approachwith the SVR islower than or equal to the mean absolute percentage error ofstate-of-the-art approaches for heat load prediction in DHSMoreover SVR is also better than PLS and RF in terms ofmean average errors and correlation coefficient Neverthelesssometimes it is impossible to perform direct comparisonwith other works due to different system implementation andinput data and different structure of the experimental setup
51 Comparison with State-of-the-Art Methods Some priorresearch work is carried out to predict and analyze the headdemand in DH However due to different system designand input data and different architecture implementation orstructure of the experimental setup sometimes it is difficult toperform direct comparison Our approach uses operationaldata from DHS to model the heat demand and since theSVR exhibited the best prediction performance we use thismethod to perform the comparison In our case we obtainedsmaller MAPE compared to [17] where the experimentalresults showed MAPE of 477 which is lower than or atleast equal to the mean percentage error of state-of-the-artregression approaches that have been proposed for heat loadforecasting in DH systems Furthermore reported results
Applied Computational Intelligence and Soft Computing 9
Mon Tu
e
Wed Th
u Fri
Sat
Sun
Support Vector Regression (SVR)
Actual HLPredicted HL
000
500
1000
1500
2000
2500
Hea
t loa
d (M
W)
(a) Predicted versus actual demand with SVR
000
500
1000
1500
2000
2500
Mon Tu
e
Wed Thu Fri
Sat
Sun
Hea
t loa
d (M
W)
Partial Least Squares (PLS)
Actual HLPredicted HL
(b) Predicted versus actual demand with PLS
000
500
1000
1500
2000
2500
Hea
t loa
d (M
W)
Random forest (RF)
Mon Tu
e
Wed Th
u Fri
Sat
Sun
Actual HLPredicted HL
(c) Predicted versus actual demand with RF
Figure 5 Hourly heat load prediction based on SVR PLS and RF for one week
from our study are also superior compared to the resultspresented in [15 32] More specifically the SVR method weapply shows better results in terms of MAPE and correlationcoefficient compared to [32] where MAPE in this paperis 554 and the correlation coefficient is 087 As far asthe comparison with work [15] is concerned in terms ofMAPE our SVR method exhibits better hourly predictionperformance where theMAPE for one week is 563 On theother hand the PLSmethod performs better than in our casehaving the MAPE value of about 899
6 Conclusion
District heating (DH) sector can play an indispensable rolein the current and future sustainable energy system of theNorth European countries where the share of DH in the total
European heat market is significantly high The innovationsand emergence of new technology and the increasing focuson smart buildings impose energy systems to face a chal-lenge in adapting to customersrsquo more flexible and individualsolutions Consequently one needs to know more aboutwhat drives customersrsquo requirements choices and prioritiesTherefore the creation and application of innovative ITecosystems in DH are considered essential to improve energyefficiency
Heat load prediction in the past decade has attracted alot of interest to the researchers since it can assist in energyefficiency of the DHS which also leads to cost reduction forheat suppliers and to many environmental benefits
In this paper three ML algorithms for heat load predic-tion in a DH network are developed and presentedThe algo-rithms are SVR PLS andRFTheheat load predictionmodels
10 Applied Computational Intelligence and Soft Computing
were developed using data from 29 weeks The predictedhourly results were compared with actual heat load dataPerformances of these three different ML algorithms werestudied compared and analyzed SVR algorithm proved tobe the most efficient one producing the best performance interms of average errors and correlation coefficient Moreoverthe prediction results were also compared against existingSVR and PLS methods in literature showing that the SVRpresented in this paper produces better accuracy
In conclusion the comparison results validate the notionthat the developed SVRmethod is appropriate for applicationin heat load prediction or it can serve as a promisingalternative for existing models
As for the future work apart from outdoor temperaturewe intend to incorporate other meteorological parametersinfluencing the heat load such as wind speed and humidity
Competing Interests
The authors declare that they have no competing interests
Acknowledgments
The authors would like to thank Eidsiva Bioenergi AS Nor-way which kindly provided the data for this research work
References
[1] Report on European Energy Security Strategy May 2014 Brus-sels httpeur-lexeuropaeulegal-contentENTXTPDFuri=CELEX52014DC0330ampfrom=EN
[2] A R Day P Ogumka P G Jones and A Dunsdon ldquoTheuse of the planning system to encourage low carbon energytechnologies in buildingsrdquo Renewable Energy vol 34 no 9 pp2016ndash2021 2009
[3] IEA (International Energy Agency) Nordic Energy TechnologyPerspectives OECDIEA Paris France 2013
[4] A Gebremedhin and H Zinko ldquoSeasonal heat storages indistrict heating systemsrdquo in Proceedings of the 11th InternationalThermal Energy Storage for Energy Efficiency and SustainabilityConference (Effstock rsquo09) Stockholm Sweden June 2009
[5] H Lund S Werner R Wiltshire et al ldquo4th Generation DistrictHeating (4GDH) integrating smart thermal grids into futuresustainable energy systemsrdquo Energy vol 68 pp 1ndash11 2014
[6] F Dalipi S Y Yayilgan and A Gebremedhin ldquoA machinelearning approach to increase energy efficiency in district heat-ing systemsrdquo in Proceedings of the International Conference onEnvironmental Engineering and Computer Application (ICEECArsquo14) pp 223ndash226 CRC Press Hong Kong December 2014
[7] N Fumo ldquoA review on the basics of building energy estimationrdquoRenewable and Sustainable Energy Reviews vol 31 pp 53ndash602014
[8] D E Claridge ldquoUsing simulation models for building com-missioningrdquo in Proceedings of the International Conferenceon Enhanced Building Operation Energy Systems LaboratoryTexas AampM University October 2004
[9] T Mestekemper Energy demand forecasting and dynamic watertemperature management [PhD thesis] Bielefeld UniversityBielefeld Germany 2011
[10] E Dotzauer ldquoSimple model for prediction of loads in district-heating systemsrdquo Applied Energy vol 73 no 3-4 pp 277ndash2842002
[11] HANielsen andHMadsen ldquoModelling the heat consumptionin district heating systems using a grey-box approachrdquo Energyand Buildings vol 38 no 1 pp 63ndash71 2006
[12] B B Ekici and U T Aksoy ldquoPrediction of building energyconsumption by using artificial neural networksrdquo Advances inEngineering Software vol 40 no 5 pp 356ndash362 2009
[13] M Grzenda and B Macukow ldquoDemand prediction with multi-stage neural processingrdquo in Advances in Natural Computationand Data Mining pp 131ndash141 2006
[14] K Kato M Sakawa K Ishimaru S Ushiro and T ShibanoldquoHeat load prediction through recurrent neural network indistrict heating and cooling systemsrdquo in Proceedings of the IEEEInternational Conference on SystemsMan andCybernetics (SMCrsquo08) pp 1401ndash1406 Singapore October 2008
[15] T C Park U S Kim L-H Kim B W Jo and Y K Yeo ldquoHeatconsumption forecasting using partial least squares artificialneural network and support vector regression techniques in dis-trict heating systemsrdquo Korean Journal of Chemical Engineeringvol 27 no 4 pp 1063ndash1071 2010
[16] T Catalina V Iordache and B Caracaleanu ldquoMultiple regres-sion model for fast prediction of the heating energy demandrdquoEnergy and Buildings vol 57 pp 302ndash312 2013
[17] S Provatas N Lavesson and C Johansson ldquoAn onlinemachinelearning algorithm for heat load forecasting in district heatingsystemsrdquo in Proceedings of the 14th International Symposium onDistrict Heating and Cooling Stockholm Sweden September2014
[18] S Idowu S Saguna C Ahlund and O Schelen ldquoForecastingheat load for smart district heating systems a machine learningapproachrdquo in Proceedings of the IEEE International Conferenceon Smart Grid Communications (SmartGridComm rsquo14) pp 554ndash559 IEEE Venice Italy November 2014
[19] L Wu G Kaiser D Solomon R Winter A Boulanger andR Anderson Improving Efficiency and Reliability of Build-ing Systems Using Machine Learning and Automated OnlineEvaluation Columbia University Academic Commons 2012httphdlhandlenet10022ACP13213
[20] M Vlachopoulou G Chin J C Fuller S Lu and K KalsildquoModel for aggregated water heater load using dynamicBayesian networksrdquo Tech Rep Pacific Northwest NationalLaboratory (PNNL) Richland Wash USA 2012
[21] S Shamshirband D Petkovic R Enayatifar et al ldquoHeat loadprediction in district heating systemswith adaptive neuro-fuzzymethodrdquoRenewable and Sustainable Energy Reviews vol 48 pp760ndash767 2015
[22] M Protic S Shamshirband M H Anisi et al ldquoAppraisal ofsoft computing methods for short term consumersrsquo heat loadprediction in district heating systemsrdquo Energy vol 82 pp 697ndash704 2015
[23] S Shamshirband D Petkovic A Amini et al ldquoSupport vec-tor regression methodology for wind turbine reaction torqueprediction with power-split hydrostatic continuous variabletransmissionrdquo Energy vol 67 pp 623ndash630 2014
[24] H Quan D Srinivasan and A Khosravi ldquoShort-term load andwind power forecasting using neural network-based predictionintervalsrdquo IEEE Transactions on Neural Networks and LearningSystems vol 25 no 2 pp 303ndash315 2014
Applied Computational Intelligence and Soft Computing 11
[25] K Bhaskar and S N Singh ldquoAWNN-assisted wind power fore-casting using feed-forward neural networkrdquo IEEE Transactionson Sustainable Energy vol 3 no 2 pp 306ndash315 2012
[26] Heat Pump Centre Newsletters International Energy Agency(IEA) Heat Pumping Technologies Vol 30 Nr 22012
[27] T G Dietterich ldquoAn experimental comparison of three meth-ods for constructing ensembles of decision trees baggingboosting and randomizationrdquo Machine Learning vol 40 no2 pp 139ndash157 2000
[28] SwedishDistrictHeatingAssociation 2016 httpwwwsvensk-fjarrvarmese
[29] K P Murphy Machine Learning A Probabilistic PerspectiveAdaptive Computation and Machine Learning Series MITPress 2012
[30] L Wang Support Vector Machines Theory and ApplicationsSpringer New York NY USA 2005
[31] D Basak S Pal and D C Patranabis ldquoSupport vector regres-sionsrdquoNeural Information ProcessingmdashLetters and Reviews vol11 no 10 pp 203ndash224 2007
[32] R E Edwards J New and L E Parker ldquoPredicting futurehourly residential electrical consumption a machine learningcase studyrdquo Energy and Buildings vol 49 pp 591ndash603 2012
[33] L Breiman ldquoRandom forestsrdquoMachine Learning vol 45 no 1pp 5ndash32 2001
[34] L Breiman J H Friedman R A Olshen and C J StoneClassification and Regression Trees Wadsworth and BrooksMonterey Calif USA 1984
[35] P Vezza C Comoglio M Rosso and A Viglione ldquoLowflows regionalization in North-Western Italyrdquo Water ResourcesManagement vol 24 no 14 pp 4049ndash4074 2010
[36] L Breiman and A Cutler ldquoRandom forestsrdquo January 2016httpwwwstatberkeleyeduusersbreimanRandomForests
[37] MATLAB Version 830 MathWorks Natick Mass USA 2014
Submit your manuscripts athttpwwwhindawicom
Computer Games Technology
International Journal of
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Distributed Sensor Networks
International Journal of
Advances in
FuzzySystems
Hindawi Publishing Corporationhttpwwwhindawicom
Volume 2014
International Journal of
ReconfigurableComputing
Hindawi Publishing Corporation httpwwwhindawicom Volume 2014
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Applied Computational Intelligence and Soft Computing
thinspAdvancesthinspinthinsp
Artificial Intelligence
HindawithinspPublishingthinspCorporationhttpwwwhindawicom Volumethinsp2014
Advances inSoftware EngineeringHindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Electrical and Computer Engineering
Journal of
Journal of
Computer Networks and Communications
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Hindawi Publishing Corporation
httpwwwhindawicom Volume 2014
Advances in
Multimedia
International Journal of
Biomedical Imaging
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
ArtificialNeural Systems
Advances in
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
RoboticsJournal of
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Computational Intelligence and Neuroscience
Industrial EngineeringJournal of
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Modelling amp Simulation in EngineeringHindawi Publishing Corporation httpwwwhindawicom Volume 2014
The Scientific World JournalHindawi Publishing Corporation httpwwwhindawicom Volume 2014
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Human-ComputerInteraction
Advances in
Computer EngineeringAdvances in
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Applied Computational Intelligence and Soft Computing 9
Mon Tu
e
Wed Th
u Fri
Sat
Sun
Support Vector Regression (SVR)
Actual HLPredicted HL
000
500
1000
1500
2000
2500
Hea
t loa
d (M
W)
(a) Predicted versus actual demand with SVR
000
500
1000
1500
2000
2500
Mon Tu
e
Wed Thu Fri
Sat
Sun
Hea
t loa
d (M
W)
Partial Least Squares (PLS)
Actual HLPredicted HL
(b) Predicted versus actual demand with PLS
000
500
1000
1500
2000
2500
Hea
t loa
d (M
W)
Random forest (RF)
Mon Tu
e
Wed Th
u Fri
Sat
Sun
Actual HLPredicted HL
(c) Predicted versus actual demand with RF
Figure 5 Hourly heat load prediction based on SVR PLS and RF for one week
from our study are also superior compared to the resultspresented in [15 32] More specifically the SVR method weapply shows better results in terms of MAPE and correlationcoefficient compared to [32] where MAPE in this paperis 554 and the correlation coefficient is 087 As far asthe comparison with work [15] is concerned in terms ofMAPE our SVR method exhibits better hourly predictionperformance where theMAPE for one week is 563 On theother hand the PLSmethod performs better than in our casehaving the MAPE value of about 899
6 Conclusion
District heating (DH) sector can play an indispensable rolein the current and future sustainable energy system of theNorth European countries where the share of DH in the total
European heat market is significantly high The innovationsand emergence of new technology and the increasing focuson smart buildings impose energy systems to face a chal-lenge in adapting to customersrsquo more flexible and individualsolutions Consequently one needs to know more aboutwhat drives customersrsquo requirements choices and prioritiesTherefore the creation and application of innovative ITecosystems in DH are considered essential to improve energyefficiency
Heat load prediction in the past decade has attracted alot of interest to the researchers since it can assist in energyefficiency of the DHS which also leads to cost reduction forheat suppliers and to many environmental benefits
In this paper three ML algorithms for heat load predic-tion in a DH network are developed and presentedThe algo-rithms are SVR PLS andRFTheheat load predictionmodels
10 Applied Computational Intelligence and Soft Computing
were developed using data from 29 weeks The predictedhourly results were compared with actual heat load dataPerformances of these three different ML algorithms werestudied compared and analyzed SVR algorithm proved tobe the most efficient one producing the best performance interms of average errors and correlation coefficient Moreoverthe prediction results were also compared against existingSVR and PLS methods in literature showing that the SVRpresented in this paper produces better accuracy
In conclusion the comparison results validate the notionthat the developed SVRmethod is appropriate for applicationin heat load prediction or it can serve as a promisingalternative for existing models
As for the future work apart from outdoor temperaturewe intend to incorporate other meteorological parametersinfluencing the heat load such as wind speed and humidity
Competing Interests
The authors declare that they have no competing interests
Acknowledgments
The authors would like to thank Eidsiva Bioenergi AS Nor-way which kindly provided the data for this research work
References
[1] Report on European Energy Security Strategy May 2014 Brus-sels httpeur-lexeuropaeulegal-contentENTXTPDFuri=CELEX52014DC0330ampfrom=EN
[2] A R Day P Ogumka P G Jones and A Dunsdon ldquoTheuse of the planning system to encourage low carbon energytechnologies in buildingsrdquo Renewable Energy vol 34 no 9 pp2016ndash2021 2009
[3] IEA (International Energy Agency) Nordic Energy TechnologyPerspectives OECDIEA Paris France 2013
[4] A Gebremedhin and H Zinko ldquoSeasonal heat storages indistrict heating systemsrdquo in Proceedings of the 11th InternationalThermal Energy Storage for Energy Efficiency and SustainabilityConference (Effstock rsquo09) Stockholm Sweden June 2009
[5] H Lund S Werner R Wiltshire et al ldquo4th Generation DistrictHeating (4GDH) integrating smart thermal grids into futuresustainable energy systemsrdquo Energy vol 68 pp 1ndash11 2014
[6] F Dalipi S Y Yayilgan and A Gebremedhin ldquoA machinelearning approach to increase energy efficiency in district heat-ing systemsrdquo in Proceedings of the International Conference onEnvironmental Engineering and Computer Application (ICEECArsquo14) pp 223ndash226 CRC Press Hong Kong December 2014
[7] N Fumo ldquoA review on the basics of building energy estimationrdquoRenewable and Sustainable Energy Reviews vol 31 pp 53ndash602014
[8] D E Claridge ldquoUsing simulation models for building com-missioningrdquo in Proceedings of the International Conferenceon Enhanced Building Operation Energy Systems LaboratoryTexas AampM University October 2004
[9] T Mestekemper Energy demand forecasting and dynamic watertemperature management [PhD thesis] Bielefeld UniversityBielefeld Germany 2011
[10] E Dotzauer ldquoSimple model for prediction of loads in district-heating systemsrdquo Applied Energy vol 73 no 3-4 pp 277ndash2842002
[11] HANielsen andHMadsen ldquoModelling the heat consumptionin district heating systems using a grey-box approachrdquo Energyand Buildings vol 38 no 1 pp 63ndash71 2006
[12] B B Ekici and U T Aksoy ldquoPrediction of building energyconsumption by using artificial neural networksrdquo Advances inEngineering Software vol 40 no 5 pp 356ndash362 2009
[13] M Grzenda and B Macukow ldquoDemand prediction with multi-stage neural processingrdquo in Advances in Natural Computationand Data Mining pp 131ndash141 2006
[14] K Kato M Sakawa K Ishimaru S Ushiro and T ShibanoldquoHeat load prediction through recurrent neural network indistrict heating and cooling systemsrdquo in Proceedings of the IEEEInternational Conference on SystemsMan andCybernetics (SMCrsquo08) pp 1401ndash1406 Singapore October 2008
[15] T C Park U S Kim L-H Kim B W Jo and Y K Yeo ldquoHeatconsumption forecasting using partial least squares artificialneural network and support vector regression techniques in dis-trict heating systemsrdquo Korean Journal of Chemical Engineeringvol 27 no 4 pp 1063ndash1071 2010
[16] T Catalina V Iordache and B Caracaleanu ldquoMultiple regres-sion model for fast prediction of the heating energy demandrdquoEnergy and Buildings vol 57 pp 302ndash312 2013
[17] S Provatas N Lavesson and C Johansson ldquoAn onlinemachinelearning algorithm for heat load forecasting in district heatingsystemsrdquo in Proceedings of the 14th International Symposium onDistrict Heating and Cooling Stockholm Sweden September2014
[18] S Idowu S Saguna C Ahlund and O Schelen ldquoForecastingheat load for smart district heating systems a machine learningapproachrdquo in Proceedings of the IEEE International Conferenceon Smart Grid Communications (SmartGridComm rsquo14) pp 554ndash559 IEEE Venice Italy November 2014
[19] L Wu G Kaiser D Solomon R Winter A Boulanger andR Anderson Improving Efficiency and Reliability of Build-ing Systems Using Machine Learning and Automated OnlineEvaluation Columbia University Academic Commons 2012httphdlhandlenet10022ACP13213
[20] M Vlachopoulou G Chin J C Fuller S Lu and K KalsildquoModel for aggregated water heater load using dynamicBayesian networksrdquo Tech Rep Pacific Northwest NationalLaboratory (PNNL) Richland Wash USA 2012
[21] S Shamshirband D Petkovic R Enayatifar et al ldquoHeat loadprediction in district heating systemswith adaptive neuro-fuzzymethodrdquoRenewable and Sustainable Energy Reviews vol 48 pp760ndash767 2015
[22] M Protic S Shamshirband M H Anisi et al ldquoAppraisal ofsoft computing methods for short term consumersrsquo heat loadprediction in district heating systemsrdquo Energy vol 82 pp 697ndash704 2015
[23] S Shamshirband D Petkovic A Amini et al ldquoSupport vec-tor regression methodology for wind turbine reaction torqueprediction with power-split hydrostatic continuous variabletransmissionrdquo Energy vol 67 pp 623ndash630 2014
[24] H Quan D Srinivasan and A Khosravi ldquoShort-term load andwind power forecasting using neural network-based predictionintervalsrdquo IEEE Transactions on Neural Networks and LearningSystems vol 25 no 2 pp 303ndash315 2014
Applied Computational Intelligence and Soft Computing 11
[25] K Bhaskar and S N Singh ldquoAWNN-assisted wind power fore-casting using feed-forward neural networkrdquo IEEE Transactionson Sustainable Energy vol 3 no 2 pp 306ndash315 2012
[26] Heat Pump Centre Newsletters International Energy Agency(IEA) Heat Pumping Technologies Vol 30 Nr 22012
[27] T G Dietterich ldquoAn experimental comparison of three meth-ods for constructing ensembles of decision trees baggingboosting and randomizationrdquo Machine Learning vol 40 no2 pp 139ndash157 2000
[28] SwedishDistrictHeatingAssociation 2016 httpwwwsvensk-fjarrvarmese
[29] K P Murphy Machine Learning A Probabilistic PerspectiveAdaptive Computation and Machine Learning Series MITPress 2012
[30] L Wang Support Vector Machines Theory and ApplicationsSpringer New York NY USA 2005
[31] D Basak S Pal and D C Patranabis ldquoSupport vector regres-sionsrdquoNeural Information ProcessingmdashLetters and Reviews vol11 no 10 pp 203ndash224 2007
[32] R E Edwards J New and L E Parker ldquoPredicting futurehourly residential electrical consumption a machine learningcase studyrdquo Energy and Buildings vol 49 pp 591ndash603 2012
[33] L Breiman ldquoRandom forestsrdquoMachine Learning vol 45 no 1pp 5ndash32 2001
[34] L Breiman J H Friedman R A Olshen and C J StoneClassification and Regression Trees Wadsworth and BrooksMonterey Calif USA 1984
[35] P Vezza C Comoglio M Rosso and A Viglione ldquoLowflows regionalization in North-Western Italyrdquo Water ResourcesManagement vol 24 no 14 pp 4049ndash4074 2010
[36] L Breiman and A Cutler ldquoRandom forestsrdquo January 2016httpwwwstatberkeleyeduusersbreimanRandomForests
[37] MATLAB Version 830 MathWorks Natick Mass USA 2014
Submit your manuscripts athttpwwwhindawicom
Computer Games Technology
International Journal of
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Distributed Sensor Networks
International Journal of
Advances in
FuzzySystems
Hindawi Publishing Corporationhttpwwwhindawicom
Volume 2014
International Journal of
ReconfigurableComputing
Hindawi Publishing Corporation httpwwwhindawicom Volume 2014
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Applied Computational Intelligence and Soft Computing
thinspAdvancesthinspinthinsp
Artificial Intelligence
HindawithinspPublishingthinspCorporationhttpwwwhindawicom Volumethinsp2014
Advances inSoftware EngineeringHindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Electrical and Computer Engineering
Journal of
Journal of
Computer Networks and Communications
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Hindawi Publishing Corporation
httpwwwhindawicom Volume 2014
Advances in
Multimedia
International Journal of
Biomedical Imaging
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
ArtificialNeural Systems
Advances in
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
RoboticsJournal of
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Computational Intelligence and Neuroscience
Industrial EngineeringJournal of
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Modelling amp Simulation in EngineeringHindawi Publishing Corporation httpwwwhindawicom Volume 2014
The Scientific World JournalHindawi Publishing Corporation httpwwwhindawicom Volume 2014
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Human-ComputerInteraction
Advances in
Computer EngineeringAdvances in
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
10 Applied Computational Intelligence and Soft Computing
were developed using data from 29 weeks The predictedhourly results were compared with actual heat load dataPerformances of these three different ML algorithms werestudied compared and analyzed SVR algorithm proved tobe the most efficient one producing the best performance interms of average errors and correlation coefficient Moreoverthe prediction results were also compared against existingSVR and PLS methods in literature showing that the SVRpresented in this paper produces better accuracy
In conclusion the comparison results validate the notionthat the developed SVRmethod is appropriate for applicationin heat load prediction or it can serve as a promisingalternative for existing models
As for the future work apart from outdoor temperaturewe intend to incorporate other meteorological parametersinfluencing the heat load such as wind speed and humidity
Competing Interests
The authors declare that they have no competing interests
Acknowledgments
The authors would like to thank Eidsiva Bioenergi AS Nor-way which kindly provided the data for this research work
References
[1] Report on European Energy Security Strategy May 2014 Brus-sels httpeur-lexeuropaeulegal-contentENTXTPDFuri=CELEX52014DC0330ampfrom=EN
[2] A R Day P Ogumka P G Jones and A Dunsdon ldquoTheuse of the planning system to encourage low carbon energytechnologies in buildingsrdquo Renewable Energy vol 34 no 9 pp2016ndash2021 2009
[3] IEA (International Energy Agency) Nordic Energy TechnologyPerspectives OECDIEA Paris France 2013
[4] A Gebremedhin and H Zinko ldquoSeasonal heat storages indistrict heating systemsrdquo in Proceedings of the 11th InternationalThermal Energy Storage for Energy Efficiency and SustainabilityConference (Effstock rsquo09) Stockholm Sweden June 2009
[5] H Lund S Werner R Wiltshire et al ldquo4th Generation DistrictHeating (4GDH) integrating smart thermal grids into futuresustainable energy systemsrdquo Energy vol 68 pp 1ndash11 2014
[6] F Dalipi S Y Yayilgan and A Gebremedhin ldquoA machinelearning approach to increase energy efficiency in district heat-ing systemsrdquo in Proceedings of the International Conference onEnvironmental Engineering and Computer Application (ICEECArsquo14) pp 223ndash226 CRC Press Hong Kong December 2014
[7] N Fumo ldquoA review on the basics of building energy estimationrdquoRenewable and Sustainable Energy Reviews vol 31 pp 53ndash602014
[8] D E Claridge ldquoUsing simulation models for building com-missioningrdquo in Proceedings of the International Conferenceon Enhanced Building Operation Energy Systems LaboratoryTexas AampM University October 2004
[9] T Mestekemper Energy demand forecasting and dynamic watertemperature management [PhD thesis] Bielefeld UniversityBielefeld Germany 2011
[10] E Dotzauer ldquoSimple model for prediction of loads in district-heating systemsrdquo Applied Energy vol 73 no 3-4 pp 277ndash2842002
[11] HANielsen andHMadsen ldquoModelling the heat consumptionin district heating systems using a grey-box approachrdquo Energyand Buildings vol 38 no 1 pp 63ndash71 2006
[12] B B Ekici and U T Aksoy ldquoPrediction of building energyconsumption by using artificial neural networksrdquo Advances inEngineering Software vol 40 no 5 pp 356ndash362 2009
[13] M Grzenda and B Macukow ldquoDemand prediction with multi-stage neural processingrdquo in Advances in Natural Computationand Data Mining pp 131ndash141 2006
[14] K Kato M Sakawa K Ishimaru S Ushiro and T ShibanoldquoHeat load prediction through recurrent neural network indistrict heating and cooling systemsrdquo in Proceedings of the IEEEInternational Conference on SystemsMan andCybernetics (SMCrsquo08) pp 1401ndash1406 Singapore October 2008
[15] T C Park U S Kim L-H Kim B W Jo and Y K Yeo ldquoHeatconsumption forecasting using partial least squares artificialneural network and support vector regression techniques in dis-trict heating systemsrdquo Korean Journal of Chemical Engineeringvol 27 no 4 pp 1063ndash1071 2010
[16] T Catalina V Iordache and B Caracaleanu ldquoMultiple regres-sion model for fast prediction of the heating energy demandrdquoEnergy and Buildings vol 57 pp 302ndash312 2013
[17] S Provatas N Lavesson and C Johansson ldquoAn onlinemachinelearning algorithm for heat load forecasting in district heatingsystemsrdquo in Proceedings of the 14th International Symposium onDistrict Heating and Cooling Stockholm Sweden September2014
[18] S Idowu S Saguna C Ahlund and O Schelen ldquoForecastingheat load for smart district heating systems a machine learningapproachrdquo in Proceedings of the IEEE International Conferenceon Smart Grid Communications (SmartGridComm rsquo14) pp 554ndash559 IEEE Venice Italy November 2014
[19] L Wu G Kaiser D Solomon R Winter A Boulanger andR Anderson Improving Efficiency and Reliability of Build-ing Systems Using Machine Learning and Automated OnlineEvaluation Columbia University Academic Commons 2012httphdlhandlenet10022ACP13213
[20] M Vlachopoulou G Chin J C Fuller S Lu and K KalsildquoModel for aggregated water heater load using dynamicBayesian networksrdquo Tech Rep Pacific Northwest NationalLaboratory (PNNL) Richland Wash USA 2012
[21] S Shamshirband D Petkovic R Enayatifar et al ldquoHeat loadprediction in district heating systemswith adaptive neuro-fuzzymethodrdquoRenewable and Sustainable Energy Reviews vol 48 pp760ndash767 2015
[22] M Protic S Shamshirband M H Anisi et al ldquoAppraisal ofsoft computing methods for short term consumersrsquo heat loadprediction in district heating systemsrdquo Energy vol 82 pp 697ndash704 2015
[23] S Shamshirband D Petkovic A Amini et al ldquoSupport vec-tor regression methodology for wind turbine reaction torqueprediction with power-split hydrostatic continuous variabletransmissionrdquo Energy vol 67 pp 623ndash630 2014
[24] H Quan D Srinivasan and A Khosravi ldquoShort-term load andwind power forecasting using neural network-based predictionintervalsrdquo IEEE Transactions on Neural Networks and LearningSystems vol 25 no 2 pp 303ndash315 2014
Applied Computational Intelligence and Soft Computing 11
[25] K Bhaskar and S N Singh ldquoAWNN-assisted wind power fore-casting using feed-forward neural networkrdquo IEEE Transactionson Sustainable Energy vol 3 no 2 pp 306ndash315 2012
[26] Heat Pump Centre Newsletters International Energy Agency(IEA) Heat Pumping Technologies Vol 30 Nr 22012
[27] T G Dietterich ldquoAn experimental comparison of three meth-ods for constructing ensembles of decision trees baggingboosting and randomizationrdquo Machine Learning vol 40 no2 pp 139ndash157 2000
[28] SwedishDistrictHeatingAssociation 2016 httpwwwsvensk-fjarrvarmese
[29] K P Murphy Machine Learning A Probabilistic PerspectiveAdaptive Computation and Machine Learning Series MITPress 2012
[30] L Wang Support Vector Machines Theory and ApplicationsSpringer New York NY USA 2005
[31] D Basak S Pal and D C Patranabis ldquoSupport vector regres-sionsrdquoNeural Information ProcessingmdashLetters and Reviews vol11 no 10 pp 203ndash224 2007
[32] R E Edwards J New and L E Parker ldquoPredicting futurehourly residential electrical consumption a machine learningcase studyrdquo Energy and Buildings vol 49 pp 591ndash603 2012
[33] L Breiman ldquoRandom forestsrdquoMachine Learning vol 45 no 1pp 5ndash32 2001
[34] L Breiman J H Friedman R A Olshen and C J StoneClassification and Regression Trees Wadsworth and BrooksMonterey Calif USA 1984
[35] P Vezza C Comoglio M Rosso and A Viglione ldquoLowflows regionalization in North-Western Italyrdquo Water ResourcesManagement vol 24 no 14 pp 4049ndash4074 2010
[36] L Breiman and A Cutler ldquoRandom forestsrdquo January 2016httpwwwstatberkeleyeduusersbreimanRandomForests
[37] MATLAB Version 830 MathWorks Natick Mass USA 2014
Submit your manuscripts athttpwwwhindawicom
Computer Games Technology
International Journal of
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Distributed Sensor Networks
International Journal of
Advances in
FuzzySystems
Hindawi Publishing Corporationhttpwwwhindawicom
Volume 2014
International Journal of
ReconfigurableComputing
Hindawi Publishing Corporation httpwwwhindawicom Volume 2014
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Applied Computational Intelligence and Soft Computing
thinspAdvancesthinspinthinsp
Artificial Intelligence
HindawithinspPublishingthinspCorporationhttpwwwhindawicom Volumethinsp2014
Advances inSoftware EngineeringHindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Electrical and Computer Engineering
Journal of
Journal of
Computer Networks and Communications
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Hindawi Publishing Corporation
httpwwwhindawicom Volume 2014
Advances in
Multimedia
International Journal of
Biomedical Imaging
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
ArtificialNeural Systems
Advances in
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
RoboticsJournal of
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Computational Intelligence and Neuroscience
Industrial EngineeringJournal of
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Modelling amp Simulation in EngineeringHindawi Publishing Corporation httpwwwhindawicom Volume 2014
The Scientific World JournalHindawi Publishing Corporation httpwwwhindawicom Volume 2014
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Human-ComputerInteraction
Advances in
Computer EngineeringAdvances in
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Applied Computational Intelligence and Soft Computing 11
[25] K Bhaskar and S N Singh ldquoAWNN-assisted wind power fore-casting using feed-forward neural networkrdquo IEEE Transactionson Sustainable Energy vol 3 no 2 pp 306ndash315 2012
[26] Heat Pump Centre Newsletters International Energy Agency(IEA) Heat Pumping Technologies Vol 30 Nr 22012
[27] T G Dietterich ldquoAn experimental comparison of three meth-ods for constructing ensembles of decision trees baggingboosting and randomizationrdquo Machine Learning vol 40 no2 pp 139ndash157 2000
[28] SwedishDistrictHeatingAssociation 2016 httpwwwsvensk-fjarrvarmese
[29] K P Murphy Machine Learning A Probabilistic PerspectiveAdaptive Computation and Machine Learning Series MITPress 2012
[30] L Wang Support Vector Machines Theory and ApplicationsSpringer New York NY USA 2005
[31] D Basak S Pal and D C Patranabis ldquoSupport vector regres-sionsrdquoNeural Information ProcessingmdashLetters and Reviews vol11 no 10 pp 203ndash224 2007
[32] R E Edwards J New and L E Parker ldquoPredicting futurehourly residential electrical consumption a machine learningcase studyrdquo Energy and Buildings vol 49 pp 591ndash603 2012
[33] L Breiman ldquoRandom forestsrdquoMachine Learning vol 45 no 1pp 5ndash32 2001
[34] L Breiman J H Friedman R A Olshen and C J StoneClassification and Regression Trees Wadsworth and BrooksMonterey Calif USA 1984
[35] P Vezza C Comoglio M Rosso and A Viglione ldquoLowflows regionalization in North-Western Italyrdquo Water ResourcesManagement vol 24 no 14 pp 4049ndash4074 2010
[36] L Breiman and A Cutler ldquoRandom forestsrdquo January 2016httpwwwstatberkeleyeduusersbreimanRandomForests
[37] MATLAB Version 830 MathWorks Natick Mass USA 2014
Submit your manuscripts athttpwwwhindawicom
Computer Games Technology
International Journal of
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Distributed Sensor Networks
International Journal of
Advances in
FuzzySystems
Hindawi Publishing Corporationhttpwwwhindawicom
Volume 2014
International Journal of
ReconfigurableComputing
Hindawi Publishing Corporation httpwwwhindawicom Volume 2014
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Applied Computational Intelligence and Soft Computing
thinspAdvancesthinspinthinsp
Artificial Intelligence
HindawithinspPublishingthinspCorporationhttpwwwhindawicom Volumethinsp2014
Advances inSoftware EngineeringHindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Electrical and Computer Engineering
Journal of
Journal of
Computer Networks and Communications
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Hindawi Publishing Corporation
httpwwwhindawicom Volume 2014
Advances in
Multimedia
International Journal of
Biomedical Imaging
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
ArtificialNeural Systems
Advances in
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
RoboticsJournal of
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Computational Intelligence and Neuroscience
Industrial EngineeringJournal of
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Modelling amp Simulation in EngineeringHindawi Publishing Corporation httpwwwhindawicom Volume 2014
The Scientific World JournalHindawi Publishing Corporation httpwwwhindawicom Volume 2014
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Human-ComputerInteraction
Advances in
Computer EngineeringAdvances in
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Submit your manuscripts athttpwwwhindawicom
Computer Games Technology
International Journal of
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Distributed Sensor Networks
International Journal of
Advances in
FuzzySystems
Hindawi Publishing Corporationhttpwwwhindawicom
Volume 2014
International Journal of
ReconfigurableComputing
Hindawi Publishing Corporation httpwwwhindawicom Volume 2014
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Applied Computational Intelligence and Soft Computing
thinspAdvancesthinspinthinsp
Artificial Intelligence
HindawithinspPublishingthinspCorporationhttpwwwhindawicom Volumethinsp2014
Advances inSoftware EngineeringHindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Electrical and Computer Engineering
Journal of
Journal of
Computer Networks and Communications
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Hindawi Publishing Corporation
httpwwwhindawicom Volume 2014
Advances in
Multimedia
International Journal of
Biomedical Imaging
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
ArtificialNeural Systems
Advances in
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
RoboticsJournal of
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Computational Intelligence and Neuroscience
Industrial EngineeringJournal of
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Modelling amp Simulation in EngineeringHindawi Publishing Corporation httpwwwhindawicom Volume 2014
The Scientific World JournalHindawi Publishing Corporation httpwwwhindawicom Volume 2014
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Human-ComputerInteraction
Advances in
Computer EngineeringAdvances in
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014