6
Household Electricity Demand Forecasting: Benchmarking State-of-the-Art Methods Andreas Veit, Christoph Goebel, Rohit Tidke, Christoph Doblander and Hans-Arno Jacobsen Department of Computer Science, Technische Universität München [email protected], [email protected], [email protected], [email protected] ABSTRACT The increasing use of renewable energy sources with variable output, such as solar photovoltaic and wind power genera- tion, calls for Smart Grids that effectively manage flexible loads and energy storage. The ability to forecast consump- tion at different locations in the distributed power system will be a key capability of Smart Grids. The goal of this paper is to benchmark state-of-the-art methods for forecast- ing electricity demand on the household level across dif- ferent granularities and time scales in an explorative way, thereby revealing potential shortcomings and find promis- ing directions for future research in this area. We compare a number of forecasting methods including ARIMA, neural networks, and exponential smoothening using several strate- gies for training data selection, in particular sliding window based and day type strategies. We consider forecasting hori- zons ranging from 15 minutes to 24 hours. Our evaluation is based on two data sets containing the power usage of individ- ual appliances at second time granularity collected over the course of several months. Our results indicate that without further refinement the considered advanced state-of-the-art forecasting methods rarely beat corresponding persistence forecasts. Furthermore, the achievable accuracy in terms of Mean Average Percentage Error is surprisingly low, ranging from 5% to 150%. Therefore, we also contribute a detailed discussion of promising future research based on the identi- fied trends and experiences from our experiments. 1. INTRODUCTION According to the US Department of Energy, creating a sustainable and energy-efficient society is one of the great- est challenges of this century, as traditional non-renewable sources of energy are depleting and adverse effects of carbon emissions are being felt [18]. To help achieve this goal, the authors of [6] outline a cs research agenda. To achieve a reliable operation of the electricity distribution system, sup- ply and load have to be balanced within a tight tolerance in real time. Today, with increasing decentralized genera- tion of electricity, there is a need for controlling of smaller zones of the electric grid. The ability to forecast local power consumption in advance is a vital factor, because load fore- casts can greatly enhance the micro-balancing capabilities of smart grids, if they are utilized for control operations and decisions like dispatch, unit commitment, fuel allocation and off-line network analysis [2]. The better the local forecasts, the more efficient the power networks can be. Whereas fore- casts on the transmission level have been performed for a while [13], the introduction of smart meters and installa- tions of smart appliances allow forecasts on the household and even the device level [22]. Recently, there have been many studies on the disaggregation of electricity consump- tion of households into individual appliances [3]. However, the short term forecasting of individual household consump- tion has not been evaluated to a satisfactory extent. Considering the importance of short-term load forecasting in demand and supply balancing, we conduct experiments to compare which forecasting methods work best. To analyze and benchmark possible forecasting methods and strategies, we use two electricity consumption data sets: one collected by researchers at the Technische Universit¨ at M¨ unchen and one from the Massachusetts Institute of Technology. Since these data sets do not provide the same features in each household, we transform the data and use different granu- larities of consumption data, i.e., sampling frequencies from 15 up to 60 minutes, and time horizons for the forecasts from very short-term forecasts of 15 minutes up to forecasts of 24 hours. In our experiment, we compare Autoregressive Integrated Moving Average (ARIMA), exponential smooth- ing and neural networks. Further, we apply three different strategies for training data selection: a sliding window ap- proach, a day type approach and a hierarchical day type approach. To compare the results, we tested the accuracy of the forecast with the Mean Absolute Percentage Error. Overall, we observed that the considered advanced fore- casting methods only rarely beat the accuracy of persistence forecasts. In addition, most of the methods benefit from larger training sets, splitting the data into sets of particular day types and predicting based on disaggregated data from individual appliances. Furthermore, the achievable accu- racy in terms of average MAPE is surprisingly low, ranging between 5 and 150%. Therefore, we also provide an explo- ration of promising directions for future research. These experimental results and the exploration of future research directions are the primary contributions of this paper. This paper is organized as follows: In Section 2, we re- view related literature and identify the research gap. In

Household Demand Forecasting

Embed Size (px)

DESCRIPTION

Forecasting

Citation preview

  • Household Electricity Demand Forecasting:Benchmarking State-of-the-Art Methods

    Andreas Veit, Christoph Goebel, Rohit Tidke,Christoph Doblander and Hans-Arno Jacobsen

    Department of Computer Science, Technische Universitt [email protected], [email protected],[email protected], [email protected]

    ABSTRACTThe increasing use of renewable energy sources with variableoutput, such as solar photovoltaic and wind power genera-tion, calls for Smart Grids that effectively manage flexibleloads and energy storage. The ability to forecast consump-tion at different locations in the distributed power systemwill be a key capability of Smart Grids. The goal of thispaper is to benchmark state-of-the-art methods for forecast-ing electricity demand on the household level across dif-ferent granularities and time scales in an explorative way,thereby revealing potential shortcomings and find promis-ing directions for future research in this area. We comparea number of forecasting methods including ARIMA, neuralnetworks, and exponential smoothening using several strate-gies for training data selection, in particular sliding windowbased and day type strategies. We consider forecasting hori-zons ranging from 15 minutes to 24 hours. Our evaluation isbased on two data sets containing the power usage of individ-ual appliances at second time granularity collected over thecourse of several months. Our results indicate that withoutfurther refinement the considered advanced state-of-the-artforecasting methods rarely beat corresponding persistenceforecasts. Furthermore, the achievable accuracy in terms ofMean Average Percentage Error is surprisingly low, rangingfrom 5% to 150%. Therefore, we also contribute a detaileddiscussion of promising future research based on the identi-fied trends and experiences from our experiments.

    1. INTRODUCTIONAccording to the US Department of Energy, creating a

    sustainable and energy-efficient society is one of the great-est challenges of this century, as traditional non-renewablesources of energy are depleting and adverse effects of carbonemissions are being felt [18]. To help achieve this goal, theauthors of [6] outline a cs research agenda. To achieve areliable operation of the electricity distribution system, sup-ply and load have to be balanced within a tight tolerancein real time. Today, with increasing decentralized genera-

    tion of electricity, there is a need for controlling of smallerzones of the electric grid. The ability to forecast local powerconsumption in advance is a vital factor, because load fore-casts can greatly enhance the micro-balancing capabilitiesof smart grids, if they are utilized for control operations anddecisions like dispatch, unit commitment, fuel allocation andoff-line network analysis [2]. The better the local forecasts,the more efficient the power networks can be. Whereas fore-casts on the transmission level have been performed for awhile [13], the introduction of smart meters and installa-tions of smart appliances allow forecasts on the householdand even the device level [22]. Recently, there have beenmany studies on the disaggregation of electricity consump-tion of households into individual appliances [3]. However,the short term forecasting of individual household consump-tion has not been evaluated to a satisfactory extent.

    Considering the importance of short-term load forecastingin demand and supply balancing, we conduct experiments tocompare which forecasting methods work best. To analyzeand benchmark possible forecasting methods and strategies,we use two electricity consumption data sets: one collectedby researchers at the Technische Universitat Munchen andone from the Massachusetts Institute of Technology. Sincethese data sets do not provide the same features in eachhousehold, we transform the data and use different granu-larities of consumption data, i.e., sampling frequencies from15 up to 60 minutes, and time horizons for the forecastsfrom very short-term forecasts of 15 minutes up to forecastsof 24 hours. In our experiment, we compare AutoregressiveIntegrated Moving Average (ARIMA), exponential smooth-ing and neural networks. Further, we apply three differentstrategies for training data selection: a sliding window ap-proach, a day type approach and a hierarchical day typeapproach. To compare the results, we tested the accuracyof the forecast with the Mean Absolute Percentage Error.

    Overall, we observed that the considered advanced fore-casting methods only rarely beat the accuracy of persistenceforecasts. In addition, most of the methods benefit fromlarger training sets, splitting the data into sets of particularday types and predicting based on disaggregated data fromindividual appliances. Furthermore, the achievable accu-racy in terms of average MAPE is surprisingly low, rangingbetween 5 and 150%. Therefore, we also provide an explo-ration of promising directions for future research. Theseexperimental results and the exploration of future researchdirections are the primary contributions of this paper.

    This paper is organized as follows: In Section 2, we re-view related literature and identify the research gap. In

  • Section 3, we describe the electricity consumption data weuse in our experiments and explain all performed transfor-mations. Subsequently, in Section 4, we describe our exper-imental setup and the forecasting methods and strategiesused in our experiments. Section 5 presents the results ofour experiments and Section 6 discusses our findings andexplores directions for future research.

    2. RELATEDWORKDemand side management and demand response receive

    increasing attention by research and industry. The pub-lished research includes a variety of directions from directload control or targeted customer interaction to indirectincentive-based control (see [14] for an overview).

    The approaches for demand side management focus on dif-ferent levels of the power system. On the grid operator level,studies focus for example on the minimization of power flowfluctuations [17] or the integration of renewable energy [20].The distribution grid operator uses consumption forecasts tobalance grids with a high penetration of decentralized gen-eration of renewable energy [11]. Others look at the levelof groups of consumers, e.g., with a focus on virtual pricesignals [19]. Most work, however, focuses on the end con-sumers, in particular, on the use of variable price signalsfor individual consumers. These dynamic tariffs penalizeconsumption during certain times with increased electricityprices, so that consumers can respond by adjusting their con-sumption [7]. However, as [16] points out, variable price sig-nals for end consumers can cause instabilities through loadsynchronization. To avoid uncontrolled behavior, accurateconsumption forecasts can help utilities balancing demandand supply and select consumers that are most suitable fora demand response program.

    First studies have analyzed the potential and first pro-totypes of consumption forecasts for individual households(e.g. [21, 22] ). However, most work on household consump-tion focuses on disaggregation of electricity consumption.Examples include [8, 10, 12]. In this paper, we benchmarkstate-of-the-art forecasting models for household consump-tion and also evaluate how the disaggregation of consump-tion data influences the forecast accuracy.

    3. ELECTRICITY CONSUMPTION DATAThe data collected by the smart meters or smart home in-

    frastructures include differing sets of attributes. The mostcommon metrics are wattage readings or accumulated en-ergy at discrete time steps. While some consumption datasets are univariate time series only consisting of the over-all electricity consumption reading from a household, otherdata sets consist of multivariate data including, for example,readings from sensors distributed over a household. In ourexperiment we use data sets from the second category.

    3.1 Data SetsWe use two different data sets for our experiments. To

    perform the same experiments using both data sets, a trans-formation of the data is necessary. These transformationsare explained in the following section.

    3.1.1 The TUM Home Experiment Data SetIn the TUM Home Experiment, a single household in Ger-

    many, in the state of Bavaria, was equipped with a dis-tributed network of sensors measuring power in Watt, on/off

    0

    500

    1000

    1500

    2000

    0

    100

    200

    300

    400

    500

    RE

    DD

    TUM

    0 24 48 72 96 120 144 168 192 216 240 264 288 312hours

    pow

    er

    Figure 1: Demand profiles from the data sets.

    status and energy in kWh each second from several appli-ances. The measured appliances include lights, the fridge,washing machine, office and entertainment devices. Thedata used for this experiment was collected from February4th 2013 to October 31st 2013. Figure 1 shows in the lowergraph the demand profile from February 21st to March 5th2013. From the figure it can be seen that the demand is flatfor long time intervals, with occasional peaks, especially inthe evenings. In particular, 70% of all power readings lie be-tween 25 and 30W. Figure 2 shows the empirical cumulativedistribution function of the power readings and illustratesthe steep increase of power readings at 25 to 30 Watt.

    3.1.2 The Reference Energy Disaggregation Data SetThe Reference Energy Disaggregation Data Set (REDD)

    is a public data set for energy disaggregation research [12].The REDD data set is provided by the Massachusetts In-stitute of Technology and contains power consumption mea-surements of 6 US households recorded for 18 days betweenApril 2011 and June 2011. The data set contains high fre-quency and low frequency readings and includes readingsfrom the main electrical circuits as well as readings fromindividual appliances such as lights, microwave and refriger-ator. For the experiments presented in this paper, we use thelow frequency readings of the individual appliances, whichare sampled at intervals of 3 seconds. Figure 1 shows inthe upper graph the demand profile for house 1 from April19th to May 1st 2011. From the figure it can be seen that, incontrast to the TUM data set, the REDD aggregate demandhas more frequent and higher fluctuations. As a result, thecumulative distribution of the power readings shown in Fig-ure 2 has a flatter slope.

    3.2 Data TransformationThe two used data sets come in different formats. To ob-

    tain uniformity and to achieve comparable results, we trans-form them in three steps:

    Step 1: The data sets are transformed into a commonformat. Since the readings are at different frequencies, weconvert the time indicators into UNIX timestamps and thegranularity to one minute.

    Step 2: Statistical time series forecasting relies on the as-sumption that time series are equally spaced. In [5], theauthor explains that in case of unequal spacing, interpo-lation methods should be used to generate equally spacedintervals. Usually, linear interpolations are performed forthis transformation. In the two data sets, several breaks in

  • 0.00

    0.25

    0.50

    0.75

    1.00

    0.00

    0.25

    0.50

    0.75

    1.00

    RE

    DD

    TUM

    0 100 200 300 400 500power

    empi

    rical

    cum

    ulat

    ive

    dist

    ribut

    ion

    func

    tion

    Figure 2: Cumulative distribution of power.

    the time series exist, due to meters or sensors not provid-ing measurements. These breaks cannot all be interpolated,because the interpolation of long intervals can have a signif-icant influence on the statistical forecasting model. As themain seasonality in electricity consumption data is one day,longer breaks of several hours can no longer be interpolated.Determining the optimal length of interpolation intervals isitself an optimization problem. In this paper, we interpolateintervals up to a length of two hours using linear interpola-tion.

    Step 3: The different sampling strategies require differentdata formats. First, the sliding window strategy, where weselect training data windows of specific lengths to predictfuture load, require a continuous time series. Therefore, weselect the longest period without breaks longer than 2 hours.Second, in day type strategies, the forecasting models aretrained using data from similar days of the week. Thus, wecreate a cross-sectional data set joining each day of the week,e.g., Mondays of consecutive weeks, into one data set.

    3.2.1 Transformation of TUM Data SetThe TUM data set as introduced above is a multivariate

    data set containing measurements from several appliances inthe experiment house. Since the readings from all applianceshave been stored in one big data set, we extract the powerreadings and split the data set into individual channels forthe different appliances and subsequently interpolate gaps ofup to two hours. Then, for the hierarchical strategy we cre-ate a separate data set for each appliance and for each day ofthe week. For several times, no data is available from someappliances, but data is available for other appliances. Suchincomplete data could disrupt forecasts, because the fore-casting model would assume the appliance to be switchedoff, although it is running. To obtain a consistent data set,we only consider intervals, where data is available from allappliances. Then, we aggregate the different appliance chan-nels for the day type and the sliding window strategies. Toget a continuous time series for the sliding window strategy,we choose the longest consistent distinct data set, which isthe data of the period from Feb 20th 2013 09:13:00 GMT toApr 5th 2013 05:44:00 GMT.

    3.2.2 Transformation of REDD Data SetThe REDD data set also contains measurements from

    several appliances. They are already divided into separatechannels for the individual appliances. Although the dataset contains readings from six different houses, we only use

    the data from house no. 1, as it contains a long enough pe-riod of measurements. For the other houses we can neitherperform the day type nor the hierarchical forecasting strate-gies, as they only contain data from 2 up to 3 days for eachday of the week. For the data from house 1 we then inter-polate gaps of up to two hours and aggregate the differentappliance channels for the day type and the sliding windowstrategies. To get a continuous time series for the slidingwindow strategy, we choose the longest consistent distinctdata set, i.e., the data from Apr 18th 2011 22:00:00 GMTto May 2nd 2011 21:59:00 GMT.

    4. EXPERIMENTSIn this section, we introduce the different forecasting meth-

    ods and strategies we use in our experiment.

    4.1 Forecasting MethodsFor reproducibility, we use different forecasting methods

    that are all provided by the R forecast package. As bench-mark, we use the persistence method, where forecasts equalthe last observation. We will refer to this method as PER-SIST. For short forecasting horizons, persistence forecastsare known as hard to beat. Further, we use the Autoregres-sive Integrated Moving Average (ARIMA) model, denotedas ARIMA(p, d, q)(P,D,Q), where the non-seasonal compo-nents are defined in the first parentheses and the seasonalcomponents of the model are defined in the second paren-theses. The parameters (p, P ) denote the number of laggedvariables, (d,D) denote the difference that is necessary tomake the time series stationary and (q,Q) denote the movingaverage over the number of last observations. To find the op-timal parameters, we use the auto.arima() method, whichprovides the best model, according to the minimization ofAkaike information criterion with a correction for finite sam-ple sizes (AICc) [9]. Third, we use an exponential smoothingstate space model (BATS) with Box-Cox transformation,ARMA errors as well as trend and seasonal components.The model is denoted as BATS(, , p, q,m1,m2...mt), where is the Box-Cox, the damping, (p, q) the ARMA param-eters, and (m1,m2...mt) are the seasonal periods. We alsoapply the TBATS model, which uses trigonometric functionsfor the seasonal decomposition. The BATS and TBATSapproaches are explained in [4]. Lastly, we also use feed-forward neural networks with a single hidden layer and laggedinputs for forecasting univariate time series. The model isdenoted as NNAR(p, P, k)m, where p is the number of non-seasonal lags, P is the number of seasonal lags, k is the num-ber of nodes in the hidden layer and m is the seasonal period.The model is analogous to an ARIMA(p, 0, 0) (P, 0, 0) model,but with nonlinear functions. The network is trained for one-step forecasts and longer forecasting horizons are computedrecursively.

    4.2 Sampling StrategiesWe use three different strategies to sample the training

    and test data. First, we use the sliding window strategy,where the data set is divided into windows of smaller parts.Each window of training data has a corresponding test win-dows for cross validation to measure the accuracy of theprediction. The approach is illustrated in Figure 3a. Themain reason for using the sliding window approach is thatit allows comparing the results from data sets of differentlengths. After a prediction model has been trained and

  • dataset

    horizon

    test setwindow

    (a) Sliding Window Strategy

    datasetSunMon Tue Wed Thu Fri Sat Mon

    datasetMonMon Mon Mon Mon Mon

    datasetTueTue Tue Tue Tue Tue

    history of similar days

    (b) Day Type Strategy

    datasetSunMon Tue Wed Thu Fri Sat Mon

    datasetMonMon Mon Mon Mon Mon

    individual appliances

    fridgeMon

    ovenMon

    lightsMon

    (c) Hierarchical Day Type Strategy

    Figure 3: Different sampling strategies.

    tested, the window moves forward on the data set. Sec-ond, we use a day type strategy. While the sliding windowapproach considers the data to be a continuous time series,the day type approach uses cross-sectional data. The strat-egy is to join each day of the week of consecutive weeks intoseparate data sets. The approach is illustrated in Figure 3b.The training and test data are then sampled from the in-dividual data sets. An example of such an approach is tojoin the Mondays of consecutive weeks. Third, we use a hi-erarchical day type strategy. A hierarchical time series is acollection of several time series that are linked together ina hierarchical structure. Hierarchical forecasting methodsallow the forecasts at each level to be summed up in orderto provide a forecast for the level above. Approaches to hi-erarchical time series include a top-down, bottom-up andmiddle-out approach. In the top-down approach the aggre-gated series is forecasted and then disaggregated based onhistorical proportions. The bottom-up approach, first fore-casts all the individual channels on the bottom level andthen aggregates the forecasts to create the aggregated fore-cast. The middle-out approach combines both approaches.We apply a bottom-up approach, i.e., we use the individ-ual appliance channels to create forecasts for the individualappliances. Similar to the day type approach, we join eachday of the week of consecutive weeks into separate data setsfor each appliance. The approach is illustrated in Figure 3c.Finally, we aggregate the individual forecasts to a forecastof the entire household and test it against the test window.

    4.3 GranularitiesWhile both used data sets contain readings at 1-3 seconds

    granularity, other data sets and meters offer measurementsof different granularities. Therefore, we want to understandthe effect of different measurement granularities on the per-formance of the different forecasting methods. In particular,we transformed the data into granularities of 15, 30 and 60minute intervals. The power reading of those intervals isdefined as the mean of the power readings in the interval.

    4.4 Training Window SizesAnother parameter for demand forecasting is the window

    length of the training data. The different data sets and fore-casting methods limit the length of training sets. For exam-ple, for the day type strategy only training sets of 3 days canbe used, since the REDD data set only contains four daysfor each day of the week. Further, the ARIMA method of the Rforecast package cannot handle models with seasonal pe-riods with more than 350 data points. Considering theserestrictions, we use a training window size of 3 days for theday type and the hierarchical day type approach and variedthe training window length for the sliding window approachbetween 3, 5 and 7 days.

    4.5 Forecasting HorizonsThe forecasting horizon is the number of point forecasts

    the algorithm predicts into the future. In this experiment,the horizon is given by the minutes the load is predictedinto the future. The focus of this work lies on short-termforecasts. Hence, the range of the prediction lies between15 minutes and 24 hours. Note that the granularity of theforecast cannot be higher than the granularity of the trainingdata. For instance, with a training data set of 15 minutesintervals the earliest prediction will be 15 minutes into thefuture and all predictions will be in intervals of 15 minutes.

    4.6 Model Quality MeasureWe require a statistical quality measure that can com-

    pare the different forecasting methods and strategies. Thepresent experiment uses the Mean Absolute Percentage Er-ror (MAPE) as the standard accuracy error measure. Thereason for this choice is that MAPE can be used to comparethe performance on different data sets, because it is a rela-tive measure. MAPE is defined as the mean over the ratio ofthe absolute difference between the residual and the actualvalue in percent:

    MAPE =1

    n

    nt=1

    xt xtxt

    where xt is the actual value and xt is the forecast value. Forexample, with an actual load of 100 Watt and a correspond-ing forecast of 150 Watt, the MAPE would be 50%.

    5. EXPERIMENTAL RESULTSIn this section, we present the influence of the defined

    parameters on the accuracy of the forecasting methods andstrategies. For the evaluation, we performed a total of 16038different forecasts. Overall, we observe that the consideredadvanced state-of-the-art forecasting methods rarely beat cor-responding persistence forecasts. This is especially true forthe TUM data set, which has long and frequent periods with

  • ARIMA BATS NNET PERSIST TBATS

    050100150

    050100150

    REDD

    TUM

    3 5 7 3 5 7 3 5 7 3 5 7 3 5 7WindowDsize

    MAPE

    Windowsize

    3Ddays

    5Ddays

    7Ddays

    SlidingDWindowDStrategy

    Figure 4: MAPE for varying window sizes.

    constant consumption. Further, our results differ largely be-tween the two data sets, i.e., the accuracy on the TUM dataset is almost constantly higher than on the REDD data set.This could be due to the more stable consumption patternin the TUM data set, which is easier to predict. In addition,Figure 4 shows boxplots of the distribution of MAPE andlinear trend lines of MAPE for different window sizes in thesliding window strategy. The results are split by data setas well as by the applied forecasting method. The resultsindicate that increasing window sizes improve the results ofthe ARIMA, NNET and TBATS methods on the REDDdata set, but not on the TUM data set. A possible explana-tion is that in the consumption profile of the TUM data setthe consumption of every day is very similar and shows aconstant pattern. Thus, an additional day of training datadoes not provide the forecasting models with new importantinformation. Further, Figure 5 shows heatmaps of MAPEfrom the sliding window and day type strategies for differentgranularities and forecasting horizons. The results indicatethat for almost every method, longer forecasting horizonslead to lower accuracy. However, the exponential smooth-ing methods BATS and TBATS seem more robust againstincreasing horizons than the other methods. Further, es-pecially on the REDD data set, lower granularities lead tobetter accuracy. In particular, the exponential smoothingstrategies BATS and TBATS and the neural network out-perform the persistence method for granularities of 30 and60 minutes. Furthermore, Figure 5 shows that for almostevery method a division of the data into day type windowsimproves the forecast accuracy. Lastly, Figure 6 comparesall three strategies for the ARIMA method indicating thatthe hierarchical strategy can greatly improve accuracy on theTUM data set. This is surprising, as generally the predic-tion of aggregated loads tend to result in higher precision.

    6. DISCUSSIONWe have evaluated a wide range of state-of-the-art meth-

    ods and strategies for short-term forecasting of householdelectricity consumption based on actual consumption data.Although our current data is limited, we were able to gainuseful insights. Overall, the considered advanced forecast-ing methods only rarely beat the accuracy of persistenceforecasts. Further, most of the methods benefit from largertraining sets, splitting the data into sets of particular daytypes and predicting based on disaggregated data from in-dividual appliances. Moreover, the achievable accuracy interms of average MAPE is surprisingly low, ranging from

    ARIMA BATS NNET PERSIST TBATS

    9111282669795

    6256629484

    47538476

    293137476057

    3032395355

    24273437

    366360667069

    5452605653

    48635249

    8711182830

    1515131934

    15151833

    206225161122126119

    9575779583

    56977965

    20128112341

    6541031

    871231

    85358696361

    6168746360

    59684646

    53441124

    654924

    761025

    667767586565

    4045555555

    50645748

    12561121

    1110121632

    10131529

    1530601807201440

    1530601807201440

    REDD

    TUM

    15 30 60 15 30 60 15 30 60 15 30 60 15 30 60SamplingPgranularityPinPminutesFo

    recastingPhorizonPinPminutes

    0

    50

    100

    150

    200

    MAPE

    DayPTypePStrategy

    ARIMA BATS NNET PERSIST TBATS

    48839096103146

    918695109146

    516684117

    192227425656

    2529445555

    30445550

    76129118112111127

    5775896992

    59746992

    7913183332

    1013234236

    17254233

    8593907885150

    49546065121

    525055102

    353638416247

    3539395543

    42466850

    5471919393113

    7571103105118

    6083103117

    6810173135

    69153134

    10132929

    721161059987106

    7270835781

    63817496

    101115214128

    1315234130

    22264532

    1530601807201440

    1530601807201440

    REDD

    TUM

    15 30 60 15 30 60 15 30 60 15 30 60 15 30 60SamplingPgranularityPinPminutesFo

    recastingPhorizonPinPminutes

    0

    50

    100

    150

    200

    MAPE

    SlidingPWindowPStrategy

    Figure 5: MAPE for varying horizons and granularities.

    day type hierarchical day type sliding window

    76.3

    43.3

    78

    28.7

    93.9

    42.7

    RE

    DD

    TUM

    0

    50

    100

    150

    200

    MAPE

    Figure 6: Mean MAPE for ARIMA with different strategies.

    5% to 150%. Our work thus motivates more research inves-tigating how accuracy can be increased.

    First, introducing further features, e.g., from occupancy,temperature or brightness sensors, could provide additionalinformation for prediction algorithms to react faster in casea change in consumption occurs. For example, when a de-vice is switched on or off, it takes some time until the av-erage wattage of the time interval accounts for the change.The TUM data set contains additional sensors which arenot yet considered in our experiments. However, it has tobe taken into account that many experiments do not mea-sure the same features in each household and are thus notdirectly comparable. Second, we also expect error reduction,when the consumption patterns of the appliances itself areconsidered. This is supported by the results for the hierar-chical strategy (cf, Figure 6). Thermal devices like fridges,freezers, boilers and heat pumps have a very predictableconsumption pattern. Other devices like washing machines,dishwashers and laundry dryers have a known consumptionpattern once switched on. Third, when looking at individ-ual appliances, another direction worthwhile investigatingwould be event detection. Instead of prediction solely basedon continuous wattage readings, it could be beneficial to de-tect concrete events (e.g., on/off) and based on that derivea future consumption pattern. A sequence of events couldtrain a markov model [15] and predict future events, whichcould be used for the consumption forecast. We think thatthis could reduce the prediction error especially for shorttime forecasts. Fourth, we only considered consistent datasets. However, in real world settings, load forecasts need tobe performed even in situations with missing data. Futurework should investigate how to handle temporary sensor out-ages, which could distract the prediction algorithms. Last,our results differ largely between the two data sets. It is un-clear, how common the characteristics of these data sets are.

  • However, the necessary data for carrying out more represen-tative studies is currently missing, although more data setsare currently being published [1]. In summary, this studyshould be considered as an exploration of promising direc-tions for future research rather than yielding final results onthe viability of local electricity demand forecasting.

    7. CONCLUSIONSWe have evaluated a wide range of state-of-the-art meth-

    ods and strategies for short-term forecasting of householdelectricity consumption based on actual data, which is a keycapability in many smart grid applications. Although ourcurrent data base is limited, we were able to gain useful in-sights. Overall, we showed that without further refinementof advanced methods such as ARIMA and neural networks,the persistence forecasts are hard to beat in most situations.Since advanced forecasting methods provide little value, ifthey are not embedded into a framework that adapts theiruse to individual household attributes, we also provided anexploration of promising directions for future research. Fu-ture work will focus on the design of such frameworks andevaluate them based on representative data.

    8. REFERENCES[1] S. Barker, A. Mishra, D. Irwin, E. Cecchet, P. Shenoy,

    and J. Albrecht. Smart*: An open data set and toolsfor enabling research in sustainable homes. InProceedings of the 2012 Workshop on Data MiningApplications in Sustainability (SustKDD 2012), 2012.

    [2] D. Bunn and E. Farmer. Review of short-termforecasting methods in the electric power industry.Comparative models for electrical load forecasting,pages 1330, 1985.

    [3] K. Carrie Armel, A. Gupta, G. Shrimali, andA. Albert. Is disaggregation the holy grail of energyefficiency? The case of electricity. Energy Policy, 2012.

    [4] A. M. De Livera, R. J. Hyndman, and R. D. Snyder.Forecasting time series with complex seasonal patternsusing exponential smoothing. Journal of the AmericanStatistical Association, 106(496):15131527, 2011.

    [5] A. Eckner. A framework for the analysis ofunevenly-spaced time series data. Tech. report, 2012.

    [6] C. Goebel, H.-A. Jacobsen, V. Razo, C. Doblander,J. Rivera, J. Ilg, C. Flath, H. Schmeck, C. Weinhardt,D. Pathmaperuma, H.-J. Appelrath, M. Sonnenschein,S. Lehnhoff, O. Kramer, T. Staake, E. Fleisch,D. Neumann, J. Struker, K. Erek, R. Zarnekow,H. Ziekow, and J. Lassig. Energy Informatics. Business& Information Systems Engineering, pages 17, 2013.

    [7] K. Herter, P. McAuliffe, and A. Rosenfeld. Anexploratory analysis of California residential customerresponse to critical peak pricing of electricity. Energy,32(1):2534, 2007.

    [8] S. Humeau, T. K. Wijaya, M. Vasirani, andK. Aberer. Electricity load forecasting for residentialcustomers: Exploiting aggregation and correlationbetween households. In Sustainable Internet and ICTfor Sustainability (SustainIT), 2013. IEEE, 2013.

    [9] R. J. Hyndman, Y. Khandakar, et al. Automatic timeseries for forecasting: The forecast package for R.2007.

    [10] W. Kleiminger, C. Beckel, T. Staake, and S. Santini.Occupancy Detection from Electricity ConsumptionData. In Proceedings of the 5th ACM Workshop onEmbedded Systems For Energy-Efficient Buildings,pages 18. ACM, 2013.

    [11] K. Kok. Dynamic pricing as control mechanism. InPower and Energy Society General Meeting, 2011IEEE, pages 18. IEEE, 2011.

    [12] J. Z. Kolter and M. J. Johnson. REDD: A public dataset for energy disaggregation research. In proceedingsof the SustKDD workshop on Data MiningApplications in Sustainability, pages 16, 2011.

    [13] K. Lee, Y. T. Cha, and J. Park. Short-term loadforecasting using an artificial neural network. IEEETransactions on Power Systems, 7(1):124132, 1992.

    [14] J. Medina, N. Muller, and I. Roytelman. Demandresponse and distribution grid operations:Opportunities and challenges. IEEE Transactions onSmart Grid, 1(2):193198, 2010.

    [15] V. Muthusamy, H. Liu, and H.-A. J. Jacobsen.Predictive Publish/Subscribe Matching. In Proceedingsof the Fourth ACM International Conference onDistributed Event-Based Systems, DEBS 10, pages1425, New York, NY, USA, 2010. ACM.

    [16] S. D. Ramchurn, P. Vytelingum, A. Rogers, andN. Jennings. Agent-based control for decentraliseddemand side management in the smart grid. In The10th International Conference on Autonomous Agentsand Multiagent Systems-Volume 1, pages 512, 2011.

    [17] K. Tanaka, K. Uchida, K. Ogimi, T. Goya, A. Yona,T. Senjy, T. Funabashi, and C. Kim. Optimaloperation by controllable loads based on smart gridtopology considering insolation forecasted error. IEEETransactions on Smart Grid, 2(3):438444, 2011.

    [18] US Department of Energy. Grid 2030: A NationalVision For Electricitys Second 100 Years, 2003.

    [19] A. Veit, Y. Xu, R. Zheng, N. Chakraborty, andK. Sycara. Multiagent coordination for energyconsumption scheduling in consumer cooperatives. InProceedings of 27th AAAI Conference on ArtificialIntelligence, pages 13621368, July 2013.

    [20] C. Wu, H. Mohsenian-Rad, and J. Huang. Windpower integration via aggregator-consumercoordination: A game theoretic approach. InInnovative Smart Grid Technologies (ISGT), 2012IEEE PES, pages 16. IEEE, 2012.

    [21] H. Ziekow, C. Doblander, C. Goebel, and H.-A.Jacobsen. Forecasting household electricity demandwith complex event processing: Insights from aprototypical solution. In Proceedings of the IndustrialTrack of the 13th ACM/IFIP/USENIX InternationalMiddleware Conference, page 2. ACM, 2013.

    [22] H. Ziekow, C. Goebel, J. Struker, and H.-A. Jacobsen.The potential of smart home sensors in forecastinghousehold electricity demand. In 2013 IEEEInternational Conference on Smart GridCommunications (SmartGridComm), pages 229234.IEEE, 2013.