Development of Artificial Neural Network Models to Predict Daily Gas Consumption

Embed Size (px)

Citation preview

  • 7/27/2019 Development of Artificial Neural Network Models to Predict Daily Gas Consumption

    1/6

    Development of Artificial Neural Network Models to Predict Daily Gas ConsumptionRonald H. Brown and Iftekhar Matin

    Departm ent of Electrical and Computer EngineeringMarquette University, Milwaukee, WI 53201- 1881Abstract - The development of feed-forward artificial neuralnetwork (ANN) models to predict daily gas calnsumption is thesubject of this paper. A methodology based on network sensitivitiesand intuition is discussed. The methodology is applied to tworegions in Wisconsin served by the Wisconsin Gas Company(WGC ). Train ing results show that ANN models reduce predictionroot m ean squared errors by more than half when compared withlinear regression models. The ANN predictions are compared withpredictions made by WGC gas controllers for the 18rst 97 days of the1994-19 95 heating season. The ANN prediction errors are 82.2%and 69.7% of the WGC estimate errors for the two regions.

    1. INTRODUCTION ANDOVERVIEiWLocal gas com panies face many challenges in the businessof supplying gas to their customers while encouraging theircustomers to participate in conservation efforts. One suchchallenge is forec asting total daily gas consumption (known asdaily sendout) for a g iven region. Daily, each local distributioncompany (LDC) notifies their pipeline company of the am ount ofgas the LD C will use the next gas day. When the errors in thisestimate exceed certain lim its, the LDC can be: penalized.Two significant sources of error in gas consumptionpredictions are errors in the weather forecast and errors in themathem atical model. The focus of this work is directly on thelatter source of error. An increase in the number of variables,and the usage of nonlinear relationships between predicted sen-

    douts, independen t variables, and behaviora l aspects of consump -tion, result in improved prediction models that can also reducethe effect of the former source of error. Models based on feed-forward artificial neural networks (ANN) are perfect, becauseANNs can model any nonlinearities.Initial results indicate that ANN models have training setroot mean squared errors (RMSE ) that are as low as 48% of theRM SE of linear regression mode ls using the sam e inputs, and arewell under 50% of the RM SE of ma thematical models typicallyused by the LDCs today. ANN m odels for two regions inWiscon sin are now being used daily. The ANN predictions arebeing compared to the p redictions being m ade independently byWG C gas controllers. T he model prediction RMSE over the first97 test days of the 1994-1995 heating season is 82.2% and69.7% of the WGC prediction RMSE for the two regions.

    This work was supported in part by a grant from the Wisconsin Center forDemand-Side Research.

    2. FACTORS CONTRIBUTING TO GA S CONSUMPTIONGas consump tion depends on many factors. The mostsignificant factor is temperature, as most gas is used forresidential, commercial, and industrial heating. The scaled gasconsumption for a region in metropo litan Milwaukee, WI, andthe average daily temperature of Milwaukee, WI, are shown inFigure 1. (The gas consum ption has been scaled to be in therange from 0 to loo0 units to protect proprietary information.)Another important factor is wind, because bu ildings lose moreheat on a windy day than on a calm day. Heat loss is also adynamic process. Therefore, weather characteristics fromprevious days are contributing factors to predicting gas

    consumption. Many industrial customers and som e commercialcustomers shut down over weekends. Thus, the day of the weekis also important. Many other potential factors exist, such ashours of sunshine, direction of the wind, tap water tempera ture,accounting for holidays, bill shock, etc.The most obvious characteristic that can be seen fromFigure 1 (and is common knowledge) is gas consumptiondecreases with increasing temperature to a point. When theoutside temperature reaches a certain range, hea ting no longertakes place and gas consumption is at some near constantbaseline value. This nonlinear characteristic was observed longago and defined the heating degree day as:

    Weather for Milwaukee.WI100, I80

    f 6040

    L

    E 200

    I I90 91 92 93 94 95 96-20'89 year

    Natural Gas Usage or Milwaukee,WI1000 I8002 6008 400

    200

    P

    I IyearOs9 90 91 92 93 94 95 96

    Figure 1. Scaled daily sen dout and average temperatureforMilwaukee,WI

    0-7803-3026-9195 $4.000 995 IEEE 1389

  • 7/27/2019 Development of Artificial Neural Network Models to Predict Daily Gas Consumption

    2/6

    HDD = max (0, T, - T )

    where T is the average temperature for a day and T, is thereference temp erature, historically set to 65F. By using a setreference temperature as an intercept point, however, thenonlinear aspect of the outside temperature range is ignored. Thisis a sourc e of prediction model err or that must be addressed if theprediction model is to be improved. Sim ilarly, all other contrib-uting factors mentioned above must also be allowed to varynonlinearly with respect to gas con sumption.

    3. THE FACTORELECTIONNDTHE TRAINING ROCESSMany A s were trained using various inputs. Dailyconsumption for the current day and the next day were theoutputs. This was done in order to determine which factors werethe most significant. The analysis of the models are presente d inthis section.A note on how error is measured: the root mean square erroris measured as:

    where e, is the I-th of N estimates, std is the error standarddeviation, and the mean is the mean estimate error. The RMSEreported in this paper is calculated on scaled data with peaksendout set to 1000.

    A, An 11-input A N N for the 1993-1994 Heating SeasonAn 1 -input ANN that achieved a RMSE of 17.67,68% ofthe linear regression RMSE of 2 5.26 using th e same factors wastrained in 51 epochs using an extended Kalman filter basedalgorithm [11. T his netw ork was trained on WGC SEW data inthe Fall of 1993 for predicting gas consump tion for the 199 3-1994 heating season.

    w i n d ( k + l ) , he forecasted wind in mph for the next day;wind(k) , he forecasted wind for the current day;wind(k-I) , he forecasted wind for the previous day;wind(k-2) , he actual wind for day befo re yesterday;SO(k-2) , he actual sendout for the day before yesterday;sdayofweek, the sine of 27t times the day-of-th e-we ek,

    where Sat=O, Su n= l, -, Fri=6;cdayojiveek, the cosine of 27c times the day-o f-the-w eek,

    where Sat=O, Su n= l, -, Fri=6.In order to ascertain which factors are most important (andhow well the network was trained), the in put sensitivity of the

    ANN was determined. This was accomp lished by addingoffsets from -5% to +5% of full range to each input (one at atime), and measuring the change in the ou tputs. 5% of fullrang e is denoted as 5%FR.This is shown in Figure 2. Ideally,the minimum error should occur when no offset is added to theinput. That is not so in Figure 2. Thus, the ANN is not ideallytrained. More training will reduce the training set error, butmay lead to o ther problems (such as overtraining).The sensitivity to +/-5%FR input variation was calculated

    as he average of the +5%FR and the -5%FR input sensitivity.The results are shown in Figure 3. As expected, the HDDfactor for the current day is most important, follow ed by SO(k-2 ) and dd(k-2) . The latter two factors together form anindicator of the current state of the system, i.e. thermostatsettings, occupancy rate, etc. The lagged HD D is next mostimportant. It indicates the dyn amic behavior of the system.The weather factors for day k+I are not that important. Thisindicates that the system is causal. Weather fac tors for k+ I aretrain set sensitivity

    m M K k d(k+l) m M K E d(k)-MKE dd(k-1) 15 i

    w 15fnz ir

    -MKE dd(k-2)-MKE wind(k+l)W M K E ind(k)-MKE wind(k-1)m M K E ind(k-2)--SO(k-2)-sdayofweek-cdayofweek

    The p roblem is to predict sendout for today and tomor-row. These days are denoted with the scripts k and k + lrespectively. The prediction is done slightly before todays t a r t s . Thus a c tu a l fa c to rs for y es terd a y , scr i p ted k - I , are notyet known . The inputs were:

    d d ( k + l ) , he forecasted HDDs for the next day;-4 -2

    dd( k) , he forecasted HDDs for the current day; -5-6d d ( k - 1 ) , he forecasted HDDs for the previous day; percent change in inputdd(k-2) , he actual HDDs for day b efore yesterday; Figure 2 Training set sensitivities for +/- 5% FS variation for the 11-inputAN N

    1390

  • 7/27/2019 Development of Artificial Neural Network Models to Predict Daily Gas Consumption

    3/6

    sensitivity to +/-5%FS nput variation

    ~~~~ ~~

    linear regression 23 .14F F N model 11.08

    RMSE by Month

    17.6314.38

    Figure 3. Training se t sensitivities in rank order for the 11-input ANN.

    reduction A " L R

    actually very important to predict the sendout for day k + l .B. A 20-inputANN for he 1993-1994 Heating Season

    Further study of the 11-inputANN indicates that the networkhas a tendency of under-p redicting in the Fall and over-p redictingin the Spring. This is explained as a behaviorad facto r, that is, ingeneral, people turn down their thermostats as he heating seasonprogresses. Thus we hypothesized that day-of-the-year ndicatorswere n eeded. Furtherm ore, when the gas coinsumption over ayear is studied, it has the appearance of a half-waved rectifiedsine wave with a period of one year. The Fourier series of a half-wave sine wave contains a fundamental antd even harmonicterms. Using Fourier analysis and linear regression, we deter-mined that using the fundamental and second harmonic termswould significantly reduce the error. We included these indicatorvariables into a ANN. To further reduce the RMSE, we hypothe-sized that weather and consumption for the same day of theprevious week would be good factors.A 20-input AN N was trained using a neuron decoupledextended Kalman filter based algorithm [2]on data for the WGCSEW region. This network also used temperature in degreesFahrenheit instead of HD Ds, since the actual nonlinearity can bemodeled better by the network than by equation (1). The R MSEfor this network was 11.08. The training set data contained onetraining vector per day for each day from 2 1-Jlec-89 to 3 1-May-93. Each epoch consists of training the A NN with all trainingvectors in the training set. The o rder of the training vectors was

    random within each epoch. The network was trained for 95 0epochs. The test set data contained one vector per day for eachday from 01- Jun-9 3 to 29- Ma r-94. The statistiics for this networkare shown in Table 1.I) Observations of he training results: A linear regression(LR) m odel w as developed using the same training data as the

    48.2% 81.6%

    Mean Error by Month

    Figwe 4. The 20-inputAN N and equivalentLK model training characteristics(a) RMSE oreach month (b) mean error for each month

    A N N (except using heating degree days (HDDs) instead oftemperatures). The LR model RMSE was 23.14, co mpared to11.08for the ANN. Thus the A NN training error was 48.2%of the LR training error.Figure 4a shows the RMSE error for each month in th etraining set for both the A N N and the LR models. The ANN

    Table 1.The Training Setand Testing Set Results of the WG C 20-in~ut NN.Training Data Dates: 21-Dec-8 9 to 31-May-93Testing Data Dates: 01-Jun-93 to 29-Mar-94I RMSEforSO(k)(peak I Training I Testing I=1000)

    1391

  • 7/27/2019 Development of Artificial Neural Network Models to Predict Daily Gas Consumption

    4/6

  • 7/27/2019 Development of Artificial Neural Network Models to Predict Daily Gas Consumption

    5/6

    RMSE by Month RMSEby Month

    301/.I070

    11319-94 Nov-94 Dec-94 Jan-95Mean Error by Month

    Gas Controlw forecast = -1.60L l i w forecast = 1.85......~NNw forecast-=5.09

    1 1:li-151 U-20 IOct-94 Nov-94 Dec-94 Jan-95

    Figure 7. Monthly RMSE for the gas controller estimates;, he linear regressionmodel estimates, and the ANN estimates for the fmt 97 days of the 1994-95heating season for the WGC EW region

    filter algorithm [2] with periodic se nsitivity adjustments on WG CSEW data. The RMSE for this AN N is 10 .67. Th e training setdata contained one training vector per day fo r each day from08-Jan-90 to 31-May-94. Th e network was trained for 1000epochs in a fashion similar to the A N N s discussed previously.W e are obtaining the real test set data as we go through the1994-1995 heating season.This ANN was implemented in a sprea dsheet for use by thegas controllers at WGC . Each morning, using weather forecasts,estimated sendouts from previous days, acteal recent weatherdata, etc., the gas controllers mak e the load estim ate using theirusual metho ds. Then, using the same data, they run the spread-sheet to calculate the ANN sendout estimate. The performance

    statistics of this ANN using we ather forecast data and estimatedprevious sendout data are shown in Figure 7.Th e performancestatistics of the estimates made by the human experts is alsoincluded in Figure 7. A first observation to make is that theerrors are considerably larger than for the 1 -input and 20-inputnetwork s in the previous two subsections. This is comp letely

    LY302010

    013-94 Nov-94 Dec-94 Jan-95

    Mean Error bv Month

    -50 tI I IOd-94 Nov-94 DeC-94 Jan-95601

    Figure 8. Monthly RMSE for the gas controller estimates , he linearregressionmodel estimates,and the ANN estimates for the fust 97 days of the 1994-95heating season for the WGC ANR Districts region

    explained by data quality. The two networks were evaluatedusing actual weather and actual previous sendou ts. This testis being performed in real time in that the data being used isweather forecasts and estimated previous sendou ts. Overall,and on a month by month basis, the AN N is making moreaccurate estimates than the human experts. In the first 97 daysof the test, the ANN has been more accurate 58 times, thehuman experts have been more accurate 29 times, and therehave been 10 ties. The peak errors for the human expertestimates and the ANN estimates are abo ut the same, in that thepeak errors are caused by inaccura te weather forecasts. TheANN estimates are more accurate followin g large temp eratureswings, where the human expert estimates are more accuratearound and imm ediately after the holidays.D. Transferability: a 23-input Ann fo r a Different Regio n fo rthe 1994-1995Heating Season

    To test transferability of this techn ology to other regions,aANN was trained on data for the WGC ANR Districts region

    1393

  • 7/27/2019 Development of Artificial Neural Network Models to Predict Daily Gas Consumption

    6/6

    using the sa me input factors and training algor ithm as the WG CSE W ANN. The RM SE for thisANN is 17.83. The training setdata contained o ne training vector per day for e ach day from15-Jan-90 o 14-May-94.Th e ANN was also implemented in a spread sheet for use bythe gas controllers at WGC. Each morning, using weather

    forecasts, estimated sendouts from previous days, actual recentweather data, etc., the gas controllers make the load estimateusing their usual meth ods. Then , using the same data, they alsorun the spreadsheet o calculate the ANN sendout estimate. Theperformance statistics of this ANN using weather forecast dataand estimated previous sendout data are shown in Figu re 8. Theperformance statistics of the estimates made by the humanexperts is also included in Figure 8. The errors for this regionare larger than the errors for the WGC SEW region, although thisregion is several times smaller that the other region. Th edifference between the A NN estimates statistics and the humanexperts statistics are less than the for the WG C S EW region, withthe human experts estimates statistics surpassing the ANNestimates statistics for Nov emb er. Similar trends to those of theother region w ere obse rved in the gas load estim ate errors, withthis additional trend. The ANN estimates during the mildOctober and November were usually low, that is, under pre-dicted, indicating a possible change in the customer base orconsumption behavior. This trend seem to diminish once thecold weather arrived.

    4. SUMMARY AN D RECOMMENDATIONSOR FURTHER WORKThe development of feed forward artificial neural networkbased models for the prediction of gas consumption of a dailybasis has been presented for two regions served by W GC. Theresults indicate that the feed forward AN N based m odels reduce

    the residual predicted consumption RMSE by more than half

    when compared to models based on linear regression using thesame input factors. Two ANNs, for two regions served byWGC , are being used to predict gas consumption for the 1994-1995 heating season. For the first 97 day s of the analysis, bothmodels have less estimate errors than that of th e human experts.This study indicates that improvements need to be m ade to

    this technology are indicated. A mechanism to track growth,demand-side management and behavioral influences onconsumption needs to be developed. A mechanism forimproved gas load estimation on and around holidays is alsoneeded. A m echanism to ensure better extrapolation on peakdays is needed. This in not an issue for the next few winterssince we have January 1994 to train on, but as this databecomes old, we need to implement a mechanism for peak d aysso estimates are better the next time a peak day occurs.

    Although the development of this w ork was for the W GCSEW reg ion in metropolitan Milwaukee, WI, the technologywas transferred to another region served by WGC . The m odelscan be developed for a particular customer base p rovided thatthe appropriate historical we ather and sendout information isavailable.

    5 . REFERENCES[11 K. Watanabe, T. Fukud a, and S . G. Tzafestas, "Learningalgorithms of layered neural networks via extendedKalman filters", Znf. J. ofSys. Sci., vol. 22, no. 4, April[2 ] G . V. Puskorius and L. A. Feldkamp, "Decoupled ex-tended Kalman filter training of feedforward layerednetworks", ZJCNN (Baltimore, MD ), vol. I, June 1991, pp.

    1991, pp. 753-768.

    77 1-777.

    1394