11
Short-term probabilistic forecasts for Direct Normal Irradiance Yinghao Chu, Carlos F.M. Coimbra * Department of Mechanical and Aerospace Engineering, Jacobs School of Engineering Center of Excellence in Renewable Resource Integration and Center for Energy Research University of California, 9500 Gilman Drive, La Jolla, CA 92093, USA article info Article history: Received 24 June 2016 Received in revised form 3 September 2016 Accepted 8 September 2016 Keywords: Solar forecasting k Nearest neighbor Probabilistic forecast Direct Normal Irradiance Ensemble predictions abstract A k-nearest neighbor (kNN) ensemble model has been developed to generate Probability Density Function (PDF) forecasts for intra-hour Direct Normal Irradiance (DNI). This probabilistic forecasting model, which uses diffuse irradiance measurements and cloud cover information as exogenous feature inputs, adaptively provides arbitrary PDF forecasts for different weather conditions. The proposed models have been quantitatively evaluated using data from different locations characterized by different climates (continental, coastal, and island). The performance of the forecasts is quantied using metrics such as Prediction Interval Coverage Probability (PICP), Prediction Interval Normalized Averaged Width (PINAW), Brier Skill Score (BSS), and the Continuous Ranked Probability Score (CRPS), and other standard error metrics. A persistence ensemble probabilistic forecasting model and a Gaussian probabilistic forecasting model are employed to benchmark the performance of the proposed kNN ensemble model. The results show that the proposed model signicantly outperform both reference models in terms of all evaluation metrics for all locations when the forecast horizon is greater than 5-min. In addition, the proposed model shows superior performance in predicting DNI ramps. © 2016 Elsevier Ltd. All rights reserved. 1. Introduction Global market penetration of centralized solar productions, particularly the Concentrated Solar Power (CSP) plants, has been growing rapidly due to the increasing demands for clean and carbon-free energy [1,2]. Direct Normal Irradiance (DNI), which is the sole energy source for CSP generations, is sensitive to the cir- cumsolar cloud cover and therefore is highly variable at the ground level [3]. As a result, the variability of ground-level CSP productions imposes serious challenges to electrical transmission grids, which need to be balanced in real time but have limited storage capacity [4,5]. Quantitatively forecast of DNI provides important informa- tion for inverter control, plant management, unit commitment, and real-time dispatch operations [6,7]. Therefore, solar forecasting models are widely recognized as key components of a smart grid to mitigate the instabilities of centralized solar power generation [4,8e10]. Many effective solar forecasting models have been developed for different temporal horizons based on data-driven, physical, or hybrid methods [8,11e25]. Most of these available solar forecasting models generate deterministic point predictions without quanti- ed uncertainty [26,27]. Point predictions are associated with inherent and irreducible forecasting errors because of the chaotic atmospheric processes, regardless of the mechanism of the model or the methods of data processing [28e30]: Iðt Þ¼ f ðtÞþ εðt Þ; (1) where I(t) represents the measured value at time t, f(t) represents the optimal prediction, and εðt Þ represents the white noise. Therefore, probabilistic forecasts, which provide the Probability Density Function (PDF) of forecast variables, are recommended for real-world forecasting applications in the literature [26,28,29,31e33]. Probabilistic solar/solar power forecasts have been proposed in literature based on analog ensemble of Numerical Weather Pre- diction (NWP) models [34e38]. The analog ensemble is usually dened as a set of historical instances from a NWP model for a given location and forecast horizon. These historical instances have similar features as the current instance from the same NWP model. The actual observations of the historical instances are used to es- timate the PDF of the future state for various weather conditions. For hourly forecast, these proposed analog ensemble models have shown superior performance over reference models, such as the * Corresponding author. E-mail address: [email protected] (C.F.M. Coimbra). Contents lists available at ScienceDirect Renewable Energy journal homepage: www.elsevier.com/locate/renene http://dx.doi.org/10.1016/j.renene.2016.09.012 0960-1481/© 2016 Elsevier Ltd. All rights reserved. Renewable Energy 101 (2017) 526e536

Short-term probabilistic forecasts for Direct Normal ...coimbra.ucsd.edu/publications/papers/2017_Chu_Coimbra.pdf · Short-term probabilistic forecasts for Direct Normal Irradiance

  • Upload
    others

  • View
    3

  • Download
    0

Embed Size (px)

Citation preview

Page 1: Short-term probabilistic forecasts for Direct Normal ...coimbra.ucsd.edu/publications/papers/2017_Chu_Coimbra.pdf · Short-term probabilistic forecasts for Direct Normal Irradiance

lable at ScienceDirect

Renewable Energy 101 (2017) 526e536

Contents lists avai

Renewable Energy

journal homepage: www.elsevier .com/locate/renene

Short-term probabilistic forecasts for Direct Normal Irradiance

Yinghao Chu, Carlos F.M. Coimbra*

Department of Mechanical and Aerospace Engineering, Jacobs School of Engineering Center of Excellence in Renewable Resource Integration and Center forEnergy Research University of California, 9500 Gilman Drive, La Jolla, CA 92093, USA

a r t i c l e i n f o

Article history:Received 24 June 2016Received in revised form3 September 2016Accepted 8 September 2016

Keywords:Solar forecastingk Nearest neighborProbabilistic forecastDirect Normal IrradianceEnsemble predictions

* Corresponding author.E-mail address: [email protected] (C.F.M. Coimb

http://dx.doi.org/10.1016/j.renene.2016.09.0120960-1481/© 2016 Elsevier Ltd. All rights reserved.

a b s t r a c t

A k-nearest neighbor (kNN) ensemble model has been developed to generate Probability DensityFunction (PDF) forecasts for intra-hour Direct Normal Irradiance (DNI). This probabilistic forecastingmodel, which uses diffuse irradiance measurements and cloud cover information as exogenous featureinputs, adaptively provides arbitrary PDF forecasts for different weather conditions. The proposedmodels have been quantitatively evaluated using data from different locations characterized by differentclimates (continental, coastal, and island). The performance of the forecasts is quantified using metricssuch as Prediction Interval Coverage Probability (PICP), Prediction Interval Normalized Averaged Width(PINAW), Brier Skill Score (BSS), and the Continuous Ranked Probability Score (CRPS), and other standarderror metrics. A persistence ensemble probabilistic forecasting model and a Gaussian probabilisticforecasting model are employed to benchmark the performance of the proposed kNN ensemble model.The results show that the proposed model significantly outperform both reference models in terms of allevaluation metrics for all locations when the forecast horizon is greater than 5-min. In addition, theproposed model shows superior performance in predicting DNI ramps.

© 2016 Elsevier Ltd. All rights reserved.

1. Introduction

Global market penetration of centralized solar productions,particularly the Concentrated Solar Power (CSP) plants, has beengrowing rapidly due to the increasing demands for clean andcarbon-free energy [1,2]. Direct Normal Irradiance (DNI), which isthe sole energy source for CSP generations, is sensitive to the cir-cumsolar cloud cover and therefore is highly variable at the groundlevel [3]. As a result, the variability of ground-level CSP productionsimposes serious challenges to electrical transmission grids, whichneed to be balanced in real time but have limited storage capacity[4,5]. Quantitatively forecast of DNI provides important informa-tion for inverter control, plant management, unit commitment, andreal-time dispatch operations [6,7]. Therefore, solar forecastingmodels are widely recognized as key components of a smart grid tomitigate the instabilities of centralized solar power generation[4,8e10].

Many effective solar forecasting models have been developedfor different temporal horizons based on data-driven, physical, orhybrid methods [8,11e25]. Most of these available solar forecasting

ra).

models generate deterministic point predictions without quanti-fied uncertainty [26,27]. Point predictions are associated withinherent and irreducible forecasting errors because of the chaoticatmospheric processes, regardless of the mechanism of the modelor the methods of data processing [28e30]:

IðtÞ ¼ f ðtÞ þ εðtÞ; (1)

where I(t) represents the measured value at time t, f(t) representsthe optimal prediction, and εðtÞ represents the white noise.Therefore, probabilistic forecasts, which provide the ProbabilityDensity Function (PDF) of forecast variables, are recommended forreal-world forecasting applications in the literature[26,28,29,31e33].

Probabilistic solar/solar power forecasts have been proposed inliterature based on analog ensemble of Numerical Weather Pre-diction (NWP) models [34e38]. The analog ensemble is usuallydefined as a set of historical instances from a NWP model for agiven location and forecast horizon. These historical instances havesimilar features as the current instance from the same NWPmodel.The actual observations of the historical instances are used to es-timate the PDF of the future state for various weather conditions.For hourly forecast, these proposed analog ensemble models haveshown superior performance over reference models, such as the

Page 2: Short-term probabilistic forecasts for Direct Normal ...coimbra.ucsd.edu/publications/papers/2017_Chu_Coimbra.pdf · Short-term probabilistic forecasts for Direct Normal Irradiance

Nomenclature

ANN artificial neural networkBS Brier scoreBSS Brier skill scoreCDF Cumulative density functionCRPS Continuous ranked probability scoreDIF Diffuse irradianceDNI Direct Normal IrradiancekNN k-nearest neighborkNNEn kNN ensemble modelkNNGD kNN Gaussian modelMAE Mean absolute errorMBE Mean bias errorMRE Missing rate errorNRBR Normalized red to blue ratioNWP Numerical Weather PredictionPDF Probability density function

PeEn Persistence ensemble modelPI Prediction intervalPICP Prediction interval Coverage probabilityPINAW Prediction interval normalized averaged widthRMSE Root mean square errorB Beam/Direct irradianceclr Clear-sky conditionFH Forecast horizonI Irradiancek Clear-sky indexM Number of ranksN Number of instancesP Probabilityp Persistences Forecast skillt Time instanceV Irradiance variability

Y. Chu, C.F.M. Coimbra / Renewable Energy 101 (2017) 526e536 527

Persistence Ensemble (PeEn) model [37] or the quantile regressionmodel [38], when validated using months of historical datacollected from multiple locations, particularly during hours of lowsolar elevation [35e38]. However, the temporal and spatial reso-lution of NWP-based method are not appropriate for the intra-hourforecasts of DNI [3,4].

A few intra-hour forecasting models that provide PredictionIntervals (PIs) for DNI are available in the literature [24,33,39,40].Nevertheless, these available models either provide empirical PIswithout underlying PDFs [33,41] or construct PIs based on theassumption that forecast errors are Gaussian-distributed[26,28,29]. The PDF of DNI forecast errors may not follow aGaussian and other common distributions. For example, Gaussian,Logistic, and Kernel functions are used to fit the distributions of thepersistence DNI forecast errors in Fig. 1. The bandwidth of thedistribution fittings are selected using the exhaustive method [33].The persistence errors are obtained by evaluating a persistencemodel (discussed in Section 3.4) using the training data collected inFolsom and Oahu when solar elevation angle is greater than 10�.More details of the datawill be explained in Section 2. In addition tovisual inspections, the goodness of fit [42] for each PDF is assessedusing the Kolmogorov-Smirnov test [43]. However, all of theapplied PDFs are rejected using the 5% confidence level. In addition,

Fig. 1. Probability density functions generated based on the persistence forecast errors. The fOahu training sets.

time series of DNI usually have different behaviors under differentweather conditions [3,22]. For example, the DNI variability is muchhigher under partially cloudy skies than under a clear skies [8].Ideally, probabilistic forecasts for DNI should be adaptive todifferent weather conditions [26].

Therefore, in this work, a probabilistic forecasting model isdeveloped based on the k-nearest neighbor (kNN) ensemble pre-dictions to generate arbitrary PDFs of DNI for intra-hour forecasthorizons: 5-, 10-, 15-, and 20-min. kNN searches and identifies khistorical time instances whose weather features are closed to theweather features of current time instance [17,44]. With the iden-tified historical time instances and corresponding DNI behaviors,the kNN generates unique PDF forecasts for different weatherconditions. The proposed model is developed and evaluated usinghigh-quality data collected in locations with different climates. Thequantitative evaluation of the proposed model is performed basedon the Prediction interval coverage probability (PICP), the Predic-tion interval normalized averaged width (PINAW), the Brier SkillScore (BSS), the Continuous ranked probability score (CRPS), andstatistic consistency. A Persistence Ensemble (PeEn) model isemployed as a reference model. Gaussian PDF forecasts are alsocomputed using the same kNN ensemble predictions to assess theadvantages of using arbitrary and adaptive PDF forecasts. Details of

orecast errors are obtained by assessing the persistence model on the (a) Folsom and (b)

Page 3: Short-term probabilistic forecasts for Direct Normal ...coimbra.ucsd.edu/publications/papers/2017_Chu_Coimbra.pdf · Short-term probabilistic forecasts for Direct Normal Irradiance

Y. Chu, C.F.M. Coimbra / Renewable Energy 101 (2017) 526e536528

these methods will be presented in Section 3.In this paper: the data used for model training and testing are

presented in Section 2. The detailed methods of persistence andkNN models are presented in Section 3. Metrics and methods toassess both point and PDF forecasts will also be presented in Sec-tion 3. The results are presented and discussed in Section 4 andconclusions are presented in Section 5.

2. Data

In this work, data used to develop and validate the PDF forecastsare collected from four locations characterized by different cli-mates: Folsom, California (Mediterranean; latitude ¼ 38.643�N,longitude ¼ 121.149�W), Merced, California (close to continental;latitude ¼ 37.363�N, longitude ¼ 120.429�W), Oahu, Hawaii (is-land; latitude ¼ 21.326�N, longitude ¼ 150.043�W), and San Diego,California (coastal; latitude ¼ 32.881�N, longitude ¼ 117.238�W).

Rotating Shadowband Radiometers (Augustyn RSR-2, manu-factured by Irradiance, Inc.) and Multi-Filter Rotating ShadowbandRadiometers (MFR-7, manufactured by Yankee EnvironmentalSystems) are employed to measure DNI and DIF components ofbroadband solar irradiance. Both RSR-2 and MFR-7 are first-classradiometers that meet the accuracy requirements of this work.Two RSR-2s are employed in Folsom and Oahu, and twoMFR-7s areinstalled in Merced, and San Diego. Both RSR-2 and MFR-7 cansimultaneously measure the DNI and DIF every minute, and themeasured irradiance values are logged using Campbell ScientificCR1000 data loggers.

Vivotek FE8171 fish-eye network cameras are installed at each ofthe four locations next to the irradiance radiometers. These cam-eras are employed to collect 8-bit RGB sky images (1536 � 1536pixels) every minute using a 3.1MP CMOS sensor and a 360�

panoramic view lens. The RGB sky images are transferred via FTP toa UCSD server for further process and analysis (presented in Section3.2). The glass domes of the cameras are cleaned regularly tomaintain the quality of sky images, and sky images with excessiveamount of dust are manually discarded.

The sampling intervals of both irradiance and sky image mea-surements are 1 min. Therefore, the irradiance data and sky imagesare paired as data instances according to their time labels. Groundobstacles, such as buildings and trees, adversely affect the mea-surements when the solar elevation is low [26]. Therefore, thiswork only considers data instances when the local solar elevationangle is greater than 10�. Data instances from each of the four lo-cations are divided into disjointed training sets for model trainingand testing sets for model validation. To avoid significant seasonaldifferences, the training sets are defined as the first three weeks ofeachmonth and the testing sets are defined as the last week of eachmonth. The details of the data sets are presented in Table 1.

Irradiance variability (V) [45] is calculated for each of the fourlocations (shown in Table 1). V quantifies the variability of irradi-ance time series and is an indicator of local sky conditions. Small Vgenerally indicates relatively stationary weather conditions, such asclear sky. For instance, the Oahu has the highest ratio of cloudy/partly-cloudy period among the four locations resulting in thelargest V. In this work, the variability V for DNI is defined as:

Table 1Data used to train and validate forecasting models.

Location Total data instances Overall V Period

Folsom 8,79,734 0.098 2013-01-01 to 2014-10-01Merced 2,48,903 0.127 2012-10-16 to 2013-04-10Oahu 71,536 0.229 2012-06-21 to 2012-08-12San Diego 2,34,076 0.139 2014-01-01 to 2014-06-15

V ¼ffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi1N

XðkðtÞ � kðt � 1ÞÞ2

r; (2)

where K(T) is the DNI clear-sky index at time t:

kðtÞ ¼ BðtÞBclrðtÞ

; (3)

where Bclr is the clear-sky DNI.Clear-sky DNI(Bclr) is defined as the DNI in absence of clouds for

a given location and time. Bclr is influenced by atmospheric con-ditions (e.g. aerosol content), which can be quantified using theatmospheric turbidity. The clear-sky model used in this work isproposed by Ineichen and Perez [46,47]. This clear-sky model isconsidered to be both one of the simplest clear-sky models and oneof the best performing models [4]. This clear-sky model uses solarelevation and Linke turbidity [48] as inputs. The solar elevation isderived using three parameters: time, latitude, and longitude. TheLinke turbidity is available to the public as worldwide monthlyaveraged maps, which are derived from global beam radiationmeasurements based on the algorithm proposed by Remind et al.[49]. For more details about the Linke turbidity values, please seeRef. [46].

3. Methods

3.1. k-nearest neighbors

The k-nearest neighbors (kNN) method is considered as one ofthe simplest pattern recognition methods [50,51]. A kNN modelconsiders its training set as a feature space, and classifies patternsbased on the comparison of a current instance with training (his-torical) instances in the feature space ([17]). In addition to patternclassifications, the kNN is also employed to forecast solar irradiance[9,17,33,44]. In forecast scenario, the kNN search its feature space toidentify the k training instances whose features match closest tothe features of current conditions. These identified k trainingsamples are named as the neighbors. Each neighbor will contributean individual prediction that is the subsequent values in the timeseries, and the collective prediction from the k neighbors is derivedas the kNN prediction.

In this work, the features space of the kNN model assembles alltraining instances from the training set in a matrix Aij. Therefore,each row of Aij is a feature vector representing a historical timeinstance. At a new time t, the feature vector pj at t is obtained and iscompared with all feature vectors in the feature space. Euclideandistances di between pj and every rows of Aij are calculated:

di ¼ffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiXj

�Aij � pj

�2vuut : (4)

Afterward, the distance values are sorted in ascending order andthe first k training instances are selected as the k nearest neighbors.In this work, k is set to 30 as suggested by Pedro and Coimbra [33].The individual predictions from the k nearest neighbors (the kNNensemble) are weighted and summed to provide a collectiveprediction:

bB ¼Pk

i¼1wibBiPk

i¼1wi

; (5)

where wi are weights that are derived based on the distance di:

Page 4: Short-term probabilistic forecasts for Direct Normal ...coimbra.ucsd.edu/publications/papers/2017_Chu_Coimbra.pdf · Short-term probabilistic forecasts for Direct Normal Irradiance

Y. Chu, C.F.M. Coimbra / Renewable Energy 101 (2017) 526e536 529

wi ¼�

1� dimaxðdÞ �minðdÞ

�n

; i ¼ 1;…; k; (6)

where n is a positive parameter. The nearest neighbors are equallyweighted if n equal to 0 and are linearly weighted if n equal to 1. Inthis work, n is set to 1 as suggested in Ref. [33].

3.2. kNN feature space

The feature space of the kNN model includes the time-laggedDNI, DIF measurements and three sky image features [26,33].Time-lagged DNI measurements (from 0 to 30 min in steps of5 min) are used as endogenous features because latest measuredDNI are highly informative for machine-learning models [3,9].Time-lagged DIFmeasurements (from 0 to 30min in steps of 5 min)are used as exogenous features because significant variations of DIFusually associate with possible DNI ramp events in future [52,53].Sky image features are used because they provide cloud cover in-formation, which are considered as useful exogenous inputs to DNIforecasts at the ground level [4,22,54].

Proposed by Pedro and Coimbra, three image features arecalculated using normalized red to blue ratio (NRBR) [22,55] of allpixels in the sky region of a sky image:mean (m), standard deviation(s), and entropy (e).

m ¼ 1N

XNi¼1

NRBRi; (7)

where N is the number of pixels in a sky image. Standard deviation

s ¼ffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi1N

XNi¼1

ðNRBRi � mÞ2vuut : (8)

Entropy

e ¼ �XNB

j¼1

pjlog2�pj�; (9)

where pj is the frequency for the jth bin (out of NB ¼ 256 bins).Comparing with common cloud detection methods, the calculationof these three image features requires substantially less amount oftime and results in minimum latency in the real time forecasting[26].

3.3. kNN probabilistic forecasts

Weather conditions of the selected neighbors (training in-stances) are assumed to closely resemble the current weathercondition for the same location. Therefore, the relative frequencyfrom the k individual predictions (kNN ensemble) can be used togenerate the probability distribution of the future DNI for thecurrent conditions [37,38]. The forecast of probability densityfunction (PDF) for DNI calculated based on the kNN ensemble isdenoted as kNNEn and is mathematically defined as [35]:

PðBðtÞ � xÞ ¼ 1k

Xki¼1

IhbBiðtÞ; x

i; (10)

where x represents a possible level of DNI, t is the current time, BðtÞrepresents the observation (the target DNI), bBiðtÞ is the i-th indi-vidual prediction from the k nearest neighbors, I½bBiðtÞ; x� equal to 1if bBiðtÞ � x and equal to 0 otherwise. PðBðtÞ< xÞ is a cumulative

density distribution (CDF) and can be easily converted into PDFwith the same resolution in x. DNI PDF generated from ensembleforecasts (denoted as kNNEn) may not perfectly match the true DNIPDF due to the model errors and deficiencies in the method ofconstructing the ensemble [35].

A kNN PDF forecasting model based on Gaussian distribution(kNNGD) is developed as a reference model to benchmark thekNNEn model. The kNNGD construct the PDF for DNI using a meanand a standard deviation derived from the k individual predictions.The forecast uncertainty at time t of the kNN forecasting model isestimated as the standard deviation of predictions from the knearest neighbors:

s2ðtÞ ¼ffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi1k

Xki¼1

�bBiðtÞ � bmðtÞ�2vuut ; (11)

where bmðtÞ is the average prediction of k nearest neighbors:

bmðtÞ ¼ 1k

XbBiðtÞ: (12)

Once the bm and s are obtained, the kNNGD PDFs can be gener-ated using the Gaussian distribution critical values z:

bm±z1�0:5asðtÞ; (13)

where a is the confidence level.

3.4. Persistence model

The persistence model is the simplest forecasting model and arefrequently used as a baseline/reference model. The persistencemodel assumes that the clear-sky index kðtÞ remains constantwithin the forecast horizon. Persistence of clear-sky index kðtÞremoves the diurnal DNI variation and therefore achieves higheraccuracy, particularly for longer horizons. The persistence forecastis mathematically defined as:

bBpðt þ FHÞ ¼ BðtÞBclrðtÞ

� Bclrðt þ FHÞ; (14)

where FH is the forecast horizon, bBp is the persistent prediction.The persistence ensemble (PeEn) method [38] is a commonly

used method to provide reference probabilistic forecasts. In thiswork, the PeEn considers the DNI laggedmeasurements in themostrecent 1 h. These selected lagged measurements are ranked todefined quantile intervals for DNI in the same way as the kNNEn[35]. Both persistence point forecast and the PeEn probabilisticforecast are expected to achieve excellent performance duringhighly stationary weather conditions, such as clear sky conditions[3,38].

3.5. Forecast assessments

In this section, metrics and methods used to quantitativelyassess the performance of forecasts are discussed. The point fore-casts are assessed using Mean Biased Error (MBE), Root MeanSquare Error (RMSE), and forecasting skill (s). Prediction intervalswith a nominal confidence level can be derived from probabilisticforecasts and are usually assessed using Prediction intervalcoverage probability (PICP) and Prediction interval normalizedaveraged width (PINAW) [29,56]. The predicted probabilistic den-sity function can be assessed using the Brier Skill Score (BSS), theContinuous Rank Probability Score (CRPS), and statisticalconsistency.

Page 5: Short-term probabilistic forecasts for Direct Normal ...coimbra.ucsd.edu/publications/papers/2017_Chu_Coimbra.pdf · Short-term probabilistic forecasts for Direct Normal Irradiance

Y. Chu, C.F.M. Coimbra / Renewable Energy 101 (2017) 526e536530

Mean Biased Error (MBE):

MBE ¼ 1n

Xnt¼1

�bBðtÞ � BðtÞ�: (15)

Root Mean Square Error (RMSE)

RMSE ¼ffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi1n

Xnt¼1

�bBðtÞ � BðtÞ�2

vuut : (16)

Forecasting skill (s), which measures the improvement of theproposed forecast model (the kNN models) over the persistencemodel in terms of RMSE

s ¼ 1� RMSERMSEp

: (17)

Prediction Interval Coverage Probability (PICP), which measureswhether the target values are covered by the PIs:

PICP ¼ 1n

Xni¼1

ci; (18)

where ci ¼ 1 indicates measured DNI value is within the PIs,otherwise ci ¼ 0.

Prediction Interval Normalized AveragedWidth (PINAW), whichmeasures the informativeness of PIs:

PINAW ¼ 1n

Xni¼1

Wi

Bclr; (19)

whereWi is the width between the upper and lower bounds of PIs.Brier score (BS), which is considered as the mean square error

for probabilistic forecasts, measures the similarity between thepredictions and the observations for probabilistic forecasts [37,38].Brier score in original form is applicable to probabilistic forecaststhat assign probabilities to a set of mutually exclusive discretecategories [57]:

BS ¼ 1N

XNt

XMi

ðpti � TtiÞ2; (20)

where N is the number of forecasting instances,M is the number ofthe possible categories in which the observation can fall (in thiswork M ¼ 60), p is the predicted probability of a categorical event,and T is categorical observation/target. In this work, the observa-tion is a point DNI value. Therefore Tti ¼ 1 if the event occurs in thei-th category at t, otherwise Tti ¼ 0. Similar to the mean squareerror, Brier score with lower magnitude indicates better perfor-mance of probabilistic forecasts. However, Brier score has multipledefinitions and decompositions. As a result, Brier score is usuallyused to calculate the Brier skill score (BSS), which measures therelative performance of an evaluated model comparing to a base-line/reference model:

BSS ¼ 1� BSBSref

; (21)

where BSref is the Brier score achieved by a reference model, whichis the PeEnmodel in this work. A positive value of BSS indicates thatthe evaluated model achieves lower Brier score than the referencePeEnmodel. Therefore, contrary to Brier score, Brier score skill withhigher magnitude indicates better performance of probabilistic

forecasts.Continuous ranked probability score (CRPS) compares the cu-

mulative distribution functions (CDFs) of predicted probabilisticdistributions and observations [38,58]. CRPS share the samedimension as the observations and is derived mathematically as:

CRPS ¼ 1N

XNt

Z �P�bBðtÞ � x

�� PðBðtÞ � xÞ

�2dx; (22)

where PðbBðtÞ � xÞ are the CDFs of the probabilistic forecasts, andPðBðtÞ � xÞ are the “CDFs” of the observations. In this work, thedefinition of CRPS does not assume a particular distributionalmodel family. Therefore, the PðBðtÞ � xÞ are estimated by theempirical CDF, which is considered as step functions because theobservations are deterministic point values. If the probabilisticforecasts are reduced to deterministic point forecasts (PðbBðtÞ � xÞbecome step functions), the CRPS is equivalent to the mean abso-lute error (MAE). Similarly, lower value of CRPS indicates betterperformance of probabilistic forecasts.

Statistical consistency [59,60] evaluates whether the ensemblepredictions are statistically indistinguishable from the observa-tions. The analysis of the statistical consistency first sorts the Mensemble predictions (bBi; i ¼ 1;…;M) and the observation (B)together from lowest to highest. If the ensemble forecast and theobservation are statistically consistent, the observation is equallylikely to take any of the M þ 1 rank:

EhP�bBi�1 � B< bBi

�i¼ 1

M þ 1: (23)

Evaluation of statistical consistency is usually performed using arank histogram. A rank histogram is the distribution of observationranks relative to the sorted ensemble predictions over a largesample of independent instances (in this work is the testing set).Rank histograms are designed to analyze the statistical consistencyof ensemble forecasts [37]. However, the kNNGD model providescontinuous probability density function instead of discreteensemble predictions. To evaluate the statistical consistency of thekNNGD model, the kNNGD rank histogram is created by dividingthe probability density function of kNNGD into 20 equal-probability bins (M ¼ 19 and the rank probability ¼ 1/20 for eachbin).

Ensemble predictions that are statistical consistent have ob-servations equally distributed in the rank histogram with uniformrank probability (1/(M þ 1)) [38,60]. Bias in the ensemble predic-tion will result in a sloped rank histogram. Ensemble predictionsthat are under-dispersive have a concave rank histogram whileensemble predictions that are over-dispersive have a convex rankhistogram [61].

A rank histogram and associated Missing Rate Error (MRE)provide insights about the performance of probabilistic forecasts,and therefore is useful to recalibrate the ensemble forecasts and toimprove the analyzed probabilistic forecasts [61]. More details ofimplementation of statistical consistency analysis can be found inRefs. [38,60,61]. The fraction of observations, which is higher/lowerthan the highest/lowest ranked prediction, is calculated as themissing rate error:

MRE ¼ f1 þ fM � 2M þ 1

; (24)

where, f1 and fM are the relative frequencies of the first and the lastbins in the histogram. Positive and negative missing rate errorusually indicate under-dispersion and over-dispersion in theensemble predictions, respectively [38].

Page 6: Short-term probabilistic forecasts for Direct Normal ...coimbra.ucsd.edu/publications/papers/2017_Chu_Coimbra.pdf · Short-term probabilistic forecasts for Direct Normal Irradiance

Table 3PICP and PINAWof all PDF forecasting models with 90% nominal confidence intervalfor different locations and forecast horizons.

Forecast horizon Metrics Folsom MercedModels PeEn kNNGD kNNEn PeEn kNNGD kNNEn

5 PICP 0.83 0.89 0.93 0.87 0.93 0.96PINAW 0.17 0.22 0.22 0.16 0.21 0.22

10 PICP 0.86 0.91 0.94 0.89 0.93 0.95PINAW 0.23 0.28 0.26 0.22 0.25 0.26

15 PICP 0.87 0.92 0.94 0.88 0.93 0.94PINAW 0.29 0.32 0.29 0.28 0.29 0.29

20 PICP 0.86 0.92 0.93 0.87 0.92 0.93PINAW 0.33 0.35 0.31 0.32 0.33 0.32

Forecast Horizon Metrics Oahu San DiegoModels PeEn kNNGD kNNEn PeEn kNNGD kNNEn

5 PICP 0.79 0.87 0.94 0.74 0.90 0.94PINAW 0.49 0.59 0.57 0.21 0.23 0.26

10 PICP 0.84 0.88 0.93 0.79 0.90 0.93PINAW 0.65 0.71 0.64 0.29 0.28 0.31

15 PICP 0.84 0.89 0.93 0.80 0.89 0.92PINAW 0.76 0.78 0.68 0.34 0.32 0.34

20 PICP 0.84 0.88 0.92 0.80 0.89 0.91PINAW 0.85 0.82 0.70 0.38 0.36 0.37

Table 4Brier skill score (BSS) and continuous ranked probability score (CRPS) of all PDFforecasting models for different locations and forecast horizons.

Forecast horizon Metrics Folsom MercedPeEn kNNGD kNNEn PeEn kNNGD kNNEn

5-min BSS N/A �0.206 �0.045 N/A �0.071 0.059CRPS 0.042 0.035 0.031 0.039 0.036 0.031

10-min BSS N/A �0.128 0.021 N/A �0.035 0.107CRPS 0.050 0.046 0.040 0.047 0.045 0.038

15-min BSS N/A �0.068 0.064 N/A �0.015 0.136CRPS 0.056 0.053 0.046 0.054 0.053 0.044

20-min BSS N/A �0.019 0.100 N/A �0.007 0.150CRPS 0.061 0.059 0.050 0.060 0.059 0.049

Forecast Horizon Metrics Oahu San DiegoPeEn kNNGD kNNEn PeEn kNNGD kNNEn

5-min BSS N/A �0.031 0.043 N/A �0.199 �0.038CRPS 0.117 0.106 0.098 0.058 0.050 0.046

10-min BSS N/A 0.001 0.006 N/A �0.157 0.002CRPS 0.131 0.126 0.115 0.067 0.063 0.057

15-min BSS N/A 0.032 0.074 N/A �0.104 0.041CRPS 0.144 0.138 0.127 0.077 0.072 0.065

20-min BSS N/A 0.052 0.085 N/A �0.072 0.070CRPS 0.154 0.148 0.137 0.087 0.082 0.074

Y. Chu, C.F.M. Coimbra / Renewable Energy 101 (2017) 526e536 531

4. Results and discussion

4.1. Deterministic point forecasts

The point forecasts from the kNN model and the referencepersistence model are evaluated using the testing sets of all fourlocations. The results of the point forecasts in terms of MBE, RMSE,and forecast skill (defined in Section 3.5) are presented in Table 2.Both models show small MBE less than 15 W/m2. In general, thepersistence model shows the smaller bias for most locations andhorizons. However, the kNN model significantly outperforms thereference persistence model in terms of RMSE and s for nearly alllocations and forecast horizons. For instance, the 20-min kNNforecasts achieve forecast skills of 12.5%, 8.0%, 14.9%, and 6.7% forFolsom, Merced, Oahu, and San Diego, respectively. In general, DNIis usually more variable under partly-cloudy sky than other skyconditions, such as clear or overcast [22,24]. Oahu have higheroccurrence of partly-cloudy sky conditions. Therefore, the occur-rence of large DNI ramps is significantly higher in Oahu than otherthree locations resulting in larger variability. Consequently, bothmodels show relatively higher error metrics in terms of MBE andRMSE for Oahu than other locations, regardless of forecast horizons.

The error metrics of both the persistence and the kNN modelsincreases with the forecast horizon (shown in Table 2). For examplein San Diego, the RMSEs of both the persistence and the kNNmodels increase from ~95 W/m2 to ~145 W/m2 when forecast ho-rizon increases from 5 to 20 min. It is expected because that fore-casts for longer horizon are usually more difficult [8] than that forshorter horizon. The persistence model is unable to predict futureirradiance ramps. Therefore, the RMSEs of the persistence modeltend to increase faster when forecast horizon increases [23,24,62].As a result, the kNN model achieves relatively higher forecast skillsfor longer horizons in all four locations. For example in Merced, theforecast skills of the kNNmodel are 2.9% and 8.0% for 5-min and 20-min forecasts, respectively.

4.2. Prediction intervals

Prediction intervals are generated from probabilistic densityfunctions of PeEn, kNNGD, and kNNEn models using a nominalconfidence levels of 90%, which is more commonly applied inrenewable forecasting studies [26,28,29,31,32]. The performancemetrics in term of PICP and PINAW are presented in Table 3. Pre-diction intervals with higher PICPs are considered as superior[26,29]. For instance, kNNEn prediction intervals using 90% nomi-nal confidence interval achieve the highest PICP of 0.93 for 5-minforecast at Folsom and therefore shows better coverage probabil-ity than the two reference models. The coverage probabilities

Table 2MBE, RMSE and forecast skill s of the kNN and the reference persistence models for diffe

Forecast horizon Metrics Folsom Merced

Persistence kNN Persisten

5-min MBE (Wm2) 0.2 �2.6 �0.8RMSE (Wm2) 83.3 78.1 81.2s (%) N/A 6.2 N/A

10-min MBE (Wm2) 0.6 �2.5 �2.3RMSE (Wm2) 111.6 98.4 98.8s (%) N/A 11.8 N/A

15-min MBE (Wm2) 1.0 �2.3 �3.5RMSE (Wm2) 125.2 109.3 116.6s (%) N/A 12.7 N/A

20-min MBE (Wm2) 1.4 �2.0 �4.6RMSE (Wm2) 134.5 117.8 124.4s (%) N/A 12.5 N/A

(PICPs) of PeEn are mostly less than the pre-defined nominal con-fidence interval. Both kNNGD and kNNEn models achieve PICPs

rent locations and forecast horizons.

Oahu San Diego

ce kNN Persistence kNN Persistence kNN

4.5 �2.4 �0.9 �1.7 �1.878.9 190.0 168.8 95.8 95.82.9 N/A 11.1 N/A �0.17.3 �2.0 �6.2 �3.3 �3.194.6 217.1 193.6 122.1 116.14.3 N/A 10.8 N/A 4.99.3 �1.5 �10.7 �4.6 �4.9107.0 244.2 208.5 136.8 128.98.2 N/A 14.6 N/A 5.811.5 �1.0 �13.2 �5.8 �6.6114.4 258.1 219.7 150.2 140.18.0 N/A 14.9 N/A 6.7

Page 7: Short-term probabilistic forecasts for Direct Normal ...coimbra.ucsd.edu/publications/papers/2017_Chu_Coimbra.pdf · Short-term probabilistic forecasts for Direct Normal Irradiance

Y. Chu, C.F.M. Coimbra / Renewable Energy 101 (2017) 526e536532

mostly greater than their nominal confidence levels. Coverageprobability of kNNEn model are slightly higher than that of thekNNGD model for all locations and forecast horizons.

Ideally, prediction intervals should have high PICPs and lowPINAWs indicating high coverage probability of target values andhigh informativeness, respectively [26]. At the same confidencelevel, the kNNGD intervals usually have significantly higher PICPsand PINAWs than the PeEn intervals (shown in Table 3). These re-sults match early studies of probabilistic forecasts that are based onGaussian distribution [26]. At the same confidence level, the kNNEnprediction intervals achieve higher PICPs but similar or lowerPINAWs than the other two reference models. Therefore, kNNEnmodel achieve lowest ratios of PINAW/PICP providing valid andmost informative predictions.

4.3. PDF forecasts

The forecasts for probabilistic density functions from the PeEn,the kNNGD, and the kNNEn models are evaluated on the testing

Fig. 2. Plots of relative frequency of CRPSs for different horizons and locations. The size of biof large CRPS values. In general, kNNEn forecasts have lower frequencies of large CRPSs butdistributions of CRPSs for both PeEn and kNNGD forecasts tend to have heavier tails than t

sets of all four locations for 5-, 10-, 15-, and 20-min horizons. Theperformance metrics in terms of Brier score skill and continuousranked probability score are presented in Table 4. The Brier scoreskills of the PeEn are not applicable because the PeEn is the refer-ence model. For all locations when forecast horizon �10-min, thekNNEn model achieves significant positive Brier score skills overthe PeEn model. The Brier score skills of the kNNGD model arenegative for most locations except Oahu. In term of continuousranked probability score, the kNNEn model consistently andsignificantly outperforms both the PeEn model and the kNNGDmodel regardless of forecast horizon and location.

For all models, both the Brier score skill and the continuousranked probability score tend to be larger for longer forecast ho-rizons in all locations. For example in Merced, both Brier score skilland continuous ranked probability score of the kNNEn model in-crease from 0.059 to 0.150 and 0.031 to 0.049 respectivelywhen theforecast horizon increase from 5-min to 20-min. The increases incontinuous ranked probability score are expected for the samereason that forecasts for longer horizon are usually more difficult

ns is 0.025. Lighter tails of the frequency distribution of CRPSs indicates less occurrencehigher frequencies of small CRPSs than both PeEn and kNNGD forecasts. Therefore, thehat of kNNEn forecasts, and the kNNEn model achieves the lowest overall CRPS.

Page 8: Short-term probabilistic forecasts for Direct Normal ...coimbra.ucsd.edu/publications/papers/2017_Chu_Coimbra.pdf · Short-term probabilistic forecasts for Direct Normal Irradiance

Y. Chu, C.F.M. Coimbra / Renewable Energy 101 (2017) 526e536 533

[8] than shorter horizons. The PeEn Brier score increases faster thanthe kNNEn Brier score when the forecast horizon increases.Therefore, the Brier score skill of the kNNEn model increases withthe forecast horizon. This results show that the kNNEn model issuperior to the PeEn model for longer horizon forecasts.

To further analyze the performance of the models, the relative-frequency distributions of continuous ranked probability scores ofall forecast instances are plotted in Fig. 2 for different locations andforecast horizons. In general, the distributions of continuousranked probability score tend to shift right with the increase offorecast horizon because longer horizons are more difficult toforecast [4]. However, all three evaluated models have very highrelative frequency (mostly > 60%) for very small continuous rankedprobability score < 0.05. Most of these small continuous rankedprobability score are observed during stationary weather condi-tions (clear period), which is the dominant weather conditions formost locations. Relatively, Oahu have less proportion of clearweather conditions and therefore has shorter peaks (~40%) andheavier tails in the distributions of continuous ranked probabilityscore.

Persistence errors equate to the magnitude of step changes inthe net load representing the levels of irradiance variability V.Therefore, the persistence models usually achieve excellent per-formance during stationary weather conditions [8,24]. However,the persistence models are unable to forecast the none-diurnalirradiance ramps [22] and persistence errors increase rapidlywith the variability V during cloudy time [22]. Consequently, the

Fig. 3. Rank histograms of PnEn (top row), kNNGD (mid row), kNNEn (bottom row) for 10equally distributed in the rank histograms indicate that the ensemble predictions are statistthe best statistical consistency achieving the lowest absolute values of Missing Rate Error (

PeEn model tend to have a relatively heavy tail for continuousranked probability score >0.4. For all locations, the distributions ofcontinuous ranked probability score for both kNNGD and kNNEnmodels have substantially less occurrence of large scores (contin-uous ranked probability score > 0.4) than that of the PeEn modelresulting in lighter tails (close to zero). Comparing to the kNNGDmodel, the kNNEnmodel generally have less frequency of moderatescores and higher frequency of small scores. Therefore, the kNNEnmodel achieves the lowest overall continuous ranked probabilityscore (shown in Table 4) for all locations and forecast horizons.

Rank histograms and corresponding missing rate errors of 10-and 20-min PDF forecasts are plotted in Figs. 3 and 4, respectively.Forecasts for longer time horizon are usually associated with higheruncertainties [8]. The histograms of the 20-min PDF forecasts havehigher missing rate errors than that of the 10-min forecasts. ThePeEn and the kNNEn histograms have the different number of binsdepending on the size of their ensemble forecasts.

The PeEn model tends to underestimate the frequency of bothhighest and lowest bins regardless of locations. The frequency ofmiddle bins are mostly notably below the uniform rank probability1/(Mþ1). These results indicate that the PeEn ensemble predictionsare under-dispersive resulting in large positive missing rate errors.These results match the previous researches [26] that the persis-tence model tends to underestimate the uncertainty of DNI andpersistence prediction intervals usually have both low coverageprobability and low average interval width. Comparing to the PeEnmodel, the kNNGD model tend to overestimate the frequency of

-min horizon PDF forecasts. Each column refers to a different location. Observationsical consistent and have uniform rank probability. The kNNEn predictions have shownMRE) for most locations.

Page 9: Short-term probabilistic forecasts for Direct Normal ...coimbra.ucsd.edu/publications/papers/2017_Chu_Coimbra.pdf · Short-term probabilistic forecasts for Direct Normal Irradiance

Fig. 4. Rank histograms of PnEn (top row), kNNGD (mid row), kNNEn (bottom row) for 20-min horizon PDF forecasts. Each column refers to a different location. Observationsequally distributed in the rank histograms indicate that the ensemble predictions are statistical consistent and have uniform rank probability. The kNNEn predictions have shownthe best statistical consistency achieving the lowest absolute values of Missing Rate Error (MRE) for most locations.

Y. Chu, C.F.M. Coimbra / Renewable Energy 101 (2017) 526e536534

highest bins. As a result, the middle bins of the kNNGD histogramusually have frequencies higher than the uniform rank probability.Therefore, the rank histograms of the kNNGD model PDF forecastsmostly have negative missing rate errors. These results indicatethat the PDF forecasts of kNNGD tend to be overspread. The kNNEnmodel shows the overall most uniform level of statistical consis-tency with lowest absolute values of missing rate error for mostlocations. The kNNEn model slightly underestimates the frequencyof highest bins for Folsom, Merced, and San Diego. In general, thekNNEn model shows better statistical consistency than the kNNGDand the PeEn models for both 10- and 20-min horizons.

Example time series of the 10-min PDF forecasts for Folsom areplotted in Fig. 5. The PDF forecasts are illustrated using predictionintervals with four nominal confidence levels (25% 50%, 75%, 95%)[26,29,32,33]. During the clear period when irradiance variability Vis low (shown in Fig. 5, top row), the PDFs of both PeEn model andkNNen model have the least spread (statistical dispersion) whilethe PDFs of the kNNGD model has the highest spread. The largespread of the kNNGDmodel is expected due to the assumption thatthe errors are Gaussian distributed, which is over-dispersive whenapplied to fit the forecasting errors as shown in Fig. 1. In addition,one or two outlier values in the kNN ensemble predictions mayyield relatively large standard deviations and produce less-peakedPDFs. The kNNEnmodel is able to adaptively provide arbitrary PDFsthat better represent the actual uncertainty of DNI forecast.Therefore, the kNNEn PDFs have less deviations and are moreinformative [26,29] than the kNNGD PDFs during stationary

periods.During the cloudy period when the irradiance variability V is

high (shown in Fig. 5, bottom row), the PeEn forecasts react withdelay to the actual ramp events. Therefore, DNI measurementsfrequently fall outside of the 95% confidence level of the PeEn PDFsresulting in very large Brier score and continuous ranked proba-bility score. Both the kNNGD and the kNNEn models show superiorperformance comparing to the PeEn model during cloudy period interm of coverage probability [33] for irradiance ramps. The kNNGDPDFs are symmetric and may stretch exceeding the possible rangeof DNI level, particular when sun is not obscured during partlycloudy periods. For instance in Fig. 5, the upper bounds of 95%confidence intervals of the kNNGD forecasts occasionally exceed1050 W/m2, which is significantly greater than the maximum DNIvalue (1004W/m2) recorded in the entire analysis period (2013-1-1to 2014-10-1) of Folsom. The stretch of the kNNEn PDFs is boundedby themaximum/minimum predictions of identified neighbors andtherefore will not exceed the historically possible range of DNI.Therefore, the kNNEn models provide asymmetric but more sci-entific PDFs for future DNI, particularly when DNI is close to theclear-sky level.

5. Conclusions

A probabilistic forecasting model is developed for Direct NormalIrradiance (DNI) based on the k-Nearest Neighbor Ensemble(kNNEn) method. This probabilistic forecasting model provides

Page 10: Short-term probabilistic forecasts for Direct Normal ...coimbra.ucsd.edu/publications/papers/2017_Chu_Coimbra.pdf · Short-term probabilistic forecasts for Direct Normal Irradiance

Fig. 5. Sample time series of 10-min horizon PDF forecasts from (a) the PeEn, (b) the kNNGD, and (c) the kNNEn models for a clear day (top row, 2015-6-13) and a cloudy day(bottom row, 2015-5-1) in Folsom. The color bar represents the confidence levels. (For interpretation of the references to colour in this figure legend, the reader is referred to theweb version of this article.)

Y. Chu, C.F.M. Coimbra / Renewable Energy 101 (2017) 526e536 535

adaptive and arbitrary Probability Density Function (PDF) forecastsfor 5-, 10-, 15-, and 20-min horizons. The kNN feature space in-cludes diffuse irradiance measurements (DIF) and sky image fea-tures as exogenous feature to enhance the performance of kNNforecasts. The proposed model is developed and validated at mul-tiple locations with different local climates: Folsom, CA; Merced,CA; Oahu, HI; and San Diego, CA.

A persistence ensemble model (PeEn) and a reference kNNmodel (kNNGD) that provides Gaussian PDF forecasts are devel-oped to benchmark the kNNEn model. All three models are quan-titatively assessed on independent testing dataset collected in eachof the four locations using the Prediction Interval Coverage Prob-ability (PICP), the Prediction Interval Normalized Averaged Width(PINAW), the Brier Skill Score (BSS), the Continuous Ranked Prob-ability Score (CRPS), and statistical consistency. The assessmentresults show the following:

1. The point forecasts of the proposed kNN model achieve positiveforecast skills up to 14.9%, depending on locations and forecasthorizons.

2. At similar levels of PINAWs, the kNNEn achieves the highestPICPs among all tested models.

3. The kNNEn model outperforms the reference PeEn and kNNGDmodels in terms of both BSS and CRPS for all locations whenforecast horizon > 5-min.

4. In general, the kNNEn model shows the most uniform rankhistograms and the smallest missing rate errors indicating thebest statistical consistency among all reference probabilisticforecasting models.

This work demonstrates that the performance of intra-hour DNIprobabilistic forecasting can be both significantly and consistently

improved using (1) the kNNmethod and (2) the adaptive ensemblePDF. The proposed kNNEn model successfully shows superior per-formance in quantifying the uncertainty of the intra-hour DNI andtherefore has high potential to benefit the integration of concen-trated solar power productions.

Acknowledgments

The authors gratefully acknowledge the partial support pro-vided by the California Energy Commission PIER EPC-14-008project, which is managed by Dr. Silvia Palma-Rojas.

References

[1] R. Perez, M. Perez, A Fundamental Look at Energy Reserves for the Planet, vol.50, The IEA SHC Solar Update, 2009, pp. 2e3.

[2] IEA, Technology, Roadmap Solar Photovoltaic Energy, Tech. Rep., IEA, 2014.[3] Y. Chu, H.T.C. Pedro, C.F.M. Coimbra, Hybrid intra-hour DNI forecasts with sky

image processing enhanced by stochastic learning, Sol. Energy 98 (2013)592e603.

[4] R.H. Inman, H.T.C. Pedro, C.F.M. Coimbra, Solar forecasting methods forrenewable energy integration, Prog. Energy Combust. Sci. 39 (6) (2013)535e576.

[5] G.K. Singh, Solar power generation by PV (photovoltaic) technology: a review,Energy 53 (2013) 1e13.

[6] J. Zhang, B. Hodge, A. Florita, S. Lu, H.F. Hamann, V. Banunarayanan, Metrics forEvaluating the Accuracy of Solar Power Forecasting, Tech. Rep., NationalRenewable Energy Laboratory, 2013. NREL/CP-5500e60142

[7] A. Florita, B. Hodge, K. Orwig, Identifying wind and solar ramping events, in:Green Technologies Conference, 2013 IEEE, IEEE, 2013, pp. 147e152.

[8] S. Quesada-Ruiz, Y. Chu, J. Tovar-Pescador, H.T.C. Pedro, C.F.M. Coimbra,Cloud-tracking methodology for intra-hour DNI forecasting, Sol. Energy 102(2014) 267e275.

[9] Y. Chu, B. Urquhart, S.M.I. Gohari, H.T.C. Pedro, J. Kleissl, C.F.M. Coimbra, Short-term reforecasting of power output from a 48 MWe solar PV plant, Sol. Energy112 (2015) 68e77.

[10] W.K. Yap, V. Karri, An off-grid hybrid PV/diesel model as a planning anddesign tool, incorporating dynamic and ANN modelling techniques, Renew.

Page 11: Short-term probabilistic forecasts for Direct Normal ...coimbra.ucsd.edu/publications/papers/2017_Chu_Coimbra.pdf · Short-term probabilistic forecasts for Direct Normal Irradiance

Y. Chu, C.F.M. Coimbra / Renewable Energy 101 (2017) 526e536536

Energy 78 (0) (2015) 42e50.[11] M. Lave, J. Kleissl, Solar variability of four sites across the state of Colorado,

Renew. Energy 35 (12) (2010) 2867e2873.[12] L. Martín, L.F. Zarzalejo, J. Polo, A. Navarro, R. Marchante, M. Cony, Prediction

of global solar irradiance based on time series analysis: application to solarthermal power plants energy production planning, Sol. Energy 84 (10) (2010)1772e1781.

[13] A. Mellit, A.M. Pavan, A 24-h forecast of solar irradiance using artificial neuralnetwork: application for performance prediction of a grid-connected PV plantat Trieste, Italy, Sol. Energy 84 (5) (2010) 807e821.

[14] A. Mellit, H. Eleuch, M. Benghanem, C. Elaoun, A.M. Pavan, An adaptive modelfor predicting of global, direct and diffuse hourly solar irradiance, EnergyConvers. Manag. 51 (4) (2010) 771e782.

[15] R. Perez, S. Kivalov, J. Schlemmer, K. Hemker, D. Renne, T.E. Hoff, Validation ofshort and medium term operational solar radiation forecasts in the US, Sol.Energy 84 (5) (2010) 2161e2172.

[16] C.W. Chow, B. Urquhart, M. Lave, A. Dominguez, J. Kleissl, J. Shields,B. Washom, Intra-hour forecasting with a total sky imager at the UC San Diegosolar energy testbed, Sol. Energy 85 (11) (2011) 2881e2893.

[17] H.T.C. Pedro, C.F.M. Coimbra, Assessment of forecasting techniques for solarpower production with no exogenous inputs, Sol. Energy 86 (7) (2012)2017e2028.

[18] R. Marquez, C.F.M. Coimbra, Intra-hour DNI forecasting methodology based oncloud tracking image analysis, Sol. Energy 91 (2013) 327e336.

[19] R. Marquez, V.G. Gueorguiev, C.F.M. Coimbra, Forecasting of global horizontalirradiance using sky cover indices, ASME J. Sol. Energy Eng. 135 (2013)0110171e0110175.

[20] R. Marquez, H.T.C. Pedro, C.F.M. Coimbra, Hybrid solar forecasting methoduses satellite imaging and ground telemetry as inputs to ANNs, Sol. Energy 92(2013) 176e188.

[21] L. Nonnenmacher, C.F.M. Coimbra, Streamline-based method for intra-daysolar forecasting through remote sensing, Sol. Energy 108 (2014) 447e459.

[22] Y. Chu, H.T.C. Pedro, L. Nonnenmacher, R.H. Inman, Z. Liao, C.F.M. Coimbra,A smart image-based cloud detection system for intra-hour solar irradianceforecasts, J. Atmos. Ocean. Technol. 31 (2014) 1995e2007.

[23] L. Nonnenmacher, A. Kaur, C.F.M. Coimbra, Verification of the SUNY directnormal irradiance model with ground measurements, Sol. Energy 99 (2014)246e258.

[24] Y. Chu, H.T.C. Pedro, M. Li, C.F.M. Coimbra, Real-time forecasting of solarirradiance ramps with smart image processing, Sol. Energy 114 (2015)91e104.

[25] M. Li, Y. Chu, H.T. Pedro, C.F. Coimbra, Quantitative evaluation of the impact ofcloud transmittance and cloud velocity on the accuracy of short-term DNIforecasts, Renew. Energy 86 (2016) 1362e1371.

[26] Y. Chu, M. Li, H.T.C. Pedro, C.F.M. Coimbra, Real-time prediction intervals forintra-hour DNI forecasts, Renew. Energy 83 (2015) 234e244.

[27] Y. Chu, M. Li, C.F. Coimbra, Sun-tracking imaging system for intra-hour DNIforecasts, Renew. Energy 96 (2016) 792e799.

[28] J.G. Carney, P. Cunningham, U. Bhagwan, Confidence and prediction intervalsfor neural network ensembles, in: Neural Networks, 1999. IJCNN'99. Inter-national Joint Conference on, Vol. 2, IEEE, 1999, pp. 1215e1218.

[29] A. Khosravi, S. Nahavandi, D. Creighton, Prediction intervals for short-termwind farm power generation forecasts, IEEE Trans. Sustain. Energy 4 (3)(2013) 602e610.

[30] E.B. Iversen, J.M. Morales, J.K. Møller, H. Madsen, Probabilistic forecasts ofsolar irradiance using stochastic differential equations, Environmetrics 25 (3)(2014) 152e164.

[31] P. Pinson, H.A. Nielsen, J.K. Møller, H. Madsen, G.N. Kariniotakis, Non-para-metric probabilistic forecasts of wind power: required properties and evalu-ation, Wind Energy 10 (6) (2007) 497e516.

[32] A. Bracale, P. Caramia, G. Carpinelli, A.R. Di Fazio, G. Ferruzzi, A Bayesianmethod for short-term probabilistic forecasting of photovoltaic generation insmart grid operation and control, Energies 6 (2) (2013) 733e747.

[33] H.T.C. Pedro, C.F.M. Coimbra, Nearest-neighbor methodology for prediction ofintra-hour global horizontal and direct normal irradiances, Renew. Energy 80(2015) 770e782.

[34] H.M. Van den Dool, A new look at weather forecasting through analogues,Mon. weather Rev. 117 (10) (1989) 2230e2247.

[35] T.M. Hamill, J.S. Whitaker, Probabilistic quantitative precipitation forecastsbased on reforecast analogs: theory and application, Mon. Weather Rev. 134(11) (2006) 3209e3229.

[36] L. Delle Monache, T. Nipen, Y. Liu, G. Roux, R. Stull, Kalman filter and analogschemes to postprocess numerical weather predictions, Mon. Weather Rev.139 (11) (2011) 3554e3570.

[37] L. Delle Monache, F.A. Eckel, D.L. Rife, B. Nagarajan, K. Searight, Probabilisticweather prediction with an analog ensemble, Mon. Weather Rev. 141 (10)(2013) 3498e3516.

[38] S. Alessandrini, L. Delle Monache, S. Sperati, G. Cervone, An analog ensemblefor short-term probabilistic solar power forecast, Appl. Energy 157 (2015)95e110.

[39] M. David, F. Ramahatana, P.-J. Trombe, P. Lauret, Probabilistic forecasting ofthe solar irradiance with recursive ARMA and GARCH models, Sol. Energy 133(2016) 55e72.

[40] A. Grantham, Y.R. Gel, J. Boland, Nonparametric short-term probabilisticforecasting for solar radiation, Sol. Energy 133 (2016) 465e475.

[41] A. Khosravi, S. Nahavandi, D. Creighton, A.F. Atiya, Lower upper bound esti-mation method for construction of neural network-based prediction intervals,Neural Netw. IEEE Trans. 22 (3) (2011) 337e346.

[42] M.A. Stephens, EDF statistics for goodness of fit and some comparisons, J. Am.Stat. Assoc. 69 (347) (1974) 730e737.

[43] F.J. Massey Jr., The Kolmogorov-Smirnov test for goodness of fit, J. Am. Stat.Assoc. 46 (253) (1951) 68e78.

[44] C. Paoli, C. Voyant, M. Muselli, M.-L. Nivet, Forecasting of preprocessed dailysolar radiation time series using neural networks, Sol. Energy 84 (12) (2010)2146e2160.

[45] R.H. Inman, Y. Chu, C.F. Coimbra, Cloud enhancement of global horizontalirradiance in California and Hawaii, Sol. Energy 130 (2016) 128e138.

[46] P. Ineichen, R. Perez, A new airmass independent formulation for the Linketurbidity coefficient, Sol. Energy 73 (3) (2002) 151e157.

[47] P. Ineichen, Comparison of eight clear sky broadband models against 16 in-dependent data banks, Sol. Energy 80 (4) (2006) 468e478.

[48] C.A. Gueymard, S.M. Wilcox, Assessment of spatial and temporal variability inthe us solar resource from radiometric measurements and predictions frommodels using ground-based or satellite data, Sol. Energy 85 (5) (2011)1068e1084.

[49] J. Remund, L. Wald, M. Lefevre, T. Ranchin, J. Page, Worldwide Linke turbidityinformation, in: ISES Solar World Congress 2003, Vol. 400, International SolarEnergy Society (ISES), 2003, 13ep.

[50] B.D. Ripley, Pattern Recognition and Neural Networks, first ed., CambridgeUniversity Press, 1996.

[51] R.O. Duda, P.E. Hart, D.G. Stork, Pattern Classification, Wiley, 2001.[52] G.H. Yordanov, O. Midtgård, T.O. Saetre, H.K. Nielsen, L.E. Norum, Over-

irradiance (cloud enhancement) events at high latitudes, IEEE J. Photovolt. 3(5) (2013) 271e277.

[53] J. Luoma, J. Kleissl, K. Murray, Optimal inverter sizing considering cloudenhancement, Sol. energy 86 (1) (2012) 421e429.

[54] R. Tapakis, A.G. Charalambides, Equipment and methodologies for clouddetection and classification: a review, Sol. energy 95 (0) (2013) 392e430.

[55] Q. Li, W. Lu, J. Yang, A hybrid thresholding algorithm for cloud detection onground-based color images, J. Atmos. Ocean. Technol. 28 (2011) 1286e1296.

[56] A. Khosravi, S. Nahavandi, D. Creighton, Construction of optimal predictionintervals for load forecasting problems, Power Syst. IEEE Trans. 25 (3) (2010)1496e1503.

[57] G.W. Brier, Verification of forecasts expressed in terms of probability, Mon.weather Rev. 78 (1) (1950) 1e3.

[58] H. Hersbach, Decomposition of the continuous ranked probability score forensemble prediction systems, Weather Forecast. 15 (5) (2000) 559e570.

[59] J.L. Anderson, A method for producing and evaluating probabilistic forecastsfrom ensemble model integrations, J. Clim. 9 (7) (1996) 1518e1530.

[60] T.M. Hamill, Interpretation of rank histograms for verifying ensemble fore-casts, Mon. Weather Rev. 129 (3) (2001) 550e560.

[61] F.A. Eckel, M.K. Walters, Calibrated probabilistic quantitative precipitationforecasts based on the MRF ensemble, Weather Forecast. 13 (1998)1132e1147.

[62] A. Kaur, H.T.C. Pedro, C.F.M. Coimbra, Ensemble re-forecasting methods forenhanced power load prediction, Energy Convers. Manag. 80 (2014) 582e590.