Prediction of Frost Events Using Machine Learning and IoT ...static.tongtianta.site/paper_pdf/d7738f76-6824-11e... · Learning and IoT Sensing Devices Ana Laura Diedrichs , Facundo

IEEE INTERNET OF THINGS JOURNAL, VOL. 5, NO. 6, DECEMBER 2018 4589

Prediction of Frost Events Using MachineLearning and IoT Sensing Devices

Ana Laura Diedrichs , Facundo Bromberg, Diego Dujovne, Keoma Brun-Laguna,and Thomas Watteyne, Senior Member, IEEE

Abstract—Internet of Things (IoT) in agriculture applicationshave evolved to solve several relevant problems from producers.Here, we describe a component of an IoT-enabled frost predictionsystem. We follow current approaches for prediction that usemachine learning algorithms trained by past readings of tem-perature and humidity sensors to predict future temperatures.However, contrary to current approaches, we assume that thesurrounding thermodynamical conditions are informative for pre-diction. For that, a model was developed for each location,including in its training information of sensor readings of allother locations, autonomously selecting the most relevant ones(algorithm dependent). We evaluated our approach by train-ing regression and classification models using several machinelearning algorithms, many already proposed in the literature forthe frost prediction problem, over data from five meteorologi-cal stations spread along the Mendoza Province of Argentina.Given the scarcity of frost events, data was augmented usingthe synthetic minority oversampling technique (SMOTE). Theexperimental results show that selecting the most relevant neigh-bors and training the models with SMOTE reduces the predictionerrors of both regression predictors for all five locations, increasesthe performance of Random Forest classification predictors forfour locations while keeping it unchanged for the remaining one,and produces inconclusive results for logistic regression predic-tor. These results demonstrate the main claim of these works:that thermodynamic information of neighboring locations can beinformative for improving both regression and classification pre-dictions, but also are good enough to suggest that the present

Manuscript received November 30, 2017; revised August 14, 2018; acceptedAugust 18, 2018. Date of publication August 27, 2018; date of current ver-sion January 16, 2019. This work was supported in part by the STIC-AmSudProgram through the PEACH Project under Grant 16STIC-08, in part bythe Aprendizaje automático aplicado a problemas de Visión Computacionalthrough Universidad Tecnológica Nacional (UTN), Argentina, under GrantEIUTIME0003601TC, in part by the Predicción localizada de heladas en laprovincia de Mendoza mediante técnicas de aprendizaje de máquinas y redesde sensores through UTN under Grant PID UTN 25/J077, and in part byPeach: Predicción de hEladas en un contexto de Agricultura de precisiónusando maCHine learning through UTN under Grant EIUTNME0004623.(Corresponding author: Ana Laura Diedrichs.)

A. L. Diedrichs is with GridTICs, Dpto. de Electrónica, UniversidadTecnológica Nacional, Mendoza 5500, Argentina, with the Laboratorio deInteligencia Artificial DHARMa, Dpto. de Sistemas de la Información,Universidad Tecnológica Nacional, Mendoza 5500, Argentina, and also withCONICET, Mendoza 5500, Argentina (e-mail: [email protected]).

F. Bromberg is with the Laboratorio de Inteligencia Artificial DHARMa,Dpto. de Sistemas de la Información, Universidad Tecnológica Nacional,Mendoza 5500, Argentina, and also with CONICET, Mendoza 5500,Argentina (e-mail: [email protected]).

D. Dujovne is with the Escuela de Informática y Telecomunicaciones,Universidad Diego Portales, Santiago 8370190, Chile (e-mail:[email protected]).

K. Brun-Laguna and T. Watteyne are with the EVA Team, Inria,75012 Paris, France (e-mail: [email protected]; [email protected]).

Digital Object Identifier 10.1109/JIOT.2018.2867333

approach is a valid and useful resource for decision makers andproducers.

Index Terms—Bayesian networks (BNs), frost prediction,machine learning, precision agriculture, random forest (RF),synthetic minority oversampling technique (SMOTE).

I. INTRODUCTION

IN MENDOZA, Argentina, one of the most relevant wineproduction regions in Latin America [1], [2], frost events

resulted in a loss of 85% of the peach production during2013, and affected more than 35 thousand hectares of vine-yards. Furthermore, research work conducted by KarlsruheInstitute of Technology [3] warns that vineyards in Mendozaand San Juan (Argentina) represent the highest risk regionsin the world for extreme weather and natural hazards, mainlydue to wide diurnal temperature variation with over 20 ◦C inwinter as Antivilo et al. describe on [4]. This reality quanti-fies one of the aspects that a frost event can generate, but thesocio-economical consequences do hit not only the produc-ers but also transport, commerce, and general services, whichtake long recovery periods. Plants and fruits suffer from frostevents as a consequence of water icing inside the internal tis-sues present in the trunk, branches, leaves, flowers, and fruits.However, water content and distribution is different amongthem, generating different damage levels. The most sensiblesections are leaves and fruits. Leaves provide photosynthe-sis surface, while fruits collect nutrients and water from theplant. Individual damage levels can be assessed by studying theeffects of freezing those parts under controlled conditions, butan integral plant view is necessary to measure the economicalloss at the end of the harvest period.

Frost events are difficult to predict given that they are alocalized phenomenon. Frost can be result in partial damagein different areas of the same crop field, with the capacityto destroy the entire production in a matter of hours: even ifthe damage is not visible just after the event, the effects cansurface at the end of the season, both reducing the quantityand quality of the harvest.

There are several countermeasures for frost events, whichinclude air heaters by burning gas, petrol, or other fuels,removing air using large fans distributed along the field orturning on sprinklers. However, each of these countermeasuresare expensive each time they are used. As a consequence, it iscritical to predict frost events with the highest possible accu-racy, so as to initiate the countermeasure actions at the righttime, reducing the possibility of false negatives (a frost event

2327-4662 c© 2018 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission.See http://www.ieee.org/publications_standards/publications/rights/index.html for more information.

https://orcid.org/0000-0001-9973-4554

4590 IEEE INTERNET OF THINGS JOURNAL, VOL. 5, NO. 6, DECEMBER 2018

was not predicted and it happened) or false positives (a frostevent was predicted and did not happen). In the first case, theproduction or part of it may be lost. In the second case, theburned fuel will be useless. Both situations lead to reducedyield or complete production loss.

Given the small amount of frost events during the year,available data is scarce to build an accurate forecasting system,which defines an unbalanced dataset problem. The more datamachine learning models have, the better they can improvetheir accuracy. This is a relevant problem in regions wherethe meteorological data is not continuous or it has a shorthistory. In these cases, there is a low amount of data to builda predictive model with high accuracy and/or precision.

This paper is part of an Internet of Things (IoT)-basedsystem aimed to predict frost events in Peach orchards, asdescribed on the work from Watteyne et al. [5]. This sys-tem is composed of three stages. On the first stage, historicalmeteorological data is gathered from a number of selectedInternet-enabled weather stations. On the second stage, thedata collected from the weather stations is used to train afrost-prediction engine. Finally, on the third stage, local fielddata is collected from a network of IoT sensors and furtherinserted on the model generated by the prediction engine toprovide a frost forecast output.

In this paper, we describe the second stage of the IoTfrost prediction system. Instead of using traditional formula-based predictions which are general for wide regions, wepropose to use a different approach: we use machine learningalgorithms for regression based on Bayesian networks (BNs)and random forest (RF), and for classification based on RF,logistic regression and binary trees, all ran over a balancedtraining set augmented with new samples produced using thesynthetic minority oversampling technique (SMOTE) [6] tech-nique. This technique increases the rate of frost detection(sensitivity).

Traditional approaches provide analytical solutions whichrely on several calibration parameters from the field, providinglow prediction accuracy. Although machine learning was thenatural evolution line of the solution, existing solutions fromthe literature did not provide enough accuracy for the produc-ers to make well-funded decisions. The novelty of this paper istwofold. First, we provide IoT-based machine learning modelswhich combine data from nearby IoT-enabled meteorologicalstations sharply improving frost prediction performance, andsecond, the model provided is amenable for integration intoan IoT system within the field to provide localized frost pre-diction, which greatly increases the value for the producers,enabling them to take action and mitigate frost events in amore efficient way.

We chose these machine learning algorithms because theyare widely used for decision support applications. Logisticregression and binary trees were chosen for comparisonagainst recent competitors [7], whereas RF was chosenbecause it ensures no-overfitting and has demonstrated verygood performance in classification problems. But RF is likea black-box model, which means that is difficult for end-users to understand how the model makes decisions. Incontrast, BNs provide a complete framework for inference

and decision making by modeling relationships of cause-consequence between the variables. A nice property of a BNis that it can be queried: we can ask for the probability of anevent given the current evidence (sensor values). In real IoTapplications, it could happen that a sensor breaks or looses theconnection with the gateway, forcing the network to deal withthe problem of processing results with incomplete informationto make decisions. BNs are also one of the most commonlyused methods to modeling uncertainty.

We are interested in evaluating the possibility of buildinggood predictors only with temperature and relative humidityvariables. These sensors are very common in most of the IoTplatforms or data loggers used for environmental measure-ments. There are situations where sensor networks cannot haveaccess to a gateway or central server; so we want to know, notonly if temperature and humidity values are enough to get anaccurate predictor but also if the sensor’s neighbors are infor-mative (or not) for the prediction, as a first step to prototypea in-network frost forecast.

Finally, we build a regression model to predict the minimumtemperature for the following day using historical informa-tion from previous days including a set of variables such astemperature and humidity from itself and from neighboringsites. It is important to have an accurate prediction at least24 h ahead because of the many logistic issues farmers mustresolve to apply countermeasures against frost (gasoline stockfor heaters, permit of irrigation to feed sprinklers, and tempo-ral employees schedule). The rest of this paper is organized asfollows. Section II locates this paper within the IoT-based Frostprediction system. In Section III, we discuss previous workson daily minimum temperature (frost) prediction. We intro-duce BNs in Section IV-A, RF in Section IV-B, and logisticregression in Section IV-C. In Section V, we describe our sce-nario of interest and the datasets to be used to train the models.Then, we explain our experimental setup in Section VI, fol-lowed by the results Section VII. Finally, we share our findingsand future work in Section VIII.

II. THE IOT FROST PREDICTION SYSTEM

The IoT frost prediction system, described on Fig. 1, iscomposed by three integrated stages.

1) A group of Internet-enabled weather data collectiondevices composed by weather stations which are firstused to provide historical data to the prediction engine,and second, to generate new data to continuouslyimprove the prediction model.

2) A prediction engine based on the techniques and resultsdescribed in detail on this paper, which is trained usingthe data gathered from the weather stations.

3) A frost forecast module based on the model obtainedfrom the prediction engine. This module obtains the datafrom an IoT-based sensor network different from theweather stations. The IoT-based sensor network providesfirst, intrafield data from strategically selected locationsas input data to the model, and second, is used to cal-ibrate temperatures on the field with those captured byremote sensing devices.

DIEDRICHS et al.: PREDICTION OF FROST EVENTS USING MACHINE LEARNING AND IoT SENSING DEVICES 4591

Fig. 1. IoT-based frost prediction system.

III. RELATED WORK

Current frost detection methods can be classified from thedata processing they use to generate the forecast: 1) empirical;2) numerical simulation; and 3) machine learning.

A. Empirical Methods

Empirical methods are based on the use of algebraic for-mulas derived from graphical statistical analysis of a numberof selected parameters. The result is the minimal expectedtemperature, such as the work from Brunt [8] which isapplied in [9] and the work from Snyder and Davis [10].A complete review of classical frost prediction methods canbe found in [11], where the common pattern among themis the estimation of the minimal temperature during thenight. Furthermore, Burgos [11] highlights the work fromSmith [12] and Young [13], comparing the minimal predic-tion accuracy. As a matter of fact, the United States NationalWeather Service has thoroughly used Young’s equation withspecific calibration to local conditions and time of the year forfrost forecasting.

The Allen method, created in 1957, is still recommendedby the Food and Agriculture Organization from the UnitedNations to predict frost events. This formula requires the dryand wet bulb at 3 P.M. of the current day as an estimationof relative humidity and dew point, together with atmosphericpressure and temperature.

All the former models must be adapted to local conditionsby calculating a number of constants that characterize eachgeographical location. The result is the prediction of the mini-mal temperature for the current night only. A number of theseformulas suffer restrictions since they are indicated only forradiative (temperature-based) frost events.

B. Numerical Simulation Methods

Numerical simulations are widely used to predict weatherbehavior. Prabha and Hoogenboom [14] have shown the use ofweather research and forecasting (WRF) models for the studyof two specific frost events in Georgia, USA. The authors usedthe Advanced Research WRF model with a 1 km resolutionscaling to the region of interest with a set of initial values,land use characteristics, soil data, physical parametrization,and for a specific topography map resolution. The resultingmodel obtains accuracies between 80% and 100% and a root

mean square error (RMSE) between 1.5 and 4 depending onthe use case.

Wen et al. [15] also base their study on WRF; however,the authors integrate a number of weather observations fromthe MODIS database as inputs, composed by multispectralsatellite images. Wen et al. highlight that the model improveswhen they include local model observations. This model pre-dicts caloric balance flows, such as net radiation, latent heat,sensible heat, and soil heat flow.

Although this is a valuable modeling tool, numerical simu-lations and empirical formulas require a number of measure-ments and parameters which are not always available to theproducer, such as solar radiation and soil humidity at differentdepths.

C. Machine Learning Methods

There have been several pioneering efforts to apply machinelearning techniques to frost prediction [9], [16]–[18]. However,newer approaches have taken advantage of the evolution ofmachine learning techniques and massive data processingfacilities to obtain higher accuracy on their results.

Maqsood et al. [19] provided a 24-h weather predictionsouth of Saskatchewan, Canada, creating seasonal models.The authors used multilayered perceptron networks (MPLNs),Elman recurrent neural networks, radial basis function net-works, and Hopfield models, all trained with temperature,relative humidity, and wind speed data.

Another example of applied machine learning to frost pre-diction is the work from Ghielmi and Eccel [20]. In this paper,the authors build a minimal temperature prediction enginein north Italy. The aim of this paper is to predict springfrost events, using temperature at dawn, relative humidity,soil temperature, and night duration from weather stations.Ghielmi and Eccel considers input data from six sources toan MPLN and compares the behavior with Brunt’s model andother authors.

Eccel et al. [9] has also studied minimal temperature pre-diction on the Italian Alps using numerical models combinedwith linear and multiple regression, artificial neural networks,and RF. The most relevant finding from this publication is theability of the RF method to provide the most accurate frostevent prediction.

Ovando et al. [21] and Verdes et al. [22] built a frost predic-tion system based on temporal series of temperature-correlatedthermodynamic variables, such as dew point, relative humidity,wind speed and direction, cloud surface, among others usingneural networks.

Lee et al. [7] used logistic regression and decision trees toestimate the minimal temperature from eight weather variablesfor each station in South Korea, for frost events between 1973and 2007, with the following results: average recall valuesbetween 75% and 80% and false alarm rate of (in average)between 22% and 28%.

We can observe that the currently proposed machinelearning-based methods for frost prediction concentrate on theuse of a single weather station to provide input to the modelwithout considering variables from other neighboring weather


stations. All the former proposals have used long periods ofcaptured data for training purposes, ranging from 8 to 30 years,highlighting the local nature of the frost phenomena. It is alsonoticeable from the literature that the most relevant parametersfound by these as inputs to the models are temperature andrelative humidity.

IV. MACHINE LEARNING METHODS

A. Bayesian Networks

The work of Aguilera et al. [23] on BNs for environmen-tal modeling stresses the benefits of using BN for inference,knowledge discovery, and decision making applications. BNis a type of probabilistic graphical model, whose set of ran-dom variables and conditional independences among themcan be represented as a directed acyclic graph (DAG), whoseset of nodes V = X1, X2, . . . , XM represent the randomvariables, and each directed edge represents a direct, i.e.,nonmediated, probabilistic influence between the target vari-able and the originating variable, referred to as parent. Thisresults in each random variable to be independent of all itsnondescendents given its parents; a property known as theMarkov property. If we denote by πXi the set of parents ofvariable Xi, the Markov property results in the following fac-torization of the joint probability distribution P(X1, . . . , XM):P(X1, . . . , XM) = ∏M

i=1 P(Xi|πXi). This factorization is themain advantage of BNs, as it can result in practice in drasticdimensionality reductions, from M down to the cardinality ofthe largest parents’ set.

BNs can be built autonomously using structure learningalgorithms (SLAs) that elicitate both the network structure aswell as its numerical parameters, completely from input data.In this paper, we take the score-based approach for struc-ture learning, that as its name implies, assigns a score toeach candidate BN structure to evaluate how well it fits thedata, and then searches over the space of DAGs for a struc-ture with maximal score using an heuristic search algorithm.Typical score measures are some variation of the lilkelihood,defined as the probability of the (observed) data D given aBN consisting of a DAG M and numerical parameters θ ,i.e., L(θ) = P(D|θ, M) = P(x1, x2, . . . , xm|θ). Greedy searchalgorithms [such as hill-climbing (HC) or tabu search] are acommon choice for the search procedure, but almost any kindof search procedure can be used.

We propose to model a state-based, Gaussian BN whichon one hand models both the local distributions of thefactorization as normal distributions linked by linear con-straints, and on the other represents the state of each variableat discrete time intervals; resulting into a series of timeslices, with each indicating the value of each variable attime t.

In our experiments, we considered different learning sce-narios, including HC [24] and Tabu Search (Tabu) for thesearch procedure, both provided by R as in packages [25],[26]; as well as maximum likelihood for fitting the parame-ters of the Gaussian BN, using the regression coefficients foreach variable against its parents. As scores, we considered thefollowing.

1) The multivariate Gaussian log-likelihood score.2) The corresponding Akaike information criterion

score.3) The corresponding Bayesian information criterion

score.4) A score equivalent Gaussian posterior density.

B. RF

RF [27] is a machine learning method that, same as BN, canbe applied to both regression and classification problems. TheRF algorithm is a very well-known ensemble learning methodwhich involves the creation of various decision trees models.Each tree is built as follows [28].

1) Build the training set for RF by sampling the trainingcases at random with replacement from the original data.About one-third of the cases are left out of the sample.This out-of-bag data is used to get a running unbiasedestimate of the classification error as trees are added tothe forest. It is also used to get estimates of variableimportance.

2) If there are M input variables, a number m << M isspecified such that at each node, m variables are selectedat random out of the M and the best split on these mis used to split the node (decide the parent’s node andleaves). The value of m, also known as mtry, is heldconstant during the forest growing.

3) The best split of one of the m variables is calculatedusing the Gini importance criteria.

4) Each tree is grown until there are no more m variablesto add to the tree. The algorithm continuous until ntreeconstant number of trees were created. No pruning isperformed.

The RF algorithm can be used for selecting the most rele-vant features from the training dataset by evaluating the Giniimpurity criterion of the nodes (variables).

C. Logistic Regression

Binary logistic regression is a generalized linear model withthe Bernoulli distribution, which is a particular case of thebinomial distribution where n = 1. We define frost events asa binary variable (Y = 0 for no frost occurrence and Y = 1for frost event), so the logistic regression equation for frostevents probability p(xi) can be defined as

logit[p(xi)

] = exp(β0 + β ′xi

)(1)

where β ′ is the coefficient corresponding to independent vari-able xi; i = 1 . . . N is the number of variables, and β0 isthe intercept of the linear term. The β ′ and β0 parameterscan be calculated using maximum likelihood estimates. Theprobability of frost events, Y = 1, is defined as

P(Y = 1) = exp(β0 + β ′xi

)

1 + exp(β0 + β ′xi). (2)

V. OUR DATASETS

We worked with data from Dirección de Agriculturay Contingencias Climáticas (DACC) [29], from Mendoza,


Fig. 2. Map of the DACC’s stations located in Mendoza, Argentina.

Argentina. DACC provided data from five meteorological sta-tions located in Mendoza Province, Argentina as depicted onFig. 2, which are listed below.

1) Junín (33◦6′ 57.5′′ S, 68◦29′ 4′′ W).2) Agua Amarga (33◦30′ 57.7′′ S, 69◦12′ 27′′ W).3) La Llave (34◦38′ 51.7′′ S, 68◦00′ 57.6′′ W).4) Las Paredes (34◦31′ 35.7′′ S, 68◦25′ 42.8′′ W).5) Tunuyán (33◦33′ 48.8′′ S, 69◦01′ 11.7′′ W).

For each location, we have data from temperature and rel-ative humidity sensors spanning a period from 2001 to 2016.Our dataset uses this information to record, for each day,the average, minimum and maximum of both temperatureand humidity, resulting in six variables per location, per daysampled.

Our datasets reflect the type of prediction intended, wherethe learned model should accurately predict the minimum tem-perature for the next day using information from previousdays. It is therefore built based on lagged variables, with a lagof T = 1, 2, 3, 4 previous days. This resulted in each labeleddata point containing T times six variables corresponding tothe T previous days, plus, for the training dataset only, thelabel variable reporting the next day minimum temperature atthe location of the station. For classification models, we dis-cretized the label variable like frost or not frost, defining afrost event as below zero ◦C, and consider the frost event asthe positive class.

As explained before, we considered two cases. One inwhich we take only the variables of the station, referred

to as local or config-local, and one in which we consid-ered variables from all other stations as well, referred toas all or config-all. To illustrate, we present some exam-ple rows for different cases of the training dataset. In theexamples, we denote by xt

i,j, the value for the jth variable,with j = 1, . . . , 6, of the ith station, with i = 1, . . . , 5, oft days in the past. For the case of station i = 2, T = 1and local, an example row of the training dataset wouldcontain data for [x1

2,1, x12,2, x1

2,3, x12,4, x1

2,5, x12,6, Y2] where Yi

is the minimum temperature for the next day at station i.For the case of station i = 2, T = 1 and config-all,an example row of the training dataset would contain datafor [x1

1,1, x11,2, . . . , x1

1,6, x12,1, x1

2,2, . . . , x12,6, Y2]. Finally, for the

case of i = 2, T = 2 and config-local, the example row wouldcontain: [x1

2,1, x12,2, . . . , x1

2,6, x22,1, x2

2,2, . . . , x22,6, Y2].

VI. EXPERIMENT SETUP

We conducted several experiments to validate our cen-tral claim that measurements in nearby locations can helpimprove prediction of frost events. For that, we compare ourapproach over several classification and regression algorithms.We considered regression models that predict the minimumtemperature for the next day, namely BN and RF. We trainthe BN models using HC and Tabu from bnlearn R pack-age [25], [26] with their default values for all the selectedscores. To contrast our approach with previous works [7], wetrain classification models using binary trees (recursive par-titioning (Rpart) [30], [31] and C5.0 without boosting [32]),logistic regression and RF [33]. C5.0 incorporates boostingand a feature selection approach (winnowing). For the pur-pose to adapt to a binary tree similar to the one presentedin [7], we did not train C5.0 with boosting, and for Rpart wechose an entropy split.

In order to train the regression and classification models, wesplit the dataset in train and test sets. The train set is used bythe algorithms to fit the model parameters to the data, whilethe test set is used to validate the performance of the modelsunder unseen conditions. In our experiments, we performed a68% split in the first part of the dataset for the training phaseand the rest for testing purposes.

To tackle the scarcity of frost events, we evaluated not onlythe original datasets, but also datasets with an augmentednumber of frost events, by using the SMOTE resamplingmethodology [34], [35]. SMOTE involves a combination ofminority class over-sampling, balanced by a majority classunder-sampling, resulting in a dataset of the same number ofdata points. In our experiments, we chose a three-time over-sampling of the minority class, that triples the number of frostevents.

The method used for selection of the optimal values of user-given parameters in the training phase is cross-validation on arolling basis, best known as time-series cross-validation [36].The range of values for each parameter used for tuning arelisted on Table I. This method consists in K training andtesting events over the same dataset, reported through the aver-age of the performance measurement over all cases. The kth


TABLE ITUNING PARAMETERS FOR CLASSIFICATION AND

REGRESSION MODELS

TABLE IICONFUSION MATRIX FOR A BINARY CLASSIFIER

case, k = 1, . . . , K, consists in a training range over the first((k − 1) × horizon) + windowSize data points of the originaldataset, and testing over the following horizon data points.Note that the tested data points of onefold are then includedas part of the next training dataset fold. In our experiments weused windowSize = 300 and horizon = 100. We implementedthis procedure using the caret R package [35].

VII. RESULTS

The experimental results consist on a number of differentperformance measures computed over the test set, which differfor classification and regression models. For regression mod-els, we report the mean absolute error (MAE) and RMSE. IfYreal denotes the actual minimum temperature as reported inthe test set, and Ypred denotes the predicted value of the model,the metrics are defined as follows.

1) RMSE =√

1n

∑ (Ypred − Yreal

)2.

2) MAE = 1n

∑nt=1 |Ypred − Yreal|.

For classification models, all performance measurementsconsidered are computed over a data structure that summa-rizes prediction quality of classification problems called theconfusion matrix. Table II shows a schema of a confusionmatrix for a binary classifier. Based only on the four quantitiesdefined, namely, true and false positives, denoted TP and FP,respectively, and true and false negatives, denoted TN and FN,respectively; we can compute the following metrics.

1) Sensitivity: [(TP)/(TP + FN)], also known as true posi-tive rate, probability of detection or recall. Higher valuesof sensitivity indicate that it is a good predictor of thepositive class.

2) Precision: [(TP)/(TP + FP)] reflects how accurately isthe predictor for predicting the positive class. The higheris this value the lower the chances of false positives.

3) F1-Score: F1 = [(2 · precision · sensitivity)/

(precision + sensitivity)], which represents a balancebetween precision and sensitivity.

Fig. 3. Performance of classification algorithms.

Fig. 4. Regression results: local versus all.

Figs. 3–7 synthesize all the results we obtained from theexperiments.

Fig. 4 shows results for MAE (top) and RMSE (bottom),for the two regression models: BNs (left) and RF (right). Asshown, the config-all case (labeled all, and shown in the lowerbox) presents the best performance in all cases, with a lowervalue for both errors, both algorithms, and all locations, prov-ing the benefit of adding information of neighboring stations.These figures also show that for both config-all and local con-figurations, the MAE and RMSE are below 3 ◦C, a small valuewhen compared to the large thermal variability in Mendozaduring winter days that can reach up to 20 ◦C [4].

Fig. 3 shows a comparison between the F1-score for allfour classification algorithms. As observed, RF and logistic


Fig. 5. RF: SMOTE versus normal.

Fig. 6. Logistic regression: SMOTE versus normal.

Fig. 7. RF and Logistic regression models: local versus all.

regression outperforms both C5.0 and Rpart in all cases.Thus, to simplify the following analysis, we decided to filterthem out.

A known effect of SMOTE on the performance of classifi-cation algorithms is to increase the sensitivity while reducingprecision, a pattern followed for our data, as shown byFigs. 5 and 6 for algorithms RF and logistic regression, respec-tively, where SMOTE results (top box) are large for sensitivity(left column) for all five locations, while for precision (rightcolumn) are clearly lower.

F1-score results for logistic regression and RF models, areshown in Fig. 7, showing a comparison between cases config-all (top box) and local (bottom box). We can see that thebehavior between the stations is different (Figs. 3 and 7), duethe fact they are apart from each other in different micro-climate zones: Agua Amarga and Tunuyán are in Uco Valley,Junín in East Valley, and Las Paredes and La Llave are inSouth Valley. We can observe that Tunuyán station presentsless variability in its F1 results. Therefore, adding informa-tion from neighbors does not help to improve F1 metricto all the stations. For RF, config-all helps to Tunuyan,Las Paredes, Junín, and Agua Amarga to increase the averageof the F1. Agua Amarga improves F1 score with config-allfor both models: RF and logistic regression, and RF has a bet-ter performance. In the case of logistic regression, Junín andTunuyán slightly increase the average F1 by using config-local.Despite that we show that our results regarding recall are sim-ilar or better than the previous approach from Lee et al. [7],where the recall value is between 0.7 and 0.8, there are twomain differences between both studies. First, the work fromLee et al. uses a longer period: 40 years of weather data fortraining; and second, both scenarios are geographically differ-ent. Finally, we observe that farmers would prefer a higherrecall value in presence of lower precision, because the costof crop losses may be higher than the price of frost protec-tion even including false positives. In this case, a combinationof config-all and SMOTE could help to build a better fittedforecast engine.

VIII. CONCLUSION

In this paper, we have created an forecasting engine whichis part of an IoT-enabled frost prediction system, which gath-ers environmental data to predict frost events using machinelearning techniques. We have shown that our prediction capa-bility outperforms current proposals in terms of sensitivity,precision and F1.

In particular, the application of SMOTE during the trainingphase has shown an improved performance in terms of recallin both RF and logistic regression models.

We have also observed that, in specific relevant cases, theinclusion of neighbor information helps to improve the pre-cision or recall of the forecasted classification model. On theother hand, regression models have less error by includingneighbor information. In these cases, including the spatialrelationships, there is a resulting improvement in model perfor-mance. We hope to contrast this approach with other scenariosin the future.

ACKNOWLEDGMENT

The authors would like to thank the DACC for sharing datawith them to make this paper possible.

REFERENCES

[1] L. Saieg. (Nov. 2016). Casi 35 Mil Hectáreas Afectadas Por Heladas.Accessed: Nov. 20, 2016. [Online]. Available: http://www.losandes.com.ar/article/casi-35-mil-hectareas-de-vid-afectadas-por-heladas


[2] A. Barnes. (May 2017). El Nino Hampers Argentina’s 2016 WineHarvest. Accessed: May 23, 2017. [Online]. Available: http://www.decanter.com/wine-news/el-nino-argentina-2016-wine-harvest-305057/

[3] M. Lehné. (Apr. 2017). Winemakers Lose Every Year Millions ofDollars Due to Natural Disasters. Accessed: Apr. 26, 2017. [Online].Available: http://www.kit.edu/kit/english/pi_2017_051_winemakers-lose-billions-of-dollars-every-year-due-to-natural-disasters.php

[4] F. G. Antivilo et al., “Thermal history parameters drive changes in phys-iology and cold hardiness of young grapevine plants during winter,”Agric. Forest Meteorol., vol. 262, pp. 227–236, Nov. 2018.

[5] T. Watteyne et al., “PEACH: Predicting frost events in peach orchardsusing IoT technology,” EAI Endorsed Trans. Internet Things, vol. 2,no. 5, Dec. 2016. [Online]. Available: http://eudl.eu/doi/10.4108/eai.1-12-2016.151711, doi: 10.4108/eai.1-12-2016.151711.

[6] N. V. Chawla, K. W. Bowyer, L. O. Hall, and W. P. Kegelmeyer,“SMOTE: Synthetic minority over-sampling technique,” J. Artif. Intell.Res., vol. 16, no. 1, pp. 321–357, 2002.

[7] H. Lee, J. A. Chun, H.-H. Han, and S. Kim, “Prediction of frostoccurrences using statistical modeling approaches,” Adv. Meteorol.,vol. 2016, Mar. 2016, Art. no. 2075186. [Online]. Available:https://doi.org/10.1155/2016/2075186

[8] D. Brunt, Physical and Dynamical Meteorology. Cambridge, U.K.:Cambridge Univ. Press, 2011.

[9] E. Eccel et al., “Prediction of minimum temperatures in an alpine regionby linear and non-linear post-processing of meteorological models,”Nonlin. Processes Geophys., vol. 14, no. 3, pp. 211–222, 2007.

[10] R. L. Snyder and C. Davis, Principles of Frost Protection Long Version–Quick Answer FP005, Univ. California at Davis, Davis, CA, USA, 2000.

[11] J. J. Burgos, Las Heladas En La Argentina, Ministerio de Agricultura,Ganadería y Pesca, Presidencia de la Nación, 2011.

[12] J. W. Smith, Predicting Minimum Temperatures From Hygrometric Data.Washington, DC, USA: Govt. Printing Office, 1920.

[13] F. D. Young, “Forecasting minimum temperatures in Oregon andCalifornia,” Monthly Weather Rev., vol. 16, pp. 53–60, 1920.

[14] T. Prabha and G. Hoogenboom, “Evaluation of the weather researchand forecasting model for two frost events,” Comput. Electron. Agric.,vol. 64, no. 2, pp. 234–247, 2008.

[15] X. Wen, S. Lu, and J. Jin, “Integrating remote sensing data with WRFfor improved simulations of oasis effects on local weather processesover an arid region in Northwestern China,” J. Hydrometeorol., vol. 13,no. 2, pp. 573–587, 2012.

[16] R. J. Kuligowski and A. P. Barros, “Localized precipitation fore-casts from a numerical weather prediction model using artificial neuralnetworks,” Weather Forecasting, vol. 13, no. 4, pp. 1194–1204, 1998.

[17] I. Maqsood, M. R. Khan, and A. Abraham, “Intelligent weather monitor-ing systems using connectionist models,” Neural Parallel Sci. Comput.,vol. 10, no. 2, pp. 157–178, 2002.

[18] I. Maqsood, M. R. Khan, and A. Abraham, “Neurocomputing basedCanadian weather analysis,” in Proc. 2nd Int. Workshop Intell. Syst.Design Appl., Atlanta, GA, USA, 2002, pp. 39–44.

[19] I. Maqsood, M. R. Khan, and A. Abraham, “An ensemble of neuralnetworks for weather forecasting,” Neural Comput. Appl., vol. 13, no. 2,pp. 112–122, 2004.

[20] L. Ghielmi and E. Eccel, “Descriptive models and artificial neural net-works for spring frost prediction in an agricultural mountain area,”Comput. Electron. Agric., vol. 54, no. 2, pp. 101–114, 2006.

[21] G. Ovando, M. Bocco, and S. Sayago, “Redes neuronales para modelarpredicción de heladas,” Agricultura Técnica, vol. 65, no. 1, pp. 65–73,2005.

[22] P. F. Verdes, P. M. Granitto, H. Navone, and H. A. Ceccatto, “Frostprediction with machine learning techniques,” in Proc. VI CongresoArgentino de Ciencias de la Computación, 2000, pp. 1423–1433.

[23] P. A. Aguilera, A. Fernández, R. Fernández, R. Rumí, and A. Salmerón,“Bayesian networks in environmental modelling,” Environ. Model.Softw., vol. 26, no. 12, pp. 1376–1388, 2011.

[24] D. Margaritis, “Learning Bayesian network model structure from data,”School Comput. Sci., Carnegie Mellon Univ., Pittsburgh, PA, USA,Rep. TR-CMU-CS-03-153, 2003.

[25] M. Scutari, “bnlearn: Bayesian network structure learning, parameterlearning and inference, R package version 4.2,” 2015.

[26] M. Scutari, “Learning Bayesian networks with the bnlearn R package,”J. Stat. Softw., vol. 35, no. 3, pp. 1–22, 2010.

[27] L. Breiman, “Random forests,” Mach. Learn., vol. 45, no. 1, pp. 5–32,2001.

[28] L. Breiman and A. Cutler. (Nov. 2017). Random Forests Leo Breimanand Adele Cutler Website. Accessed: Nov. 29, 2017. [Online]. Available:https://www.stat.berkeley.edu/ breiman/RandomForests/cc_home.htm

[29] A. Gobierno de Mendoza. (Apr. 2017). Dirección de Agricultura yContingencias Climáticas. Accessed: Nov. 27, 2017. [Online]. Available:http://www.contingencias.mendoza.gov.ar

[30] T. M. Therneau and E. J. Atkinson, “An introduction to recursivepartitioning using the RPART routines,” Rpart Package Vignettes, 2018.

[31] T. Therneau, B. Atkinson, and B. Ripley. (2018). Rpart: RecursivePartitioning. R Package Version 4.1-13. [Online]. Available: http://cran.r-project.org/web/packages/rpart/index.html

[32] M. Kuhn, S. Weston, N. Coulter, and R. Quinlan. (2014). C50: C.50Decision Trees and Rule-Based Models, R Package Version 0.1.2.[Online]. Available: https://CRAN.R-project.org/package=C50

[33] L. Breiman, A. Cutler, A. Liaw, and M. Wiener, “RandomForest:Breiman and cutler’s random forests for classification and regression,R package version 4.6-12,” 2015.

[34] L. Torgo, “DMwR: Functions and data for data mining with R, version0.4.1,” 2010.

[35] M. Kuhn, J. Wing, and S. Weston, “Package ‘CARET,”’ 2015.[36] C. Bergmeir and J. M. Benítez, “On the use of cross-validation for time

series predictor evaluation,” Inf. Sci., vol. 191, pp. 192–213, May 2012.

Ana Laura Diedrichs received the InformationSystem Engineering degree from the FacultadRegional Mendoza, Universidad TecnológicaNacional (FRM UTN), Mendoza, Argentina, in2009, where she is currently pursuing the Ph.D.degree under the supervision of Dr. F. Brombergand Dr. D. Dujovne, focussing on “Predictionof Frost Location Using Machine Learning andInternet of Things (IoT).”

During her studies, she applied for a DAAD-UTNScholarship and was an exchange student with the

Technische Universität Carolo-Wilhelmina zu Braunschweig, Braunschweig,Germany, taking part in the “WISEBED–Future Internet” project from 2008to 2009. She was with IT companies involved with software developmentin Mendoza. Since 2012, she has been lecturing on teleinformatics/IoT(Electronic Engineer Department) and machine learning (InformationSystem Department) as a Teacher Assistant with FRM UTN. She tookpart in the PEACH Project. She is an active member of the GridTICs andDHARMA Lab, UTN, Mendoza. She is the founder and an organizer of theR-Ladies Mendoza Chapter. Her current research interests include machinelearning and IoT.

Facundo Bromberg received the undergraduationdegree in physics from the Instituto Balseiro,Universidad Nacional de Cuyo, Bariloche,Argentina, in 1998, and the Ph.D. degree fromthe Computer Science Department, Iowa StateUniversity, Ames, IA, USA, in 2007.

For the last ten years, he has been a full-timeProfessor with the Universidad TecnológicaNacional, Mendoza, Argentina, founding and devel-oping the DHARMa Research Lab by successfullyadvancing a wide range of research projects in the

artificial intelligence and machine learning disciplines, both academic and forthe industry. His current research interests include computer vision appliedmainly to viticulture and biomechanics, machine learning applications, andprobabilistic graphical models.

Diego Dujovne received the electronic engineeringdegree from the Universidad Nacional de Córdoba(UNC), Córdoba, Argentina, in 1999, and the Ph.D.degree from the National Union of AutonomousUnions, Paris, France, in 2009.

He developed his Ph.D. project (Equipe-ProjetPLANETE) with INRIA Sophia Antipolis,Valbonne, France. He developed several university-industry collaboration projects on instrumentationand communications with LIADE, UNC, for fiveyears. He is currently a full-time academic with the

Universidad Diego Portales, Santiago, Chile. His current research interestsinclude IPv6 over LLN routing and scheduling, deterministic networking,Internet of Things (IoT) applications and architecture, machine learningapplications for IoT and wireless experimental platform development andmeasurement methodologies.

http://dx.doi.org/10.4108/eai.1-12-2016.151711


Keoma Brun-Laguna received the M.Sc. degree innetworking from UPMC (Paris 6), Paris, France, in2015. He is currently pursuing the Ph.D. degree atthe Inria-Paris Research Center, Paris, involved withresearch on deterministic networking for the indus-trial IoT under the supervision of Prof. T. Watteyne.

He has participated in the deployment of fourreal-world wireless sensor networks from sensorplacement to the back-end system.

Thomas Watteyne (GS’06–M’09–SM’15) receivedthe M.Eng. degree in telecommunications, M.Sc.degree in networking, and Ph.D. degree in computerscience from INSA Lyon, Villeurbanne, France, in2005, 2005, and 2008, respectively.

He was a Post-Doctoral Research Lead withProf. K. Pister’s Team, University of California atBerkeley, Berkeley, CA, USA. He founded and co-leads Berkeley’s OpenWSN project, an open-sourceinitiative to promote the use of fully standards-basedprotocol stacks for the Internet-of-Things (IoT).

From 2005 to 2008, he was a Research Engineer with France Telecom, OrangeLabs, Paris, France. He is an insatiable enthusiast of low-power wirelessmesh technologies. He holds an advanced research position with the EVAResearch Team, Inria, Paris, where he designs, models, and builds network-ing solutions based on a variety of IoT standards. He is Senior NetworkingDesign Engineer with the Dust Networks Product Group, Analog Devices,the undisputed leader in supplying low power wireless mesh networks fordemanding industrial process automation applications. Since 2013, he co-chairs the IETF 6TiSCH Working Group, which standardizes how to useIEEE802.15.4e TSCH in IPv6-enabled mesh networks. He is a member ofthe IETF IoT Directorate.

Documents

Prediction of Frost Events Using Machine Learning and IoT ...static.tongtianta.site/paper_pdf/d7738f76-6824-11e... · Learning and IoT Sensing Devices Ana Laura Diedrichs , Facundo