Machine Learning based Context-predictive Car-to-cloud ...€¦ · network. In this paper, we bring together mobility prediction and machine learning based channel quality estimation

Machine Learning based Context-predictiveCar-to-cloud Communication Using Multi-layerConnectivity Maps for Upcoming 5G Networks

Benjamin Sliwa1, Robert Falkenberg1, Thomas Liebig2, Johannes Pillmann1 and Christian Wietfeld11Communication Networks Institute, 2Department of Computer Science VIII

TU Dortmund University, 44227 Dortmund, Germanye-mail: {Benjamin.Sliwa, Robert.Falkenberg, Thomas.Liebig, Johannes.Pillmann, Christian.Wietfeld}@tu-dortmund.de

Abstract—While cars were only considered as means ofpersonal transportation for a long time, they are currentlytranscending to mobile sensor nodes that gather highly up-to-date information for crowdsensing-enabled big data servicesin a smart city context. Consequently, upcoming 5G commu-nication networks will be confronted with massive increasesin Machine-type Communication (MTC) and require resource-efficient transmission methods in order to optimize the overallsystem performance and provide interference-free coexistencewith human data traffic that is using the same public cellularnetwork. In this paper, we bring together mobility predictionand machine learning based channel quality estimation in orderto improve the resource-efficiency of car-to-cloud data transferby scheduling the transmission time of the sensor data withrespect to the anticipated behavior of the communication context.In a comprehensive field evaluation campaign, we evaluatethe proposed context-predictive approach in a public cellularnetwork scenario where it is able to increase the average datarate by up to 194% while simultaneously reducing the meanuplink power consumption by up to 54%.

I. INTRODUCTION

Massive MTC is one of the three foundations of upcoming5G cellular networks. In the vehicular context, cars willbe exploited as mobile sensor nodes that gather data forcrowdsensing-based cloud-applications like intelligent trafficforecast and distributed weather prognosis [1]. With the re-sulting increase of Machine-to-machine (M2M) communica-tion, resource-efficient data transmission methods are highlydemanded in order to satisfy the application demands and min-imize the interference with Human-to-human (H2H) networktraffic. A promising technique to address these challengesis the integration of the transmission context in the actualtransmission process in order to exploit favorable channel con-ditions for transmitting data in a more efficient way [2] withrespect to different performance indicators such as data rate,energy-efficiency and transmission duration. In earlier work,we presented the probabilistic transmissions schemes Channel-aware Transmission (CAT) and predictive CAT (pCAT) [3]that exploit knowledge about the Signal-to-noise-ratio (SNR)in order to leverage connectivity hotspots for achieving reliabletransmissions of vehicular sensor data. In this paper, we moveanother step further and transit from context-awareness tocontext prediction. The pCAT scheme is extended analyticallyusing a multi-metric approach allowing the joint considerationof multiple network quality indicators. Data obtained fromprevious measurements is exploited to construct connectivitymaps (see Fig. 1) that are used as a priori information aboutthe channel environment. With this knowledge and the use ofmobility prediction, vehicles schedule their data transmissions

with respect to the anticipated development of the radiochannel conditions. Furthermore, the achievable data rate ispredicted with machine learning methods and used itself asa transmission metric for the extended pCAT scheme. Thepaper is structured as follows: after discussing the relatedwork, we present the analytical model of the extended pCAT aswell as multiple mobility prediction schemes. Afterwards, thesetup for the empirical evaluation is described. Finally, detailedresults of the transmission schemes are presented for multipleparameterizations and application scenarios. Furthermore, theachieved accuracy for mobility prediction schemes as well asthe transmission metrics is discussed. In order to achieve ahigh level of transparency and reproducibility, the source codeof the developed measurement application as well as the rawdata obtained from the experimental evaluation is provided inan Open Source way.

0

300300

200200

5

100100

00

10

15

CQ

I

-14-12-10-8-6-4

RS

RQ

[d

B]

-120

-100

-80

RS

RP

[dB

m]

-10

0

10

20

30

SN

R [

dB

]S

treet

Layer

Predicted

Channel Context

Measured

Channel Context

Current

Position

Predicted

Position

Cell y Cell x

Fig. 1. Multi-layer connectivity maps as an enabler for anticipatory commu-nication. With knowledge about the spacial dependencies of the connectivityindicators being available as a priori information, the measured channelcontext can be set into relation to its predicted future state.

II. RELATED WORKThe provision of broadband services for emerging smart

city applications and Intelligent Transportation Systems (ITS)crucially depends on the real-time data exchange over a so-phisticated network infrastructure. In order to ensure a reliablecommunication, even though the device density and traffic loadcontinuously increase, many researchers integrate versatileapproaches of data analysis into the communication systemsand evolve reactive networks into intelligent and anticipatingcompound structures [2]. Multi-layer approaches, such as [4],provide frameworks to aggregate crowd-sensing data aboutroad traffic, traffic lights and network conditions to guidevehicles to appropriate access networks in order to deliverservices more efficiently, e.g., broadband infotainment. Otherapproaches utilize context information from the network forsmart-parking services and road-traffic prediction to performa context-aware dynamic vehicle routing [5].

Furthermore, the radio resource management of the Ra-dio Access Network (RAN) requires anticipatory methodsfor proper handling of heterogeneous traffic types to satisfyparticular needs, e.g., latency and throughput, in a large scale.Optimizing the mixture of H2H and MTC in 5G cellular net-works is addressed in [6]. The main challenge is dealing withmassive MTC without degrading the performance of H2H,however, still maintaining the required degree of stability forMTC. A biologically inspired approach, which models MTCand H2H as populations of predators and prey, respectively,provides a stable equilibrium for both applications. A detaileddata-driven analysis of anticipatory networking strategies isgiven in [7]. The authors provide an optimization frameworkto allow network operators to save up to 50 % or resources byusing either centralized or distributed prediction approaches intheir base stations. In contrast to that, the proposed machinelearning based extension for pCAT works decentralized on theclient side and does not involve additional downlink commu-nication for coordinating uplink transmissions. Therefore, itcan be easily integrated into existing measurement applicationswithout requiring changes in the network infrastructure.While in theory the resulting throughput of a data transmissionis the result of a deterministic process, it is a challengingtask to predict the resulting value proactively in a live-systemdue to a wide range of hidden and dynamic variables (e.g.packet loss, channel stability, transport-layer interdependenciesand available spectral resources), especially in the presence ofmobility. In the recent years, great advancements have beenachieved by bringing together engineering expertise for dataacquisition and identification of relevant features and machinelearning based data analysis [8]. In [9] the authors utilizemachine learning for high-accuracy data rate predictions,based on passive connectivity indicators in conjunction withthe estimated activity of other participants in the same cell. Thedata is obtained by blind-decoding the Long Term Evolution(LTE) Physical Downlink Control Channel (PDCCH) in realtime, which provides the User Equipment (UE) a globaloverview of the current cell load and the momentary degreeof competition for radio resources. Other approaches exploitconnectivity maps for an optimized utilization of spectralresources, especially for vehicular applications [10]. Suchmaps can be obtained from sparse sampling the environmentand performing a kernel-based reconstruction of coveragemaps [11], which can be downloaded to the mobile devicefor client-based decision-making.

RS

RP

RS

RQ

SN

RC

QICh

an

ne

lC

on

tex

t Passive ProbingLTE Downlink

Pa

ylo

ad

Da

tara

te

Active ProbingHTTP POST

Ve

loc

ity

Po

sitio

nD

ire

ctio

nT

raje

cto

ry

Mo

bil

ity

Co

nte

xt Navigation System

Sy

ste

mP

ara

me

ters

OfflineTraining

Online Datarate Prediction

MobilityPrediction

Trained Model Future Position

Context-aware Transmission Scheme

Transmission Probability

Metric Current Datarate Future Datarate

ConnectivityMap

Future ChannelContext

AP

rio

riIn

form

ati

on Crowdsensing

Data (cell-wisemean)

RS

RP

RS

RQ

SN

RC

QI

only used for predictive CAT

only used for goodput-prediction metric

Feature - known during the application phaseLabel - hidden during the application phase

Le

arn

ing

Se

tup ANN

SVMM5T

LR

Model

Fig. 2. Architecture model of the proposed transmission scheme. Differentcontext layers are brought together with a priori information about the radiochannel obtained from previous measurements. The dashed connections areonly required during the training phase of the selected machine learningmodel.

III. SOLUTION APPROACH

The overall architecture model of the proposal that operateson the application layer is illustrated in Fig. 2. Obtained sensordata is buffered locally until a transmission decision is madefor the whole buffer. The decision itself is based upon currentlymeasured information and predicted data obtained from theconnectivity map and can either use a single context parameteror machine learning based data rate prediction as a metric.Fig. 3 provides an example that compares naive periodic datatransfer to the proposed transmission scheme.In the first step, a mechanism to predict the achievable datarate S(t) based on the measured channel context parameters isderived by training different machine learning models (Artifi-cial Neural Network (ANN), Support Vector Machine (SVM),M5 Decision Tree (M5T) and Linear Regression (LR)) (seeSec. III-1). The channel context is formed by the reportedvalues of the passive LTE downlink indicators ReferenceSignal Received Power (RSRP), Reference Signal ReceivedQuality (RSRQ), SNR, Channel Quality Indicator (CQI) aswell as the payload size of the data packet, which represent thefeatures of the machine learning process. During the trainingphase, the achieved data rate of active Hypertext TransferProtocol (HTTP) POST transmissions is monitored and usedas the prediction label.

0 50 100 150 200 250 300 350 400

Time [s]

-10

0

10

20

30

40

SN

R [dB

]

High

Medium

LowPeriodic

pCAT

Connectivity Valleys

Connectivity Hotspots

Predictionhorizon

1

Fig. 3. Example temporal behavior of the different transmission schemes.While naive periodic transmissions are executed regardless of the channelquality, the proposed pCAT determines a transmission probability that takesthe current and predicted channel quality into account. At 1 , pCAT transmitsdata at medium network quality as it expects the vehicle to pass a connectivityvalley in the near future.

0 2 4 6 8 10 12 14 16 18 20

Predicted Goodput [MBit/s]

0

2

4

6

8

10

12

14

16

18

20M

ea

su

red

Go

od

pu

t [M

Bit/s

]

Underestimation

Overestimation

Fig. 4. Accuracy of the data rate prediction with the M5 decision tree basedon [12]. The blue dots illustrate the behavior of an optimal prediction as areference. The black curve shows the 0.95 confidence-interval of the meanvalue.

For obtaining knowledge about the future channel context fromthe connectivity map, vehicles need to first determine theirfuture position, which is then used to find the matching cellentry of the map that contains the aggregated context informa-tion. For this purpose, mobility prediction (cf. Sec. III-A) isapplied to estimate the future position ~P (t+ τ) for a definedprediction horizon τ , based on the current mobility parametersand optional trajectory knowledge.The online data rate prediction is then performed for the mea-sured channel quality parameters as well as for the predictedchannel context parameters. For this purpose, the machinelearning model with the highest achieved prediction accuracyis applied. With the availability of estimations about thecurrently achievable data rate S(t) and the future data rateS(t + τ), the context-aware transmission (cf. Sec. III-B) isthen performed based on a defined metric Φ, which computesa transmission probability for the current time t. The processis periodically evaluated with a fixed update interval tp.If no valid location information is available or if one of theprediction steps fails, the pure probabilistic basic CAT schemeis used as a fallback mechanism that is only considering thecurrently measurable context information. In the followingsubsections, the individual components and their relevantparameters are described in details.

1) Machine Learning based Data Rate Prediction: In [12],we have compared the accuracy of the data rate prediction formultiple machine learning models (ANN, SVM, M5T, LR).In the following, only the best performing model (M5T) isapplied in the further evaluations. Fig. 4 shows the behaviorof the prediction accuracy with Mean Absolute Error (MAE)= 1.33 MBit/s and Root Mean Square Error (RMSE) = 1.81MBit/s. The blue curve represents a perfect prediction, wherethe predicted values are equal to the achieved measured values.While there is a two-sided error range, only the overestimationarea (lower triangle of the plot) is considered harmful for thedata transmission as underestimations (upper triangle of theplot) even result in a higher gain in the data rate than expected.The great advantage of using machine learning for channel

quality assessment is that it does not only consider the explic-itly defined features RSRP, RSRQ, SNR, CQI, payload sizeand velocity but also implicitly integrates hidden parametersinto the evaluation process. With considering the payload size,the time-stability of the communication channel and cross-layer dependencies such as the slow start of the TransmissionControl Protocol (TCP) and the payload-overhead-ratio havean impact on the estimated data rate. Similarly, the RSRQvalue contains hidden information about the available cellresources as well as the interference situation.

A. Mobility PredictionFor the estimation of the future channel context, knowledge

about the future location is a basic requirement. Therefore,different approaches for predicting the future vehicle position~P (t+ τ) for a defined prediction horizon τ are applied whichdiffer in the utilized sensor information. For simplicity, thefollowing calculations are presented in a cartesian coordinatesystem. For using raw Global Positioning System (GPS)coordinates, all calculations have to be performed in theorthodromic dimension [13].

1) GPS-based Extrapolation: The simplest prediction ap-proach utilizes the available information from the GPS moduleof the UE and uses the current values for position ~P (t),angular direction λ and the velocity v to extrapolate the futurelocation with Eq. 1

~P (t+ τ) = ~P (t) +

(sin(Π

2 ) · cos(λ·Π180

)sin(Π

2 ) · sin(λ·Π180

)) · τ · v (1)

Since the prediction relies of the assumption of a staticdirection for the duration of τ , it is severely impacted by theturns the vehicle performs on its route, especially in urbanenvironments with many road junctions.

2) Leveraging Trajectory-knowledge from the NavigationSystem: Precise predictions of the vehicle’s future locationcan be achieved by integrating knowledge about the plannedtrajectory into the prediction process. Even if this informationis not explicitly available, it may be derived exploiting the reg-ularities of human behavior [14], since most human trajectoriescan be predicted with a high accuracy as people usually followdaily routines and visit the same places regularly. A single tripis described by its starting point and destination (e.g. the wayfrom home to work). It is assumed that multiple traces for thesame trip are already available and that the network qualityhas been monitored continuously and was used to build theconnectivity map along the track (cf. Sec. IV-A). The meantrajectory over all traces for the same trip is then computedbased on the algorithm proposed in [15]. With this information,the predicted position is computed with an iterative process.At first, the distance potential D = v · τ is determined.Afterwards, the distance di,j between consecutive waypoints~Wi and ~Wj=i+1 on the trajectory T is computed and thetravelled distance D is incremented by di,j . Once D exceedsD, the final position is computed by interpolation with Eq. 2.

~P (t+ τ) = ~Wi +~Wj − ~Wi

|| ~Wj − ~Wi||·(D − D − di,j

)(2)

For the lookup of the context entry from the connectivity map,the position is converted to a map index m = b ~P (t+τ)

c c witha defined cell size c that describes the aggregation area.

3) Prediction based on a Reference Trace: A simplerapproach is to declare a single trip as a reference and thenapply the same prediction mechanism as in Sec. III-A2. Incontrast to the other approaches, the measured values arenot grouped in cell-structures but mapped to discrete points(with respect to the context sampling frequency fcontext) onthe geographical track. The channel quality indicators areobtained from the reference measurement and not from theconnectivity map. While this approach is easy to implement, itdoes not perform any cell-wise data aggregation. In Sec. V-Athe different approaches are compared with respect to theirachievable prediction accuracy for the individual indicators.

B. Probabilistic Context-predictive Transmission of VehicularSensor Data

Once the future vehicle location is known and the predictedcontext information has been obtained from the connectivitymap, the probabilistic transmission scheme is applied. In orderto allow the combination of multiple metrics and provide amechanism for comparing metrics that have different valueranges, a generic metric Φ is introduced, which is definedby a maximum value Φmax and a minimum value Φmin.Furthermore, the metric exponent α controls how much theresulting transmission probability should be depending on highmetric values.

In the first step, the measured metric value Φ(t) is trans-formed into the normed current metric value Θ(t) with Eq.3.

Θ(t) =Φ(t)− ΦminΦmax − Φmin

(3)

With Φ(t + τ) describing the future metric value obtainedfrom the connectivity map, the anticipated gain ∆Φ(t) is thencomputed using Eq. 4.

∆Φ(t) = Φ(t+ τ)− Φ(t) (4)

Finally, the transmission probability pΦ(t) is computed withrespect to the defined timeout values tmin and tmax that guar-antee a minimum required packet size and a maximum allowedbuffering delay with Eq. 5. The pCAT-specific exponent z (c.f.Eq. 6) is used to increase the transmission probability if thechannel quality decreases and to delay transmissions if thechannel quality improves.

pΦ(t) =

{0 : ∆t ≤ tminΘ(t)α·z : tmin < ∆t < tmax1 : ∆t > tmax

(5)

0 5 10 15 20 25 300

0.2

0.4

0.6

0.8

1

Tra

nsm

issio

n P

robabili

ty

Channel gets worse

Channel gets better

Fig. 5. Analytical behavior of the probabilistic pCAT transmission scheme.Transmissions are performed with a higher probability if the channel qualitydecreases (∆Φ ≤ 0) and are delayed if the anticipated channel quality is high(∆Φ > 0).

z =

{max (|∆Φ(t) · (1−Θ(t)) · γ| , 1) : ∆Φ > 0

(max (|∆Φ(t) ·Θ(t) · γ| , 1))−1

: ∆Φ ≤ 0(6)

If the connectivity map does not contain an entry for theestimated position ~P (t + τ), the non-predictive CAT schemeas described in [12] is applied as a fallback mechanism. Fig. 5shows the behavior of the analytical pCAT model for differentvalues of ∆Φ. For ∆Φ = 0, the transmission schemes behavesequally to simple non-predictive CAT.

IV. SETUP OF THE EMPIRICAL EVALUATION

In this section, the setup for the empirical evaluation cam-paign and the methodology of the data analysis are presented.

The default parameters for the evaluation scenario aredefined in Tab. I and the parametrization of the consideredtransmission metrics is shown in Tab. II. A virtual sensorapplication generates 50 kByte of data per second and thetransmission decision for the whole local data buffer is per-formed second-wise.

TABLE IPARAMETERS OF THE REFERENCE SCENARIO

Parameter ValueSensor frequency fsensor 1 HzSensor payload size ssensor 50 kByteTransmission probability up-date interval tp

1 s

tmin 10 stmax 120 sPrediction horizon τ {10, 30, 60} sConnectivity map cell size c 25 m2

Context sampling frequencyfcontext

1 Hz

TABLE IIPARAMETERS FOR THE CONSIDERED METRICS

ΦRSRP ΦRSRQ ΦSNR ΦCQI ΦM5T

min -120dBm

-11 dB 0 dB 2 0 MBit/s

max -70 dBm -4 dB 30 dB 16 {15, 18}MBit/s

α 8 8 8 8 8γ 0.3 2.14 0.5 1.07 1

A. Evaluation ScenarioThe map of the real-world scenario used for the perfor-

mance evaluation is shown in Fig. 6, consisting of a highwaytrack and a suburban track. For each parametrization of thetransmission schemes, the performance was evaluated 5 timeson each of the tracks. Overall, more than 6000 HTTP POSTtransmissions were performed within a total driven distance ofmore than 2000 km. The actual transmissions were performedwith a Samsung Galaxy S5 Neo (Model SM-G903F) smart-phone in a public cellular network with a developed Android-based measurement application that is available in an OpenSource manner1. Furthermore, the raw data obtained from themeasurements is provided at [16].

The channel quality data to build the connectivity mapand the mobility data for the trajectory-based prediction is

1Available at https://github.com/BenSliwa/MTCApp

A 45

Track 2: Highway14 km : 130 km/h

Track 1: Suburban9 km : 50-70 km/h

Fig. 6. Overview of the 23km long test route containing different streettypes and speed limitations. (Map: c©OpenStreetMap contributors, CC BY-SA.). Source: Adapted from [12].

based on measurements from 90 previous drive tests that wereperformed in the same scenario in [12].

The values for the weighting factor γ have to be chosenwith respect to the metric’s value range and its granularity.For allowing a fair comparison among the different metrics, γhas been chosen with respect to the definition of ΦSNR metricand it’s respective value range with Eq. 7.

Φi,γ = ΦSNR,γ ·ΦSNR,max − ΦSNR,min

Φi,max − Φi,min(7)

B. Analysis of the Uplink Power Consumption with theContext-aware Power Consumption Model (CoPoMo)

The accurate determination of the isolated power consump-tion caused by the uplink data transmissions is a challengingtask and is most likely performed in a laboratory environmentand not within a mobile setup. However, for estimating theuplink power consumption in the highly-mobile scenario, theCoPoMo can be applied that has been proposed and validatedin [17]. The model takes the different states of the poweramplifier into account and maps the transmission power todiscrete power states. Since those states are highly specificfor the actual used device, the device characteristics need tobe obtained using a laboratory measurement setup. Fig. 7shows the device characteristics for the measurement deviceat 800 MHz and 2600 MHz. It can be seen that both curvescan be approximated by two linear functions that are separatedby the parameter γ. In the following, the model is appliedto estimate the overall power consumption for the performeduplink transmissions.

V. RESULTS OF THE EXPERIMENTAL PERFORMANCEEVALUATION

In this section, the results of experimental performanceevaluation are presented. At first, the prediction accuracyof relevant performance indicators is evaluated. Afterwards,a performance comparison is provided for the consideredcontext-predictive transmissions schemes that use either a sin-gle indicator or the machine learning based data rate predictionas a metric.

-10 -5 0 5 10 15 20 25

Transmission Power [dBm]

0.5

1

1.5

2

Po

we

r C

on

su

mp

tio

n [

W]

low high max

800 MHz

2600 MHz

Fig. 7. CoPoMo device characteristics for the measurement device obtainedfor laboratory measurements. The colored boxes above the plot illustrate theCoPoMo states with respect to the transmission power.

A. Evaluation of the Prediction Accuracy using ConnectivityMaps

Since the transmission schemes base their decisions on thepredicted context information, the ability to obtain accuratepredictions is crucial for the overall system performance. Mul-tiple errors are system-immanent and should be minimized:

• Positioning inaccuracy of GPS, especially in the presenceof obstacles and urban canyons.

• Non-static influences like traffic signals and obstructionscaused by other traffic participants are not considered bythe mobility prediction.

• Aggregation loss due to cell-wise aggregation of mea-sured data within the connectivity maps, especially forindicators that are highly influenced by dynamic effectslike multipath-fading and interference.

Fig. 8 shows the achieved prediction accuracy for the con-sidered indicators with different prediction schemes and pre-diction horizons. It is obvious that the integration of trajectoryinformation from the navigation system reduces the magnitudeof the position prediction error significantly. The pure GPS-based extrapolation suffers severely from direction changesof the vehicle. For the speed-dependency of the predictionerror, three different regions can be identified. Up to about 70km/h (urban/suburban road traffic 1 ), the error is proportionalto the velocity. For higher velocities, the road characteristicschange due to the highway scenario 2 and the direction ofroad segments is constant for longer time periods, resulting ina decrease phase of the error. However, for velocities above 90km/h 3 , the higher prediction distance becomes the dominanterror source again. The achievable accuracy has to be regardedin relation with the prediction success ratio since the forecast isuseless for the actual transmission decision if the connectivitymap does not contain context information for the falselypredicted cell. Therefore, in the following considerations, theGPS-based prediction is not evaluated anymore, as its predic-tion failure ratio is unacceptable high. While the RSRP andthe RSRQ only show slight differences between the trajectory-based and the reference-track-based prediction, this does notapply to the SNR and the CQI that have a high dependencyto the dynamics of the channel environment. Consequently,both values achieve an aggregation gain through the cell-wiseaveraging by the connectivity maps.

0 20 40 60 80 100 120

Velocity [km/h]

0

100

200

300

400

500

600

700

Me

an

Ab

so

lute

Dis

tan

ce

Err

or

[m]

= 60s

= 30s

= 10s

1 2 3

ExtrapolationReference traceTrajectory-based

Significant accuracy improvementby integrating trajectory knowledge

Low velocity dependency oftrajectory-aware approaches

0 20 40 60 80 100 120

Velocity [km/h]

0

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

1

Pre

dic

tio

n F

ailu

re R

atio

Trajectory-aware approaches enable reliablecontext-lookup from the connectivity map

Pure GPS-based extrapolation has unacceptable highfailure ratio and is not considered further

0 20 40 60 80 100 120

Velocity [km/h]

2

3

4

5

6

7

8

9

10

11

12

Mean

Absolu

te R

SR

PE

rror

[dB

] High similarity of single tracemeasurements and cell-wise averaging

0 20 40 60 80 100 120

Velocity [km/h]

0.8

1

1.2

1.4

1.6

1.8

2

2.2

Mean

Absolu

te R

SR

Q E

rror

[dB

]

0 20 40 60 80 100 120

Velocity [km/h]

4

5

6

7

8

9

10

Mean

Absolu

te S

NR

Err

or

[dB

]

0 20 40 60 80 100 120

Velocity [km/h]

3

4

5

6

7

8

9

Mean

Absolu

te C

QI E

rror

[dB

]

High aggregation gain bycell-wise averaging

Fig. 8. Prediction accuracy for the vehicle’s mobility and the passive downlink indicators of the LTE signal for the considered prediction methods withdifferent predictions horizons in relationship to the vehicle’s velocity. All curves show the 0.95 confidence interval of the mean value.

B. Performance Evaluation of the Proposed TransmissionScheme with Different Metrics

Fig. 9 provides a comparison of the resulting data rate, ageof information and uplink power consumption for differentvariants of the proposed pCAT that use a single downlinkindicator as the transmission metric. The highest data rateand energy-efficiency gains are achieved for the SNR- andthe RSRP-based approaches. While the regular CAT has topay for the increased data rate with an increased age ofinformation, the trade-off between data rate and introduceddelay is significantly better for the predictive scheme. Never-theless, the results have to be set into relation to the actualapplication requirements. While crowdsensing-based servicesrely on up-to-date information for analysis and prognosis,often, real-time capabilities are not required and the data ageis allowed to be in the range of several minutes. Fig. 10

provides a performance comparison for the machine learningbased data rate prediction metric ΦM5T with different valuesfor Φmax and τ . For Φmax = 18 MBit/s and t = 30 s, thehighest performance increase is achieved with data rate gainof 194% and reduction of the mean uplink power consumptionby 54%. Interestingly, the pCAT approach is able to achievehigh throughput values even for higher values of τ , where thecontext prediction is significantly distorted by uncertainties ofthe mobility prediction (cf. Fig. 8). While the computation ofthe transmission probability only evaluates the context for twodiscrete points of time t and t + τ , the computation itself isperformed second-wise with tp. Therefore, it behaves similarto a moving window with the car itself moving forward onthe trajectory. Consequently, the transmission scheme is muchmore influenced by the proability of the car experiencing thepredicted context within any point of time in the remainingtime until tmax is reached than to have a perfect prediction

Transmission Scheme

5

6

7

8

9

10

11

12

13

14

15

Mean G

oodput per

Drive [M

Bit/s

]

Suburban Highway

Periodic SNR SNR RSRP RSRQ CQICAT Periodic SNR SNR RSRP RSRQ CQICAT

Probabilistic transmission scheme showslower performance for metrics with

non-continuous value ranges

Significant gain bycontext-prediction

for the same metric

6.49 MBit/s

+ 94 %

3.77 MBit/s

+ 55 %

Non-predictivereferences

Transmission Scheme

10

20

30

40

50

60

Mean

Age o

f In

form

ation [s]

Suburban Highway


All predictive metrics outperformthe periodic scheme on highways

Significantly better trade-off between data rateand introduced delay through context prediction

Transmission Scheme

60

80

100

120

140

160

180

200

220

240

260

Mean U

plin

k P

ow

er

Consum

ption [m

W]


Suburban Highway

Less timeout triggered transmissionson highway scenarios due to

frequently changing channel conditions

Gain achieved byintegrating context prediciton

Fig. 9. Performance comparison of the proposed transmission scheme with different single indicator metrics for a fixed prediction horizon τ = 10 s. Theresults for periodic transmission and non-context-predictive SNR-based CAT are shown as reference.

6

8

10

12

14

16

18

20

22

Mean G

oodput per

Drive [M

Bit/s

]

Suburban Highway

13.87 MBit/s+ 194 %

11.1 MBit/s+ 162 %

Transmission Scheme

Periodic 0 10 30 60 0 10 30 60Periodic 0 10 30 60 0 10 30 60

= 15 MBit/s = 18 MBit/s = 15 MBit/s = 18 MBit/s

Significant improvement of the absolute data rate as wellas the lower bound by using valueshigh prediction horizon

15

20

25

30

35

40

45

Me

an

Ag

e o

f In

form

atio

n [

s]

Suburban Highway


Transmission Scheme

predictive CATHigher transmission frequencydue to proactive detectionof connectivity valleys

Regular CATTransmissions areoften forced by


Baseline of periodic approach is not reachedM5T has a clear focus on data rate optimization

Transmission Scheme

50

100

150

200

250

Mean U

plin

k P

ow

er

Consum

ption [m

W]

Suburban Highway



63 mW

- 54 %

Similar performance of all machinelearning based schemes on highways

92 mW

- 61 %

Enhanced awareness of connectivityvalleys reduces occurences of

energy-intense long transmissions

Fig. 10. Performance comparison of the proposed transmission scheme with machine learning based data rate prediction using the ΦM5T metric. The resultsfor periodic transmission and non-context-predictive M5T-based CAT (τ = 0 s) are shown as reference.

for the actual time point.Since the machine learning based approach integrates thepayload size of the data packet into the prediction process,it implicitly considers effects that have a strong dependencyto the packet size and the resulting transmission durationinto the actual data transmission decision. In particular, thepayload-overhead-ratio and the slow start of TCP benefit fromlarger packets, whereas the time stability of the communicationchannel can be elaborated with a higher significance for shortertransmission durations.

VI. CONCLUSIONIn this paper, we presented a machine learning enabled ap-

proach for context-predictive transmission of vehicular sensordata that relies on mobility prediction and channel qualityanticipation. Since the properties of the radio communicationchannel have a severe impact on the resource-efficiency of thedata transfer, naive periodic data transmissions suffer frompacket loss which results in low data rate and high powerconsumption. Within our comprehensive empirical evaluationsin public cellar networks, we were able to show that context-predictive communication is able to significantly increasethe end-to-end goodput while simultaneously saving energydue to exploitation of connectivity hotspots and avoidanceof retransmissions. While there is a trade-off between theachievable gains and the introduced additional buffering delay,the abstract metric definition allows to configure the proposedtransmission scheme for prioritising one of the aspects withrespect to the application requirements. The raw measurementdata as well as the developed measurement application isprovided in an Open Source way in order to increase the trans-parency and reusability of the performed evaluation campaign.In future work, we will further optimize the transmissionscheme by optimizing the prediction accuracy for the mobilitybehavior as well as the channel quality estimation.

ACKNOWLEDGMENTPart of the work on this paper has been supported by Deutsche Forschungs-

gemeinschaft (DFG) within the Collaborative Research Center SFB 876“Providing Information by Resource-Constrained Analysis”, projects A4 andB4 and has been conducted within the AutoMat (Automotive Big Data Mar-ketplace for Innovative Cross-sectorial Vehicle Data Services) project, whichreceived funding from the European Union’s Horizon 2020 (H2020) researchand innovation programme under the Grant Agreement No 644657. ThomasLiebig received funding from the European Union Horizon 2020 Programme(Horizon2020/2014-2020), under grant agreement number 688380 “VaVeL:Variety, Veracity, VaLue: Handling the Multiplicity of Urban Sensors”.

REFERENCES

[1] A. Zanella, N. Bui, A. Castellani, L. Vangelista, and M. Zorzi, “Internetof things for smart cities,” IEEE Internet of Things Journal, vol. 1, no. 1,pp. 22–32, Feb 2014.

[2] N. Bui, M. Cesana, S. A. Hosseini, Q. Liao, I. Malanchini, andJ. Widmer, “A survey of anticipatory mobile networking: Context-basedclassification, prediction methodologies, and optimization techniques,”IEEE Communications Surveys & Tutorials, 2017.

[3] C. Ide, B. Dusza, and C. Wietfeld, “Client-based control of the inter-dependence between LTE MTC and human data traffic in vehicularenvironments,” IEEE Transactions on Vehicular Technology, vol. 64,no. 5, pp. 1856–1871, May 2015.

[4] C. Chen, T. H. Luan, X. Guan, N. Lu, and Y. Liu, “Connected vehiculartransportation: Data analytics and traffic-dependent networking,” IEEEVehicular Technology Magazine, vol. 12, no. 3, pp. 42–54, Sept 2017.

[5] J. Wan, D. Zhang, S. Zhao, L. T. Yang, and J. Lloret, “Context-aware vehicular cyber-physical systems with cloud support: architecture,challenges, and solutions,” IEEE Communications Magazine, vol. 52,no. 8, pp. 106–113, Aug 2014.

[6] Z. Feng, Z. Feng, and T. A. Gulliver, “Biologically inspired two-stage resource management for machine-type communications in cellularnetworks,” IEEE Transactions on Wireless Communications, vol. 16,no. 9, pp. 5897–5910, Sept 2017.

[7] N. Bui and J. Widmer, “Data-driven evaluation of anticipatory network-ing in LTE networks,” IEEE Transactions on Mobile Computing, vol. PP,no. 99, pp. 1–1, 2018.

[8] C. Jiang, H. Zhang, Y. Ren, Z. Han, K. C. Chen, and L. Hanzo,“Machine learning paradigms for next-generation wireless networks,”IEEE Wireless Communications, vol. 24, no. 2, pp. 98–105, April 2017.

[9] R. Falkenberg, K. Heimann, and C. Wietfeld, “Discover your com-petition in LTE: Client-based passive data rate prediction by machinelearning,” in IEEE Globecom, Singapore, dec 2017.

[10] T. Pogel and L. Wolf, “Optimization of vehicular applications andcommunication properties with connectivity maps,” in 2015 IEEE 40thLocal Computer Networks Conference Workshops (LCN Workshops), Oct2015, pp. 870–877.

[11] M. Kasparick, R. L. G. Cavalcante, S. Valentin, S. Stanczak, andM. Yukawa, “Kernel-based adaptive online reconstruction of coveragemaps with side information,” IEEE Transactions on Vehicular Technol-ogy, vol. 65, no. 7, pp. 5461–5473, July 2016.

[12] B. Sliwa, T. Liebig, R. Falkenberg, J. Pillmann, and C. Wietfeld,“Efficient machine-type communication using multi-metric context-awareness for cars used as mobile sensors in upcoming 5G networks,”in IEEE Vehicular Technology Conference (VTC-Spring), Porto,Portugal, Jun 2018, accepted for presentation. [Online]. Available:https://arxiv.org/abs/1801.03290

[13] A. Weintrit and P. Kopacz, “Computational algorithms implemented inmarine navigation electronic systems,” in Telematics in the TransportEnvironment, J. Mikulski, Ed. Berlin, Heidelberg: Springer BerlinHeidelberg, 2012, pp. 148–158.

[14] C. Song, Z. Qu, N. Blumm, and A.-L. Barabasi, “Limits of predictabilityin human mobility,” Science, vol. 327, no. 5968, pp. 1018–1021, 2010.

[15] B. Niehofer, R. Burda, C. Wietfeld, F. Bauer, and O. Lueert, “GPS com-munity map generation for enhanced routing methods based on trace-collection by mobile phones,” in 2009 First International Conferenceon Advances in Satellite and Space Communications, July 2009, pp.156–161.

[16] B. Sliwa, “Raw experimental cellular network quality data,” Mar 2018.[Online]. Available: http://doi.org/10.5281/zenodo.1205778

[17] B. Dusza, C. Ide, L. Cheng, and C. Wietfeld, “CoPoMo: A context-aware power consumption model for LTE user equipment,” Transactionson Emerging Telecommunications Technologies (ETT), Wiley, vol. 24,no. 6, pp. 615–632, October 2013.

Documents

Machine Learning based Context-predictive Car-to-cloud ...€¦ · network. In this paper, we bring together mobility prediction and machine learning based channel quality estimation