22
https://doi.org/10.1007/s10723-019-09493-z Addressing Application Latency Requirements through Edge Scheduling Atakan Aral · Ivona Brandic · Rafael Brundo Uriarte · Rocco De Nicola · Vincenzo Scoca Received: 14 December 2018 / Accepted: 21 August 2019 © The Author(s) 2019 Abstract Latency-sensitive and data-intensive appli- cations, such as IoT or mobile services, are lever- aged by Edge computing, which extends the cloud ecosystem with distributed computational resources in proximity to data providers and consumers. This brings significant benefits in terms of lower latency and higher bandwidth. However, by definition, edge computing has limited resources with respect to cloud counterparts; thus, there exists a trade-off between proximity to users and resource utilization. More- over, service availability is a significant concern at the edge of the network, where extensive support sys- tems as in cloud data centers are not usually present. To overcome these limitations, we propose a score- based edge service scheduling algorithm that eval- uates network, compute, and reliability capabilities of edge nodes. The algorithm outputs the maximum All authors have contributed equally and are listed alphabetically. A. Aral () · I. Brandic · R. B. Uriarte Vienna University of Technology, Vienna, Austria e-mail: [email protected] I. Brandic e-mail: [email protected] R. B. Uriarte e-mail: [email protected] R. De Nicola · V. Scoca IMT School for Advanced Studies Lucca, Lucca, Italy e-mail: [email protected] V. Scoca e-mail: [email protected] scoring mapping between resources and services with regard to four critical aspects of service quality. Our simulation-based experiments on live video stream- ing services demonstrate significant improvements in both network delay and service time. Moreover, we compare edge computing with cloud computing and content delivery networks within the context of latency-sensitive and data-intensive applications. The results suggest that our edge-based scheduling algo- rithm is a viable solution for high service quality and responsiveness in deploying such applications. Keywords Edge computing · Scheduling · Live streaming 1 Introduction Cloud Computing is currently the predominant host- ing solution for internet services due to the economi- cal and infrastructural advantages it provides. Indeed, the massive pool of redundant resources character- izing cloud data centers benefits from significantly lower marginal costs due to economies of scale and guarantees high level of reliability and dynamism, which allow the providers to scale up/down the allo- cated resources based on current needs. As a sig- nificant challenge to this centralized paradigm, rapid progress in smart devices and network technologies has enabled new categories of internet-based services with strict end-to-end latency requirements and with J Grid Computing (2019) 17:677–698 / Published online: 5 November 2019

Addressing Application Latency Requirements …...such Twitter (Periscope), Facebook (Facebook Live), Google (Youtube Mobile Live), and IBM (Ustream, IBMCloudVideo).Figure1depictsthemajorcompo-

  • Upload
    others

  • View
    14

  • Download
    0

Embed Size (px)

Citation preview

Page 1: Addressing Application Latency Requirements …...such Twitter (Periscope), Facebook (Facebook Live), Google (Youtube Mobile Live), and IBM (Ustream, IBMCloudVideo).Figure1depictsthemajorcompo-

https://doi.org/10.1007/s10723-019-09493-z

Addressing Application Latency Requirements throughEdge Scheduling

Atakan Aral · Ivona Brandic ·Rafael Brundo Uriarte ·Rocco De Nicola ·Vincenzo Scoca

Received: 14 December 2018 / Accepted: 21 August 2019© The Author(s) 2019

Abstract Latency-sensitive and data-intensive appli-cations, such as IoT or mobile services, are lever-aged by Edge computing, which extends the cloudecosystem with distributed computational resourcesin proximity to data providers and consumers. Thisbrings significant benefits in terms of lower latencyand higher bandwidth. However, by definition, edgecomputing has limited resources with respect to cloudcounterparts; thus, there exists a trade-off betweenproximity to users and resource utilization. More-over, service availability is a significant concern atthe edge of the network, where extensive support sys-tems as in cloud data centers are not usually present.To overcome these limitations, we propose a score-based edge service scheduling algorithm that eval-uates network, compute, and reliability capabilitiesof edge nodes. The algorithm outputs the maximum

All authors have contributed equally and are listedalphabetically.

A. Aral (�) · I. Brandic · R. B. UriarteVienna University of Technology, Vienna, Austriae-mail: [email protected]

I. Brandice-mail: [email protected]

R. B. Uriartee-mail: [email protected]

R. De Nicola · V. ScocaIMT School for Advanced Studies Lucca, Lucca, Italye-mail: [email protected]

V. Scocae-mail: [email protected]

scoring mapping between resources and services withregard to four critical aspects of service quality. Oursimulation-based experiments on live video stream-ing services demonstrate significant improvementsin both network delay and service time. Moreover,we compare edge computing with cloud computingand content delivery networks within the context oflatency-sensitive and data-intensive applications. Theresults suggest that our edge-based scheduling algo-rithm is a viable solution for high service quality andresponsiveness in deploying such applications.

Keywords Edge computing · Scheduling ·Live streaming

1 Introduction

Cloud Computing is currently the predominant host-ing solution for internet services due to the economi-cal and infrastructural advantages it provides. Indeed,the massive pool of redundant resources character-izing cloud data centers benefits from significantlylower marginal costs due to economies of scale andguarantees high level of reliability and dynamism,which allow the providers to scale up/down the allo-cated resources based on current needs. As a sig-nificant challenge to this centralized paradigm, rapidprogress in smart devices and network technologieshas enabled new categories of internet-based serviceswith strict end-to-end latency requirements and with

J Grid Computing (2019) 17:677–698

/ Published online: 5 November 2019

Page 2: Addressing Application Latency Requirements …...such Twitter (Periscope), Facebook (Facebook Live), Google (Youtube Mobile Live), and IBM (Ustream, IBMCloudVideo).Figure1depictsthemajorcompo-

A. Aral et al.

a vast amount of data to process. Such near real-time and data-intensive services include live and 360degrees video streaming, online gaming, intelligenttraffic control, and smart power grids. Since central-ized deployment solutions fail to satisfy strict latencyrequirements and network architectures will soonbecome incapable of handling a massive amount ofdata communication, more geographically distributedapproaches are necessary [31]. Indeed, many cloudproviders already enrich their offers with distributeddeployments solutions including Content DeliveryNetworks (CDN) and Edge Computing [36].

Edge Computing proposes to place computationand storage capabilities at the edge of the networkin the form of micro scale data centers as an exten-sion of massive scale cloud data centers [37]. Thisis a highly promising solution for the aforementionedlatency-sensitive services since it allows the deploy-ment of services in close proximity of their end usersto meet response time requirements. Early work onthe impact of Edge Computing on service deploy-ment [9, 18] provides insights its effectiveness interms of end-to-end service latency, which ensuresa higher quality of service for end users. However,such benefits can be achieved only if service instancesare effectively scheduled on the nodes that satisfytheir requirements on latency, bandwidth, computa-tion capacity, and reliability. In this regard, contextualinformation (i.e., user, application, computation, andnetwork) must be taken into account to find an optimalservice placement [51]. This would not only maximizethe quality of user experience by decreasing end-to-end latency, but also minimize the backbone net-work traffic since most data communication would belocal.

Currently, only a few works, mainly in the areaof Internet-of-Things, consider service scheduling forEdge Computing. Existing scheduling solutions bor-rowed from similar paradigms, in particular cloudand CDN, are subject to infrastructure limitations inthe context of Edge Computing, which renders themunsuitable for meeting the requirements of latency-sensitive services. More specifically, cloud-basedsolutions depend on the massive pool of resources,homogeneous network conditions for compute nodes,as well as high reliability and scalability [15]. CDNparadigm is similar to Edge Computing in the sensethat the resources are distributed and close to users[35]. However, CDN are designed for data-intensive

services rather than for computing intensive ones andcomputations are still offloaded to the cloud [8].Hence, CDN scheduling solutions ignore processingcapabilities.

In this paper, we propose a score-based edgescheduling framework specifically designed forlatency-sensitive, computational and data-intensiveservices at the edge of a network. Proposed schedul-ing mechanisms are applicable to virtual machines(VM) as well as to more light-weight implementa-tions such as containers; for the sake of brevity inthe rest of the paper, we use the term VM for both.For each service instance, our approach identifies theVM type with the computational and network capa-bilities that minimizes the response time for end userswithout reserving resources in excess. The algorithmfirst evaluates the eligibility of available VM typeson the edge nodes for hosting a given service byconsidering network latency, bandwidth, processingpower, and reliability. Then, it schedules services onthe most appropriate VMs according to the overalleligibility scores, to guarantee optimal service qual-ity. We validate our approach by considering a livevideo streaming scenario and evaluating how the dif-ferent deployment solutions, namely Cloud, CDN,and Edge, affect user response time. Our resultsclearly demonstrate that an Edge-based deploymentcan effectively improve user response time. Our maincontributions are: (i) a novel Edge scheduling frame-work for latency-sensitive services; and (ii) an evalua-tion of different deployment solutions and schedulingapproaches for latency-sensitive services.

This paper is an extended and revised version of[33]. Here, the scheduling algorithm takes availabil-ity into account and the proposed solution considersalso the integration of Edge with Cloud, which turnsout to be useful especially when Edge data centersare saturated. All experiments are revised to takeinto account these new perspectives. In the followingsection, we motivate our work by discussing the ben-efits of edge-based deployment for latency-sensitivelive video streaming services. In Section 3, we presentour scheduling approach and in Section 4 we intro-duce the experimental setup for the evaluation of thisapproach. We present and discuss numerical results inSection 5, whereas we survey the related literature inSection 6. We conclude the paper and discuss futuredirections in Section 7.

678

Page 3: Addressing Application Latency Requirements …...such Twitter (Periscope), Facebook (Facebook Live), Google (Youtube Mobile Live), and IBM (Ustream, IBMCloudVideo).Figure1depictsthemajorcompo-

Addressing Application Latency Requirements through Edge Scheduling

2 Motivation

2.1 Use Case Scenario

Smart-phones are widely used for recording and shar-ing live events thanks to their ubiquitousness andimproved video recording capabilities. In mobile livevideo streaming scenario, a video recorded by amobile device is live streamed globally to any num-ber of mobile or desktop audience. The success ofpioneer services such as Meerkat that have been usedby journalists, lecturers, marketing firms, event orga-nizes, etc. has attracted the interest of large companiessuch Twitter (Periscope), Facebook (Facebook Live),Google (Youtube Mobile Live), and IBM (Ustream,IBM Cloud Video). Figure 1 depicts the major compo-nents and workflow of a live video streaming service,which is discussed in detail in the rest of this section.

As the foremost operation, raw input media fromthe camera is encoded into a high-quality stream byeither the local camera encoder or an external one,running on a server preferably close to the stream-ing venue. Transcoding is the next step, where newstream instances in different resolutions and bit-ratesare created on the fly. As a result, the audiencewith different device configurations in terms of net-work bandwidth, screen resolution and size can streama suitable stream and experience a high quality ofservice. Each transcoded stream then undergoes the

packaging operation. In this step, the streams are splitinto chunks of a size that is manageable from thenetwork communication perspective. Packaging oper-ation is carried out based on a streaming protocol (e.g.,HLS or HDS), which defines the chunk length, codecs,container, etc. Finally, the video chunks are deliv-ered to the corresponding audience devices, which areresponsible for sorting and merging them, and playingback video content through a media player.

Deployment of the above described operations onthe widely distributed architecture of Cloud, CDN,and Edge nodes plays a critical role in the engagementof the audience with the video stream. More specif-ically, it has a significant impact on the various userengagement metrics such as join time (i.e., the timebetween the connection establishment and the begin-ning of the playback), buffering ratio (i.e., the percent-age of buffering time in a video) and video quality[12]. Less computation capability at the deployedservers and longer geographical distance translates toeither higher join time and buffering ratio or lowervideo quality, which decreases user engagement.

The most widely implemented solution at themoment by the service providers is the cloud-baseddeployment. This allows fast scalability of resourcesin the face of volatile workloads caused by, for exam-ple, flash crowds. Figure 2 depicts the simplest form ofcloud-based deployment, where transcoding and pack-aging operations are carried out in a cloud data center

Fig. 1 Live video streaming workflow

679

Page 4: Addressing Application Latency Requirements …...such Twitter (Periscope), Facebook (Facebook Live), Google (Youtube Mobile Live), and IBM (Ustream, IBMCloudVideo).Figure1depictsthemajorcompo-

A. Aral et al.

Fig. 2 Cloud-based solution for live streaming services

and directly delivered to the audience from there. Asa result, transcoding and packaging operations enjoyseemingly unlimited resource capacity and elasticityof cloud data center. Even though, video can be pro-cessed quite efficiently in this scenario, delivery of themedia content between a centralized cloud data centerand possibly globally distributed users is the bottle-neck. This is because current wide area networks canprovide only best effort guarantee for packet delivery.Considering the geographical distance and high num-ber of networks hops along the route, link congestionis highly probable, which leads to packet loss, jitterand low throughput and consequently a low quality ofvideo experience.

To overcome above described issues, the state-of-the-art cloud deployment strategies involve the use ofcontent delivery networks (CDNs) as demonstrated inFig. 3. Here, another level of caching is added betweenthe audience and cloud data center. Whereas mediaprocessing still occurs in the cloud, distribution isoffloaded to a CDN provider, which maintains a net-work of cache servers in close proximity to the endusers. Each user request is then redirected to the clos-est cache server with the required version of the videostream. As a result, bandwidth consumption and net-work delay between the video processing and the audi-ence are reduced. Cloud-based and CDN supportedvideo streaming are successfully employed by manyvideo providers that have fairly static media content

in centralized repositories. However, this approach hassignificant shortcomings when it comes to live videosoriginating from end users. First, problems with thenetwork communication between the recording deviceand the cloud data center are still present. Especiallyin cases that the recorder and audience are in closeproximity (e.g., within the same city), transferring thestream to a remote data center causes unnecessary net-work traffic. Second, continuous update of the CDNservers due to the transient nature of live video stream-ing is inefficient in terms of network bandwidth andmonetary cost. We argue that CDN architecture isinsufficient for highly demanding live video streamingusage.

In this work, we propose an edge-based live videostreaming architecture as given in Fig. 4 in order toeliminate above described disadvantages of cloud andCDN-based deployments. Here, raw video is encodedeither locally or at the edge node that is closest to therecorder. Afterwards, it is distributed to a set of edgenodes in close proximity to the audience for the exe-cution of transcoding and packaging operations. Dif-ferent from the cloud-based video processing, wherestreams in all possible combinations of bit-rates, res-olutions, and protocols have to be generated; onlythe requirements of the local audience, who are ser-viced by a particular edge node, need to be satisfied.This results in efficient utilization of limited comput-ing capacity of edge servers. Moreover, they have a

680

Page 5: Addressing Application Latency Requirements …...such Twitter (Periscope), Facebook (Facebook Live), Google (Youtube Mobile Live), and IBM (Ustream, IBMCloudVideo).Figure1depictsthemajorcompo-

Addressing Application Latency Requirements through Edge Scheduling

Fig. 3 Delivery networks for live streaming services

supplementary task of caching the video chunks simi-lar to the CDN servers. Hence, each audience is servedby the closest edge node, which generates the streamin the required format.

In summary, encoding, transcoding, and packagingof the live video stream can be carried out close to therecorder and the audience. Consequently, additionalnetwork delays and detours from the shortest route are

Fig. 4 Edge-based platform for live streaming services

681

Page 6: Addressing Application Latency Requirements …...such Twitter (Periscope), Facebook (Facebook Live), Google (Youtube Mobile Live), and IBM (Ustream, IBMCloudVideo).Figure1depictsthemajorcompo-

A. Aral et al.

mostly eliminated. Moreover, similar media process-ing performance to cloud computing can be achievedby means of parallelization. Load on the wide area net-work is also alleviated since the majority of the datatransfer occurs in the local area, where edge nodes areusually accessible by the user devices through high-bandwidth wireless connections. All these improve-ments result in higher quality of experience in play-back and increase user engagement.

2.2 Edge Business Models

Despite the potential of Edge computing, the busi-ness models of this emerging paradigm are complexbecause of the different types of providers, the mobilenature of the services and their limited scope (e.g.,ephemeral or spike processing demands). Many dif-ferent players can take part in this market: cloudproviders, which want to expand their services andinfrastructure; Internet service providers (ISPs), whichcan install computational resources in their existinginfrastructure; and so-called, prosumers, i.e., userswith spare computational resources, who can, at thesame time, consume and provide resources. All theseactors face significant challenges to actually enter thismarket. Cloud providers profiting from the economyof scale and the centralization of services may have tochange their business model to interact with geograph-ically distributed small data centers. ISPs that do nothave provisioning of computational resources as corebusiness need to consider them. Prosumers of edgecomputing need to acquire the skills and the tools toproperly manage services and avoid resorting to largerproviders because of the guarantees they offer.

In addition, service models for edge computing aremanifold. The infrastructure and platform as a ser-vice models may work well in scenarios where thesame service is provided to many users, like, e.g., thelive video streaming scenario covered in this paper. Inthese scenarios, VMs can host services for groups ofusers. However, their use is not ideal for applicationsrelated to a single user since VMs’ initialization andmigration time is relatively high. For example, whenedge computing is employed to complement the com-putational power of user’s device to deal with the CPUintensive part of an application, users can share thesame resources (bare-metal servers, VMs, etc). Alter-natively, providers need to deploy lighter solutions,e.g., containers.

Since in edge environments users can be mobile orrequire seamless service handover, the single providermodel used in cloud is only applicable if largeproviders own geographically distributed resourcescovering the needs of most of its consumers. A solu-tion to this problem could be that brokers agree withlocal resource providers (e.g., private cloud owners),to acquire a share of their infrastructure and createa resource network to be sold to edge consumers.An alternative could be the creation of edge openmarkets with low market entry barriers to increasethe number of providers and leverage the adoptionof edge computing, while providing a single inter-face for edge consumers. To this end, the capacity ofblockchain and smart contracts to create a trust layerbetween participants could be very useful. Indeed,blockchain and smart contracts have attracted atten-tion from industry and academia. Several projects areemerging on the area, e.g., [43, 45], but there are stillmany open gaps, such as, defining QoS and pricingpolicies [32, 47, 49], comparing resource offers andproviding seamless handover. For an overview of thechallenges and advantages of these solutions, we referto [46]. The scheduling algorithm we are proposingcan be used in any of these models and by any typesof providers. It might only be necessary to introducesome small extensions to take into account specificaspects, e.g., brokers might need to consider alsocosts.

3 A Scheduling Framework for Edge Computing

This work uses the terminology of [19], which definesedge networks as geographically distributed nodes,relatively close to Radio Access Technology (RAT)Base Stations (BSs). These stations provide to usersand edge nodes access to the core network. Each node,in our context, is a micro data center that grants accessto its resources through different types of VirtualMachines (VMs).

Considering that we focus on latency-sensitiveservices, the most important service requirement isresponse time, i.e., the time between a user requestand its reply. In this scenario, the response time ismainly composed of processing time and networkdelay. The former refers to the time elapsed betweenthe user request and its arrival in the edge node pro-viding the service. The queuing, user-node distance,

682

Page 7: Addressing Application Latency Requirements …...such Twitter (Periscope), Facebook (Facebook Live), Google (Youtube Mobile Live), and IBM (Ustream, IBMCloudVideo).Figure1depictsthemajorcompo-

Addressing Application Latency Requirements through Edge Scheduling

and overhead of each hop in the network route and theavailable bandwidth are the main metrics that com-pose the network delay. Processing time, instead, iscalculated based on the VM characteristics and servicespecifications and refers to the time necessary to com-pute a user request, which depends on the type of VMit is executed and on the service specifications.

We design a resource reservation and schedul-ing framework, which considers the edge computingcharacteristics in the scheduling process to improvethe experience of latency-sensitive application users.Algorithm 1 and Fig. 5 illustrate the main steps ofour methodology. For each service, it groups the usersbased on their location and requirements; evaluates thenetwork quality of each node, in particular, its con-nectivity and bandwidth; evaluates the VM’s resourcesand availability with respect to the service require-ments; and finally, based on the combination of theseevaluations, schedules the service. These steps aredetailed in the rest of this section with references tothe line numbers in Algorithm 1.

3.1 User Clustering

Considering the large audience of live video streams,it would be extremely time and resource consuming tocompute scores for each individual user. Fortunately,consumers and producers of many edge services areknown to exhibit geographical locality [5], thus theircompute and network requirements are similar. In thiswork, users with similar requirements in close proxim-ity are clustered into user groups in order to reduce thecomplexity and consequently the time of VM evalua-tion. In this way, computing only a single overall scorefor each group of users is sufficient. We first assigneach user to the geographically closest edge node, andthen we cluster the users who are assigned to the samenode and have similar compute requirements into agroup (line 2). We use three levels of CPU consump-tion (low, medium, and high), which represent, for

example, different encoding length and video resolu-tion requirements of smartphones, tablets, and laptopsin the live streaming use case.

3.2 Network Evaluation: Connectivityand Bandwidth

The network evaluation is divided into three mainphases: calculating the latency between the previ-ous defined groups and the edge nodes; defining aconnectivity score; and a bandwidth score.

In our solution, we assume that edge nodes are con-nected to each other through a network and that usersconnect to the closest BS. In many cases, however, theservice may not be scheduled in the closest edge nodeto its user. For example, the edge node might nodenot have the type of resources required by the serviceor it might be overloaded. We, therefore, need to takeinto account the latency between groups and the edgenodes. To this aim, we: (i) represent the edge networkas a weighted graph, where each node is a data cen-ter in the network, the link between nodes define theconnection between data centers and the link weightdescribes the estimated network latency betweenthem; and (ii) devised an user-based approach that,for every group of users g, detects the lowest latencypath from a node n to g by relying on the Dijk-stra’s algorithm (line 5). This method is valid sincenetwork latency can be estimated using many differ-ent techniques, which range from active monitoringby sending probes between servers and then measur-ing the latency, to passive monitoring that captureinformation from the network device about networkpaths [52].

With the connectivity score qn,l ∈ [0, 1] we definethe quality of the connectivity between each user andVM v by evaluating the network routes between usergroups and the node n of v. The input for this processis the previously computed latency. The evaluation ofthe delay, executed for every path connecting a user

Fig. 5 The scheduling framework

683

Page 8: Addressing Application Latency Requirements …...such Twitter (Periscope), Facebook (Facebook Live), Google (Youtube Mobile Live), and IBM (Ustream, IBMCloudVideo).Figure1depictsthemajorcompo-

A. Aral et al.

Algorithm 1 Latency-sensitive scheduling.Data: Service s to be scheduledData: Service provider’s latency constraint tData: Location of the users U = {u1, u2, . . . , um}Data: Nodes N = {n1, n2, . . . , nk}Data: Location of the nodes L = {l1, l2, . . . , lk}Data: Types of VMs V M = {vi,1, vi,2, . . . , vi,q , ∀i ∈ N}Data: Array Q of quality scores for each VM v ∈ V M

Result: The vms scheduled for s.1 begin2 Define a set of users’ groups G = {gi, |uj − li | < ε ∀uj ∈ U, ∀li ∈ L};3 forall the n ∈ N do4 forall the g ∈ G do5 Estimate the network path pg,n with the lowest latency tg,n between g and n;

6 Estimate the available bandwidth ˜bwg,n between users in g and the node n and the required bandwidth bwg

on pg,n;7 Calculate the connectivity score qn,l,g = u;

8 Calculate the bandwidth score qn,bw,g = BandwidthScore(bwg, ˜bwg,n);

9 qn,l =∑n

i=1 |gi |qn,l,gi∑ni=1 |gi | ;

10 qn,bw =∑n

i=1 |gi |qn,bw,gi∑ni=1 |gi | ;

11 forall the v ∈ n do12 qv,res = ComputingScore(w, wv);13 qv,av = AvailabilityScore(av, avv);

14 Calculate the qv quality score Q[qv ]= 41

qv,res++ 1

qv,av+ 1

qn,l+ 1

qn,bw

15 vms = Scheduler(Q);16 return vms ;

group g to the n node, is computed by a utility func-tion previously defined by the provider, for example,a sigmoidal function that assigns values close to 1 ifthe network delay of the path is under a given valueand 0 if over. The path-based evaluation produces aset of quality scores {qn,l,g1, qn,l,g2, . . . , qn,l,gN

}, laterused to calculate the connectivity final score qn,l asthe mean value of the scores of groups using this path(line 9).

One of the main factors affecting the overall ser-vice quality, as mentioned in Section 2, is the availablebandwidth of the paths between users and the VM.The bandwidth score qn,bw ∈ [0, 1] represents theavailable bandwidth for each n. We calculate a band-width score qn,bw,gi

∈ [0, 1] for each shortest pathfrom n to each user group gi . Similar to the delay eval-uation process, we calculate the final bandwidth qual-ity score qn,bw by averaging the single path quality

scores qn,bw,gi, weighted by the number of users in

each group (line 10).

3.3 Virtual machine evaluation: Resourcesand Availability

To measure the VM resource evaluation we take intoaccount the expected overall service load and the loada VM type can handle. For latency-sensitive contexts,estimating fewer resources than actually required bythe application execution may increase considerablythe overall response and the processing time. Ourapproach computes a score qv,res ∈ [0, 1] to evalu-ate the computational resources of a VM v, similarlyto the network evaluation process, by using a util-ity function defined by the provider, which comparesthe number of user requests that v can be copedwith the overall number of requests expected (line

684

Page 9: Addressing Application Latency Requirements …...such Twitter (Periscope), Facebook (Facebook Live), Google (Youtube Mobile Live), and IBM (Ustream, IBMCloudVideo).Figure1depictsthemajorcompo-

Addressing Application Latency Requirements through Edge Scheduling

12). Other utility functions can also be used, forexample, the similarity of the new service with otherservices running in the same host [48] or the perfor-mance of the type of the new service in the targethardware.

Another critical factor for service performance atthe edge of the network is the reliability of computa-tional resources. Edge servers are known to be moreprone to failures in comparison to cloud counterparts[2]. In this work, we consider transient failures at edgenodes such as power outage, system restart, memoryoverflow, etc. We model the reliability of each nodeas its historical availability rate, which is shown to besufficiently accurate assuming the failures are inde-pendent [3]. Desired availability is taken as 100% inorder to guarantee the computation capacity promisedby VM resource evaluation. A user-defined utilityfunction computes an availability score qv,av ∈ [0, 1](line 13).

3.4 Scheduling Latency-Sensitive Services

Given the set s ∈ S of services, the provider P

schedules each s on VM type v ∈ V , which canguarantee the end user service quality. First, we com-pute for each VM type an overall quality score qs,v

by calculating the harmonic mean of the connectiv-ity, bandwidth, resource and availability scores (line14). Although the harmonic mean is similar to thearithmetic mean since they both give the same weightto two scores when they are similar, it increasesthe importance of smaller values when the gapbetween two values increases. This guarantees thatvery high and very low scores VMs are penalized,which favors a better balance between the network andcomputational resources. The output of the VM eval-uation is the set Q = {qv1 , . . . , qvk

} of quality scores(line 15).

Then, the scheduling process maximizes the overallquality of the selected VMs, guaranteeing the servicequality to end users. This process was defined as abinary integer programming optimization problem asshown in the formulation below. xs,v are binary vari-ables that take true value only if service s is scheduledto VM type v. The coefficients qs,v , on the other hand,are the VM quality scores calculated by the VM evalu-ation algorithm, which represent the suitability of VM

type v for service s. The quality of the final schedulingis maximized by the cost function (I).

maximizex

s∈S

v∈Vqs,vxs,v (I)

s. to∑

v∈Vxs,v = 1 ∀s (II)

s∈Sxs,v <= kv ∀v ∈ Vn (III)

where

xs,v ={1 if s is scheduled on v0 otherwise

Additionally, constraint (II) guarantees that eachservice is scheduled to a single VM, whereas (III)guarantees that the number of provisioned VMs oftype v cannot exceed the number available instancesof that type, kv .

4 Simulation Environment

We developed a simulation environment by extend-ing the EdgeCloudSim framework [41], which itselfis an edge computing extension to the widely usedCloudSim toolkit [10]. We experiment with sev-eral deployment scenarios and scheduling policies tocover different aspects of the problem. Moreover, weemploy real-world resource and workload characteris-tics, where possible, in order to obtain realistic results.When real-world traces are unavailable, we resort tosynthetic generation via methods that are shown to beeffective by previous studies. The rest of this sectiondescribes the simulation environment in detail.

4.1 Deployment Scenarios

We simulate the live video streaming scenariodescribed in Section 2 as a use case for the proposedscheduling framework. In this scenario, computationcapability and network latency characteristics of thechosen nodes for the transcoding, packaging, anddelivery of live video have a strong impact on the QoSperceived by the users. We also compare the perfor-mance of Cloud, CDN, Edge and we consider a hybridsolution, where both Cloud and Edge resources areavailable.

685

Page 10: Addressing Application Latency Requirements …...such Twitter (Periscope), Facebook (Facebook Live), Google (Youtube Mobile Live), and IBM (Ustream, IBMCloudVideo).Figure1depictsthemajorcompo-

A. Aral et al.

Cloud This is the centralized deployment, where theoriginal video is sent to a Cloud data center and allvideo processing (i.e., encoding, transcoding, packag-ing) is carried out in powerful servers. Video segmentsare also distributed from the data center to the users,who request the live stream. Considering the hugecomputation capacity of a Cloud data center, resourcesare modeled as an infinite set of VM instances withspecifications taken from the Amazon Web Services(AWS) m2.4xlarge as reported in Table 1. The net-work communication between the user and the Clouddata center is modeled with a single 1 Gbps link. Thisrepresents the aggregation of multiple links from useraccess BSs to the Cloud. To estimate the communica-tion delay of this link, we made ICMP requests fromhosts in Lucca, Italy and the AWS instances in thesame time zone and averaged the experienced latency.In our experiments, we use the communication delayδu,c = 0.09s to model the latency due to the queuingand processing operations as well as the physical dis-tance between the user u and the Cloud data center c.

Content Delivery Network (CDN) Also in this sce-nario, the live video is transmitted to and processedin the Cloud data center. However, distributed cacheservers store the video segments for future requests.Content distribution follows the policy proposed by[27]. Here, a user request is first redirected to theclosest cache server which returns the content if itis already cached. As a result, an additional delay tothe Cloud is avoided. If the requested content is notavailable in the cache, the request is forwarded to theCloud. In this case, the response is both sent to the userand stored at that cache server. We consider a three-tiernetwork architecture of the users, geographically dis-tributed CDN nodes, and an origin server in the Clouddata center. In this scenario, the origin and replica(CDN) servers have different purposes and corre-spondingly different hardware configurations. Origin

server is of type m2.4xlarge similar to the Cloudscenario, whereas replica servers, being intended fordelivering content, have storage optimized i3.largespecifications as shown in Table 1. Regarding the net-work characteristics, CDN nodes feature substantialinter-node and node-to-cloud distance because theyare geographically distributed and co-located with net-work nodes, such as ISP point of presences (PoPs)or at internet exchange points (IXPs). However, theyare in close proximity to the users, which reducesthe user-replica network latency. Hence, we define acommunication delay of δu,r = 0.013s between theuser and the CDN node along with a δl = 0.03s foreach hop on the path between the CDN and the ori-gin server. δu,r is approximated by sending a set ofICMP requests from a host to a server distant around300km over a 4 hops connection. Access bandwidthsof CDN servers are generated as a Pareto distribu-tion with a mean value of μ = 500Mbps, whereasthe links between the Cloud and CDN nodes are ofhigher capacity with μ = 750Mbps, since we assumethat they are directly connected to the ISP back-bone. Underlying network topology is described inSection 4.3.

Edge In this scenario, entire service is deployed onthe edge computing servers as described in Section 2.Hence, the cloud data center is only responsible forthe management of the edge nodes, which executeboth video processing and distribution operations. Wedeploy 20 Edge nodes that are co-located with BSs. Ateach node, we allow 10 VMs of type either m1.largeor m1.xlarge to be instantiated in order to reflect thelimited computing power of Edge nodes. The accessbandwidths of edge nodes are modeled as a Paretodistribution with average value μ = 375Mbps. Con-sidering the locality of Edge nodes, we assume thatEdge nodes are in close proximity not only to the userbut also to other nodes. Hence, we define high-speed

Table 1 Virtual machinespecifications m1.large m1.xlarge m2.4xlarge i3.large

Number of CPUs 4 8 26 2

CPU (MIPS) 2400 4800 20000 2400

RAM (GB) 8 16 70 15

Storage (GB) 2x420 4x420 2x840 unlimited

Price (USD/h) 0.17 0.35 0.98 0.15

686

Page 11: Addressing Application Latency Requirements …...such Twitter (Periscope), Facebook (Facebook Live), Google (Youtube Mobile Live), and IBM (Ustream, IBMCloudVideo).Figure1depictsthemajorcompo-

Addressing Application Latency Requirements through Edge Scheduling

connections through one-hop dedicated links betweenEdge nodes, where bandwidth values are from a Paretodistribution with a mean value of μ = 400Mbps, andlatency from a uniform distribution U [0.006, 0.009].In this case, we estimated the δei ,ej

values as the aver-age latency measured from a set of ICMP request sentover a single hop connection between two hosts within40 km distance.

Hybrid This scenario extends the Edge one by merg-ing it with the Cloud approach. It considers, besidesthe 20 nodes available in the edge network, also acloud data center in the actual service deployment.Intuitively, if there are not enough resources availablein the reserved VM, our solution checks the avail-ability of other VMs in the same host. In case it isnot available, instead of using other edge data centers,which could affect the reservation scheme provided bythe scheduler, it sends the service to a cloud data cen-ter. Overall, this approach provides a hybrid solutionthat integrates the virtually infinity computation powerof the clouds to edge, which is used only in caseswhere the edge micro data centers are saturated. Com-putation and communication capabilities of the Edgeand Cloud nodes are the same as their respectablescenarios described above.

4.2 Scheduling Scenarios

In addition to the Edge-based scheduling approachproposed in Section 3.4, we also implement a state-of-the-art Cloud-based scheduling approach [13] as abaseline.

Edge-Based We use the utility functions in (1)–(4) forthe four scores described in Section 3.3, namely thenetwork delay, available bandwidth, VM resources,and VM availability. Utility score for the networkdelay between a user group g and a virtual machine v

(1) is computed via sigmoidal function S. Domain ofthe function is the ratio of the delay δg,v to the maxi-mum tolerable delay δ which is set to 50ms. Additiveinverse indicates that smaller δg,v values are preferred.As a result, delays δg,v that are over the requirementsδ, receive scores close to 0, whereas the scores withinthe limits receive scores close to 1. Use of the sig-moidal function ensures that delays shorter that therequirements are not incentivized. Other functions aredefined in a similar fashion. For the bandwidth score

(2), available bandwidthBg,v is compared to the band-width demand B, whereas for the VM resources (3),the computing capability of v expressed in terms ofrequests per secondRPSv is compared to the expectednumber of requests W based on the expected work-load. Utility function for the availability (4), on theother hand, compares the availability rate of the node,Av , to the optimal case of 100% availability. In Fig. 6,the functions are plotted for varying resources avail-abilities given a request of maximum 50 ms latency,minimum 300 Mbps bandwidth, and capability toprocess at least 25 requests per second with 95%availability.

uv,δ(δg,v, δ) = S

(

1 − δg,v

δ

)

(1)

uv,B(Bg,v, B) = S

(Bg,v

B− 1

)

(2)

uv,RPS(RPSv, W ) = S

(RPSv

W− 1

)

(3)

uv,A(Av) = S

(Av

100− 1

)

(4)

Cloud-Based There exist a plethora of task schedul-ing algorithms in the context of cloud computing.We implement the algorithm by [13] called FIXEDprovisioning algorithm as a representative baselinefor cloud-based scheduling. This algorithm estimatesthe minimum number of streaming engine instancesrequired to meet a tolerable user waiting time. Itproposes a M/M/V5 queue to accurately model thesystem and proactively adjust the number of instancesaccording to the workload prediction. We provide thepseudo-code of this algorithm in Algorithm 2. Giventhe estimations for inter-arrival times and durationsof the streaming requests, it begins with calculatingan initial set of VMs with the minimum size (line 2).Then, t is defined as the minimum waiting time to beguaranteed based on the QoS requirements of the ASP(line 3). The algorithm increases the number of VMinstances (line 10) as long as the probability of unac-ceptable waiting time (calculated in lines 7 and 8) ishigher than a predefined threshold, p (lines 6 to 10). Inour edge computing implementation of the algorithm,the calculated set of streaming engines are randomlyinstantiated at the edge nodes.

687

Page 12: Addressing Application Latency Requirements …...such Twitter (Periscope), Facebook (Facebook Live), Google (Youtube Mobile Live), and IBM (Ustream, IBMCloudVideo).Figure1depictsthemajorcompo-

A. Aral et al.

Algorithm 2 Cloud-based scheduling.Data: Set of QoS requirements offered by the ASP: S = {QoS(x), x ≥ 0}Data: Estimated mean inter-arrival times: 1

λ

Data: Estimated mean requests’ durations: 1μ

Result: Number of VMs to acquire: V1 begin2 calculate the smallest number of V such that λ

V μ≤ 1;

3 t = min(QoS(x)), QoS(x) ∈ S;4 P(w(ri) > t) = 1;5 ρ = λ

V μ;

6 while P(w(ri) > t) > p do

7 calculate �W = (Vρ)V

V ! ((1 − ρ)∑V −1

n=0(Vρ)n

n! + (Vρ)V

V ! )(−1);

8 calculate P(w(ri) > t) = �W e(−V μ(1−ρ))t ;9 if P(w(ri) > t) > p then10 V = V + 1

11 return V

4.3 Network Topology

Figure 7 demonstrates the three-tier network, whichconsists of the cloud, edge or CDN nodes and usergroups. We generate an undirected random networktopology graph using the BRITE topology generator[24] between the CDN and Edge nodes (shown witha dashed line in Fig. 7). We employ the Barabasi–Albert scale-free network generation model [6], whichis known to accurately represent human-made systemssuch as the Internet. It mimics the incremental growthand preferential node connectivity behaviors of suchsystems. The nodes are added gradually and the newones link with higher probability to well-connectednodes called the hubs.

The nodes are placed on a 25 × 25 km grid uni-formly at random as shown with black filled circles

in Fig. 8 a. We assume that these locations are alsopoints of interests (POIs) such as faculties in a cam-pus or commercial buildings or recreational areas ina city. Hence, the users, represented with grey circles,are clustered around these POIs. We use Gaussian dis-tribution to generate x and y coordinates relative to thecoordinates of POIs, assign each user to the closestnode, and consequently obtain user groups. Note that,the closest node may not always be the POI for whichthe user coordinates are generated.

4.4 Mobility

There is a large body of literature regarding humanmobility based on observed movements [17]. Consid-ering the POI-based edge node placement and userdistribution, we implement a mobility model that is

Fig. 6 Visualization of the utility functions for the requirements of a 50 ms latency, b 300 Mbps bandwidth, c 25 requests/secondprocessing capability, and d 95% availability

688

Page 13: Addressing Application Latency Requirements …...such Twitter (Periscope), Facebook (Facebook Live), Google (Youtube Mobile Live), and IBM (Ustream, IBMCloudVideo).Figure1depictsthemajorcompo-

Addressing Application Latency Requirements through Edge Scheduling

Fig. 7 Edge network infrastructure [color online]

based on real-world check-ins [14]. Authors in thiswork analyzed a massive dataset of 37 million check-in records to understand the movement behavior ofusers. They determined the distribution of transi-tion probabilities between POI’s, which we utilize to

represent the probability that a user leaves an edgenode and joins another one.

Specifically, we consider the probabilities that auser stays in the same POI (Pstay), moves to a newone (Pnew), and returns to the previous one (Preturn).

Fig. 8 A schematic representation of edge nodes with network topology and user distribution a, and with mobility paths b

689

Page 14: Addressing Application Latency Requirements …...such Twitter (Periscope), Facebook (Facebook Live), Google (Youtube Mobile Live), and IBM (Ustream, IBMCloudVideo).Figure1depictsthemajorcompo-

A. Aral et al.

Based on the findings of [14], Pstay is independent oftime and follows a normal distribution with an averagevalue of 0.445. Pnew, on the other hand, decays as thepower law of time, which is given in (5). Here, −0.3is the decaying rate of Pnew, whereas a is the rate ofexploration, that is the extent that a user explores newplaces. Finally, Preturn is calculated as given in (6).

Pnew = a ∗ t−0.3 (5)

Preturn = 1 − (Pnew + Pstay

)(6)

Since the transition functions are estimated by fit-ting to real data, it is possible that Pnew + Pstay > 1in some cases. When this happens, we take Pnew =1 − Pstay instead of (5). In our implementation, allPOIs have the same attraction level, hence the sametransition probability. However, we restrict the tran-sitions based on geographical proximity. Figure 8bpresents available mobility paths between edge nodesdetermined by a distance threshold of 8 units. When auser takes the action to move to a new node, the des-tination is randomly selected among the neighbors ofthe current node, except the previous one.

4.5 Workload

Since, to the best of our knowledge, there are nopublicly available data set with information about thenumber of viewers joining video streams, we createone based on real-world information. The main para-meters considered in our scheduler/simulation are:maximum number of concurrent clients; streamingduration; user arrival rate; and required video quality.

Live video streams can be classified with respect todifferent parameters, such as the number and arrivalprocess of viewers and stream duration, as reported inthe analysis of live video streaming workloads carriedout in [42]. Our experiments focus on small/mediumstreams, the common stream type in use (in particularin social networks) , which have a peak of less than1000 concurrent clients, and are short, with a durationof less than one hour . We modeled the user arrivalprocess with an exponential distribution, whose meantime between two arrivals (1\λ) is 5 seconds. The timeeach user spends watching the streaming is defined,instead, by a heavy-tailed distribution modeled as a“truncated” version of Pareto distribution character-ized by a cut-off point at 40 minutes, where the taildrastically drops off [42].

The audience uses various end-devices to access thestream, namely smartphone, tablet and laptop, and,therefore, the video quality (i.e., resolution and bitrate)changes according to the device used and bandwidthcondition. A smartphone video is encoded in standarddefinition in two different resolutions and bitrates,as 640x354@640Kbps and 416x234@400Kbps. Atablet video, instead, is encoded in high definitionin two different formats, 1280x720 @4400kbps and1280x720@2500kbps, having the same resolution butdifferent bitrates. Finally, a laptop video is encoded intwo different full-HD formats, as 1920x1080@3000and 1920x1080@5000kbps. All these video parame-ters were retrieved from one of the most widely usedcommercial services for live video streaming, namelyYoutube.1

Availability of Edge servers is taken from real-world failure traces for Local Domain Name Servers(LDNS) [25]. This conforms to our model that Edgeservers are deployed on networking hardware such asISP point of presences. The data set contains pingprobes to 62,201 LDNS servers, which are initiated atexponential intervals with a mean of 1 hour. Traces aredated between March 17 to March 24, 2004. We cal-culate the observed availability percentage from thesetraces and also account for the re-initialization periodafter each failure.

5 Numerical Results and Discussion

5.1 Comparison of Deployment Scenarios

The aforementioned scenarios are run 10 times withthe number of mobile devices ranging from 400 to1400. The results in Fig. 9a show that the edge plat-form (Edge and Hybrid scenarios) reduces consider-able network delay with respect to the other deploy-ment solutions. As the workload increases, Hybridapproach tends to offload more applications to thecloud, which incurs higher average network delay. Inthe experiments, the delay of the Edge scenario is 4 to5 times less than the Cloud scenario, and around halfwith respect to CDN.

Processing time represents another critical factor inthe total service time. The results of our simulations aredepicted in Fig. 9b, which in live streaming scenario,

1https://support.google.com/youtube/answer/2853702?hl=en

690

Page 15: Addressing Application Latency Requirements …...such Twitter (Periscope), Facebook (Facebook Live), Google (Youtube Mobile Live), and IBM (Ustream, IBMCloudVideo).Figure1depictsthemajorcompo-

Addressing Application Latency Requirements through Edge Scheduling

Fig. 9 Average network delay a average processing time b and average service time c experienced by the users in the scenariodescribed in Section 4.1 [color online]

is mainly composed of the time to prepare the videocontent to be delivered. Due to the limited processingcapabilities of edge nodes, they have the highest pro-cessing time and become the system bottleneck as theload increases. Essentially, in the proposed approach,where each streaming engine instance has to processall the incoming requests from the associated userdevice, we experience a faster raise of the processingtime since the scaling possibilities are rather limited

due to constrained resource capabilities of the edgenodes. Therefore, predicting in advance a suitablenumber of VMs represents may lead to better results.In the CDN scenario, we obtain smaller processingtime and less steep rise than in the Edge scenario, sincemost of the processing is done in the cloud servers. Weobserve a sudden decrease in processing time for theHybrid scenario when the edge nodes are overloadedand some processing is offloaded to the resource-rich

691

Page 16: Addressing Application Latency Requirements …...such Twitter (Periscope), Facebook (Facebook Live), Google (Youtube Mobile Live), and IBM (Ustream, IBMCloudVideo).Figure1depictsthemajorcompo-

A. Aral et al.

cloud servers. Cloud scenario, as expected, results inthe lowest processing time.

As demonstrated by the overall results in Fig. 9c,the Hybrid approach outperforms other scenarios inmost cases. It provides similar results to the Edgescenario when the nodes are not overloaded, andaddresses the problem of overloaded nodes in the edgenetwork by dispatching the applications to the cloud.It merges the advantages of both the edge and cloudscenarios. Clearly, CDN is not the best solution forlatency-sensitive applications if they also require pro-cessing power (e.g., video encoding). Yet, it is stilla valid solution in other scenarios, for example, ifonly videos with the same characteristics (bitrate, etc)are present as in offline streaming. Cloud, as wellperforms poorly since the network latency is highlydisruptive in the live streaming scenario.

5.2 Comparison of Scheduling Approaches

The use of Edge solutions, however, require a suitableservice scheduling algorithm as shown in Fig. 10a.Here, both scheduling algorithms are evaluated onthe edge deployment; however, edge-based schedul-ing outperforms the cloud-based baseline significantlyin terms of experienced network delay. This is due tothe joint consideration of network conditions and userrequirements. Figure 10b demonstrates that proactiveestimation of the number of VM instances resultsin lower processing time. However, network delayimprovement overshadows processing time on aver-age and service time is shorter in nearly all cases inFig. 10c. The only exception is when the edge nodesare saturated, as in the case of 1400 users. Theseresults underline the need for scheduling solutionsthat take into account edge specific features to obtainoptimal scheduling.

5.3 Analysis of the Impact of User Mobility

In this final experiment, we evaluate the extent towhich service quality worsens as users move overtime. We particularly focus on average network delay,which is the most severely affected metric due to theincreased distance between the user and a static VMplacement. We present the results for the edge deploy-ment scenario with the edge-based scheduler and 1000users in Fig. 11. We consider five mobility scenariosfrom very low exploration (a = 0.05), which results

in 4.47 transitions per minute, to very high exploration(a = 0.50), which results in 45.09 transitions perminute on average.

Beginning from 42ms, an increase in network delayis observed for all scenarios, albeit relatively slowin low mobility scenarios. Moreover, high mobil-ity scenarios also stabilize under 57ms of delay dueto the returning behavior of the users as the timepasses. Based on our analysis, the point of stabiliza-tion roughly corresponds to the time taken by everyuser to make at least one transition (e.g. 22.2 min-utes for a = 0.50 or 37.5 minutes for a = 0.30).Despite mobility, average network delay of the pro-posed algorithm is significantly better than cloud-based scheduling (around 75ms) or CDN deployment(around 89ms). For the optimal performance and par-ticularly delay-sensitive services, we recommend peri-odic re-execution of the scheduling algorithm, whichwould yield an updated mapping between resourcesand services. The execution time of the Algorithm 1is negligible (on the order of seconds) with respect tothe re-execution period (possibly on the order of min-utes or hours). However, for the environments with avery high level of mobility and with network delaysbetween the edge nodes (e.g. vehicular services), otherscheduling techniques that focus on elasticity and usermobility would be more suitable. We discuss suchworks from the literature in Section 6.

6 Related Work

Edge computing refers to the set of technologies (i.e.,MobileEdgeComputing [19], FogComputing [22] andCloudlets [50]), which perform computation offload-ing and data storage at the edge of the network aimingat the reduction of end-user network delay, bandwidthconsumption in core network and energy consump-tion [37]. The development of these new technologiesis mainly driven by the advent of the Internet ofThings (IoT) services (e.g., real-time analytics andsmart city platforms), generating a massive volume ofdata to be analyzed that can overwhelm current cloud-based solution and are characterized by very strictlatency requirements [11]. Edge computing paradigmis proven effective in many scenarios [51], however,there still exist open research challenges such as thescheduling on edge computing services. Computationoffloading problem [16, 23, 55], for instance, decides

692

Page 17: Addressing Application Latency Requirements …...such Twitter (Periscope), Facebook (Facebook Live), Google (Youtube Mobile Live), and IBM (Ustream, IBMCloudVideo).Figure1depictsthemajorcompo-

Addressing Application Latency Requirements through Edge Scheduling

Fig. 10 Comparison of the average network delay a average processing time b and average service time c in the edge deploymentscenario using a cloud scheduler and the edge scheduler we developed [color online]

to schedule a task either on the mobile device orlocal/internet cloud. Nevertheless, there is no studythat deals with the actual service scheduling on edgenodes, to the best of our knowledge. In the edge area,instead, some preliminary works have been carried outas reported by [39] but it focuses on the optimiza-tion of response time between service components,

without taking into account service users. Aazam andHuh [1] define a resource estimation and pricingmodel for IoT, still without providing a schedulingapproach for the service instance placement.

Preliminary evaluation and future research direc-tions for scheduling approaches in edge scenarios arepresented by [9]. The work mainly focuses on users

693

Page 18: Addressing Application Latency Requirements …...such Twitter (Periscope), Facebook (Facebook Live), Google (Youtube Mobile Live), and IBM (Ustream, IBMCloudVideo).Figure1depictsthemajorcompo-

A. Aral et al.

Fig. 11 Impact of usermobility on network delaywith various explorationrates [color online]

mobility and on the consequent elasticity requirementfor service placement through three mobility-awarescheduling algorithms; it is also proposed to pri-oritize low delay applications in order to improveapplications execution. Other works in this directionalso consider handover mechanisms to cope with usermobility and unnecessary handovers via probabilis-tic [54] and fuzzy logic-based [7] approaches. Sunet al. [44], on the other hand, add energy efficiencyto the equation considering the limited battery powerof typical mobile users. They propose near-optimalmobility management and handover algorithms basedon Lyapunov optimization. As distinct from the afore-mentioned works, Plachy et al. [30] utilize mobilityprediction of individual users for the proactive pro-visioning of VMs and communication paths. Livevideo streaming use case is considered in [34], wherethe authors aim to optimize bandwidth availability incommunity networks through a heuristic algorithm.Our approach differs in the sense that we optimize notonly the delivery of the data but also their processing(e.g. transcoding and packaging operations for thevideo streaming use case). Consequently, our modelaccounts for the resource capacity of the nodes. Wealso consider user mobility in our evaluation.

On the other hand, the service scheduling prob-lem is widely studied in the cloud computing context[38]. Several approaches related to scheduling havebeen applied to cloud computing, e.g., [40]. How-ever, they are focused on single providers and relyon a reduced number of variables. In the area ofdistributed clouds, Papagianni et al. [26] propose aframework for efficient mapping of VM user requestson the aggregate set of connected clouds. They mod-eled the mapping problem as mixed integer program-ming aiming to minimize the mapping costs and thenumber of hops among the VMs. Again in the areaof distributed clouds, Konstanteli et al. [21] proposea novel approach to tackle the problem of serviceallocation in a federated cloud environment for hor-izontally scalable services. They define a methodthat allocates service component replicas taking intoaccount the maximum service requirements in termsof computing, networking and storage resources, aset of affinity and anti-affinity rules for deployingreplica in the same node or subnet and the feder-ated infrastructure costs. They model the allocationproblem as Mixed-Integer Linear Programming opti-mization problem and implement, then, a heuristicsolver that yields to near-optimal solutions in a very

694

Page 19: Addressing Application Latency Requirements …...such Twitter (Periscope), Facebook (Facebook Live), Google (Youtube Mobile Live), and IBM (Ustream, IBMCloudVideo).Figure1depictsthemajorcompo-

Addressing Application Latency Requirements through Edge Scheduling

short time. Pittaras et al. [29] develop an approach forefficient mapping of virtual networks onto a multi-domain network substrate. They devise a semantic-based approach for the mapping of requests to realnetwork subnets, which minimizes the number of hopsbetween selected nodes. Aral and Ovatman [4] pro-pose a novel approach for the problem of mappingvirtual networks defined by the set of interconnectedVMs (i.e., service replicas) on a real infrastructure ina distributed cloud. Their solution strives to reducenetwork latency and optimize bandwidth utilization. Itfollows a topology-based mapping approach and allo-cates the virtual network defined by the connectionsamong service components on a cloud subnet whosetopology is isomorphic to the virtual one.

Many works are carried out also for QoS-awareservice scheduling in the context of a single cloudprovider, more focused, then, on the optimization ofservice placement within a single data center. Duonget al. [13] propose an integrated approach that com-bines provisioning and scheduling techniques in orderto satisfy both the provider’s revenue and consumers’requirements. They devise a provisioning algorithm,based on queuing theory approaches, for the defini-tion of the number of VMs to be deployed in orderto minimize the user waiting time. Similarly, a rule-based resource provisioning algorithm that employsfuzzy logic is proposed [20] to enable qualitativespecification of the elasticity rules. This work alsofeatures an elasticity controller, which predicts andcopes with changes in the workload. Zeng et al.[53] develop an approach that optimizes content dis-tribution within servers in a data center by jointlyoptimizing request routing and content placement.Essentially, they devise a method, which optimize theusage of storage and network capacity based on block-ing probability to avoid the starvation of service users.Piao and Yan [28], instead, present a VM placementapproach for data-intensive applications that aims tominimize the transfer time between the data centershosting the data and the VMs running the services.Therefore, they develop a solution that takes intoaccount the transfer data time between the hostingdata centers and the VMs, define a VM placement thatminimizes the overall system transfer time.

However, none of these approaches can be directlyapplied for the VM placement in edge computingsince they do not support the specific characteristicsof the edge paradigm. For example, VM placement

solutions for federated clouds that minimize inter-node network latency, represent only a partial solutionfor the VM placement for latency-sensitive applica-tions at the edge. Indeed, these solutions are notlocation-aware, that is they do no take into accountuser localization, which is a critical feature of the edgeparadigm. Definition of a VM placement that mini-mizes only the inter-node delay yields to sub-optimalresults since the user-node latency, which accounts fora large part of the overall service delay, impedes userexperience. Moreover, they are not resource-awareand therefore, considering the limited capabilities ofedge nodes, the placement of a VM on edge node,which cannot provide enough resources to cope withworkload peaks, incurs in high service time due toeither the high processing time or the additional delayadded by the high rate of VM migrations.

7 Conclusion

Effective scheduling of services in edge computingscenario is vital due to strict latency requirements andlimited resources. Sub-optimal service placement andscheduling may result in significantly low quality ofservice and low utilization of resources. To cope withthese issues, we focus on the edge service schedul-ing problem. We develop a service-driven approach tomaximize the service quality experienced by the usersthrough deploying services on the most suitable VMsin terms of computational and network resources.

Our proposal is a score-based algorithm that worksin two stages. First, it verifies the eligibility of eachavailable VM type, according to the service networkand computational requirements as well as its reli-ability. It assigns a quality score to each VM typedenoting the suitability of that VM to host the ser-vice to be scheduled. Then, the services are assignedin a way to maximize the total score of the chosenVMs, thus improving service quality for end users.To validate the proposed approach, we evaluated theaverage response and processing time as well as net-work delay experienced by the users in the Edge,CDN, Cloud, and Hybrid (Edge and Cloud) scenar-ios. The results obtained demonstrate the validity ofour scheduling approach in the scope of edge com-puting paradigm. Indeed, the promised benefits ofedge computing can only be achieved when effec-tive scheduling algorithms that consider its peculiar

695

Page 20: Addressing Application Latency Requirements …...such Twitter (Periscope), Facebook (Facebook Live), Google (Youtube Mobile Live), and IBM (Ustream, IBMCloudVideo).Figure1depictsthemajorcompo-

A. Aral et al.

features are implemented. We believe that our workwill bring more industry and research attention to thebarriers to the adoption of edge computing.

As future work, we will enhance the developedsolution by decentralizing the optimization process,adding another decision output for the number ofinstances of services, and considering vertical andhorizontal scaling. Moreover, we plan to investigateand integrate our solution with Software Defined Net-works to improve and cope with the latency andbandwidth requirements of edge applications.

Acknowledgements This work has been partially funded bythe Rucon project (Runtime Control in Multi Clouds), FWF Y904 START-Programm 2015, by the European Union’s Hori-zon 2020 research and innovation programme under the MarieSklodowska-Curie grant agreement No 838949, and by the Ital-ian National Interuniversity Consortium for Informatics (CINI).

Funding Information Open access funding provided by Aus-trian Science Fund (FWF).

Open Access This article is distributed under the terms of theCreative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unre-stricted use, distribution, and reproduction in any medium,provided you give appropriate credit to the original author(s)and the source, provide a link to the Creative Commons license,and indicate if changes were made.

References

1. Aazam, M., Huh, E.N.: Fog computing micro datacenterbased dynamic resource estimation and pricing model forIoT. In: IEEE International Conference on Advanced Infor-mation Networking and Applications (AINA), pp. 687–694.IEEE (2015)

2. Aral, A., Brandic, I.: Quality of service channelling forlatency sensitive edge applications. In: IEEE InternationalConference on Edge Computing (EDGE), pp. 166–173.IEEE (2017)

3. Aral, A., Brandic, I.: Dependency mining for serviceresilience at the edge. In: ACM/IEEE Symposium on EdgeComputing, pp. 228–242. ACM (2018)

4. Aral, A., Ovatman, T.: Network-aware embedding of vir-tual machine clusters onto federated cloud infrastructure. J.Syst. Softw. 120, 89–104 (2016)

5. Aral, A., Ovatman, T.: A decentralized replica placementalgorithm for edge computing. IEEE Trans. Netw. Serv.Manag. 15(2), 516–529 (2018)

6. Barabasi, A.L., Albert, R.: Emergence of scaling in randomnetworks. Science 286(5439), 509–512 (1999)

7. Basic, F., Aral, A., Brandic, I.: Fuzzy handoff control inedge offloading. In: IEEE International Conference on FogComputing. IEEE (2019)

8. Bilal, K., Erbad, A.: Edge computing for interactive mediaand video streaming. In: International Conference on Fogand Mobile Edge Computing (FMEC), pp. 68–73. IEEE(2017)

9. Bittencourt, L.F., Diaz-Montes, J., Buyya, R., Rana, O.F.,Parashar, M.: Mobility-aware application scheduling in fogcomputing. IEEE Cloud Computing 4(2), 26–35 (2017)

10. Calheiros, R.N., Ranjan, R., Beloglazov, A., De Rose, C.A.,Buyya, R.: Cloudsim: a toolkit for modeling and simula-tion of cloud computing environments and evaluation ofresource provisioning algorithms. Software: Practice andexperience 41(1), 23–50 (2011)

11. Dastjerdi, A.V., Gupta, H., Calheiros, R.N., Ghosh, S.K.,Buyya, R.: Fog computing: principles, architectures, andapplications. In: Internet of Things, pp. 61–75. Elsevier(2016)

12. Dobrian, F., Sekar, V., Awan, A., Stoica, I., Joseph,D., Ganjam, A., Zhan, J., Zhang, H.: Understanding theimpact of video quality on user engagement. ACM SIG-COMMComputer Communication Review 41(4), 362–373(2011)

13. Duong, T.N.B., Li, X., Goh, R.S.M., Tang, X., Cai, W.:Qos-aware revenue-cost optimization for latency-sensitiveservices in Iaas clouds. In: IEEE/ACM International Sym-posium on Distributed Simulation and Real Time Applica-tions (DS-RT), pp. 11–18. IEEE (2012)

14. Fan, C., Huang, J., Yang, D., Rong, Z.: Modeling poitransition network of human mobility. In: InternationalConference on Cyber-Enabled Distributed Computing andKnowledge Discovery, pp. 364–367. IEEE (2016)

15. Gill, S.S., Buyya, R.: Resource provisioning based schedul-ing framework for execution of heterogeneous and clus-tered workloads in clouds: from fundamental to auto-nomic offering. Journal of Grid Computing, 17(3), 385–417(2018)

16. Guo, X., Singh, R., Zhao, T., Niu, Z.: An index basedtask assignment policy for achieving optimal power-delaytradeoff in edge cloud systems. In: IEEE InternationalConference on Communications (ICC), pp. 1–7. IEEE(2016)

17. Hess, A., Hummel, K.A., Gansterer, W.N., Haring, G.:Data-driven human mobility modeling: a survey and engi-neering guidance for mobile networking. ACM ComputingSurveys (CSUR) 48(3), 38 (2016)

18. Hu, W., Gao, Y., Ha, K., Wang, J., Amos, B., Chen, Z., Pil-lai, P., Satyanarayanan, M.: Quantifying the impact of edgecomputing on mobile applications. In: Proceedings of the7th ACMSIGOPS Asia-PacificWorkshop on Systems, p. 5.ACM (2016)

19. Hu, Y.C., Patel, M., Sabella, D., Sprecher, N., Young, V.:Mobile edge computing—a key technology towards 5g.ETSI White Paper 11(11), 1–16 (2015)

20. Jamshidi, P., Ahmad, A., Pahl, C.: Autonomic resourceprovisioning for cloud-based software. In: InternationalSymposium on Software Engineering for Adaptive andSelf-Managing Systems, pp. 95–104. ACM (2014)

21. Konstanteli, K., Cucinotta, T., Psychas, K., Varvarigou,T.A.: Elastic admission control for federated cloud services.IEEE Transactions on Cloud Computing 2(3), 348–361(2014)

696

Page 21: Addressing Application Latency Requirements …...such Twitter (Periscope), Facebook (Facebook Live), Google (Youtube Mobile Live), and IBM (Ustream, IBMCloudVideo).Figure1depictsthemajorcompo-

Addressing Application Latency Requirements through Edge Scheduling

22. Mahmud, R., Kotagiri, R., Buyya, R.: Fog computing: ataxonomy, survey and future directions. In: Internet ofEverything, pp. 103–130. Springer (2018)

23. Mao, Y., Zhang, J., Letaief, K.B.: Dynamic computa-tion offloading for mobile-edge computing with energyharvesting devices. IEEE Journal on Selected Areas inCommunications 34(12), 3590–3605 (2016)

24. Medina, A., Lakhina, A., Matta, I., Byers, J.: BRITE: anapproach to universal topology generation. In: Ninth Inter-national Symposium onModeling, Analysis and Simulationof Computer and Telecommunication Systems, pp. 346–353, IEEE (2001)

25. Pang, J., Hendricks, J., Akella, A., De Prisco, R., Maggs, B.,Seshan, S.: Availability, usage, and deployment characteris-tics of the domain name system. In: Proceedings of the 4thACM SIGCOMM Conference on Internet Measurement,pp. 1–14. ACM (2004)

26. Papagianni, C., Leivadeas, A., Papavassiliou, S., Maglaris,V., Cervello-Pastor, C., Monje, A.: On the optimal allo-cation of virtual resources in cloud computing networks.IEEE Trans. Comput. 62(6), 1060–1071 (2013)

27. Pathan, A.M.K., Buyya, R.: A taxonomy and survey ofcontent delivery networks. Grid Computing and DistributedSystems Laboratory, University of Melbourne, TechnicalReport (2007)

28. Piao, J.T., Yan, J.: A network-aware virtual machine place-ment and migration approach in cloud computing. In: Inter-national Conference on Grid and Cooperative Computing(GCC), pp. 87–92. IEEE (2010)

29. Pittaras, C., Papagianni, C., Leivadeas, A., Grosso, P., vander Ham, J., Papavassiliou, S.: Resource discovery andallocation for federated virtualized infrastructures. Futur.Gener. Comput. Syst. 42, 55–63 (2015)

30. Plachy, J., Becvar, Z., Strinati, E.C.: Dynamic resourceallocation exploiting mobility prediction in mobile edgecomputing. In: IEEE International Symposium on Personal,Indoor, and Mobile Radio Communications, pp. 1–6. IEEE(2016)

31. Satyanarayanan, M.: The emergence of edge computing.Computer 50(1), 30–39 (2017)

32. Scoca, V., Uriarte, R.B., De Nicola, R.: Smart contractnegotiation in cloud computing. In: 2017 IEEE 10thInternational Conference on Cloud Computing (CLOUD),pp. 592–599. IEEE (2017)

33. Scoca, V., Aral, A., Brandic, I., De Nicola, R., Uriarte, R.B.:Scheduling latency-sensitive applications in edge comput-ing. In: CLOSER, pp. 158–168 (2018)

34. Selimi, M., Cerda-Alabern, L., Freitag, F., Veiga, L., Sathi-aseelan, A., Crowcroft, J.: A lightweight service placementapproach for community network micro-clouds. Journal ofGrid Computing 17(1), 169–189 (2019)

35. Shamsi, J., Khojaye, M.A., Qasmi, M.A.: Data-intensivecloud computing: requirements, expectations, challenges,and solutions. Journal of Grid Computing 11(2), 281–310(2013)

36. Sharifi, L., Cerda Alabern, L., Freitag, F., Veiga, L.: Energyefficient cloud service provisioning: keeping data cen-ter granularity in perspective. Journal of Grid Computing14(2), 299–325 (2016)

37. Shi, W., Cao, J., Zhang, Q., Li, Y., Xu, L.: Edge comput-ing: vision and challenges. IEEE Internet of Things Journal3(5), 637–646 (2016)

38. Singh, S., Chana, I.: A survey on resource scheduling incloud computing: issues and challenges. Journal of GridComputing 14(2), 217–264 (2016)

39. Skarlat, O., Nardelli, M., Schulte, S., Dustdar, S.: TowardsQos-aware fog service placement. In: IEEE 1st Interna-tional Conference on Fog and Edge Computing (ICFEC),pp. 89–96. IEEE (2017)

40. Song, W., Xiao, Z., Chen, Q., Luo, H.: Adaptive resourceprovisioning for the cloud using online bin packing. IEEETrans. Comput. 63(11), 2647–2660 (2014)

41. Sonmez, C., Ozgovde, A., Ersoy, C.: Edgecloudsim: anenvironment for performance evaluation of edge computingsystems. In: International Conference on Fog and MobileEdge Computing (FMEC), pp. 39–44. IEEE (2017)

42. Sripanidkulchai, K., Maggs, B., Zhang, H.: An analysis oflive streaming workloads on the internet. In: ACM SIG-COMM Conference on Internet Measurement, pp. 41–54.ACM (2004)

43. Stanciu, A.: Blockchain based distributed control systemfor edge computing. In: 2017 21st International Confer-ence on Control Systems and Computer Science (CSCS),pp. 667–671. IEEE (2017)

44. Sun, Y., Zhou, S., Xu, J.: EMM: energy-aware mobilitymanagement for mobile edge computing in ultra dense net-works. IEEE J. Sel. Areas Commun. 35(11), 2637–2646(2017)

45. Tuli, S., Mahmud, R., Tuli, S., Buyya, R.: Fogbus: ablockchain-based lightweight framework for edge and fogcomputing. J. Syst. Softw. 154, 22–36 (2019)

46. Uriarte, R.B., De Nicola, R.: Blockchain-based decen-tralised cloud/fog solutions: challenges, opportunities andstandards. IEEE Communications Standards Magazine.2(3), 22–28 (2018)

47. Uriarte, R.B., Tiezzi, F., De Nicola, R.: Dynamic slas forclouds. In: European Conference on Service-Oriented andCloud Computing, pp. 34–49. Springer (2016)

48. Uriarte, R.B., Tiezzi, F., Tsaftaris, S.A.: Supporting auto-nomic management of clouds: service clustering with ran-dom forest. IEEE Trans. Netw. Serv. Manag. 13(3), 595–607 (2016)

49. Uriarte, R.B., De Nicola, R., Scoca, V., Tiezzi, F.: Definingand guaranteeing dynamic service levels in clouds. Futur.Gener. Comput. Syst. 99, 27–40 (2019)

50. Verbelen, T., Simoens, P., De Turck, F., Dhoedt, B.:Cloudlets: bringing the cloud to the mobile user. In: ACMWorkshop on Mobile Cloud Computing and Services,pp. 29–36. ACM (2012)

51. Wang, S., Zhang, X., Zhang, Y., Wang, L., Yang, J., Wang,W.: A survey on mobile edge networks: convergence ofcomputing, caching and communications. IEEE Access 5,6757–6779 (2017)

52. Yu, C., Lumezanu, C., Sharma, A., Xu, Q., Jiang, G., Mad-hyastha, H.V.: Software-defined latency monitoring in datacenter networks. In: International Conference on Passiveand Active Network Measurement, pp. 360–372. Springer(2015)

697

Page 22: Addressing Application Latency Requirements …...such Twitter (Periscope), Facebook (Facebook Live), Google (Youtube Mobile Live), and IBM (Ustream, IBMCloudVideo).Figure1depictsthemajorcompo-

A. Aral et al.

53. Zeng, L., Veeravalli, B., Wei, Q.: Space4time: optimizationlatency-sensitive content service in cloud. J. Netw. Comput.Appl. 41, 358–368 (2014)

54. Zhang, H., Qiu, Y., Chu, X., Long, K., Leung, V.C.: Fogradio access networks: mobility management, interferencemitigation, and resource optimization. IEEE Wirel. Com-mun. 24(6), 120–127 (2017)

55. Zhao, T., Zhou, S., Guo, X., Zhao, Y., Niu, Z.: A coopera-tive scheduling scheme of local cloud and internet cloud fordelay-aware mobile cloud computing. In: IEEE GlobecomWorkshops, pp. 1–6. IEEE (2015)

Publisher’s Note Springer Nature remains neutral withregard to jurisdictional claims in published maps and institu-tional affiliations.

698