IEEE TRANSACTIONS ON SERVICES COMPUTING, MANUSCRIPT ID 1 Energy Efﬁcient Scheduling ...bbcr.uwaterloo.ca › ~xshen › paper › 2015 › eesamf.pdf · 2020-01-23 · request scheduling

1939-1374 (c) 2015 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission. Seehttp://www.ieee.org/publications_standards/publications/rights/index.html for more information.

This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI10.1109/TSC.2015.2444845, IEEE Transactions on Services Computing

IEEE TRANSACTIONS ON SERVICES COMPUTING, MANUSCRIPT ID 1

Energy Efficient Scheduling and Managementfor Large-Scale Services Computing Systems

Ying Chen, Student Member, IEEE , Chuang Lin, Senior Member, IEEE ,Jiwei Huang, Member, IEEE , Xudong Xiang, Student Member, IEEE ,

and Xuemin (Sherman) Shen, Fellow, IEEE

Abstract —With the increasing popularity of services published online, energy consumption of services computing systems isgrowing dramatically. Besides Quality of Service (QoS), energy efficiency has become an important issue and drawn significantattention. However, energy efficient request scheduling and service management for large-scale services computing systemsface challenges because of the high dynamics and unpredictability of request arrivals. In this paper, we jointly consider theconflicting metrics of performance, queue congestion and energy consumption. We propose a distributed online schedulingand management algorithm which does not require any priori statistical knowledge of request arrivals. Mathematical analysisis conducted which demonstrates that our algorithm can achieve arbitrary tradeoff between performance and energy efficiency.Numerical and real trace data based experiments are carried out to validate the effectiveness of our algorithm in optimizingenergy efficiency while stabilizing the system.

Index Terms —services computing; request scheduling; service management; energy efficiency; QoS

✦

1 INTRODUCTION

S ERVICES represent a type of relationships-basedinteractions between service providers and service

consumers to achieve a certain business goal or solu-tion objective [1]. Services Computing, as an emergingcross-discipline covering various aspects of businessand IT services, is to create, operate, manage andoptimize these services in a well-defined architecturefor higher flexibility facing future business dynamics.With the increasing presence and adoption of serviceson the World Wide Web, there is a growing popularityof third-party commodity clusters and cloud environ-ments providing services with different functionali-ties. As the number of functionally similar servicesavailable on the Internet increases rapidly, Qualityof Service (QoS) is employed for describing non-functional characteristics of services [2]. Therefore,upon arrival of service requests, how to select theoptimal execution plan that maximizes the end-to-

• Ying Chen and Chuang Lin are with the Department of ComputerScience and Technology, Tsinghua University, Beijing 100084, China.E-mail: [email protected], [email protected]

• Jiwei Huang is with the State Key Laboratory of Networking andSwitching Technology, Beijing University of Posts and Telecommu-nications, Beijing 100876, China.E-mail: [email protected]

• Xudong Xiang is with the Department of Computer Science andTechnology, University of Science and Technology Beijing, Beijing100083, China.E-mail: [email protected]

• Xuemin (Sherman) Shen is with the Department of Electrical andComputer Engineering, University of Waterloo, Waterloo, ON N2L3G1, Canada.E-mail: [email protected]

end QoS has drawn significant attention, and therehave been many research efforts with regard to suchproblem in services computing systems [3], [4].

The development and management of services com-puting systems has been focused on the improvementof performance, such as response time, throughput,reliability, etc. Recently, however, more and moreattention has been paid to energy consumption, es-pecially in large-scale computing infrastructures suchas service systems, cloud systems and data centers [5],[6]. It is reported that the energy consumed by datacenters in the US is more than 1.5 percent of the totalenergy consumption in the country, and the numberis still increasing [7]. It is also estimated that theenergy consumption of a Google search is 0.0003kWhon average, resulting in more than 32 million kWhenergy consumed every year [8]. How to improveenergy efficiency has become an important issue inservice systems.

There are several approaches for reducing energyconsumption in services computing systems: (1) thefirst approach is energy efficient request dispatch.It has been shown in [9] and [10] that, there is asignificant opportunity promoting energy efficiencythrough effective request dispatch and allocation; (2)the second approach to reduce energy consumptionis service management at service level, which mon-itors the workload and utilization information, andswitches the service or its hosting virtual machinebetween active and idle states accordingly [11]; and(3) the third approach is power management at thehardware level, e.g., Dynamic Voltage and FrequencyScaling (DVFS). DVFS is one of the most popular



2 IEEE TRANSACTIONS ON SERVICES COMPUTING, MANUSCRIPT ID

techniques for power management in computer sys-tems [12]. In this paper, we jointly consider the threeapproaches to improve energy efficiency in servicescomputing system, which is unexplored in previousliterature.

Furthermore, existing research works generally pro-mote QoS and energy efficiency based on assumptionor prediction of request arrival or QoS properties [13],[14], [15]. In reality, the accuracy of such predictionscan hardly be guaranteed. Because of the burstinessand fluctuations of request arrivals, it is extremelyhard or even impossible to precisely predict the ar-rival statistics of service requests [11], [16]. Moreover,with the growing popularity of services on the In-ternet, traditional centralized optimization techniquessuch as combinatorial optimization and dynamic pro-gramming may not be applicable in large-scale servicesystems because of their high time-complexity.

To address these challenges, we introduce a dis-tributed online energy efficient request schedulingand service management mechanism without requir-ing a priori knowledge of any statistical informa-tion of requests arrivals. In specific, we formulatethe request scheduling and service management asan optimization problem, which can be widely ap-plied to classical research problems, such as serviceselection. We develop a general model consideringthe decisions of request dispatch, service manage-ment and DVFS to improve energy efficiency froma systematical point of view. Based on Lyapunovoptimization techniques, we propose a DistributedOnline Scheduling and Management algorithm calledDOSM. With an arbitrary parameter V to control thetradeoff between energy efficiency and queue length,the DOSM algorithm is proven to be O(1/V )-optimalwith respect to the average energy efficiency whilebounding the queue length by O(V ). Furthermore, weconduct both numerical experiments and real tracebased simulations to demonstrate the effectivenessand efficiency of DOSM algorithm. Besides, sensitivityanalysis is performed which can provide guidance forsystem management and optimization.

Our preliminary work was presented in the ICWS2014 [17]. The main differences are as follows. (1) Ourrequest scheduling and service management model ismore generic than the conference version. The appli-cability of our model in services computing systemis discussed in detail. Specifically, the relationshipbetween our model and the traditional service selec-tion is studied. (2) We combine dynamic voltage andfrequency scaling which reduces the energy consump-tion at the hardware level to achieve a comprehensiveoptimization of energy efficiency in service systems.(3) The algorithm has been modified. In the worstcase, the time complexity of the algorithm has beenreduced from O(n + 2m) to O(n + Nm), which ispolynomial time. (4) We redo the experiments andconduct real trace based simulations to further val-

idate the effectiveness of our model and algorithmin complex and real world situations. (5) Sensitivityanalysis based on real trace is carried out whichcan find effective ways for system management andoptimization.

The remainder of the paper is organized as follows.Section 2 introduces related work. In Section 3, weformulate the energy efficient request scheduling andservice management problem. In Section 4, based onLyapunov optimization techniques, we propose a dis-tributed online algorithm. Optimality analysis of ouralgorithm is also provided. In Section 5, we conductnumerical experiments and real trace based simula-tions to verify its effectiveness. Sensitivity analysis isalso performed. We conclude the paper in Section 6.

2 RELATED WORK

With the increasing energy consumption associatedwith IT and service systems, energy efficiency hasdrawn significant attention in both the computing andmobile services [6], [9], [18]. There are several ap-proaches to reduce energy consumption. In this paper,we consider three approaches, i.e., request dispatch,service management and DVFS [5], [10], [19].

Effective request dispatch is to make full use ofservices diversity, and distribute the requests to theappropriate services in order to meet the requirementand reduce energy consumption at the same time [20].It has been shown that the energy efficiency can beimproved significantly by effective request dispatchand allocation [9]. Gao et al. [21] proposed a flow op-timization based framework for request dispatch andengineering, which monitored the request workloadinformation and dynamically controlled the dispatchof user requests. Also, in our previous work [14],we proposed an energy efficient approach for servicerequest dispatch, and proved that an effective requestdispatch mechanism can significantly reduce energyconsumption in large-scale service systems. Lin et al.[22] discussed environmental potential of workloaddispatch and balancing based on the “receding hori-zon control” (RHC) algorithm, and studied how tointroduce variants of RHC in the face of heterogeneityof delays, servers, and electricity prices. Adnan et al.[23] took advantage of geographical load balancingand combined the flexibility from the Service LevelAgreements (SLAs) to differentiate among workloadsfor cost savings. They showed that significant costsavings can be achieved by dispatch of workload withfuture electricity price prediction. For our work, we donot require the prediction on electricity price, since theaccuracy of prediction can hardly be guaranteed.

Service management is an effective approach toreduce energy consumption at service level. It controlsthe states of services on servers by switching the ser-vices (or the virtual machines that host them) betweenactive and idle states according to their workload



CHEN et al.: ENERGY EFFICIENT SCHEDULING AND MANAGEMENT FOR LARGE-SCALE SERVICES COMPUTING SYSTEMS 3

and utilization [16]. It has been demonstrated thatservice management can improve system utilizationand achieve the desired tradeoff between energy costand performance. Zhou et al. [24] took advantage ofservice management and presented an optimizationframework for service platforms, which was able toreduce energy and achieve performance guarantees.Beloglazov et al. [25] proposed service managementand allocation algorithm for energy-efficient manage-ment of service systems. The algorithm could reducepower consumption while satisfying QoS constraints.Jang et al. [26] proposed techniques for reducingenergy consumption by tuning the services with lowworkload into standby state.

Another way to reduce energy consumption isDVFS at the hardware level [5]. Equipped by mod-ern hardware computer components, dynamic speedscaling is helpful for improving energy efficiency,which adjusts the CPU frequencies of the server [12].Another popular technique, i.e., dynamic voltage scal-ing, is commonly used in conjunction with frequencyscaling [5]. It has been shown in [12] that power con-sumption can be dramatically cut down by decreas-ing the CPU voltage and frequency simultaneously.Such combined technique, called DVFS, has beenwidely used and supported by most modern CPUs.For example, most Intel CPUs have been equippedwith the techniques of SpeedStep and TurboBoost,adjusting CPU’s voltage and speed to its utilization[27]. The other famous CPU manufacturer, AMD, hasalso equipped their CPUs with the similar mecha-nisms, such as PowerNow! [28]. Guerra et al. [29]used dynamic voltage scaling techniques and Vary-OnVary-Off configuration to improve energy efficiency.Wierman et al. [15] discussed how to reduce powerconsumption by DVFS method. They assumed therequest arrival as a Poisson process and modeled thesystem by an M/GI/1 queue. In our work, we do notrequire any statistical information of requests arrivals.

Although the request dispatch, service managementand DVFS approaches have been well studied sepa-rately by the research community, our work jointlyconsiders the three approaches in services computingsystems and present a comprehensive optimizationframework of energy efficiency in service systems. Wedesign efficient algorithms providing reference valuefor system management and optimization. Extendedfrom our previous work [17], we develop a morecomprehensive model to study energy efficiency. Ourmodel is applicable to the traditional service selectionwhich is one of the most popular problems in the ser-vices computing community. Different from previousworks which focused on local optimization at eachtime period without considering the variations acrosstime [30], we guarantee the average performance andcost over a long-time period.

Fig. 1. Framework for energy efficient scheduling andmanagement.

3 SYSTEM MODEL

In this section, we formulate the request schedulingand service management problem and discuss theapplicability of our formulation in services computingsystems.

3.1 System Architecture

Fig. 1 shows a framework for energy efficient schedul-ing and management in a general services computingsystem. There are multiple services deployed on dif-ferent servers, and many of them are duplicated forthe following two reasons. First, some of the serviceswith the same functionality are published by differentservice providers. Second, even one service providermay deploy multiple duplications of a service ondifferent servers to guarantee the service availabilityto the users [31].

The brokers are responsible for service discovery,status monitor and QoS management. Upon arrival ofuser requests, the brokers search published servicesthat can fulfill the functionalities and QoS require-ments of these requests, and dispatch requests tothe appropriate services [32]. There is a managerfor each of the servers, managing the status of theservices while determining the operational status ofthe servers (such as CPU voltage and server frequencyusing DVFS techniques).

3.2 Problem Formulation

Consider the services computing system with m cate-gories of services denoted by a set I . Each category ofservices has candidates deployed on different serversand the set of all service candidates is denoted asC. The set of service candidates that are appropriateto accomplish service category i ∈ I is expressed




TABLE 1Notations and Definitions.

Notation Definition

I set of services.

C set of service candidates.

J set of servers.

κ service request.

Ai(t) number of requests for service i that arrive in slott.

Dij(t) number of requests dispatched to service candi-date i on server j in slot t.

yij(t) binary variable denoting the state of service can-didate i on server j in slot t.

uj(t) running frequency of server j in slot t.

QoSwκ (t) w-th QoS requirement of request κ in slot t.

qoswκ (t) w-th QoS value of request κ in slot t.

f(t) system profit in slot t.

rij(t) reward of a request on service candidate i onserver j in slot t.

Pj(t) power consumption of server j in slot t.

ϕ(t) price of unitary energy consumption in slot t.

τ slot length.

Qij(t) queue length of service candidate i on server j inslot t.

by Ci. Let J = {1, 2, . . . , n} denote the set of allphysical servers in the services computing system. Inorder to make our approach more general, we assumethe physical servers to be heterogeneous. The mainnotations in this paper are listed in Table 1.

3.2.1 Service Request Model

Each service request is characterized by a tuple κ =〈i, di, Oi〉, where i ∈ I denotes the category of servicethat the request belongs to, and di denotes the de-mand (i.e., job length or hardware resources) of suchrequest. Oi is the set of servers where the services aredeployed. K represents the set of all the requests.

We model the system as a discrete time-slottedsystem where the length of each time period τ canrange from milliseconds to minutes. Let Ai(t) denotethe number of requests for service i arrived in timeslot t.

We highlight the generality and practicability ofour model that it does not necessarily require a prioriknowledge of any statistical information of Ai(t), whichis generally unpredictable and difficult to obtain inpractice.

3.2.2 Request Scheduling and Service Management

Let Bm×n = [bij ]m×n represent the mapping matrixof services on servers, where bij = 1 represents thatservice i is deployed on server j, otherwise bij = 0.The problem of service deployment or placement isout of the scope of this paper, and the mapping matrixBm×n is assumed to be predefined. In each time slott, the service brokers make decisions to select the

appropriate service candidates to serve the requests,and decide how many requests to be distributed tothe services. Let Dij(t) denote the number of requestsdispatched to service candidate i on server j. Forthose j with bij = 0, the variables Dij(t) are set tobe zero.

Our approach also determines the state of servicei (or its hosting virtual machine) on server j ineach time slot t, and can switch it between activestate and idle state in response to time-varying sys-tem workload, in order to arbitrate the performance-energy tradeoff. Let yij(t) be the indicator variableof the state of service i on server j in slot t, whereyij(t) = 1 means service i on server j is active,otherwise yij(t) = 0.

As the voltages of computer components are closelyrelated to the server speed and dynamic voltagescaling is usually applied in conjunction with speedscaling [5], we let uj(t) be the DVFS decision variable,standing for the running frequency of server j in slott. The set of all available frequencies for server j isrepresented by Uj .

In services computing systems, users usually havepreferences and QoS requirements towards the ser-vices they are offered, and thus Service Level Agree-ment (SLA) should be carefully considered. Let Wrepresent the set of indices referring to different QoSproperties, including reliability, availability, integrity,etc. We use the superscript w to index QoS properties.Let QoSw

κ (t) represent the w-th QoS requirement ofrequest κ in slot t. qoswκ (t) stands for the w-th QoSvalue of request κ. We only consider the cases thatlarger QoS values represent better quality, since theother cases can be transformed by taking the oppositevalues. Thus it should be satisfied that qoswκ (t) ≥QoSw

κ (t), ∀κ ∈ K, ∀w ∈ W .We aim to maximize the system profit in the ser-

vices computing system, which is a function of thedecision variables Dij(t), yij(t) and uj(t), denotedby f(Dij(t), yij(t), uj(t)). Then the problem can beformulated as

maxDij(t),yij(t),uj(t)

f(Dij(t), yij(t), uj(t)); (1)

subject to∑

j∈J

Dij(t) = Ai(t), ∀i ∈ I, ∀t ∈ N; (2)

yij(t) ∈ {0, 1}, ∀i ∈ I, ∀j ∈ J, ∀t ∈ N; (3)

uj(t) ∈ Uj , ∀j ∈ J, ∀t ∈ N; (4)

qoswκ (t) = qoswκ (Dij(t), yij(t), uj(t))

≥ QoSwκ (t), ∀κ ∈ K, ∀w ∈ W. (5)

3.3 Model Applicability

Many service providers offer services that are func-tionally similar but differ in non-functional properties.When a request is submitted, how to choose the




appropriate candidate service to provide the requiredfunctionality and satisfy the requirements has becomean important problem. This issue, known as “serviceselection”, has been widely studied by the servicescomputing community [1]. Here, we discuss howservice selection is related to our request schedulingand service management problem.

Consider a discrete time-slotted system and thetotal requests are classified into |I| categories. Letxκij(t) represent the service selection decision variable

of request κ in time slot t, where xκij(t) = 1 means that

service candidate i on server j is selected, otherwisexκij(t) = 0. The profit of request κ after service se-

lection is denoted by g(xκij(t)) and the total requests’

profit is∑

i

∑

j

∑

κ

g(xκij(t)). Then the service selection

problem can be formulated as

maxxκij(t)

∑

i∈I

∑

j∈J

∑

κ∈K

g(xκij(t)); (6)

subject to

xκij(t) ∈ {0, 1}, ∀κ ∈ K, ∀i ∈ I, ∀j ∈ J, ∀t ∈ N; (7)n∑

j=1

xκij(t) = 1, ∀κ ∈ K, ∀i ∈ I, ∀t ∈ N; (8)

qoswκ (t) = qoswκ (xκij(t)) ≥ QoSw

i (t), ∀κ ∈ K, ∀w ∈ W.(9)

We elucidate that service selection can be applicablein our model. When considering service selectionwithout service management and DVFS, the objectiveis deduced to maximize the system profit f(Dij(t))with single argument Dij(t). Let h(Dij(t)) denote theprofit obtained by each service i on server j in timeslot t, so we have f(Dij(t)) =

∑

i

∑

j

h(Dij(t)). We

present Theorem 1 which elucidates the connectionof our model with service selection.

THEOREM 1: Our request scheduling model isequivalent to service selection if the functions h(·) andg(·) satisfy (10),

h(kx) = kg(x). (10)

Therefore, it can be seen that our request schedul-ing and service management model is general andthe traditional service selection can be regarded aspart of the considered problem. Interested readers arereferred to our detailed proof for Theorem 1 in theappendix.

3.4 Model Specification

In this subsection, we refine our request schedulingand service management problem in detail and de-velop a more specific model to describe the systemproperties. We try to maximize energy efficiency whilebounding the queue length.

The reward of services is a critical concern in ser-vices computing systems. Let rij(t) denote the reward

obtained by a request served by service i on server j intime slot t, and thus we obtain ri(t) =

∑

j rij(t)Dij(t)as the total reward of service i.

According to [5], the power consumption of aserver consists of two parts, static power and dynamicpower. For server j, the power consumed in time slott can be expressed by (11), where Psj represents thestatic power, k and α (α ≥ 1) are constant, ρj ∈ [0, 1]and uj represent the utilization and the frequency ofthe server, respectively.

Pj(t) = Psj + kρj(t)uαj (t). (11)

Define sij ∈ [0, 1] as the ratio of the processingrequirement from service i to the the total processingcapacity of its hosting server j. It should be satisfiedthat

∑

i sij = 1, ∀j ∈ J . The utilization of server j intime slot t can be expressed as ρj(t) =

∑

i sijyij(t).Besides physical servers, other facilities such as

cooling systems also consume a fraction of powerin large-scale service systems. The ratio of powerconsumed by the entire service system to powerconsumed by servers is denoted by Power Usage Effi-ciency (PUE), which is commonly a constant in a cer-tain large-scale computing system [33]. So the powerconsumption of the whole services computing systemin time slot t can be expressed as PUE·

∑

j Pj(t).Define ϕ(t) as the price of unitary energy consump-tion in time slot t, and the energy cost of the wholeservice system in slot t can be approximated asϕ(t) · PUE

∑

j Pj(t)τ . The profit of the service systemin slot t is expressed as

f(t) =∑

i

∑

j

rij(t)Dij(t)−ϕ(t) ·PUE∑

j

Pj(t)τ. (12)

Unlike previous work focusing on the instanta-neous system profit, we study the average propertywhich can guarantee long-term profit over a largetime horizon. The average profit of the services com-puting system over time slots t = {0, 1, 2, . . . , T − 1}is expressed as below.

f = limT→∞

1

T

T−1∑

t=0

E{f(t)}.

Queueing delay is an important metric and caninfluence the degree of customer satisfaction. Accord-ing to Little’s Law, the queueing delay of a service isin proportion to the number of requests waiting inthe queue, i.e., the queue length. Therefore, we aimto reduce the queue length and achieve low systemcongestion. Optimization of queue length has beenconsidered in many previous works [11], [16], [19].In this paper, we use Qij(t) to represent the queuelength of service i on server j in time slot t.

Let u0j represent the base frequency of server j,

which is assumed to be fixed for each server j. lij isdefined as the number of requests that can be servedin each slot by service i on server j running at base




frequency u0j , and lij is a constant. In each slot t,

for server j running in frequency uj(t), assume thenumber of requests it can serve for service i can be

calculated by lij ·uj(t)

u0

j

. The queue length is updatedas

Qij(t+ 1) = max

[

Qij(t)− lijyij(t)uj(t)

u0j

, 0

]

+Dij(t).

(13)To reduce queueing delay, we try to push the

queues under low congestion states and bound theaverage queue length qij . Let ξ be the bound of queuelength, then

qij = limT→∞

1

T

T−1∑

t=0

E{Qij(t)} ≤ ξ, ∀i ∈ I, ∀j ∈ J, ∃ ξ ∈ R+.

(14)The optimization framework can be formulated as

maxDij(t),yij(t),uj(t)

limT→∞

1

T

T−1∑

t=0

E{f(t)}; (15)

subject to (2), (3), (4), (5), (14).Solving the problem offline is problematic since it

involves unpredictable future information, e.g., ser-vice request arrivals. Besides, as the scale of the ser-vice system increases, it is computationally complexand challenging using centralized solutions. There-fore, we propose a distributed and online algorithmwhich can solve the problem efficiently and effec-tively.

4 OPTIMIZING ENERGY EFFICIENCY

By taking advantage of Lyapunov optimization tech-niques [34], we propose a Distributed Online Schedul-ing and Management algorithm called DOSM, whichmakes control decisions in each time slot to optimizethe long-term average system profit.

4.1 Problem Transformation

Let Θ(t) = (Qij(t)) denote the queue matrix of ser-vices on servers. We define the Lyapunov function as

L(Θ(t)) =1

2

∑

i∈I

∑

j∈J

Q2ij(t), (16)

where L(Θ(t)) reflects the queue congestion state ofthe services computing system at the beginning oftime slot t. To maintain system stability and reducequeueing delay, we try to reduce L(Θ(t)) and push theservices computing system towards a low congestionstate.

We define the conditional Lyapunov drift as follows.

∆(Θ(t)) = E{L(Θ(t+ 1))− L(Θ(t))|Θ(t)}. (17)

Minimizing the drift ∆(Θ(t)) alone would tend topush the queues towards a low congestion state

and reduce the queue length, but the resulting aver-age power expenditure might be unnecessarily large,which would incur a large cost and reduce the totalprofit [11]. Thus there exists tradeoff between thesefactors [34]. Following the Lyapunov framework, wedefine the drift minus profit expressed as ∆(Θ(t)) −VE{f(t)|Θ(t)}. The parameter V > 0 represents thetradeoff between queueing delay and system profit,and can be regarded as a weight placed on the profit.It can be determined by service providers or usersaccording to their requirements in real applications.

In Theorem 2, we show that the drift minus profitcan be upper bounded.

THEOREM 2: (Bounding Drift Minus Profit) In eachslot t, under any algorithm, for all possible values ofΘ(t) and any parameter value of V , if there exits apeak value Amax

i that upper bounds the number ofrequests arrived in each slot, the drift minus profit canbe upper bounded by

∆(Θ(t))− VE{f(t)|Θ(t)}

≤B −∑

i

∑

j

E{V rij(t)Dij(t)−Qij(t)Dij(t)|Θ(t)}

−∑

j

E

{(

∑

iQij(t)lijyij(t)uj(t)

u0

j

−V ϕ(t)PUEPj(t)τ

)∣

∣

∣

∣

∣

Θ(t)

}

,

(18)

where B = 12 [∑

i

∑

j(lijuj

u0

j

)2 +∑

i(Amaxi )2] is a con-

stant.The proof of Theorem 2 is provided in the appendix.

4.2 Distributed Online Algorithm for Energy Effi-cient request scheduling and Service management

We aim to minimize the upper bound of drift minusprofit, as shown by the right-hand-side (R.H.S.) of(18). We propose an online algorithm DOSM, whichcan decompose the problem into subproblems, andsolve the problem in a distributed way. Furthermore,it can be proven that DOSM can approach the opti-mal profit while still bounding the queue length inSection 4.3.

In each slot t, DOSM observes the current queuelength Θ(t) and makes the following control decisionsto minimize the upper bound of drift minus profit:(1) request dispatch, which is to decide how manyrequests to dispatch to each service candidate, and (2)service management and DVFS, which switch servicesbetween active and idle states as well as adjust theserver frequencies. After the decisions are made, thequeue length evolves as (13).

(1) Request Dispatch. In each slot t, DOSM performsrequest dispatch to minimize the second term of theR.H.S of (18), which is equivalent to maximize theopposite of it. Furthermore, the dispatch decisionsof requests for different categories of services areindependent from each other, and the subproblem can




be reduced to the following problem for each i ∈ I ,which can be solved concurrently.

maxDij(t)

∑

j

(V rij(t)−Qij(t))Dij(t); (19)

subject to (2), (5).(19) can be viewed as a generalized max-weight

problem, where in each time slot t the number ofrequests dispatched to service is weighted by V rij(t)−Qij(t). User preferences and QoS requirements arealso considered, and let O′

i(t) be the set of candidatesthat satisfy users requirements. O′

i(t) can be obtainedby comparing each candidate’s QoS values with user’sQoS requirements. We call the candidates that belongto O′

i(t) the potential candidates. So the optimal solutionis to dispatch all the requests to the one with themaximum value of V rij(t) − Qij(t) in O′

i, which isgiven by

Dij(t) =

{

Ai(t), j = j∗i ;

0, otherwise,(20)

where j∗i = argmaxj∈O′

i(t)(V rij(t) − Qij(t)) denotes

the index of the server with the maximum profit minusqueue.

The result elucidates that the optimal policy isto distribute all the requests to a single candidateservice i on server j∗i , which indicates that only oneof the candidate services is selected to handle all therequests. Therefore, at service layer viewpoint, ourscheduling problem is equivalent to service selection,which also proves the applicability of our model.

(2) Service Management and DVFS. DOSM makesthe service management decision yij(t) and DVFSdecision uj(t) in each slot t in order to minimizethe third term of the R.H.S. of (18). Similarly, thiscan be transformed to maximizing its opposite. Thevariables yij(t) and uj(t) are independent amongdifferent servers, and this subproblem can be reducedto solving problem (21) concurrently for each serverj ∈ J .

maxyij(t),uj(t)

∑

i

Qij(t)lijyij(t)uj(t)

u0j

− V ϕ(t)PUEPj(t)τ ;

(21)subject to (3) and (4).

Problem (21) involves integer variables yij(t) andit is a generalized mixed integer programming (MIP)problem. In the following, we explore the charac-teristics of the problem and solve it in polynomialtime, which is efficient and can guarantee to find theoptimal solutions.

For each given uj(t) ∈ Uj , the original objective of(21) can be rewritten as

Obj =∑

i

(Qij(t)lijuj(t)

u0j

− V ϕ(t)PUEτkuαj (t)sij)yij(t)

− V ϕ(t)PUEτPsj .

Algorithm 1 Distributed Online Scheduling and Man-agement (DOSM)

1: In the beginning of each time slot t, observe thecurrent queue length Qij(t).

2: Solve the Max-Weight problem (19) for each cate-gory of service i ∈ I in parallel

3: Solve the MIP problem (21) for each server j ∈ Jin parallel

4: t = t+ 15: Repeat steps 1-4

Algorithm 2 Max-Weight Problem Solving for Servicei

1: for all Potential candidates do2: Calculate V rij(t)−Qij(t)3: Search for the candidate on server j∗ with the

maximum value of V rij(t)−Qij(t)4: Set Dij(t) according to (20)

Algorithm 3 MIP Problem Solving for Server j

1: maxObj = −∞2: for all uj(t) ∈ Uj do3: for all candidates i on the server do4: if γij(t) > 0 then5: yij(t) = 16: else7: yij(t) = 08: if Obj > maxObj then9: maxObj = Obj

10: maxuj = uj(t)11: for all candidates i on the server do12: maxyij(t) = yij(t)13: uj(t) = maxuj

14: for all candidates i on the server do15: yij(t) = maxyij(t)

Let γij(t) = Qij(t)lijuj(t)

u0

j

− V ϕ(t)PUEτkuαj (t)sij .

Considering that the expression of −V ϕ(t)PUEτPsj isa constant to the decision variables, so for each givenuj(t) ∈ Uj , (21) can be reduced to solving (22) for eachserver j ∈ J .

maxyij(t)

∑

i

γij(t)yij(t); (22)

subject to (3).(22) is a binary integer programming (BIP) prob-

lem. The decision variables yij(t) are set to 1 whenγij(t) > 0, and 0 otherwise. So (21) can be solvedby enumerating each uj(t) ∈ Uj and comparing eachoptimal result of (22).

The detailed algorithm is shown in Algorithm 1-3.The time complexity of our DOSM algorithm for

the worst case can be divided into the followingtwo parts: (1) request dispatch, and (2) service man-agement and DVFS. In the request dispatch part,




since our algorithm can solve it concurrently for eachservice i ∈ I , and the max-weight problem can besolved by exploring every value (V rij(t) − Qij(t))once, the time complexity of this part is O(n). Forsolving the MIP problem in the service managementand DVFS part, similarly, our DOSM algorithm cansolve it concurrently. Let N be the maximum numberof possible solutions of uj(t) for all j ∈ J . For eachgiven uj(t), it needs O(m) operations to solve thereduced BIP problem, so the time complexity of thispart is O(Nm). The overall time complexity of ourDOSM algorithm is O(n+Nm), which is polynomialtime.

4.3 Optimality Analysis

We show that DOSM can also obtain arbitrary trade-offs between system profit and queue length, ap-proaching the optimal profit while still bounding thequeue length. Note that there exists upper boundf and lower bound f of the objective f under theassumption that the number of requests arrived ineach slot can be bounded by Amax

i . We prove thatthe average system profit and queue length under ourDOSM algorithm are bounded as in Theorem 3.

THEOREM 3: If there exists ε satisfying λ + ε ∈ Λ,where Λ is the capacity region of the system, thenunder the DOSM algorithm, for any value of theparameter V , the average queue length is boundedby (23).

Q = limT→∞

1

T

T−1∑

t=0

∑

i

∑

j

E{Qij(t)}

≤B + V (f − f)

ε.

(23)

Furthermore, the average system profit can bebounded by (24), which indicates the profit derivedby our algorithm can approach the optimal value byincreasing the parameter V . Here B is the constantdefined in Theorem 2.

fDOSM ≥ f∗ −B

V. (24)

Theorem 3 indicates a [O(1/V ), O(V )] tradeoff be-tween the average system profit and queue lengthunder the DOSM algorithm. In addition, it can reachto the optimal system profit by increasing the param-eter V . Although this would also increase the averagequeue length, the queue length qij of each service

can still be bounded, since letting ξ = B+V (f−f)ε

,(14) can be satisfied. In practice, service operators canchoose the parameter V according to actual demandsand make the desired choice. The detailed proof ofTheorem 3 can be found in the appendix.

5 EVALUATION

In this section, we conduct numerical experimentsand real trace based simulation to validate the ef-fectiveness of our DOSM algorithm. We also presentsensitivity analysis to help identify the bottleneck ofthe services computing system and provide guidancefor system optimization.

5.1 Numerical Experiments

5.1.1 Experiment Setup

In the numerical experiments, there are n = 400servers and m = 400 categories of services in thesystem. Each service has 10 candidates that are func-tionally similar but differ in QoS properties. Eachserver holds 10 candidates. The reward of each re-quest is generated randomly between 0.1 and 0.2.Two QoS properties are considered, i.e., reliability andavailability. The reliability and availability values ofeach candidate are generated randomly between 0.4and 1. Users’ reliability and availability requirementsare randomly set between 0.5 and 0.9.

The slot length τ is set as 300 seconds. We willdiscuss the details on τ later. The service rate of eachservice i ∈ I , i.e., the number of requests that can beserved in each time slot is set randomly between 60and 300. The arrival rate λi is set to be the service ratemultiplied by a factor β ≤ 1. The requests arrivingprocesses are generated according to Poisson distri-bution [15]. Note that our approach is more generaland is able to deal with any type of requests arrivals.

We use power-measuring equipments to collectpower consumption data from servers with Intel Ne-halem Bloomfield processor. The average static powerconsumption is ps = 140 watt and the coefficient k isset to be 100 based on the data. The parameter α is setto be 3 according to [5]. For each server j ∈ J , the basefrequency is 133MHz, and there are four multipliers,i.e., 0, 12, 20, 22. PUE is set to be 1.2 based on [24].

5.1.2 Parameter Analysis

1) Impact of tradeoff parameter V .Fig. 2 shows the average system profit and queue

length with different values of V . It can be seen thatas V increases, the average system profit improvesunder our DOSM algorithm, and tends to the opti-mum when V becomes sufficiently large. However,the improvement of profit starts to diminish with farmore increase of V . This is consistent with (24) in The-orem 3, which shows that our DOSM algorithm canapproach the maximum profit with a gap of O(1/V ).The queue length also grows with the increase of V .This is consistent with (23) in Theorem 3 that thequeue length is upper bounded by O(V ). Along withsystem profit, it shows the tradeoff between averagesystem profit and queue congestion.

2) Impact of unitary energy price parameter ϕ.




0 0.5 1 1.5 2

x 106

17.5

17.9

18.3

18.7

19.1

19.5

V

prof

it ($

)

Profit

0 0.5 1 1.5 2

x 106

5.2

5.3

5.4

5.5

5.6

5.7

5.8x 10

5

V

queu

e le

ngth

Queue Length

Fig. 2. Average system profit and queue length withdifferent V.

0 40 80 120 160 200 240 28016.2

16.7

17.2

17.7

18.2

ϕ ($/MWHr)

prof

it ($

)

Profit

0 40 80 120 160 200 240 280

5.3

5.4

5.5

5.6

5.7x 10

5

ϕ ($/MWHr)

queu

e le

ngth

Queue Length

Fig. 3. Average system profit and queue length withdifferent ϕ.

We evaluate the impact of parameter ϕ on systemprofit and queue length. Fig. 3 shows the averagesystem profit and queue length with different valuesof ϕ. It can be seen that the system profit decreasesas ϕ increases, due to the rising of energy cost. Thequeue length also increases for the same reason. OurDOSM algorithm trys to reduce the energy consump-tion by adjusting the servers to run at low speed andswitching some services to idle state, thus increasingthe total queue length. However, the rise of queuelength will start to diminish with far more increase ofϕ, which shows that the queue length will stabilizegradually.

3) Impact of arrival rate λi.To evaluate the impact of arrival rate on the system,

we scale the arrival rate up or down to ϑ · λi. Threeexperiments with ϑ= 0.5, 1 and 1.5 are discussed.The experiment is carried out for 1,000 time slotsfor each setting. Fig. 4 shows the average sum ofall servers’ frequencies and number of active servicesunder different ϑ. As it can be seen, both serversspeed and number of active services increase when ϑbecomes larger. This shows that our DOSM algorithmcan adapt to different arrival rates and adjust thedecisions accordingly. Besides, both the servers speedand number of active services will become stable astime goes by.

4) Impact of slot length τ .To evaluate the impact of slot length τ on average

system profit (in second) and queue length, we varyτ from 100 seconds to 900 seconds. The experiment iscarried out for 5,000 slots for each setting. Fig. 5 showsthat when τ becomes larger, the average profit in eachsecond drops slightly, and the queue length increases.This is because a larger τ makes it hard to adapt to

0 200 400 600 800 10000

200

400

600

800

1000

1200

t

serv

er fr

eque

ncie

s (G

Hz)

Server Frequencies

ϑ=0.5ϑ=1ϑ=1.5

0 200 400 600 800 10000

300

600

900

1200

1500

1800

t

num

ber

of a

ctiv

e se

rvic

es

Active Servies

ϑ=0.5ϑ=1ϑ=1.5

Fig. 4. Average server frequencies and number ofactive services under different ϑ.

100 300 500 700 9000

0.02

0.04

0.06

τ (s)

prof

it ($

/s)

100 300 500 700 9000

2

4

6

8x 10

5

τ (s)

queu

e le

ngth

Fig. 5. Average profit and queue length under differentτ .

the system dynamics as the system variables wouldvary during a long time slot. So a smaller τ can help toadapt to changes and achieve better results. However,when τ is too small, it would lead to high overheadsof estimating or obtaining the system variables. Onetypical way is to set τ corresponding to the priceupdating frequencies in electricity markets [11], [16].

5.2 Simulation Experiments Based on Real TraceData

5.2.1 Simulation SetupWorkload Data: In the simulation experiments, we usetwo real world workload traces to verify our approachunder different services computing contexts.

The first workload trace is from LHC ComputingGrid (LCG). The project is a global collaboration ofmore than 170 computing centres in 40 countries, andprovides SaaS services for data storage and analysis[35]. The LCG data was provided by the e-ScienceGroup of HEP at Imperial College London. The timeperiod of the whole data lasts for 11 days from Nov20th to 30th, 2005. The submission time, duration timeand resource demand of the service requests can befound in the trace [36]. We classify the requests into 20categories of services according to their duration timeand resource demand. Each service has 50 candidatesand the whole candidates are deployed on 200 serversin the system.

The second workload trace is from LPC of BlaisePascal University of Clermont-Ferrand. LPC is part ofthe EGEE project (Enabling Grids for E-science in Eu-rope), which develops an infrastructure and providesIaaS services to users [37]. We choose 11 days of datafrom the LPC trace, in accordance with LCG trace.The requests are also classified into 20 categories of




0 500 1000 1500 2000 2500 30000

50

100

150

200

250

300

t

num

ber

of r

eque

sts

LCG Workload

Fig. 6. LCG Workload trace.

0 500 1000 1500 2000 2500 30000

100

200

300

400

t

num

ber

of r

eque

sts

LPC Workload

Fig. 7. LPC Workload trace.

services according to the duration time and resourcedemand.

The period τ of each time slot t is set to be 300 sec-onds corresponding to the power price data updatingfrequency, which will be introduced later. Thus, thereare 3,168 time slots in our simulation. Based on thesubmission time logs, we can obtain the number ofrequests for each service that arrived in each time slott. Fig. 6 shows the number of requests that arrived inthe LCG system over time slots. It can be seen thatthe workload fluctuates over time. Fig. 7 shows theworkload trace of LPC system (scaled up by a factor10). The service rates are set inversely proportional tothe duration times.

Price Data: We collect the real world locationalmarginal price (LMP, in unit of $/MWHr) from thewebsite of New York independent system operator(ISO) [38], which publishes the updated power pricedata every 300 seconds. We choose 3,168 price datain accordance to the workload data. Fig. 8 shows thepower price trace over the 3,168 time slots. It can beseen that the price fluctuations periodically and thetime of each period is about 250 time slots, roughlyequal to one day. The reward data are set to be thesame as that in the numerical experiments.

QoS Data: We adopt the updated QWS Datasetwhich includes 2,507 real Web services. The QoSparameters of these services were measured using theWeb Service Broker (WSB) framework over a six-dayperiod [39]. We choose 5 QoS properties from thedataset, including availability, successability, reliabil-ity, compliance and best practices. For all these QoSproperties, a higher value stands for a better service.We randomly select 1,000 services from the dataset.Users’ QoS preferences and requirements are gener-ated based on the value ranges of the correspondingQoS property in the dataset.

0 500 1000 1500 2000 2500 30000

50

100

150

200

250

t

pow

er p

rice

($/M

WH

r)

Power Price

Fig. 8. Power price trace.

0 500 1000 1500 2000 2500 30000

5

10

15

20

25

30

t

prof

it ($

)

Profit

DOSMBest EffortLoad−balancedRandomized

0 500 1000 1500 2000 2500 30000

100

200

300

400

500

600

700

t

queu

e le

ngth

Queue Length


Fig. 9. Average system profit and queue length of theLCG system under different algorithms.

5.2.2 Comparative AnalysisBased on the trace data, we compare our DOSMalgorithm with three other algorithms: (1) Best Effortalgorithm, where the server runs at the maximumspeed and the service is switched active as long as thecorresponding queue is not empty, while we use thesame method as our DOSM algorithm for request dis-patch; (2) Load-balanced algorithm, where the requestsare dispatched to each service based on the servingcapacity, while the service management and DVFS arethe same as our DOSM algorithm; and (3) Randomizedalgorithm, where the requests are dispatched to eachservice randomly, while the rest is the same as DOSMalgorithm.

Fig. 9 shows the average system profit and queuelength of the LCG system under different algo-rithms. It can be seen that our DOSM algorithmachieves the highest profit among the four algorithms,which shows the effectiveness of DOSM. The BestEffort algorithm achieves higher profit than the Load-balanced and Randomized algorithms, since it canserve more requests in each time slot, thus reduceenergy cost. The queue length of the Best Effortalgorithm is the lowest, since the the actual servingrate of the Best Effort algorithm is the largest. Thequeue length of our DOSM algorithm is smaller thanthose of Load-balanced and Randomized algorithms.Together with system profit, it is shown that DOSMalgorithm outperforms the Load-balanced and Ran-domized algorithms in both system profit and queuecongestion, and DOSM achieves the highest systemprofit with a little sacrifice of queue congestion.

Fig. 10 shows the average system profit and queuelength of the LPC system under different algorithms.It can be seen that the profit under DOSM algorithmis the highest among all algorithms. The queue lengthunder DOSM algorithm is slightly larger than that of




0 500 1000 1500 2000 2500 3000

4

6

8

10

12

14

16

18

20

t

prof

it ($

)

Profit


0 500 1000 1500 2000 2500 30000

20

40

60

80

100

120

140

160

180

t

queu

e le

ngth

Queue Length


Fig. 10. Average system profit and queue length of theLPC system under different algorithms.

TABLE 2Execution Time of Different Algorithms

SystemAlgorithms Execution Time (s)

DOSM Best Effort Load-balanced RandomizedLCG 0.953 0.907 0.947 0.949LPC 0.955 0.901 0.946 0.948

the Best Effort algorithm, but smaller than those ofLoad-balanced and Randomized algorithms. Similarresults have also been obtained for the LCG system.

Table 2 gives the execution times (in second) ofthe four algorithms in the LCG and LPC systems.For each experiment, we run 20 times to calculatethe average result. It can be seen that the executiontime differences among our DOSM algorithm, Load-balanced and Randomized algorithms are small. Theexecution time of the Best Effort algorithm is slightlysmaller than the other three algorithms.

5.3 Sensitivity Analysis

Sensitivity analysis is often performed in the processof system optimization, which can help to find bot-tlenecks and provide guidance for system design andmanagement [40]. We conduct parametric sensitivityanalysis and discuss the sensitivity of system profitand queue length based on the real trace data pre-sented in Section 5.2.

Let F be the objective function and η be one of theparameters. In practise, the sensitivity can be obtainedby the definition of partial differential coefficient de-scribed in (25).

Sη(F ) =

∣

∣

∣

∣

F (η +∆η)− F (η)

∆η

∣

∣

∣

∣

. (25)

Fig. 11 shows system profit sensitivity and queuelength sensitivity of the LCG and LPC systems interms of V over the 3,168 time slots. The parameterV is set to be 60 and ∆V is set as 0.001× V = 0.06. Itcan be seen that the profit sensitivities of both systemsfluctuate over time. The profit sensitivity of the LCGsystem reaches to the maximum value at t1 = 2,500,i.e., the profit can reach to the maximum gain whenchanging V at time slot t1, which provides guidancefor profit optimization. The queue length sensitivity ofthe LCG system increases sustainably during the last200 time slots. The profit sensitivity of the LPC system

0 500 1000 1500 2000 2500 30000

0.01

0.02

0.03

0.04

0.05

t

LPC

pro

fit s

ensi

tivity

($)

Profit Sensitivity

0 500 1000 1500 2000 2500 30000

0.02

0.04

0.06

0.08

0.1

LCG

pro

fit s

ensi

tivity

($)

LPC profit sensitivityLCG profit sensitivity

0 500 1000 1500 2000 2500 30000

3

6

9

12

t

LPC

que

ue le

ngth

sen

sitiv

ity

Queue Length Sensitivity

0 500 1000 1500 2000 2500 30000

2

4

6

8

LCG

que

ue le

ngth

sen

sitiv

ity

LPC queue length sensitivityLCG queue length sensitivity

Fig. 11. Profit sensitivity and queue length sensitivitywith V in the LCG and LPC systems.

0 500 1000 1500 2000 2500 30000.138

0.139

0.14

0.141

0.142

0.143

0.144

0.145

t

LPC

pro

fit s

ensi

tivity

($)

Profit Sensitivity

0 500 1000 1500 2000 2500 30000

0.05

0.1

0.15

0.2

0.25

0.3

0.35

LCG

pro

fit s

ensi

tivity

($)

LPC profit sensitivityLCG profit sensitivity

0 500 1000 1500 2000 2500 30000

2

4

6

8

10

12

t

LPC

que

ue le

ngth

sen

sitiv

ity

Queue Length Sensitivity

0 500 1000 1500 2000 2500 30000

7

14

21

28

35

42

LCG

que

ue le

ngth

sen

sitiv

ity

LPC queue length sensitivityLCG queue length sensitivity

Fig. 12. Profit sensitivity and queue length sensitivitywith ϕ in the LCG and LPC systems.

reaches to the maximum value at the 1,600 slot andthe queue length sensitivity of the LPC system reachesto the highest value at the 2,300 slot.

Fig. 12 shows system profit sensitivity and queuelength sensitivity of the LCG and LPC systems withϕ over time slots. The parameter ϕ is set to be thesame value as the trace in Section 5.2 and ∆ϕ =0.3$/MWHr. It can be seen that the profit sensitivityof the LCG system reaches to the maximum value att2 = 2490, which indicates that the profit optimizationby changing ϕ is the most effective at time slot t2. Thequeue length sensitivity of the LCG system becomesthe maximum at t3 = 3050. It can be seen that t3is a good point to adjust ϕ to optimize both systemprofit and queue length of the LCG system. The profitsensitivity of the LPC system reaches to the maximumvalue at t4 = 1700. The queue length sensitivity of theLPC system is also large at t4. Thus t4 is a good pointto adjust ϕ to optimize both system profit and queuelength of the LPC system.

6 CONCLUSION

In this paper, we study energy efficient schedulingand management for large-scale services computingsystems. We jointly consider the metrics of energy ef-ficiency, performance and queue congestion. We com-bine the three approaches of request dispatch, servicemanagement and DVFS to improve energy efficiencyfrom a systematical point of view. A distributed on-line algorithm based on Lyapunov optimization tech-niques is proposed, which can achieve near optimalsystem profit while bounding the queue length. Ouralgorithm does not require any arrival statistics orany prediction on future information. Mathematical




analysis is provided to evaluate the effectiveness ofour algorithm, which is further validated by both nu-merical experiments and real trace based simulation.

ACKNOWLEDGMENTS

This work is supported by the National GrandFundamental Research 973 Program of China (No.2010CB328105, No. 2013CB329105), the National Nat-ural Science Foundation of China (No. 61020106002),and Tsinghua University Initiative Scientific ResearchProgram (No. 20121087999).

REFERENCES

[1] L.-J. Zhang, J. Zhang, and H. Cai, Services Computing. Springerand Tsinghua University Press, 2007.

[2] P. Xiong, Y. Fan, and M. Zhou, “QoS-aware web service con-figuration,” IEEE Transactions on Systems, Man and Cybernetics,Part A: Systems and Humans, vol. 38, no. 4, pp. 888–895, 2008.

[3] K. Bessai, S. Youcef, A. Oulamara, C. Godart, and S. Nurcan,“Bi-criteria workflow tasks allocation and scheduling in cloudcomputing environments,” in 2012 IEEE 5th International Con-ference on Cloud Computing (CLOUD ’12), June 2012, pp. 638–645.

[4] T. He, S. Chen, H. Kim, L. Tong, and K.-W. Lee, “Schedul-ing parallel tasks onto opportunistically available cloud re-sources,” in 2012 IEEE 5th International Conference on CloudComputing (CLOUD ’12), June 2012, pp. 180–187.

[5] A. Beloglazov, R. Buyya, Y. Lee, and A. Zomaya, “A taxonomyand survey of energy-efficient data centers and cloud comput-ing systems,” Advances in Computers, vol. 82, no. 2, pp. 47–111,2011.

[6] D. Ardagna, B. Panicucci, M. Trubian, and L. Zhang, “Energy-aware autonomic resource allocation in multitier virtualizedenvironments,” IEEE Transactions on Services Computing, vol. 5,no. 1, pp. 2–19, 2012.

[7] W.-c. Feng, X. Feng, and R. Ge, “Green supercomputing comesof age,” IT professional, vol. 10, no. 1, pp. 17–23, 2008.

[8] J. G. Koomey, “Estimating regional power consumption byservers: A technical note,” Lawrence Berkeley National Labora-tory, Oakland, CA, 2007.

[9] C. Wang, M. de Groot, and P. Marendy, “A service-orientedsystem for optimizing residential energy use,” in 2009 IEEE7th International Conference on Web Services (ICWS ’09), 2009,pp. 735–742.

[10] P. Bartalos and M. Blake, “Green web services: Modeling andestimating power consumption of web services,” in 2012 IEEE19th International Conference on Web Services (ICWS ’12), Jun.2012, pp. 178–185.

[11] Z. Zhou, F. Liu, H. Jin, B. Li, B. Li, and H. Jiang, “On arbitratingthe power-performance tradeoff in SaaS clouds,” in 2013 IEEEINFOCOM, 2013, pp. 872–880.

[12] L. Minas and B. Ellison, Energy efficiency for information technol-ogy: How to reduce power consumption in servers and data centers.Intel Press, 2009.

[13] Z. Chen, C. Liang-Tien, B. Silverajan, and L. Bu-Sung, “Ux-anarchitecture providing QoS-aware and federated support forUDDI,” in 2003 IEEE 1st International Conference on Web Services(ICWS ’03), 2003.

[14] J. Huang and C. Lin, “Improving energy efficiency in webservices: An agent-based approach for service selection anddynamic speed scaling,” International Journal of Web ServicesResearch (IJWSR), vol. 10, no. 1, pp. 29–52, 2013.

[15] A. Wierman, L. Andrew, and A. Tang, “Power-aware speedscaling in processor sharing systems,” in 2009 IEEE INFO-COM, April 2009, pp. 2007–2015.

[16] S. Ren, Y. He, and F. Xu, “Provably-efficient job schedulingfor energy and fairness in geographically distributed data cen-ters,” in 2012 IEEE 32nd International Conference on DistributedComputing Systems (ICDCS ’12), 2012, pp. 22–31.

[17] Y. Chen, J. Huang, X. Xiang, and C. Lin, “Energy efficientdynamic service selection for large-scale web service systems,”in 2014 IEEE 21st International Conference on Web Services (ICWS’14), Jun. 2014, pp. 558–565.

[18] X. Zhang, X. Shen, and L.-L. Xie, “Joint subcarrier andpower allocation for cooperative communications in lte-advanced networks,” IEEE Transactions on Wireless Communi-cations, vol. 13, no. 2, pp. 658–668, February 2014.

[19] Z. Zhou, F. Liu, Y. Xu, R. Zou, H. Xu, J. Lui, and H. Jin,“Carbon-aware load balancing for geo-distributed cloud ser-vices,” in 2013 IEEE 21st International Symposium on Modeling,Analysis Simulation of Computer and Telecommunication Systems(MASCOTS ’13), Aug 2013, pp. 232–241.

[20] S. Liu, G. Quan, and S. Ren, “On-line real-time service allo-cation and scheduling for distributed data centers,” in 2011IEEE International Conference on Services Computing (SCC ’11),July 2011, pp. 528–535.

[21] P. X. Gao, A. R. Curtis, B. Wong, and S. Keshav, “It’s not easybeing green,” SIGCOMM Comput. Commun. Rev., vol. 42, no. 4,pp. 211–222, Aug. 2012.

[22] M. Lin, Z. Liu, A. Wierman, and L. Andrew, “Online algo-rithms for geographical load balancing,” in 2012 InternationalGreen Computing Conference (IGCC), June 2012, pp. 1–10.

[23] M. Adnan, R. Sugihara, and R. Gupta, “Energy efficient geo-graphical load balancing via dynamic deferral of workload,”in 2012 IEEE 5th International Conference on Cloud Computing(CLOUD), June 2012, pp. 188–195.

[24] Z. Zhou, F. Liu, H. Jin, B. Li, B. Li, and H. Jiang, “On arbitratingthe power-performance tradeoff in saas clouds,” in 2013 IEEEINFOCOM, 2013, pp. 872–880.

[25] A. Beloglazov, J. Abawajy, and R. Buyya, “Energy-aware re-source allocation heuristics for efficient management of datacenters for cloud computing,” Future Generation Computer Sys-tems, vol. 28, no. 5, pp. 755 – 768, 2012.

[26] J.-W. Jang, M. Jeon, H.-S. Kim, H. Jo, J.-S. Kim, and S. Maeng,“Energy reduction in consolidated servers through memory-aware virtual machine scheduling,” IEEE Transactions on Com-puters, vol. 60, no. 4, pp. 552–564, April 2011.

[27] Intel turbo boost technology - on-demand processorperformance. [Online]. Available: http://www.intel.com/content/www/us/en/architecture-and-technology/turbo-boost/turbo-boost-technology.html?wapkw=Turbo

[28] AMD PowerNow! Technology. [Online].Available: http://www.amd.com/us/products/technologies/amd-powernow-technology/Pages/amd-powernow-technology.aspx

[29] R. Guerra, J. Leite, and G. Fohler, “Attaining soft real-timeconstraint and energy-efficiency in web servers,” in 2008 ACMsymposium on Applied computing. ACM, 2008, pp. 2085–2089.

[30] D. Lin, C. Shi, and T. Ishida, “Dynamic service selectionbased on context-aware QoS,” in 2012 IEEE 9th InternationalConference on Services Computing (SCC ’12), 2012, pp. 641–648.

[31] L. Qi, Y. Tang, W. Dou, and J. Chen, “Combining local opti-mization and enumeration for qos-aware web service compo-sition,” in 2010 IEEE 8th International Conference on Web Services(ICWS ’10), July 2010, pp. 34–41.

[32] W.-T. Tsai, X. Sun, and J. Balasooriya, “Service-oriented cloudcomputing architecture,” in 2010 Seventh International Confer-ence on Information Technology: New Generations (ITNG), April2010, pp. 684–689.

[33] A. Greenberg, J. Hamilton, D. A. Maltz, and P. Patel, “The costof a cloud: research problems in data center networks,” ACMSIGCOMM Computer Communication Review, vol. 39, no. 1, pp.68–73, 2008.

[34] M. J. Neely, Stochastic Network Optimization With Application toCommunication and Queueing Systems. Morgan & Claypool,2010.

[35] Worldwide LHC Computing Grid. [Online]. Available: http://wlcg.web.cern.ch/search/node/LCG

[36] H. Li, M. Muskulus, and L. Wolters, “Modeling job arrivals ina data-intensive grid,” in Job Scheduling Strategies for ParallelProcessing, ser. Lecture Notes in Computer Science, E. Fracht-enberg and U. Schwiegelshohn, Eds. Springer Berlin Heidel-berg, 2007, vol. 4376, pp. 210–231.

[37] K. Deng, J. Song, K. Ren, and A. Iosup, “Exploring portfolioscheduling for long-term execution of scientific workloads iniaas clouds,” in Proceedings of SC13: International Conference for




High Performance Computing, Networking, Storage and Analysis.ACM, 2013, pp. 55–66.

[38] New York ISO Real-time Market LBMP. [On-line]. Available: http://www.nyiso.com/public/marketsoperations/market data/pricing data/index.jsp

[39] E. Al-Masri and Q. H. Mahmoud, “Qos-based discovery andranking of web services,” in Computer Communications andNetworks, 2007. ICCCN 2007. Proceedings of 16th InternationalConference on, Aug 2007, pp. 529–534.

[40] J. T. Blake, A. L. Reibman, and K. S. Trivedi, “Sensitivity anal-ysis of reliability and performability measures for multipro-cessor systems,” in Proceedings of the 1988 ACM SIGMETRICSConference on Measurement and Modeling of Computer Systems,ser. SIGMETRICS ’88, 1988, pp. 177–186.

Ying Chen received the B.Eng. degree inSchool of Computer Science from BeijingUniversity of Posts and Telecommunications,Beijing, China, in 2012. She is currentlyworking towards the Ph.D. degree with theDepartment of Computer Science and Tech-nology, Tsinghua University. Her researchinterests are modeling, performance evalu-ation and optimization of web services andservices computing. She is a student mem-ber of the IEEE.

Chuang Lin is a professor of the Depart-ment of Computer Science and Technology,Tsinghua University, Beijing, China. He re-ceived the Ph.D. degree in Computer Sci-ence from the Tsinghua University in 1994.His current research interests include com-puter networks, performance evaluation, net-work security analysis, and Petri net theoryand its applications. He has published morethan 300 papers in research journals andIEEE conference proceedings and has pub-

lished five books. He is a member of ACM Council, a senior memberof the IEEE and the Chinese Delegate in TC6 of IFIP. He serves asthe Associate Editor of IEEE Transactions on Vehicular Technology,the Area Editor of Journal of Computer Networks and the Area Editorof Journal of Parallel and Distributed Computing.

Jiwei Huang is an assistant professor inthe State Key Laboratory of Networking andSwitching Technology at Beijing University ofPosts and Telecommunications. He receivedthe Ph.D. degree and B.Eng. degree both incomputer science and technology from Ts-inghua University in 2014 and 2009, respec-tively. He was a visiting scholar at Georgia In-stitute of Technology. His research interestsare in services computing and performanceevaluation. He has published more than 15

papers in international journals and conference proceedings, e.g.,IEEE Transactions on Services Computing, SIGMETRICS, ICWS,SCC, etc. He is a member of the IEEE.

Xudong Xiang received the B.Sc. in informa-tion management and information systemsand M.Sc. degree in management scienceand engineering from Beijing InformationScience and Technology University, Beijing,China, in 2008 and 2011, respectively. Heis currently working toward the Ph.D. degreein the Department of Computer Science andTechnology, University of Science and Tech-nology Beijing. His current research focuseson the stochastic modeling, optimal control

and performance evaluation of computer networks and systems. Heis a student member of the IEEE.

Xuemin (Sherman) Shen received the B.Sc.degree from Dalian Maritime University,Dalian, China, in 1982, and the M.Sc. andPh.D. degrees from Rutgers University, NewBrunswick, NJ, USA, in 1987 and 1990, re-spectively, all in electrical engineering. He isa Professor and University Research Chairwith the Department of Electrical and Com-puter Engineering, University of Waterloo,Waterloon, ON, Canada. He was the Asso-ciate Chair for Graduate Studies from 2004

to 2008. His research focuses on resource management in intercon-nected wireless/wired networks, wireless network security, wirelessbody area networks, and vehicular ad hoc and sensor networks.Heis a Distinguished Lecturer of the IEEE Communications Society.He received the Excellent Graduate Supervision Award in 2006 andthe Outstanding Performance Award in 2004 and 2008 from theUniversity of Waterloo, the Premiers Research Excellence Award(PREA) in 2003 from the Province of Ontario, Canada, and theDistinguished Performance Award in 2002 and 2007 from the Facultyof Engineering, University of Waterloo. He is a registered Profes-sional Engineer of Ontario, Canada, a fellow of the IEEE, a fellowof Canadian Academy of Engineering and a fellow of EngineeringInstitute of Canada.

Documents

IEEE TRANSACTIONS ON SERVICES COMPUTING, MANUSCRIPT ID 1 Energy Efﬁcient Scheduling ...bbcr.uwaterloo.ca › ~xshen › paper › 2015 › eesamf.pdf · 2020-01-23 · request scheduling