30
SLA-aware Virtual Resource Management for Cloud Infrastructures On the Management and Efficiency of Cloud Based Services Eitan Rosenfeld December 8 th , 2010

SLA-aware Virtual Resource Management for Cloud Infrastructures On the Management and Efficiency of Cloud Based Services Eitan Rosenfeld December 8 th,

  • View
    213

  • Download
    0

Embed Size (px)

Citation preview

Page 1: SLA-aware Virtual Resource Management for Cloud Infrastructures On the Management and Efficiency of Cloud Based Services Eitan Rosenfeld December 8 th,

SLA-aware Virtual Resource Management for Cloud

Infrastructures

On the Management and Efficiency of Cloud Based Services

Eitan Rosenfeld December 8th, 2010

Page 2: SLA-aware Virtual Resource Management for Cloud Infrastructures On the Management and Efficiency of Cloud Based Services Eitan Rosenfeld December 8 th,

Problem SpaceAutomate the management of virtual

servers

Why this is challenging:

Must take into account high-level SLA requirements of hosted applications

Must take into account resource management costs

When applications change state, their resource demands are likely to change

Page 3: SLA-aware Virtual Resource Management for Cloud Infrastructures On the Management and Efficiency of Cloud Based Services Eitan Rosenfeld December 8 th,

High Level SolutionGenerate a Global Utility function

Constraint Programming approachDegree of SLA fulfillmentOperating costs

Autonomic resource manager built on utility functionDecouple resource provisioning and VM placement

Page 4: SLA-aware Virtual Resource Management for Cloud Infrastructures On the Management and Efficiency of Cloud Based Services Eitan Rosenfeld December 8 th,

Why use (and automate) the Cloud?

Static allocation of resources results in 15-20% utilization

VMs allow decoupling of applications from physical servers

Automation of the management process (scale up and scale down) can reduce cost, and boost/maintain performance

Page 5: SLA-aware Virtual Resource Management for Cloud Infrastructures On the Management and Efficiency of Cloud Based Services Eitan Rosenfeld December 8 th,

Decoupling in two stagesProvisioning stage

Allocate resource capacity virtual machines for a given application

Driven by performance goals associated with the business-level SLAs of the hosted applications

Placement stageMap Virtual Machines to Physical MachinesDriven by data center policies regarding resource

management costs. A typical example is to lower energy consumption by

minimizing the number of active physical servers.

Page 6: SLA-aware Virtual Resource Management for Cloud Infrastructures On the Management and Efficiency of Cloud Based Services Eitan Rosenfeld December 8 th,

Application Environment A

VM1 VM2 VM7VM6VM5VM4VM3

Physical Machine1

Physical Machine2

Physical Machine3

Physical Machine4

Application Environment B

Application Environment C

Allocating new resources with state changes

State = 1State = 2State = 3

Page 7: SLA-aware Virtual Resource Management for Cloud Infrastructures On the Management and Efficiency of Cloud Based Services Eitan Rosenfeld December 8 th,

Automation criteriaWhat are the requirements for successful

automation? Dynamic provisioning and placementSupport for

online applications with stringent QoS requirements batch-oriented CPU-intensive applications

Support for a variety of application topologies

Page 8: SLA-aware Virtual Resource Management for Cloud Infrastructures On the Management and Efficiency of Cloud Based Services Eitan Rosenfeld December 8 th,

Provisioning maps to application specific functions

Placement maps to a global decision layer

Utility function is their means of communication. Utility function returns a scalar value

0 (unsatisfied) to 1 (satisfied)Application state: Workload, Resource Capacity, SLA

Both provisioning and placement are mapped as Constraint Satisfaction Problems

Page 9: SLA-aware Virtual Resource Management for Cloud Infrastructures On the Management and Efficiency of Cloud Based Services Eitan Rosenfeld December 8 th,

Some DefinitionsSatisfaction – whether an application is

achieving its performance goals

Constraint Programming – solve a problem by stating constraint relations between variables – constraints must be satisfied by the solution

Page 10: SLA-aware Virtual Resource Management for Cloud Infrastructures On the Management and Efficiency of Cloud Based Services Eitan Rosenfeld December 8 th,

AssumptionsPhysical machines can run multiple VMs

Application Environment (AE)AEs can span multiple VMsSLAs apply to AEs

A VM can only run one AE at a time

Page 11: SLA-aware Virtual Resource Management for Cloud Infrastructures On the Management and Efficiency of Cloud Based Services Eitan Rosenfeld December 8 th,

High Level ArchitectureLocal Decision Module (LDM) for each AE

Compute satisfaction with current resources, workload, and service metrics (utility function)

Evaluate the opportunity of allocating more VMs or releasing existing VMs to/from the AE

Global Decision Module (GDM)Arbitrates resource requirements based on utility

functions and performance of VMs and PMsNotify LDMs of changes to VMs and manage the

VM lifecycle (start, stop, migrate)

Page 12: SLA-aware Virtual Resource Management for Cloud Infrastructures On the Management and Efficiency of Cloud Based Services Eitan Rosenfeld December 8 th,
Page 13: SLA-aware Virtual Resource Management for Cloud Infrastructures On the Management and Efficiency of Cloud Based Services Eitan Rosenfeld December 8 th,

Local Decision ModuleLDM is associated with two utility functions

(1) Fixed service-level - maps the service level to a utility value

(2) Dynamic resource-level - maps a resource capacity to a utility value, communicated to GDM

VariablesLet A =(a1, a2, ..., ai, ..., am) denote the set of AEs,

P=(p1,p2,...,pj,...,pq) denote set of PMs in datacenter,

S=(s1, s2, ..., sk, ..., sc), denote set of c classes of VMs, where sk=(sk

cpu, skram) specifies the VM CPU capacity

in MHz and the VM memory capacity in megabytes

Page 14: SLA-aware Virtual Resource Management for Cloud Infrastructures On the Management and Efficiency of Cloud Based Services Eitan Rosenfeld December 8 th,

LDM (cont’d)Utility function (2) ui for application ai:

ui = fi(Ni), where Ni is the VM allocation vector of application ai:

Ni = (ni1,ni2,...,nik,...,nim) where nik is the number of VMs of class sk attributed to application ai.

Page 15: SLA-aware Virtual Resource Management for Cloud Infrastructures On the Management and Efficiency of Cloud Based Services Eitan Rosenfeld December 8 th,

Application ConstraintsEach application also provides upper bound on

VMs that it is willing to accept.Each VM class Ni

max=(ni1max

,ni2max,...,nik

max,...,nimmax)

Total Timax

(1) 1 ≤ i ≤ m and 1 ≤ k ≤ c

(2) 1 ≤ i ≤ m

nik ≤ Timax

k=1

c

∑€

nik ≤ nikmax

Page 16: SLA-aware Virtual Resource Management for Cloud Infrastructures On the Management and Efficiency of Cloud Based Services Eitan Rosenfeld December 8 th,

Global Decision ModuleDuties (and Constraint Satisfaction Problems)

Determining VM allocation vectors Ni for each application (Provisioning)

Place VMs on PMs such that number of active PMs is minimized (Packing)

Page 17: SLA-aware Virtual Resource Management for Cloud Infrastructures On the Management and Efficiency of Cloud Based Services Eitan Rosenfeld December 8 th,

ProvisioningVMs allocated to all applications are constrained by

capacity physical servers

CPU capacity

(3)

RAM capacity

where Cj is the capacity of PM pj€

nik ⋅ skcpu ≤ C j

cpu

j=1

q

∑k=1

c

∑i=1

m

nik ⋅ skram ≤ C j

ram

j=1

q

∑k=1

c

∑i=1

m

Page 18: SLA-aware Virtual Resource Management for Cloud Infrastructures On the Management and Efficiency of Cloud Based Services Eitan Rosenfeld December 8 th,

Provisioning OutputProvisioning phase output

Set of vectors Ni for which constraints 1, 2, 3 are satisfied

Comparing new Ni to existing Ni tells GDM which VMs will be created, destroyed, or resized.

Global utility Uglobal is maximized via weighted sums of utility and operating costs.

where is weight of utility fn for application ,

and ε is coefficient that allows admin to trade/tweak performance goals for operating cost of Ni€

Uglobal = maximize α i × ui −ε ⋅ cost(N i)( )i=1

m

α i

ui

i

Page 19: SLA-aware Virtual Resource Management for Cloud Infrastructures On the Management and Efficiency of Cloud Based Services Eitan Rosenfeld December 8 th,

Packing (Placement)V = (vm1,vm2,...,vml,...,vmv) lists all VMs running

at the current time.

For each PM pj ∈ P,

bit vector Hj = (hj1,hj2,...,hjl,...,hjv) denotes the set of VMs assigned to pj

Example: hjl = 1 if pj is hosting vml

R = (r1 , r2 , ..., rl , ..., rv ) is the resource capacity (CPU, RAM) of all VMs, where rl=(rlcpu , rlram )

Page 20: SLA-aware Virtual Resource Management for Cloud Infrastructures On the Management and Efficiency of Cloud Based Services Eitan Rosenfeld December 8 th,

Packing (physical resource) Constraints

The sum of the resource capacities of the VMs on PM pj must be less than or equal to the resource capacity of pj.

1≤ j ≤ q

1 ≤ j ≤ q

rlcpu⋅ hij ≤ C j

cpu

l=1

v

rlram ⋅ hij ≤ C j

ram

l=1

v

Page 21: SLA-aware Virtual Resource Management for Cloud Infrastructures On the Management and Efficiency of Cloud Based Services Eitan Rosenfeld December 8 th,

Packing OutputPacking produces VM placement vectors Hj

GDM is run periodically – uses previous Hj to determine which VMs need to be migrated

Goal is to minimize number of active PMs X:

X = u jj=1

q

∑ , where u j

{

1 ∃vml ∈V | h jl =1

0 otherwise

Page 22: SLA-aware Virtual Resource Management for Cloud Infrastructures On the Management and Efficiency of Cloud Based Services Eitan Rosenfeld December 8 th,

Simulation Environment4 PMs, each with 4000 MHz, 4000 MB

2 applications

Cost function:

Cost(CPU) =CPUdemand

CPUtotal

Page 23: SLA-aware Virtual Resource Management for Cloud Infrastructures On the Management and Efficiency of Cloud Based Services Eitan Rosenfeld December 8 th,

Simulation 1Minimize operating cost impact: ε = .05

4 VM classes

Given Table II below, A is given priority

Page 24: SLA-aware Virtual Resource Management for Cloud Infrastructures On the Management and Efficiency of Cloud Based Services Eitan Rosenfeld December 8 th,

Demand

DA and DB

CPUs

RA and RB

#Physical Machines

Global Utility

Response times

TA and TB

Page 25: SLA-aware Virtual Resource Management for Cloud Infrastructures On the Management and Efficiency of Cloud Based Services Eitan Rosenfeld December 8 th,

Simulation (cont’d)

Simulation 2: Operating Cost factor ε increases to .3

Page 26: SLA-aware Virtual Resource Management for Cloud Infrastructures On the Management and Efficiency of Cloud Based Services Eitan Rosenfeld December 8 th,

Simulation 3New utility function for both A and B

Page 27: SLA-aware Virtual Resource Management for Cloud Infrastructures On the Management and Efficiency of Cloud Based Services Eitan Rosenfeld December 8 th,

Simulation 3 results

Looking at t4 and t5 – CPU resource for B descends faster as compared to the first test

Page 28: SLA-aware Virtual Resource Management for Cloud Infrastructures On the Management and Efficiency of Cloud Based Services Eitan Rosenfeld December 8 th,

Simulation 4: Changing weight factors

αA=0.3, αB=0.7 B obtains enough CPU, A does not fails to meet

SLA

Page 29: SLA-aware Virtual Resource Management for Cloud Infrastructures On the Management and Efficiency of Cloud Based Services Eitan Rosenfeld December 8 th,

RecommendationsConstraint Solver for optimizing provisioning and

packing is not discussed.*

No mention of any overheads of migrating to a new PM or allocating a new VM.

Simulations do not dive into N vectors for VM provisioning

No discussion of cost or frequency of running GDM

*Choco open source constraint solver is used

Page 30: SLA-aware Virtual Resource Management for Cloud Infrastructures On the Management and Efficiency of Cloud Based Services Eitan Rosenfeld December 8 th,

ConclusionsDynamic placement and attention to

application-specific goals are valuable

Modeling on top of Constraints allows for flexibility

Utility functions provide a uniform way for applications to self-optimize.