Autonomic SLA-driven Provisioning for Cloud Applications

Autonomic SLA-driven Provisioning for Cloud Applications

Nicolas Bonvin, Thanasis Papaioannou, Karl Aberer

CCGRID 2011, May 23-26 2011, New Port Beach, CA, USA

[email protected] - EPFL

● A distributed, component-based application running on an elastic infrastructure

Cloud Apps – Issue #1 : Placement

2 EPFL – LSIR - Nicolas Bonvin

C1C1 C2C2 C3C3 C4C4




C1C1 C2C2 C3C3 C4C4

VM1 VM2 VM3


● Performance of C1, C2 and C3 is probably less than C4● No info on other VMs colocated on same server !



C3C3 C4C4

VM2 VM3

Server 1 Server 2

C1C1 C2C2

VM1


● Performance of C1, C2 and C3 is probably less than C4● No info on other VMs colocated on same server !



No control on placement

C3C3 C4C4

VM2 VM3

Server 1 Server 2

C1C1 C2C2

VM1

● Load-balanced trafic to 4 identical components on 4 identical VMs

Cloud Apps – Issue #2 : Unstability


C1C1 C1C1 C1C1 C1C1

VM1 VM2 VM3 VM4

100 ms 100 ms 100 ms 100 ms


– VM performance can vary up to a ratio 4 ! [Dej2009]

● Physical server, Hypervisor, Storage, ...



C1C1 C1C1 C1C1 C1C1

VM1 VM2 VM3 VM4

100 ms 140 ms 100 ms 100 ms



● Physical server, Hypervisor, Storage, ...● Component overloaded



C1C1 C1C1 C1C1 C1C1

VM1 VM2 VM3 VM4

130 ms 140 ms 100 ms 100 ms



● Physical server, Hypervisor, Storage, ...● Component overloaded● Component bug, crash, deadlock, ...



C1C1 C1C1 C1C1 C1C1

VM1 VM2 VM3 VM4

130 ms 140 ms 100 ms infinity



● Physical server, Hypervisor, Storage, ...● Component overloaded● Component bug, crash, deadlock, ...● Failure of C1 on VM4 -> load is rebalanced



C1C1 C1C1 C1C1 C1C1

VM1 VM2 VM3 VM4




● Physical server, Hypervisor, Storage, ...● Component overloaded● Component bug, crash, deadlock, ...● Failure of C1 on VM4 -> load is rebalanced



C1C1 C1C1 C1C1 C1C1

VM1 VM2 VM3 VM4


Application should react early !

● Build for failures

– Do not trust the underlying infrastructure

– Do not trust your components either !

● Components should adapt to the changing conditions

– Quickly

– Automatically

– e.g. by replacing a wonky VM by a new one

Cloud Apps – Overview


Scarce: a framework to build scalable cloud applications

Architecture Overview


Agent

Server

GOSSIPING + BROADCAST

Agent

A

B

E

● An agent on each server / VM

– starts/stops/monitors the components

– Takes decisions on behalf of the components

● An agent communicates with other agents

– Routing table

– Status of the server (resources usage)

Agent

Agent

Agent

Agent

An economic approach


● Time is split into epochs (no synchronization between servers)● Servers charge a virtual rent for hosting a component according to

– Current resource usage (I/O, CPU, ...) of the server

– Technical factors (HW, connectivity, ...)

– Non-technical factors (country stability, ....)

An economic approach


● Time is split into epochs (no synchronization between servers)● Servers charge a virtual rent for hosting a component according to

– Current resource usage (I/O, CPU, ...) of the server

– Technical factors (HW, connectivity, ...)

– Non-technical factors (country stability, ....)

● Components

– Pay virtual rent at each epoch

– Gain virtual money by processing requests

– Take decisions based on balance ( = gain – rent )

● Replicate, migrate, suicide, stay

● Virtual rents are updated by gossiping (no centralized board)

Economic model (i)


● The rent of a server is different for each component !

Economic model (ii)


● VM1 and VM2 have an « identical » resources usage : 45%● Server rent = server's resources usage with component's weights

– Rent for C1 @ VM1 > rent for C1 @ VM2

C1C1CPU : 30%I/O : 5%

VM1

CPU : 70%I/O : 20%

Multiplexing of server resources

VM2

CPU : 25%I/O : 65%

?

Economic model (iii)


● Choosing a candidate server j during replication/migration of a component i

– netbenefit maximization

● 2 optimization goals :

– high-availability by geographical diversity of replicas

– low latency by grouping related components

● gj : weight related to the proximity of the server location to the geographical distribution of the client requests to the component

● Si is the set of server hosting a replica of component i

SLA Performance Guarantees (i)


● Each component has its own SLA constraints● SLA derived directly from entry components

● Resp. Time = Service Time + max (Resp. Time of Dependencies)

C3C3

C1SLA : 500ms

C1SLA : 500ms

C2C2

C5C5

C4C4

SLA Performance Guarantees (ii)


● SLA propagation from parents to children● Parent j sends its performance constraints (e.g. response time upper

bound) to its dependencies D(j) :

● Child i computes its own performance constraints :

● : group of constraints sent by the replicas of the parent g

SLA Performance Guarantees (iii)


● SLA propagation from parents to children

Automatic Provisioning


● Usage of allocated resources is maximized :

– autonomic migration / replication / suicide of components

– not enough to ensure end-to-end response time

● Cloud resources managed by framework via cloud API

● Each individual component has to satisfy its own SLA

– SLA easily met -> decrease resources (scale down)

– SLA not met -> increase resources (scale up, scale out)

Adaptivity to slow servers


● Each component keeps statistics about its children

– e.g. 95th perc. response time

● A routing coefficient is computed for each child at each epoch

– Send more requests to more performant children

Evaluation

Evaluation: Setup


● 5 components, mostly CPU-intensive (wc >> wm,wn,wd)

● 8 8-cores servers (Intel Core i7 920, 2.67 GHz, 8GB, Linux 2.6.32-trunk-amd64)

● d=0, C=110, k =10000, xs* = 25%

C3C3

C1SLA : 500ms

C1SLA : 500ms

C2C2

C5C5

C4C4

Adaptation to Varying Load (i)


● 5 rps to 60 rps at minute 8, step 5 rps/min● Static setup : 2 servers with 2 cores

Adaptation to Varying Load (ii)


● 5 rps to 60 rps at minute 8, step 5 rps/min● Static setup : 2 servers with 2 cores

Adaptation to Slow Server


● Max 2 cores/server, 25 rps● At minute 4, a server gets slower (200 ms delay)

Scalability


● Add 5 rps

per minute until 150 rps● Max 6 cores/server

Conclusion

Conclusion


● Framework for building cloud applications● Elasticity : add/remove resources ● High Availability : software, hardware, network failures● Scalability : growing load, peaks, scaling down, ...

– Quick replication of busy components

● Load Balancing : load has to be shared by all available servers

– Replication of busy components

– Migration of less busy components

– Reach equilibrium when load is stable

● SLA performance guarantees

– Automatic provisioning

● No synchronization, fully decentralized

Thank you !

Technology

Autonomic SLA-driven Provisioning for Cloud Applications