Cloud Resource management problems

7/29/2019 Cloud Resource management problems

1/5

Resource Management in Cloud Computing

Abstract

This paper reviews certain papers on resource management in cloud computing. Algorithms and frameworks in

resource allocation, resource calculation, resource provisioning, resource discovery and selection are described and

summarized.

1Introduction

Resource Management is critical in cloud computing. With improper resource management, applications might

experience network congestion, long time wait, CPU waist, overused CPU and memory, and security problems. To

maximized cloud computing infrastructure utilization and minimize total cost of both the cloud computing

infrastructure and running applications, resources need to be managed properly.

In this report, certain papers on resource management in cloud computing are reviewed. Four types of resource

management, resource allocation, resource calculation, resource provisioning and resource discovery and selection,

are described.

2Problems of Resource Management

In cloud computing, the underlying large-scale computing infrastructure is often heterogeneous, not only because

its not economic and reliable to procure all the servers, network devices and power supply devices in one size and

one time, but because different application requires different computer hardware, e.g. workflow extensive

computing might need standard and cheep hardware; scientific computing might need specific hardware other than

CPU like GPU or ASIC.

There are kinds of resources in the large-scale computing infrastructure need to be managed, CPU load, network

bandwidth, disk quota, and even type of operating systems. To provide better quality of service, resources are

provisioned to the users or applications, via load balancing mechanism, high availability mechanism and security

and authority mechanism. To maximize cloud utilization, the capacity of application requirements shall be

calculated so that minimal cloud computing infrastructure devices shall be procured and maintained. Given access to

the cloud computing infrastructure, applications shall allocate proper resources to perform the computation with

time cost and infrastructure cost minimized. Proper resources shall be selected for specific applications. In other

words, resource management in cloud computing is categorized into four types, resource allocation, resource

calculation, resource provisioning and resource discovery and selection.

3Resource Management Review


2/5

Distributed resource management has become a research topic since (Injong 1995), in which resource allocation in

distributed systems are formulated as three kinds: the dining philosophers problem, the drinking philosophers

problem and the dynamic resource allocation problem.

3.1Resource Allocation

In a distributed computing environment, e.g. a distributed cloud application based on the CORBA infrastructure,

resource allocation (Carter, St. Louis & Andert 1998)can help with issues of component migration, scheduling, and

load balancing across dynamic computational resources. In such an environment, applications run on collections of

interconnected computers, also known as networks of workstations (NOW). In this work, heterogeneous CORBA

job migration is presented, and a prioritized set of job queues based on the job migration is demonstrated. To

provide fault tolerance, application checkpoints are described, that failure of the system is detected automatically

and restarted without loss of client jobs.

An efficient online algorithm for co-allocating resources and advanced resource reservations is presented in (Claris,

George & Khaled 2009). The algorithm uses specific data structures to store availability of resources. Efficient

range searches are designed to identify all available resources. The overall complexity for a successful scheduling

attempt is , while is the spatial size of the reservation, N is the amount of servers, Q is the amount of

partitions of the system.

The Service Oriented Architecture (SOA) and virtualization of physical resources combination has emerged the

Service Oriented Infrastructure (SOI), which provide flexible solution for on demand component accessing. The

problem of determining the optimum resource allocation for Virtual Machines is addressed in work (Danilo et al.

2009), in which the resource allocation problem is modeled non-linearly which is able to be solved optimally.

For asynchronous distributed environment, two algorithms, RBA* and OBA, are introduced in (Hegazy &

Ravindran 2002)to maximize application QoS and minimize deadline missing ratio. Both algorithms build

distributed application models based on Jenson benefit functions to model timely resource requirements in the

distributed applications. Adaptation functions are proposed to predict resource requirements in the near future.

Application adaptation models are built to accommodate dynamic application replication for sharing increased

workload. The underlying network model is built as a switched real-time Ethernet network. The difference of the

two algorithms is that RBA* allocates resources by analyzing the process response times of each application, while

OBA makes allocation decisions based on the processor workloads. The time cost of RBA* is about , n is the

size of the task set; the time cost of OBA is relatively lower, about , with the same meaning of n as RBA*.

However, the experiments show that RBA* provides better QoS with less deadline missing.

A multiple degree load balancing resource allocation scheme is introduced in (Lee 2004). Each of load balance

degree maps to each resource management configuration and reduces data traffic among distributed components. A


3/5

High Level Architecture (HLA) bridge middleware environment is also introduced for data bridging among multiple

federations.

An autonomous decentralized resource allocation scheme is presented in (Masuishi et al. 2005). A system

architecture utilizing the autonomous decentralized resource allocation scheme is designed, in which there is a

subsystem that has a production server to process request and a coordination server to adapt resources to load

changes. Three architectures, centralized control architecture, autonomous decentralized architecture with

independent adaptation and autonomous decentralized architecture with coordinated adaptation, are compared and

simulated in this paper. The results show that autonomous decentralized resource allocation is better for server

availability and the tracking accuracy for the autonomous decentralized resource allocation with coordinated

adaptation is comparable with the centralized control architecture.

The problem of apportioning multiple resources to satisfy a single QoS dimension is addressed in (Rajkumar et al.

1998). This paper first introduces Q-RAM, the QoS-based resource allocation model, and then discusses allocation

of a single resource with multiple QoS dimensions. After the discussion of multiple QoS dimensions, the paper

analyzes the allocation problem of multiple resources with single QoS dimension. The paper shows that the problem

of finding optimal resource allocation is NP-hard. However, a simple polynomial algorithm based on computational

geometry helps find a solution very near to the optimal solution of the resource allocation problem.

A decentralized market-based approach allocating resources in a heterogeneous overlay network is presented in

(Smith et al. 2008). In this paper, a resource allocation strategy of the overlay network resources is defined to assign

traffic dynamically based on the current utilization, thus enabling the system to accommodate fluctuating network

demands. A mathematical model of the resource allocation environment is presented and the problem is regarded as

a constrained optimization problem.

A resource allocation evaluation framework in Application Layer Networks is presented in (Streitberger et al. 2006).

A pyramid of metrics is defined to evaluate resource allocation methods. Two layers of metrics are defined,

technical parameters and economic parameters. Technical parameters are technical metrics in the system, such as

Discovery Time, Message Latency, Message Size, Service Provisioning Time, and Negotiation Time. Economic

parameters are, in the upper layer of the pyramid, derived from technical parameters.

3.2Resource Calculation

In the Software as a Service (SaaS) paradigm, the cost of deployment, customization and hosting of applications can

be reduced by sharing with multiple tenants. However, to maximize user experience and minimize the cost, the

service provider has to calculate how the resources are allocated. In the work of (Kwok & Mohindra 2008), the

resource calculation is addressed and solutions are provided. A multi-tenant placement model is given to place

multiple applications in a set of servers.

For satellite networks, to maximize the use of satellite is important. Certain model (Petraki et al. 2007)has been built

for MF-TDMA satellite network to help provide minimum timeslots but still guarantee the QoS required in the


4/5

network. Three algorithms, SIT side algorithm, NCS side algorithm and multiple subset sum algorithm, are given in

this work.

3.3Resource Provisioning

Cloud, such as Amazon EC2 and S3, offers storage and computational resources that can be used on demand for afee. Different usages of the resources in such a cloud have different impact on the final cost. The paper (Deelman et

al. 2008) addresses this problem to minimize the total cost of cloud utilization. The paper first introduces a scientific

application Montage of its computational workflow; then it presents the computational models and cost models

(Amazon EC2); after that, simulations are conducted based on the models. Results show that for a data-intensive

application with a small computational granularity, the storage costs are insignificant compared to CPU costs.

In Service Oriented Architecture (SOA), a set of low-level atomic services are composite to form the high-level

services, in which capacity planning and resource provisioning is important. In paper (Chun et al. 2008), a black-box

method for estimating service requests demanded CPU resource is presented. The method is based on linear

regression between the observed requests and resource utilization. The service composition relationship discovery is

also introduced. The discovered service composition relationship can further be used to improve the CPU demand

estimation quality.

A absolute delay guaranteed resource provision technique for real-time applications is presented in (Chunfeng et al.

2003). Runtime network workload uneven distribution factors are considered to configure resources provided. A

resource provision model for real-time applications is built, in which the end-to-end worst-case delay for real-time

traffic, the bandwidth constraint, the delay constraint, and the admission probability are formulated. The results

show that the overall resource utilization by admission probability is higher than the uniform UBAC method.

A general complex resource provisioning (CRP) model and the major requirements to the authorization service

infrastructure (AuthZ) to support multi-domain CRP are defined in (Demchenko, Gommans & de Laat 2007). Two

main issues are focused, AuthZ session support and policy expression for CRP models. The eXtensible Access

Control Markup Language and its special profiles to specify complex resources access control policies are

described. A XML based AuthZ ticket format is proposed to support extended AuthZ session context. Specific

functionality are added to the gLite Java Authorization Framework (gJAF) to handle dynamic security context, e.g.

the AuthZ sessions.

An application-level BPEL workflow on demand resource provisioning is introduced in (Dornemann, Juhnke &

Freisleben 2009), in which BPEL service calls are scheduled according to the target host loads. The solution

schedules workflow steps to underutilized servers and allocates extra hosts utilizing the Cloud computing

infrastructures at service load peaks. The BPEL standard is not required to modify in this approach. The Cloud

computing infrastructure utilized in this paper is Amazons Elastic Compute Cloud (EC2). The BPEL

implementation is ActiveBPEL engine.


5/5

A Universal Factory Service (UFS) that provides dynamic resource deployment and resource broker named the Door

service, is introduced in (Eun-Kyu et al. 2005).

Autonomous Decentralized Community System (ADCS) is proposed to adapt transportation services (Filali, Hafid

& Gendreau 2008), in which global mobile communication technologies are used and service condition changes

dynamically. Based on ADCS, resources are shared among transportation neighbors that each node provides

resources to other node requests. Autonomous collaboration technology is proposed to satisfy the cooperation

among computational nodes. By evaluation of a taxi dispatching application, effectiveness of ADCS is approved.

In large-scale scientific workflows, three techniques for resource provisioning, advance reservations, multi-level

scheduling, and infrastructure as a service (IaaS), are proposed in (Juve & Deelman 2008). Advance reservation is

achieved by users requesting slots from batch schedulers that specify the number of resources to reserve and the

duration of the reservation and the beginning and the end of the advance reservation. Rather than batch scheduler

based advance reservations, another advance reservation is to use probabilistic advance reservations in which

reservations are made based on statistical estimates of queue times. In multi-level scheduling, the allocation of

resources and the management of application tasks are separated in which application tasks are submitted using

standard mechanisms, but the node managers are in charge of contacting an external resource manager.

Infrastructure as a service (IaaS) enables users to run applications on remote servers by configuring and launching

virtual machines on these servers. The virtual machines can be allocated with certain amounts of CPU, disk space

and memory, with certain type of operating systems, computing frameworks and application software.

A dynamic resource provisioning framework for dynamically provisioning virtual machines is introduced in (Kusic

et al. 2009)to reduce the power consumption in large data centers, by sharing servers among multiple online services

utilizing virtualization technology, achieving higher server utilization and energy efficiency while still maintain

desired quality of services. A LLC framework is implemented and validated. The switching costs and the risk notion

are explicitly encoded in the optimization problem. The experiments show that the LLC framework can save 22% on

average in power consumption cost.

Documents

Cloud Resource management problems