Upload
shelly-saunders-deli
View
214
Download
0
Embed Size (px)
Citation preview
7/28/2019 Software Quality of Service in Composite Applications Built with Web Services PhD Thesis
1/141
i
Software Quality of Service
in Composite Applications
Built with Web Services
Shelly Saunders
A thesis submitted in fulfilment of the requirements for the degree of Doctor of Philosophy of
Nottingham Trent University and Southampton Solent University
November 2010
7/28/2019 Software Quality of Service in Composite Applications Built with Web Services PhD Thesis
2/141
ii
Abstract
Over recent years businesses have evolved their enterprise architectures to include
packaged applications, legacy systems, and bespoke line-of-business (LOB) applications that
are integrated using an architectural approach called Service-oriented architectures (SOA).
The architecture promotes agile and reconfigurable application development which is ideal
in 21st century businesses. Modern SOAs now extend into the Cloud using pay per use
Software as a Service (SaaS).
Composite applications built using Web Services are one way in which SOA principles are
being introduced into enterprise architectures. Although there are many business reasons
for this type of architecture the methods and techniques for estimating overall Quality of
Service (QoS) in a composite application built using Web Services do not exist at the
moment.
This thesis attempts to address a number of questions in this area. Firstly, how can we
predict the performance of a composite application, for example, what is the effect on
performance of replacing one Web Service with another one of equivalent functionality, or
by dynamically changing the steps in a workflow? Secondly, how can we maximise the
performance of that application through effective use and exploitation of the resources
available to it? Thirdly, what strategies are there for improving our ability to meet QoS
metrics in a composite application? As the provider of a composite application to one or
more clients how can we manage situations were resources are overloaded? Under these
conditions it would be useful to be able to selectively admit or reject requests from clients
based on some criteria that maximises the providers profits or business objectives.
7/28/2019 Software Quality of Service in Composite Applications Built with Web Services PhD Thesis
3/141
iii
Fourthly, how can we define and manage SLAs for performance metrics in composite
applications?
This thesis makes the following five contributions. The first is detailed test results to
demonstrate that the Mean Value Analysis (MVA) algorithm can be applied to a queuing
network description of a composite application using Web Servcies. The second is the
demonstration of the MVA algorithm as the fitness function for a Genetic Algorithm (GA).
The third is a practical example of applying a GA to dynamic management of a real workflow
implemented as a set of Web Services across multiple servers. The fourth is the
demonstration of strategies for meeting QoS metrics under a number of different real-life
overload conditions. The fifth is a proposal for improvements that could be made to existing
SLA design methodologies and SLA languages to define QoS metrics composite applications.
7/28/2019 Software Quality of Service in Composite Applications Built with Web Services PhD Thesis
4/141
iv
Dedication
To Greta 1993-2008 and Jenny 1970-2009
May flights of angels sing thee to thy rest.
7/28/2019 Software Quality of Service in Composite Applications Built with Web Services PhD Thesis
5/141
v
Acknowledgements
I would like to acknowledge the encouragement and support of my supervisors at Southampton
Solent University: Eur Ing Professor Margaret Ross MBE, Eur Ing Geoff Staples and Dr Sean
Wellington, Head of the Technology Research Centre. I would also like to thank Professor Mike
Barnett and Professor John Rees who both provided helpful advice and direction.
The final version of this thesis benefited greatly from the comments and input made by Edwin Gray
during the viva.
Steve White, of IBMs Autonomic Computing laboratory at the Thomas J Watson Research Centre in
Hawthorne gave up time to read and comment on some of this work in a very valuable session.
I have also had useful conversations about SOA in general from colleagues at my former employer,
ACE Group, as well as IBM staff from the Hursely labs near Winchester.I would also like to thank
Marlborough Stirling plc who gave permission for me to use the results of a performance testing
exercise conducted on their systems.
7/28/2019 Software Quality of Service in Composite Applications Built with Web Services PhD Thesis
6/141
vi
Table of Contents
Chapter 1 Introduction ..................................................................................................................... 1
1.1 Motivation............................................................................................................................... 3
1.2 Hypothesis............................................................................................................................... 4
1.3 Methodology........................................................................................................................... 5
1.4 Contributions........................................................................................................................... 5
1.5 Thesis Roadmap ...................................................................................................................... 6
Chapter 2 Analysis of the Problem ................................................................................................... 8
Chapter 3 Related Work ................................................................................................................. 13
3.1 Background ........................................................................................................................... 13
3.1.1 Discovery and Negotiation .............................................................................................. 13
3.1.2 Service Level Agreement................................................................................................. 14
3.1.3 Service Provision............................................................................................................. 15
3.1.4 Monitoring...................................................................................................................... 15
3.2 Adaptive Control of Web Applications and Services ............................................................... 16
3.3 Queuing Theory ..................................................................................................................... 17
3.3.1 Dynamic Resource Configuration .................................................................................... 17
3.3.2 Admission Control........................................................................................................... 19
3.3.3 Dynamic Provisioning of Idle Resources .......................................................................... 20
3.3.4 Extending to Multiple Tiers ............................................................................................. 20
3.4 Control Theory ...................................................................................................................... 21
3.4.1 Admission Control........................................................................................................... 21
3.4.2 Degraded Service............................................................................................................ 21
3.4.3 Extending to Multiple Tiers ............................................................................................. 22
3.4.4 Fuzzy Controllers............................................................................................................. 23
3.5 Combined Approaches........................................................................................................... 23
3.6 Solving Optimization Problems .............................................................................................. 24
7/28/2019 Software Quality of Service in Composite Applications Built with Web Services PhD Thesis
7/141
7
3.6.1 Utility Functions.............................................................................................................. 24
3.6.2 Integer Linear Programming............................................................................................ 25
3.6.3 Genetic Algorithms ......................................................................................................... 25
3.7 Concluding Remarks .............................................................................................................. 26
3.8 Publications ........................................................................................................................... 27
Chapter 4 Designing SLAs for Composite Applications.................................................................... 28
4.1 Service Level Management.................................................................................................... 28
4.1.1 Service Monitoring.......................................................................................................... 29
4.1.2 Key Quality Indicators and Key Performance Indicators................................................... 30
4.2 Service Level Agreement Design ............................................................................................ 31
4.2.1 COSMA ........................................................................................................................... 31
4.2.2 MoDe4SLA ...................................................................................................................... 33
4.2.3 Differential QoS Support ................................................................................................. 35
4.3 Proposals............................................................................................................................... 36
Chapter 5 An MVA Performance Model for a SOA.......................................................................... 37
5.1 Introduction .......................................................................................................................... 37
5.2 Performance Requirements................................................................................................... 38
5.3 Business Demand Modelling.................................................................................................. 39
5.4 Workload Characterisation .................................................................................................... 41
5.4.1 Task Distribution ............................................................................................................. 41
5.4.2 Arrival Time Distribution ................................................................................................. 43
5.4.3 Service Time Distribution ................................................................................................ 45
5.4.4 Load-Dependence of Service Times................................................................................. 46
5.5 Modelling the Application...................................................................................................... 47
5.5.1 Mean Value Analysis ....................................................................................................... 47
5.5.2 The Queuing Network Model of an N-Tier Application .................................................... 51
5.6 Management Software .......................................................................................................... 53
5.6.1 Capturing Application Metrics ......................................................................................... 53
5.6.2 Statistical Analysis of Raw Metrics................................................................................... 54
7/28/2019 Software Quality of Service in Composite Applications Built with Web Services PhD Thesis
8/141
8
5.6.3 MVA Modeller ................................................................................................................ 55
5.7 Results................................................................................................................................... 55
5.7.1 Accuracy of the Model .................................................................................................... 58
5.8 Using the Model to Predict the Performance of a New Workflow ......................... ................. 58
5.9 Summary and Discussion ....................................................................................................... 59
5.10 Publications ......................................................................................................................... 60
Chapter 6 A Genetic Algorithm with an MVA Fitness Function for Runtime Performance
Improvements of a Composite Application Built Using Web Services ............................................ 62
6.1 Introduction to Genetic Algorithms........................................................................................ 62
6.2 Comparison with Other Techniques ....................................................................................... 64
6.3 A GA for the Sample Application............................................................................................ 64
6.3.1 Chromosome Encoding ................................................................................................... 64
6.3.2 Initial Population............................................................................................................. 67
6.3.3 Fitness Evaluation ........................................................................................................... 68
6.3.4 Fitness Selector............................................................................................................... 69
6.3.5 Constraints ..................................................................................................................... 69
6.3.6 Crossover........................................................................................................................ 70
6.3.7 Mutation ........................................................................................................................ 74
6.3.8 Population Evolution....................................................................................................... 74
6.4 Technical Design of the Management Solution ...................................................................... 74
6.4.1 The ESB........................................................................................................................... 74
6.4.2 Dynamic Routing............................................................................................................. 75
6.4.3 Logical Design ................................................................................................................. 76
6.5 Test Harness .......................................................................................................................... 78
6.5.1 Sample Workflows .......................................................................................................... 78
6.5.2 Baseline Performance Test Results.................................................................................. 82
6.5.3 Post GA Results............................................................................................................... 85
6.6 Summary and Discussion ....................................................................................................... 86
Chapter 7 Strategy for a QoS-aware Composite Applications in the Cloud...................... ............... 87
7/28/2019 Software Quality of Service in Composite Applications Built with Web Services PhD Thesis
9/141
9
7.1 Enterprise SOA and Cloud Computing .................................................................................... 87
7.1.1 Cloud Computing and Software as a Service.................................................................... 87
7.1.2 A Unified Architecture .................................................................................................... 89
7.1.3 Challenges with SLA Management .................................................................................. 90
7.2 Modelling a Composite Application in the Cloud .................................................................... 91
7.3 Strategies for Automated QoS Control in the Cloud ............................................................... 92
7.3.1 Changes in Workload ...................................................................................................... 92
7.3.2 Loss of Service ................................................................................................................ 93
7.3.3 Increases in Latency ........................................................................................................ 93
7.3.4 Differentiated Services.................................................................................................... 94
7.4 Conclusions ........................................................................................................................... 95
7.5 Publications ........................................................................................................................... 96
Chapter 8 Evaluation and Conclusions............................................................................................ 97
8.1 Discussion of Results ............................................................................................................. 98
8.2 Evaluation of Results and Methodologies ............................................................................ 101
8.3 Contributions of this Thesis ................................................................................................. 103
8.4 Limitations of this Thesis ..................................................................................................... 104
8.5 Future Work ........................................................................................................................ 105
References.................................................................................................................................... 106
Appendix A Publications Linked to This Thesis ............................................................................. 124
Journals..................................................................................................................................... 124
Conferences .............................................................................................................................. 124
Appendix A2 Other Research Outputs Not Directly Relevant To This Thesis ......................... ....... 125
Software Engineering ................................................................................................................ 125
Optoelectronics......................................................................................................................... 125
Patents...................................................................................................................................... 125
7/28/2019 Software Quality of Service in Composite Applications Built with Web Services PhD Thesis
10/141
List of Figures
Figure 1.1 Service-oriented application integration.
Figure 1.2 A Virtual Enterprise.
Figure 4.1 Service Level Management
Figure 4.2 Composite Web Services
Figure 4.3 Simplified COSMAdoc schema
Figure 5.1 Probability of client requesting each job class.
Figure 5.2 Distribution of tasks across three tiers for each of the 26 classes of work.
Figure 5.3 Inter-arrival time distribution from web logs.
Figure 5.4 The tail of the distribution for inter-arrival times beyond two seconds.
Figure 5.5 Inter-arrival times of tasks on the job queue.
Figure 5.6 Service time distributions for all tasks executing in under 2 sec.
Figure 5.7 Service time distributions for all tasks executing in over 2 sec
Figure 5.8 Increase in task service time with load
Figure 5.9 Queuing Network Model of an N-tier Application
Figure 5.10 An ESB executing a sequence of tasks via Web Services
Figure 5.11.Execution times of each class in a simple workflow, together with the total response
time
Figure 5.12 Comparison of the response times predicted by the model and the actual response times
at different loads.
Figure 5.13 Accuracy of the model
Figure 5.14 The differences between the predicted and observed results when the model is used in a
predictive manner
7/28/2019 Software Quality of Service in Composite Applications Built with Web Services PhD Thesis
11/141
Figure 6.1 Layers of services become increasingly more coarse-grained, with the top layer of
orchestration services providing a standards based aggregation and process framework.
Figure 6.2 Logical Design
Figure 6.3 Sample Workflows
Figure 6.4 Baseline Results
Figure 7.1 SOA and SaaS used to create a composite application
Figure 7.2 Generalised example queuing network model including SaaS services
7/28/2019 Software Quality of Service in Composite Applications Built with Web Services PhD Thesis
12/141
xii
List of Tables
Table 5.1 MVA Algorithm
Table 6.1 Example Chromosome Encoding
Table 6.2 Logical Design
Table 6.3 Sample Workload
Table 6.4 Job Distribution
Table 6.5 Measured Execution Times
Table 6.6 Optimised Job Distribution
7/28/2019 Software Quality of Service in Composite Applications Built with Web Services PhD Thesis
13/141
xiii
Glossary of Terms
Admission Control a QoS procedure which determines the rate at which jobs are accepted into a
system or network, or indeed, whether the jobs will be accepted at all
Artificial Intelligence (AI) a branch of computer science dealing with simulating intelligent
behaviour in computers
Autonomic Computing a term created by IBM to describe self-managing computer systems.
BPEL an XML language for describing business processes
Cloud computing - is the provision of dynamically scalable and often virtualised resources as a
service over the Internet. Cloud computing services often provide common business applications
online that are accessed from a web browser, while the software and data are stored on the servers.
Control Theory a technique from engineering whereby one or more input variables are tracked by
a controller in order to manipulate one or more output variables.
Decision Theory a branch of AI concerned with decision making. In particular design theory
addresses problems such as how to measure the outcome of a decision to ensure that its optimal
and how to make decisions with incomplete knowledge (choice under uncertainty).
E-Commerce Transaction a business transaction that occurs over a network between two
partners. The transaction is likely to consist of a number of discrete business processes that
automatically engage other IT systems.
ESB (Enterprise Service Bus) a layer of abstraction on top of a messaging service stack that
supports Web services standards, synchronous and asynchronous messaging patterns, content-
based routing, rules-based content filtering or enrichment, XML transformation services, standards-
based adapters (such as JCA, JMS).
7/28/2019 Software Quality of Service in Composite Applications Built with Web Services PhD Thesis
14/141
xiv
Event-driven architecture (EDA) is a software architecture pattern promoting the production,
detection, consumption of, and reaction to events. Event-driven architecture can complement
service-oriented architecture (SOA) because services can be activated by triggers fired on incoming
events
Fuzzy Logic Fuzzy logic is a form of multi-valued logic derived from fuzzy set theory to deal with
reasoning that is approximate rather than precise
Genetic Algorithm - A genetic algorithm (GA) is a search technique used in computing to find exact
or approximate solutions to optimization and search problems. Genetic algorithms are categorized
as global search heuristics. Genetic algorithms are a particular class of evolutionary algorithms (also
known as evolutionary computation) that use techniques inspired by evolutionary biology such as
inheritance, mutation, selection, and crossover (also called recombination).
Grid Computing an emerging architecture whereby many networked computers are used to
parallel process work by packaging the work up into many small jobs
J2EE a multi-platform framework that provides software developers a huge number of pre-coded
solutions for common tasks in the Java language
Kendall Notation a system for describing the characteristics of a queuing system - letters are used
to describe the shape of a distribution: M-Markovian, G-general. The first letter defines the job
interarrival distribution and the second letter describes the service time distribution. Then a number
is used to give the number of servers, so M/M/1 is a queue where job interarrival and service times
have a Markovian distribution and there is a single server.
Linear Programming In mathematics, linear programming (LP) problems involve the optimization
of a linear objective function, subject to linear equality and inequality constraints
Mean Value Analysis (MVA) A technique for analysing closed multichain queuing networks.
7/28/2019 Software Quality of Service in Composite Applications Built with Web Services PhD Thesis
15/141
xv
.NET a framework for the Windows operating system that provides software developers with a
huge number of pre-coded solutions for common tasks. It supports development in multiple
languages but C#.NET and VB.NET are the most popular
PI Controller in control theory, a controller with both proportional and integral feedback control.
Its popular because it can have a nonzero constant value under steady-state conditions even when
the error signal is zero.
Queuing Theory the mathematical analysis of queues
QoS Quality of Service, in a software application sense, refers to non-functional attributes such as
response time, availability, and reliability. QoS attributes are used to provide measurable constraints
in a SLA.
SLA a Service Level Agreement is a formal contract between an IT service provider and a service
consumer
Service-oriented Architecture (SOA) "a style of multi-tier computing that helps organizations
share logic and data among multiple applications and usage modes." [Natis and Schulte, 1996]
SOAP SOAP is a protocol for exchanging XML-based messages between software components
Software as a Service (SaaS) an element of Cloud Computing, SaaS is a model of software
deployment whereby a provider licenses an application to customers for use as a service on
demand. SaaS software vendors may host the application on their own web servers or download the
application to the consumer device, disabling it after use or after the on-demand contract expires.
UDDI an XML based registry for listing the WSDL and URLs of Web services
UML Unified Modelling Language
Utility Functions Utility is a measure of preference, expressed through utility functions. Utility
functions assign numbers to members of a choice set in order to rank the choices.
7/28/2019 Software Quality of Service in Composite Applications Built with Web Services PhD Thesis
16/141
xvi
URL(URI) a unique identifier to the location of a Web service, application or site (on a corporate
network or on the Internet)
Web Service a Web service is commonly defined as a software service that uses WSDL to define its
interface and SOAP envelopes for message exchange.
Workflow a business process implemented as a composite Web Service comprised of a number of
steps each consuming finer-grained Web Services
Workload in e-commerce terms, this is the rate at which requests are made to system resources
WSDL an XML format for describing the public interface of Web Services
7/28/2019 Software Quality of Service in Composite Applications Built with Web Services PhD Thesis
17/141
1
Chapter 1
Introduction
It is commonplace for organisations to automate complex business processes using service-oriented
architectures (SOA) [Erl, 2008]. A service-oriented architecture is a distributed architecture that
models components as services. It is built upon a collection of open standards, including Web
Services Description Language (WSDL) [W3C, 2001], SOAP [W3C, 2007], WS-Security [OASIS, 2006a],
WS-Policy [IBM, 2006], WS-ReliableMessaging [OASIS 2006b], BPEL, or Business Process Execution
Language [OASIS 2007].
A SOA encourages enterprise application integration and composite application development by
virtue of the fact that it is intrinsically loosely-coupled [Erl, 2008]. For example, Web Services can
provide wrappers to applications built on legacy systems, allowing the functionality of those legacy
applications to be integrated with new functionality which is collectively delivered via a single portal,
Figure 1.1.
Figure 1.1 Service-oriented application integration. The legacy functionality of back-end systems is
exposed via Web Services. An integration layer provides business process orchestration and the portal
layer provides the user interface.
7/28/2019 Software Quality of Service in Composite Applications Built with Web Services PhD Thesis
18/141
2
This type of composite application built using Web Services not only helps companies maximise their
investment in legacy systems, but also helps streamline business processes. The business and
technical benefits of such applications collectively offer what is often called a virtual enterprise or
virtual organisation [Khoshafian, 2002], Figure 1.2.
SOAP, WS-Add ress ing
B P E L / O W L -S
WSDL, WS-Po l i cy
W S- T r ans ac t ion , WS- Sec u r i t y
In te rne t
U DD I
ServiceC o n s u m e r
In tr ane t In te rne t
C R M
E R P
Legacy Apps
Service Providers
Service Providers
Se rv ice Prov ide rs
E n te r p r i se A p p l ica t i o nIntegration
B 2B I n teg ra t i o n v i a OnD em an d S er v ice
Providers
In -house se rv ices
Figure 1.2 A Virtual Enterprise (adapted from [Khoshafian, 2002]). A service-oriented architecture can
flexibly integrate applications, functionality and data across not only legacy applications on the
organisations own intranet, but can also across enterprise boundaries to consume external third-
party services. Furthermore, the organisation can expose its own composite services to its own
clients.
Cloud Computing allows SOAs to reach out across the globe consuming software services from
around the world a concept usually referred to as Software as a Service (Saas) [Lakshmanan,
2009].
The ability to automatically discover services, compose those services into a business process and
invoke them as part of workflow in an on-demand fashion opens up some of the most exciting
features of dynamic e-business. The Universal Description and Discovery Interface (UDDI) initiative
7/28/2019 Software Quality of Service in Composite Applications Built with Web Services PhD Thesis
19/141
7/28/2019 Software Quality of Service in Composite Applications Built with Web Services PhD Thesis
20/141
4
Managing Overload Conditions It is a common scenario when providing workflows to
multiple clients that resources can become overloaded. Under these conditions it would be
useful to be able to selectively admit or reject requests from clients based on some criteria
that maximises the providers profits or business objectives.
Performance Prediction How can we predict the effect on performance of replacing one
Web Service with another one of equivalent functionality, or by dynamically changing the
steps in a workflow?
Performance Improvement How can an organisation improve the performance of that
application through effective use and exploitation of the resources available to it?
Service Level Agreement (SLA) Management how can we define and manage SLAs for
performance metrics in composite applications?
SLA Strategies what strategies are there for improving our ability to meet SLA performance
targets in a composite application?
1.2 Hypothesis
This thesis examines the following hypothesis:
There exist solutions and strategies that will allow providers of composite applications built using
Web Services to manage the QoS metrics of that application in such a way that they can ensure they
meet SLA targets containing those metrics.
We make no attempt to determine the best solutions in this thesis as the scope of the work involved
would be too broad for a thesis. Instead we attempt to provide evidence that such solutions exist.
As suggested by the Software Engineering Institute, this is itself extremely valuable. Within the
financial services industry, which the thesis author has worked since 1997, there is widespread
7/28/2019 Software Quality of Service in Composite Applications Built with Web Services PhD Thesis
21/141
5
belief among system management professionals that these are potentially intractable problems.
This thesis aims to demonstrate that solutions do exist.
1.3 Methodology
It is the aim of this research to explore how the scenarios outlined above could be addressed at the
application levelthrough the construction of QoS-aware software components that could be offered
as a generic management service in a typical composite application built using Web Services.
Fundamental to this effort are two major pieces of work: firstly, the creation of a model for
measuring and predicting the essential QoS metrics: response time and throughput, and secondly
the development of a methodology for efficiently solving the optimization problem that results.
We will use an qualitative, empirical approach for the major pieces of software engineering involved
in which we will attempt to apply candidate solutions to a real insurance application built using Web
Services. In electing to use this application we have chosen to follow an exploratory case-study
methodology. The results of exploratory research such as this are not useful for decision-making by
themselves, but they can provide significant insight into a given situation and this is therefore
considered a good approach to address our hypothesis. We also believe that the composite
application used in the study is very typical of a general class of applications used in the insurance
and financial services industries. This view is based on the thesis author's many years experience
working as a technical architect in this sector. In selecting the software engineering aspects of this
thesis we are attempting to generate ideas for a design space and evaluate our design choices
through prototyping the proposed design solutions in real use with actual components of the case-
study application.
1.4 Contributions
This thesis makes the following main contributions to the subject:
7/28/2019 Software Quality of Service in Composite Applications Built with Web Services PhD Thesis
22/141
6
1. This is the first time that detailed test results have been published to prove that the Mean
Value Analysis (MVA) algorithm can be applied to a queuing network description of a
composite application using Web Services.
2. This thesis is the first published work to use MVA as the fitness function for a Genetic
Algorithm
3. This thesis is the first published work to apply a GA to dynamic run-time QoS management
of a real insurance application implemented as a set of Web Services across multiple
servers. Previous published work on using GAs to optimize service composition has
restricted itself to numerical simulations.
4. This thesis demonstrates strategies for meeting QoS targets under a number of different
real-life overload conditions.
5. This thesis discusses improvements that could be made to existing SLA design
methodologies and SLA languages to incorporate QoS metrics for composite applications.
1.5 Thesis Roadmap
Chapter 2 provides a background discussion to the issues of QoS in service-oriented architectures as
well as introducing related work in the field to the two main components of the thesis: the model
and the optimization methodology.
In order to derive and use a model for a composite application using Web Services we need to
undertake the following steps:
1. Define the performance requirements of the composite application.
2. Model the business demand of the composite application
3. Build a performance model by characterising the workload of real systems.
7/28/2019 Software Quality of Service in Composite Applications Built with Web Services PhD Thesis
23/141
7
Chapter 3 describes theses steps as applied to a real insurance application and shows how a
performance model was developed based on queuing network theory. The report then shows how
the model can be used for adaptive control of a composite application using Web Services by
addressing some simple performance prediction problems.
In Chapter 4, a Genetic Algorithm is introduced for performance management of composite
application using Web Services . The key aspects of the GA are the chromosome encoding and the
crossover strategy. It is shown how the GA can optimize the overall response times of workflows
using the MVA model as a fitness function to identify whether the workflow suggested by each
chromosome will meet the QoS targets defined.
In chapter 5 we demonstrate from an architectural perspective how the models and optimization
techniques introduced in this thesis can be applied to workflows for enterprise applications built
using Service-oriented architectures that extend beyond the local enterprise and consume third-
party services in the Cloud.
Finally, in Chapter 6 we review the most recent proposals for SLA management of composite
applications and identify areas where these proposals could be extended to include provision for the
adaptive strategies described in the previous chapters.
7/28/2019 Software Quality of Service in Composite Applications Built with Web Services PhD Thesis
24/141
8
Chapter 2
Analysis of the Problem
The Software Engineering Institute reported in their review paper on Service Level Agreements in
Service-Oriented Architectures that one of the most important areas for further research is the need
to understand and determine the QoS of composite services [Bianco et. al. 2008]. The problems
have also been raised with respect to applications built using services in the Cloud [Panzieri et. al.
2010] who observe that QoS in clouds is not sufficiently investigated as yet but there is growing
interest in both industry and academia.
In terms of the SOA methodology, composition of services allows the business to realize flexibility,
reusability and adaptability of its software assets. However, the application must still meet such
important QoS attributes as performance. Since the components may be provided by multiple
stakeholders and the configuration could change at run-time these are important additional issues
to consider over a more traditional distributed architecture.
Menasce [2002] first highlighted the need for a QoS definition in Web Services and identified the
need to take into consideration both the needs of the service provider and the service consumer.
QoS requirements for Web Services include the following [Yu et. al., 2007]: Performance, Reliability,
Scalability, Transactions, Capacity, Accuracy and Integrity, Regulatory, Availability, Interoperability
and Security.
Performance:
Service time is the length of time for services taken to provide a response to various
types of requests [Bhoj et al, 2000; Chandrasekaran et al, 2002; Menasce, 2002;
Agarwal et al, 2005].
7/28/2019 Software Quality of Service in Composite Applications Built with Web Services PhD Thesis
25/141
9
Response time is the total time required to complete a service request [Mani and
Nagarajan, 2002; Papazoglou and Georgakopoulos, 2003; Looker et al, 2004;
DAmbrogio, 2006].
Reliability refers to the capability of maintaining the service and service quality [Jin et al,
2002; Silver et al, 2003; Cardoso et al, 2004; Burstein et al, 2005].
Security refers to authentication mechanisms, messages encryption and access control,
confidentiality, non-repudiation and resilience to denial-of-service attacks [Sahai et al, 2002;
Ran, 2003; Wang et al, 2004; DAmbrogio, 2006].
Accessibility refers to the capability of satisfying a web service request [Gu et al, 2002; Mani
and Nagarajan, 2002; Looker et al, 2004; Mathijssen, 2005].
Transactions relates typically to properties such as the transactional durability and
consistency of results [Mani and Nagarajan, 2002; Menasce, 2002; Ran, 2003; Schmit and
Dudstdar; 2005]
Capacity is the maximum number of concurrent requests that server can process to
guarantee performance or the number of concurrent connections that is permitted by the
service [Al-Ali et al, 2002; Ran, 2003; Mathijssen, 2005].
Accuracy and Integrity refers to the maintaining of correct and consistent interaction [Mani
and Nagarajan, 2002; Papazoglou and Georgakopoulos, 2003; Looker et al, 2004].
Regulatory refers to the conformance and compliance to the rules, laws, standards and
specifications [Mani and Nagarajan, 2002; Ran, 2003; Looker et al, 2004].
Availability is the time as a percentage that the composite application is available to service
requests [Hu et.al. 2009]
7/28/2019 Software Quality of Service in Composite Applications Built with Web Services PhD Thesis
26/141
10
Interoperability is the ability of the composite application to interoperate with systems in a
way that is agnostic of the platform they run on or the programming language used to write
them.
Many of these are now well addressed, for example, Security through WS-Security [OASIS, 2006a]
and Interoperability [OASIS 2010]. Performance in the context of Composite Web Services remains a
challenge, however [Dyachuk et. al. 2007]. This is reflected in the fact that within the SaaS (Software
as a Service) industry many vendors, e.g. Amazon, only make SLA statements that cover availability
and reliability. SLA assurances about performance metrics such as response times are not widely
available. The thesis author raised this topic on the discussion forum of the SaaS group on the
LinkedIn business networking site. Despite the fact that this group has almost 6000 members
worldwide (as of December 2009) just two SaaS vendors voluntarily offered performance related
SLA metrics for their services. Of these two companies only one (Intactt) publicly display those
figures on their website.
Within the financial services industry, the author has noted through her work as a consultant, that
many companies recognise this as a problem without any readily available automation solution.
Instead, the state-of-the-art today is to monitor each individual resource in a composite application
for its availability on a large monitor visually inspected by Help Desk staff, whilst performance
metrics of individual resources are only analysed offline on a periodic basis (daily, weekly) from web
logs. There is no published literature on this issue from these companies as the subject is for obvious
reasons, commercially sensitive information.
Where solutions exist to monitor performance metrics and to pro-actively take remedial action,
these are based primarily on the use of redundant virtual machines. Lodi et. al. [2007] is an example
of an approach using large-scale clustering of available Virtual Machines and adaptive load-
7/28/2019 Software Quality of Service in Composite Applications Built with Web Services PhD Thesis
27/141
11
balancing that has been trialled an J2EE application servers. Two particular problems with this
approach include:
large number of VMs may give rise to scalability problems in collateral subsystems (e.g. a
shared database may become a bottleneck)
VM allocation time may cause SLA violations
An alternative solution would be to make better use of the resources that are available. There are
two aspects to this. Firstly, monitoring of individual resources to capture live performance metrics
and being able to use a model of the composite application to be able to understand the impact on
the workflows being executed in real-time. Secondly, using this data to automatically take remedial
action where SLA targets are in danger of not being met. We attempt to address our hypothesis
with a focus on these two pieces of software engineering.
There have been many published strategies for modelling the performance of QoS of Composite
Web Services, for example, through the use of integer programming [Cardoso, 2002; Zeng, et. al.,
2004; Gao et. al., 2005; Kelly, 2003], as a multiple choice knapsack problem [Yu, et al. 2007],
probability theory [Hwang et. al., 2007], event-driven rule-based programming [Zeng et. al., 2010],
numerical simulation [Silver et. al., 2003], game theory [Esmaeilsabzali et. al., 2005], layered
queuing network models [D'Ambrogio et. al., 2007], fuzzy logic [Lin et. al., 2005; Diao et. al. 2002b,
2003], analytical models using queuing networks and hill-climbing algorithms [Menasce and Bennani
2003], utility functions based on simple queuing networks [Pacifici et. al., 2003], approximate Mean
Value Analysis [Menasce et. al., 2004], exact Mean Value Analysis [Urgaonkar et. al. 2005a], job
scheduling [Urgaonkar and Shenoy 2005], control theory [Abdelzaher et. al. 2001; Lu et. al., 2002;
Diao et. al., 2002a; Lu et. al., 2004; Wand et. al., 2004] and genetic algorithms [Canfora et. al. 2005;
7/28/2019 Software Quality of Service in Composite Applications Built with Web Services PhD Thesis
28/141
12
Zomaya and Teh 2001; Page and Naughton 2006; Canfora et. al. 2008]. There is no published work
that attempts to compare and contrast these different approaches.
7/28/2019 Software Quality of Service in Composite Applications Built with Web Services PhD Thesis
29/141
13
Chapter 3
Related Work
3.1 Background
Although there has been an enormous amount of published material regarding SOA, the quality of
service aspects have only more recently been addressed [Bichler and Lin, 2006], furthermore, the
typical software QoS challenges of any Web Service [Mani and Nagaranjan, 2002]: availability,
integrity, performance (throughput and latency), reliability, interoperability, regulatory factors and
security, are compounded in more advanced SOA by their dynamic nature. Some of these issues are
introduced below.
3.1.1 Discovery and Negotiation
For all but the simplest of agreements, some form of negotiation is required if services are to be
consumed on demand [Stantchev et. al. 2009]. WS-Negotiation was proposed [Hung et. al., 2004]
with the principal goals of describing a negotiation process and publishing an XML negotiation
language through the use of Web Services architecture technologies.
The proposed standard leaves the negotiation decision-making process to some internal algorithm
that could be based on metrics such as price, service level objectives, or business policy. From the
service providers perspective, a decision must also take into consideration the resources that are
available at the time. Many proposals have recently appeared regarding this complex problem.
Suggested solutions include modelling the problem as a multi-constraint knapsack [Yu and Lin,
2005], as a fuzzy constraint problem [Lin et. al., 2005] or using integer programming models [Gao
et. al., 2005].
7/28/2019 Software Quality of Service in Composite Applications Built with Web Services PhD Thesis
30/141
14
3.1.2 Service Level Agreement
The first attempt at describing a machine-readable (i.e. XML) language for specifying service-level
agreements was IBMs Web Service Language Agreement (WSLA) [Ludwig et. al., 2003]. The
rationale behind the specification [Keller and Ludwig, 2003] was the desire to create a flexible but
formal language that could be applied end-to-end. WSLA is still the most actively cited SLA
specification [Patel et. al. 2009] for researchers in the area of SOA and Cloud Computing.
Another commonly cited specification is a joint proposal put forward via the Global Grid Forum,
entitled WS-Agreement [Andrieux et. al., 2007]. However, WS-Agreements primary focus is the
management of Grid architectures [Foster et. al., 2004]. It does not, therefore, directly address all of
the needs of SLA definition from a business users perspective. Recently modifications were
proposed to modify WS-Agreement to allow it to better model composite business services [Di
Modica et. al. 2009].
A much simpler approach is described by Web Services Offering Language (WSOL) [Tosic et., al.,
2003]. The objective of WSOL is to create a series of classes of service in a standard format that
would sit alongside a services WSDL file. The WSOL descriptions act as advertisements for a
matchmaking engine to examine. The service consumer, via their matchmaking engine, selects the
offering that is most appropriate. The WSOL team suggest that classes of service could differ in
terms of usage privileges, priorities or response times. The value of WSOL is that it vastly simplifies
the negotiation process. Additionally, the authors suggest that the management infrastructure
required to support WSOL is also much simplified [Tosic et. al., 2004].Many other SLA languages
have been suggested [Greiner and Rahm, 2004, Tian et. al., 2004, Sahai et. al., 2002 and Lamanna et.
al., 2003], each with their own relative merits. More recently these initial attempts at SLA definition
have been enhanced to cater for composite applications. Two such examples are MoDe4SLA
7/28/2019 Software Quality of Service in Composite Applications Built with Web Services PhD Thesis
31/141
15
[Bodenstaff et. al. 2008] and COSMA [Ludwig et. al. 2008]. These two specifications will be described
in more detail in the following chapter.
3.1.3 Service Provision
Resource provisioning is one of the biggest challenges for a service provider in an on-demand
environment. Amongst the considerations are the identity of the client, the service being requested,
the SLA, business policy, the measurement service, specific service provisioning operations, and the
concurrent activities of requests from other clients [Dan et. al., 2004]. The problem in an on-
demand, e-commerce environment where requests are stochastic and the physical resources
satisfying those requests behave non-linearly and are not only spread across multiple application
tiers, but could also be distributed around the globe, is potentially a major exercise. Workflow
management must be able to distinguish requests based on performance objectives [Dan et. al.,
2003]. Service provisioning is at the heart of autonomic computing [IBM, 2003] and much of the
recent research into these two fields is related. A primer on control theoretic techniques for
resource can be found in Diao et. al. [2004]. A detailed overview of the work that has been
conducted in this area will be provided in the section 2.2.
3.1.4 Monitoring
The monitoring of QoS metrics must consider what to measure, how to measure, who does it
(service provider, consumer etc), and where the measurements are taken [Menasce, 2004]. The task
of monitoring is implicit to the task of provisioning and it is assumed that a provider will need to
have in place mechanisms for efficiently collecting and storing resource metrics as a basis for any
adaptive provisioning. However, it is also in the consumers interest to undertake monitoring
activities: the question of trust is one issue, but perhaps more importantly, the consumer might also
be acting as a composite service provider to someone else. In this scenario, the service consumer
7/28/2019 Software Quality of Service in Composite Applications Built with Web Services PhD Thesis
32/141
16
must not only monitor the quality of third-party services he/she consumes but must also monitor
the quality of his/her own offering. Not only could monitoring become a fairly complex activity, it
could in itself become a resource intensive activity. To this end, is has been proposed that new
breed of service provider could become a reality one that exists to provide monitoring services
independently of both service provider and consumer, easing trust issues and limiting performance
penalties (Benjamin, et. al., 2004).
An example of how monitoring and provisioning can be used to solve dynamic resource allocation
problems is the WebQ framework developed by Patel et. al. (2004). WebQ dynamically monitors
QoS parameters from more than one provider of a given service. To begin with, the framework
distributes requests equally across all of the providers. As the metrics database grows, the
framework dynamically shifts load to the better performing services using a weighted algorithm.
Since it continues to send some of its load to the slower services (these could, in fact, be test
messages), the framework can re-adapt itself should the performance of the slower service improve
again. The authors have used multi-level rule modelling in OWL-S to create a flexible framework that
can manage complex QoS requirements involving large number of parameters.
3.2 Adaptive Control of Web Applications and Services
Research into adaptive management has concentrated on two main modelling techniques: those
using queuing theory [Kleinrock, 1976] and those using feedback control theory [Franklin et. al.,
2002]. Combining the two, it is also possible to use queuing theory to derive the system model for a
control theoretic approach.
Derived from research into decision theory in artificial intelligence [Russel and Norvig, 2003],
optimization problems can be approached using utility functions to mathematically model
7/28/2019 Software Quality of Service in Composite Applications Built with Web Services PhD Thesis
33/141
17
preference. Given a certain event, the action to take is determined by ranking all possible actions
and choosing the actions with the best expected outcome. An alternative is the use of genetic
algorithms (GAs) [Holland, 1992]. GAs are adaptive techniques for search and optimization problems
that were inspired by some of the processes involved in natural evolution and specifically the notion
of survival of the fittest.
The goals of published work on adaptive management fall into one of four categories:
Dynamically reconfigure available resources to optimise the throughput or response of the
current workload this is a common goal of queuing related techniques and utility driven
approaches.
Admission control through the rejection of excess requests. Ideally the controller should
also attempt to only service the important ones, rather than rejecting requests at random.
Both queuing and control theoretic techniques have used this approach
Dynamically provision idle resources, if these are available. Queuing models have been used
to predict when to do this, based on a demand threshold being exceeded.
Degrade the performance of admitted requests, possibly paying penalties to the client.
Control theoretic techniques have been applied to providing relative guarantees between
service classes, rather than absolute guarantees.
3.3 Queuing Theory
3.3.1 Dynamic Resource Configuration
An example of the use of queuing theory for adaptive resource configuration is Welsh's SEDA
architecture [Welsh et. al., 2001]. SEDA applications consist of event-driven stages connected by
queues. Dynamic resource controllers keep stages within their operating regimes during load
changes via thread pool management. A stage is a self-contained component consisting of an event
7/28/2019 Software Quality of Service in Composite Applications Built with Web Services PhD Thesis
34/141
18
handler, a simple, incoming event queue and a thread pool. The handler processes events and
dispatches them onto successive stages. Multiple stages can share the same thread pool; hence the
architecture can dynamically adjust the pool based on the load at each stage. SEDA can be found at
the heart of the Mule [Mule, 2010] open source message bus. Dynamic thread adaptation was
shown to be effective in dealing with bursty Internet traffic but had limitations when it came to
dealing with overload conditions.
Menasce and Bennani [2003] used an analytical performance model to design controllers that run
periodically to determine the best current resource configuration of a web server given its current
workload. The authors used a QoS controller that monitors system performance, including the
resource utilisation of server resources and periodically executes an algorithm to determine the
appropriate reconfiguration commands. Data is collected from metrics such as CPU utilisation, which
allows the current service demand to be calculated as the ratio of the resource utilisation and the
system throughput. Mean Value Analysis, MVA, [Lazowska et. al., 1984] is used to create a model in
which average response time, the probability of rejection, and average throughput can be predicted.
In this particular paper, the network model is extremely simple, assuming only that an incoming
request is serviced by one of m threads. When all m threads are busy, the request is rejected. A hill-
climbing search algorithm is used to find a close-to-optimal configuration by constantly re-applying
the algorithm to all possible configurations.
Pacifici et. al. [7] extended this work to clusters of servers, supporting multiple classes of web traffic.
The content of the inbound requests SOAP header is examined to determine its class of service and
the server farm is partitioned into clusters, each one managing different classes of traffic. A utility
function is defined for each class of traffic, which is simply a construct to weight the deviation of the
actual response times from the desired response time. A combined utility function is also derived to
7/28/2019 Software Quality of Service in Composite Applications Built with Web Services PhD Thesis
35/141
19
calculate how to allocate tasks across all of the resources in the farm. In this work, a traffic class set
is created for each tuple . In Kendall notation an M/M/1
queuing model is used to predict the average response time. The dynamic model is used to allocate
resources and dynamically load-balance work across the available resources.
3.3.2 Admission Control
Menasce et. al. [2004a] used an analytical model to make real-time admission control decisions.
Every time a new request is received the performance network model, using approximate MVA,
solves a closed loop queuing network. An algorithm then determines whether the request can be
serviced or not based on the current commitments and the possible solutions suggested by the
model. Each client session is modelled as an individual class.
Urgaonkar and Shenoy [2005] discuss a policy mechanism that emphasises the need to ensure that
the policy mechanism itself does not create a significant performance overhead. Requests are
mapped to a service class and then scheduled either FIFO (first-in-first-out) or shortest job first.
Requests of lower class are deliberately delayed. This prevents them from denying access to more
important requests. Requests of a higher class are subject to the admission control tests first. If the
highest class fails, there is obviously no need to test lower classes. Requests are admitted so long as
the system believes it has sufficient capacity to meet the SLA. Furthermore, batching requests
reduces policing overhead. Buckets are defined in each class, with a range of service times. All
requests in a bucket are then treated as equal. When admission control is invoked it considers each
non-empty bucket in the class its testing and conducts an all or nothing test on those requests. A
predictive technique is also used to further reduce overhead. The number of requests to admit can
be pre-computed if you have a good idea of how many requests will be arriving at the next time
interval.
7/28/2019 Software Quality of Service in Composite Applications Built with Web Services PhD Thesis
36/141
20
3.3.3 Dynamic Provisioning of Idle Resources
Urgaonkar and Shenoy [2005] also use a G/G/1 queuing model in conjunction with online
measurements to determine the need to replicate applications across idle, virtual servers if the
number of requests gets so high that a threshold is breached. This threshold is simply the known
bound on the job arrival rate of a G/G/1 queue [Kleinrock, 1976].
3.3.4 Extending to Multiple Tiers
Urgaonkar et. al. [2005a] have specifically considered the problems associated with tackling
bottlenecks in a multi-tier distributed application. They show that independent per-tier provisioning
is not sufficient as it can fail to capture the way in which bottlenecks can shift across tiers. In a
related paper, Urgaonkar et. al. present a multi-tier model based on MVA in [2005b]. The model
deals with scenarios such as a single request in the web tier spawning multiple tasks on the
application tier through the use of closed-loops creating multiple visits to each resource. They also
deal with long-lived sessions using an infinite queue at the front of the model, which also serves as
the re-entry point for requests that have been completed, thus forming a completely closed-loop.
This models think time at the client. The model uses an exact MVA algorithm. They suggest it can
be extended in several ways: to deal with scenarios where service times increase with load, where
resources are replicated on the same tier (load-balancing), for overload conditions at a given tier
causing dropped requests and for multiple session classes, but provide no specific details. Liu et. al.
[2005] developed an approximate MVA model for a three-tiered architecture that uses a multi-
station queuing centre to model the ability of web servers to multi-thread incoming requests.
7/28/2019 Software Quality of Service in Composite Applications Built with Web Services PhD Thesis
37/141
21
3.4 Control Theory
3.4.1 Admission Control
The first published work on the use of control theory for admission control appears to be
Abdelzaher et. al. [Abdelzaher, 2001]. In this paper, the authors attempted to keep the utilisation of
a web server at a fixed percentage where the web server was known to achieve optimum
performance. A simple linear expression was derived relating the utilisation to the number of
admitted job requests and the bandwidth of pages being served. A PI controller was used. Also using
a PI controller, Kihl et. al. [2003] used a M/G/1 queue for their model using a non-linear
approximation for the utilisation of the server expressed in terms of the number of requests in the
system and the service time distribution.
Lu et. al. [2002] presented an approach using two SISO (single-input single-output) controllers. The
controlled variables were the deadline miss ratio and the CPU utilisation. The adaptive system is
characterised in terms of the following performance metrics: stability (the miss ratio and utilisation
are bounded at all times), transient state response (overshoot and settling time), steady-state error
and sensitivity to workload variations.
As an alternative to using multiple SISO controllers, [Diao et. al. 2002a], constructed a true MIMO
(multiple-input multiple output) controller. They controlled the Keep Alive and Max Clients
parameters of an Apache Web Server in order to optimise its CPU and memory utilisation. They
conclude that MIMO design techniques such as the Linear Quadratic Regulator, LQR [Franklin et. al.,
2002], are beneficial for balancing design trade-offs.
3.4.2 Degraded Service
An alternative approach is to degrade the service levels of admitted requests. Instead of offering
customers absolute delay guarantees, Lu et. al. [2001] describe an approach that offers a
7/28/2019 Software Quality of Service in Composite Applications Built with Web Services PhD Thesis
38/141
22
differentiated service. Only the relative delay between two service classes is guaranteed, e.g. the
ratio of gold response time to silver response time. They point out that under conditions of heavy
load many of the reported approaches will only ensure better service to premium customers, but do
not provide any guarantees as to how much better the service will be. Their proportional delay
model specifies a fixed ratio between the delays seen by each service class. They also introduce a
hybrid policy one that uses proportional delay in normal operating conditions and switches to
absolute delay under very heavy load. This is because extreme load could lead to very long response
times even for the premium customers if the target is simply to maintain a fixed proportional delay.
They show how the relative delays can be use as the control variable for a proportional feedback
controller.
3.4.3 Extending to Multiple Tiers
Lu et. al. [2004] extended control theoretic techniques to multi-tier distributed systems. Their paper
presents the EUCON (End-to-end Utilization CONtrol) algorithm, which adaptively manages CPU
utilisation using feedback control and a MIMO model predictive controller. They point out that most
papers on feedback control methods assume a single CPU operating on a single task while most
applications consist of tasks spawning multiple other tasks and are deployed on multi-CPU
platforms. The performance of one task is coupled to the performance of other tasks. Changing the
rate of one task affects the utilisation of dependent tasks on the processors that they are using. This
paper derived a dynamic model to capture coupling amongst processors, developed a model
predictive controller approach for QoS control and designed a distributed MIMO feedback control
loop. When the number of servers is large, the overhead of a centralised controller could become
significant. For this reason, an enhanced version, DEUCON, was presented in [Wand et. al., 2004].
This is the distributed controller version of EUCON. A peer-to-peer control structure and localised
7/28/2019 Software Quality of Service in Composite Applications Built with Web Services PhD Thesis
39/141
23
utilisation control algorithm are used based on distributed model predictive controller theory where
a controller for each CPU cooperates only with local neighbours, i.e. only those that are executing
sub-tasks. Simulation results show that the overhead compared to a centralized solution is much
lower.
3.4.4 Fuzzy Controllers
One of the major limitations of control theoretic approaches is the need to derive a suitable model
of the system. Diao et. al. [2002b, 2003] demonstrated that fuzzy controllers offer significant
advantages. They defined a set of simple business related metrics to describe revenue, cost
(penalty), and profit and then adapted their MIMO controller [2002a] to use a set of fuzzy rules,
such as:
IF change_in_MaxUsers IS neglarge AND change_in_profit IS neglarge THEN
next_change_in_MaxUsers IS poslarge.
They show that a PI controller achieves better results in the region of the workload for which it was
designed, but for all other workloads, the fuzzy controller outperforms it. It is frequently the case, in
any engineering discipline, that the derivation of a suitable model (Franklin, et al. 2002] can be a
challenging task. In the case of complex distributed IT systems, the results presented demonstrate
that it is particularly challenging. The conclusion is that the less rigorous demands of deriving a fuzzy
model mean that this could be an extremely valuable approach.
3.5 Combined Approaches
Although control theory has been successfully used to provide improvements in the throughout and
response times of web applications, the technique is limited due to the highly non-linear behaviour
such systems. Queuing models, on the other hand, are very good at modelling these systems due to
their statistical approach. Liu et. al. [2006] applied a simple queuing model to an adaptive control
7/28/2019 Software Quality of Service in Composite Applications Built with Web Services PhD Thesis
40/141
24
algorithm [Astrom and Wittenmark, 1994] to demonstrate its applicability as an admission control in
an overloaded web site. They compare their technique with three other approaches:
A queuing model only
Adaptive control only
A queuing model with a PI controller (an approach proposed by Kamra et. al. [2004]).
They show how their approach provided the smallest difference between the target response time
and the actual response time. Their controller did not exploit their previous work using MVA [Liu et.
al., 2005] and they express their intent to extend it with this in mind.
3.6 Solving Optimization Problems
3.6.1 Utility Functions
Utility functions have been commonly used in artificial intelligence (AI) as a means of expressing
preference. Recent research has begun to explore their application to self-optimisation problems in
autonomic computing [Walsh et. al., 2004]. Utility is the measure of the desirability of an outcome.
It is usually measured in terms of the cost, benefit or risk of an action. A utility function assigns a
cardinal number to the desirability of an outcome and can depend on one or more dimensions.
These could be related to business level objectives as well as service level objectives. Expected utility
is the combined utility of combinations of actions. By defining the optimal decision to be when the
maximum expected utility is achieved, the regret of a decision can be defined as the difference
between the maximum expected utility and the actual expected utility. A common AI algorithm
known as minimax attempts to minimise the maximum possible regret [Wang and Boutilier, 2003].
One of the key advantages of this approach is that decisions can be taken in the absence of a
complete description of the constraints that define a utility function [ Boutilier et. al., 2004]. The
7/28/2019 Software Quality of Service in Composite Applications Built with Web Services PhD Thesis
41/141
25
methodology has been demonstrated in an autonomic, self-optimising application architecture at
IBM [Tesauro, et. al., 2004].
3.6.2 Integer Linear Programming
Several papers have been published that propose the use of Integer Linear Programming (IP)
methods for web service composition and resource allocation problems [see for example Gao et. al.,
2005 and Kelly, 2003]. In terms of the current discussion, the most relevant is the work of Zeng et al
(2004) who have applied IP to the problem of finding an optimal execution plan for a sequence of
tasks in a composite web service. IP problems are a form of linear programming where the variables
are integers (usually 0 and 1). In this case, the variables are 1 if a service x can execute task y and 0
otherwise. The objective function is a linear weighted calculation of the QoS using parameters such
as price, availability, service time etc. IP attempts to maximize or minimize the value of the objective
function by adjusting the values of the variables while enforcing any known constraints. The output
of an IP problem is the maximum (or minimum) value of the objective function and the values of
variables at this maximum (minimum). As the authors discuss, though, IP has a large computation
cost, especially as the number of services and tasks increase, because IP problems are generally NP-
hard.
3.6.3 Genetic Algorithms
Whilst GAs have been applied to multiple and diverse applications [Goldberg, 1989], their use in
software systems optimization problems is currently quite limited, focussing primarily on the rather
different problem of Job Shop Scheduling [see for example Mahmood, 2000, Fayad and Petrovic,
2005, Petrovic and Fayad, 2005, Montana et. al., 1998, Wang et. al. 1997]. An exception is the work
of Canfora et. al. [2005] who attempt to solve an optimization problem for a set of Web Services
comprising a complex workflow using a GA. To evaluate the fitness of their solution, they use a
7/28/2019 Software Quality of Service in Composite Applications Built with Web Services PhD Thesis
42/141
26
weighted combination of parameters including Availability, Response Time, Cost and Reliability,
although there is no discussion of how these parameters might be evaluated in real-time, based on
real measurements. Using a numerical simulation, they do, however, provide an interesting
comparison with the Integer Programming method of Zeng et. al. described above, and demonstrate
that the GA provides a faster solution as the number of Web Services increases.
In a related paper [Canfora et. al., 2004] the authors take a step to considering the dynamic use of
GAs for service composition by discussing the question of service re-planning and propose adding a
trigger to the workflow engine to re-evaluate the optimum service composition.
A similar piece of work was presented by Jaeger and Mhl [2007] who also use numerical simulation
to describe the effectiveness of GAs in this problem domain. They provide detailed results
comparing the impact of different parameters (e.g. mutation rate, fitness function) on the
optimisation capability of the genetic algorithm.
From the world of task scheduling two papers provide useful background into the problem of using a
GA in a dynamic scenario. Zomaya and Teh [2001] used a cycle crossover to load balance a discrete
set of tasks over a set of resources, whilst Page and Naughton [2006] added a heuristic on the
mutation operator to improve the performance.
3.7 Concluding Remarks
In conclusion, there is a large body of previous work to draw inspiration from. Approaches using
control theory appear to be difficult to apply to distributed systems, as is evidenced by the
complexity of the DEUCON model [Wand et. al., 2004]. Statistical approaches using queuing theory
have proven successful in related disciplines such as network performance analysis and
telecommunications queuing. For this reason, the queuing model approach is used in this thesis.
7/28/2019 Software Quality of Service in Composite Applications Built with Web Services PhD Thesis
43/141
27
In terms of choosing an optimization strategy, the discussion has focussed on general search and
optimization techniques rather than heuristic methods confining themselves to a narrow domain.
The results of Canfora et. al. [2005] suggest that GAs offer a promising candidate, particularly
compared with Integer Programming. GAs are traditionally strong in problem spaces where heuristic
approaches are too complex to be practical. Weise et al [2007], in a review of web service
composition challenges, conclude that especially in practical applications, additional requirements
will be imposed onto a service composition engineSuch requirements could include quality of
service (QoS) ... or the generation of complete BPEL processes ... In this case, heuristic search will
most probably become insufficient but genetic algorithms and genetic programming will still be able
to deliver good results
3.8 Publications
Parts of this chapter were published in the Software Quality Journal:
Shelly Saunders, Margaret Ross, Geoff Staples, and Sean Wellington, 2006. The Software Quality
Challenges of Service Oriented Architectures in e-Commerce, Software Quality Journal14 (1)
65-76 March 2006
This article has been cited at least 11 times by the start of 2011.
It was also presented at SQM 2005 conference as:
Shelly Saunders, Margaret Ross, Geoff Staples, and Sean Wellington, 2005. The Software Quality
Challenges of Service Oriented Architectures in e-Commerce, In: Current Issues in Software
Quality, Thirteenth International Conference on Software Quality Management (SQM 2005),
pp87-100.
7/28/2019 Software Quality of Service in Composite Applications Built with Web Services PhD Thesis
44/141
7/28/2019 Software Quality of Service in Composite Applications Built with Web Services PhD Thesis
45/141
29
Application (KQI)
Service Performance
Indicators (KPI)
Monitoring
Instrumentation
Service Level
Agreement
Service Level
Monitoring
Figure 4.1 Service Level Management
Key Quality Indicators (KQI) of the application are derived from the performance metrics of the
underlying composite services. These performance metrics are known as Key Performance
Indicators (KPI). For each service these will be obtained from monitoring instrumentation which is
the core of the whole process.
4.1.1 Service Monitoring
Many Cloud vendors who offer Web Services that can be composed into composite enterprise
applications are only just beginning to provide actual data about the performance of their services.
7/28/2019 Software Quality of Service in Composite Applications Built with Web Services PhD Thesis
46/141
30
Even where this data is provided, the question of trust will always be an issue. Furthermore, the
complete round trip time of a particular service from a particular vendor is also dependent on
multiple elements lying between the composite application and the service itself, including ISPs
hardware, communication links etc. For these reasons, we propose that even if all services provide
their own performance metrics, service monitoring is also performed centrally. In our work we have
used the Enterprise Service Bus to achieve this. Apart from raw data obtained from live monitoring
of data, we can also use information from service registries, test data, SLA statements on contracted
QoS values and feedback from other service consumers. However, the most weight should be put on
the service execution history data as the most reliable source of information. This process has been
termed service profiling [Abramowicz et. al., 2006].
4.1.2 Key Quality Indicators and Key Performance Indicators
Whilst there are many QoS metrics applicable to services [Kritikos and Plexousakis, 2009] the Key
Quality Indicator of interest to this thesis is primarily the end-to-end time required to execute a
particular workflow. This KQI can be mapped directly into the SLA. The KQI is derived from KPIs of
each service used by that workflow. The KPIs we are interested in are the execution times of each
service call. For differentiated services based on priority sessions, we would also be interested in the
cost of each service as another KPI. Further, from a business intelligence perspective, understanding
the costs involved in operating a composite service is also very desirable.
Once KPIs are defined, we can generate our end-to-end workflow execution time KQIs from the KPIs
for example using the techniques described by Mensace [2004]. In the example of Figure 4.2, Service
A invokes B with probability p1 and it invokes C with probability p2 = 1 p1.
7/28/2019 Software Quality of Service in Composite Applications Built with Web Services PhD Thesis
47/141
31
Likewise C invokes D with probability p3 and it invokes E with probability p4 = 1 p3. Finally F is
invoked when either D or E finish, or when B finishes. In this example the total execution time, T, is
given by:
T = tA + p1tB + p2(tC + p3tD + p4tE) + tF
where p is the probability of that execution path being chosen.
Figure 4.2 Composite Web Services
Likewise, the total cost will have exactly the same form. The KQI is based on the value of T that the
application is expected to meet. Likewise KPIs are based on values of tn that each service is expected
to be able to meet. Since the KQI is a composite measure each individual KPI can be defined with a
certain degree of tolerance. An individual service could exceed its KPI whilst the overall application
execution time remains within its SLA targets. This allows us to add flexibility to the KPIs by adding
performance thresholds. There could be a warning threshold as well as an error threshold.
4.2 Service Level Agreement Design
4.2.1 COSMA
The collection of service execution data and its use in the definition of KPIs has also been suggested
as an important method of service profiling of composite services [Ludwig et. al. 2009a] based on
COSMA, an approach for managing SLAs in composite services [Ludwig et. al. 2008].The concept
7/28/2019 Software Quality of Service in Composite Applications Built with Web Services PhD Thesis
48/141
7/28/2019 Software Quality of Service in Composite Applications Built with Web Services PhD Thesis
49/141
7/28/2019 Software Quality of Service in Composite Applications Built with Web Services PhD Thesis
50/141
34
therefore, find it very useful to review MoDe4SLA and compare it with the work we have presented
in this thesis so far.
The MoDe4SLA approach begins with a dependency model which for our purposes is similar to what
we have produced already in Figure 4.2. We can produce models of this sort not only for response
times but also for cost dependencies.
Next the approach advocates that we analyse the impact the dependent services have on the
composite service. An example is a service that is called repeatedly. If a workflow calls service A
three times and its response time is 3 seconds and it calls service B once and its response time is 4
seconds we could represent the impact to the workflow of service A has 3x3s = 9 and the impact of
service B has 4x1s = 4.
Additional measures of impact might also be desirable. We mentioned in section 5.3.4 that some
services could have a far greater impact on our composite service than other services, for example,
if only one external vendor could supply that service. The MoDe4SLA approach does not cover this
kind of scenario so we propose that a uniqueness impact is also derived for each service. If a
service can only be sourced from one location it has an uniqueness of one. If we can source the
service from 2 locations it has a uniqueness of 0.5.
In both the impact derivation and the uniqueness, the important thing to understand is that we are
at this stage simply creating a method which allows us to rank services as being important or less
important to us in meeting our own service level objectives. The actual values are of no importance
as long as we are consistent with how we derive them. Note also that MoDe4SLA was extended in a
recent paper [Bodenstaff et. al., 2009a] to study availability as a metric alongside response time and
cost not an impact. This is also an important consideration, especially if it is conjugated with
uniqueness.
7/28/2019 Software Quality of Service in Composite Applications Built with Web Services PhD Thesis
51/141
35
Next, MoDe4SLA suggests that we structure our monitoring results. All of the data indicated by
MoDe4SLA is captured by our management solution described in section 4.3.3 and it consists of the
following:
An audit trail of all the messages exchanged
The services invoked
Which workflow the service invocation belonged to e.g. New Business, Renewal.