View
221
Download
1
Category
Preview:
Citation preview
Design and Implementation of a Prototype Toolset
for Full Life-Cycle Management of Web-Based Applications
TR-29.3610
By
Joseph G. Gulla
A dissertation submitted in partial fulfillment of the requirements
for the degree of Doctor of Philosophy
Graduate School of Computer and Information Sciences
Nova Southeastern University
2002
We hereby certify that this dissertation, submitted by Joseph G. Gulla, conforms to acceptable standards and is fully adequate in scope and quality to fulfill the dissertation requirements for the degree of Doctor of Philosophy. ______________________________________________ ___2/5/2003___ John A Scigliano, Ed.D. Date Chairperson of Dissertation Committee ______________________________________________ ____1/22/2003__ Maxine S. Cohen, Ph.D. Date Dissertation Committee Member ______________________________________________ ____1/24/2003__ Sumitra Mukherjee, Ph.D. Date Dissertation Committee Member Approved: ______________________________________________ ____2/7/2003__ Edward Lieblein, Ph.D. Date Dean, Graduate School of Computer and Information Sciences
Graduate School of Computer and Information Sciences Nova Southeastern University
2002
Abstract
An Abstract of a Dissertation Submitted to Nova Southeastern University in Partial Fulfillment of the Requirements for the Degree of Doctor of Philosophy
Design and Implementation of a Prototype Toolset
for Full Life-Cycle Management of Web-Based Applications
by Joseph G. Gulla
November 2002
The goal in this study was the design and prototype implementation of procedures, programs, views, schema, and data (toolset) for the management of Web applications. This toolset pertained to all phases of the Web application's life including design, construction, deployment, operation, and change. The toolset built upon key functional perspectives including accounting, administration, automation, availability, business, capacity, change, configuration, fault, operations, performance, problem, security, service level, and software distribution. The main problems addressed by the researcher through the toolset were the lack of support in a number of key areas such as keeping applications available and performing well, making applications easy to fix when they fail, making applications easier to change and maintain, and ensuring that applications are secure. The toolset addressed these challenges and at the same time reduced the impact of application complexity, the labor needed, and the skill required to achieve Web application manageability. Joint application design techniques were used for requirements and design activities. A rapid application design approach was used for toolset implementation, planning, and construction. Evaluation was done using a five-question survey that focused on input about the toolset's software attributes and technology, level of satisfaction with the toolset, and perceived contribution of the toolset to the organization. It is expected that this research project will be used as input for future service-based offerings for IBM's e-Business Hosting line of business.
Acknowledgements I would like to thank Dr. Scigliano for his patient and skillful guidance of my dissertation project. Over the last two years, I have appreciated his consistent encouragement and celebratory messages as I completed each milestone including the idea paper, preliminary and formal proposals, and final report. I would also like to thank Drs. Cohen and Mukherjee for their helpful comments and suggestions regarding my Formal Dissertation Proposal and Final Dissertation Report. Finally, I would like to thank my wife Rosemarie for making it possible for me to complete this dissertation without taking a leave of absence from IBM.
Table of Contents Abstract iii List of Tables viii List of Figures xii Chapters 1. Introduction 1 Statement of the Problem Investigated and Goal Achieved 1 Statement of the Problem 1 Goal 10 Relevance and Need for the Study 12 Barriers and Issues 14 Applications and Their Environments are Complex 14 Making Applications Management Ready is Labor Intensive 15 Management Solutions Require a High Skill Level 15 There is a Lack of Focus on the Manageability of Applications 16 Hypotheses and Research Questions Investigated 16 Limitations and Delimitations of the Study 22 Definition of Terms 22 Summary 27 2. Review of the Literature 30 Introduction 30 Historical Overview of the Theory and Research Literature 30 Application Management as a Discipline 31 History of Applications Management 36 Major Research Efforts and Projects 40 The Theory and Research Literature Specific to Application Management 45 Management Infrastructure 46 Management Standards 52 Management Information Repository 63 Classes of Products 77 Summary of What is Known and Unknown About this Topic 88 Accounting 93 Administration 95 Automation 96 Availability 98 Business 99 Capacity 101 Change 103 Configuration 105 Fault 108 Operations 111 Performance 114
v
Problem 116 Security 118 Service Level 120 Software Distribution 122 The Contribution This Study Makes to the Field 123 Expand Knowledge and Capability in Full Life-Cycle Management of
Applications 124 Provide the Design of an Innovative Toolset for the Management of
Applications 125 Expand the Capabilities of 15 Key Functional Perspectives in Applications
Management 126 Integrate with Existing Products in a Seamless Fashion 126 Summary 127 3. Methodology 130 Research Methods Employed 130 Specific Procedures Employed 130 Design the Toolset 130 Implement the Toolset 134 Evaluate the Toolset 135 Formats for Presenting Results 139 Projected Outcomes 140 Resource Requirements 141 Hardware 141 Software 142 Data 143 Procedures 144 People 144 Reliability and Validity 144 Summary 145 4. Results 147 Introduction 147 Presentation of Results 148 Analysis 149 Toolset Design 149 Overall System Summary 149 Subsystem Summary 156 Support for the Accounting Functional Perspective 156 Support for the Administration Functional Perspective 160 Support for the Automation Functional Perspective 166 Support for the Availability Functional Perspective 171 Support for the Business Functional Perspective 174 Support for the Capacity Functional Perspective 177 Support for the Change and Configuration Functional Perspectives 180 Support for the Fault Functional Perspective 185
vi
Support for the Operations Functional Perspective 187 Support for the Performance Functional Perspective 189 Support for the Problem Functional Perspective 192 Support for the Security Functional Perspective 194 Support for the Service Level Functional Perspective 197 Support for the Software Distribution Functional Perspective 200 Other Support for the Functional Perspectives 206 Application Segment Strategy and Planning for Scenario Development 205 Web Application Operational Fault 207 Web Application Deployment is Unsuccessful 209 Web Application Change Results in Poor Performance 211 Web Application Experiencing Bottlenecks as Some Queries Take a Long Time 212 Overall Response for the Web Application is Slow but the Application is Still Functional 214 Toolset Implementation Using the Segment Strategy 217 Toolset Evaluation 222 Findings from the Survey 222 Profile of Participants 223 Responses to the Toolset Survey 226 Written Comments on the Strengths and Weaknesses of the Toolset 238 Summary of results 243 5. Conclusions, Implications, Recommendations, and Summary 246 Introduction 246 Conclusions 246 Conclusions for the Primary Research Questions and the First Hypothesis 247 Conclusions for the Secondary Research Questions 252 Conclusions for Hypotheses 2, 3, and 4 275 Strengths, Weaknesses, and Limitations of the Study 280 Implications 283 Recommendations 284 Summary 287 Appendixes 291 A. Functional Perspectives Analysis Tables 292 B. Toolset Evaluation Survey 297 C. Institutional Review Board Documents 300 D. Tivoli Management Applications 308 E. Survey Materials From the Toolset Evaluation 312 F. Background and Brainstorming Materials 362 G. Comment Sheet Detail 383 H. Data Dictionary for Full Life Cycle Toolset 399 Reference List 428
vii
List of Tables Tables 1. Messages Extracted From a Log File 48 2. Reference Architecture Components for CORDS Project 64 3. MIR Product Implementations Discussed 68 4. NetView Data Including Type, Table Name, and Description 73 5. Summary of Five Framework Products 83 6. Different Views of Application-Management Functional Perspectives 90 7. Management Products and the Function That They Perform 92 8. The Applications Dependency Stack and Application-Management Support 124 9. Proposed Category and Subcategory Elements of the TeamRoom 133 10. Hardware Used for the Creation of the Toolset 142 11. Software Used for the Creation of the Toolset 142 12. Primary Inputs to Design Sessions 150
13. Functional Perspectives and Related Subsystems 152
14. Resource Modeling Component Summary 157
15. Resource Accounting Component Summary 159
16. Automated Installation and Configuration Component Summary 161
17. Configuration Verification Component Summary 165
18. Template Creation Component Summary 167
19. Component Comparison Component Summary 170
20. Deep View Component Summary 173
21. Business Views Component Summary 175
viii
22. Application Capacity Bottlenecks Component Summary 178
23. Unauthorized Change Detection Component Summary 181
24. Change-Window Awareness Component Summary 184
25. Smart Fault Generation Component Summary 186
26. Integrated Operations View Component Summary 188
27. Intimate Performance Component Summary 190
28. Detailed Data Component Summary 193
29. Interface Monitoring Component Summary 195
30. SLO/SLA Data Component Summary 198
31. Deployment Monitoring Component Summary 201
32. MIR Creation Component Summary 204
33. Subsystems and Related Scenarios 217
34. Subsystems and Related Tables to Support the Prototype 221
35. Summary of Participant Profile Information 224
36. Scenario 1 Summary 227
37. Scenario 2 Summary 229
38. Scenario 3 Summary 231
39. Scenario 4 Summary 232
40. Scenario 5 Summary 234
41. Ranking of Scenarios 235
42. Summary of Question-by-Question Analysis 237
43. Ranking of Questions 238
ix
44. Informal Strengths Summary 239
45. Informal Weakness Summary 241
46. Other Informal Comments 242
47. Data Sources Used in the Toolset Scenarios 276
48. Three Attributes of Significance to Hypothesis 3 278
49. Availability and Performance Focus by Scenario 279
50. Standards Organizations and Support for Fifteen Functional Perspectives 292 51. Researchers, Research and Consulting Organizations, and Vendors and Support for Fifteen Functional Perspectives 293 52. Systems Management Products and Support for Fifteen Functional Perspectives (First Six) 294 53. Systems Management Products and Support for Fifteen Functional Perspectives (Last Seven) 295 54. Application Capacity Log 399 55. Application Definition 400 56. Automated Installation and Configuration Log 402 57. Business Systems Definitions 404 58. Change-Window Operations Log 404 59. Configuration Verification Log 406 60. Deep View Application Resources 407 61. Deployment Status Log 414 62. Detailed Data 415 63. Resource Modeling Log 416 64. Resource Monitoring Input 417
x
65. SLO/SLA Definitions 418 66. SLO/SLA Log 418 67. Specific Fault Data 419 68. Unauthorized Change Detection Log 426
xi
List of Figures Figures 1. The Toolset as Integrator of Existing Views or Tools with an Application
Management Layer 7 2. The Toolset as Management Functions with Actions that Support Application
Management 8 3. The Toolset and Its Components Consisting of Procedures, Programs, Views,
Schema, and Data 11 4. Applications Management as Part of a Comprehensive Approach 13 5. Categorization Grid Showing Business Impact of Applications 35 6. MANDAS Architecture 42 7. Concepts Important to the Management of Applications 45
8. Builders and Users of the GEM MIR 71 9. NetView, Sources of Data, and its Relational Database Support 72 10. The NetView OS/390 MIR 74 11. Solstice Enterprise Agents and Other Components 75 12. The Toolset and its Relationship to the Management and Application Domains 127 13. Layout of Typical Web Page 219 14. Layout of Typical Frameset 220 15. Results of the Survey Regarding Participants' Job Family 225 16. Results of the Survey Regarding Participants' Focus Area 226 17. Cover page from the JAD Kickoff Presentation 362 18. Agenda from the JAD Kickoff Presentation 363 19. Background from the JAD Kickoff Presentation 364 20. Design background from the JAD Kickoff Presentation 365
xii
21. Implementation background from the JAD Kickoff Presentation 366 22. Toolset Information from the JAD Kickoff Presentation 367 23. Procedures Information from the JAD Kickoff Presentation 368 24. View Information from the JAD Kickoff Presentation 369 25. Program Information from the JAD Kickoff Presentation 370 26. MIR Information from the JAD Kickoff Presentation 371 27. Schema Information from the JAD Kickoff Presentation 372 28. Data and Information for the MIR from the JAD Kickoff Presentation 373 29. Brainstorm page from the JAD Kickoff Presentation 374 30. Phases and Toolset Information from the JAD Kickoff Presentation 375 31. Functional Perspectives Information from the JAD Kickoff Presentation 376 32. Design Brainstorming Template from the JAD Kickoff Presentation 377 33. Construction Brainstorming Template from the JAD Kickoff Presentation 378 34. Deployment Brainstorming Template from the JAD Kickoff Presentation 379 35. Operations Brainstorming Template from the JAD Kickoff Presentation 380 36. Change Brainstorming Template from the JAD Kickoff Presentation 381 37. Next Steps from the JAD Kickoff Presentation 382
xiii
Chapter 1
Introduction Statement of the Problem Investigated and Goal Achieved
Statement of the Problem The problem investigated in this study was the management of Web applications.
Strategists, like those who participated in a recent Washington DC technology conference
(Understanding the Digital Economy, 1999), make assumptions about infrastructure
stability, network availability, and application performance, but this does not mean that
the current situation is without significant challenges. Many companies have Web sites
that are important sources of revenue, but lack tools to keep the Web applications
available and performing well (Aldrich, 1998). Failures of applications, systems, and
networks can be costly for these companies. On August 6, 1996, AOL experienced a 24
hour outage because of human error during a maintenance period. The cost of this failure
was $3 million in rebates. At that time, AOL announced an $80 million program for new
infrastructure investment. E*TRADE has also felt the pain of costly failures. From
February 3, 1999 through March 3, 1999, E*TRADE experienced four outages of at least
five hours. The direct cost of these failures is not known, but the company's stock price
declined 22% on February 5, just two days after the initial failure (Frick, 2000).
It would be an overstatement to say that a toolset for the management of Web
applications would eliminate all of these problems. However, a toolset could have a
significant impact on many aspects of these problems, thereby reducing the severity of
their force. According to Hurwitz (1996), an application-focused management toolset
would be very useful. Hurwitz defined application management as the task of
2
guaranteeing the availability, reliability, and performance of applications. Therefore, this
project is important at this time, because it will help to address the lack of tools by
providing a prototype toolset based on a design that is centered on the management of the
Web application.
The situation with the management of Web applications may be an even greater
challenge than that for other types of distributed applications. Many Web applications
require access to existing applications and data from client/server, distributed and
mainframe systems. This is the case for one of the Web applications recently deployed at
a leading insurance company (Turner, 1998). Since the Web implies 24 hours a day,
seven days a week availability, a toolset is needed to deal with challenges like minimizing
planned down time for application upgrades and database backups (Mason, 1998). A
toolset is also needed to help answer the question--is it the network or the application
(Garg, 1998)? Many of the application-response measurement techniques described by
Garg and Schmidt (1999) are needed for Web applications as they suffer from the same
problems that affect client/server applications.
In 1997, the White House of the United States published a report titled "A Framework
for Global Electronic Commerce." This important document focused on principles
regarding the way that electronic commerce should develop in the United States and
around the world. The principles explained included:
1. The private sector should lead.
2. Governments should avoid undue restrictions on electronic commerce.
3. Where government involvement is needed, its aim should be to support and enforce
predictable, minimalist, consistent and simple legal environment for commerce.
3
4. Governments should recognize the unique qualities of the Internet.
5. Electronic Commerce over the Internet should be facilitated on a global basis. (A
Framework, 1997, p. 1)
In the United States, the execution of these principles, which started in 1995, has
fostered growth in Web sites. This policy, in addition to a very active private sector,
created a booming economy around the Internet and Internet-related technologies. The
authors of "A Framework for Global Electronic Commerce" remarked that the Internet
was already having a profound impact on the global trade in services and accounted for
well over $40 billion of U.S exports. The report also discussed the importance of security
and reliability. The report indicated that secure and reliable telecommunications networks
and infrastructure are essential if Internet users are to have confidence. In general, the
U.S. Government documents on global electronic commerce are concerned with broad
matters related to commerce like customs and taxation. However, they consistently
acknowledge the importance of the availability and performance of global electronic
commerce infrastructure and applications.
A year later, the U.S. Working Group on Electronic Commerce published its first
annual report. In this report, the authors indicated that since the release of the framework
document, the number of Internet users has more than doubled to over 140 million people
worldwide. The report also stated that information technology industries were responsible
for over one third of the real growth of the U.S. economy and were driving productivity
improvements in almost every sector of the economy. Other important Internet-related
information was reported including information technology spending as a share of
business equipment spending, salaries of information technology workers versus the
4
private sector, and the growth of Internet hosts. This information built upon the work of
the original framework document. However, this report began new initiatives like
ensuring adequate bandwidth and access, consumer protection, the Internet and
developing countries, and understanding the digital economy (Unites States Government,
1998).
In May 1999, a conference was held in Washington, DC at the U.S. Department of
Commerce. The conference was titled "Understanding the Digital Economy." This
conference was a direct result of a working group initiative and covered a broad range of
topics. The topics included macroeconomic assessment, organizational change,
measuring the digital economy, small business access, market structure and competition,
and employment and the workforce. There were more than 35 speakers from government,
universities, and technology companies (Understanding the Digital Economy, 1999).
Most of the focus was on strategic issues, but throughout there was the assumption that
the Web would be up, available, and performing well.
All of the elements of a Web site must work correctly for it to be useful. Welter (1999)
put this in a business context by indicating that this would help maximize a company's
investment. Many interrelated elements must work together to support the Web
application. Web sites can be very complex in the way they are constructed. Jutla, Ma,
Bodorik, and Wang (1999) described the components of their Web-based system that
included browsers, database servers, Web servers, firewalls, network protocols, and SSL
components from a trusted third-party. Other Web-based systems have an even longer list
of interrelated elements. The USAA Internet Member Services system contained many of
the elements of the Web-based Order Management System, but also included legacy
5
components like database systems on mainframe computers. Member Services was also
closely related to systems owned and maintained by banks that were used by the USAA
Internet Member Services management system (Turner, 1998). Another example was the
IBM REQCAT Web application. This commercially available application had still other
components like mail servers, eNet Dispatchers (used for application-level load
balancing), gateway servers, and interfaces to SAP (Turner, 1999). SAP, which means
Systems Applications Products in Data Processing, is an industry leading enterprise
resource planning application. The broad number of elements or components that make up
these systems dramatically increases the challenge of maintaining their availability.
This toolset helps to address the challenge of keeping Web elements available by
reporting their availability in an application context. This context, which can also be
called business-system management, provides an alternative to the technology-based
approaches that dominate the systems-management field today. This prototype toolset
provides a way to anticipate failures and to automatically correct them when possible.
Automation is important for providing timely responses to problem situations just as it has
proven indispensable in other areas like the automatic creation of instrumentation for the
management of systems, networks, and applications (Hong, Gee, & Bauer, 1995) and
policy driven fault management in distributed systems (Katchabaw, Lutfiyya, Marshall, &
Bauer, 1996).
Future growth and stability are difficult without a well-managed site. According to
Hurwitz (1998), the bookseller Amazon.com recently lost the availability of its Web
servers. This is the only way Amazon.com customers can book orders! When this
happens, customers can wait until the system comes back or try another on-line
6
bookseller. Amazon.com is hopeful to turn a profit in 2002, so retaining its customers, and
attracting new ones is very important (Sandoval & Dignan, 2001). Other e-commerce
companies have experienced problems like Amazon.com. On June 12, 1999, eBay
experienced a 22 hour operating system failure that cost between $3 and $5 million.
eBay's stock price suffered a 26% decline that was attributed to the failure. Between
February 24, 1999 and April 21, 1999, Charles Schwab & Company experienced four
outages of at least four hours in duration. The direct cost of these failures is not known,
but shortly after the problems, the company announced a $70 million investment in new
infrastructure (Frick, 2000).
This toolset helps promotes a deeper understanding of how Web system and
application availability can be improved. With Web application management as a focus,
this toolset explores a new and different way to make Web sites more stable and better
managed. Initially, two approaches were considered for the overall approach of the
toolset. One approach was that the toolset would unify or integrate components of the
application into a new discipline by adding a new management layer to the management
layers that already exist. An example of an existing discipline is Network Management
used for the management of networks. The proposed discipline or way of thinking called
Web application management was in addition to the technology-based views, approaches,
and tools that are currently used to manage the components used by applications. The
new discipline of Web application management would work in cooperation with the other
management disciplines. This approach is shown in Figure 1.
7
________________________________________________________________________
N ewLayer
E xis tingLayers
H ardware S ystem s
W eb A pp lica tion M anagem ent
N etwork
M iddlewareD atabase
A pplication P rogram s
O perating System
P roposedT oolset
E xistingT ools
Figure 1. The toolset as integrator of existing views or tools with an application management layer ________________________________________________________________________
Today, tools exist for middleware, database, network and operating system
management. These tools are shown in the middle of Figure 1. The application itself often
exists in the system as an unmanaged collection of resources like tasks and programs.
Hardware systems, which are shown at the bottom of Figure 1, are the server machines,
switches and routers that run the site's programs and infrastructure. Hardware systems are
not a focus of this study, but were included to provide a complete context for the
discussion. The availability and performance of these components is generally collected
and represented using software tools. In Figure 1, Web application management is shown
as a new management layer with a focus on application-specific elements like tasks and
programs. This layer is an addition to the existing management layers involving database,
8
middleware, network and the operating system. This new layer complements the other
layers with a specific focus on the Web application.
The second approach that was considered was one where the toolset was management-
function based, but did not seek to unify or integrate the existing technology-based
information. Examples of management functions are accounting, business, fault, and
performance. This approach is shown in Figure 2, below, and is explained in detail in the
paragraphs that follow the figure.
________________________________________________________________________
Network
Middleware
Database
ApplicationPrograms
OperatingSystem
Performance View with Actions
Web Application Management
Accounting* View with Actions
Fault View with Actions
Business View with Actions
HardwareSystems
* Disciplines to be explored include accounting, administration, automation,availability, business, capacity, change, configuration, fault, operations, performance,problem, security, service level, and software distribution.
Proposed Toolset Existing Layersand Tools
Figure 2. The toolset as management functions with actions that support application management ________________________________________________________________________
Instead of integrating existing tools and technology with a new management layer, this
approach was based on management functions or disciplines like accounting and
9
performance. Each management function had a management view of the Web application
and a set of actions that could be taken from the view such as start, stop, restart, and show
events. The accounting view could be used to show real-time charge back information for
the application. The accounting actions could be used to start or stop the accounting
recording function or generate an ad hoc bill for a department or division.
The business view could be used to monitor a collection of applications that indicated
the status and relationship of the components that made up the business system. The
business actions could include dynamically allocating resources to certain components of
the business system or restarting application components that have failed. The fault view
could be used to monitor for application-specific faults like application errors or
terminations. The fault actions could be used to repair errors using the guidance provided
in the recommended actions for the specific fault. The performance view could be used to
monitor for application performance bottlenecks. When these problems are detected,
performance action could be taken such as assigning additional threads or reducing the
number of concurrent users.
In summary, the management function software interfaced with existing layers and
tools, but operated in the context of a well-defined management function. This approach is
strongly tied to the ability of the computer management specialist to model the application
using views and to manage the application using monitors and commands.
The requirements and design activities for the toolset resulted in a management-
function based toolset. During the joint application design activities, one or more
subsystems were developed in support of each functional perspective. For example, the
accounting functional perspective had two subsystems to support its functional needs. The
10
Resource Modeling and Resource Accounting subsystems were developed to support the
accounting requirements of a Web application. The management function approach was
used because it offered the best opportunity to link application management with a set of
disciplines, such as administration and performance, which have long-standing
importance to users and the systems-management community. The management function
approach also offered the best opportunity to innovate and create an exciting design and
prototype toolset.
Goal The goal of the researcher in this dissertation was to reduce the barriers to the
successful implementation and operation of Web applications by providing full life-cycle
management support of these applications. In general, there are significant problems in
managing distributed applications. Bauer, Bunt, El Rayess, Finnigan, Kunz, Lutfiyya,
Marshall, Martin, Oster, Powley, Rolia, Taylor, and Woodside (1997) stated “the design,
development, and management of distributed applications presents many difficult
challenges. As these systems grow to hundreds or even thousands of devices and similar
or even greater magnitude of software components, it will become increasingly difficult to
manage them without appropriate support tools and frameworks” (p. 508). According to
the Hurwitz Consulting Group, "The lack of manageability has led to a crisis in enterprise
computing" (Application Management: A Crisis, 1996, p.3). Martin (1996) pointed out
that management support is often cited by users of a system as a very important aspect of
the distributed system.
In this dissertation, the researcher designed and implemented a prototype toolset for
the management of Web applications. The toolset included procedures, views, programs,
11
schema, and data as part of a system to improve the monitoring and control of Internet
applications (see Figure 3). Procedures were used to define tasks performed by operators
or system administrators. Views were used to monitor and manage the application using
graphical depictions of application components. Programs performed tasks and operations
with a minimum of human interaction. The schema defined the layout of the application-
management data. Management data was stored in a database and consisted of items like
run-time parameters, profiles, alerts, and log files.
________________________________________________________________________
Schema
Views
Data
Procedures
Programs
Figure 3. The toolset and its components consisting of procedures, programs, views, schema, and data ________________________________________________________________________
The design of the toolset was based on input gathered from system administrators,
developers, and systems-management personnel. A Joint Application Design approach
12
was used. Design collaboration was fostered with a document-based electronic data base
called a TeamRoom. The prototype toolset was developed using Rapid Application
Development techniques. The toolset leveraged existing technology like a database
management systems to store application management data and hypertext markup
language to build management views. The toolset evaluation uses a methodology based on
a framework from Boloix and Robillard (1995).
Relevance and Need for the Study Web application management is a necessary part of a management system that ensures
the viability of the Web site. Presently, there is an intense focus on the availability of the
servers and infrastructure such as switches and routers, but management of the e-business
application itself is also needed (Gillooly, 1999). Management of the Web application is
often neglected because it is considerably more challenging than management of a
common set of components like servers. Results of this study contribute to the discipline
of systems management by making it possible to consider comprehensive management of
a site by including the key discipline of managing the Web application itself.
Results of this study produced work products that will foster the use of application
management instrumentation. Application management instrumentation is a key part of
Figure 4 that shows an original two-part model. Instrumentation makes a comprehensive
applications-management approach meaningful to the enterprise because it provides
detailed data about the application. For example, this detail can be used to answer specific
questions about the impact of application faults on the performance of the application.
13
________________________________________________________________________
Basicmonitoringandcontrolwithadditionalcomponentmonitoringand applicationinstrument-ation
Automatedexecutionof applicationtestswithproblemmanage-ment
Basicmonitoringandcontrolwith additionalcomponentmonitoring
Automatedexecutionofapplicationtests
Basicmonitoringand control
Manualexecutionofapplicationtests
Basic Basic
Morecomplete
Moreeffective
Comprehensive
Software monitoring
Web application sampling
Automatedexecutionof sample testswithfullsystemsmanagementintegration1
23 4
5
67
Figure 4. Applications management as part of a comprehensive approach ________________________________________________________________________
The model shown in Figure 4 includes the dimensions of software monitoring and Web
application sampling. Software monitoring (left side of Figure 4) has three parts that build
upon one another. These are (1) basic monitoring and control, (2) additional component
monitoring, and (3) application instrumentation. The numbers in the preceding sentence
refer to labels in Figure 4. The first two parts are used today, but the third, application
instrumentation is often neglected. Results of this study helped to address this need. This
model also includes Web application sampling (right side of Figure 4) that has four parts
that build upon one another. These are (4) manual execution of application tests, (5)
automated execution of application tests, (6) integration with problem management, and
(7) full systems management integration with other perspectives like change and
14
performance. This study addressed how to provide more complete systems-management
integration in direct support of Web applications. Using this model, the results of this
study help make it possible to provide a level of support (basic, more complete or
comprehensive) that meets the needs and budget of the application provider.
Barriers and Issues The challenge of managing applications is surfacing because more and more Web
applications are being developed and deployed. The situation is now more urgent because
many companies are using the Web for commerce. By 2003, the U.S. Commerce
Department estimates that business to consumer e-commerce will likely be in the range of
$75 to $144 billion. Business to business e-commerce could reach between $634 billion
and $3.9 trillion. (Leadership for the New Millennium, 2001).
Four barriers and issues were addressed by the proposed toolset. The challenges are
significant. Some challenges, like application complexity and high skill level
requirements, are growing more difficult over time as Web sites reference more legacy
data and systems. The lack of focus on manageability is a barrier that will require a
change in thinking. Making applications manageable involves a labor challenge that can
perhaps best be addressed by leveraging automated software capabilities. These barriers
and issues are explored in more detail below.
Applications and Their Environments are Complex Applications management is difficult, because applications and the environments they
run in are complex. Simply put, the complexity of applications is making them difficult
to manage (Application Management: A Crisis, 1996). This toolset addressed this
complexity by providing procedures, programs, views, schema, and data to help manage
15
these applications in the same way regardless of application environment or platform. In
so doing, the complexity of an applications-management implementation for developers
and systems-management administrators is reduced.
Making Applications Management Ready is Labor Intensive Another barrier is the labor-intensive nature of the effort required to put the
management system in place. Labor is needed to plan the effort. It is also needed to
design the management solution for the application. Planning and design activities are
only two of the many steps that are required to implement an applications-management
solution. In 1998, consultants from Tivoli Systems assisted a number of companies in
making their applications manageable. These efforts ranged from as few as 60 to as many
as 300 days of planning, design, and implementation. In 1998, a large effort was
undertaken for the Pentagon to instrument an important operational system. The pilot for
this effort took more than 250 days to complete (Tisdale, 1998). This toolset would have
helped to reduce the amount of labor required to implement a full life-cycle management
solution by providing procedures, programs, views, schema, and data that are ready to
implement and use across the application's life cycle.
Management Solutions Require a High Skill Level Another barrier is the high skill level required of the personnel that implement the
management solution. These individuals are required to be skilled in activities as diverse
as planning, debugging, design, and system testing. These individuals are also required to
know how to work with different operating systems, network protocols, databases, and
applications. Because of this high skill requirement, some analysts suggest that the
instrumentation should come from the vendors of application development tools
16
(Applications Management: A Crisis, 1996). This toolset helped reduce the skill
requirements by assisting individuals with planning, design, and implementation activities
over all life-cycle phases. This toolset also helped by providing management components
like monitors and tasks that run on any platform and do not require detailed platform
knowledge on the part of the personnel creating the management solution.
There is a Lack of Focus on the Manageability of Applications Another challenge to be overcome is the lack of focus on the management of the
application. Developers are primarily focused on the creation of the application’s useful
function and are often not concerned with how the application will be deployed and
managed after it is written. According to the Seybold Group, the solution is for
developers to participate in application management (Rymer, 1995). This toolset helped to
give management of the application the focus that it requires without mandating a high
degree of developer involvement. This toolset also helped to make the application
manageable by providing easy-to-use interfaces to popular application development
languages and environments. In addition, this toolset made it easier to define the
management characteristics of an application. These characteristics, once stored in a
machine-readable format, were used to distribute or monitor the availability of the
application.
Hypotheses and Research Questions Investigated Four hypotheses were explored in this study. The hypotheses are described below.
- Hypothesis 1 - The manageability of Web-based applications is improved by a
toolset (procedures, programs, views, schema and data) implemented in a full life-
cycle context, aligned with key functional perspectives.
17
- Hypothesis 2 - Existing data sources like alerts, traps, and messages are sufficient
to build and maintain an effective management information repository for the
management of Web-based applications.
- Hypothesis 3 - Problem determination is significantly improved by a toolset that
utilizes views to display information from a comprehensive management
information repository of data about the Web-based application.
- Hypothesis 4 - Availability and performance faults are more easily detected and
corrected using a comprehensive toolset.
Hypothesis 1 is related to the first three research questions. These questions are the
primary research questions. They are related to the first hypothesis because they explore a
specific aspect of the hypothesis such as the components that make up the toolset, the life
cycle context, and the appropriate functional perspectives. The primary research questions
are described below.
- Question 1 - What are the appropriate procedures, programs, views, schema, and
data that would improve the manageability of Web-based applications?
- Question 2 - How do these toolset components fit in the context of the
application's life cycle including design, construction, deployment, operation, and
change?
- Question 3 - How do these toolset components round out the functional
perspectives of accounting, administration, automation, availability, business,
capacity, change, configuration, fault, operations, performance, problem, security,
service level, and software distribution?
18
Hypotheses 2, 3, and 4 are associated with the secondary research questions. These
research questions are numbered 4 through 23. The secondary research questions explore
concepts specific to the functional perspectives that were examined in the study. The
secondary research questions are described below.
- Question 4, Part A - For the accounting functional perspective (as it relates to Web
application management), is it possible to instrument an application whereby the
developer or user specifies the resources they intend to use and the toolset alerts
them when the limit is exceeded? Part B - Are simple messages the appropriate
alert mechanism for this tool?
- Question 5, Part A - Another accounting research question is--is it possible to
instrument an application for accountability? Part B - Could this instrumentation
be used for the charge back of the Web site to the internal groups that use it?
- Question 6, Part A - For the administration functional perspective, is it possible to
completely automate the key administration activities for the installation of a Web
application? Part B - Is it possible to install a Web application without human
intervention?
- Question 7 - Another administration research question is--in a problem-solving
context, is it possible to verify the administrative settings of key Web application
software parameters using previously stored values?
- Question 8 - For the automation functional perspective, is it possible to read
design-phase work products and automatically produce templates to be used in
subsequent phases? Examples might include start, stop, and restart scripts or
19
schema that describes the key Web application components that make up the Web
site.
- Question 9 - Another automation research question is--is it possible to create a tool
that automatically compares designed versus actual installed Web application
components?
- Question 10, Part A - For the availability functional perspective, what are the
characteristics of "deep" availability? Often, availability is centered on the
management of the state of a logical resource--the symbolic representation of a
system or a user. Part B - How would a deeper treatment of availability be
managed? Would it automatically include responsiveness, stability, and usage
measurements?
- Question 11 - For the business functional perspective, what additional substance or
depth can be created in support of business-systems views, in addition to the
current focus on specific component monitors and commands?
- Question 12, Part A - For the capacity functional perspective, from the point of
view of the application (not the server), is it possible to determine the components
of the application that are important to understanding its potential for capacity
bottlenecks? Part B - Which application, middleware, and database components
are essential to understanding the capacity of the application and how does that
relate to server and network-based models and approaches?
- Question 13, Part A - For the change and configuration functional perspective, is it
possible for an application to detect unauthorized changes to itself? Part B - What
would be required to detect and notify these unauthorized modifications?
20
- Question 14, Part A - Another change and configuration question is--would
application-level change-window awareness be useful to the team or process
making the changes? Part B - Would this make possible the suppression of certain
kinds of application-generated faults, that often occur during planned change
periods?
- Question 15, Part A - For the fault functional perspective--is there an optimal
technique for generating application faults? Part B - Is a smart fault-generating
module possible? A smart module might be one that takes minimal input from the
application and makes intelligent choices regarding selections for the target-
systems.
- Question 16 - For the operations functional perspective, is there a way to have an
application view for the helpdesk that integrates key functions like job scheduling,
backup status and history, and the status of key print or file outputs?
- Question 17, Part A - For the performance functional perspective, is there an
alternative to gathering intimate application performance data by modifying the
application itself to insert calls to a performance-measurement tool? Part B - Is
there a proxy for this that is possible using an instrumented application robot?
- Question 18, Part A - For the problem functional perspective, most of the focus is
on the problem-management tools. Is it possible to instrument an application to
provide more meaningful and detailed data to the problem management system?
Part B - What would the instrumentation be that would minimize the programming
burden yet maximize the data collected and recorded?
21
- Question 19 - For the security functional perspective, is it possible to build a view
(with probes) that would be used to monitor key security interfaces for an
application? These interfaces might include traditional access points like
application sign on attempts, failures, and retries as well as information from
application dedicated routers, firewalls, and network interface cards.
- Question 20 - For the service level functional perspective, is it possible to architect
a service-level management tool that is independent of the application, yet it
records specific information, that can be used for both service-level objective and
service-level agreement reporting?
- Question 21 - Another service level question is--is it possible for a toolset to
gather availability and performance metrics as they relate to service level?
- Question 22 - For the software distribution functional perspective, is it possible to
create deployment-phase views that allow software distribution to be monitored on
an application component-by-component basis? Would it be helpful for the
monitoring of mission-critical distributions?
- Question 23 - Another software distribution question is--would it be useful to have
a tool that reads a directory structure and builds schema and data to populate the
Management Information Repository? These data, once loaded, could be used to
build packages for distribution, objects for distribution views, and storage for data
or information relating to distributions.
In summary, there are four hypotheses and twenty-three research questions. The first
hypothesis is associated with the primary research questions. Hypotheses 2, 3, and 4 are
associated with the secondary research questions. The results, which are explained in
22
Chapter 5 of this Final Dissertation Report, specifically address these hypotheses and
research questions in significant detail.
Limitations and Delimitations of the Study In this research, the author focused on the creation of a prototype, not a product. This
reflects the idea that the most important aspects of this work were the requirements
gathering and design work necessary to create a prototype of a management tool. The
design was comprehensive, but the implementation focused on a subset of function, that
supports the five scenarios explained in Appendix B - Toolset Evaluation Survey. This
prototype toolset has value as it could be used to assist with the development of service-
based offerings for an organization looking to develop or purchase an applications-
management solution. Kroenke and Dolan (1987) pointed out that prototyping is a
requirements-determination tool that is used like an architect's scale model.
This scale model focused specifically on the management of Web applications not
applications in general. Because of this, the toolset focused on Web-specific aspects of
monitoring, commands, operations interface, automation, and interface to management
systems like problem and change. The researcher did not focus on the management of
servers, networks, or hardware. These components are an important part of many Web
sites, but adequate management solutions are already in place to address these elements. A
management focus is missing at the top layer in the application-dependency stack. This
top layer is focused on the application itself and its supporting middleware and database
components.
Definition of Terms A number of key terms are defined below.
23
Application Programming Interface (API) - a formally defined programming language
interface between a program and the user of the program (Dictionary of Computing,
1987).
Application topology - used when describing application components and their
relationship to one another. With Tivoli software, this relationship is defined using the
enhanced relationship group that is defined in the Applications Management Specification
(Applications Management Specification, 1997).
Availability - has to do with monitoring an application and its environment while it is
running (Sturm & Bumpus, 1999).
Business to Business E-Commerce - where businesses sell to other businesses. For
example, when a shop orders new products for its shelves or a factory orders new steel to
make its products (Dr. Ecommerce, 2000).
Business to Consumer E-Commerce - where businesses set up a Web-based storefront
to reach a global market. The benefits to consumers are greater convenience, easy access
to a wide variety of goods and services, and savings in money and time (First Annual
Report, 1998).
Change - after an application is deployed and is running it is often changed. These
modifications take place during a period of time that is sometimes called the change
phase. These changes are often managed using change management practices and tools
(Harikian, Blust, Campbell, Cooke, Foley, Gulla, Gayo, Howlette, Mosher, and O'Mara,
1996).
Construction - a phase or period of time when the application is created. Some vendors
call this phase the assemble phase (CONTROL: Enterprise Web, 1999).
24
Data - a component of the toolset that is described by schema and resides in a
Management Information Repository (MIR) or file that is referenced by the MIR (Martin,
1996). Records in a Web-server log are typical of the data used in this project.
Deployment - a set of activities during the software life cycle where a software feature is
distributed and put in an installable state (DMTF Standards, 1998).
Design - a time during the creation of an application when the process of defining the
hardware and software architecture, components, modules, interfaces, and data for the
system is conceptualized and documented (Dictionary of Computing, 1987).
Desktop Management Task Force (DMTF) - this industry organization is leading the
development, adoption, and unification of management standards. The focus of this work
is broad – desktop, enterprise, and Internet environments (Distributed Management Task
Force, 1999).
Functional perspective - this term was used by Sturm and Bumpus (1999) for the
management functions needed to support an application. Examples include fault,
performance, configuration, and security.
Hyper Text Markup Language (HTML) - programming language that uses text and
tags to format a page or document on the World Wide Web (SCIS Help, 1999).
International Standards Organization (ISO) – work from this organization results in
international agreements that are published as international standards. ISO achievements
include the film speed code, the standardization of telephone and banking cards, and ISO
9000, which is used by businesses to provide a framework of quality management and
quality assurance (ISO - International Organization, 1999).
25
Information Technology Infrastructure Library (ITIL) - describes the organization of
service delivery in the area of automated information technology systems (Bladergroen,
Maas, Dullaart, Kalfsterman, Koppens, Mameren, & Veen, 1998).
Information Technology Process Model (ITPM) - an IBM process model consisting of
8 process groups, 41 processes, and 176 sub processes (Harikian et al., 1996).
Java Management Extensions (JMX) - an architecture, components, protocols, and
APIs that make it possible to manage Java applications through Java technology (JAVA
Management Extensions White, 1999).
Joint Application Design (JAD) - an approach that involves heavy client participation in
the development of formal requirement specifications (Jackson & Embley, 1996).
Management Information Repository (MIR) - tool to integrate management
applications and the data they require. A logically centralized database that is at the heart
of the management system (Martin, 1996).
Management Infrastructure - software, hardware, and procedures that are used to
support the management needs of an application. These needs cover activities like
application distribution, application installation, dependency checking, application
monitoring, application configuration, operational control, and deploying updates and new
releases. (Applications Management Specification, 1995).
Monitor - a program that examines specific applications or systems, upon which,
applications rely. Typical monitor programs examine available disk space or application
errors and use thresholds to determine when conditions require the attention of an
administrator (Tivoli Manager for Oracle, 2000).
26
Operation - a time in the application life cycle (phase) when a software feature is running
and being monitored (DMTF Standards, 1998). It can also refer to data or tasks performed
by people.
Open Software Foundation (OSF) / Distributed Computing Environment (DCE) -
OSF is now The Open Group and they provide DCE which is a robust, network-centric
computing environment that includes system services like remote procedure call, directory
services, time services, security services, and thread services (DCE Overview, 1996).
Procedure - a description of the course of action to be followed as a solution to a problem
(Dictionary of Computing, 1987).
Program - a sequence of instructions for processing by a computer (Dictionary of
Computing, 1987).
Rapid Application Development (RAD) - this approach is used in software development
as a means of delivering maximum functionality in the shortest time (Carter, Whyte,
Birchall, & Swatman, 1997).
Schema - a set of statements that describe the structure of the database (Dictionary of
Computing, 1987).
Secure Socket Layer (SSL) - a mechanism for securing Web transactions. This protocol
consists of an initial phase, called a handshake, during which secure communications are
established; a period of application-to-application communication where encryption is
applied to the data; and finally an exchange of data to close the dialog (Rubin, Geer, &
Ranum, 1997).
27
Simple Network Management Protocol (SNMP) – this protocol is a simply composed
set of network communication specifications that cover the basics of network
management in a way that does not stress an existing network (Vallillee, n.d.).
Transmission Control Protocol/Internet Protocol (TCP/IP) - a protocol suite named
for two of its most important protocols: Transmission Control Protocol and Internet
Protocol. The suite is over 25 years old and is still evolving (Feit, 1996).
UNIX – this operating system was created in the late 1960s to provide a multi-user,
multitasking system for use by programmers. It consists of a kernel, standard utility
programs, and system configuration files (Byrd, 1997).
Windows New Technology (NT) – this operating system was released in 1993 and was
built by former designers and developers of VMS, and operating system from Digital
Equipment. NT has similar goals to UNIX--portability, extensibility, and support for a
broad range of hardware from laptops and desktops to servers that support an entire
department (Russinovich, 1999).
Summary In this dissertation, the researcher designed and implemented a prototype toolset
consisting of procedures, views, programs, schema, and data. The toolset components
were chosen because they made possible a complete approach to the management of
applications. For example, procedures provided a through list of tasks to be performed by
the administrator whereas programs were used to automate steps and activities. Views
were provided so that the administrator could more easily grasp the meaning of the
management data. Schema was created so that the data would be organized in a
meaningful and orderly manner.
28
This toolset pertains to all phases of the Web application's creation and use including
design, construction, deployment, operation, and change. A full life-cycle approach was
chosen so that the application could be effectively managed through multiple phases not
just the operation phase that typically receives most of the focus. The toolset built upon
key functional perspectives including accounting, administration, automation, availability,
business, capacity, change, configuration, fault, operations, performance, problem,
security, service level, and software distribution. These functional perspectives were
chosen because they are important to the successful management of Web applications, and
provide a clear focus to the toolset functions within the life cycle context.
This toolset was designed to be used for the management of Web applications and
focused on improving their availability through effective monitoring, control, operations
interface, automation, and problem management. Four barriers and issues were addressed
by this toolset. The challenges included application complexity, the high skill level
requirement to create management solutions, the lack of focus on manageability, and the
challenge of making applications management ready.
Four hypotheses were examined in this study. Twenty-three research questions were
also explored. The first three research questions were the primary research questions and
the remaining were the secondary research questions. The secondary research questions
had an almost one-to-one relationship with the subsystems that were developed in support
of the 15 functional perspectives. Nineteen subsystems were developed in support of the
15 functional perspectives that are important to the management of applications.
A prototype toolset was developed and evaluated using a survey instrument. The
instrument made use of a framework that consisted of the system dimension and the
29
environment domain. Factors such as toolset understandability, technology, compliance,
performance, technology and contribution were considered. The evaluation aspect of the
research was based on ideas from Boloix and Robillard (1995).
30
Chapter 2
Review of the Literature
Introduction This chapter contains a review of the literature focused on systems, network, and
applications management. The historical overview discusses applications management as
an emerging discipline, the history of applications management, and major research
efforts and projects in the area of the management of applications. The theory portion of
the survey pertains to management infrastructure such as alerts and toolkits, management
standards such as CORBA and SNMP, management information repositories, and classes
of products such as point or framework.
This chapter also contains a summary of what is known and unknown about the
management of applications. This information is organized by functional perspective
starting with accounting and administration and ending with service level and software
distribution. The last major part of this chapter discusses the contribution that this study
makes to the field of applications management. The main contributions are in the areas of
expanding knowledge and capability in full life-cycle management, providing the design
of an innovative toolset, expanding the capabilities of 15 key functional perspectives such
as accounting and service level, and integrating with existing management products.
Historical Overview of the Theory and Research Literature Today, applications management is considered an emerging discipline. However,
managing applications has been done by computer professionals from the 1960s through
the 1990s (Sturm & Bumpus, 1999). Presently, the discipline is being defined by the
31
work of standards organizations, product manufacturers, systems-management
consultants, and professional-services personnel. The efforts are not coordinated so there
are differences in the terminology used and the scope of the efforts surrounding
applications management initiatives. In spite of the confusion, there is a recognizable and
developing discipline.
Application Management as a Discipline As a formal discipline, applications management emerged as a response to the
management challenges of client/server applications. One of the first groups to focus on
the manageability of applications was the Desktop Management Task Force. The DMTF
was formed in 1992 by companies including Intel Corp., Microsoft Corp., Novell, Inc.,
SunSoft, and SynOptics Communications. Hewlett-Packard, IBM, and Digital Equipment
Corp. also actively participated in the group (Applications Management Specification,
1997). During that time, software companies began to produce programs to manage
client/server applications.
Consultants from the Patricia Seybold Group began to write white papers in support of
the new products being developed. An early report (Rymer, 1995), was written in
response to the management challenges created by new applications written using the
client/server model. For some, the report might be suspect as it was written to support a
software product called AppMan by Unify Corporation. The report was not pure research,
but rather a tool used to sell products. Regardless of its purpose, this twenty six-page
report was remarkable because it identified and explained many of the core application
management issues. In the report, the researcher defined direct application management as
the "monitoring, control, and tuning of the software modules that make up client/server
32
applications" (Rymer, 1995, p. 1). Five disciplines of direct application management were
also defined. These included:
1. Fault management
2. Performance management
3. Application configuration management including software distribution
4. Security management
5. Accounting management including software asset management (Rymer, 1995)
Rymer also wrote of the need to build management right into the application programs.
He explained that developers must become participants in application management. This
is still an important issue today, as many applications are not instrumented for
manageability, and reworking applications after they have been written is a significant
challenge. Also, it is difficult to provide meaningful instrumentation without modifying
the application itself.
Starting in 1996, and continuing to the present, an important series of reports and
articles began to be available from Hurwitz Consulting. Some of these documents were
available in magazines, others on the Web. An early white paper (Application
Management: A Crisis, 1996), was also written to support the AppMan product. It too
was written for Unify Corporation. Another early article on managing applications
(Foote, 1997a) was published in DBMS Magazine. It focused on the disciplines
appropriate for applications management. A second article by the same author (Foote,
1997b) was published by Hurwitz Associates. That article focused on the management of
applications and databases. In that article, Foote explained the relationships between the
different components in the distributed computing environment. He explained the
33
application dependency stack that contained network, hardware, operating system,
database, application services, and application elements. He also identified the
appropriate disciplines--distribution, configuration, operations, event/problem,
performance, storage, and security. Finally, Foote added four environments to the
discussion that he named desktops, departmental servers, enterprise servers, and
mainframes. This entire discussion was framed in the context of the service-level
agreement. The article was notable for its description of the colliding trends in application
management. In Foote's view, the colliding trends included:
• Too frequent releases
• Componentization of applications (trend to create self-contained groups of
application function)
• Too many application sources (each provider has its own service and management
requirements)
• Constantly changing content
• Increasing availability requirements
• Component dependencies, incompatibility, collisions, and availability
requirements (Foote, 1997b, p. 6)
Several Hurwitz reports focused on the importance of an applications-management
strategy. Hurwitz (1997) explained the key organizational issues in developing an
applications management strategy. The researcher explained the importance of
establishing procedures, as well as the role of the help desk/service center in managing
user perceptions. Hurwitz also pointed out that the strategy was not just for the IT
department--it must take into account the needs and ideas from the user community. Later,
34
Hurwitz (1998) wrote a white paper to support the product strategy of Full Time Software
Corporation, a developer of application-availability products. The focus of the paper was
on the demand for 100% application availability.
Still other papers were available for a fee directly from Hurwitz Consulting. These
were some of the strongest reports available on the management of applications.
Geschickter (1996b) wrote a five-page report that defined applications as the intersection
of technology and business. It included a discussion of the application-dependency stack
and the vendors who have products to monitor the components of the stack. Geschickter
(1996a) also wrote a 36 page report that explained the results of a phone survey that was
conducted by the Hurwitz Consulting Group to validate applications management issues
and needs. In 1996, Sobel wrote a series of application management white papers for
Hurwitz Associates.
Sobel (1996d) wrote a seven-page report that discussed the limitations of network and
systems management, applications-management technology issues and methodologies,
standards, and APIs. The focus of that paper was to help IT managers separate the
important issues from the ideas being sold by the software industry at the time. Sobel
(1996b) wrote a five-page report that had as its focus guidelines for the creation of a user
strategy for the management of applications that support the business. In this report, Sobel
used a tool to determine the impact of an application on the business. This tool, in the
form of a categorization grid, was useful in helping to identify the applications that will
shut down the business if they are not available. Personal productivity programs are
important to individuals and are often used to support management and planning.
Although these programs are important to individuals, their failure is unlikely to shut
35
down the business. Mission critical applications are those programs that involve core
business processes like payroll, accounts receivable, and accounts payable. These
programs are used in an enterprise-wide manner and would have a very negative impact
on the business if the organization were to experience a catastrophic failure. For this
reason, these applications should be the focus of the strategic application-management
activities. Sobel's grid is shown in the Figure 5.
________________________________________________________________________
C o n s t i t u e n c y
Func
tion
C o r eB u s i n e s sP r o c e s s e s
A d v i s i n g /P l a n n i n g
I n d i v i d u a l E n t e r p r i s e
T a s kC r i t i c a l
P e r s o n a lP r o d u c t i v i t y
M i s s i o nC r i t i c a l
D e c i s i o nC r i t i c a l
Figure 5. Categorization grid showing business impact of applications
________________________________________________________________________
Sobel (1996a) continued to write on applications management with a four-page report
that expanded on the ideas discussed in "Creating an applications management strategy"
and gave detailed suggestions about managing user perceptions, service-level agreements,
and the role of the help desk or service center. Sobel (1997) also wrote a five-page report
that described the key applications-management standards activities including the work of
the DMTF (Desktop Management Interface standard), Tivoli (Applications Management
Specification), and the IEFT (Application Management MIB). Gillooly (1999) wrote a
36
six-page report that built upon the previous Hurwitz applications-management reports and
explained that e-business had made management much more critical and valuable to the
organization. The report described the problem, the requirement, and possible solutions to
the challenges of business-to-consumer and business-to-business e-commerce.
Taken as a body of work, the articles, reports, and white papers from Hurwitz
consulting and associates captured the problems, requirements, and strategies associated
with the management of client-server applications. They also form a good foundation
upon which to explore the management of business-to-consumer and business-to-business
Web applications.
History of Applications Management In the 1950s and early 1960s, computers made calculations in milliseconds. In one day,
ENIAC performed as many calculations as it would take a human to perform in 300 days
(Hussain & Hussain, 1985). At that time, managing applications was largely manual
labor. The management discipline was not identified or defined so anything that was done
to support an application could loosely be called an applications-management activity.
Computer operators worked with the applications and performed activities like
maintaining a log of when jobs started and ended. Operators also scheduled jobs based on
variables like when the input would be ready, what forms were required for the printer,
and what resources were needed like tape drives and card readers (Sturm & Bumpus,
1999). At the time, media was bulky taking about 100 cubic feet to store one million
characters of data (Hussain & Hussain, 1985).
In the 1970s, punched card and tape media gave way to disk, a direct-access media, and
Cathode-Ray Tubes (CRTs). A variety of other media were used including paper-tape
37
readers and punches, magnetic ink character readers, optical mark and character readers,
line printers, character printers, computer-output microfilm, direct-entry consoles and
recorders, graph plotters, and audio response units (Daniels & Yeates, 1971). Computer
applications were batch (like the 1960s) and on-line where users could get immediate
access to information like account balances. Programs were stored in disk libraries instead
of on cards and application software became more flexible and functional (Sturm &
Bumpus, 1999). By 1979, the space required to store one million characters of storage
dropped to .03 cubic feet (Hussain & Hussain, 1985). Managing applications was still
largely a manual process although some software features were developed to assist the
operators like a Job Entry Subsystem that made operating system software easier to use to
manage the application workload. This was the case for IBM's operating system of the
day, Multiple-Virtual System (MVS) that was both a batch operating system and one that
supported hundreds of concurrent users (Kronke & Dolan, 1987). MVS had two job
subsystems--JES2 and JES3. Each offered different workload management capabilities
(Gulla, 1991).
Mainframe computers were not the only computers used in the 1970s. At this time, the
minicomputer became popular and created a set of challenges associated with distributed
or departmental computing. Digital Equipment Company (DEC) produced the first
commercially successful minicomputer in the mid-1960s. By the 1970s, Hewlett-Packard,
Data General, Texas Instruments, Honeywell, Burroughs, Wang, IBM, and Prime all
entered the market (Szymanski, Szymanski, Morris, & Pulschen, 1988). Minicomputers
provided flexibility to their users and when these systems began to be used in a branch
38
bank or regional office, it became cost effective to use automated tools to manage these
systems (Sturm & Bumpus, 1999).
In the 1980s, the personal computer became a business tool. Invented in the late 1970s
by Jobs and Wozniak, the first personal computers had limited computing and storage
capability (Szymanski et al., 1988). However, they very rapidly grew in functionality, and
soon, they were linked to the mainframe. In some organizations, microcomputers were
clustered in Information Centers (ICs). The IC was a hands-on facility where
microcomputers, software tools, and training were made available to users. Individual
productivity was enhanced through word processing, mail merge, desktop publishing,
electronic spreadsheet, and presentation graphics software (Long, 1989). IC specialists
managed the hardware and software resources of the center and assisted the user
community.
Later, microcomputers were combined in departmental local-area networks. These
networks allowed for the sharing of files and resources like printers and plotters. It was
during the 1980s that management software started to be developed to manage the
infrastructure necessary to support the application. Vendors, like 3Com, IBM, Bay
Networks, Fluke, and HP developed management utilities that were used by technicians to
manage the availability of network infrastructure. These utilities took many forms from
stand-alone test instruments to UNIX-based management applications. Some were device
dependent applications whereas others were device independent. The scope of their
functionality included availability, performance, measurement, monitoring, and reporting
(Gulla, 1997). Additional software was developed to manage the other components upon
which the application depended. Database management vendors, like Informix,
39
developed utilities to manage the database. The System Monitoring Interface (SMI) was
used by management applications to get information about bottlenecks, resource usage,
performance profiling, lock usage, and other key values that are useful for managing the
database (Mattison, 1997). Management tools began to mature that were useful in
managing functions like backup and recovery and network management (Kronke &
Dolan, 1987).
In the 1990s, the management of applications became a topic of increased importance.
This topic became significant due to the proliferation of client-server applications and the
many distributed systems needed to support them. At this time, application management
research projects like the project that created the Modular Advanced Re-configurable
Integrated Architecture (MARIA) toolkit (Atkinson, Hawkins, Hills, Woollons,
Clearwaters, & Czaja, 1994), the Consortium for Research on Distributed System
(CORDS) project (Bauer, Finnigan, Hong, Rolia, Teorey, & Winters, 1994), and
Management of Distributed Applications and Systems (MANDAS) project (Martin, 1996)
appeared in the literature.
Applications-management products also appeared in the marketplace. Early examples
included Tivoli AMS-based products like Distributed Monitoring and Software
Distribution (Lendenmann, Nelson, Lara, & Selby, 1997). Since the late 1990s, the World
Wide Web has had a major impact on application development and delivery of new
function to users. Many different products are available to help solve problems found in
the Web environment, but no full life-cycle toolset has yet been developed.
40
Major Research Efforts and Projects Starting in the mid-1990s, application-management projects were being proposed and
carried out by a number of researchers around the world. Atkinson, Hawkins, Hills,
Woollons, Clearwaters, and Czaja (1994) wrote about an application management project
that had as it focus the implementation, re-configuration, monitoring, and control of an
application. The researchers, from the University of Exeter and Helitune Limited,
identified a broad number of requirements that included communication support,
distribution transparency, computer-aided development and tool support, structure
support, allocation support, change support, and fault tolerance. In addition, configuration
management was identified as a requirement for both planned and unplanned changes.
The researchers responded to the requirements by developing the Modular Advanced Re-
configurable Integrated Architecture (MARIA) toolkit. Most of the support was for
developers, but management of the application was a big part of the fiber of the toolkit.
Bauer, Coburn, Erickson, Finnigan, Hong, Larson, Pachi, Slonim, Taylor, and Teorey
(1994) explained the Consortium for Research on Distributed System research project.
The scope of this project was to develop new techniques for developing distributed
applications and for understanding the services required for distributed applications and
the associated tools. The researchers identified requirements that included support for
peer-to-peer development, accommodation of legacy systems, accommodation of
emerging applications, support for security and privacy, and manageability. They also
identified other requirements including data access, support for role-specific transparency,
support for visualization, support for application development languages and tools,
41
support for distributed debugging and testing, and the accommodation of evolving service
requirements.
The CORDS functional framework had a huge scope that included system management
and network management, which is how that project satisfied the manageability
requirement. In detail, system and network management included services like configure,
monitor, and control-managed objects. The objects represented real components like
applications, services, networks, and devices. The researchers organized management
services by subsystems. These subsystems included management information repository,
configuration, monitoring, and control subsystems. Management agents were also part of
this structure. The project team was composed of individuals from four IBM research
groups, six Canadian universities, four American universities, and other international
research entities.
Bauer, Finnigan et al. (1994) used prototypes from the CORDS project to focus almost
exclusively on system and application management issues. Their main proposal was a
reference architecture for distributed systems management that utilized system
monitoring, information management, and system modeling techniques. Within their
scope were three classes of system management. These included network services and
devices, operating system services and resources, and user applications. The focus on
user applications was most interesting as the network and system components were
somewhat well studied and understood. The services they identified--monitoring, control,
and management information, were all designed to interact with managed objects that
represented real components. In addition to network and system components, application
components were given a real focus. The application components included instances like
42
databases, files, programs, tasks, application clients, application servers, queues, and
processes.
Bauer, Lutfiyya, Black, Kunz, Taylor, Bunt, Eager, Rolia, Woodside, Hong, Martin,
Finnigan, and Teorey (1995) formalized the work that began with the CORDS project.
This research was supported by the IBM Center for Advanced Studies and the National
Sciences and Engineering Research Council of Canada. The MANDAS architecture
consisted of four parts (see Figure 6).
______________________________________________________________________
Figure 6. MANDAS architecture
________________________________________________________________________
The first part was the management applications that included configuration
management, performance management, fault management, and modeling. Modeling was
Management Applications
Management Services
Managed Objects
ConfigurationManagement
FaultManagement
PerformanceManagement
Modelling
ConfigurationSubsystem
MonitoringSubsystem
ControlSubsystem
Repository Subsystem
ManagementAgents
43
the most interesting of the management applications because it was a tool to help predict
the needs and behavior of the programs that it supported. The second part of MANDAS
was a set of management services that communicated with agents to manage network,
system, and application objects. A key part of the management services was its use of a
repository subsystem. This subsystem utilized data in a management information
repository. The third part of MANDAS was its management agents. These agents
executed on the systems that were being managed and communicated with the
management applications through the management services. The fourth part of
MANDAS was the managed objects. These objects represented the real components like
servers and applications. Object technology was used to take advantage of productivity
characteristics like inheritance.
Bauer et al. (1997) discussed the follow on activities of the MANDAS research.
MANDAS and MANDAS-related prototypes were also discussed. The MANDAS review
was detailed in a way that reflected that there were working prototypes of the subsystems.
For the first time, there was a detailed information model class hierarchy and a
management data warehouse. Registration and query services were named and explained.
An instrumentation architecture and environment were also depicted. The proposed
instrumentation was intrusive to the application thereby availing itself of application
details unavailable to external monitors. The article also contained a comprehensive list
of related work including other management frameworks, information models, systems for
the monitoring and control of distributed applications, and configuration services.
Some of the MANDAS researchers focused on specific management architectures and
protocols. Hong, Katchabaw, Bauer, and Lutfiyya (1995) discussed the use of the Open
44
Systems Interconnection (OSI) management framework for monitoring, analyzing, and
controlling networks and their devices. The OSI management framework used an object-
oriented methodology that employs the use of agents. This framework comes with
guidelines for the definition of managed objects. These guidelines help with methods and
notational techniques for depicting classes for the managed objects that represent the real
resources. Other systems-management researchers took an interest in the OSI
management platform. Maltinti, Mandorino, Mbeng, and Sgamma (1996) discussed a
system that was built for an Italian Public Administration department. The scope of this
management system included system and application resources. The application
resources were largely services like TP monitor, file transfer, software distribution,
activity scheduler, and terminal emulator.
Endler and Souza (1996) took a different approach from the MANDAS researchers.
Sampa was a system for the availability management of process-based applications.
Sampa was designed for support of DCE-based applications and is based on an
application-specific availability specification. Simply put, Sampa was built to detect and
automatically react to failures like node crashes, process crashes, and hang-ups. Sampa
used non-intrusive ways to monitor the application and requires no changes to the
application code. The researchers pointed out that this was a benefit of the design
approach taken by the developers. MANDAS was mentioned in the related-work section
of the article and influenced the Sampa architecture with its monitoring and configuration
control components.
Yucel and Anerousis (1999) explored the challenges of managing, filtering, and
aggregating event from the managed elements in a Web-based system. The system they
45
developed, called Marvel, was a distributed computing environment for the creation and
management of events. Marvel used an object-oriented information model and included a
number of tools called views. The monitoring view was a high-level view of the network.
The control view was used to manage network management services. The event view
displayed the notifications associated with the managed element. Although this work was
largely related to network-management, it could have implications for applications
management.
The Theory and Research Literature Specific to Application Management This part of the chapter contains information about management infrastructure,
management standards, management information repositories, and classes of products (see
Figure 7). Each of these topics is pertinent to the study of applications management.
________________________________________________________________________
A p p lic a t io n sM a n a g e m e n t
M a n a g e m e n tIn fra s tru c tu re
M a n a g e m e n tS ta n d a rd s
M a n a g e m e n tIn fo rm a t io nR e p o s ito ry
C la s s e s o fP ro d u c ts
T o o lk its
In s tru m e n ta t io nE v e n ts
A la rm s
M e s s a g e s
C O R B A
R M -O D PO S I
S N M PJ M X
P o in t
Ta rg e te dG e n e ra l
F ra m e w o rkA p p lic a t io n s
S y s te m s
N e tw o rk s
Figure 7. Concepts important to the management of applications
________________________________________________________________________
46
Management infrastructure is important because it is a key source of data about the
application being managed. Infrastructure topics include instrumentation, events, alarms,
toolkits, and messages. These infrastructure components are important sources of
information for a management system. Management standards are important because
standards make interoperability possible. Standards can also be used as building blocks.
Over time, tools are developed and individuals become skilled with important standards.
These software and human resources can be utilized on projects increasing the likeliness
of success in these activities. Examples of standards that will be covered include CORBA,
OSI, RM-ODP, SNMP, and JMX.
Management information repositories are important because they contain well ordered
information about systems, networks, and applications. Knowledge of the classes of
products is important because it is important to know the software in the marketplace so it
can be leveraged as appropriate. Management infrastructure, management standards,
management information repository, and classes of products are now discussed in detail.
Management Infrastructure Management infrastructure includes a wide variety of items from simple application
programming interfaces to complex architectures that include manager and agent roles.
Some items are free and available through downloads whereas others are products that are
costly and have significant vendor support. Many data sources provide input to
management systems. Alarms, events, and messages are some of the most basic ways that
systems, networks, and applications share data and information about problems and
steady-state operations. As such, these sources are important to the management system.
47
In general, an alarm is a warning signal (Webster's New International Dictionary,
1955). An alarm might be an event whose characteristics cause it to be given the
designation of an alarm. Examples of this are the communicationsAlarm,
environmentalAlarm, and equipmentAlarm (Solstice Enterprise Manager, 2001).
Sometimes alarms are called alerts as in the case of the Intrusion Detection Exchange
Format Data Model (Debar, Huang, and Donahoo, 1999). A management system that can
trap and interpret alarms has access to a significant information source about the system,
network, and application.
An event is something that comes, arrives, or happens (Webster's New International
Dictionary, 1955). SNMP events are unusual conditions that occur in the SNMP device.
The information for these events is represented in trap messages. Examples of these
messages include link up or link down, cold start or warm start, authentication failure, and
loss of EGP neighbor (Siyan, 2000). Events occur with great frequency in many
networked systems. Often, multiple events are generated by different components for the
same problem. This situation has led to research activities and products that handle both
high event volumes and event correlation. Yemini, Kliger, Mozes, Yemini, and Ohsie
(1996) described a network management system that polls devices and accepts
asynchronous events. This system also correlates events to the same root cause in a very
efficient manner.
There are a number of products that focus on event management. An example is BMC
Software's Patrol Enterprise Manager. This product has a three-tier architecture that
includes a graphical interface, a manager component, and agents. Its main features
48
include filtering, business views, correlation, data collection, and recovery (Event
Management, 2000).
Messages are important sources of data to the management system. Almost every
program that one can think of uses messages to communicate with its user. Messages can
be written by the application to a computer window or an application log or both. These
messages can indicate normal processing or can indicate a problem. Since there is no
global architecture for message format, messages can and often are free form in content.
Table 1 contains several messages extracted form a log file with an interpretation of the
significance of the message to a management system.
Table 1. Messages Extracted From a Log File
Message text Significance
The date is Saturday, August 05, 2000.
This message gives context to the other messages in the file since most do not have a date or time associated with the message.
08-05-2000 22:28:31.49 - Interpreted response: Ok
Example of message where response from the call to another function completed successfully.
Authentication completed successfully. Security message.
Internal state information is 'L=436CL:1:1:32:1:8432161366: 28800:3:1'
Example of a message that could be used for debugging internal program problems.
Name server (9.37.0.5) pinged in 281ms. Performance message.
Log files closed. Normal completion message.
49
It is easy to understand how messages are a key source of information to a
management system and could be used to make high-level determinations about the state
of system and application resources. Certainly, up and down states could be interpreted
from messages like "Authentication completed successfully". Performance level could be
implied by a message like " Name server (9.37.0.5) pinged in 281ms". If 281ms is
deemed slow then the name server resource in a network view might be set to yellow
(performance degraded).
Messages can be presented to the management system through an API or the messages
can be collected from a log file by a utility program. An example of the API technique is
the Write to Operator (WTO) macro interface used by the IBM OS/390 system. This
macro allows the application program to present a message of up to 122 characters to the
console (McQuillen, 1975). The message is presented to the operator and subject to
automation processing that is standard on mainframe systems (Irlbeck, 1992). An example
of the log file utility program approach is the Tivoli Event Adapter. This utility receives
log messages from the syslogd daemon running on a host computer. The utility reformats
the messages into Tivoli Event Console events and forwards them to the event server for
processing (Lendenmann et al., 1997). The reformatted events can be used to create
problem records or can trigger automated actions.
Instrumentation is a key applications-management concept as it is being developed and
discussed by researchers and product developers. Schade, Trommler, and Kaiserswerth
(1996) proposed a method to support the development of manageable distributed
applications. This method was based on a formal management interface based on
CORBA-compliant systems and DME. It also leveraged the capabilities of an
50
instrumentation library, called a management adaptation library that made management of
the application more straightforward when it was executing. Support was included for two
functions--initialize to register the object to the management system and state that updates
the MIB to reflect the new state and trigger any pending management actions. Hong, Gee,
and Bauer (1995) supported the idea of instrumentation as a tool for managing systems
and applications. They too were interested in automation for the instrumentation so little
needed to be done by the developers to set up the application for manageability. The
authors defined the process of instrumenting the application as having four steps. The
steps were defining the management data, defining the management operations,
generating the management interface code, and building the instrumented software
resource. The tool they created, the Management Interface Instrumentation Tool (MIIT),
was designed to take the resource to be managed like an application, and add the
management interface so it could be managed. This is a simple idea that minimized the
burden on the developer to instrument the application for manageability.
Many software companies and some researchers make toolkits available to support a set
of activities like OSI management alarm handling (Compaq TeMIP, 2000) or the
exchange of management data (Integration Overview, 2001). Although few of these
toolkits are application-management specific, many could be useful if the management
platforms that they support were used as the platform for an application-management
toolset. An example of a potentially useful toolkit is the Tivoli Multi-Platform Manager
API Software Developer Kit. This toolkit was developed for management applications to
perform functions like handle alerts and discovery of devices. This toolkit also supports
remote file operations and program execution (Integration Overview, 2001). It is easy to
51
see how these functions could be useful to a management program that was specifically
focused on applications. Using this toolkit, the management application could browse a
remote log to display application messages or execute a command to restart a failed
application component.
Another interesting toolkit is the Telecommunications Management Information
Platform (TeMIP) management toolkit. This toolkit provides alarm handling, event
logging, and problem-ticket support in a heterogeneous, distributed environment. The
toolkit supports SNMP and CMIP management protocols (TeMIP OSI Management
Toolkit, 1999). This toolkit also contains utilities that would be useful to an application-
management program running in a Digital UNIX environment. Another example of a
toolkit is Firmato, a firewall-management toolkit. This toolkit, which was developed by
researchers at Bell Laboratories, included an entity-relationship model, a model definition
language, a model compiler, and a graphic firewall rule illustrator (Bartal, Mayer, Nissim,
and Wool, 1999). It might be possible and interesting to invoke the firewall rule
illustrator from the context of an application-management program.
At this time, there are few application management toolkits. Examples include the
Tivoli Module Designer and Tivoli Developer Kit for PowerBuilder. The Tivoli Module
Designer uses a graphical user interface to capture key management data about an
application including the names of the directories where the programs reside, the files that
make up the application, information on its dependencies, and information to support the
installation and removal of the application. It also includes information like the
monitors and tasks needed to support application availability and the relationships
between
52
components (Tivoli Module Designer, 1998). The Tivoli Developer Kit for PowerBuilder
is a developer kit specific to PowerSoft Corporation's PowerBuilder application
development program. This toolkit makes it easier for PowerBuilder application
developers to define the manageability characteristics of their applications. The outputs of
the toolkit are used to distribute, monitor, and control the customer-developed
PowerBuilder application program (Tivoli Developer Kit, 1996). There are no
other commercial available application-management toolkits.
Management Standards Management standards that apply to application management are discussed in the
sections that follow. These standards have come from a variety of sources including
private companies like Tivoli Systems and Hewlett-Packard Company and standards
organizations like the Open Group and the Internet Engineering Task Force.
In 1996, Tivoli Systems and Hewlett-Packard Company announced an open API for
end-to-end applications management. It was called the Application Response
Measurement (ARM) API (Snell, 1997). At that time, Sobel from Hurwitz consulting
wrote a Balanced View bulletin on the subject. In that report, Sobel (1996c) wrote that
the API was a good start, but it was limited to six basic commands and it could not
determine response time for transactions across a distributed network. In 1999, the Open
Group adopted the ARM API Version 2 as its technical standard for application
performance instrumentation (The Open Group Adopts, 1999).
The ARM API was developed to measure responsiveness of client-server applications.
Applications that are instrumented using the API make it possible to answer questions like
"is the applications working correctly?" or "how is the application performing?" This
53
capability was made possible using a shared library of function calls. It also used a
management infrastructure that includes a measurement agent and a server-based manager
that can store the collected data (Systems Management: Application Response, 1998).
The ARM API could be implemented for a Web application. Many Web applications
have a client-server relationship with other components in the Web site and the
information collected by ARM would be useful for understanding the inter-application
response. ARM might also be useful for applications that reach out across the Web.
However, since ARM is not strong in the measurement of applications with a broad
distributed scope, the data collected might not help solve the detailed questions that often
arise when investigating response time problems.
The Common Object Request Broker Architecture (CORBA) is a computing
infrastructure that has been used by management applications. The architecture, although
not specifically targeted at management applications, is supported and promoted by the
Object Management Group (OMG). CORBA is a tool that makes it possible to realize the
benefits of object technology rapidly. CORBA automates many common programming
tasks like object registration, location and activation. It also manages error handling and
has interfaces for common facilities like object services that make it possible to like a
spreadsheet object into a report document (Schmidt, 2001). CORBA is popular with
management applications because object-oriented technology is valued by some in the
network-management community. Researchers and developers, in their search for a
productive way to use objects, turned to CORBA for help. Many implementations of
CORBA are available to be used by any application. Examples include ILU from Xerox
Parc (free implementation), Orbix from Iona (fully compliant commercial
54
implementation), ObjectBroker from Digital, and HP's Distributed Smalltalk (Keahey,
2000).
It is not clear that CORBA or object technology in general has any special implications
for applications management. Object technology emerged in the late 1980s as an
important new way upon which to develop applications. Object technology can be very
complex to implement and that is why, even today, it is avoided by some in the system
management community. Awareness of CORBA and object-technology is nevertheless
important because a new application-management toolset will likely need to interact with
legacy systems that use CORBA like Tivoli's Distributed Monitoring (Gaffaney and
Carlin, 1998) or network management systems that are object-based like IBM's Resource
Object Data Manager (RODM) implementation with NetView (Finkel and Calo, 1992).
The Common Information Model (CIM) is a standard that is a work product of the
DMTF. CIM is a follow on standard to the DMTF's earlier work that used Management
Information Files (MIFs) to define and capture the management characteristics of an
application. CIM is a conceptual model that is not tied to a particular implementation. It
allows for the interchange of management information between management system and
applications (Cover, 2000). Because of its focus on applications, CIM is potentially a
very significant standard for application management.
CIM is deliberately narrow in its focus. CIM operates in an application context that
supports six steps in the life cycle of an application. These steps are purchase, deploy,
advertise, configure, execute, and remove/uninstall. Both installation and operational data
are maintained about the application. Installation data includes information about the
product, the software features, and the elements that make it up. Operational data
55
includes settings, start and stop information, and information about associations. Some
CIM implementations support views that result from queries on classes that are
represented in tables (Applications and Namespaces, 2001). This capability makes it
possible to view and change the data that supports the life cycle of the application.
CIM is designed in a way that makes immediate implementation possible. Microsoft
has an implementation that is based on CIM version 2 and is intended to represent the
state of the local environment (Applications and Namespaces, 2001). Intel's Wired for
Management (WfM) initiative includes CIM as a key tool of its asset-management
approach. WfM's baseline version 2.0 specifies that a server system must support CIM,
DMI or SNMP. It considers all three as key management information frameworks
(Overview of Wired, 2000). CIM is a key part of other products like Manage.Com's
FrontLine Manager. FrontLine Manager uses CIM to support functions like automatic
discovery and locate. FrontLine Manager also diagnoses and correct problems like
electronic commerce transaction bottlenecks (Horwitt, 2000).
The Lightweight Directory Access Protocol (LDAP) is rooted in an overall directory
service called X.500. X.500 is an OSI entity that consists of a namespace and protocol for
querying and updating it. The protocol is called Director Access Protocol (DAP). DAP
requires the OSI protocol stack that results in a rather large client due to the richness of
the implementation. LDAP is both an information model and protocol and it runs directly
over the TCP/IP protocol stack (Hodges, 2000). LDAP contains strongly typed and
structured information that can be provided in a highly distributed manner. Its core
schema is fixed and usually controls the directory hierarchy and the schema for individual
objects is highly extensible. In addition, LDAP is powerful because it can be integrated
56
with other technologies like relational databases (Kille, 1998). An applications-
management toolset should be aware of LDAP since it is being used as a part of more and
more Web sites.
Ensuring the availability of LDAP services and namespaces should be the task of a
management toolset. In addition, the toolset itself might consider the use of LDAP in the
same way a management tool like NetView uses a RDBMS to support its management
information repository. LDAP is not a RDBMS, but it is suited for high-performance
access to hierarchical data. LDAPv3 is specifically targeted at management applications
and browser applications that provide read/write access to directories. LDAPv3 is
designed to provide key function while not incurring the resource requirements of the
X.500 Directory Access Protocol (Wahl, Howes, & Kille, 1997).
The Reference Model for Open Distributed Processing (RM-ODP) is a standard that is
a joint effort of ISO and ITU-T. This standard serves as a framework for the specification
for various aspects of an open distributed system that is useful for systems management.
Related to RM-ODP is a specification language called Viewpoint. The concepts and
structures of Viewpoint support five areas of interest for enterprise modelers. These
include enterprise, informational, computational, engineering, and technology views
(Enterprise Distributed Computing, 2000).
RM-ODP and viewpoint are useful for application management because the standard
has the flexibility required to handle the complexities of managing applications. RM-ODP
and viewpoint were used by Neumair (1998) to provide an umbrella management
approach to complex systems, networks, and applications. Neumair used RM-ODP and
Viewpoint, as well as Generic Application Managed Objects Classes (OSI constructs) to
57
interface with and manage applications that executed in environments supported by
CORBA, SNMP, and OSI/TNM agents. The umbrella-management idea is very useful for
providing an overall management framework where different legacy management systems
are in use. In this situation, the alternative to the umbrella-management approach is to
convert the legacy management systems to a single standard system. In many
circumstances, this is a very costly and impractical alternative.
The International Organization for Standardization conceived and implemented the
Open System Interconnection Protocols (Open System Interconnection, 1999). There are
many Open System Interconnection (OSI) protocols that are organized in suites. An
example is the CMOT protocol suite that contains seven protocols including ISO ASCE,
ISO DIS ROSE, ISO CMIP, the lightweight presentation protocol (LPP), UDP, TCP, and
IP (Warrier, Besaw, LaBarre, & Handspicker, 1990). These protocols were developed to
facilitate multivendor equipment interoperability. They grew out of the need for
communication between different hardware and software systems even when the
underlying architectures were different. One of the ISO protocol groups is Common
Management-Information Protocol (CMIP). CMIP is a management protocol that is
similar to SNMP (Open Systems Interconnection, 1999). Actually, CMIP was designed to
replace SNMP by making up for its shortcomings. CMIP was designed to be a more
robust and detailed manager containing complex and sophisticated data structures with
many attributes suited to the management of diverse networks. Compared to SNMP,
CMIP is a more efficient network management system requiring less work on the part of
the user to keep updated on the status of the network (Vallillee, n.d.).
58
The CMIP protocol has been used as the architectural underpinnings for some
applications-management research projects. Hong, Katchabaw et al. (1995) used the OSI
management framework for monitoring, analyzing, and controlling networks and their
devices. Their scope included management of the applications running in a networked
environment. Maltinti, Mandorino, Mbeng, and Sgamma (1996) discussed a system that
was built for an Italian Public Administration department whose scope included system
functions and application services like TP monitor, file transfer, software distribution,
activity scheduler, and terminal emulator.
The Java Management Extensions are a product of the leadership activities of Sun
Microsystems and leading companies in the management field. The extensions manage
Java applications using Java technology (Java Management Extensions Home, 1999).
The Java language was introduced by Sun Microsystems in 1994. Sun Microsystems
claimed that Java, because of its ability to imbed applications (applets) into a Web page,
would make the content of Web pages alive and dynamic (Yang, Linn, & Quadrato,
1998). Since that time, the language has grown and acceptance has been widespread. The
functional capabilities of the language have grown to include internationalization, 2D
graphics, sound, JavaBeans, JDBC database access, servlets, security, and the extension
mechanism. These language capabilities are "specialized trails" in the Java Tutorial that
can be accessed on the Sun Microsystems Web site (The Java Tutorial, 1999).
Key elements of the Java Management Extensions include its architecture,
components, and APIs. JMX Architecture is organized using a three-level model. The
levels include manager, agent, and instrumentation. The JMX components work within
this architecture. The main components include a JMX manageable resource, a JMX
59
Agent, and a JMX Manager. The Java Manageable Extensions also include services for
management. These include support for polling and forwarding information between
agents and managers. APIs are also a key part of JMX. APIs are included so that there is
a standard way for Java management agents to work with existing management
technologies like SNMP, WBEM, and TMN. Also included are APIs to generate alarms
and to provide topology information (Java Management Extensions White, 1999). JMX is
a natural choice when developing a toolset to manage Java applications.
SNMP is a network-management protocol that has implications for application
management. SNMP was developed by the Internet Engineering Task Force (IETF) that
defines itself as "a large open international community of network designers, operators,
vendors, and researchers concerned with the evolution of Internet architecture and smooth
operation of the internet" (Joining the IETF, 2000, p.1). The IETF deals largely with
network and security topics through working groups, but since 1997 has had some focus
on applications. The area directors define applications as "things that are not security
(part of the security area), nor networks (most of the other areas), but rather things that
use the networks and security services to provide things of benefit to the end-user" (The
IETF Application Area, 2000, p.1).
The IETF has had a number of efforts specific to the management of applications. Not
all of them have resulted in standards. RFC 1514, Host Resources MIB was an early
applications-management effort. This RFC did not become a standards-track document.
RFC 1697, Relational Database Management MIB, was another early effort. RFC 1697
was focused on database management, but it did not become a standards-track document
(Sturm & Bumpus, 1999). RFC 2248, Network Services Monitoring MIB, contained two
60
applications-management groups. As of January 1998, it was still a standards-track
document (Freed & Kille, 1998). RFC 2287, Definition of System-Level Managed
Objects for Applications, was proposed as a standard in 1998 (Krupczak & Superia,
1998). RFC 2564, Application Management MIB, was a standards track RFC that had to
do with managing applications using SNMP and a MIB that includes considerable
capabilities (Kalbfleisch, Krupczak, & Presuhn, 1999). Since 1999, the IETF has started
to move beyond SNMP as a management information structure. A recent draft, SMIng -
Next Generation Structure of Management Information, was focused on an object-
oriented data definition language for the specification of various kinds of management
information (Straus, Schoenwaelder, Braunschweig, & McCloghrie, 2001).
As early as 1995, researchers Sturm and Weinstock (1995) tested a prototype
application MIB with a variety of SNMP managers including OpenView and NetView.
Bellcore, a communications research company, was planning to introduce products that
used the MIB in 1996. Whether the MIB was adopted by the IETF or not, the
attractiveness of the Bellcore implementation was that it used standard SNMP facilities
with its own information base. This is just what a hardware vendor does when they
introduce a new device to be managed--they supply a MIB and use their existing SNMP
manager. Currently, a number of products manage applications using a MIB. An example
is the SNMP MIB support for the IBM HTTP Server. This server uses three MIBS--
SNMPv2-MIB, WWW-MIB, and APACHE-MIB, to support the server software and its
related applications (SNMP MIB Support, 2001).
The original Portable Operating System Interface (POSIX) standard was published in
1986 and was actually called IEEE-IX, a name that reflected the strong UNIX influence
61
on the standard. The convention at the time for naming UNIX-related software was to
have the name end in an X like HPUX, AIX, and PNX (The Portable Application
Standards Committee, 2000). The base of POSIX standards contains over 70 documents in
various states from Project Authorization Request (PAR) approval to final IEEE
approved. The scope of these standards is application service interfaces and includes
documents covering system interfaces, real-time extensions, threads, security, protocol
independent interfaces, fault tolerance, checkpoint/restart, tracing, and utilities. This is a
small subset of the POSIX titles.
POSIX standard 1387.2 (Information Technology - Portable, 1995) is an important
applications-management specification. This standard, approved in March of 1997, had a
scope that included a standard layout for software, a definition for information about
installed software, and a standard set of commands for manipulating software (DCE-RPC
Interoperability, 1997). POSIX 1387.2 makes possible the orderly and automated
management of software. This powerful standard has found acceptance in the marketplace
as it is not hard to find products that advertise 1387.2 compliance. Software distributor
from HP (Software Distributor, 2001) and SysMan Software Manager from DEC
(Overview and Installation, 2001) are two examples of products that proudly announce
their acceptance of this Open Group specification.
The Tivoli Application Management Specification (AMS) provided a way to specify
management information about an application that is required for management. The data
was in a standard format that was machine-readable and supported a number of life-cycle
related tasks including application distribution and installation. It also supported
monitoring and operations control of the application including support for the
62
visualization of application component relationships (Applications Management
Specification, 1997). AMS had support from several Tivoli products like Distributed
Monitoring and Software Distribution and provided a toolkit called Tivoli Module
Designer that was used to build the definition files (Tivoli Module Designer, 1998). AMS
is less important than it was just a few years ago as it has failed to get support from major
software development companies other than Tivoli. Tivoli's own products never fully
embraced AMS and unlike the Application Response Measurement standard that was
adopted by the Open Group, AMS has not been adopted by any standards group.
Web-Based Enterprise Management (WEBM) is an initiative of the Distributed
Management Task Force to define a set of management and Internet standard technologies
to make it easier to manage computing environments in the enterprise (WEBM Initiative,
2001). Specifically, WEBM is an open industry standard for enterprise-wide systems
management that is rooted in existing Web technology. Its goal is to deliver management
functionality for systems, network, and applications independent of protocol or supporting
management framework. WEBM has three basic components--schemata, instrumentation,
and clients. The WEBM schemata are provided by CIM. The instrumentation can take
many forms. Typically, the data model is populated by software agents provided by
vendors like BMC, Cisco, Intel, and Microsoft. The clients provide the console capability
that is needed to manage the applications. Typically, the management data is displayed
using a Web page (Spuler, 2000).
WEBM appears to have the good chance to provide a significant management base to
support applications. It has a schema that is specific to applications, as well as systems
and networks and it has the ability to utilize data from legacy sources like SNMP and
63
ARM. It also leverages existing Web technology for the presentation of data. Finally,
there is considerable vendor support with marketplace products from BMC Software,
Cisco Systems, Compaq Computer Corporation, IBM Corporation, Intel Corporation and
Microsoft Corporation.
Management Information Repository A MIR provides database support for management applications and supports their
integration into a single management environment (Martin, 1996). The MIR is the heart of
the management system. The management data support can involve the use of a database
management system like Oracle or DB2 or can be provided by one or more simple files.
The database management systems used are often relational and sometimes tied to a
specific product implementation. Such was the case for LAN Network Manager for OS/2
and IBM’s DataBase2 (DB2) product (LAN Network Manager, 1997). A MIR can also be
called a Management Database or a Management Information File.
Management Databases often have specific names like Generic Topology Database.
They can also have generic names like Object Database. The specific and generic names
can be combined in the same product to name different physical components of the
Management Database. This is the case for the Generic Topology and Object Database
examples used with the LAN Management Utilities product (AIX LAN Management,
1995). The Management Information File (MIF) is an open standard from the DMTF.
The DMTF MIF has a specific format that can be used by any vendor thereby encouraging
ease of integration between different management products (Tivoli Module Builder,
1998).
64
The CORDS project at Queen’s University Database Systems Laboratory had a
research focus on MIRs. The main theme of the research conducted by the database group
was efficient access to distributed data. The research had previously covered various
aspects of this problem including remote procedure calls, distributed full text retrieval,
distributed query processing, and multi-database systems. CORDS research projects
included Management Information Repository for Distributed Applications Management,
Data Warehouse for Distributed Applications and Systems Management, Networked
Multimedia Systems, Dynamic Tuning of DBMSs, and WWW-CM: Querying the Web
using a Conceptual Model (Queen’s University Database, n.d.).
Bauer, Finnigan et al. (1994) defined distributed-systems management architecture
with four sets of functional components (see Table 2). The functions used tools such as
fault viewing programs, provided services such as configuration management, leveraged
management agents that had monitoring and control capabilities, and managed resources
such as applications using object data stored in a MIR.
Table 2. Reference Architecture Components for CORDS Project
Function Consists of
Operation and management tools
Configuration, performance, fault, modeling and simulation, report generation, and visualization
Management services
Configuration, monitoring, control, and MIR (X.500, databases, files, etc.) subsystems
Management agents System and network layers with monitoring and control
capabilities
Managed resources Managed objects, for example, servers and applications
65
The MIR was an important subsystem of the CORDS architecture. The repository
dealt with static and dynamic management information. Static information included
SNMP or CMIP data whereas dynamic information included performance information or
fault (events or traps) data. The repository integrated this data and information to provide
meaningful support to the management applications.
MIRs have often been discussed in the context of the Managing Distributed
Applications and Systems (MANDAS) project. The MANDAS project was documented
on the Web (Queen’s University MANDAS, n.d.) and in a number of detailed articles.
Bauer et al. (1997) identified the MIR as an important part of the management–services
component and is called the repository subsystem. The repository subsystem was
explained in the context of an overall system that contained management applications,
management services, management agents, and managed objects.
The focus of the MIR work for MANDAS was understanding the requirements of
management applications including configuration, performance, fault, and modeling and
building prototypes that exploit current technology to address those needs. The MIR
work in MANDAS was focused on the integration of data that is needed by the
management application. Implementation of management services was also important in
this work.
MIRs are discussed by software vendors when they are describing the components of
their products. They are also discussed as a feature of their systems as data collected in
MIRs can be used to generate reports. A good example is Digital Equipment Corp.’s
Polycenter Framework (Muller, 1998). Polycenter’s MIR is an object-oriented database
system that contains real-time status and performance information. The Polycenter MIR
66
supports a large number of applications including Network Management, Storage
Management, Configuration/Change Management, Fault/Problem Management,
Performance/Capacity Management, Automation, Security Management, and
Accounting/Billing Management. Digital’s TeMIP is another example of a product where
the MIR is discussed by the vendor. TeMIP provides a platform for the integrated
management of heterogeneous networks. Its architecture allows the platform to support,
simultaneously or separately, the element management, network management and service
management dimensions of a Telecommunications Management Network. The TeMIP
MIR is a component of this architecture called common and basic services. Common and
basic services include security services, TeMIP name service, data dictionary (a metadata
repository), management information repository, and distributed event forwarding
(TeMIP OSS Framework, 2001).
Product information manuals are a great source of information about a given product’s
implementation of the MIR. The LAN Network Manager for OS/2 Reference (1997) is a
detailed reference that contains a chapter on the LAN Network manager database. All
thirty-two-product tables are explained. The NetView Database Guide (1997) discusses
the relational database that supports TME 10 NetView and its network-management
applications. Fifteen database tables are used by the product. AIX LAN Management
Utilities (1995) describes various aspects of this utility and its use of a MIR. The AIX
LAN Management Utilities product supports configuration, performance, and fault data.
Most standards efforts in network, systems, or application management define a MIR.
The Applications Management Specification, which is focused on the emerging discipline
of Applications Management, defines a way to capture, in one place, information about an
67
application that is useful in managing it deployment, availability, and change. AMS
captures information about the application in files called Application Definition Files
(ADFs) that use a MIF format (Applications Management Specification, 1995). These
files are placed in a MIR that is used by a number of different management applications.
Recently, the DMTF announced the introduction of the Common Information Model. The
main idea of the Model is help with the exchange of management information between the
management applications and the resources they are managing. CIM is organized in such
a way that the managed environment can be viewed as a collection of interrelated systems.
CIM data is defined and stored in objects (Thompson & Sweitzer, 1997).
Open Software Interconnection (OSI) Management framework supports the use of a
MIR. With OSI, a set of managed objects within a system, together with their attributes,
constitutes a system’s management information base (Hong, Katchabaw et al., 1995).
SNMP is a management system that uses a MIB. Tschichholz, Hall, Abeck, and Wies
(1995) pointed out that SNMP has a much greater market share as compared to OSI
implementations like CMIP because SNMP is easier to implement. The MIB is stored
locally as a file that can be queried and updated through a simple programming interface.
The SNMP MIB can be adapted for a variety of uses. Sturm and Weinstock (1995) made a
convincing case for using the SNMP MIB for applications-management uses. In their
implementation, the MIB would focus on information like installed unit, installed process,
distributed application, configured unit, realizable process, business function, process,
files, and mailbox variables.
The MIR implementation of a number of products are now discussed. A summary of
the products can be found in Table 3. The management applications were chosen because
68
they offered contrast with other tools. Some implementations are simple, whereas others
complex. Some tools have many database tables, others few–some have no tables at all.
The products that do not use relational databases with tables use a variety of other file
types including simple sequential files and indexed files with make more
Table 3. MIR Product Implementations Discussed
Product Management Focus Platform MIR Implementation
HP OpenView Network, server, client, and peripheral management
Workstation, mid- range
Management database with open data model
LAN Network Manager
Network management and problem determination aid for local-area networks
Workstation, mid- range
Relational tables using DB/2
LAN Management Utilities
Monitor and manage IP, IPX, and NetBios devices including problem determination and event processing
Workstation, mid- range
Proprietary database
TME 10 NetView for UNIX
Network Management
Mid-range Relational database tables
TME 10 NetView for OS/390
Network and system management
Large systems, mid- range, and workstation
Sequential files, keyed files, object-oriented data cache backed by DASD
Solstice Enterprise Agents
SNMP for DMI- based management application
Independent of any specific operating system, hardware platform, or management protocol
MIF Database with install/delete services and notification to all registered applications
69
HP OpenView is a tool for network, server, client, and peripheral management. HP
OpenView supports a variety of platforms including HP-US, Sun Solaris, Microsoft
Windows 95 and Windows NT systems. The common services provided by HP
OpenView include a user interface, event management, discovery, management database
(storage of network data), communications infrastructure, integration services, and node
management. The management database is a central repository for storing network-
management data. HP OpenView uses a MIR that is open, not proprietary thus allowing
users to utilize the database of their choice. The database holds real-time and historical
data. The real-time data supports availability management of the systems whereas the
historical data can be used to graph and analyze information and produce reports
(Network Management, 2001).
LAN Network Manager is a network management and problem determination aid for
local-area networks. The MIR implementation for this product consists of thirty-two
tables that contain information on alerts, events, and network resources like bridges, rings,
and controlled access units. The tables are designed to serve different roles. Some tables,
like the Alert Cause Text Table and Alert Filters Table contain static data. For the static
tables, some of the data is supplied by the software manufacturer whereas other data is
defined when the management software is configured. Other tables contain dynamic data
like the Event Log Table and Bridge Performance Data. In the case of the Event Log
Table, one entry is created for each event that is generated in the system. There are many
event sources in a system including bridges, routers, and hubs. Application programs can
also be sources for events.
70
There are many sources of performance data in the system. LAN Network Manager
tables are used to capture this data. The chief performance-related tables are the Bridge
Performance Table, Multiport Bridge Performance Table, and Ring Performance Tables 1
& 2. LAN Network Manager has many commands that interact with the MIR. There are
a number of event-related commands that work with the Event Log Table. Event Delete is
used to remove unwanted events. To set up filters to eliminate the recording of certain
kind of events, several commands are supplied including log filter add, delete, list, query,
and set (LAN Network Manager, 1997).
LAN Management Utilities (LMU) is a tool to monitor and manage IP, IPX, and
NetBios devices from a single workstation. Problem determination and event processing
are also centralized with this product. The MIR implementation for this product has three
parts. The components are an object database, a topology database, and a MIB. The
object database contains global object information used by the graphical user interface.
The topology database, called Generic Topology Database, stores LMU topology
information, as well as information about submap groupings and content. The XXMAP
application queries both this database and the object database. The MIB database is
called the LMU Subagent MIB. This collection of management information contains
system configuration data, performance data, and PF2 data. PF2 data is data collected by
the System Performance Monitor/2 product (AIX LAN Management, 1995).
TME 10 Global Enterprise Manager (GEM) is a tool to monitor and manage
applications and business systems. The MIR for GEM is a collection of files in DMTF
MIF format that conform to AMS. AMS is an open standard that defines the management
characteristics of applications. This information in the MIF is used by the management
71
tool to monitor and operate the application system. GEM has many utilities that
contribute information to the MIR. Module Builder is a tool that is used to create AMS-
based management files. One important management file is the Component Description
File (CDF) that contains information about the components that make up the system to be
managed like IP hosts, deamons, and routers. Other management files are used to show
the content of and relationships between the business components.
Figure 8 indicates the main sources necessary to build and use the GEM MIR. Two
main utilities are used to create files in the MIR. Module Builder and Module Designer
are used to create application definition files and executable instrumentation. Both
application definition files and executables can be built manually without the use of the
Builder and Designer utilities.
_______________________________________________________________________
Figure 8. Builders and users of the GEM MIR
________________________________________________________________________
The files in the MIR are used by a variety of management applications. The GEM
Server is used to build business system views that monitor the availability of applications.
G E MM I R
B u i l d U s e
M o d u l eB u i l d e r
M o d u l eD e s i g n e r
O t h e rS o u r c e s
G E MS e r v e rS o f t w a r eD i s t r i b u t i o nO p e r a t i o n a lT a s k sD i s t r i b u t e dM o n i t o r i n g
72
Software distribution is used to distribute applications to target servers and clients.
Operational tasks are used to manage applications, for example, start, stop, backup, and
recover an application and its key components. Distributed monitoring is used to
proactively monitor server and client resources like CPU utilization, file-system
utilization, and memory usage (Gaffaney & Carlin, 1998).
The NetView product uses a relational database as its MIR. NetView supports four
commercial database products including DB2/6000, Informix, Oracle, or Sybase products.
The relationship between NetView, its sources of data, and its database is shown in Figure
9.
________________________________________________________________________
N e tV ie w
IPT o p o lo g y
tra p dL o g
R e la tio n a lD a ta b a se *
sn m pc o lle c t
*D B 2 /6 0 0 0 ,In fo rm ix ,O ra c le ,syb a se
Figure 9. NetView, sources of data, and its relational database support
________________________________________________________________________
NetView stores three kinds of data in 13 tables (see Table 4). The main types of data
are Internet Protocol (IP) Topology, Trapd Log, and SNMP Collect. IP Topology data
73
covers a variety of aspects of the network such as information about each network and
segment that is managed by NetView. Trapd Log data is typically exception information
such as a node down alert. SNMP Collect data pertains to the MIB variables that are
managed and controlled by NetView. The tables are created using utilities supplied with
the product. The product also supplies commands that are used to create reports.
Table 4. NetView Data Including Type, Table Name, and Description
Type of data Table name Description
IP Topology Topoinfo Summary information about the entire IP topology.
Networkclass Information about each network in the IP topology.
Segmentclass Information about each segment in the IP topology.
Nodeclass Information about each node in the IP topology.
Interfaceclass Information about each interface in the IP topology. Objecttable Information about the objects of the network, segment,
node, and interface classes.
Classtable Information about each objectclass.
Memberof Information about objects in a one-way member-of relationship.
Coupledwith Information about objects in a two-way coupled-with relationship.
Trapd Log Trapdlog Describes the types of information found in the trapdlog table.
SNMP Collect ColData Information about data collection activities.
Varinfo Information about MIB variable data.
Expinfo Information about MIB expression data.
74
Procedures are delivered with the product to work with IP Topology, Trapdlog, and
snmpCollect data. Procedures are used to diagnose problems, search for data, and delete
data (NetView Database Guide, 1997). These procedures are written in a task-oriented
style that makes it easier for system administrators to perform activities quickly.
TME 10 NetView for OS/390 is a comprehensive product for network and systems
management. The product was created in 1989 by combining and enhancing a number of
IBM network and systems-management offerings. Partly due of its history, NetView for
OS/390 has a complex and comprehensive set of files (called Legacy Repository in Figure
10) that make up its MIR. Its files utilized a wide variety of access methods without the
use of a DBMS. The NetView for OS/390 files are depicted in Figure 10.
________________________________________________________________________
9
In i t .P a r m s
N e tV ie w f o r O S /3 9 0
T
M
T
TC
E
CO
S e s s io nH is to r y
E v e n tH is to r y
M e s s a g eL o g
S u p p o r t
L e g a c y R e p o s i to ry
H ig h S p e e d D a ta C a c h e
. . .1
A S
D S
OM
Figure 10. The NetView OS/390 MIR
________________________________________________________________________
75
Parts of the NetView MIR are legacy components. These files are used to store data
like session history and event history information. Messages, in the form of logs, are also
stored in files (NetView for OS/390, 1997). Recently, NetView has been enhanced with a
high-speed cache. This enhancement to the NetView MIR is called the Resource Object
Data Manager (RODM). RODM is a high-speed, in-memory repository that contains
information about resources that is used to support automated actions and to support
graphical views. These views are used to manage the availability of resources and to start,
stop, and recover them. RODM is an object-oriented system (Finkel & Calo, 1992).
Solstice Enterprise Agents is a tool that works as part of a system that includes
management applications, subagents, and a MIF database. The relationship between the
components is shown in Figure 11.
________________________________________________________________________
S N M P M a s t e r A g e n t
M I FD a t a b a s e
D M I - b a s e dM a n a g e m e n tA p p l i c a t i o n
D M I M a p p e r a n d S u b a g e n t
S N M P S u b a g e n t ( s )
Figure 11. Solstice Enterprise Agents and other components
________________________________________________________________________
76
The DMI-based Management Application is used to display network topology and to
take management actions against network resources. The SNMP Master Agent is a
process on a node that exchanges protocol messages with managers and its subagents to
monitor resources. The Mapper and DMI Subagent use the Desktop Management
Interface (DMI) to interface with the Management Application, Master Agent, and MIF
database. The MIF database is associated with each Mapper and DMI Subagent as these
components contain a function called the Service Provider (SP). The SP controls all
access to the MIF database (Solstice Enterprise Manager 2.1, 1997).
Most of the product publications contain some information on maintaining or
improving MIR performance. NetView for UNIX contains a chapter on performance. In
addition to explaining how to increase table size, the chapter explains the importance of
updating table statistics to improve data retrieval from the database (NetView Database
Guide, 1997). LAN Network Manager contains more information on improving database
performance including reorganizing data, optimizing the DB2/2 configuration, isolating
database log files, avoiding database maintenance when LAN Network Manager is
running, and backing up the database (LAN Network Manager, 1997).
Improving performance is not the same as having an architecture that is high-speed,
and in-memory as part of its initial design. RODM has several features that make it
unique among MIRs. RODM runs as a privileged OS/390 subsystem, keeps its objects
and classes stored in data spaces, and does not commit all changes to disk. A commit
request is supported and this capability makes a warm-start possible that is much quicker
that a cold-start operation (Finkel & Calo, 1992). Mohan, Pirahesh, Tang, and Wang
77
(1994) discussed parallelism that is an important issue when considering the performance
needs of large MIRs.
MIRs are an interesting area of study in systems, network and application
management. MIRs were proprietary and are now evolving to open systems embracing
relational technology. Some MIRs are object oriented in their structure and exploit high-
speed, in-memory implementations. The schema and data for this toolset's MIR will be the
heart of the Web application management toolset.
Classes of Products A four-point classification system was developed by Sturm and Bumpus (1999) for
understanding the depth and functionality of existing applications-management products.
The classification system includes:
1. Point products - perform a specialized function
2. Targeted products - broader than point, but focused at a specific environment
3. General solutions - broad stand-alone or integrated product suite
4. Framework solutions - platform and components
This classification system offers a real-world mechanism to discuss the existing
products. However, software vendors have some different models. OpenVision
technologies has a strategy that segments its products into four tiers. Their point product
is called PointSolutions. The other three classifications are product suites
(SuiteSolutions); products designed for specific third-party business applications
(SolutionsPlus), and combined products and services offerings for database, network, and
systems management (QuickConnect Services). Professional services are often a
component of the highest tier offerings (OpenVision Tech Unveil, 1994). A survey of
78
applications management products follows that is based on a review of the literature and a
detail examination of the specific products.
Point solutions or products are discussed extensively in the literature. The term point
product is widely accepted. Richardson (1998) discussed Picture Taker, a product from
Lanovation's that takes a snapshot of a Windows 95 or NT system's setup. From this we
learn that point products perform a specific and often narrow function. Another point
solution from the literature is Full Armor software. This product prevents desktop users
from changing settings, installing software, and deleting components (Mazurek, 1998).
Both of these products could fit into the change dimension of a broader applications
management strategy. Change control or management is a vital applications management
functional perspective.
Point solutions are available that support other functional perspectives like performance
and configuration. Allot Communication's AC 2000 product is a network performance
tool that helps Web managers manage expensive wide-area network bandwidth. This
point product can be configured to control traffic by source address, destination address,
time-of-day, access control, or class of service (Anderson & James, 1998). This
capability could be exploited to give preferred service to certain application servers
thereby insuring better performance to that application. Another point product in this
category is the offering from Copper Mountain Networks and Xedia Corporation. This
network performance offering is a tool that can be used to offer improved performance to
Internet applications through a technology called Class-Based Queuing (CBQ). This
technology makes it possible to allocate and prioritize bandwidth according to subnet, IP
79
address, port number or URL. This capability could be used to assist in meeting service
level agreements (DSL: Copper Mountain, 1998).
Configuration management is a great challenge and one that is costly in time and
effort. In Ready, set, deploy! (Sturdevant, 1999) the author detailed how GTE
Internetworking deployed ON Technology's Comprehensive Client Manager product to
automate the entire life cycle of PC software. With that point product, GTE is able to
remotely configure and manage a variety of desktop and laptop systems. They use it to
install the operating system and application software and to manage the software through
its life cycle. This tool provided direct application support and could be a key part of the
configuration component of an applications-management strategy.
Targeted products are broader in scope than point products, but focused at a specific
environment. A targeted product might focus on the end-to-end management of SAP R/3
or the Windows operating system. The NetIQ's AppManager Suite is a tool to proactively
manage the performance and availability of the Windows NT and 2000 systems. The
product has a central console that can be used to monitor many components of these
systems ranging from physical hardware to applications like Microsoft Exchange, Citrix
WinFrame, Lotus Domino, and Oracle (NetIQ AppManager Suite Overview, 2001). In
addition to the central console, the architecture of the product includes a Repository,
Manager Server, Agent, and Web Management Server (NetIQ AppManager Suite
Architecture, 2001).
Patrol for SAP R/3 is a product from BMC Software. This product manages SAP with
four key components including a knowledge module, a console, agents, and an event
manager. The product's knowledge module monitors critical processes like dialog, batch,
80
enqueue, update, and spool. It does this using a server profile that makes it possible to
automatically discover all processes defined on the SAP server. The product also
monitors and manages database servers, CCMS alerts, response time, and users (BMC
Solutions, 2001).
General solutions are broad stand-alone products or integrated product suites. Many
product suites in this category are close to the framework products in approach and
marketing. Platinum technology's ProVision was labeled "The Unframework" by
Information Week (Gallagher, 1998) because its products share a common set of services
yet each component can stand on its own. Its products are like a series of point solution
yet they combine to create an integrated solution. ProVision is a general-purpose network
and systems-management tool. It handles security administration, help desk, desktop
configuration management, database, and application management (Planinum Technology
Emerges, 1998). As if to cover all possibilities, ProVision integrates with the framework
solutions including Hewlett-Packard's OpenView and Tivoli Systems' TME 10
(Gallagher, 1998). At least one publication, Network Computing, awarded ProVision
finalist category for its enterprise systems management framework noting the addition of a
common graphical user interface (Boardman, 1999).
BMC's Patrol is another example of a general solution. BMC's approach with Patrol
was to create knowledge modules that specialize in the monitoring and management of a
specific area like network management or SAP systems. The knowledge modules
leverage the basic Patrol architecture that includes a common user interface and an agent-
manager structure. Knowledge modules (sometimes called Solutions) were available for
network products like Cabletron hubs, ERP products like SAP and PeopleSoft, servers
81
like Digital Equipment's AlphaServers, and database like Informix, Sybase, and Adabas
(Patrol Enterprise Manager, 2001). The wide variety and depth of the knowledge modules
make Patrol an attractive tool to manage key technology areas.
The systems management field is changing. Nash (1999) indicated that IT
organizations have a choice of implementing solutions that use point solutions or
framework-based offerings however there are fewer vendors to choose from these days as
there has been considerable consolidation in the industry. He also points out that new
technologies like the Internet are driving the need for organizations to offer their users
complete solutions. This drive favors the suppliers of framework-based products.
However, the situation is not completely favorable for these software companies.
Boardman (1999) commented on recent tests that his lab completed on both framework
and point products. Regarding the framework products, he said, "the cost in equipment
and human resources to implement these systems is still daunting" (Boardman, 1999,
p.26). He also mentioned other challenges with the implementation of framework
products including the need to abandon familiar and well-liked tools.
The framework-based products provide management functions that make use of a
functional framework. What is a functional framework? Bauer, Coburn et al. (1994,
p.405) defined it as "the definition and organization of logical services and functions that
satisfy a set of requirements for a system". What services and functions make up the
frameworks of a typical systems management software product?
The Computer Associates product Unicenter TNG is a good one to examine and
describe. It contains common services and cross-platform support. The common services
are a common GUI, object repository, distributed services, communication facilities, and
82
event services. The cross-platform support includes Novell NetWare, IBM OS/400, MVS,
HP/UX, Sun Solaris, IBM AIX, Digital UNIX, ICL Unixware, Sequent Dynix/PTX, DG-
UX, NCR MP-RAS, SCO Unixware, SGI Irix, Tandem/NSK, Digital OpenVMS,
Windows NT, Java, Linux and other platforms. Other technologies include a cross-
enterprise calendar, virus detection, reporting, hands-free management, and partnerships
with software, hardware, and services organizations (Karpowski, 1999). This functional
framework provides services that the management applications use. For example, the
event viewing application uses the same GUI services as the network management
application. This not only reduces the development and maintenance cost of the
application for the vendor, but also makes the product easier to use because there is one
graphical user interface convention.
What functional perspectives can you expect a framework-based management solution
to support? Using Unicenter as an example, the scope is broad. Unicenter has support for
traditional disciplines like database, network, security, operations, help desk, storage, and
desktop and server management. It also supports applications, Internet, and real world
management. Real world management includes the management of devices like vending
machines, vehicle fleets, and environmental control systems (Karpowski, 1999). What are
some of the other important framework-based products and how are they organized?
Table 5 contains summary information for five important framework products
including Hewlett Packard's OpenView, Solstice Enterprise Manager, Spectrum
Enterprise Manager, Tivoli Management Software, and Computer Associates' Unicenter
TNG.
83
Table 5. Summary of Five Framework Products
Product Characteristics of framework Support for functional perspectives
OpenView Data collection and action execution, management data, service-oriented management applications, and Web-based GUI.
Service desk, SLA, change, asset, storage, network, performance, desktop management, software distribution, network analysis, and administration.
Solstice Distributed applications, management information services, management communication infrastructure, and management protocol adapters.
Availability and event/alarm management.
Spectrum
System structure with client-server components, knowledge-base, application programming interface, device communication manager, and model-type editor.
Availability, performance, fault, and usage. By extension--facilities management, factory automation, software management (OS, database, and application software).
Tivoli Manage- ment Software
Graphical user interface, command- line interface, communication service, databases, installation service, and application services.
Asset, availability, change, network, operations, security, service, storage management, medium-sized businesses, small to medium sized businesses, e-business, and OS/390.
Unicenter TNG
Common services and cross- platform support.
Database, network, security, operations, help desk, storage, desktop and server, applications, Internet, and real world management.
OpenView is a key component of Hewlett Packard's service management initiative.
The software, along with professional services, systems integrators, developer partners,
and outsourcing providers is part of a strategy that focuses on technology, processes, and
people. The OpenView framework consists of data collection and action execution,
management data, service-oriented management applications, and Web-based GUI. The
framework is used to manage networks, servers, desktops, database, and applications.
84
The management approach is end-to-end management of technology. The framework
supports three levels of abstraction--element, task, and service. The framework also
works within a service-management life cycle that includes commit, deploy, and operate
dimensions. This framework supports management applications that support 11
functional perspectives. These include service desk, SLA, change, asset, storage,
network, performance, and desktop management, software distribution, network analysis,
and administration (HP OpenView Directions, 1998).
Like Spectrum, Solstice is a sophisticated network-management tool that can be
extended to other disciplines. Its framework has four components--distributed
applications, distributed management information services, management communication
infrastructure, and management protocol adapters. Its supports the availability and
event/alarm management functional perspectives (Solstice Enterprise Manager 2.1, 1997).
Spectrum is primarily advanced network-management software (Spectrum Enterprise
Manager, 1998). Additionally, Spectrum can be extended to provide broader functional
support. Spectrum is a client-server system that includes a knowledge base. This
knowledge base is object-oriented and is built upon a tool called db_VISTA from Raima
Corporation. Spectrum has an application-programming interface that can be used to
broaden the support for the software beyond its network focus. Other key components
include a device communication manager, model type editor, and programming language
that can be used to create reports. The out-of-the-box functional scope of the product is
somewhat narrow. It includes support for availability, performance, fault, and resource
usage functional perspectives. The product publications point out that the product can be
extended to support other areas like facilities management and factory automation.
85
Software management can also be supported including management of the operating
system, database, and application software (Spectrum Concepts, 1996).
Tivoli management software consists of a framework and management applications.
The Tivoli Framework consists of a graphical user interface, command-line interface,
communication service, databases, installation service, and application services. The
software and hardware platform support for the manager is broad including AIX, HP-UX,
Solaris, SunOS, and Windows. The management agent support includes DOS, NetWare,
OS/2, and Windows (Lendenmann et al., 1997). The management application support is
impressive with sixty-five products that leverage the framework. Appendix D contains
the complete list.
The functional perspectives supported by the Tivoli management applications includes
asset, availability, change, network, operations, security, service, storage management,
medium-sized businesses, small to medium sized businesses, e-business, and OS/390
management (Tivoli Solutions, 2001). This is a change for Tivoli as previously, the
framework and management applications were discussed in a four-discipline model that
included:
1. Deployment management,
2. Availability management,
3. Security management, and
4. Operations and Administration (Lendenmann et al., 1997).
An example of a Tivoli management application that builds on the framework is Tivoli
Manager for MCIS. According to the Tivoli Manager for MCIS (1998), this product
provides comprehensive management of Microsoft's Commercial Internet Servers.
86
Comprehensive means support for IIS, Proxy, News, Mail and Directory components that
includes monitoring resources, managing events, automating routine tasks, and deploying
software (software distribution) for Microsoft Internet Explored (browser) clients. This
product utilizes the Tivoli framework and leverages other framework-based products like
Distributed Monitoring and Software Distribution.
Many system, network, and application management-product vendors have
professional service personnel that implement their products for customers for a fee. For
most, this is not a requirement when you purchase the product. Tivoli has sixty-five
partners listed on their partner Web page (Tivoli Business Partners, 2001). The list of
Computer Associates Consulting partners is too big to count (Consulting Partners, 2000).
Hewlett-Packard has a link from their home page that makes it very easy to find a reseller
that performs implementation services. They will even provide a map with directions that
will get you from your place to theirs (Welcome to Hewlett-Packard, 2000). There are a
number of reasons why products are implemented with services. In some cases, the
products are immature and successful implementation would be impossible without highly
skilled and experienced implementation personnel. This situation was discussed in The
Double Edged Side of ESM (Boardman, 1999). In some cases, the involvement of service
personnel is part of a strategic set of activities whose goal is to control the customer
choices and influence where the implementation budget is spent.
Groupe Bull and its approach to implementation of the Integrated System Management
(ISM) product involves the strategic use of personnel. ISM was a collection of integrated
system and network management products developed and managed by Evidian that was a
Group Bull subsidiary (Evidian Products, 2001). When combined with professional
87
services, these products cover a broad set functions including systems, network, PC
workgroup, application, database, security, and telco management. Groupe Bull has its
worldwide headquarters in Paris France. Groupe Bull operates in 100 countries and has
approximately 27,900 employees that provide the consulting and implementation-services
(Integrated Systems Management, 2000).
ISM is standards-based including functions defined by Network Management Forum's
Omnipoints, Open Software Foundation's Distributed Management Environment, Object
Management Group's CORBA, X-Open's XMP-API, and Telco-defined TMN standards.
ISM has a programming language environment called System Management Language that
is used describe the management objects and actions to be taken for specific exceptions
like faults and performance problems (Miller, 1994). ISM is an unusually complete
network and systems management product that is backed by a large group of consultants
and implementation personnel.
Keynote Perspective is a Web site performance tool that is bundled with professional
services. It requires no specialized software on the customer machines or network.
Keynote provides a worldwide network of monitoring agents to keep track of response
times and it provides daily emails that contain useful comparison information
(Performance Monitoring Software, 2001). Customers can engage Keynote to perform
detailed analysis of the performance data and prepare reports that can be used in a variety
of ways. The analysis reports could be used to determine the best city from which to host
a Web application. The reports could also be used to uncover configuration problems in a
set up of an application hosted in a dual-site mode. Keynote's focus is performance as
88
slow response is costing e-commerce Web sites as much as $4.35 billion annually in lost
revenue (Keynote Perspective, 2000).
Summary of What is Known and Unknown About this Topic Much of what is known about applications management can be discussed in the context
of a functional perspective. One of the challenges in using this approach is to devise a
commonly understood list of functions. As is the case with some many aspects of
information technology, there are many different sources of information from which to
derive a list. These sources include standards organizations, systems-management
software companies, systems-management process consultants, and researchers.
According to Sturm and Bumpus (1999), the list of functions that is needed should
include fault, performance, configuration, security, and accounting. Their list is taken
directly from the ISO work on the subject called the ISO Management Model. This model
is widely known and referenced in network-management articles and books. An example
is a book on network management by Udupa (1996). This list of functions is a good
starting point, but it has a network-management bias. What about the applications
management standards organizations? What do the thought leaders in this emerging
discipline think regarding the functional perspective? The three main organizations that
have applications-management as a focus are the IETF, the DMTF, and POSIX of the
IEEE Computer Society. In general, these organizations are creating standards that are at
an implementation level. These standards can be used as a starting point by researchers or
used in software products and offerings.
The IETF has included an application focus in its standards work since 1993 (Sturm &
Bumpus, 1999). Application MIBs or application components imbedded in other types of
89
MIBs have been the focus of activity and some approved standards. An example of this
work, Application Management MIB (1999), defines objects used for the management of
applications. A functional perspective is not stated in the standard, but can be derived
through careful examination of the scope of the document. The scope includes throughput
measurements, support for units of work, application response time monitoring and
support, resource management (files in use, I/O statistics, etc.) and control of applications.
The implied scope of this standard is application availability and performance. One
reason for this limited functional perspective is the design decision that the management
of the application would be done without the cooperation of the software being managed.
The DMTF, through the CIM provides an application life cycle that has characteristics
like a functional perspective. The CIM life cycle includes six stages--purchase, deploy,
advertise, configure, execute, and remove. Each stage has an associated state, for
example, the purchase stage has an associated state of deployable. The stage/state
relationship is important, as the focus of the CIM standard is the support of installation
and operational data. The scope of the data includes product, software features, and
software elements. Other data includes configuration and service point that supports
initiate, start, and stop functions (Learn CIM, 1999). The implied scope of this model is
administration, configuration, availability, and change. Like the IETF Application
Management MIB standard, CIM provides a model that is ready to be used by researchers
and software developers.
The POSIX committee was focused on creating formal standards regarding the
administration of software. The POSIX standard (Information Technology - Portable,
1995) focused on administration of software across distributed systems. This included
90
packaging of software for distribution, distribution of software to systems, installation and
configuration using utilities, and finally removal of that software from the system. The
functional perspective of this standard includes administration, configuration, software
distribution, and change.
What about others in the systems and network-management community? The
community is large and contains many software companies with service delivery and
consulting organizations. Tivoli Systems, Computer Associates, BMC Software,
Compuware are a few examples of the strong companies’ active in systems management
today. Table 6 contains a summary of some of the different views that been developed by
systems-management software companies, systems-management process consultants, and
researchers.
Table 6. Different Views of Application-Management Functional Perspectives
Source Language used Names given to groupings
Best practices in enterprise management (1998)
Management Domains Service, security, storage, desktop and server, network, internet, and application/database (7)
Information systems management design guidelines and strategy a practical approach (Harikian et al., 1996)
Functions in a task or process view (also know as SystemView Disciplines)
Business, change, configuration, operations, performance, and problem (6)
Distributed systems management design guidelines: The smart way to design. (Harikian et al., 1996)
Tivoli Disciplines Deployment, availability and security, operations & administration (4)
91
Table 6. (continued)
Source Language used Names given to groupings
Distributed systems management design guidelines: The smart way to design. (Harikian et al., 1996)
Information Systems Management Architecture (ISMA) Processes
Business, data, service level, recovery, security, audit, capacity, problem, and distribution (9)
Distributed systems management design guidelines: The smart way to design. (Harikian et al., 1996)
Information Technology Process Model (ITPM) Processes
Business, data, service level, recovery, security, audit, capacity, problem, and distribution (9) Note: this subset derived from a complete list of 42 processes
Delivering IT Services (Bladergroen at al., 1998)
Information Technology Infrastructure Library (ITIL) Services
Configuration, help desk, problem management, change, software control and distribution, service level, capacity, availability, contingency, and cost (10)
Distributed Computing Environment (Cerutti & Pierson, 1993)
Systems Fault, performance, configuration, accounting, billing, and software distribution (6)
These functional perspectives (disciplines, processes, services, or systems), although
written about widely, are not implemented in any uniform way within management
products in the marketplace. This statement can be easily be proven by gathering
information from systems, network, and application management software companies.
Few of them use precise terminology to categorize their products. Certainly, one can
perform analysis regarding some products and put them in a category like availability or
security. Others, however, are hard to categorize using a conventional functional label.
92
Table 7 contains a list of products along with the function that it performs. The
information about function comes directly from the software companies.
Table 7. Management Products and the Function That They Perform
Company: Product name Company description: Implied functional
perspective
Resonate: Central Dispatch (Central Dispatch, n.d.)
Service level control: Service level
IBM: Client Response Time (Client Response Time Monitoring, 1998)
Response time measurement: Performance
WebManage: ContentMover (ContentMover, 1999)
Deployment automation & Web content distribution: Automation & software distribution
Tivoli: Distributed Monitoring (Tivoli Distributed Monitoring, 1999)
Server monitoring: Automation and availability
WebManage: Enterprise Reporter (Enterprise Reporter, 1999)
Response time monitoring: Performance
Trend Micro: InterScan WebManager (Interscan Webmanager, 2000)
Monitor and control internet access: Security
Keynote: Perspective (Keynote Perspective, 2000)
Response time monitoring: Fault and performance
BMC: Patrol (Patrol 2000, 2000) Enterprise management, performance and capacity, and service management: Automation, availability, performance, capacity, & service level
IBM: PCPMM: Port Checking Pattern Matching Monitor (Woodruff, 1999)
URL and port monitoring: Availability
WebManage: Service Level Reporter (Service Level Reporter, 1999)
Performance measurement, service level reporting, and Web site activity analysis: Performance, service level, Internet
93
Table 7. (continued)
Company: Product name Company description: Implied functional
perspective
Platform: SiteAssure (Platform SiteAssure, 2000)
Availability: Availability
IBM: System Resource Management (Server Resource Management, 2000)
Performance monitoring: Performance
In spite of the challenges discussed, a functional perspective list was developed by this
researcher and is used as the basis for the information contained in this part of the
document. Numeric analysis was done using the 85 function-perspective observations
gathered from 23 sources. The sources included 4 standards organizations; 6 groups of
researchers, research and consulting organizations, and vendors; and a survey of 13
sample products. Tables that support the selection of the 15 functional perspectives in this
section can be found in Appendix A, Functional Perspectives Analysis Tables.
Accounting The accounting functional perspective is rooted in the ISO Management Model (ISO
DP 7489/4, 1986). It is also described by Sturm and Bumpus (1999), Cerutti and Peirson
(1993), and Udupa (1996). Accounting pertains to how much of the resources are being
used and how much must be charged for using them (Udupa, 1996). The accounting
function requires the collection of data and the generation of reports. Data collection is
typically used to capture usage information that in turn is used to generate usage and
potentially billing reports. Accounting is associated with the idea that resources like CPU
cycles, network bandwidth, and direct-access storage space are expensive and have to be
94
allocated and managed. This functional perspective is rooted in the mainframe computer
when many users shared one machine and its associated resources. Interestingly, today's
high-end Web servers are so expensive that it is starting to make sense to IT departments
to manage them like mainframes. This phenomena is reported by Olsen (1998) where the
author described how UNIX JobAcct software was used to bill users for their CPU
connect time and disk activity. Olsen's research involved an implementation with the
Army Corps of Engineers What is the relevance of the accounting perspective to the
management of applications?
Most of the references to the accounting perspective in the research literature stem
from the initial ISO work and the network management discipline. Numerous articles
simply explain the ideas that can be found in the original ISO standard (ISO DP 7489/4)
from 1986. There are however, a number of product implementations. EcoTools from
Computerware Corporation has a resource accounting capability that is linked to charge
back. The software handles a somewhat diverse set of application products and runtime
environments including various Unix platforms running Oracle, Sybase, and Informix
database software (Systems Management Tools, 1996).
Olsen (1998) reported that the Army Corps of Engineers uses UniSolutions Associates
JobAcct software to bill its districts for their CPU connect time and disk activity on their
Unix servers. The utility is just one of several components of a layered charge-back
system. Other components include Awk and Unix shell scripts that create sequential files
that are used to supply information needed by the Army's charge-back and billing
application. Aragon (1997), Rennhackkamp (1997), and Fosdick (1998) highlighted the
functionality of Computer Associates Unicenter TNG that is attractive to many
95
companies. Jones International choose Unicenter TNG due to its strengths in resource
accounting, as well as software distribution, asset management, event management,
workload management, and report management (Aragon, 1997). Fosdick (1998) discussed
the comprehensive list of Unicenter TNG's capabilities including resource accounting--a
major functional capability.
Another example of resource accounting software is Platinum Technologies CIMS, a
multi-platform product to enterprise resource management. The product focus was
enterprise-wide resource management including charge back and capacity planning
reporting supporting MVS, VSE, UNIX, Windows NT and OpenVMS systems (System
Software, 1997).
Administration The administration functional perspective is the focus of POSIX standard. POSIX
centered on standardizing system administration utilities--an area where there are no
formal standards. The narrow focus was software administration, a subset of the tasks and
tools used by the system administrator. Within this focus, POSIX defined a software-
packaging layout, a set of information maintained about the software, and a set of utility
programs to work with that software and information (Information Technology - Portable,
1995). More than a decade earlier, IBM dealt with the issue of mainframe software
administration by creating a system of procedures, information, and data utilizing a tool
that it created called System Modification Program (OS/VS2 MVS Overview, 1980). This
utility, with its standard sequence of receive, apply, and accept, became a standard for
large systems whereby every software vendor used the utility to install and administer
their product.
96
Administration is also an aspect of the DMTF's CIM. This model has a life-cycle state
named deploy that means installing the application on a server so it can be administered
over the network (Applications and Namespaces, 2001). Deploying the application is an
administration activity in the same way that POSIX frames it. CIM has a bigger focus
than just administration. Intel's Wired for management initiative is looking to CIM to
help with asset management. CIM 2.0 has the ability to receive and manage
instrumentation from devices, like USB cards, and to maintain information about static
resources like platform BIOS (Overview of Wired, 2000).
The software developer Tivoli, which constructed a new set of disciplines from
SystemView, has Administration as one of its primary focus areas (Lendenmann et al.,
1997). Tivoli, an IBM Company, is not the thought leader that SystemView was in the
marketplace. The Tivoli Management Environment is the management framework that
replaced SystemView in the IBM portfolio. Instead, Tivoli focuses on products like
Workload Scheduler, Operations Planning and Control, Remote Control, Distributed
Storage Manager, and modules that provide integration of third-party products (Tivoli
Product Index, 2001)
Automation The automation functional perspective is complex to describe as it is both a stand-
alone function and one that is imbedded in other functional perspectives like operations
and problem. As a stand-alone function, automated operations has been in the literature
for over 10 years. Irlbeck (1992) wrote an important article in the IBM Systems Journal
that described automation in the context of network, system and remote system operation.
This article announced new capabilities in IBM's NetView product.
97
Many vendors, including IBM (Irlbeck's employer), have developed products to
automate the processing of large-scale systems (Desmond, 1990). OPS/MVS, AutoMate
(Anthes, 1992), and Operations Planning and Control (Tivoli Operations, 2001) are
products, that place under software control, the operations of large systems including
subsystems like the Customer Information Control System (CICS) and the Information
Management System (IMS). These subsystems support the daily operations of hundreds of
thousands of users (Ryan, 1993). Researchers like Flanagan (1996) wrote about the
importance of automation for effective network management. Linked with a policy-
directed approach, automation could be used to manage legacy environments, intelligent
agents, and application integration.
Products are beginning to become available to automate activities on smaller systems
like NT clusters and UNIX complexes. Welter (1999) used the Summit OnLine forum to
explain how the automation capabilities of Freshwater Software's family of products
leverages automation to watch local processes, network connections, and machine
resources to prevent and detect problems. As these smaller systems get larger and more
expensive, the economic benefits of labor savings become more attractive.
Automation has a relationship to other functional perspectives in that it is often a key
supporting activity. For operations, automation products and thinking has made it possible
to automate the start up, shutdown, and restart of system, network, and application-
support resources. Free from some routine activities, personnel are available to handle
operational exceptions that cannot be easily automated (Day, 1992). For problem
management, automation is important to the creation and updating of problem records.
Many software tools exist to use problem event data to automatically create problem
98
records (Universal Server Farm, 2000). Automation is used to support configuration
management by soliciting inventory scans from all user devices that are part of a network
(Remedy Discovery Services, 2000). Automation supports security management when a
procedure executes and compares the security profile on a server to a predefined
specification and resets it, if necessary, to the required configuration (Windows NT,
2000).
Availability Bladergroen et al. (1998) defined availability as part of a process called availability
management. This process makes possible the optimum utilization of resources, methods,
and techniques to achieve the agreed upon level of service. In contrast, researchers like
Hariri and Mutlu (1995) have examined the topic of availability and have modeled it in a
numeric fashion. Their attention to availability treated it as one of several important
parameters to evaluate and optimally engineer in regards to cost effective distributed
systems.
Availability was also the focus of the DMTF through projects like CIM. This project
represents an approach to the management of systems, software, users, and networks that
use a model that is object-oriented. In the model, availability is a state construct that is
managed at the logical-device level along with device identification, last error code, and
other key device-level fields. It is ironic that one of the major uses of CIM is to manage
the availability of managed objects yet the word availability hardly appears in a 54-page
white paper on the model (Westerinen & Strassner, 2000).
There are a large number of availability products in the marketplace. Tivoli Systems
and predecessor IBM products, have a long history of focus on system and network
99
availability. NetView, which appeared in the marketplace in 1989, was a tool to manage
the availability of network devices like terminals, control units, lines, front-end processors
as well as the application systems that made these devices useful including software like
the CICS and IMS (Szabat & Meyer, 1992). CICS and IMS are still used today as
application environments for banking, insurance, and many other industries. Netscape,
through a programming extension, supported seamless integration of its application server
with legacy CICS and IMS application systems. This integration allowed customers to
leverage their existing investments in the new Internet and Extranet application
environments that use Netscape's technology (New Netscape Extension, 1998).
Many other Web products were centered on availability. Platform SiteAssure used a
view containing Web, application, and database servers as a tool to monitor the
availability of these resources. Actions were taken directly from the availability views
(Platform - SiteAssure, 2000). The Tivoli Distributed Monitoring product had a "plug in"
module called Tivoli Manager for MCIS that was an availability tool specifically targeted
at Microsoft's Commercial Internet Servers. With this module, availability was managed
as a life-cycle activity. Automation and software distribution were also key functional
capabilities (Tivoli Manager for MCIS, 1998). These are a few of the legacy and
emerging availability products.
Business The business functional perspective is one that is typically broad in its capability. In
fact, it has overlap with other functional perspectives depending on what definition is
used. Mangold and Brandner (1993) described the SystemView definition of business
management. The scope included accounting, security, service agreement planning and
100
control, and service marketing. Most of these activities are administrative in nature with
some support being provided by software. The SystemView discipline of business
management came about or was derived from the broader set of functions described in the
Information Technology Process Model (The Information Technology, 1995).
Harikian et al. (1996) had a somewhat different definition of the business functional
perspective. Their definition of business management had a scope that included inventory
and security management, as well as financial administration, business planning, and
management services. Implementation of the Harikian definition is a management
function, again, supported by key software for inventory and security management.
The business functional perspective is one area where there is a significant difference
between the strategist's view of the function and the implementations of the perspective as
evidenced in vendor products. There are a number of products that advertise business
management functionality. Tivoli's Global Enterprise Manager product focused on the
management of a business system that was described as a logical collection of
management-ready applications (Gulla & Warren, 1998). The GEM paradigm for
management of the business system consisted of business-system views, application
monitors, and commands to take actions to control the business system. The views
contained a variety of objects that represented the components of the business system.
With these components as a launching point, the IT specialist could graphically monitor
the status of components (up, down, degraded, etc.) and issue commands like start, stop,
restart, and events (Tivoli Global Enterprise, 1998).
Like Tivoli, Computer Associates has business systems management as a key part of
their generally available products. The flagship systems management product to support
101
the business management functional perspective is Unicenter TNG (Karpowski, 1999).
Their implementation had tools that focused on business process views. These views
helped to answer questions relevant to the business community like "why is order
fulfillment processing slow?" The technology they used logically overlapped resource
views like those containing applications, databases, systems, and networks with business
views organized by geography, application, or functional role (Computer Associates:
Enterprise, 1997). Tivoli and Computer Associates used their business systems
management software as a basis for their competition for new customers in the late 1990s
(Kay, 1999).
Capacity Capacity management is a long-standing functional perspective. Capacity
management, which has as its goal the effective and efficient use of resources, is often
linked to performance management. This is the case with the IBM IT Process Model
where the Manage Performance and Capacity Process is a component of the Support IT
Services and Solutions Process Group (Harikian et al., 1996). Capacity management is
also part of ITIL Services. It is also linked to performance. In addition to performance,
the ITIL capacity service also includes modeling of resources, demand, and workload
management. It also includes application sizing. The sizing of the application is a
statement of minimums. Examples include "at least 4 MB RAM" and "a VGA screen".
However, the ITIL method includes an examination of these minimums to make sure that
they continue to result in satisfactory performance. This is very important when the
application or workload changes (Bladergroen et al., 1998).
102
The literature is dominated by papers on Web server capacity. Menasce and Armeida
(1999) focused on cost-effective configurations for Web servers based on a formal
capacity planning approach. Banga and Druscher (1999) concentrated on a new method
for Web traffic generation that generated bursty traffic with peaks that exceeded the
capacity of the servers. Their focus was realistic loads. Christensen and Javagal (1997) put
their attention on understanding traffic as it relates to capacity planning. Their focus made
it possible to determine if additional network capacity was needed to support the
application. Other capacity papers exist that focus higher in the application-dependency
stack, but there are far fewer of them compared to Web server capacity papers. The
capacityplanning.com Web site contains thirty-four white papers--only one directly relates
to application sizing. The paper (Application sizing, capacity planning, 1996) was a
detailed description of a proposed research project. It is not clear if the project was
funded or completed, but the scope and tasks outlined were substantial.
Many capacity products are available with a large number of them narrowly focused
point products. Any example of the point products available was described in Capacity
Management Software (Loyola, 1998). The author reviewed four different capacity-
management tools that ran under Windows NT and were designed to provide support for
network-capacity analysis. Capacity management products and simple tools available with
many software systems are often used by capacity planners. These capacity-planning
professionals perform services internally for their companies and externally for customers.
An example of this is a white paper where BMC professional services personnel explain
how they use BMC capacity planning tools to consolidate multiple servers into fewer
instances to save costs and to reduce points of failure (Server Consolidation Methodology,
103
2001). The main product used was BMC Patrol. The consultants focused on CPU,
transition I/O rates, and disk utilization.
Many companies with capacity planning software utilize their own professional
services groups and partners to deploy and use the software to provide value to their
customers. Examples include Unisys, Hewlett Packard, Oracle, and IBM. Many other
capacity-planning consultants, like CapaciTeam Inc. and Veritechs Solutions, have no
built-in affinity to a software company (Yahoo! Search Results, 2001).
Change Change management has been an important focus in information technology since
mainframes started to provide services that companies found critical. Since change
introduces risk to a system, change management was developed as a tool to reduce the
risk of failed changes by requiring a written back-out procedure. IBM's change-
management focus was developed during the early years of the mainframe and has
continued to remain important. Specifically, for the IBM IT process model, change is
reflected in the Deploy Solutions Process Group. This group has several change-related
components including Define Change Management Practices and Administer Changes
and Plan Change Deployment. These processes are explained in detail and have a broad
scope including management of changes to software, hardware, control mechanisms,
configurations, environments, facilities, databases, and business applications (Harikian et
al., 1996). The ITIL methodology, another key process methodology like the IBM IT
Process Model, identified change management as a key support service along with
configuration, helpdesk, problem, and software control and distribution (Bladergroen et
al., 1998).
104
The focus of the literature on change management is the challenges created by Web
applications with frequently changing content. Cochran (2000) explained the alternatives
that companies have regarding changing content. They indicated that companies can
implement a change-management system or almost guarantee an influx of corrupt and
unauthorized data. Huh and Bae (1999) proposed a Web-based change management
framework that can verify change success and support synchronous collaboration. The
prototype of the framework implementation supported a commercial object oriented
database management system and utilized programs written in C++. A number of change-
management applications were developed specifically for the Web. Huang, Yee, and Mak
(2001) developed a system to manage engineering changes in forms, fits, functions,
materials, and dimensions. Their system, actually a framework, was designed to provide
better information sharing, simultaneous data access, and more timely communication.
There are many change-management products in the marketplace. Some legacy
software, like IBM's Information Management product (now caller Service Desk for
OS/390), is still used for its change-management application. The program is over 20
years old (Tivoli Service Desk - INFOMAN, 2001). This tool has the ability to support
thousands of concurrent users--an ability lacked by Windows NT or UNIX-based
solutions. It has recently been updated to support up to 400 gigabytes of data storage and
now has a Web interface (Tivoli Service Desk - Datasheet, 2001). Newer products are
focused on change management of applications with integration of other tools. An
example is StarSync for Starbase inc. StarSync is used to deploy digital components to
servers as part of a managed change. It can work with StarSweeper, another Starbase
product, to sweep Web content for defects. It can also work with StarTeam Test Director
105
to synchronize defects between the change system and the defect-tracking module
(Starbase Corporation, 2001).
Change management is like capacity management in that it often takes a skilled
professional to implement it in an effective manner. Because of this, many software firms
have highly skilled professional services personnel to assist customers with their product
implementations. A good example is Telelogic, a global provider of solutions for
advanced software development. Telelogic has an integrated change and configuration
management product suite and utilizes their professional-services personnel to deliver the
solutions. Telelogic consultants present frequently at conferences like the Annual
Workshop on Software Configuration Management and have published many white
papers. Weber (1999) explained that Source Control Maintenance (SCM) provided many
well-known benefits for traditional software development. However, RAD users are
reluctant to use SCM because it slows them down. Weber's paper explained strategies to
successfully use SCM with RAD providing powerful and strategic direction to the reader.
In another paper (Continuus/CM: Change Management, 2001) the author explained the
drivers for change management including the growing diversity of development and
delivery platforms, the use of parallel development strategies, application size and
complexity, team size, distributed development teams, quality and time to market
demands, and modern integrated development environments.
Configuration Configuration management is one of the OSI Systems Management Functional Areas
(SMFAs). Configuration is important in regards to networks because it can support the
location of resources like routers, hubs, switches, and hosts. Details about these resources
106
can be modeled as objects and these objects can be stored in directories. Configuration
management is also important because the information it maintains is needed to start, stop,
add, and delete resources from the network (Udupa, 1996). ITIL views configuration
management as a key support service like helpdesk, problem management, change
management, and software control and distribution (Bladergroen et al., 1998). For the
IBM IT Process model, configuration management (called maintain configuration
information) is part of the Support IT Services and Solutions Process Group. The broad
goal of Maintain Configuration Information is to identify, capture, organize, and maintain
configuration information for use by other processes. One of the big challenges associated
with configuration management is the dynamic nature of systems, networks, and
applications (Harikian et al., 1996).
Some of the key configuration-management issues in the literature involved
automation in network configuration, the role of software distribution for Web sites, and
Web-based configuration applications. Ku, Forslow, and Park (2000) discussed the
importance of some level of automation in network configuration and management. Their
Java-based tool automatically populated a centralized database with key network-
configuration data. Leoni, Trainotti, & Valerio (1999) explained the results of an
experiment designed to understand the impact of a process-improvement activity
involving configuration management in the software development process. The focus of
this study was development projects for Web sites. A number of Web-based configuration
applications were discussed. Attardi, Cisternino, and Simi (1998) wrote of a Web-based
configuration assistant useful in electronic commerce and information services. In their
paper, the researchers described a generalized approach for building Web-based
107
application assistants. The Web was the focus of a paper that described a Web based
configuration control application. Curtis (1997) described a Web-based application to
manage the configuration of team programming projects. Web-based tools are described
by Hahn and Bruck (1999) in the context of micro electromechanical systems process
configuration. PCFONFIG is a tool used to manage initial configurations for build-to-
order products. It is especially suited for machines with complex configuration
requirements (Slater, 1999).
Like so many other functional perspectives, configuration management software got its
start on mainframe systems. Initially, mainframe configuration information was
maintained by a programmer and processed by the system control program and stored in
machine-readable Unit Control Blocks (Elder-Vass, 2000). Over time, more of the
configuration information was delivered as part of the microcode or logic included with
the device itself (Microcode, 2001). When networks were attached to mainframe
computers, these devices already had considerable functional capability including some
configuration-management capability. An example is the Vital Product Data (VPD)
command that was used to query the device and store the results in a file that could be
used to produce configuration reports. The data returned included general product data
(hardware or software), data for modems, data for DSU/CSUs, link configuration data,
sense data, attached device configuration data, and product set attributes (NetView for
OS/390 Application, 2001). VPD is still used today by IBM and products from other
networking companies like Cisco (Router Products, 2001)
Today, configuration software is much broader in scope. In addition to host and
network configuration, products like Code Co-op from Reliable Software provide a
108
distributed software version control system. Version-control software is used to manage
application configurations (Reliable Software, 2001). Other software configuration
management products include PCMS from SQL Software (Merant PVCS, 2001) and
Continuus/CM (Text-based configuration, 2001). Telelogic North America Inc. (formerly
Continuus Software Corporation) employees have authored many white papers on
configuration management that are available on the Continuus Web site. Dart and
Krasnov (1995) explained how the discipline associated with configuration management
can significantly reduce the risk associated with the adoption of a new tool or process
reengineering. Dart (1994) also wrote about the challenges of adopting an automated
configuration-management solution and the strategies to successfully implement the
technology. These employee-written documents are used by customers to help with their
own software implementations. They also demonstrate the skills and previous success of
the company's professional-services team.
Fault Fault management is rooted in OSI. It is one of the five SMFAs. The other SMFAs
include configuration, performance, accounting, and security. Fault management is
generally concerned with detection, isolation, and correction of unusual operation of
systems (Udupa, 1996). Fault management is not a specific focus of ITIL, but a part of
availability management where it is used as part of a methodology called fault tree
analysis (FTA). FTA is used to examine the sequence of events disrupting an IT service
(Bladergroen et al., 1998). In the IBM IT Process Model, fault management, also called
alert or trap management, is part of a subgroup called Manage Problems and is part of the
109
support IT Service and Solutions process group. The scope of Manage Problems is
detection, analysis, recovery, resolution, and tracking (Harikian et al., 1996).
The literature on fault is like a tree with a number of strong branches. One branch is
centered on policy and reasoning approaches to fault management. Katchabaw, Lutfiyya,
Marshall, and Bauer (1996) defined a fault as a violation of policy. This definition is in
contrast to the domain-specific work that dominates the literature and product
implementations. Network-level faults are an example of a domain-specific exception. In
this work, the researchers proposed a policy-driven system that detects application faults
and isolates them and corrects them using a predefined policy. The researchers
implemented a prototype that utilized the MANDAS components called configuration
management. The prototype required instrumentation, that when added to the application
processes, provided detailed information about application faults.
Yun, Ahn, & Chung (2000) defined fault conditions and rules and the related recovery
mechanisms to handle them. They investigated fault conditions like process fault, server
overload, network interface fault, and configuration and performance fault. Another
branch of the fault tree has as its focus fault tolerance in Web servers. Aghdale and Tamir
(2001) made this a focus of their work that was centered on client transparent recovery
actions. Their work was concentrated on completing the user's transaction at the time of
the failure instead of the typical outcome which is to leave the user unsure if the process
completed successfully. Chin, Ramachandran, and Chong (2000) and Yang and Luo
(2000) also focused on server fault tolerance.
There are many fault-management products in the marketplace. Fault-management
software originated with mainframe computers. Early examples were the Environmental
110
Record Editing and Printing Program (EREP) (Environmental Record Editing, 1999).
EREP was a tool that reported on hardware faults that were recorded in a key system file
called SYS1.LOGREC. EREP and SYS1.LOGREC are still in use today (Elder-Vass,
2000). When networks were connected to mainframes, devices were outfitted with logic to
detect errors and report those errors to "upstream" devices in the network. Eventually, the
data made its way to the mainframe computer. Initially, this data was stored in system
files, but new tools like Network Problem Determination Aid (NPDA) were developed to
collect the data, store it on-line, and offer operators assistance with the faults it recorded
(called events and alerts) in the form of probable cause and recommended action advice
(NetView User's Guide, 2001).
Today, NPDA remains a common interface point for software vendors who want to
consolidate fault data, from a variety of devices, in one place with considerable storage
and processing capacity. A typical example is the Spectrum/NV-S Gateway software
from Cabletron systems. The product user's guide explains how to forward Spectrum-
detected faults to NPDA for viewing and reporting (Spectrum/NV-S, 1998). Fault
detection and management is a challenge in distributed environment as evidenced by the
large number of products produced by major software companies. Veritas Software
Corporation produces NerveCenter, a fault management tool targeted at UNIX and NT
systems (Veritas NerveCenter, 2001). Tivoli has a number of modules for systems and
applications like CATIA, BEA Tuxedo, Domino, R/3, DB2, MCIS, and others that have a
strong focus on fault (event) management (Tivoli Product Index, 2001).
Quest software has I/Watch for Oracle event management. A focus on Oracle and the
databases that support an application give the management software a window into many
111
applications. Computer Associates has legacy mainframe fault-management products, but
also has Unicenter TNG with a number of fault-management and automation plug-in
modules (Mason, 2001). BMC has Patrol for event management and, like Tivoli, has plug-
in modules (called knowledge modules) to handle the fault management requirements for
a variety of environments like SAP R/3 and Oracle (Event Management, 2000).
Operations Operations management was not the focus of any major standards efforts. OSI, for
example, did not include operations as one of its SMFAs. ITIL does not identify it as a
specific discipline, but rather discusses operations management as part of availability
management--"Good availability management is founded on adequate performance of the
operational management processes" (Bladergroen et al., 1998, p.49). Operations was a
major focus of phase four of the IBM Systems Management Solution Life Cycle. The
main goal of operation was to ensure that the ongoing delivery of IT services is efficient,
effective, and consistent. The other phases of the Systems Management Solution Life
Cycle were define the solution approach, design the solution, and implement the solution.
The methodology was independent of the tools used to deliver the service. This life-cycle
approach comes directly from the IBM IT Process Model and is focused exclusively on
systems-management activities (Fearn, Berlen, Boyce, and Krupa, 1999).
Tivoli Systems combined operations and administration. The scope of the products
they produced included job scheduling, help desk, backup/restore, and output
management (Lendenmann et al., 1997). Tivoli inherited operations management from
SystemView when IBM purchased Tivoli and the two architectures and products merged.
112
SystemView's focus was purely related to the operational aspects of computing resources
(Udupa, 1996).
The literature contains the work of a number of researchers active in operations
management. Tsaoussidis and Liu (1998) explained a knowledge-based management
system that offers dynamic service to distributed applications. The services included
determining the processor allocation approach to be used, choosing applications to be
executed next, and invoking the best parallel processing method. Tanaka and Ishii (1995)
explained a service-management architecture they developed that was focused on
providing reliable telecommunications and operations services. The architecture consisted
of application software elements, a manager, a database, and an installer and remover. The
focus was smooth operation of the target application. Shukla and McCann (1998) created
an operation support system focused on systems management using the World Wide Web
and intelligent agents. The main goals of the system were to provide continuous
monitoring of critical computing resources and problem detection and notification
utilizing intelligent agents. Other goals focused on the use of a Web browser like
configuration management and centralized control of applications.
Operation-management software is an especially broad category. It is broad because so
many different products support IT operations. There are many representative job-
scheduling programs. Job Scheduling Server for Windows 2000 from Microworks handles
the scheduling of work on a single machine or a networked version that can serve multiple
machines including load balancing. It has a rich set of features including a monitor-
program service, job logging, a schedule window, a calendar, conditional job scheduling,
and job priority (Job Scheduling Server, 2001).
113
Sys*ADMIRAL, from TIDAL Software, has a rich set of functions that operate on a
broad set of platforms--UNIX, Windows, AS/400, and OS/390 machines. Its wide
platform support makes seamless management of applications possible using a centralized
approach. Sys*ADMIRAL also has integration with applications including SAP,
PeopleSoft, and Oracle applications (TIDAL Software, 2001). For help desk, another
focus of operations-management software, a large number of products are available.
Footprints, a product from UniPress Software, was 100% Web-based. Its features
provided the ability to centralize tracking, improve workflow, and enable Web-based
collaboration (UniPress Software, 2001).
SDS HelpDesk V4, from SDS HelpDesk Solutions, was a feature-laden tool built for
the Windows platform. It utilized Microsoft Access, a database program, to support issue
management, service contracts, contract management, time tracking, work-group
management, and reporting (SDS HelpDesk Software, 2001). There are several
alternatives to these tradition software products. Host Help Desk, from Hostedware
Corporation, is a Web-based product that is completely hosted by the software vendor--
there is no hardware to purchase (Hosted Help Desk, 2000). Another alternative is open-
source software. FREE DESK, offered through http://freedesk.wlu.edu, is completely free
and can be used with few stipulations (FREE DESK, 2000).
Many backup/restore products are available to support operations management.
NovaStor Backup software is a solution for Windows, Novell, DOS, and Mac Platforms
(NovaStor, 2001). Amanda Backup Software, an open-source archiving and compression
program from the University of Maryland, can be used to administer a local-area network
from a single master backup software tool (Amanda, 2002). UNIX backup solutions are
114
available from Syncsort, a company that got its start by creating a high-performance
mainframe sort program (UNIX and Windows, 2001). For output management, the final
operations-management area, there is a rich collection of products. TUSS, from Square
Software, is a program that makes printers symbolically available for both TCP/IP and
Windows/NT networks. This NT based tool, supports printers on a wide variety of OS
Platforms (TUSS System, 2001).
Whereas TUSS is a generalized tool, other products like SAP Output Management
from Cypress Software target the specific needs of the SAP application and operating
environment (SAP Output Management, 2001). StreamServe Corporation, like Cypress,
has output management tool for specific applications. StreamServe has modules for
Oracle, QAD, as well as SAP (StreamServe Overview, 2001).
Performance Performance is one of the OSI SMFAs. It is a key component of the ISO Reference
Model Entities (Modiri, 1991) and has as its focus the effectiveness of communication
activities. Udupa (1996) pointed out that through performance reports, the utilization of a
station can be observed. Once detected, performance problems can be address by adding
capacity or making other adjustments. The ITIL literature does not single out performance
as a discipline however it is identified as a key part of capacity management (Bladergroen
et al., 1998). It is not unusual to find performance and capacity in close association. The
IBM IT Process model also links performance and capacity management. The Manage
Performance and Capacity process is part of the Support IT Service and Solutions Process
Group (Harikian et al., 1996).
115
The performance literature is vast! Performance management topics offer researchers
many opportunities to create and test models. Rhee, Park, and Kim (2000) proposed a
heuristic connection-management approach that maximizes the use of key server
resources. The work was started when the HTTP 1.1 standard reduced the closing and
reestablishing of connections by supporting persistent connections as a default. As with
some of the other functional perspectives, the literature has examples of the Web as a new
area of focus for performance management. Ahn, Yoo, & Chung (1999) focused on the
analysis of data from Internet networks (TCP/IP) in conjunction with other Web
technologies like Java. A Web-based tool was created to view and analyze the data
collected in the MIB. Goedicke and Meyer (1999) focused their research on a lightweight
approach to using multiple distributed collaborating agents to improve the real-time
performance of Web-based applications.
There are many performance products in the marketplace. Jander (1998) surveyed the
types of products available in the marketplace. These products ranged from measuring the
speed of traffic whereas others simulated traffic. Still others ran from within the
application itself. This is what the industry calls intrusive. BMC Software's products
include Patrol and Best/1 Performance Assurance Series. These tools gather data and
provide performance reports that can be used by performance professionals to detect
problems and to assist longer-term with capacity planning activities (Patrol 2000, 2000).
Candle Corporation has legacy mainframe products like Omegamon. Recently, they have
developed tools like ETEWatch for the performance management of distributed systems.
This tool measures application performance from a customer perspective. ETEWatch has
116
many features including application response-time monitoring, real-time alerts, and
application usage reports (CandleNet ETEWatch, n.d.).
Tivoli Systems had a performance-monitoring tool that used the ARM API to gather
data about the applications performance. ARM is an example of an intrusive approach.
With ARM, the application is instrumented to include API calls that interface directly
with the management system to create response-time data (System Management:
Application Response, 1998). Measuring and improving the performance of Web sites
has been a focus as more and more companies are engaged in commerce over the Web.
Keynote Perspective is a service offering available from the Internet Performance
Authority that can be used for simple performance measurement and for diagnosing
performance problems. The Internet Performance Authority has probes all over the globe
that collect real-time performance data for the customers that pay for the service. Daily
reports are emailed to the administrators of the Web applications that are used to view
Web-based performance reports (Keynote Systems Services, 2000).
Problem Problem management is not one of the OSI SMFAs. However, managing problems is
an action that is commonly associated with networks that are the focus of the OSI SMFAs.
For ITIL, problem management is an important service support element. ITIL is
concerned with overall IT service management therefore managing problems is important
because timely correction of problems is a typical user expectation (Bladergroen et al.,
1998). The IBM IT Process model includes a significant focus on problem management.
Manage Problems is part of the Support IT Service and Solutions Process Group. For the
IT Process Model, the goals are to reduce problem quantity, impact, and costs (Harikian et
117
al., 1996). Problem management became a focus when companies started to depend on IT
services to run their businesses. Minimizing the impact of problems became a goal that
was measured (and still is) in many organizations.
The problem management literature contains some research on projects that are using
the Web for problem solving research. In another researchers Hellerstein, Zhang, &
Shahabuddin (1998) described a systematic, statistical approach to characterizing normal
system operation. The researcher's interface to problem management used the
characterizations to remove known behavior thus better detecting anomalies. These
anomalies are the true problems to be managed. In general, however, little research is
being done to improve problem management as it is a mature discipline. Kundtz (1996)
found that companies could realize a greater return on investment of a helpdesk solution if
they applied the business process method to the implementation of the problem
management process. Talluru and Deshmukh (1995) took a knowledge-based approach as
they created a problem-management model in the context of a decision support system.
The model was built using Prolog and Visual Basic and utilized a natural-language
interface.
The marketplace offers many problem-management products. Some products offer
problem-management as one of many integrated functional perspectives. Impact, from
Allen Systems, offers problem management as well as service desk, change, asset, and
service level management--all in the same product. The software also has automation
capabilities that can be used to detect and resolve problems (Products - ASG, 2001). The
Action Request System software from Remedy Corporation offers problem management
in the context of a workflow automation tool (Remedy Action Request, 2001).
118
Compuware offers VantageView, which is part of a family of products called Vantage.
VantageView is focused on the disciplines of availability, performance, and serviceability.
Problem management is improved by rapid detection and recovery of application failures
(Compuware Vantage, 2002). Support.com is an example of a non-traditional problem-
management tool. The support.com healing system utilized probe technology to detect and
repair problems without user intervention. Support.com's problem-management assistance
is provided by its remote help desk. The product is both software and remote service
(Help desk, 2001).
Security Security management is one of the OSI SMFAs. Udupa (1996) pointed out that
security management has become an important issue because of the thousands of
workstations that have come about due to the developments associated with distributed
computing. Security management was not identified as an ITIL service support element
(Bladergroen et al., 1998). It is not even mentioned in the basic literature. The IBM IT
Process model includes Manage IT Security as part of the Manage IT Assets and
Infrastructure Group. Three important areas were identified for security--security within a
local system, security of distributed processes and data, and security of networks and
communications (Harikian et al., 1996).
The literature on Security has vitality. Security research regarding the Web is
significant in size and scope as it is generally recognized that security concerns are one of
the major inhibitors to the use of the Internet for commerce (Rubin, Geer, & Ranum,
1997). Its importance is so great that security training is required for the growing
profession. Horrocks (2001) noted that the Security Industries Training Organization, the
119
International Institute of Security, and the Society for Industrial Security Safety are the
leaders in the development of education for professionals. He noted that some in the IT
industry do not yet believe that security management is a profession.
Researchers are exploring many aspects of the security-management challenge.
Barruffi, Milano, and Montanari (2001) focused on intrusion-detection systems. Their
tool, called PlanNet, was a constraint-based system that used artificial intelligence
techniques to perform security-management functions in a network of computer systems.
Devices are not the only focus in the security literature. Eloff and Von Solms (2000)
focused on information not devices or networks. Security management is the focus on
both industry and government. Verton (2000) explained what the Department of Defense
(DOD) was doing in the area of security training. The DOD offered more than 1,044 IT
and security-related courses at the time of the paper.
There are a large number of security products and many of them are updated frequently
to keep up with security threats. In April 2001, Symantec software, maker of Norton
AntiVirus and Enterprise Security Manager, updated its programs to deal with a false
digital certificate. The certificate, which was issued in error by VeriSign, was given to an
individual who fraudulently claimed to be a representative of Microsoft Corporation. The
story was documented in a Software Industry Report (Symantec First to Provide, 2001).
Other security changes and updates are needed because of new technologies. Check Point
software, a company that makes firewall software, has had to improve its firewall security
because of the emergence of Virtual Private Networks (VPNs). The emergence of VPNs
as an alternative to older forms of back-end connectivity has caused vendors of software
for key devices like firewalls to expand security policies to handle the unique security
120
exposures associated with these types of networks (Krapf, 2001). Technologies like VPNs
emerge rapidly and create challenges for the security software and consulting
communities and the customers that they support.
Service Level The service level functional perspective has as its focus the quality of IT services
provided to the users of the computer system. Service level is described in the Merit
Project Report (Best Practices, 1998). It is a key part of the Information Systems
Management Architecture Processes (Harikian et al., 1996) and Information Technology
Infrastructure Library Services (Bladergroen et al., 1998). Both are methodologies that
focus on providing comprehensive management services. Service level management is
also part of Integrated System Management--software and services from Groupe Bull.
Integrated System Management defines service management as the handling of resources
like central and distributed processors, networks, and related technologies. It also
includes services like voice, data, and video. Both resources and services are managed
using a service level agreement that is focused on the client's business and expectations.
The service level agreement measures the performance while budgets measure the cost
(Miller, 1994). Service level agreements are part of a broader concept called service level
management. Lewis and Ray (1999) describe a framework that can serve as a baseline
against which one can situate and evaluate service-level agreement proposals. The
framework integrates business processes, service, service parameters, service level,
service level profile, service level agreement, and service level management into a
coherent system. In this framework, a service level "is some mark by which to qualify
acceptability of a service parameter" (Lewis & Ray, 1999, p.1974). The authors noted that
121
the marking schemes can be binary (language like, "is acceptable only if never more than
40%") or fuzzy (language like, is acceptable only if it is very good to excellent"). Service
level management is sometimes administered in very specialized domains. Puka, Penna,
and Prodocimo (2000) wrote about a management system for ATM networks that is
tightly aligned with technology characteristics like quality of service parameters and
ATM traffic management specifications. The service level model created by the authors
could be applied to other networking technologies, but is not a good match for the
management of applications.
For Web applications, both service level agreements and service level objectives apply.
Among Web hosting service providers, most contracts specify service level objectives.
An example of the language used might include --"Web hosting provider strives to make
the Web Hosting Environment available for access by Internet users and the customer’s
authorized representatives or agents 24 hours per day, each day of the year, except during
periods of scheduled maintenance. Web hosting provider's availability objective for the
Web Hosting Environment is less than four hours per calendar month of downtime,
subject to specific exclusions" (Universal Server Farm, 2000, p. 17). Although the
language is specific, it is still just an objective and carries with it no penalties for not
meeting the objectives.
Service level agreements contain very specific language that often includes a penalty
notice for failure to perform. UUNet, a Internet Service Provider, has a powerful SLA.
The commitments are in three areas--network quality ("the latency (speed) of core
sections of UUNET's network will not fall below specified levels"), service quality ("the
UUNET network will be available 100% of the time"), and customer care quality (should
122
a fault cause the network to become unavailable, UUNET will notify the customer within
specified timescales. In certain countries, circuit install time commitments are also
available"). UUNet also states "should these specified levels of service fail to be
achieved, UUNET will credit the customer's account" (Service Level Agreements, 2001,
p.1).
Software Distribution Software distribution is about getting files from one place to another. It is made
complicated by different machine configurations running different operating systems,
brief update time windows, and frequency of software maintenance and updates (TME 10
Software Distribution, 1998). Software distribution became an important functional
perspective in the 1990s after de-centralization of computing resources became an
established trend. At that time, reliable distribution of software was accomplished using
different approaches like manual distribution, ad hoc solutions, and electronic
distribution. Electronic distribution became popular because many distributions involved
a broad geography and required timeliness of execution (Vangala, Cripps, & Varadarajan,
1992).
Software distribution is a focus of POSIX. Its software administration standard
includes utilities that facilitate software distribution including swcopy (copy distribution),
swpackage (package distribution), and swverify (verify software). These utilities along
with software structures like bundles and filesets make standardized software distribution,
used for software installation, possible (Information Technology - Portable, 1995).
Software distribution is used by ITIL as part of its Service Support Set. It is a peer activity
to change, configuration, helpdesk, and problem (Bladergroen et al., 1998). Software
123
distribution has been the focus of many research activities. Gumbold (1996) described
software distribution by reliable multicast that involves an end-to-end application layer
protocol built on top of a thin transport layer (UDP) and a best effort network layer
multicast service (IP). Osel and Gansheimer (1995) described the use of the OpenDist
toolset to synchronize file servers. They described the difficulties associated with
performing daily updates on a large number of servers. Since network bandwidth can be
an important consideration, data compression for software distribution can help lessen the
impact of large distributions and improve the effectiveness of the management activity
(Tong, 1996).
Start and Patel (1995) focused their work on the distribution of telecommunications
service software. Software distribution is the focus for software companies like Tivoli
Systems. This company has software that is designed to distribute an application to its
clients and servers (TME 10 Software Distribution, 1998). Tivoli's software distribution
service is also used to support system-management products by distributing their elements
like commands and monitors. The popularity of the Web has created an explosion in
software-distribution utilities. Levitt (1997) discussed nine products that stream updated
Web-page content from a Web server to a browser. The products are used to deliver news
and information.
The Contribution This Study Makes to the Field This study makes a contribution to the field of systems management in three significant
ways. This study expands knowledge and capability in the full life-cycle management of
applications, it provides the design of an innovative toolset for the management of
124
applications, and it expands the capabilities of 15 key functional perspectives in the area
of application management.
Expand Knowledge and Capability in Full Life-Cycle Management of Applications One focus of the researcher in this study was on applications-management support for
the full life cycle of the application. The life cycle includes design, construction,
deployment, operation and change. Creating an application with management support has
considerations throughout its design, construction, deployment, and operation.
Management support is also a consideration when the application undergoes change.
Design, construction and operation activities were described by Kramer, Magee, Ng, and
Sloman (1993). It is helpful to explain the application operation phase by using a concept
called the application stack, which is shown in Table 8.
Table 8. The Applications Dependency Stack and Application-Management Support
Stack Component Description Application Management
Support
Application The programs and processes that make up the application.
Deployment of the application; monitoring and operation of application tasks.
Database The files and database used by the application.
Monitoring and operation of database resources that are used by the application.
Network The network components like protocols and services used by the application.
Monitoring and operation of network resources that support the application.
Operating System
The operating system and its services used by the application.
Monitoring and operation of operating system resources that support the application.
125
This construct provides a framework for the monitoring of the application and its
components. The application stack was described by Hurwitz (1996) and is a collection
of resources that provide support to the application. The main stack components are the
operating system, network, database, and the application itself.
Change is a phase that has activities that are used to coordinate the handling of
modifications to the application after it has been deployed. These activities are often
assisted by software. An example of this software support is CONTROL from Eventus
Software. This product is designed to address Web application management challenges
and has strong change-management capabilities including real-time Web application
metadata. This metadata includes information on link-management problems and
deployment history, as well as the ability to check in and check out application content.
Also supported are move, rename, and delete capabilities (CONTROL Overview, 1999).
These capabilities provide basic support for a Web application's change-management
process. The toolset design and prototype implementation will expand the body of
knowledge in all life-cycle areas.
Provide the Design of an Innovative Toolset for the Management of Applications The toolset is an innovation exploration of the management of applications. Toolset
components include procedures that are used to give the task-oriented steps to complete
application-management activities. Programs are also part of this management toolset.
They perform task-oriented activities will little or no human intervention. Views are also a
key part of the toolset. Views are used to assist system administrators and operators in
visualizing activities like deploying a new application or monitoring a running
application. Schema and Data/Information are tightly linked as components of the toolset.
126
Schema defines the application management data and information that is stored in the
MIR. This innovative approach expands the knowledge and practice associated with the
management of applications.
Expand the Capabilities of 15 Key Functional Perspectives in Application Management Fifteen functional perspectives were explored deeply in this study. The perspectives
included Accounting, Administration, Automation, Availability, Business, Capacity,
Change, Configuration, Fault, Operations, Performance, Problem, Security, Service Level,
and Software Distribution. Many of these functional perspectives are legacy network and
systems management disciplines. Some of these perspectives have received recent focus
with applications. Some software vendors, for example Continuus Corporation, provide
powerful change and configuration software backed by professional-services personnel
who work directly with customers and write detailed papers to tell others about it. They
also present at conferences. This helps to bring their story to the marketplace. This study
expands the uses of the functional perspectives and links them directly to the management
of applications.
Integrate with Existing Products in a Seamless Fashion The toolset is part of a computing environment that consists of two domains. Figure 12
shows the application and the management domains. The application domain consists of
application clients and servers running Web applications. These applications use a
browser as a front-end Web interfaces for HTML documents. Often these applications
use back-end database systems rooted in legacy, sometimes mainframe systems (Turner,
1998). The management domain exists to provide management support to the
applications to make it easier to deploy and change them. This management infrastructure
127
can also help improve the application's availability. The management domain contains
management clients that are used to gather availability and performance data. An example
of a product implementation is Global Enterprise Manager (Tivoli Global Enterprise,
1998). Typically, management clients can also issue commands and receive responses.
Management servers provide support for this function.
________________________________________________________________________
Applica tionC lient
Applica tionC lient
Applica tionC lient
Applica tionC lient
Applica tionC lient
B rowser
B rowser
Arch ie
G opher
O ther
Applica tionServer
Applica tionServer
M anagem entS erver
Toolsetfor W ebA pplications M anagem ent
M anagem entC lien t
Applica tion D om ain
M anagem ent D om ain
M IR
issue com m ands,receive responses
pu llinstrum entation sto re
instrum en ta tion
dep loyinstrum enta tion
insta lltoo ls
Figure 12. The toolset and its relationship to the management and application domains ________________________________________________________________________
The toolset integrates with existing management servers and frameworks. It also uses a
Management Information Repository. The MIR provides database support for the
management applications and supports their integration into a single management
environment (Martin, 1996). The database support for this toolset was a relation database
that supports SQL. The toolset works with the management servers to install application
128
instrumentation in a push or pull fashion. It also stores the instrumentation and other
management information like message logs in the MIR.
Summary This chapter contained a review of the literature focused on systems, network, and
applications management. The systems and network management literature provides a
launching point for defining, redefining, and expanding the role of applications
management. The historical overview discussed applications management as an emerging
discipline, the history of applications management, and major research efforts and projects
in the area of the management of applications. The theory portion of the survey explained
management infrastructure such as alerts and toolkits, management standards such as
CORBA and SNMP, management information repositories, and classes of products such
as point or framework. Many standards efforts carried out by the IETF and DMTF are
focused on aspects of applications management, but no comprehensive research project,
consulting methodology or vendor product is available that addresses full life-cycle
management of applications. Some groups and companies have placed focus on the
management of applications during key life-cycle phases like operations and change. ITIL
and the IBM IT Process Model are examples of methodologies that focus on these phases.
This chapter also contained a summary of what is known and unknown about the
management of applications, organized by functional perspectives, starting with
accounting and administration and ending with service level and software distribution.
Some vendor products provide strong availability support for the application. BMC Patrol
has knowledge modules that provide deep availability support for a wide variety of
129
applications like SAP and People Soft. Many other products support key functional
perspectives like configuration, change, problem, security, and service level.
The last major part of this chapter discussed the contribution that this study makes to
the field of applications management. The main contributions are in the areas of
expanding knowledge and capability in full life cycle management, providing the design
of an innovative toolset, expanding the capabilities of 15 functional perspectives such as
accounting and service level, and integrating with existing management products. The
design and prototype work for this toolset is built upon the work of researchers,
consultants, and software vendors.
130
Chapter 3
Methodology Research Methods Employed This study involved the design and implementation of a prototype toolset. The scope of
the design included procedures, programs, views, schema, and data for the research
questions identified in Chapter 1. The implementation of the prototype included the five
detailed scenarios that are documented below and supported the Toolset Evaluation
Survey in Appendix B. This project used ideas from Joint Application Design and Rapid
Application Development methodologies. JAD was used to devise and validate a list of
toolset components that were developed. RAD was used to determine the manner that the
prototype toolset components were developed. The use of these two techniques fostered
an effective design and rapid development experience. After the toolset was designed and
implemented, it was evaluated using data collected from 33 participants.
Specific Procedures Employed This research was completed using three main steps. During the design step, a
comprehensive design of the toolset was created. Next, the prototype toolset was created
and tested during the implementation step. Finally, the toolset was assessed during the
evaluation step. Each of these steps is explained in detail in the paragraphs that follow.
Design the Toolset Using JAD techniques described in Chin (1995), a comprehensive design of the toolset
was created. The design was created through JAD sessions conducted by the researcher
with participants skilled in application, middleware, and database management. The JAD
technique, as reported by Purvis and Sambamurthy (1997), was used because that
131
methodology was expected to promote richer design interactions among the participants
during the design process. Jackson and Embley (1996) incorporated the use of an analysis
model, specification language, and tools to create a JAD variant with a technology focus.
Structured analysis techniques were used to explore and analyze the requirements for
the prototype toolset using the research questions documented in the section labeled
Hypotheses and Research Questions to Be Investigated in Chapter 1. Specifically, there
were twenty-three research questions that were explored during the design step. These
research questions served as a starting point for the design and were expanded upon as
required. The design sessions were conducted using a series of conference calls. The
sessions leveraged an electronic TeamRoom that was a repository for all design step work
products. The TeamRoom fostered collaboration among the participants. The TeamRoom
was the repository of documents in the following five key categories:
1. Design meeting agenda
Two design meetings were conducted. Each meeting had a detailed agenda. The focus
of the first meeting was to introduce the team to the basic project concepts and to review
the primary and secondary research questions. The research questions were used to
identify toolset components that were later designed. The second meeting was used to
continue the work started in the first session and to complete the conversion of research
questions into toolset components. At the completion of the design meetings, a
comprehensive list of toolset components was identified. These programs, procedures, and
views were the toolset components that were designed by this researcher in a detailed
manner.
132
2. Design meeting minutes
Minutes were documented after each of the two design meetings and distributed for
review to the design team. The minutes contained a record of the detailed ideas and
recommendations that were discussed during the meetings. The minutes served as a tool to
connect the meetings until the preliminary toolset components were completely identified.
3. Design documentation for the toolset
The design documentation consisted of one document in the TeamRoom for each
toolset component. This design documentation was created by the researcher and
reviewed by the design team until it reached its final form. The researcher collaborated
on the design with the design team members and typically received input from at least one
design team member for each component before it was finalized. A summary table for
each type of toolset component was also created. The summary table included all the
toolset components and was organized by subsystem within application perspective as this
was the main organizational approach used for the development of the toolset.
4. Comprehensive design document
After the toolset component documentation was completed, it was organized into a
single comprehensive document. The document was called Comprehensive Design for a
Prototype Toolset for Full Life-Cycle Management of Web-Based Applications. This
design document is summarized in Chapter 4. The design document included the detailed
design of the toolset components as well as other information that is usually found in a
design document including system background, system overview and technical
architecture (External Design, 1996).
133
5. Log of key correspondence and other work products
A log of important e-mail correspondence was also stored in the TeamRoom (see Table
9). This included copies of e-mail associated with the design of specific toolset
components and other intermediate work products associated with the design of the
toolset.
Table 9. Proposed Category and Documents for the TeamRoom
Category Document
Design meeting agenda
Agenda for first design session and second design session
Design meeting minutes
Minutes from the first and second design sessions
Design documentation for the toolset
Summary and detailed documents for programs, procedures, views, schema, and data
Log of key correspondence and other work products
E-mail correspondence associated with toolset elements, presentation used for first design meeting, and presentation used for second design meeting
Additional design work products were stored there including materials used for the design
meetings and reviews. A summary of the categories used and key document names are
shown in Table 9.
After the design was completed, the next step was to transform the design into logical
entities called application segments. Segmentation is focused on function and sequence
and allows the application to be used in the order delivered (Hough, 1993). A segment
strategy was developed because it was needed to determine the order in which the
prototype toolset components were developed. This was particularly important because
134
only a subset of the toolset components designed were developed in a prototype manner.
The subset of toolset components were those toolset components necessary to complete
the toolset evaluation goals of the project. Application segmentation provided a way to
organize the components that were developed into groups that were straightforward to
develop, test, and deploy. Hough (1993) recommended that the user interface be built first
to reduce the risk associated with changing requirements. This advice was followed and
other activities like database design followed the development of the user interface. The
segment strategy is discussed in Chapter 4.
Implement the Toolset The first implementation step was to create a plan for the development steps. The
researcher was the primary developer of the prototype toolset so an elaborate plan was not
required. However, the importance of project planning and management in the success of
a project is well documented in the literature (Cleland & Gareis, 1994) so this step was
not eliminated. Next, RAD techniques, as described by Hough (1993) were used to
develop, deliver, and integrate the procedures, programs, views, schema, and data for the
prototype toolset. Initially, it was anticipated that the toolkit programs were to be
developed using scripting languages to handle grouping of commands and platform-
dependent operations. It was anticipated that a scripting language such as Awk would be
used (Gilly, 1994). It was also expected that some programs would be developed in Java
because Java is built on familiar constructs and is a portable language (Jackson &
McClellan, 1996). Almost no scripting was required to complete the implementation
work because of the built in capabilities of the database software and the HTML editor
used to build the toolset. Procedures were developed using plain text and placed directly
135
into the HTML views. For a commercial implementation of this toolset they would be
placed in the Application MIR. It was originally anticipated that the toolset views would
be prototyped using a NetView client system. Although NetView is a network-
management tool, the graphical user interface had the flexibility necessary to build
application views. This approach was discarded and a HTML editor was substituted so
that the browser that is built into every computer system today could be used. Schema was
developed using relational database software. The dictionary for the schema was
documented and can be found in Appendix H. Data and information was stored in tables
and accessed using Microsoft SQL. The toolkit components of each segment were tested
in a unit fashion as testing is an important step in the development process (Kroenke &
Dolan, 1987). The toolset components developed to support the five scenarios and related
toolset evaluation are summarized in Chapter 4.
Evaluate the Toolset
Evaluation of the toolset was an important task in this project. The evaluation
methodology for the project was based on concepts adapted from Boloix and Robillard
(1995). The researchers' article explained an evaluation approach that involved three
dimensions--project, system, and environment. The project dimension characterized
project efficiency using three factors. The factors were process, agent, and tool. Since the
development of the toolset was not a large development effort involving many
individuals, the project dimension was not used in this project's evaluation survey.
The system dimension focused on the product, its performance, and the technology it
utilizes. The first question in the evaluation survey for this project was derived from the
136
product factor. The product factor was used to assess the intrinsic software system
attributes regarding product understandability. The first survey question was:
1. Which best characterizes how easy it was to understand how the toolset handles this
scenario?
_ A lot of effort to understand
_ A moderate amount of effort to understand
_ A minimum effort to understand
The second question in the evaluation survey for this project was derived from the
technology factor. The technology factor was used to assess how well operators and
system administrators have mastered the technology used by the system. The second
survey question was:
2. Which best characterizes the level of sophistication of the toolset in the way it
handled this scenario?
_ Low
_ Sufficient
_ High
The last dimension discussed in the paper was the environment dimension. This
dimension focused on the level of satisfaction with the software system and the
contribution that the system can make to the organization. An important factor is
compliance which measures how well the software system meets requirements. The third
survey question in the evaluation survey for this project was derived from the compliance
factor. The third survey question was:
137
3. Which best characterizes how well the toolset met the requirements of handling this
scenario?
_ Partially fulfills requirements
_ Meets requirements
_ Completely fulfills requirements
Usability assesses the adequacy and learnability of the software system. The fourth survey
question in the evaluation survey for this project was derived from the usability factor.
The fourth question was:
4. Which best characterizes how usable the toolset was when handling this scenario?
_ Not easy to understand
_ Easy to understand, but there are some usability concerns
_ User friendly and efficient to use
Contribution assesses the benefit of the system to the organization. The fifth question in
the evaluation survey for this project was derived from the contribution factor. The fifth
question was:
5. Which best characterizes the impact that the toolset might have on the organization
because of the way it handled this scenario?
_ No major impact on the users and their productivity
_ Will have an impact, but improvements are needed
_ Will have an major impact
138
The evaluation survey for this study was reviewed and approved by the IRB
representative (J. Cannady, personal communication, September 24, 2001). The complete
survey can be found in Appendix B. Appendix C contains the Institutional Review Board
Documents including the letter of approval and consent form that was used with each
participant. The survey was administered to personnel who are familiar with the operation
and management of Web sites. Prior to completing the survey, participants were shown a
a collection of materials for each scenario, such as Web application operational fault, and
were then asked to complete a series of questions. The materials were taken directly from
the toolset procedures, program outputs, views, and data. The five scenarios included:
1. Web application operational fault
In this scenario, the Web application experiences a database failure and an event is
generated and captured by the toolset. The failing application has been instrumented to
invoke a toolset program that gathers failure data and makes it available to the system
administrator and support personnel.
2. Web application deployment is unsuccessful
In this scenario, the deployment of the Web application is initiated by the
administrator. The deployment is unsuccessful and the failure is detected by the toolset.
After detection, the toolset procedure guides the administrator through the steps to resolve
the problem and transfers the fault to the problem-management system as a closed
problem.
3. Web application change results in poor performance
139
In this scenario, new functionality is installed for a Web application. After the change,
the new function is operational, but poor application performance results. The
administrator uses toolset views to get a clear understanding of the problem and transfers
the problem to the development team for resolution.
4. Web application experiencing bottlenecks as some queries take a long time
In this scenario, certain inquiry functions of the Web application are taking a long time
to complete. The toolset is used to detect the database functions that were performing
poorly and the problem is transferred to the developers who must change the application
source code or modify the underlying database structure to improve the performance of
the SQL commands.
5. Overall response for the Web application is slow, but the application is still functional
In this scenario, the Web application is performing slowly, but all components are
available. The toolset's deep availability capability is used to determine the root cause of
the overall poor performance. In this scenario, several simultaneous problems are the
cause of the slow overall response.
Formats for Presenting Results The results of this study are included in four main work products. The first work
product is the primary design document of this project called Comprehensive Design for a
Prototype Toolset for Full Life-Cycle Management of Web-Based Applications. This
document is summarized in Chapter 4 of this Final Dissertation Report. This document
contains the design for all toolset components that resulted from the primary and
secondary research questions. The second work product is the segment strategy, which
140
explains what components were developed and in what order, is also included in Chapter
4. The segment strategy narrative was an important RAD work product in this project.
The third work product is the toolset components developed to support the five
scenarios and related toolset evaluation. These items are also summarized in Chapter 4.
The fourth and last work product of this study is the analysis of the data collected during
the toolset evaluation survey. The survey contained five questions that were administered
to participants each time they reviewed one of the five scenarios. The data for the survey
are also discussed in Chapters 4 and 5.
Projected Outcomes It is expected that the toolset components that resulted from this study will be used as
the basis for several service capabilities to be offered as part of IBM's Web Hosting
offerings. These offerings will focus on improving the availability of Web applications,
middleware, and database components of a customer's Web site. The work to improve the
availability of components high in the application dependency stack (Hurwitz, 1996) is
also expected to be linked to problem-determination procedures and tools.
It is also expected that results of this research will continue to generate technical
papers that will improve the way that Web sites are monitored and managed. Early in
2001, two papers were published that were the direct result of this research. Gulla and
Hankins (2001) defined a framework that can be used to evaluate the quality and
completeness of the monitoring and management of a Web site. The approach, which was
supported by a methodology was based on a series of "perspectives" that incorporated a
comprehensive view of tools, processes, organizational structure, and staff skills. In
another paper, Gulla and Siebert (2001) explained an activity that makes it possible to
141
plan for the successful implementation of monitoring for a customer's Web site. The
method, called Monitoring Implementation Planning, was put into practice in the South
Service Delivery Center for fully managed Universal Server Farm customers.
In 2002, two other papers were published based on ideas from this research. Ahrens,
Birkner, Gulla, and McKay (2002) documented case studies in Web application
availability and problem determination. The simple tools used in those case studies are
similar to the toolset programs and procedures used for the full life-cycle prototype. Gulla
and Hankins (2002) expanded the ideas contained in Figure 4, applications management
as part of a comprehensive approach, into a framework to address the challenges of
managing high availability environments.
Resource Requirements The facilities that were used to complete the dissertation included hardware, software,
data, procedures, and people. The method used to describe the required resources is based
on Kroenke and Dolan (1987).
Hardware The hardware that was used to complete the project included a personal computer to
develop and run the toolset and function as both an application and management client
(see Table 10). The personal computer was a ThinkPad 760 EL machine with 64
megabytes of RAM and a 2.1 Gigabyte disk (IBM ThinkPad, 1996). A Sun workstation
was used as a management server. The Sun workstation was a SPARCstation V with 64
megabytes of RAM and a 2-Gigabyte disk (SPARCstation 5, 1996).
142
Table 10. Hardware Used for the Creation of the Toolset
Hardware Activity Role
ThinkPad Unit development and testing
Management client and server
Sun workstation Integration and testing Management server and application server Software A variety of software was used in this dissertation project. A summary of the software
used is shown in Table 11. Microsoft Word was used to document requirements and to
create design documentation. For toolset design, Microsoft PowerPoint was used to create
drawings that were imbedded in Word documents. For toolset development, a Web
development tool which included an HTML editor was used. For database development,
database utilities that are part of Microsoft Access were used. TCP/IP was used for the
network protocol, as it is native to Web applications and contains many powerful
commands like Telnet and FTP.
Table 11. Software Used for the Creation of the Toolset
Software Activity Role
Microsoft Word Design through evaluation
Create documentation
Microsoft PowerPoint Design through evaluation Create drawings
Scripting Tool Development Create toolset
TCP/IP Development through integration
Transfer files and other network utilities
Tivoli Framework Development Toolset support
Tivoli Distributed Monitoring Development
Toolset support
143
Table 11. (continued)
Software Activity Role
Tivoli Software Distribution Development
Toolset support
Microsoft Access Development
Toolset support
Domino Server Development
Toolset support
E-Commerce Construction Kit Development
Toolset support
Netscape Navigator Development
Toolset support
The Tivoli Framework and systems management applications were used including
Distributed Monitoring and Software Distribution. These applications provided a base for
the toolset development. Distributed Monitoring was used as a general purpose
monitoring engine. Software Distribution was used as the utility that supported the
Automated Installation and Configuration subsystem. Microsoft Access, a relational
database was used for the MIR. For the Web environment, the server software used was
Domino Server. Application pages were composed using the E-Commerce Construction
Kit and displayed using Netscape Navigator.
Data Data for this project were stored in a repository called the Full Life-Cycle Toolset
MIR. Microsoft Access, a relational database, was used for this purpose. Data stored in
the database included toolset components, event logs, component descriptions, component
relationships, and other items identified during the design sessions. The use of this
database allowed SQL queries to be written to extract data from the MIR.
144
Procedures Procedures were developed and used in the prototype toolset scenarios. These
procedures were used in each of the five scenarios that can be found in Appendix E,
Survey Materials Used for the Toolset Evaluation. The scope of the procedures included
the full life-cycle support of the management of the application, as well as toolset
operation, integration with other software, and problem determination. Procedure
requirements and design were documented in the Comprehensive Design for a Prototype
Toolset for Full Life-Cycle Management of Web-Based Applications which are
summarized in Chapter 4.
People For this project, a team of Web professionals assisted through participation in the JAD
sessions. The design team also provided support during the implementation of the
prototype toolset. The design team was skilled in application, middleware, and database
availability and problem determination. Web operations support personnel and system
administrators were also the subjects who participated in the toolset evaluation.
Reliability and Validity A statistically significant number of observations were gathered during the toolset
evaluation step. Thirty three of 40 participants responded to the survey. The sample
population was diverse and experienced as shown by Figures 15 and 16 as well as Table
35 which can be found in Chapter 4. Analysis of the sample was performed using
descriptive statistics for an opinion/fact survey as part of a summative evaluation. Two
approaches were taken to the organization of the data for analysis. The first approach
examined the data in a scenario-by-scenario manner. Descriptive statistics were used in
145
the analysis including count, cumulative percentages, and ranking. Ranking was used to
determine the toolset prototype scenarios that were more successful than others. This
analysis can be found in Tables 36, 37, 38, 39, and 40 in Chapter 4.
The second approach that was used examined the data based on the survey questions
across all the scenarios. This approach made it possible to examine the participant's
response to questions about the toolset's ease of understanding, level of sophistication,
meeting of requirements, usability, and potential impact of its use independent of the
scenario that was used to demonstrate toolset functionality. Descriptive statistics were
used in the analysis including average, minimum, maximum, and ranking. Ranking was
used to determine the toolset characteristics, like ease of understanding, that were more
successful than others. This analysis can be found in Table 42 in Chapter 4.
Summary The main steps used to carry out the project included design, implementation, and
evaluation. The design activities leveraged JAD techniques and involved the researcher
leading several design sessions with participants skilled in application, middleware, and
database management. After the JAD activity was completed, a RAD approach was used
to plan and implement the toolset prototype that was used in the evaluation step. A RAD
planning tool called a segment strategy was used to organize the work that would be done
to implement the toolset. The segment strategy made it straightforward to use the JAD
design outputs to create a prototype toolset that was a meaningful subset of the
comprehensive design that was created. The design work products are summarized in
Chapters 4 and 5 along with the segmentation strategy that was developed during the
implementation step.
146
After the segment strategy was completed, toolset components were developed to
support the five scenarios and the related toolset evaluation. Web pages (toolset views)
used for the management of Web applications were built using a HTML editor called E-
Commerce Construction Kit. Toolset procedures were developed and documented in the
Web pages that were used for the prototype toolset. These procedures provided structure
and organization to the toolset scenarios. The procedures were used early in each scenario
to explain the approach to be taken and then were used throughout the scenario. See
Appendix E, Survey Materials Used for the Toolset Evaluation, for a view of the
procedures developed for the prototype. A MIR was built using Microsoft Access, a SQL
database that contained the 15 database tables that were needed to support the prototype.
The prototype was developed on a personal computer (Thinkpad) that was supported by a
Sun server running the UNIX operating system. The project's hardware provided the
necessary support for the software chosen and was sufficiently powerful to develop and
test a prototype with a large number of Web pages, framesets, and database tables.
A survey was used during the evaluation step to gather data from the survey
participants about their understanding of the prototype toolset. The data collected was
analyzed in a scenario-by-scenario approach and a question-by-question approach to
determine the scenarios and characteristics of the toolset that were more effective than
others. The results of the analysis are discussed in detail in Chapter 4. Chapter 5 also uses
the results of the analysis in a detailed way in the context of the research questions and
hypotheses of this study.
147
Chapter 4
Results Introduction This study was conducted during the period from February 2001 to May 2002. During
that period, activities were carried out and work products were produced that focused on
the design, development, testing, and evaluation of a prototype toolset for the full life-
cycle management of Web applications. The prototype toolset that was developed was a
subset of the full toolset that was designed during this project. Specifically, the toolset
components that were developed were those components needed to support one of the five
scenarios that were used during the toolset-evaluation phase of this project. Although only
a subset of the toolset components were developed and tested, all programs, procedures,
views, and schema were designed.
The complete toolset design was documented in a design notebook called
Comprehensive Design for a Prototype Toolset for the Full-Life Cycle Management of
Web-Based Applications. This design document is summarized in this chapter. The
participants in the study evaluated the prototype toolset that was developed and tested.
More than 40 study participants were asked to review five scenarios and complete a five-
question survey for each of the five scenarios. The review of the scenarios was completed
using a package of materials that consisted of printed versions of toolset components
including procedures, views, and data. The results of the survey are discussed in detail in
this chapter. The key parts of the survey materials are included in Appendix E of this
document.
148
In order to manage the transition from comprehensive design to subset implementation,
a segment strategy was developed. The segment strategy is a RAD technique used to
develop the approach to deliver a usable segment or subset of a system’s functionality to a
community of users (Hough, 1993). For this development project, a subset was needed
which specifically met the needs of the study scenarios. The segment strategy for this
project also explained the development sequence that was used and identified the toolset
components needed for each study scenario. As described by RAD researchers, one of the
first activities typically involves the creation of the user interface (Carter, Whyte,
Birchall, & Swatman, 1997). The complete segment strategy is documented in this chapter
of the document.
Presentation of Results The presentation of results for this dissertation is directly related to the specific
procedures employed in Chapter 3. The toolset design portion of this chapter documents
and summarizes the specific procedure from Chapter 3 called Design the Toolset. The
summary explains the programs, procedures, views, schema, and data that were designed
to create a comprehensive toolset for the full life-cycle management of Web applications.
The toolset implementation portion of this chapter documents the implementation
activities and the toolset components that were developed to support the five scenarios.
This is directly related to the procedure in Chapter 3 called Implement the Toolset. This
portion of the report also documents the experiences and techniques using commands,
view-building tools, and database software.
The toolset evaluation portion of this chapter documents and explains the results of the
activity that was focused on gathering data from survey participants regarding toolset
149
performance, technology, and level of satisfaction with the software and the contribution
that the system will make to the organization. This part is directly related to the procedure
in Chapter 3 called Evaluate the Toolset. This chapter also contains a detailed discussion
of the findings of this study. The chapter concludes with a summary of results.
Analysis This part of the chapter contains three sections. The three sections are toolset design,
toolset implementation, and toolset evaluation. Each section relates to the specific
procedures employed as described in Chapter 3.
Toolset Design Three parts make up this summary of the toolset design. Part 1, Overall System
Summary, explains the toolset components in the context of a single system containing 19
subsystems providing functionality in support of 15 functional perspectives. Part 2,
Subsystem Summary, explains in detail the components that make up the individual
subsystems. This part is organized by functional perspective as each perspective is
supported by one or more subsystems. Part 3, Segment Strategy, contains the segment
strategy that was used to guide the development and testing of the prototype toolset.
Overall System Summary The design for the toolset was created using a series of Joint Application Design
sessions. The design sessions were conducted with a number of participants who are
familiar with the challenges of developing and managing applications. Prior to the design
sessions, research was done to identify the functional perspectives that would be the focus
of the design activities. The term functional perspective was used by Sturm and Bumpus
(1999) and means roughly the same thing as discipline and process. Both discipline and
150
process are common systems-management terms. The focus list of functional perspectives
was explored and documented in Chapter 2 in the section Summary of What is Known
and Unknown About the Topic. The role of the functional perspectives as a primary
design input is summarized in Table 12.
Table 12. Primary Inputs to Design Sessions
Research Input Purpose
Functional Perspectives Provided focus to the functions to be designed
Toolset Components Provided a list of the types of components to be developed
Research Questions Provided specific concepts to explore within the functional perspective
Life Cycle Provided a framework in which to define the usefulness of a toolset component
Also brought to the design sessions was a well-developed concept of a toolset. The
toolset components that were identified consisted of programs, procedures, views,
scheme, and data. The idea of creating a toolset was influenced by the literature. Firmato,
a firewall management toolkit developed by Bartal, Mayer, Nissim, & Wool (1999), was a
good model for this project. Also influential was PCFONFIG a Web based toolset (Slater,
1999) and the OpenDist toolkit from researchers Osel & Gansheimer (1995) which was
used to synchronize file servers. The TeMIP OSI Management Toolkit (1999) also
provided a good model from the software development community.
151
The toolset components themselves were chosen for their appropriateness for the
management of Web applications. The toolset components and their relationship to one
another was explained in Chapter 1 and summarized in Figure 3. The role of the toolset
components as a primary design input is summarized in Table 12. The other important
inputs that helped to structure the design activities were the twenty-three research
questions documented in Chapter 1. These research questions, in particular the secondary
research questions numbered 4 through 23, were used to explore the creation of the
subsystems which provided support to the functional perspectives. The role of the
research questions as a primary design input is summarized in Table 12.
Finally, the well-known concept of the application life cycle was considered during the
design sessions. The life cycle explored consisted of design, construction, deployment,
operations, and change phases. These life-cycle steps are sometimes given different
names, but the activities are fundamentally the same. During the design phase, the
application is explained in a detailed design document. Sometimes, a prototype is created
to make it easier to understand the proposed application. During construction the
application is developed and tested. During deployment the application is installed on
servers and made available to its users after training is conducted. The operations phase
begins when the application is available for regular use. Finally, the change phases is
invoked when an operational application is changed to fix errors or to make new
application function available to its users. The role of the application life cycle as a
primary design input is summarized in Table 12.
The design sessions were conducted as a series of meetings with the design
participants. These meetings were JAD sessions. Between meetings, communication was
152
facilitated by electronic mail that made use of a shared documentation database used
exclusively for this project. Project communications, meeting agendas, meeting minutes,
presentations, and design work elements were stored in a document database called a
TeamRoom. Using the important ideas from the design sessions, a comprehensive design
was created. An example of the JAD materials that were used in the first JAD session
called Background and Brainstorming JAD Materials can be found in Appendix F.
The comprehensive design for the toolset was collected in a design notebook called
Comprehensive Design for a Prototype Toolset for Full Life-Cycle Management of Web-
Based Applications. From a system point of view, the design contains a large number of
toolset components organized by subsystems within functional perspectives. Some
functional perspectives are supported by more that one subsystem. The subsystem
approach was used because a subsystem is usually capable of operating independently or
asynchronously (Dictionary of Computing, 1987) and this common way of organizing a
system was a particularly good match for this toolset.
The system that resulted from the design was called the toolset for full life-cycle
management of Web-based applications. The system contains 19 subsystems. Table 13
summarizes the subsystems in support of the 15 functional perspectives.
Table 13. Functional Perspectives and Related Subsystems
Functional Perspective Subsystem (s)
Accounting Resource Modeling Resource Accounting
Administration Automated Installation and
Configuration Configuration Verification
153
Table 13. (continued)
Functional Perspective Subsystem (s)
Automation Template Creation Component Comparison
Availability Deep View
Business Business Views
Capacity Application Capacity Bottlenecks
Change and Configuration Unauthorized Change Detection
Change-Window Awareness
Fault Smart Fault Generation
Operations Integrated Operations
Performance Intimate Performance
Problem Detailed Data
Security Interface Monitoring
Service Level SLO/SLA Data
Software Distribution Deployment Monitoring MIR Creation
The 19 subsystems are made up of 43 procedures, 78 programs, 25 views, and a
database that contains 59 tables. The Accounting functional perspective contains two
related subsystems. The Resource Modeling subsystem was focused on matching actual
resource use with predicted or desired resource use and alerting the developer when there
is a mismatch. Table 14 contains a summary of the toolset components that make up the
Resource Modeling subsystem. The Resource Accounting subsystem had as its goal
providing instrumentation for charge-back of a Web site (see Table 15).
154
The Administration functional perspective has two subsystems. The Automated
Installation and configuration subsystem was designed to completely automate the
installation and configuration of a Web application (see Table 16). The Configuration
Verification subsystem was designed to verify the administrative settings of a Web
application in support of problem solving (see Table 17). The Automation functional
perspective has two subsystems. The Template Creation subsystem was focused on
productivity through the creation of operational templates like start, stop, and restart
scripts and management schema (see Table 18). The Component Comparison subsystem
was designed to compare designed to actually installed components as an aid in finding
implementation errors or omissions (see Table 19).
The Availability functional perspective has one subsystem. The Deep View subsystem
was designed to provide a deep treatment of availability to include responsiveness,
stability, and usage measurements (see Table 20). The Business functional perspective has
one subsystem. The Business Views subsystem was focused on representing a logical
collection of applications as a business system (see Table 21). The Capacity functional
perspective has one subsystem. The Application Capacity Bottlenecks subsystem
examined the application, database, and middleware components necessary to determine
application capacity (see Table 22).
The Change and Configuration functional perspectives have two subsystems. The
Unauthorized Change Detection subsystem was centered on creating a capability for the
application to detect unauthorized changes to itself (see Table 23). The Change-Window
Awareness subsystem was designed to make it possible for an application to suppress
certain kinds of application-generated faults (see Table 24). The Fault functional
155
perspective has one subsystem. The Smart Fault Generation subsystem was designed to
optimize the creation of application faults utilizing minimal inputs (see Table 25).
The Operations functional perspective has one subsystem. The Integrated Operations
subsystem was designed to have an application view for helpdesk personnel that included
job scheduling, backup status and history, and status of key print and file outputs (see
Table 26). The Performance functional perspective has one subsystem. The Intimate
Performance subsystem utilized a proxy to gather performance data instead of modifying
the application to make calls to a performance-measurement tool (see Table 27). The
Problem functional perspective has one subsystem. The Detailed Data subsystem was
focused on providing meaningful and detailed data to the problem-management system
(see Table 28).
The Security functional perspective has one subsystem. The Interface Monitoring
subsystem was designed to provide a view with supporting probes to monitor and report
on key security interfaces (see Table 29). The Service Level functional perspective has
one subsystem. The SLO/SLA Data subsystem was designed to provide an application-
independent tool to collect and report service-level objective and service-level agreement
information (see Table 30). The Software Distribution functional perspective has two
subsystems. The Deployment monitoring subsystem was centered on monitoring mission-
critical distributions (see Table 31). The MIR Creation subsystem was a design for a set of
tools to populate the MIR with information in support of package distributions (see Table
32).
156
Subsystem Summary This part of the report contains a summary of each subsystem that was created as part
of the comprehensive toolset design. Each subsystem is summarized in the context of the
functional perspective that it supports. A description is included of each program,
procedure, view, and table component that was part of the design of the subsystem.
Support for the Accounting Functional Perspective The Resource Modeling and the Resource Accounting subsystems support the
Accounting perspective. The purpose of the Resource Modeling subsystem is to allow a
developer or user to specify the resources they intend the Web application to use and the
toolset will alert them when the threshold is exceeded. In this way, this subsystem
supports a high-level modeling activity involving resources like memory and processors.
Fosdick (1998) described the resource accounting capability of a commercial product
called Unicenter TNG. The Resource Modeling subsystem is similar to Unicenter TNG in
that it tracks resource usage, but the focus is workload modeling not simple accounting for
resource consumption for charge back purposes.
Toolset components from this subsystem can be used during design, construction,
deployment, operations, and change phases. For example, the Resource Utilization
Optimization procedure is useful for providing strategies for making the best use of disk,
memory, processor, and I/O at the design phase of developing the Web application. The
Resource Modeling subsystem is supported by eight toolset components including two
programs, two procedures, one view, and three required database tables. Two optional
database tables are also included as part of this subsystem. A summary of the components
that were designed for this subsystem is shown in Table 14.
157
Table 14. Resource Modeling Component Summary
Component Type
Component Name
Program Resource Modeling Monitoring Resource Modeling Reporting Procedure Resource Modeling Utilization Optimization Guide Resource Modeling Utilization Analysis View Resource Modeling Monitoring Table Application Definition Resource Modeling Monitoring Input Resource Modeling Log Resource Modeling Log Summary (optional) Resource Modeling View History (optional)
The Resource Modeling Monitoring program utilizes input values from the user and
monitors for those limits to be exceeded. The Resource Modeling Reporting program
produces an output report from log file data collected by this subsystem. The Resource
Modeling Utilization Optimization Guide is a procedure that provides strategies for
making best use of disk, memory, processor, and I/O resources. The Resource Modeling
Utilization Analysis procedure provides steps to perform to interpret the data in the output
report. The Resource Monitoring Modeling view displays an output report that contains a
list of exceptions, ordered by server. The exceptions are related to disk, memory,
processor, and I/O resources.
The Application Definition database table contains information that identifies the
application that will be managed. By design, only applications that are defined to the
toolset will be subject to management actions. The Resource Modeling Monitoring Input
database table contains the data that the Resource Modeling Monitoring program uses to
158
control its operations. The Resource Modeling Log database table is used as a repository
of output messages that contain the details associated with modeling exceptions.
Two optional database tables are included as part of this subsystem and are represented
in many of the other toolset subsystems. The Resource Modeling Log Summary database
table contains information about the data contained in the Resource Monitoring log. This
summary information is updated periodically by a toolset support utility that makes it
possible to query about the start and end dates of data in the MIR, as well as the number
of specific types of exceptions, by simply accessing the summary information. The
Resource Modeling View History database table is another optional database table. This
database table contains the information that was displayed in previously examined HTML
views. This database table functions as a view history mechanism. A view history is
automatically saved upon each use of the reporting programs. The default number of
saved views is 99. If necessary, as a new view is added, an older saved view is discarded.
The second subsystem, Resource Accounting, is used to instrument an application for
accountability so that the application usage can be charged-back to the internal groups
that use the application. A variety of event types can be used to generate charge back data
including log on, log off, query, update, and browse. User defined event types are also
supported. The idea of creating a utility to support charge back was influenced by the
Application Response Measurement API as described by Sobel (1996c). The application
Response Measurement API is a performance tool that provided a model for how a simple
API could be used to gather application-specific data. Other products influenced the
design of the Resource Accounting subsystem like UniSolutions Associates JobAcct
159
software. This products and its use by the Army Corp of Engineers was described by
Olsen (1998).
As was the case for the Resource Modeling subsystem, toolset components of the
Resource Accounting subsystem can be used during design construction, deployment,
operations, and change phases. An example is the Resource Accounting view that is
useful during construction, deployment, operations, and change phases. The Resource
Accounting subsystem consists of eight toolset components including two programs, two
procedures, one view, and three required database tables. Two optional database tables are
also included as part of this subsystem. A summary of the components that were designed
for this subsystem is shown in Table 15.
Table 15. Resource Accounting Component Summary
Component Type
Component Name
Program Resource Accounting Resource Usage Recording Resource Accounting Reporting Procedure Resource Accounting Usage Recording Guide Resource Accounting Report Analysis View Resource Accounting Table Application Definition Charge-Back Definitions Resource Accounting Log Resource Accounting Log Summary (optional) Resource Accounting View History (optional)
The Resource Accounting Resource Usage Recording program accepts an input
parameter list from a calling program and generates a usage record in the MIR for charge-
back purposes. The Resource Accounting Reporting program is used to generate the
160
output reports that support the subsystems. The Resource Accounting Usage Recording
Guide is a procedure that assists with the understanding and use of the Resource
Accounting Resource Usage Recording program. The Resource Accounting Report
Analysis procedure is used to assist the administrator with interpreting the five output
reports. The reports order the data by application; by user within application; by event
type; by user within event type, and by charge back. The Resource Accounting view is
used to display and manipulate the output reports that are part of this subsystem.
The Application Definition database table is used to identify the application being
managed and the event types being used by the instrumented application. The Charge-
Back Definitions database table contains the charge to be applied for each event type. The
Resource Accounting Log database table contains an entry-sequenced collection of rows
containing the events that were generated by the application.
Support for the Administration Functional Perspective The Automated Installation and Configuration and Configuration Verification
subsystems support the Administration perspective. The purpose of the Automated
Installation and Configuration subsystem is to completely automate the installation and
configuration of a Web application. Installing applications is an important activity that
was a focus of POSIX standard 1387.2 (Information Technology - Portable, 1995).
Automated application installation was also a primary focus of the System Modification
Program (OS/VS2 MVS Overview, 1980). However, this tool did not provide a way to
manage the configuration of the application after it was installed. This subsystem expands
upon the Systems Modification Program by including configuration activities within its
scope. CIM also influenced this subsystem design. Settings and configurations, which are
161
part of the CIM Core Model (Westerinen & Strassner, 2000), are the kind of definitions
that are the primary focus of the automated configuration portion of this toolset.
In addition to providing completely automated installation support, this subsystem
also supports hybrid installation activities like those that automate the application
installation, but use manual configuration tasks. Application developers and
administrators often need this flexibility. Automated Installation and Configuration
programs have implications for the design and construction phases because the Web
application should be designed and constructed for automated installation and
configuration. Modules, for example, should be placed in libraries that are platform
specific. This makes distributing them for installation more straightforward. Automated
Installation and Configuration programs are useful for the deployment, operations, and
change phases.
The Automated Installation subsystem consists of twelve toolset components including
four programs, three procedures, two views, and three required tables. Two optional tables
are also included as part of this subsystem. A summary of the components that were
designed for this subsystem is shown in Table 16.
Table 16. Automated Installation and Configuration Component Summary
Component Type
Component Name
Program Automated Installation (Silent Install) Automated Configuration (Silent Configuration) Automated Application Installation Reporting Automated Application Configuration Reporting
162
Table 16. (continued) Component
Type Component
Name Procedure Automated Installation and Configuration: Guide to Designing an
Application For Distribution Automated Installation Set Up and Use Guide Automated Configuration Set Up and Use Guide View Automated Application Installation Automated Application Configuration Table Application Definition Source and Target Installation and Configuration Definitions Automated Installation and Configuration Log Automated Installation and Configuration Log Summary (optional) Automated Installation and Configuration View History (optional)
The Automated Installation (Silent Install) program copies an application's components
to one or more target systems. The Automated Configuration (Silent Configuration)
program administers the definition files that make up the configuration of the application.
The Automated Application Installation Reporting program creates a report on the
installation status of an application. The Automated Application Configuration Reporting
program creates a report on the configuration status of an application.
The procedure called Automated Installation and Configuration: Guide to Designing an
Application For Distribution is used to help the designer avoid the pitfalls that will make
it difficult to distribute, install, and configure an application in an automated fashion. The
Automated Installation Set Up and Use Guide is a procedure that explains how to use the
Automated Installation program, set up definition files, and install Web applications
including optional services like simulating a test installation. The Automated
Configuration Set Up and Use Guide is a procedure that explains how to use the
163
Automated Configuration program, set up definition files, and configure applications
including optional services like delete a target component when an error occurs.
Two views are part of this subsystem design. The Automated Application Installation
view is used to display the reports associated with the installation process. The reports
contain primary installation information like target system name and target servers as well
as many details associated with the installation activities like options used and log file
data. The Automated Application Configuration view is used to display the reports
associated with the configuration process.
The Application Definition database table is used to associate the application and its
servers. The Source and Target Installation and Configuration Definitions database table
is used to store definitions that identify source file name and location. It also contains
installation options that determine the actions of the programs for this subsystem. An
example is what actions are taken when an installation error occurs. The actions could be
to keep or delete the target files. The Source and Target Installation and Configuration
Definitions database table also contains definitions specific to the configuration of the
application. The Automated Installation and Configuration Log database table is used for
messages associated with the installation and configuration processes.
The second subsystem that supports the administration perspective is the Configuration
Verification subsystem. The Configuration Verification subsystem was designed to verify
the administrative settings of a Web application. This verification allows for the
comparison of the settings of the same application in different environments or domains
like test, verification, and production. Since inconsistent or incorrect configuration setting
can cause problems with Web applications this subsystem could be helpful in problems
164
solving situations when application behaviors are different between test and production
systems. It could also be helpful solving problems after a recent application change. The
idea of a subsystem to verify configurations is rooted in several sources. Start and Patel
(1995) included configuration management as a key part of their distribution management
service. Their configuration management system was essential to verifying the current
state of the service node in the network.
The MVS utility program IEBCOMPR was used to compare two files and report on the
differences (Elder-Vass, 2000). The Configuration Verification subsystem expands this
idea from the file to the application level. Arcade Skipper, a software configuration and
verification management tool for AS/400 systems also influenced this subsystem through
its source compare and merge utility. This utility is used in application environments
where modifications of commercial packages are maintained separate from the base
software (AS/400 and iSeries, 2001). The Configuration Verification subsystems uses this
utility idea and leverages it across all file types that make up the application.
Toolset components from this subsystem can be used during the deployment phase to
ensure that environments are the same. They can be used to answer the question--is the
test system the same as the verification system? They can also be used during operations
when an application does not exhibit the same behavior that it did in the test system.
Additionally, toolset components can be used during the change phases to make sure that
important components where not omitted during the change window. The Configuration
Verification subsystem is supported by seven toolset components including two programs,
one procedure, one view, and three required database tables. Two optional database tables
165
are also included as part of this subsystem. A summary of the components that were
designed for this subsystem is shown in Table 17.
Table 17. Configuration Verification Component Summary
Component Type
Component Name
Program Configuration Verification Configuration Verification Reporting Procedure Configuration Verification Install, Configure, and Use Guide View Configuration Verification Table Application Definition Source and Target Installation and Configuration Definitions Configuration Verification Log Configuration Verification Log Summary (optional) Configuration Verification View History (optional)
The Configuration Verification program is used to physically compare, file by file, the
configuration of one application domain with that of another. The Configuration
Verification Reporting program creates reports that summarize an individual environment
or compares two environments and reports on the differences using data collected in the
MIR. The Configuration Verification Install, Configure, and Use Guide is a procedure that
explains how to install and configure (set up) the Configuration Verification program. The
Configuration Verification view is used displays configuration reports.
Three database tables are important to the operation of the subsystem. The Application
Definition database table is used to identify the application specific servers that make up
the test, verification, and production environments. The Source and Target Installation
and Configuration Definitions database table is the same database table that was used for
166
the Automated Installation and Configuration Subsystem. It is used to define the baseline
configuration to be verified for each application domain. The Configuration Verification
Log database table is used to store verification messages, both normal and exception in
nature.
Support for the Automation Functional Perspective The Automation perspective is supported by the Template Creation and Component
Comparison subsystems. The purpose of the Template Creation subsystem is to improve
the productivity of the developer and operations support staff by automatically creating
operational tools to be used in the day-to-day management of the Web application. An
example of an operational tool is a script that could be used to create standard file systems
for verification and production environments. The Template Creation subsystem design
was influenced by the productivity commands and automated capabilities of NetView.
Irlbeck (1992) wrote of the network and system automation capabilities of NetView
Version 2 and the impact on network operations. The Template Creation subsystem
extends the NetView models to include application-specific commands.
The Global Enterprise Manager product also had an influence on the Template
Creation subsystem as this product supplied generic commands like start, stop, and restart
that could be tailored for the management of an application (Tivoli Global Enterprise,
1998). The Template Creation subsystem extends the Global Enterprise Manager
examples to include a broader set of application-specific functions.
This subsystem is useful during the construction, deployment, operations, and change
phases. For example, the outputs of the Template Creation Template Build program can
be used to operate the Web application during the construction, deployment, operations,
167
and change phases. It should be noted that this subsystem has an impact on the design
phase of the application in two ways. The Web application design must be placed in a
common format whereby formal sections of the design document can be used to extract a
list of application components. The design must also be placed in the design documents
table and a design definition record must be created in the MIR to identify the document.
The Template Creation subsystem is supported by thirteen toolset components
including four programs, two procedures, two views, and five required database tables.
Two optional database tables are also included as part of this subsystem. A summary of
the components that were designed for this subsystem is shown in Table 18.
Table 18. Template Creation Component Summary
Component Type
Component Name
Program Template Creation Design Extract Template Creation Template Build Template Creation Extract Reporting Template Creation Build Reporting Procedure Template Creation: Utilizing Common Components Template Creation Install and Use Guide View Template Creation Extract Template Creation Build Table Application Definition Design Definitions Design Documents Template Models Template Creation Log Template Creation Log Summary (optional) Template Creation View History (optional)
168
The Template Creation Design Extract program is used to read the Web application
design document and extract the names of components that will be the target of template
operational commands. The Template Creation Template Build program is used to create
templates for components that were discovered to be part of the Web application design.
The Template Creation Extract Reporting program is used to create a report that lists the
components that were isolated from the Web application design. The Template Creation
Build Reporting program is used to create a report that lists the templates that were built
for each application. The procedure called Template Creation: Utilizing Common
Components describes what templates are created and how these templates can be used in
daily operations. The Template Creation Install and Use Guide explains how to install and
configure the Template Creation programs. It also discusses the design extract
prerequisites.
Five database tables are important to the function of this subsystem. The Application
Definition database table is used to identify and name the application as well as to assign
a unique build target file name for this application. The build target file is the repository
for the template commands that will be used to assist in the operation of the Web
application. The Design Definitions database table is used to contain the names of the
design documents that should be used by the Template Creation subsystem to generate
commands and scripts. The Design Documents database table is used to hold the physical
design documents named in the Design Definitions database table. The Template Models
database table contains initial files or starting points for operational commands like
display, start, and stop. The Template Creation Log database table is used to store
169
messages, both normal and exception in nature, that are generated during the process of
automatically generating operational tools.
The second subsystem to support the automation perspective is the Component
Comparison subsystem. The purpose of the Component Comparison subsystem is to
compare designed components (using the design extract provided by the Template
Creation subsystem) with the components deployed on an operational system. Information
about the deployed components is provided by the Automated Installation and
Configuration subsystem. The idea of comparing designed components with those that
were deployed was influenced by a number of existing utilities and products. Visio, a well
know tool for creating Web application architecture diagrams, has a documented
capability of storing data from diagrams in a database (Developing Visio Solutions,
1997). This built in capability to export component information makes it easier to devise a
list of elements to compare to elements in a target Web application environment.
In recent years, many commercial products have greatly enhanced their capability to
read and write files in many formats. Microsoft Word, a tool used by many Web
application designers, has extensive capability to exchange information with other
applications (Microsoft Word, 1994). The Component Comparison subsystem exploits
this capability that is built into Word and other popular programs. Models also exist to
extract information from running systems. The Tivoli Inventory program is an example of
a tool that is used to extract data about the components that are installed on a device like a
Web server (TME 10 Inventory, 1998). The Component Comparison subsystem extends
this capability by linking it directly to the Web application. The Component Comparison
170
subsystem is useful for comparing what was designed with what was deployed so it is by
nature a cross life cycle tool.
Toolset components from this subsystem can be used during the deployment,
operations, and change phases. It is important that the Web application be designed using
the information from the Component Comparison Install and Use Guide as restrictions
apply regarding the use of design tools. The Component Comparison subsystem is
supported by eight toolset components including two programs, one procedure, one view,
and four required database tables. Two optional database tables are also included as part
of this subsystem. A summary of the components that were designed for this subsystem is
shown in Table 19.
Table 19. Component Comparison Component Summary
Component Type
Component Name
Program Component Comparison Component Comparison Reporting Procedure Component Comparison Install and Use Guide View Component Comparison Table Application Definition Template Creation Log Automated Installation and Configuration Log Component Comparison Log Component Comparison Log Summary (optional) Component Comparison View History (optional)
The Component Comparison program is used to match designed and installed
components and record the results in the Component Comparison Log table. The
Component Comparison Reporting program generates reports, by application, from the
171
Component Comparison Log table. The Component Comparison Install and Use Guide is
a procedure designed to explain how to install, set up, and use the Component
Comparison program. The Component Comparison view is used to display the reports that
are generated as part of the Component Comparison subsystem.
The Application Definition database table is used to describe the applications that are
candidates for component comparison and reporting. The Template Creation Log database
table, which was created by the Template Creation subsystem, contains information about
application components that are used by this subsystem. The Automated Installation and
Configuration Log database table is also used by this subsystem because it contains
information about installed and configured application components. The Component
Comparison Log database table is used to store the results of comparisons that were done.
The data in this database table is stored in the form of messages.
Support for the Availability Functional Perspective The Availability perspective is supported by the Deep View subsystem. The purpose of
the Deep View subsystem is to provide a comprehensive treatment of availability to
include information pertaining to responsiveness, stability, and usage measurement. This
subsystem uses data and information from all the functional perspectives explored in this
toolset design. Typically, availability information is limited to information about the
logical state of a resource or component. Examples of network resource states reported by
the NetView program include ACTIV, NORMAL, and INACT (NetView User's Guide,
2001). The Deep View subsystem was designed to be the subsystem where all availability
information about the Web application could be examined in one place using one view.
172
The range of availability information included covered as many functional
perspectives as possible. For example, for the change perspective, information about the
last change is displayed. Change information can help an operator understand something
about the stability of the Web application. For the problem perspective, the most recent
problem record is displayed. Problem records are another stability factor that can yield
information on the availability of the Web application. For the automation perspective, the
most recent information about attempted recovery actions is displayed. Automation
history, especially information about failed attempts to recover resources like network
interface cards or message queues, is an important availability information source. In
summary, data associated with many of the perspectives has information that contributes
to the full understanding of the availability of a Web application.
Toolset components from this subsystem can be used during design, construction,
deployment, operations, and change phases. For example, the Deep View Application
Design Considerations procedure is useful for providing strategies for making the best use
of the Deep View subsystem in the construction through change phases. The Deep View
Real-Time Update program is useful during construction and deployment phases to assist
the developers and support personnel with understand the root cause of problems during
application testing or application deployment to other domains.
The Deep View subsystem is supported by eight toolset components including three
programs, two procedures, one view, and two required database tables. Two optional
database tables are also included as part of this subsystem. A summary of the components
that were designed for this subsystem is shown in Table 20.
173
Table 20. Deep View Component Summary
Component Type
Component Name
Program Deep View Initialize Deep View Real-Time Update Deep View Application Resource Reporting Procedure Deep View Application Design Considerations Deep View Install, Definition, and Use View Application Resources Table Application Definition Deep View Application Resources Application Resources Log Summary (optional) Application Resources View History (optional)
The Deep View Initialize program is used to discover initial values for the perspectives
that apply to this Web application and update the MIR with these initial settings. The
Deep View Real-Time Update program is used to gather availability data and update the
MIR in real time. It support requests for data on individual aspects, for example,
automation status. It can also perform mass update operations for all availability data for a
specific application. The Deep View Application Resource Reporting program is used to
create a report containing an availability snapshot using data stored by other programs
about this subsystem. The Deep View Application Design Considerations procedure
explains the prerequisites for using Deep View facilities to manage an application and the
impact its use has on the design of the application. The Deep View Install, Definition, and
Use procedure describes how to install, set up, and operate the Deep View subsystem.
The Application Definition table is used to identify the applications that are subject to
Deep View processing. A detailed set of Deep View definitions are also added to the
Application Definition database table that is used by most of the other subsystems. The
174
Deep View Application Resources database table contains definitions of the application-
specific resources that are reported about by the subsystem. Examples of application
resources are application programs, application processes, and other (user specified)
application components.
Support for the Business Functional Perspective The Business perspective is supported by the Business Views subsystem. The purpose
of the Business Views subsystem is to explore the possibilities of managing a collection
of related applications in a business-system context as compared to the usual network or
server centric focus. The Unicenter TNG and Tivoli Global Enterprise Manager products
first defined the business-system context. The Unicenter product utilized a graphical user
interface that garnered a lot of attention because it was different from anything else in the
marketplace and could be used to depict objects in context like computers and buildings
(Karpowski, 1999). Unlike Unicenter TNG, the Business Views subsystem is focused on
depicting relationships like parents, children, and peers in a tabular fashion. It also
supports different view types like logical and physical views. The subsystem is different
from Unicenter in that it is mainly centered on the relationship between applications that
make up a business system.
The Business Views subsystem is related to Tivoli Global Enterprise Manager in that
both can depict application, database, middleware, network, and server resources. Tivoli
Global Enterprise Manager is capable of representing an extremely broad list of resources
as shown by the tremendous variety that are part of the USAA Internet Member Services
management design (Turner, 1998). The Business Views subsystem does not have the
graphical richness of Tivoli Global Enterprise Manager. However, when paired with the
175
Template Creation subsystem the Business Views subsystem can draw upon a large
number and variety of commands and monitors.
Toolset components from this subsystem can be used during the deployment,
operations, and change phases. For example, the Business Systems view is useful during
the operations phase after a network or switch problem is resolved. The view indicates the
status of all resources in a business system in one view. It is easy to understand which
resources are active, degraded, or down. The Business Views subsystem is supported by
twelve toolset components including five programs, three procedures, one view, and three
required database tables. Two optional database tables are also included as part of this
subsystem. The components that were designed for this subsystem is shown in Table 21.
Table 21. Business Views Component Summary
Component Type
Component Name
Program Business View Physical Initialize Business View Physical Update Business View Logical Initialize Business View Logical Update Business View Reporting Procedure Designing Applications for Business Views Using Business Views Extensions Guide–Enhancing Supplied Commands and Monitors View Business Systems Table Application Definition Business Systems Definitions Application Resources Log Business Systems Log Summary (optional) Business Systems View History (optional) Four programs are used to maintain the data in the MIR that is used in the reports. The
Business View Physical Initialize and the Business View Physical Update programs are
176
responsible for initializing and maintaining information on the physical devices associated
with Web applications that are shown in the Business Views. The Business View Logical
Initialize and Business View Logical Update programs do the same to manage the logical
relationships like parent, child, and peer that exist between applications and their
components. The Business View Reporting program is used to create the reports for this
subsystem.
The Designing Applications for Business Views procedure explains the items that
should be considered during the design of the Web application. For example,
consideration should be given to the relationship of a new Web application with others
that are already part of the business suite of applications. The Using Business Views
procedure is an operator’s guide to the subsystem. The Extensions Guide–Enhancing
Supplied Commands and Monitors is a procedure that can be used to better understand
how to make changes to the commands and monitors that are part of the subsystem.
Application specific commands and monitors can significantly enhance the usability of
this subsystem especially if they are powerful and specific to the Web application. The
Business Systems view is used to display the main Business Views report and to move
between the report selections.
The Application Definition database table is used to define the applications that can be
monitored using Business System views. The Business Systems Definitions database table
is used to contain the parameters that define the parent and peer relationships between the
applications that make up the Business System. The Application Resources Log database
table is used as a repository for key messages that are created during Business View
subsystem processing. The data in this log is used by the subsystem's reporting program.
177
Support for the Capacity Functional Perspective The Capacity perspective is supported by the Application Capacity Bottlenecks
subsystem. The purpose of the Application Capacity Bottlenecks subsystem is to provide
a focus on the demanding area of application, database, and middleware capacity with
specific attention paid to detecting operational limitations or bottlenecks. There are
significant challenges with designing a subsystem with this focus. What are the typical
bottleneck areas? Is there technology that can be exploited to monitor these areas? Britton
(2001) identified eight elements that make up a middleware system. They include the
communication link, the middleware protocol, the API, a common data format, server
process control, naming/directory services, security, and administration.
Of these eight only a subset can be monitored on a regular basis without adding
significant overhead to the overall operation of the application. Considering the overhead
of using an API as a method of checking the availability of the API, it is easily understood
that the act of monitoring can contribute to causing the subject of the monitoring activity
to back up or slow down. Database management systems present similar challenges.
Informix, a commercial database product, provides a management interface called the
System Monitoring Interface (SMI). The SMI can be used to get information regarding
processing bottlenecks, resource usage, session, and server activity (Mattison, 1997).
Using this interface to gather information results in database queries to the system master
database, which like the situation with middleware monitoring, may be adding an
unintentional burden to the specific subsystem being checked for bottlenecks.
Toolset components from this subsystem can be used during design, construction,
deployment, operations, and change phases. For example, the Strategies to Reduce
178
Application Capacity Limits Guide is a procedure that explains what can be done to
minimize the impact of known capacity limitations with Web applications and the
supporting middleware and database systems. The Application Capacity Bottlenecks
subsystem is supported by twelve toolset components including five programs, three
procedures, one view, and three required database tables. Two optional database tables are
also included as part of this subsystem. A summary of the components that were designed
for this subsystem is shown in Table 22.
Table 22. Application Capacity Bottlenecks Component Summary
Component Type
Component Name
Program Application Bottlenecks Database Bottlenecks Middleware Bottlenecks Application, Database, and Middleware Alerts Application, Database, and Middleware Reporting Procedure Strategies to Reduce Application Capacity Limits Guide Capacity Limits Definition Guide Capacity Limits Subsystem Guide View Application Capacity Table Application Definition Bottleneck Definitions Application Capacity Log Application Capacity Log Summary (optional) Application Capacity View History (optional)
179
Three programs focus on monitoring for specific bottlenecks. The Application
Bottlenecks program monitors for missing and stuck application processes. It also
monitors for the situation where more than a specified number of processes are running
simultaneously. The Database Bottlenecks and Middleware Bottlenecks programs perform
monitoring functions for their respective areas. In addition to process monitoring, both
programs can also detect long running activities like SQL queries and queue puts or gets.
The Application, Database, and Middleware Alerts program is a common interface
module used by the other programs in the subsystem to format and present alerts to the
management infrastructure. The Application, Database, and Middleware Reporting
program produces the application-focused bottleneck report. The report contains data
gathered by the monitoring programs like application and database process status. It also
indicates if the application utilizes voice, video, audio, graphics, email, FTP, and HTTP
services.
The Strategies to Reduce Application Capacity Limits Guide is a procedure for
application designers that helps them reduce or eliminate practices that artificially inhibit
the effective operation of the Web application. The Capacity Limits Definition Guide
explains the parameters that guide the operation of the programs in the subsystem. The
Capacity Limits Subsystem Guide is a procedure that is an operational guide to the use of
the subsystem. The Application Capacity view is used to display the subsystem's main
report.
The Application Definition database table contains the parameters that define the
application to be monitored for bottlenecks. The Bottleneck Definitions database table
contains definitions of the bottlenecks and threshold that are specific to the application.
180
The toolset administrator places these definitions in the database table. It is anticipated
that after a period of trial and error, the threshold values will change to better match the
operation profile of the Web application. The Application Capacity Log database table is
used to hold the data that is used in the reports of this subsystem.
Support for the Change and Configuration Functional Perspectives The Change and Configuration perspectives are supported by the Unauthorized Change
Detection and Change-Window Awareness subsystems. The purpose of the Unauthorized
Change Detection subsystem is the detection of changes made to an application that are
not formally authorized. Authorized changes apply to applications that are deployed,
operational, or subject to change. A formally authorized change is one that is approved
through the organization's change control process. Ideas for this subsystem are based on
functionality that is build into all operating systems. UNIX systems, for example,
maintain a time stamp that is the date and time when the file was last modified (UNIX
Unleashed, 1994). Windows 95 maintains three dates that can be viewed by displaying the
properties of the selected item. The dates are when the resource was created, modified,
and last accessed (Introducing Windows 95, 1995).
This subsystem compares MIR data that defines approved change windows with the
change dates for key application resources like files. Changes to resources on specified
systems that are outside approved change windows are considered unauthorized. Toolset
components from this subsystem can be used during the construction, deployment,
operations, and change phases. For example, the Change Detection program can be used
on a monthly basis during the operations phase as a method to meet security audit
requirements. The Building and Maintaining Applications Configuration Guide can be
181
used during the change phase to add new resources to the definition list for monitoring of
unauthorized changes.
The Unauthorized Change Detection subsystem is supported by sixteen toolset
components including five programs, five procedures, two views, and four required tables.
Four optional tables are also included as part of this subsystem. A summary of the
components that were designed for this subsystem is shown in Table 23.
Table 23. Unauthorized Change Detection Component Summary
Component Type
Component Name
Program Change Definition Change Detection Application Configuration Change Reporting Application Configuration Reporting Procedure Change Definition Guide Understanding Change Detection Building and Maintaining Applications Configuration Guide Using Application Configuration Using Change Detection View Application Configuration Change Detection Table Application Definition Application Configuration Change Definition Unauthorized Change Detection Log Application Configuration Log Summary (optional) Unauthorized Change Detection Log Summary (optional) Application Configuration View History (optional) Change Detection View History (optional)
The Change Definition program is used to build and maintain data that identifies
authorized change periods. The Change Detection program analyzes predefined physical
182
elements on a periodic basis to see if they have been changed during periods that are not
authorized. The Application Configuration program is used to define resources for an
application that is subject to change monitoring. There are two reporting programs that are
part of this subsystem. The Change Reporting program is used to produce the report for
the Change Detection view. The Application Configuration Reporting program is used to
produce the report for the Application Configuration view.
The Change Definition Guide is a procedure that explains how the Change Definition
program utilizes existing change-management records to create a change definition. The
Understanding Change Detection procedure describes how change detection compares the
last changed dates of physical elements of an application to authorized changes. The
Building and Maintaining Applications Configuration Guide is a procedure that explains
how this subsystem utility builds a useful application configuration using existing MIR
data. Also explained is how unique application-specific components can be supplied. The
Using Application Configuration procedure explains how to use the data and information
supplied in the Application Configuration view. The Using Change Detection procedure
explains how to use the data and information supplied in the Change Detection view.
The Application Configuration view is used to display the configuration data for a
specific application. This view can also be used to update the settings displayed. The
Change Detection view is used to display the unauthorized changes that have been
detected and stored in the MIR. The Application Definition database table is used to
identify the application for which unauthorized change detection will be performed. The
Application Configuration database table is used to contain the list of application
resources that will be monitored by the subsystem. The Change Definition database table
183
is used to contain the valid change periods that will be used in the analysis to determine if
a change is authorized. The Unauthorized Change Detection Log database table is a data
repository that contains data on all the unauthorized changes that have been discovered.
by the subsystem.
The second subsystem in support of the Change and Configuration perspectives is the
Change Window Awareness subsystem. The purpose of the Change-Window Awareness
subsystem is to maintain normal operations and management of an application while it is
undergoing change that is taking place during a predefined period called a change
management window. A change management window is a common term used by
individuals in the IT industry as well as hardware and software vendors. The hardware
vendor Cisco suggested that polling MIB objects after a change management window was
an ideal to get an up-to-date and accurate inventory of your network (How To Collect,
2002). The change management window is an element of the change management process
that is one of the disciplines described in ITIL (Bladergroen et al., 1998) and IBM
(Harikian et al., 1996).
Toolset components from this subsystem are used primarily during the change phase. If
change management disciplines are used during the construction phase this subsystem
could be used there as well. The Change-Window Awareness subsystem is supported by
eight toolset components including two programs, one procedure, two views, and three
required database tables. Two optional database tables are also included as part of this
subsystem. A summary of the components that were designed for this subsystem is shown
in Table 24.
184
Table 24. Change-Window Awareness Component Summary
Component Type
Component Name
Program Administer Change Window Definitions Manage Active Window Procedure How to Administer and Manage Change Window and Its
Definition View Change-Window Definitions Change-Window Operations Table Application Definition Change-Window Definitions Change-Window Operations Log Change-Window Operations Log Summary (optional) Change-Window Operations View History (optional)
The Administer Change Window Definitions program is used to add and update
change definitions in the MIR. The Manage Active Window program is used to
manipulate the change window in real time. For example, this program can be used to
close a change management window before its scheduled completion. The How to
Administer and Manage Change Window and Its Definitions procedure is a
comprehensive administration and operations document for the subsystem. The Change-
Window Definitions view is used to display and update this subsystem's operational
parameters or definitions. The Change-Window Operations view is used, in conjunction
with the Manage Active Window program, to display and manipulate a scheduled or
active change window.
The Application Definition database table is used to identify the applications that are
valid for this subsystem's actions. The Change-Window Definitions database table
185
contains the change definitions for this subsystem. The Change-Window Operations Log
database table is used to contain a history of the change window activities.
Support for the Fault Functional Perspective The Fault perspective is supported by the Smart Fault Generation subsystem. The
purpose of the Smart Fault Generation subsystem is to optimize the creation of application
faults, events, alarms, or alerts utilizing minimal inputs. Typically, fault data consists of
the source of the event, IP address of the source system, hostname, status, severity, date,
and message text. Often, it is desirable to have additional information that adds context to
the primary fault. This subsystem adds this additional detail by harvesting information
that is available at the time of the fault. The structure of the faults from this subsystem
was influenced by the content of Tivoli Event Console events (Lendenmann et al., 1997).
The notion of gathering addition data at the time of the error came from a Compuware
product called Abend-AID. The tools is used to quickly resolve and manage the
application failure resolution process. It is designed to reduce critical application
downtime and achieve service level agreements (Compuware Abend-AID, 2002).
Toolset components from this subsystem can be used during the construction,
deployment, operations, and change phases. For example, the Smart Fault View is useful
in providing application development support during the construction of the application.
Since the Fault Generation program is called by the Web application, its use has
implications on the design of the application as the designer must decide when and under
what circumstances the program is to be invoked. The Smart Fault Generation subsystem
is supported by seven toolset components including two programs, two procedures, one
view, and two required database tables. Two optional database tables are also included as
186
part of this subsystem. The components that were designed for this subsystem is shown in
Table 25.
Table 25. Smart Fault Generation Component Summary
Component Type
Component Name
Program Fault Generation Fault Reporting Procedure How to Use Smart Fault Generation Working With Specific Fault Data View Specific Fault Table Application Definition Specific Fault Data Specific Fault Data Summary (optional) Specific Fault View History (optional)
The Fault Generation program is called by the Web application with a parameter list
that directs the program's processing. The Fault Reporting program is used to retrieve and
format the information contained in the fault. How to Use Smart Fault Generation is a
procedure that explains how to call the Fault Generation program and how to request
support for specific data like application, database, and middleware. The Working With
Specific Fault Data procedure explains how to analyze and manipulate the specific fault
data collected by the Fault Generation module.
The Specific Fault view is used to display a report that contains primary fault
information and additional detailed information if it has been requested by the calling
Web application. The Application Definition database table is used to name the Web
applications that can use the Smart Fault subsystem. It also contains application specific
187
parameters that are used to control some aspects of the subsystem's processing. The
Specific Fault Data database table is the main output repository for the faults that are
generated by the subsystem.
Support for the Operations Functional Perspective The Operations perspective is supported by the Integrated Operations subsystem. The
purpose of the Integrated Operations subsystem is to provide an integrated operational
framework for the activities that are central to computer operations. These activities focus
on job scheduling; backup and restore status and history; print output status; and other
outputs like file transfers and print to file operations. The notion of integrating these
activities into a single framework is the main idea behind this subsystem. This focus was
strongly influenced by the IBM IT Process Model processes that are centered on systems-
management activities (Fearn, Berlen, Boyce, and Krupa, 1999).
Integration has not taken place in the marketplace because the operations software
products are produced by a wide variety of vendors that do not share a common approach
or framework. The Integrated Operations subsystem was designed to be independent of a
specific vendor helpdesk or print output utility. Toolset components from the Operations
Integration subsystem can be used during construction, deployment, operations, and
change phases. For example, the Integrated Operations view can be used during the
change phase to monitor the backup jobs that are often run early in a change window to
provide a data or software copy to be used if a restore operation is needed. This subsystem
should be installed and used during the construction phase to make sure that it useful and
can be maintained during later life cycle phases.
188
The Integrated Operations subsystem is supported by ten toolset components including
five programs, two procedures, one view, and two required database tables. Two optional
database tables are also included as part of this subsystem. The components that were
designed for this subsystem is shown in Table 26.
Table 26. Integrated Operations Component Summary
Component Type
Component Name
Program Job Monitor Backup/Recover Interface Print Interface Other Output Interface Integrated Operations Reporting Procedure How to Set up Integrated Operations How to Use Integrated Operations View Integrated Operations Table Application Definition Integrated Operations Data Integrated Operations Data Summary (optional) Integrated Operations View History (optional)
The Job Monitor program is used to track the execution of Web application
background programs. The Backup/Recover Interface program is used to monitor backup
and recover operations. The program manages active operations and keeps history
information from recent executions in the MIR. The Print Interface program tracks real-
time print activities for queues it has been defined to manage. The Other Output Interface
189
program monitors and manages non-print output including print-to-file and file transfer
objects. The Integrated Operations Reporting program is used to create reports that
include primary subsystem information from job, task, and other services.
Two procedures are part of this subsystem. The How to Set up Integrated Operations
procedure explains the steps that are required to set up the subsystem using its main
configuration definitions which are stored in the Application Definition database table.
The How to Use Integrated Operations procedure explains how to use the Integrated
Operations view to manage an application's background units of work as well as
backup/restore, print, and other output resources. The Integrated Operations view is used
to display and manipulate the data that is gathered to support the subsystem.
The Application Definition database table contains the parameters that identify the
Web application and its components that are to be managed by this subsystem. The
Integrated Operations Data database table stores the operational data that is required to
support the processing needs of the subsystem in its management of the Web application.
Support for the Performance Functional Perspective The performance perspective is supported by the Intimate Performance subsystem. The
purpose of the Intimate Performance subsystem is to explore both intrusive and non-
intrusive techniques to instrument an application for manageability. Intrusive techniques
are currently available like ARM (System Management: Application Response, 1998).
ARM is effective, but costly to implement, as it requires source code changes to the
application. An alternative to modifying the application is to create a proxy for the
application and instrument the proxy. The instrumented proxy would be used in place of
190
managing the application itself. This technique was explored during a project to manage
the USAA Internet Member Services management application (Turner, 1998).
Toolset components from this subsystem can be used during design, construction,
deployment, operations, and change phases. For example, the Non-Intrusive Performance
Techniques and Intrusive Performance Programming and Testing procedures are useful
for providing strategies that can be used during the design of the Web application. The
Intimate Performance subsystem is supported by twelve toolset components including
four programs, three procedures, two views, and three required database tables. Three
optional database tables are also included as part of this subsystem The components that
were designed for this subsystem is shown in Table 27.
Table 27. Intimate Performance Component Summary
Component Type
Component Name
Program Proxy Performance Proxy Reporting Non-Intrusive Performance Non-Intrusive Reporting Procedure Non-Intrusive Performance Techniques Intrusive Performance Programming and Testing How to Set Up and Use Proxy Performance View Intimate Performance Proxy Performance Table Application Definition Performance Definitions Performance Data Performance Data Summary (optional) Intimate Performance View History (optional) Proxy Performance View History (optional)
191
The Proxy Performance program supports the execution of the proxy application. This
program is used to schedule the execution of the proxy application and to store its data in
the MIR. The Proxy Reporting program builds and manages the reports associated with
the proxy application. The Non-Intrusive Performance program schedules non-intrusive
performance probes and monitors. It also records the appropriate data in the MIR. The
Non-Intrusive Reporting program builds and manages the reports associated with the data
collected from the probes and monitors. The Non-Intrusive Performance Techniques
procedure explains how to make use of monitors and commands to better understand the
availability and performance of the Web application. The procedure references already-
existing, non-intrusive tools that are available from software vendors. The Intrusive
Performance Programming and Testing procedure explains how to modify and test the
Web application program that has been instrumented with calls to the Intimate
Performance subsystem. The How to Set Up and Use Proxy Performance procedure is a
guide to the administration of the software associated with the Intimate Performance
subsystem.
The Intimate Performance view is used to display performance data from instrumented
Web applications. The Proxy Performance view is used to display performance data from
instrumented proxies for Web applications. The Application Definition database table
contains the parameters that identify the Web application that is to be managed by this
subsystem. The Performance Definitions database table contains the definitions that
support the subsystem, for example relationships between Web applications and their
proxies. The Performance Data database table contains the data that is collected and used
to create the subsystem reports.
192
Support for the Problem Functional Perspective The problem perspective is supported by the Detailed Data subsystem. The purpose of
the Detailed Data subsystem is to provide a data repository of information for operators
and administrators that is related to problem solving. This subsystem contains static
problem solving information including a cause description, specified action, long term
recommendation, and contact information. Although the information is static, links in the
form of URLs, are placed in the information to reference material that is more dynamic in
nature. The design of this subsystem was influenced by NetView's Network Problem
Determination Application (NetView User's Guide, 2001). This application contained a
description of both the probable cause and recommended actions for a problem. However,
unlike the Network Problem Determination Application this subsystem's information is
focused on solving application, middleware, and database problems.
Toolset components from this subsystem can be used during the construction,
deployment, operations, and change phases. For example, the Detailed Data view can be
used during the change phase to diagnose a problem that may have occurred as a result of
a difference between the test and production application domains. The Detailed Data view
is useful because the large number of elements that make up and support an application
are a significant challenge to compare without the support of a software tool. The Detailed
Data subsystem is supported by eight toolset components including two programs, two
procedures, one view, and three required database tables. Two optional database tables are
also included as part of this subsystem. A summary of the components that were designed
for this subsystem is shown in Table 28.
193
Table 28. Detailed Data Component Summary
Component Type
Component Name
Program Generate Detailed Problem Handling Data Detailed Problem Handling Reporting Procedure How to Define Cause and Action Information How to Utilize Detailed Problem Handling Data View Detailed Data Table Application Definition Detailed Data Definitions Detailed Data Detailed Data Summary (optional) Detailed Data View History (optional)
The Generate Detailed Problem Handling Data program is used to place information in
the MIR about a specific problem situation. The Detailed Problem Handling Reporting
program is used to retrieve detailed problem information from the MIR and present it in
the Detail Data view. The How to Define Cause and Action Information procedure
describes the steps to use to define the information in the MIR. The How to Utilize
Detailed Problem Handling Data procedure is an operation guide for the subsystem. The
Detailed Data view is designed to be used in a context where faults are being investigated
and resolved. It is designed to be used in conjunction with the Specific Fault view.
The Application Definition database table identifies applications for which detailed
data can be displayed. The Detailed Data Definitions database table contains the
parameters which control the operation of the subsystem. The Detailed Data database
table is the main repository for this subsystem containing all the detailed data for each
known problem.
194
Support for the Security Functional Perspective The Security perspective is supported by the Interface Monitoring subsystem. The
purpose of the Interface Monitoring subsystem is to provide a single point from which to
view all security related exceptions. The security exceptions are collected by this
subsystem from all main points of potential exposure including the application, database,
middleware, firewall, network protocol, back-end, front-end, load balancing, gateway, and
mail interfaces. The basic idea for this subsystem is that it is more straightforward to
collect and report all security exceptions using one MIR and report instead of many
different log files and databases.
To implement this idea, utilities are needed to gather security exceptions and transport
them to a fault management system. Many management frameworks have utilities to
extract key messages from files and insert them into the critical message flow or
databases. An example is the Tivoli Event Adapter which is a utility that receives log
messages from various sources like the syslogd daemon running on a mail or load
balancing computer. This utility reformats the messages into Tivoli Event Console events
and forwards them to the event server for processing (Lendenmann et al., 1997). This
same interface can be exploited by the Interface Monitoring subsystem. The Load-
Balancing Data Collection and Mail Data Collection programs can gather these exceptions
and store the information into the subsystem's MIR.
The toolset components from this subsystem can be used during the construction,
deployment, operations, and change phases. For example, the Interface Monitoring view
can be used to see if a development system has been hacked during the construction
phase. Viruses imbedded during program construction might inflict their worst damage
195
during later phases like operations or change. The Interface Monitoring subsystem is
supported by eighteen toolset components including eleven programs, three procedures,
one view, and three required database tables. Two optional database tables are also
included as part of this subsystem. A summary of the components that were designed for
this subsystem is shown in Table 29.
Table 29. Interface Monitoring Component Summary
Component Type
Component Name
Program Application Data Collection Database Data Collection Middleware Data Collection Firewall Data Collection Network Protocol Data Collection Back-End Data Collection Front-End Data Collection Load-Balancing Data Collection Gateway Data Collection Mail Data Collection Interface Monitoring Reporting Procedure Planning for Interface Monitoring Installing and Configuring Interface Monitoring Modules Using Interface Monitoring Data View Interface Monitoring Table Application Definition Interface Monitoring Definitions Interface Monitoring Log Interface Monitoring Log Summary (optional) Interface Monitoring View History (optional)
Ten resource specific data collection programs are part of the subsystem including the
Application Data Collection, Database Data Collection, Middleware Data Collection,
Firewall Data Collection, Network Protocol Data Collection, Back-End Data Collection,
196
Front-End Data Collection, Load-Balancing Data Collection, Gateway Data Collection,
and Mail Data Collection programs. Each of these programs works with a specific type of
resource and collects security data using published interfaces or less direct means like
harvesting messages from program log files. The Interface Monitoring Reporting program
is used to produce reports from the MIR data collected by the other subsystem programs.
Three procedures are part of this subsystem. The Planning for Interface Monitoring
procedure is used during the design phase to ensure that the Web application is built to
generate application security messages if it detects such an exception. The Installing and
Configuring Interface Monitoring Modules procedure explains how to install and set up
the subsystem's data collection programs. The Using Interface Monitoring Data procedure
explains the meaning of the data that is collected from the different collection points like
firewalls and back-end network connections. The Interface Monitoring view is the single
view that is used to display the information from all the collection modules. This view
displays the report data for specific Web applications.
Three database tables are used by this subsystem. The Application Definition database
table identifies the application that is to be monitored by this subsystem. The Interface
Monitoring Definitions database table contains the parameters the control the processing
actions of the subsystem. The Interface Monitoring Log database table is used to contain
the messages associated with the subsystem's monitoring processes. This database table is
the primary source of data for the subsystem's reports.
197
Support for the Service Level Functional Perspective The service-level perspective is supported by the SLO/SLA Data subsystem. The
purpose of the SLO/SLA Data subsystem is to gather and report information that
describes the level of service that the application is providing to its users. In the Web
industry, two levels of service are usually supported. An application with a service level
objective is one that provides application availability or performance at a level that is not
guaranteed to its users. The service level objective might be stated in terms like--"Web
hosting provider's availability objective for the Web Hosting Environment is less than
four hours per calendar month of downtime, subject to specific exclusions" (Universal
Server Farm, 2000, p. 17). This objective is simply a goal. If the goal is not achieved there
are typically no penalties
A service level agreement is more like a contract as compared to a service level
objective. The agreement will specify a specific goal like--"the UUNET network will be
available 100% of the time". In this situation, UUNet also stated "should these specified
levels of service fail to be achieved, UUNET will credit the customer's account" (Service
Level Agreements, 2001, p.1). In other cases, if agreements are not met then penalties
might be paid to the organization that relies on the application.
Tools and products in the marketplace influenced the design for this subsystem. The
notion of monitoring Web URLs in a service level context was influenced by the Port
Checking Pattern Matching Monitor program which can be invoked on a periodic basis to
launch a URL and load the first 256 characters of the page (Woodruff, 1999). The
Keynote Perspective service also influenced the design of this subsystem as it has a
number of built-in functions that support its use as a service level tool (Keynote
198
Perspective, 2000). The SLO/SLA subsystem extends the ideas from the Port Checking
Pattern Matching Monitor program and Keynote Perspective product by not only
integrating URL availability and performance, but also application, middleware, network,
operating system and hardware data in the same MIR that can be used for reporting.
Toolset components from this subsystem can be used during operations and change
phases. For example, during the change phase, the SLO or SLA data can be used to verify
that a recent change to the application did not have a negative impact on its service level
objective. The SLO/SLA Data subsystem is supported by sixteen toolset components
including nine programs, two procedures, two views, and three required database tables.
Two optional database tables are also included as part of this subsystem. A summary of
the components that were designed for this subsystem is shown in Table 30.
Table 30. SLO/SLA Data Component Summary
Component Type
Component Name
Program Retrieve SLO/SLA Definitions Collect and Record URL Data Collect and Record Application Data Collect and Record Middleware Data Collect and Record Network Data Collect and Record OS Data Collect and Record Hardware Data SLO Reporting SLA Reporting
Procedure How to Define SLO and SLA Customers How to Interpret and Use SLO/SLA Information View SLO Information SLA Information
199
Table 30. (continued)
Component Type
Component Name
Table Application Definition SLO/SLA Definitions SLO/SLA Log SLO/SLA Log Summary (optional) SLO/SLA View History (optional)
The Retrieve SLO/SLA Definitions program performs a control function for the other
programs in this subsystem by coordinating and synchronized their collection operations.
The actions of this program are controlled by the contents of the SLO/SLA Definitions
database table. There are six programs that are used to collect and record specific data.
These programs are the Collect and Record URL Data program, the Collect and Record
Application Data program, the Collect and Record Middleware Data program, the Collect
and Record Network Data program, the Collect and Record OS Data program, and the
Collect and Record Hardware Data program. The specific data they record is described in
their names. Two other programs, the SLO Reporting and SLA Reporting programs
generate the reports that are needed to make use of the data from this subsystem.
The How to Define SLO and SLA Customers procedure describes the basic definitions
that are needed by the subsystem. The How to Interpret and Use SLO/SLA Information
procedure is focused on the operational aspects of the subsystem. A significant focus of
this procedure in understanding and using the information that is in the subsystem reports.
Two views are supplied with this subsystem. The SLO Information and SLA Information
views display reports that contain information about a specific service level objective or
service level agreement Web application.
200
The Application Definition database table is used to identify the SLO or SLA Web
application. The SLO/SLA Definitions database table is used to contain the parameters
that the subsystem needs to support its operation. The SLO/SLA Log database table
contains message data that documents the operation of the subsystem over a period of
time.
Support for the Software Distribution Functional Perspective The software distribution perspective is supported by the Deployment Monitoring and
MIR Creation subsystems. The purpose of the Deployment Monitoring subsystem is to
monitor and manage mission-critical software distributions of Web applications. The
concept of deployment that is used in this subsystem makes it possible to take an already
installed and configured application and copy it into a completely new environment. For
example, an application that is running in a verification environment could be copied with
a single deployment action into a new or existing production environment.
Several existing products influenced the design of this subsystem. The POSIX.
standard, which includes utilities that facilitate software distribution, were building blocks
for this subsystem. POSIX includes copy distribution, package distribution, and verify
software utilities. It also has software structures like bundles and filesets that make
standardized software distribution possible (Information Technology - Portable, 1995).
This subsystem can use the POSIX utilities as it manages the deployment of an
application to a target domain. Gumbold (1996) described software distribution by
reliable multicast that involved an end-to-end application layer protocol. That researcher's
work utilized a thin transport layer and a best effort network layer multicast service which
201
could be integrated into a deployment approach that ensured successful delivery to the
target system.
Osel and Gansheimer (1995) described the use of the OpenDist toolset to synchronize
file servers. OpenDist is another example of a utility that could be used and managed as
part of a Web application deployment. The added value of domain management provided
by the Deployment Monitoring subsystem could heighten the usefulness of utilities like
OpenDist. Toolset components from this subsystem are most useful during deployment,
operations, and change phases. However, the procedure How to Design for Deployments
would be useful to application developers during the design phase.
The Deployment Monitoring subsystem is supported by twelve toolset components
including five programs, three procedures, one view, and three required database tables.
Two optional database tables are also included as part of this subsystem. A summary of
the components that were designed for this subsystem is shown in Table 31.
Table 31. Deployment Monitoring Component Summary
Component Type
Component Name
Program Start Deployment Stop Deployment Restart Deployment Coordinate Deployment Deployment Monitoring Procedure How to Set Up for Deployments How to Manage Deployments How to Design for Distribution View Deployment Management
202
Table 31. (continued)
Component Type
Component Name
Table Application Definition Deployment Definitions Deployment Status Log Deployment Status Log Summary (optional) Deployment Status View History (optional)
The Start Deployment, Stop Deployment, and Restart Deployment programs are all
utilities that support the actions implied by their program names. These actions are
selected from a operator-action view supported by the toolset. The Coordinate
Deployment program is the main module that supports the subsystem. It invokes the other
utility functions as required and ensures that actions are logical. For example, the
Coordinate Deployment program will not start an already started deployment and will
only restart and deployment that has a previously unsuccessful start operation. The
Deployment Monitoring program is used to proactively support active deployments by
continuously examining the utility logs of dependent programs for messages upon which
it relies to determine the success or failure of software distributions.
The How to Set Up for Deployments procedure explains the main concepts regarding
deployment and how to set up subsystem files to best support the needs of the Web
application. The How to Manage Deployments procedure is a guide to operations for the
subsystem. The target audience is support personnel. The How to Design for Distribution
procedure is written for the application designer and focuses on key items that should be
considered during the design of the Web application. The Deployment Management view
is the focal point for information about a specific deployment of a Web application. The
report in the Deployment Management view explains if the deployment was successful
203
and provides the ability to browse logs to examine key messages associated with the
utilities that were used.
Three database tables are used by the Deployment Monitoring subsystem. The
Application Definition database table is used to identify the applications that can be used
by this subsystem. The Deployment Definitions database table is used to hold the
parameters that control the subsystem. An example is the setting to simulate deployment.
This setting has a yes or no value. If set to yes, the Deployment Monitoring subsystem
will try all operation and utility functions in a non-permanent way making it possible for
the administrator to discover and fix problems before they occur during a real change
window. The Deployment Status Log database table is used to store messages that
indicate the success or failure of deployment actions.
The MIR Creation subsystem is the second subsystem in support of the software
distribution perspective. The purpose of the MIR Creation subsystem is to populate the
MIR with information in support of packaged distributions. This subsystem directly
supports other subsystems like Automated Installation and Configuration by reducing the
burden of predefining the files, programs, and other objects needed to support automated
actions. The idea of a MIR creation utility is based on a number other utilities and this
researcher's experiences working with software tools. In the early 1990's, a powerful
object-based system called the Resource Object Data Manager was delivered by IBM as a
component of NetView (Finkel and Calo, 1992). The Resource Object Data Manager was
complex to use and it was not until utilities were supplied to populate its MIR that it
became popular and more useful. BLDVIEWS is an example of a Resource Object Data
204
Manager utility that was a direct influence upon this subsystem (NetView for OS/390
Application, 2001).
Toolset components from this subsystem can be used during construction, deployment,
operations, and change phases when MIR definitions need to be built or rebuilt in
association with the development of a new or changed function. For example, the MIR
Creation Reporting program is useful for verifying the list of components that will be
distributed as part of change to be deployed to Web application or database servers. The
MIR Creation subsystem is supported by eleven toolset components including four
programs, one procedure, one view, and five required database tables. Two optional
database tables are also included as part of this subsystem. A summary of the components
that were designed for this subsystem is shown in Table 32.
Table 32. MIR Creation Component Summary
Component Type
Component Name
Program MIR Creation Inventory Scan Inventory Associations MIR Creation Reporting Procedure How to Set Up and Use Inventory, Scan, and Associations Guide View MIR Creation Table Application Definition Predefined Associations Deployment Definitions (from Deployment monitoring) Test/Verification/Production Library Definitions MIR Creation Log MIR Creation Log Summary (optional) MIR Creation View History (optional)
205
The MIR Creation program is the main utility of this subsystem that is used to build
tables that are used by other subsystems in the toolset. The history of creation activities is
kept in the MIR Creation Log database table. The Inventory Scan program is used to read
files, based on predefined library definitions that contain Web application elements. This
program provides a utility function to the other programs in the subsystem. The Inventory
Associations program is another utility program. This program is used to define
associations between elements at the file level and a specific Web application. This
program uses input specifications that are flexible, for example, a specification of
BLG*/B2B-EzTran can be used to associate any file name beginning with BLG and
ending with any characters with the B2B-EzTran Web application. The MIR Creation
Reporting program uses the data in the MIR Creation Log database table to create reports
about MIR creation activity. The How to Set Up and Use Inventory, Scan, and
Associations Guide is used to explain how to use the MIR Creation subsystem. The MIR
Creation view is used to display the reports associated with the subsystem.
Five database tables are used with this subsystem. The Application Definition database
table is used here as it is used with other subsystems to identify Web applications upon
which the toolset can perform actions. The Predefined Associations database table is used
to associate files and other Web application components with a specific Web application.
The Deployment Definitions database table, which was created by the Deployment
Monitoring subsystem, is used by this subsystem as it contains some key data that can be
used for MIR creation activities. The Test/Verification/Production Library Definitions
database table is used to identify files or collection of files (libraries) that are needed to
206
support the MIR Creation programs. The MIR Creation Log database table is used to
contain messages that record the results of the MIR Creation subsystem's activities.
Other Support for the Functional Perspectives In addition to the 19 subsystems, support was needed for functions like data transfer to
legacy problem-management systems and seamless interface to existing tools and
products. Support for these functions was created for the prototype toolset as needed and
not the focus of significant design activity.
Application Segment Strategy and Planning for Scenario Development This part of the chapter describes the application segment strategy and scenario
planning that was developed for the five scenarios for the prototype and related study.
Hough (1993) explained that RAD or Rapid Delivery is a method for developing
applications that can evolve over time. A key step in the Rapid Delivery approach is
application segmentation, a technique that makes it possible to break applications into a
variety of functional capabilities. The application segment strategy for this toolset was to
develop a subset of the fully designed toolset. The subset to be developed was determined
by the functions required by the five scenarios. Unlike some segment strategies, this
strategy had a fixed amount of function to be developed in order to create the prescribed
amount of function that was to be evaluated by the survey participants. The other aspect
of this strategy was the approach to be used to develop the prototype. The approach that
was used involved the creation of the user interface, the creation of the supporting
database tables, and the exploitation of the database and data through the user interface.
These three steps in this sequence were used to create the prototype toolset. Each scenario
is now explored in detail. The information in this part of the segment strategy acted as a
207
roadmap for the development of the HTML views. This part of the application sequence
strategy focuses primarily on the user interface.
Web Application Operational Fault
In this first scenario, the Web application failed and an event was generated and
captured by the toolset. The failing application, which supports the General Ledger
function, has been instrumented to gather fault data and make it available to the system
administrator and support personnel. The root cause of the application failure was a
database problem that first evidenced itself with a Structured Query Language (SQL)
error. The application responded to the application error by invoking the Smart Fault
Generation subsystem.
A number of toolset subsystems were used in this scenario. Smart Fault Generation
was used to examine the primary fault data as well as additional information captured by
that subsystem. The Detailed Data subsystem was used to examine the cause description,
specified action, and long term recommendation, and contact information associated with
the fault record. The Resource Modeling subsystem was used to determine if the Web
application was exceeding disk, memory, processor, and I/O resources. Finally, the
Administrator’s Action view of the Specific Fault subsystem was used to transfer the fault
to the problem management system.
The planned flow of activities that supported the development of this scenario was as
follows:
- An event was generated by the application and appears on the Specific Fault View.
This view was part of the Smart Fault Generation subsystem.
208
- The Detailed Data subsystem was invoked from the Specific Fault View to
determine what actions are to be taken.
- Vendor recommended actions were taken (see details on the specific error, below)
using the Telnet utility to access the database server.
- Additional site-specific actions were taken. Since this error was probably memory
related the resource modeling subsystem was checked to see if exceptions were
recorded. Finally, the actions taken were recorded using the Administrator’s
Action view of the Specific Fault subsystem and the fault data was transfered to
the problem-management system.
The specific error that was the root cause of the fault is SQL10003C. This error is
generated when there are not enough system resources to process the request. The request
is aborted because it cannot be processed. The cause is the database manager could not
process the request due to insufficient system resources. The resources that can cause this
error are the amount of memory in the system or the number of message queue identifiers
available in the system. The action to be taken when this error occurs is to stop the
application. Possible solutions to this problem include remove background processes or
terminate other applications using the needed resources. If Remote Data Services are in
use, it is recommended to increase the Remote Data Services heap size (rsheapsz) in the
server and client configuration because at least one block is used per application. It is also
recommented to decrease the values of the configuration parameters that define allocation
of memory, including udf_mem_sz if UDFs are involved in the failing statement. When
209
this error occurs a sqlcode of -10003 and a sqlstate of 57011 are also created and returned
to the program (Database 2 Messages Reference, 1995).
Web Application Deployment is Unsuccessful
In this scenario, the deployment of the Web application was unsuccessful and the
failure was detected by the toolset. The deployment was part of a sequence of activities
that included application installation and configuration activities initiated by the
administrator. The application that failed supported Human Resources Benefits
Administration. After detection of the deployment failure, the administrator worked with
the development team to correct the problem and then transferred the fault to the problem
management system for problem tracking and close out. Several reports were created that
contain information about the installation, configuration, and deployment outcomes.
A number of subsystems were used in this scenario. Automatic Installation and
Configuration were used to install and configure the Web application. Deployment
Monitoring was used to initiate the deployment and to monitor and manage the Web
application software distribution including the possibility of restoring the previous version
of the application if appropriate. The Specific Fault and Detailed Data subsystems were
used to view specific exceptions and detail data. Finally, actions taken by the
administrator were recorded and the fault was transferred to the problem management
system.
The planned flow of activities that supported the development of this scenario was as
follows:
210
- The Automated Installation view was used to verify a successful installation of the
application. This view was part of the Automated Installation and Configuration
subsystem.
- Next, the Automated Configuration View was used see if configuration actions
previously performed were successful. Successful installation and configuration
was required before deployment could begin.
- Deployment Monitoring was used to initiate the deployment to the verification
system, but it failed.
- Vendor recommended actions were taken by the support team (see details on the
specific error, below) using the Telnet utility to access an application server and to
fix the problem. Deployment Monitoring was used to restart the deployment to the
verification system and it was now successful.
- The actions taken were recorded in the Specific Fault view and the fault data
(generated when original deployment failed) was transfered to the problem-
management system as a closed record.
The specific error that was the root cause of the fault is--DIS:SENG:0033 Error:
Cannot create temporary file 'path'. The explanation for this error is the system is not able
to create a temporary or backup file in the service area. The variable 'path' in the message
will contain the actual directory path name of the temporary or backup file which failed to
be created. The system action is that the operation failed. The suggested operator response
is to check for space availability in the service area or the existence of a file with the same
name, then try the operation again (TME 10 Software Distribution, 1998).
211
Web Application Change Results in Poor Performance In this scenario, new functionality was installed for a Web application. The application
was a Business-to-Business site called OrderMarketplace. After the application change,
poor application performance was detected by the toolset and a problem component was
identified. The root cause of the problem was a middleware definition file that was
changed and tested for this release of the application, but not migrated as part of the
planned change.
A number of subsystems were used in this scenario. The Change-Window Awareness
subsystem was used to see if there was an active change window or to examine the status
of the last change. The Unauthorized Change subsystem was used to check for
unauthorized changes. The Configuration Verification subsystem was used to check if the
system with the problem matches the verification system after which it is modeled. The
Deep View subsystem was used to look for other potential impacts of the problem.
The planned flow of activities was as follows:
- Web application monitoring reveals that the Web application is operating, but
performing more slowly than expected after an application change.
- The Change Window Awareness view was used to see if there was an active
window. The administrator was wondering--perhaps the change is still in
progress?
- Next, the Unauthorized Change view was used to check for any unauthorized
changes. An unauthorized change might be the reason for the performance
problems.
212
- The Configuration Verification view was used to see if the system with the
problem was configured properly as compared to other versions of the application
installed in other domains.
- When the configuration problem was found, the application support analyst was
notified and took responsibility for correcting the problem. The actions taken were
recorded in the Specific Fault view and the fault data was transfered to the
problem-management system as a open record.
This error was related to the WebLogic Properties file and the WebLogic System
Execute Thread Count. This value in the WebLogic Properties file equals the number
of simultaneous operations that can be performed by the WebLogic Server. As work
enters a WebLogic Server, it is placed on an execute queue while waiting to be
performed. This work is then assigned to a thread that performs the work. In this
problem scenario, the thread count in the active definition was too low for the manner
in which the application was designed and work was backing up due to this artificially
low setting (Tuning the WebLogic Server, 2000).
Web Application Experiencing Bottlenecks as Some Queries Take a Long Time In this scenario, certain inquiry functions of the Web application were taking a long
time to complete. The application in this scenario supported business to business
transactions. The brand name of the application was b2b-EzTran. The toolset was used to
detect and correct the database functions that were performing poorly. The database was
performing poorly because of the way that the application programmer had written the
SQL statement. In the test system, this coding technique did not present a problem
because the volume of application data was low. When the program was migrated into the
213
production system it performed poorly due to the greatly increased volume of application
data in the database tables.
A number of subsystems were used in this scenario. The Application Bottleneck
subsystem was used to look for specific application, database, and middleware
bottlenecks. Support services were used to interface with a database deep-analysis tool.
This tool used DB2 Event Monitor trace records to identify long running SQL statements.
The Fault Generation subsystem was used to manage the fault detected by the database
deep-analysis tool as was the Detailed Data subsystem get detailed information on the
fault. The SLO/SLA subsystem was used to determine the impact of the bottleneck on
customer satisfaction (and contractural implications). Finally, support services were used
to transfer the fault to the problem-management system.
The planned flow of activities was as follows:
- Software that is monitoring the customer experience with the application records
long responses with some transactions within the application.
- The Application Bottleneck view was used to see if the toolset has detected any
slowdowns with the application itself or the supporting database or middleware
systems.
- Support services were used to interface with a database deep-analysis tool. This
tool was used because application bottlenecks indicated that there was a problem
in the database.
214
- The Fault Generation view was used to manage the fault detected by the database
deep-analysis tool. The fault contains considerable detail on the database-related
problem.
- The Detailed Data view was used to examine and understand the details associated
with the fault
- The SLO/SLA subsystem was used to determine the contractural impact of the
bottleneck and then the actions taken were recorded in the Specific Fault view and
the fault data was transfered to the problem-management system as a open record.
This error was caused when a recently implemented SQL query in the Web application
performed poorly because a search field was not indexed. The problem did not evidence
itself in testing because the system contained a small volume of data. The problem was
solved quickly using several Create Index commands. A Create Index command is
typically executed when the database is defined, but the dynamic nature of today's
database systems allows this function and others to be performed on a running production
system (Pratt, 1990).
Overall Response for the Web Application is Slow, but the Application is Still Functional In this scenario, the Web application was performing slowly, but all components were
available. The toolset's deep availability capability was used to determine the root cause
of the overall poor performance. This problem situation was particularly challenging
because several problems were detected at about the same time. The problems include
degraded application and database functions involving stalled application programs and
database table deadlocks. Sample transactions that were used to simulate human users
215
were running slowly and completing in times that were greater than the thresholds that
were set for them.
A number of subsystems were used in this scenario. The Deep View subsystem was
used to take a comprehensive look at the operational status of the application. Next, the
Business Views subsystem was used to see what business system the application was part
of and what applications may be affected. The Intimate Performance subsystem was used
to examine both application-specific and proxy performance data. The Smart Fault
Generation subsystem was used to manage the fault detected by the Intimate Performance
subsystem. Also, the Detailed Data subsystem was used to get details on the fault. Finally,
support services were used to transfer the fault to the problem-management system.
The planned flow of activities was as follows:
- Users of the Value Market application report slow response with the application.
They tell the help desk that "everything is working, but taking much more time
than usual".
- The Deep Information view was used to get a comprehensive view of the
application--everything that has been recorded for availability, automation,
capacity, and the other perspectives. This view gave some clues regarding what to
examine next to determine the root cause of the problem.
- Business Views was used to look for degraded application resources. Degraded
indicates that servers or databases are working, but something is limiting their
productivity or throughput.
216
- The Intimate Performance view was used to confirm what was found in the other
views about degraded resources.
- The Specific Fault view was used to manage the fault detected by the Intimate
Performance subsystem. Also, the Detailed Data view was used to examine and
understand the details associated with the fault.
- Since there were so many problems--failed restarts, switch faults, and processor
faults the development team was contacted and the actions taken were recorded in
the Specific Fault view and the fault data was transfered to the problem-
management system as a open record.
These errors were caused by a number of unrelated problems. Although they are
unrelated, they nevertheless have a collectively negative impact on the application. The
failed restarts were related to a resource shortage that was inhibiting the restarting of a
failed application task or supporting middleware or database component. The notion of
automatically restarting failed components is well established in some domains. In
parallel systems with high levels of concurrent processing the first monitor that finds a
component missing restarts it (Overview of Parallel, 2002). Only recently, have software
developers begun to implement automation at the application level. The switch faults in
this scenario were the result of human error in the configuration of these network devices.
Devices on the same network interface card should be set to the same speed. If they are
not, errors will result on that port. This can impact the performance of the site, especially
if that is the only connection from the Web servers to the Internet (I. Ahad, personal
communication, December 15, 2000). The processor faults were generated because the
CPU on the Web servers were running at greater than 95% utilization for an extended
217
period of time. Menasce and Almeida (1999) wrote about the capacity challenges
associated with the unpredictable characteristics of Web service requests. The processor
faults in this situation could be the result of a peak load or a basic mismatch between the
everyday needs of the application and the CPU capacity of its servers.
Toolset Implementation Utilizing the Segment Strategy The toolset components developed to support the five scenarios and related toolset
evaluation are summarized in this part of the chapter. The discussion focuses on the
subsystems, graphical interface, and database implementation used in the five scenarios.
The five scenarios exploited the functionality of 15 of the 19 subsystems that made up the
toolset design (see Table 33).
Table 33. Subsystems and Related Scenarios
Subsystem Name Scenario Number
Resource Modeling 1
Resource Accounting N/A
Automated Installation and Configuration 2
Configuration Verification 3
Template Creation 5
Component Comparison N/A
Deep View 5
Business Views 5
Application Bottlenecks 4
218
Table 33. (continued)
Subsystem Name Scenario Number
Unauthorized Change Detection 3
Change-Window Awareness 3
Smart Fault Generation 1, 2, 3, 4, 5
Integrated Operations N/A
Intimate Performance 5
Detailed Data 1, 2, 3, 4, 5
Interface Monitoring N/A
SLO/SLA data 4
Deployment monitoring 2
MIR Creation 3, 4
Subsystem support facilities for the functional perspectives were also exploited. These
facilities were used to interface with utilities and other systems like a legacy problem
management system. Table 33 contains a list of the subsystems and the scenarios that used
them. The Smart Fault and Detailed Data subsystems were used in all five scenarios.
The graphical user interface that was developed consisted of 12 independent Web
pages and 37 framesets consisting of 74 frames. The Web pages were designed using a
consistent layout throughout. For the Web pages that were not part of a frameset the
layout consisted of the full life cycle graphic in the upper left side with text left justified
prominently displayed on the page. All the Web pages contain the current date, time, and
used a consistent color scheme. A sample Web page is shown in Figure 13.
219
________________________________________________________________________
Figure 13. Layout of typical Web page ________________________________________________________________________
The layout of the framesets employed a scheme similar to the stand-alone Web pages
regarding the use of the full life-cycle graphic and use of data and time on each frame. In
addition, each frameset consisted of two frames. The frame on the left was used for
navigation since it contained links to one or more reports for each subsystem. The frame
on the right contained information that was typically in the form of a report. In a few
cases, the frames contained fields that required inputs before a program could be invoked.
An example of an input frame is the Deployment Monitoring view that is used to start the
deployment of an application to a target system. A sample Frameset is shown in Figure
14.
220
________________________________________________________________________
Figure 14. Layout of typical frameset ________________________________________________________________________
The database for the toolset was developed using a relational database management
system. Fifteen unique tables were defined to support the prototype toolset. A data
dictionary was created in support of these tables and can be found in Appendix H - Data
Dictionary for Full Life-Cycle Toolset. The Data Dictionary contains definitions for the
dozens of numeric, text, and date/time fields that were created to support the prototype.
The tables that were created in support of the prototype are summarized in Table 34.
221
Table 34. Subsystems and Related Tables to Support the Prototype
Subsystem Name Table Name
Resource Modeling Application Definition Resource Modeling Log Resource Modeling Monitoring Input
Resource Accounting N/A
Automated Installation and Configuration
Application Definition Automated Installation and
Configuration Log
Configuration Verification Application Definition Configuration Verification Log
Template Creation Application Definition
Component Comparison N/A
Deep View Application Definition
Deep View Application Resources
Business Views Application Definition Business Systems Definitions Application Resources Log
Application Bottlenecks Application Definition
Application Capacity Data
Change-Window Awareness Application Definition Change-Window Operations Log
Smart Fault Generation Application Definition
Specific Fault Data
Integrated Operations N/A
Intimate Performance Application Definition
Detailed Data Application Definition Detailed Data
222
Table 34. (continued)
Subsystem Name Table Name
SLO/SLA data Application Definition SLO/SLA Definitions SLO/SLA Log
Deployment monitoring Application Definition
Deployment Status Log
Toolset Evaluation The evaluation of the toolset is discussed in this part of the chapter. The focus is
information about the data collected during the evaluation step. The findings from the
survey are discussed including a profile of the participants, responses to the survey, and
written comments on the strengths and weaknesses of the toolset.
Findings from the Survey The development effort to create the toolset began using the software tools described
in Chapter 3. The first challenge was developing a large number of framesets and stand-
alone HTML Web pages. Initially Netscape Composer was used for this purpose, but the
limitations of the tool regarding a prototype of this size were quickly met. Missing from
Netscape Composer were facilities that would automatically name and store objects, a
database interface, and a library structure. To address the needs of a development project
of this size, a tool that provided a productive development environment was needed. E-
Commerce Construction Kit was chosen because it had a frame generator, database
support, and site promotion capability and a cost of around $50. The program is
distributed by Macmillan Software and is developed by Boomerang Software of Belmont
MA (E-Commerce Construction Kit, 2001). Although there were some technical problems
223
with this software, it was useful in producing the prototype HTML pages and framesets
what were needed to satisfy the technical requirements of the design.
The toolset evaluation involved collecting data from the survey participants using two
instruments. The first instrument, the toolset survey, contained five questions for each
scenario. There were five scenarios in the survey resulting in twenty-five data elements
collected from each participant. The survey also contained three open-ended questions
that were included in the back of the survey to solicit information that might provide some
addition insights into the toolset's strengths and weaknesses. The second instrument, the
survey participant profile questions, was given to the participant after they returned the
toolset survey. In total, the data collected included the participant’s profile information,
responses to the survey questions, and written comments on the strengths and weaknesses
of the toolset. The findings for all of this data are now discussed.
Profile of Participants Thirty-three of 40 individuals that were asked to complete the survey actually
completed it for a participation rate of 83%. The participants were given two weeks to
complete the survey. One out of three participants had to be reminded that the results of
the survey were overdue. The participants were chosen from the large group of
professionals who specialize in the hosting of customer Web applications. The
participants were geographically located in North Carolina, Illinois, Florida, and Georgia
although they generally work in support of the same line of business called IBM e-
Business Hosting. Experienced participants were selected from a broad set of professions
including project management, database support, offering management, system
224
administration, and others. A summary of the participant’s profile information is shown in
Table 35.
Table 35. Summary of Participant Profile Information
Profile Variable Summary
Years in IT
16.39 average, 7.78 standard deviation
Years in Web-related work
4.09 average, 2.13 standard deviation
Job Family
7 job families
Focus Area
14 focus areas
Systems Management Specialist 16 are systems management specialist, 17 are not systems management specialist
For the 33 participants, the average number of years in IT was 16.39 years. The
minimum experience was two years and the maximum was 33 years. The standard
deviation of the sample was 7.78 indicating that there was significant spread in the data in
number of years of experience. The average number of years engaged in Web-related
work was far less than the average number of years in IT. This reflects the fact that use of
the World Wide Web is a recent practice for many companies and their employees. The
average number of years performing Web-related work was 4.09. The minimum
experience was .5 years and the maximum was 10 years. The standard deviation of the
sample was 2.13 indicating relatively low spread.
The participants were from seven job families. The families were IT
architect/specialist, technical project manager, technical services, marketing, software
engineer, exempt professional, and consultant. The largest group of participants consisted
of IT architects/specialists. Individuals in this job family design and plan Web site
225
implementations and also work closely with Web software and hardware. The second
largest group included technical project managers. These project managers plan and
manage the deployment of new Web sites or significant changes to existing sites. The
results of the survey regarding participants' job families is shown below in Figure 15.
________________________________________________________________________
Software Engineer12.1%
IT Architect/Specialist27.3%
Consultant3.0%Technical Project Manager
21.2%
Marketing12.1%
Technical Services15.2%
Exempt Professional9.1%
Figure 15. Results of the survey regarding participants' job families ________________________________________________________________________
The survey participants were also asked to specify a focus area within their job family.
Fourteen focus areas were presented to the participants (see Figure 16). The focus areas
were offering development, systems management, architecture support, system
administration, middleware support, process engineering, software development, software
support, software testing, Web measurements, program management, marketing,
infrastructure, and database support. The largest focus areas were offering development
(18.8%) and systems management (12.5%). Individuals working in offering development
define and help to create packages of technical capabilities called offerings that are sold to
customers. Individuals working in systems management implement software that is used
226
to monitor and control the hardware and software that makes up a Web site. The results of
the survey regarding participants' focus areas is shown in Figure 16.
________________________________________________________________________
Software Developm ent6.3%
Software Testing6.3%
W eb Measurem ents3.1%
Offering Development18.8%
System Adm inistration9.4%
Middleware Support9.4%
Database Support3.1%
System s Management12.5%
Process Engineering6.3%
Program Management3.1%
Architecture Support9.4%
Infrastructure3.1%
Marketing3.1%
Software Support6.3%
Figure 16. Results of the survey regarding participants' focus areas ________________________________________________________________________
The last question in the profile pertained to a specialization. The participants were
asked if they considered themselves a systems management specialist. Sixteen participants
considered systems management a primary skill whereas 17 did not.
Responses to the Toolset Survey Each participant answered five questions for each scenario. The survey instrument was
constructed so that the answers for each question are ordered least favorable, favorable,
and most favorable. Question 4 from the survey demonstrates this clearly. The specific
question is--Which best characterizes how usable the toolset was when handling this
scenario? The first answer is the least favorable from the point of view of the toolset
researcher. The answer is "Not easy to understand". The second answer, "Easy to
227
understand, but there are some usability concerns", is better than the first so it is
characterized as favorable. The last choice, "User friendly and efficient to use", is clearly
the best answer or most favorable. This ordering of least favorable, favorable, and most
favorable, was used to rank the results of the surveys to see which scenarios did better
than others. The methodology combined the percentage of respondents that selected the
favorable and most favorable answers for each question. A summary of percentages for
each scenario was used to determine what characteristics of each scenario were most
successful. The characteristics associated with the questions included ease of
understanding, level of sophistication, meeting of requirements, usability, and potential
impact of its use. The combined response percentages for each question were combined to
produce a total score for that scenario. The total score for the scenario was then compared
to a total score for the other scenarios to determine the ranking.
Scenario 1 received the best overall score of all the scenarios. The score was 476 out of
a possible score of 500. The overall score is the sum of the combined response
percentages for each question of the scenario. Put another way, only eight of the 165
responses were not in the favorable or most favorable answer range. A summary of the
responses is shown in Table 36.
Table 36. Scenario 1 Summary
Question and Focus
Answer 1--Least
Favorable
Answer 2--Favorable
Answer 3--Most
Favorable
Question Rank
1. Ease of Understanding
Count 2 11 20 3 Percentage 6.06 33.33 60.61 Total of
Favorable/Most Favorable
94%
228
Table 36. (continued) Question and Focus
Answer 1--Least
Favorable
Answer 2--Favorable
Answer 3--Most
Favorable
Question Rank
2. Level of Sophistication
Count 0 20 13 1 Percentage 0 60.61 30.39 Total of
Favorable/Most Favorable
100%
3. Meeting of Requirements
Count 0 24 9 1 Percentage 0 72.73 27.27 Total of
Favorable/Most Favorable
100%
4. Usability Count 1 13 19 2 Percentage 3.03 39.39 57.58 Total of
Favorable/Most Favorable
97%
5. Potential Impact
Count 5 14 14 4 Percentage 15.15 42.42 42
Total of Favorable/Most Favorable
85%
All Questions Total of Favorable/Most Favorable
476
The participants indicated through their responses that the strengths of this scenario
were the level of sophistication and ability to meet requirements. Both of these questions
were ranked first within the scenario. The lowest ranking characteristic within this
scenario was the potential impact from the use of the toolset (ranked 4 of 4).
229
Scenario 2 received the second best overall score of 467 out of a possible score of 500.
It was second place to the first scenario by nine cumulative percentage points. A summary
of the responses is shown in Table 37 below.
Table 37. Scenario 2 Summary
Question and Focus
Answer 1--Least
Favorable
Answer 2--Favorable
Answer 3--Most
Favorable
Question Rank
1. Ease of Understanding
Count 1 9 23 1 Percentage 3.03 27.27 60.70 Total of
Favorable/Most Favorable
97%
2. Level of Sophistication
Count 1 18 14 1 Percentage 3.03 54.44 42.42 Total of
Favorable/Most Favorable
97%
3. Meeting of Requirements
Count 2 15 16 2 Percentage 6.06 45.45 48.48 Total of
Favorable/Most Favorable
94%
4. Usability Count 1 10 22 1 Percentage 3.03 30.30 66.67 Total of
Favorable/Most Favorable
97%
230
Table 37. (continued)
Question and Focus
Answer 1--Least
Favorable
Answer 2--Favorable
Answer 3--Most
Favorable
Question Rank
5. Potential Impact
Count 6 14 13 3 Percentage 18.18 42.42 39.39
Total of Favorable/Most Favorable
82%
All Questions Total of Favorable/Most Favorable
467
The participants indicated through their responses that the strengths of this scenario
were the ease of understanding, the level of sophistication and usability. The responses to
these three questions were tied for the first ranking. The lowest ranking was the potential
impact of the use of the toolset (ranked 3 of 3).
Scenario 3 received the fourth best overall score of 443 out of a possible score of 500.
It was 33 cumulative percentage points less favorable than the first place scenario. A
summary of the responses is shown in Table 38.
231
Table 38. Scenario 3 Summary
Question and Focus
Answer 1--Least
Favorable
Answer 2--Favorable
Answer 3--Most
Favorable
Question Rank
1. Ease of Understanding
Count 3 16 14 1 Percentage 9.09 48.48 42.42 Total of
Favorable/Most Favorable
91%
2. Level of Sophistication
Count 5 4 24 3 Percentage 15.15 12.12 72.73 Total of
Favorable/Most Favorable
85%
3. Meeting of Requirements
Count 4 13 16 2 Percentage 12.12 39.39 48.48 Total of
Favorable/Most Favorable
88%
4. Usability Count 4 11 18 2 Percentage 12.12 33.33 54.55 Total of
Favorable/Most Favorable
88%
5. Potential Impact
Count 3 10 20 1 Percentage 9.09 30.30 60.61
Total of Favorable/Most Favorable
91%
All Questions Total of Favorable/Most Favorable
443
232
The participants indicated through their responses that the strengths of this scenario
were the ease of understanding and potential impact from the use of the toolset. This is the
first scenario where the potential impact from the use of the toolset did not receive the
lowest ranking. In this scenario, the lowest ranking was the answer to the question on
level of sophistication (ranked 3 of 3).
Scenario 4 received the third best overall score of 458 out of a possible score of 500.
The total score of this scenario was 18 points from the first place scenario. A summary of
the responses is shown in Table 39.
Table 39. Scenario 4 Summary
Question and Focus
Answer 1--Least
Favorable
Answer 2--Favorable
Answer 3--Most
Favorable
Question Rank
1. Ease of Understanding
Count 3 12 18 1 Percentage 6.06 33.33 60.61 Total of
Favorable/Most Favorable
91%
2. Level of Sophistication
Count 1 6 26 1 Percentage 3.03 18.18 78.79 Total of
Favorable/Most Favorable
97%
3. Meeting of Requirements
Count 1 15 17 1 Percentage 3.03 45.45 51.52 Total of
Favorable/Most Favorable
97%
233
Table 39. (continued)
Question and Focus
Answer 1--Least
Favorable
Answer 2--Favorable
Answer 3--Most
Favorable
Question Rank
4. Usability Count 3 12 18 2 Percentage 9.09 36.36 54.55 Total of
Favorable/Most Favorable
91%
5. Potential Impact
Count 6 10 17 3 Percentage 18.18 30.30 51.52
Total of Favorable/Most Favorable
85%
All Questions Total of Favorable/Most Favorable
458
The participants indicated through their responses that the strengths of this scenario
were ease of understanding, the level of sophistication, and the ability to meet
requirements. The answer to all three questions were tied for first ranking. The lowest
ranking answer was the response to the question on the potential impact of the toolset's
use (ranked 3 of 3).
Scenario 5 received the worst overall score of 428 out of a possible score of 500. It was
48 points less favorable than the first place scenario. In the sequence of scenarios, this
scenario was the last one administered. The participants may have been experiencing
some mental fatigue by the time that they got to this scenario. This scenario was also more
complex than the other scenarios in that the toolset detected multiple problems that were
234
the potential root cause of the Web application problem. A summary of the responses is
shown in Table 40.
Table 40. Scenario 5 Summary
Question and Focus
Answer 1--Least
Favorable
Answer 2--Favorable
Answer 3--Most
Favorable
Question Rank
1. Ease of Understanding
Count 2 11 20 3 Percentage 6.06 33.33 60.61 Total of
Favorable/Most Favorable
94%
2. Level of Sophistication
Count 0 20 13 1 Percentage 0 60.61 30.39 Total of
Favorable/Most Favorable
100%
3. Meeting of Requirements
Count 0 24 9 1 Percentage 0 72.73 27.27 Total of
Favorable/Most Favorable
100%
4. Usability Count 1 13 19 2 Percentage 3.03 39.39 57.58 Total of
Favorable/Most Favorable
97%
235
Table 40. (continued)
Question and Focus
Answer 1--Least
Favorable
Answer 2--Favorable
Answer 3--Most
Favorable
Question Rank
5. Potential Impact
Count 5 14 14 4 Percentage 15.15 42.42 42
Total of Favorable/Most Favorable
85%
All Questions Total of Favorable/Most Favorable
428
The participants indicated through their responses that the strength of this scenario was
the level of sophistication and meeting of requirements. The lowest ranking answer for
this scenario was relating to the question of the potential impact of the toolsets use
(ranked 4 of 4).
A summary of the ranking and total scores for the five scenarios is shown in Table 41.
Table 41. Ranking of Scenarios
Rank Scenario Total Score
1 1 476 2 2 467 3 4 458 4 3 443 5 5 428
The first few scenarios were the most straightforward from the point of view of simplicity
of the problem and the prototype solution. Scenario 1, for example, was a simple
operational fault where a SQL error was detected by the Web application and the Smart
236
Fault subsystem was invoked to record the error and gather context information. The
scenario was easy to understand and the toolset displayed seven HTML views or
framesets that explained the scenario. This scenario received the best overall score and
ranked first among the scenarios. In contrast, Scenario 5 was a much more complex
problem situation that used ten HTML views or framesets to show the problem solution.
The root cause of the problem was complex and this probably contributed to the lower
score. Unlike the first scenario where the problem was a straightforward SQL error,
Scenario 5 involved unsuccessful automated actions, switch faults, processor faults, table
deadlocks, and slow performing buy transactions.
A second approach was taken to the analysis of the data using a question-by-question
approach. The goal of this analysis was to rank the responses to the questions,
independent of the scenarios, to determine what characteristics of the toolset as a whole
where more highly valued by the survey participants. The technique used was similar to
the approach taken for the scenario-by-scenario analysis. The responses were totaled by
question and the average, minimum, and maximum values were computed. Next, on a
question-by-question basis the count of favorable plus most favorable was computed as
well as the favorable plus most favorable percentage. This percentage was then used to
rank the questions. A summary of this analysis is shown below in Table 42.
237
Table 42. Summary of Question-by-Question Analysis
- Scenario - Question: Focus
Answer 1 2 3 4 5 Row Total
Avg MinMaxFavorable +Most Favorable Sum (out of 165 possible points)
Favorable +Most Favorable as a % (out of 100 possible points)
Rank by Favorable+Most Favorable
1: Ease of Understanding A 2 1 3 3 6 15 3 1 6
B 11 9 16 12 10 58 12 9 16 C 20 23 14 18 17 92 18 14 23 150 90.91% 3
2: Level of Sophistication A 0 1 5 1 4 11 2 0 5
B 20 18 4 6 8 56 11 4 20 C 13 14 24 26 21 98 20 13 26 154 93.33% 1
3: Meeting of Requirements A 0 2 4 1 5 12 2 0 5
B 24 15 13 15 21 88 18 13 24 C 9 16 16 17 7 65 13 7 17 153 92.73% 2
4: Usability A 1 1 4 3 8 17 3 1 8
B 13 10 11 12 14 60 12 10 14 C 19 22 18 18 11 88 18 11 22 148 89.70% 4
5: Potential Impact of its Use A 4 6 3 6 3 22 4 3 6
B 15 14 10 10 15 64 13 10 15 C 14 13 20 17 15 79 16 13 20 143 86.67% 5
238
A summary of the ranking and total scores for the five questions is shown in Table 43.
Table 43. Ranking of Questions
Rank Question Total Score (out of 100
possible % points)
1 2 93.33 2 3 92.73 3 1 90.91 4 4 89.70 5 5 86.67
Regarding the question ranking, question 2, which related to level of sophistication,
received the highest ranking among the participants. This question received 154 out of
165 possible points in the combined favorable plus most favorable categories. There are
165 possible points because each of the 33 participants answered 5 questions each for a
total of 165 possible points. Question 3, regarding the meeting of requirements, ranked
second among the questions. The least successful aspect of the toolset based on the
question ranking was question 5 that related to the potential impact of the toolset's use.
This question received 11 fewer responses from the combined favorable plus most
favorable categories as compared to question 2.
Written Comments on the Strengths and Weaknesses of the Toolset This summary of the written comments on the strengths and weaknesses of the toolset
was taken from the comments that are documented in their entirety in Appendix G -
Comment Sheet Details for Full Life-Cycle Toolset. Seven attributes were the focus of the
comments on the strengths of the toolset. The attributes were integrated data and
information, ease of use, improvements to problem determination, process assistance,
239
comprehensiveness, straightforwardness of the user interface, and integration with other
systems. The strengths of the toolset are summarized in Table 44.
Table 44. Informal Strengths Summary
Attribute Count
Integrated Data and Information
15
Ease of Use
9
Improvements to Problem Determination
8
Process Assistance
8
Comprehensiveness
6
Straightforwardness of the User Interface
5
Integration with Other Systems
2
There were 15 comments on the integrated data and information. The benefits
associated with an integrated repository were the most mentioned of all the toolset
benefits. Typical phrases from the participants were "provided a great deal of information
and pulled in information from numerous sources" and "tremendous amount of
information on system and application components". Providing a single repository was a
major goal of this toolset. This was made possible through the toolset's MIR. There were
nine comments on the ease of use of the toolset. Representative comments were "easier to
understand and use" and "easy to use help desk personnel to do preliminary problem
determination".
240
There were eight comments on how the toolset would help improve problem
determination. Typical phrases were "Makes problem determination easier since data
required to debug is available without additional runs to capture the data" and "I like the
concept of having some of the problem determination assistance views (like Check for
Configuration Differences/Mismatches) generating faults which can then be investigated
further using the mainline processing views (Specific Fault and Detailed Data)". A focus
of the toolset was improving problem determination and this is reflected in the comments.
There were eight comments relating to the usefulness of the process assistance provided
by the toolset. One participant commented "leads the technician in a methodical way to
evaluate the situation" whereas another wrote, "Having the procedures page initially to
guide the support personnel through the toolset is very helpful".
There were six comments on the comprehensiveness of the toolset. "Covers most of the
common Web application issues" and "Very sophisticated and comprehensive" were
typical comments. There were five comments regarding the user interface. Most
commented that it was straightforward. One reviewer commented "Relatively intuitive,
common look and feel". There were two comments on the toolsets ability to integrate with
other systems like the problem management systems.
Five attributes were the focus of the comments on the weaknesses of the toolset. The
attributes were suggestions for improvement, information overload, difficult to follow,
information maintenance burden, and process deficiencies. The weaknesses of the toolset
are summarized in Table 45.
241
Table 45. Informal Weakness Summary
Attribute Count
Suggestions for Improvement
12
Information Overload
6
Difficult to Follow
5
Information Maintenance Burden
4
Process Deficiencies
3
There were 12 general suggestions for improvement. Two typical comments were
"why not have the tool evaluate and report mistakes" and "in some situations it appears as
though information should have been prioritized better". Regarding information overload,
there were six comments. Clearly, the volume of data and information overwhelmed some
of the participants. Comments included "there may be too much information for the user
to digest" and "quantity of information can be overwhelming". There were five comments
indicating that the toolset was hard to follow. Comments included "I found the different
views that are not common between all the scenarios somewhat difficult to
understand/follow" and "data shown was not always easy to understand". There was no
training of the participants on the toolset so it is not surprising that they had difficulties
understanding some of what they saw in the scenarios.
There were four comments regarding the information maintenance burden. There is no
question that it is a great challenge to create and maintain the MIR. Failures to capture
key information would quick erode confidence in the toolset. Comment on the
maintenance burden included "maintenance of information sources would be high" and
242
"may be difficult to update tool based upon new software/documentation". Lastly, there
were three comments about process deficiencies. Samples include "a more detailed
breakdown of the possible problems" and "maybe a little more explanation on
functionality".
Four attributes were the focus of the comments of the other comments or observations
on the toolset. The attributes were positive endorsement, suggestions for improvement,
higher skill level requirement, and confusing interface with other systems. One participant
mentioned the performance impact of the toolset and another brought up the missing
definition of support roles. The additional informal comments and observations are
summarized in Table 46.
Table 46. Other Informal Comments and Observations Summary
Attribute Count
Positive Endorsement
12
Suggestions for Improvement
7
Higher Skill Level Requirement,
3
Confusing Interface With Other Systems
2
There were 12 comments that are best characterized as positive endorsements of the
toolset. Examples include "Overall, even including the more complicated scenarios, it is
easy to pinpoint problems or problem areas" and "This is a well thought out
comprehensive set of tools. The level of sophistication is definitely leading edge". There
were seven comments that are suggestions for improvement for the toolset. "When more
243
than one possible solution is available, I'd like to see the tool recommend a course of
action" and "I'd suggest you distinguish between proprietary applications, especially those
owned by the customer and shrink-wrap applications like MS Outlook" are both examples
of comments that represent suggestions for improvements. There were three comments
indicating that the toolset required a higher skill level of the operations staff. "The
sophistication of this toolset seems to imply a higher level skill in the operations role than
traditional" was a typical comment.
There were two comments indicating that the interface between the toolset and other
systems was confusing. The comment "The tie-in with the problem management systems
is a bit confusing" points out the difficulty. The relationship between the toolset and other
systems like legacy problem-management systems is well defined, but it is difficult to
completely understand this relationship from the scenarios. In summary, the toolset
gathers faults. These faults are transferred to the problem-management system for
reporting and tracking after investigation of the fault has begun. Faults can be transferred
to the problem management system as open or closed problems.
Summary of results In this chapter, the researcher discussed the results of the research project that was
conducted during the period February 2001 to May 2002. The results of the project
focused on the work products that were produced during the design, development, testing,
and evaluation of a prototype toolset for the full life-cycle management of Web
applications. The design of the toolset was summarized in this chapter. The design was
brought about using JAD activities. The JAD sessions resulted in a comprehensive design
244
for 19 subsystems that consisted of 43 procedures, 78 programs, 25 views, and a database
containing 59 tables. The design was used as an input to help create a segment strategy.
The segment strategy, an important RAD tool, was used to develop the prototype. The
development and testing of a prototype toolset was completed so that survey participants
could evaluate the work. The prototype toolset was developed using a framework that
consisted of five scenarios. The scenarios offered the opportunity to develop toolset
components for 15 of the 19 subsystems. A user interface was developed which consisted
of 12 Web pages and 37 framesets.
The evaluation of the toolset was carried out using an instrument that gathered data on
the ease of understanding, level of sophistication, meeting of requirements, usability, and
potential impact of the toolset's use. Two techniques were used to analyze the data
collected for the survey. The first technique analyzed the data using a scenario-by-
scenario approach. The data revealed that the scenarios that were simpler were rated more
highly that the later ones which were larger in size and more complex in problem
structure. The second technique used to analyze the data utilized a question-by-question
approach. The data also revealed that the toolset was considered to be sophisticated and
that it met the requirements of managing Web applications. The toolset was weakest when
it came to the question of the potential impact of its use on an organization.
A wide variety of informal comments were gathered and discussed pertaining to the
toolset's strengths and weaknesses. Seven attributes were the focus of the comments on
the strengths of the toolset including integrated data and information, ease of use,
improvements to problem determination, process assistance, comprehensiveness,
straightforwardness of the user interface, and integration with other systems. These are
245
summarized in Table 44. Five attributes were the focus of the comments on the
weaknesses of the toolset including suggestions for improvement, information overload,
difficult to follow, information maintenance burden, and process deficiencies. These
comments are summarized in Table 45. Other informal comments and suggestions were
collected and are summarized in Table 46. The various data gathering tools made it
possible to collect a variety of information about the prototype toolset.
246
Chapter 5
Conclusions, Implications, Recommendations, and Summary Introduction In this chapter, the dissertation research is presented in the context of four main
sections. The sections are conclusions, implications, recommendations, and summary. In
the conclusions section, 23 research questions and four hypotheses are stated and the
conclusions associated with each are presented. The conclusions are stated based on the
analysis performed for the study. In the implications section, the contribution of the work
to the field of study is presented. The potential applications of the research are also
presented. In the recommendations section, the suggestions for future research and
changes in academic and professional practice are presented. In the summary section, at
the end of the chapter, the entire dissertation project and study are reviewed.
Conclusions The conclusions of the study are presented below using the research questions and
hypotheses as a framework. The first three research questions are the primary research
questions and are as follows:
1. What are the appropriate procedures, programs, views, schema, and data that would
improve the manageability of Web-based applications?
2. How do these toolset components fit in the context of the application's life cycle
including design, construction, deployment, operation, and change?
3. How do these toolset components round out the functional perspectives of
accounting, administration, automation, availability, business, capacity, change,
247
configuration, fault, operations, performance, problem, security, service level, and
software distribution?
These three primary research questions are summarized by the first hypothesis. The
first hypothesis contains all the elements of the primary research questions including
toolset components, life-cycle context, and functional perspectives. The first hypothesis is
as follows:
The manageability of Web-based applications is improved by a toolset (procedures,
programs, views, schema and data) implemented in a full life-cycle context, aligned
with key functional perspectives.
The conclusions for the primary research questions and first hypothesis are stated below.
Conclusions for the Primary Research Questions and the First Hypothesis The first primary research question is focused on the specific components that make up
a toolset that will improve how a Web application is managed. It is centered on the type of
components that are shown in Figure 3 that can be found in Chapter 1. The question is as
follows:
Question 1 - What are the appropriate procedures, programs, views, schema, and data
that would improve the manageability of Web-based applications?
One of the outcomes of this research was to define a well-balanced and comprehensive
collection of toolset components, that when used as a system, would have a significant
and positive impact on the manageability of the Web application. The scope of the
challenge of managing Web applications is so broad and challenging that it is difficult to
confirm with certainty that a given solution like this toolset is an appropriate solution.
(Conclusion 1) It was confirmed during the prototype toolset development that more that
248
one solution can be designed and implemented to meet the challenge of managing Web
applications.
(Conclusion 2) The fact that this toolset represents a reasonable approach is supported
by the survey data collected during the evaluation phase. Skilled IT professionals,
averaging 16 years of experience, gave "most favorable" evaluations 50% of the time.
This percentage uses an average for all questions for all scenarios. Integration of data and
information and ease of use were the top two strengths noted about the toolset. These
strengths are documented in Table 45. (Conclusion 3) There are data to support findings
that the volume of information and the burden of maintaining that information are
weaknesses in the toolset. The weaknesses are documented in Table 46. In these two
areas, the toolset may have fallen short regarding the appropriate level of data, but the
support for the toolset was nevertheless strong.
The second primary research question focused on the relationship of the toolset
components to the life cycle of the Web application. Design, construction, deployment,
operation, and change were identified as life-cycle phases and described in the Definition
of Terms that can be found in Chapter 1. The question is as follows:
Question 2 - How do these toolset components fit in the context of the application's life
cycle including design, construction, deployment, operation, and change?
Another outcome in this research was to carefully identify the toolset components that
could have a powerful impact on all phases of the application including design,
construction, deployment, operation, and change. Historically, focus has been on
managing the application after it is deployed. Deployment is an important phase because
it is during this phase that the users of the system gain from its use. (Conclusion 4)
249
However, other phases are important as well and should receive an appropriate level of
the benefits of improved applications management.
The design of the toolset and prototype toolset implementation were deliberately
focused on all phases of the application's life cycle. A considerable number of procedures
were designed that provided support to the design phase so that the application that was
developed would be more manageable. Many examples are described in Chapter 4. One
example is the Resource Utilization Optimization procedure which is part of a subsystem
in support of the Accounting functional perspective. Another example is the Strategies to
Reduce Application Capacity Limits Guide which is from a subsystem in support of the
Capacity functional perspective.
Several toolset programs were planned to work with design-phase work products with
a goal of reducing the burden associated with integrating the life cycle phases. For
example, definitions needed in the construction phase, could be extracted from design
documents eliminating the need to redefine them when the phase boundaries were
crossed. This example is from Chapter 4. Please see the MIR Creation subsystem which is
summarized in Table 32. This subsystem is in support of the Software Distribution
functional perspective.
The third primary research question focused on the relationship of the toolset
components to the functional perspectives that have been identified as appropriate to the
management of Web applications. Accounting, administration, automation, availability,
business, capacity, change, configuration, fault, operations, performance, problem,
security, service level, and software distribution were used as key functional perspectives
250
and were described in the Summary of What is Known and Unknown About this Topic
which can be found in Chapter 2. The question is as follows:
Question 3 - How do these toolset components round out the functional perspectives of
accounting, administration, automation, availability, business, capacity, change,
configuration, fault, operations, performance, problem, security, service level, and
software distribution?
Another important outcome in this research was to identify and address gaps that exist
for the management of applications in the identified functional perspectives. (Conclusion
5) It was found that some of the perspectives like automation, which have subsystems
summarized in Tables 18 and 19, had a long history and needed only a modest amount of
work to give that perspective an application focus. (Conclusion 6) Other perspectives like
capacity (see Table 22) needed much more analysis and investigation to make them a
compelling perspective for application management. For the capacity functional
perspective, the application bottleneck subsystem was a challenge to design and
implement requiring a complex monitoring strategy. (Conclusion 7) In an unexpected
way, the integration of these functional perspectives or disciplines was also explored and
this resulted in some powerful combinations. For example, once the data were collected
relating to all the functional perspectives, it was very useful to have summary or key
indicators presented in one toolset view. This was the approach taken with the Deep View
subsystem (see Table 20 for a summary of this subsystem) with its focus on availability
management of the Web application. This subsystem used data and information collected
by many of the other subsystems like change and problem and brought this data together
in one view to give depth and texture to the issue of application availability.
251
The first hypothesis is a summary or synthesis of the primary research questions. This
hypothesis includes the notion of the toolset (from the first primary research question),
full life-cycle focus (from the second primary research question), and functional
perspectives (from the third primary research question). The hypothesis is as follows:
Hypothesis 1 - The manageability of Web-based applications is improved by a toolset
(procedures, programs, views, schema and data) implemented in a full life-cycle
context, aligned with key functional perspectives.
The data collected from survey question 5 (Which best characterizes the impact that
the toolset might have on the organization because of the way it handled this scenario?)
reveal that 87% of the participants responded favorably to the toolset (see Table 42). They
responded either that the toolset will have an impact, but improvements are needed or that
the toolset will have a major impact. Only 13% of the participants responded that the
toolset would have no major impact on the users and their productivity. This favorable
response from the participants, regarding this specific survey question, supports the
hypothesis that the manageability of Web-based applications is improved by a toolset. By
nature, this toolset is a full life cycle entity and its design and prototype implementation
leveraged a context that included 15 functional perspectives.
Furthermore, considering the averages over all scenarios, 9% of the participants rated
the toolset least favorable whereas favorable responses were given 41% of the time and
most favorable responses were given 50% of the time. These averages were computed
using the data summarized in Tables 36, 37, 38, 39, and 40 which can be found in Chapter
4. It is clear from these survey results that the response to the toolset were generally in the
favorable or most favorable category. (Conclusion 8) This favorable response from the
252
participants, for the overall toolset, supports the hypothesis that the manageability of
Web-based applications is improved by a toolset.
Conclusions for the Secondary Research Questions Research questions 4 thought 23 are the secondary questions. These secondary
research questions were explored during the design and implementation of the prototype
toolset. The secondary research questions have an almost one-to-one relationship to the
subsystems that were designed in support of the 15 functional perspectives. Please see
Table 13 in Chapter 4 for a complete list of the toolset subsystems. The nineteen
secondary research questions are summarized by hypotheses 2, 3, and 4. In this section,
the strengths, weaknesses, and limitation of the study are also discussed.
Question 4, a secondary research question, is focused on the accounting functional
perspective. The question examines if it is possible to instrument or change an application
in order to understand details of its operation such as the resources that it expends during
operation. The question is as follows:
Question 4, Part A - For the accounting functional perspective (as it relates to Web
application management), is it possible to instrument an application whereby the
developer or user specifies the resources they intend to use and the toolset alerts them
when the limit is exceeded? Part B - Are simple messages the appropriate alert
mechanism for this tool?
To address this question, including parts A and B, a detailed design was completed for
the Resource Modeling subsystem. This subsystem, which is summarized in Table 14,
was designed in support of the accounting functional perspective. In addition, this
subsystem was used in Scenario 1 with an implementation of the Resource Modeling
253
view. Scenario 1 can be found in Appendix E. The subsystem design focused on disk,
memory, processor, and input/output resources and used simple monitors to periodically
compare the actual state of these resources with the thresholds specified by the
programmer or administrator.
Specific to part A of the research question, it was discovered that it is not necessary to
instrument the Web application as the processing that is required to support the function
can be retrieved externally through non-intrusive monitors. (Conclusion 9) No changes to
the application programs are required. However, the programmer or administrator does
need an accurate understanding of the performance characteristics of the Web application
otherwise their lack of knowledge may result in false indications that the application is
exceeding normal thresholds. False indications or alerts are a significant challenge for
many system management teams. For some Web sites, as many as three out of four alerts
are false (N. Knight, personal communication, June 10, 2002).
Relative to part B of this research question, simple messages were generated when a
threshold was triggered and these messages were used to supply the data displayed in the
subsystem reports. (Conclusion 10) The simple message mechanism was sufficient to
meet the needs of the subsystem.
Question 5, a secondary research question, is another two-part question relating to the
accounting functional perspective. This question is focused on accounting and charge
back of the application and its supporting resources. The question is as follows:
Question 5, Part A - Another accounting research question is--is it possible to
instrument an application for accountability? Part B - Could this instrumentation be
used for the charge back of the Web site to the internal groups that use it?
254
To address this question, a detailed design which addressed both parts A and B of the
research question, was completed for the Resource Accounting subsystem. This
subsystem, which is summarized in Table 15, was designed in support of the accounting
functional perspective. Specific to part A of the research question, a notion that was
explored in the subsystem design was the idea that charge back could be achieved using
an event model. The main idea for the event model was that certain key application events
like sign on, sign off, browse, and update would each be assigned a cost. This cost would
be used as a basis to charge the organization for use of the Web application servers,
network connectivity, and backup and restore. (Conclusion 11) The event model
demonstrated that it is possible to instrument an application for accountability in this
fashion.
Specific to part B of the research question, flexibility was build into the design so the
costs associated with each event could be adjusted in a trial-and-error fashion until the
desired level of charge back was achieved. (Conclusion 12) Based on this design, it is
feasible that this instrumentation could be used for the charge back of the Web site to the
internal groups that use it. Although the design was completed, this subsystem was not
explored in detail in the prototype. However, charge-back data was displayed in Scenario
5 (see Appendix E) on the Deep Information view, but the specific Resource Accounting
views were not developed.
Question 6, a secondary research question, is a two-part question relating to the
administration functional perspective. This question is focused on automating the
installation of a Web application. It also explores the notion of doing it completely
without human intervention. The question is as follows:
255
Question 6, Part A - For the administration functional perspective, is it possible to
completely automate the key administration activities for the installation of a Web
application? Part B - Is it possible to install a Web application without human
intervention?
To address this question, a detailed design was completed for the Automated
Installation and Configuration subsystem. This design, which is summarized in Table 16,
addresses both parts of the research question. This subsystem was designed in support of
the administration functional perspective. In addition, this subsystem was explored in
detail in Scenario 2 (see Appendix E) where the HR Benefits Plus application was
installed, configured, and deployed using the Automated Installation and Configuration
subsystem.
(Conclusion 13) Specific to part A of the research question, it is possible to
completely automate the key administration activities for the installation of a Web
application. A prototype of this automation was shown in Scenario 2 where views
containing status information for installation, configuration, and deployment activities
were shown. In this prototype, the first two steps were automated and the third step
(deployment) required manual intervention although this activity could also have been
automated.
(Conclusion 14) Specific to part B of the research question, it is possible to install an
application without human intervention, but it was discovered through the design process
that human intervention is a desired feature in some circumstances. A developer may
install a Web application to a number of systems using automated procedures, but the
administrator may desire to complete the process by configuring the application manually.
256
Question 7, a secondary research question, is also focused on the administration
functional perspective. This question explores a different approach to solving problems
with Web applications that are related to administrative or configuration settings. The
question is as follows:
Question 7 - Another administration research question is--in a problem-solving
context, is it possible to verify the administrative settings of key Web application
software parameters using previously stored values?
To address this question, a design was completed for the Configuration Verification
subsystem. Table 17 summarizes the design for this subsystem. Additionally, this
subsystem was explored in detail in Scenario 3 (see Appendix E) where the configurations
for two domains were compared to look for differences in the Order Marketplace
application. The Configuration Verification subsystem was designed to support the
administration functional perspective.
(Conclusion 15) It is not necessary, but may be desirable to verify administrative
settings, using previously stored values. Using previously stored values can dramatically
reduce the time it takes to perform the verification. However, using previously stored
values can raise the question--do these previously stored values reflect the current
configuration of the administrative settings? Achieving a balance between accuracy and
performance is a significant challenge particularly when the Web application has a large
number of components.
Question 8, a secondary research question, is the first of two questions focused on the
automation functional perspective. The question explores the idea of using design phase
257
work products like design documents as input to a utility that would produce elements like
templates that would be used in later life-cycle phases. The question is as follows:
Question 8 - For the automation functional perspective, is it possible to read design-
phase work products and automatically produce templates to be used in subsequent
phases? Examples might include start, stop, and restart scripts or schema that describe
the key Web application components that make up the Web site.
To address this question, a detailed design was completed for the Template Creation
subsystem. This subsystem, which is summarized in Table 18, was one of the two
subsystems designed to support the automation functional perspective. (Conclusion 16) It
was confirmed through the design process that it is possible to electronically read design
documents, extract lists of application components, and match components with templates
to create functional programs or scripts.
Of these three activities, the greatest technical challenge is creating a functional
program or script because of the large number of variables needed by the program or
script. A design decision was made to use subsystems definitions to supply this input.
This creates a time burden for the administrator. The alternative approach, which was not
explored, was to create a program or script that requires administration and tailoring
before it can be used. In the case where the administrator is supplying definitions, the
administrator has the option to use the generated program or script as a starting point that
can be enhanced or extended through modification and testing.
This subsystem was not explored in detail in the prototype, however automated
information about the programs created from templates was displayed in Scenario 5 on the
Deep Information view. Scenario 5 can be found in Appendix E.
258
Question 9, a secondary research question, is also focused on the automation functional
perspective. As was the case with research question 8, this research question is examining
the possibility of crossing life-cycle boundaries in order to gain some benefit during
problem determination or during the analysis activities associated with a major change.
The question is as follows:
Question 9 - Another automation research question is--is it possible to create a tool
that automatically compares designed versus actual installed Web application
components?
To address this question, a detailed design was completed for the Component
Comparison subsystem that supports the automation functional perspective. (Conclusion
17) The design for this subsystem, which is summarized in Table 19, indicated that it is
possible to compare designed components with those that are actually installed.
The programming support exists that makes it possible to electronically read design
documents, extract lists of application components, and match these components with the
components from an actual running system. The volume of data processing required to do
this comparison makes it unlikely that this activity could be achieved in real time. In
addition, if the system that is the subject of the comparison was a production system the
comparison process would create contention that might impact the performance of the
system being used by humans.
Although this subsystem explored bridging the design and construction life cycle
phases with technology an obstacle remains in that designers do not always update the
design of an application with changes made during the construction phase. (Conclusion
18) Discipline over multiple phases would be required in order for this subsystem to
259
produce useful results. Perhaps this functionality would be useful to an audit team for
financial Web applications. Although this subsystem was part of the comprehensive
toolset design, this subsystem was not explored in detail in the prototype.
Question 10, a two part secondary research question, is focused on the availability
functional perspective. The question explores availability measures that are more detailed
than the logical state of the resource. The question is as follows:
Question 10, Part A - For the availability functional perspective, what are the
characteristics of "deep" availability? Often, availability is centered on the
management of the state of a logical resource--the symbolic representation of a system
or a user. Part B - How would a deeper treatment of availability be managed? Would
it automatically include responsiveness, stability, and usage measurements?
To address both parts of this question, a detailed design was completed for the Deep
View subsystem. This subsystem design is summarized in Table 20. Additionally, this
subsystem was explored in detail in Scenario 5 (see Appendix E) where availability
information was displayed for the Value Market application using information for the
fourteen other functional perspectives as the context. To address part A of the research
question, information about the Web application was utilized as characteristics or
dimensions of availability. This information was supplied by the subsystems supporting
the accounting, administration, automation, business, capacity, change, configuration,
fault, operations, performance, problem, security, service level, and software distribution
functional perspectives. (Conclusion 19) This information provided a sufficiently detailed
understanding of the actual availability of the Web application.
260
Specific to part B of the research question, linking the other functional perspectives
with availability supplied the deeper treatment of availability and automatically included
measures of responsiveness, stability, and usage. For example, responsiveness measures
were supplied through performance data including the value of the current and previous
performance indicators. Stability measures were supplied through automation data
including automation actions attempted, both successful and unsuccessful. Usage
measures were supplied through fault data like total faults, transferred closed, transferred
open, examined, not yet examined, and average per day.
Question 11, a secondary research question, is focused on the business functional
perspective. This question is seeking functional capabilities beyond simply visualizing the
collection of related applications in a view supported by monitors and commands. The
question is as follows:
Question 11 - For the business functional perspective, what additional substance or
depth can be created in support of business-systems views, in addition to the current
focus on specific component monitors and commands?
To address this question, a detailed design was completed for the Business Views
subsystem. A summary of this design can be found in Table 21. This subsystem was also
explored in detail in Scenario 5 where business system information was displayed for the
Value Market application. The complete scenario can be found in Appendix E.
The focus of the design and scenario was to show relationships, both peer and parent,
and the status of the application including its middleware and database components. The
subsystem design included both physical and logical components in an attempt to provide
flexibility and richness in the functional possibilities of the subsystem. If a Web
261
administrator wanted to focus primarily on the tracking of physical components like
servers they could use the subsystem whereas another Web administrator might have a
preference to organize the views logically using a business system hierarchy. The design
was completed in such a way to be able to satisfy the needs of both kinds of
administrators.
The additional substance or depth, which was sought in the research question, was
supplied by three key design elements. These elements included relationships (peer and
parent); application, middleware, and database status; and the depiction of both physical
and logical components. (Conclusion 20) These design elements were provided in
addition to monitors and commands and provided more robust capabilities than are
typically available with business views.
Question 12, a secondary research question, is focused on the capacity functional
perspective. This multiple-part question concentrates on application capacity bottlenecks
that are in the middleware and database layers. The question is also concerned with how
these bottlenecks evidence themselves when server-centric and network-centric capacity
models are used. The question is as follows:
Question 12, Part A - For the capacity functional perspective, from the point of view of
the application (not the server), is it possible to determine the components of the
application that are important to understanding its potential for capacity bottlenecks?
Part B - Which application, middleware, and database components are essential to
understanding the capacity of the application and how does that relate to server and
network-based models and approaches?
262
To address this question, a detailed design was completed for the Application Capacity
Bottleneck subsystem. The design for this subsystem is summarized in Table 22. This
subsystem was explored in detail in Scenario 4, which can be found in Appendix E, where
bottlenecks were detected and displayed for application, database, middleware
components.
Specific to part A of the research question, the main bottlenecks analyzed were process
and input/output related. Specifically, five bottlenecks were explored including process
hung, too many processes, long SQL query, long get/put request, and missing processes.
(Conclusion 21) This collection of capacity bottlenecks were identified as the key
inhibitors to the capacity throughput of an application regarding middleware and database
operations.
Specific to part B of the research question, this subsystem design was developed to
complement server and network-based models and approaches not to replace them.
Current approaches to capacity analysis do not typically include a detailed examination of
application processes, database queries, and queue responses. (Conclusion 22) Since this
is the specific focus of this subsystem, it makes an attractive and useful addition to the
capacity functional perspective.
Question 13, a secondary research question, is focused on the combined change and
configuration functional perspectives. Change and configuration perspectives are often
combined in practice because their functions are closely related in that there is typically a
configuration update for every change. The main focus of the question has to do with
authorized changes, but the question has a security dimension as well, because some
263
unauthorized changes may also be violations of security policy. The question is as
follows:
Question 13, Part A - For the change and configuration functional perspectives, is it
possible for an application to detect unauthorized changes to itself? Part B - What
would be required to detect and notify these unauthorized modifications?
To address this question, a detailed design was completed for the Unauthorized
Change Detection subsystem. The design for this subsystem is summarized in Table 23.
This subsystem was explored in detail in Scenario 3 where all unauthorized changes were
examined for the Order Marketplace application. This scenario is included with the other
scenarios in Appendix E.
(Conclusion 23) Specific to part A of the research question, it is possible for an
application to detect unauthorized changes to itself if an authorization mechanism is used.
The design for this subsystem used the concept of the change window to implement the
authorization-mechanism idea. Using the change window as a rule, all changes to
application components could be evaluated based on if they occurred within or outside of
a change window. Changes to application components outside of a change window would
automatically be considered unauthorized.
Specific to part B of the research question, to detect and notify regarding these
unauthorized modifications, the design for the subsystem relied heavily on the time
stamps maintained by the operating system for files, directories, and programs.
(Conclusion 24) Change mechanisms, like time stamps, are sufficient to support the needs
of change and configuration subsystems.
264
Question 14, a secondary research question, is also focused on the change and
configuration functional perspectives. This question probes the usefulness of change-
window awareness to teams that support Web applications. The question is as follows:
Question 14, Part A - Another change and configuration question is--would
application-level change-window awareness be useful to the team or process making
the changes? Part B - Would this make possible the suppression of certain kinds of
application-generated faults, that often occur during planned change periods?
To address this question, a detailed design was completed for the Change-Window
Awareness subsystem which is summarized in Table 24. The Change-Window Awareness
subsystem was explored in detail in Scenario 3 where previous, current, and proposed
change-window data was displayed for the Order Marketplace application. This scenario
can be found in Appendix E.
Specific to part A of the research question, the notion of a change window is useful
because most Web applications are not continuously available so an authorized period is
needed when the application can be maintained and updated. Continuous availability is
possible, but expensive to implement so its use is limited. It is also useful for the
application to have knowledge of the window so it can modify its processing accordingly
as administrators would not want an application to generate alerts indicating that the
database is down if the application could be made aware of the fact that the database is
unavailable due to scheduled maintenance.
Specific to part B of the research question, the administrators of the problem
management system would want to suppress problem records during a change window for
an application in order to reduce costs. Researching and closing problem records is time
265
consuming and usually results in a charge to the owners of the application. A mechanism
is needed to make suppression possible during a change window. The Change-Window
Awareness subsystem created and maintained data in the MIR that made it possible for an
application to determine if a change window was in effect. (Conclusion 25) In the
prototype (demonstrated in Scenario 3), the implementation of this subsystem made
possible the suppression of certain kinds of application-generated faults that occur during
planned change periods.
Question 15, a secondary research question, is focused on the fault functional
perspective. The question explores what could be done to improve fault creation and
management. Specifically, the question examines what could be doe to improve the
quality of fault data without putting an increased burden on the application developer. The
question is as follows:
Question 15, Part A - For the fault functional perspective--is there an optimal
technique for generating application faults? Part B - Is a smart fault-generating
module possible? A smart module might be one that takes minimal input from the
application and makes intelligent choices regarding selections for the target-systems.
To address this question, a detailed design was completed for the Smart Fault
subsystem. The design for this subsystem is summarized in Table 25. The Smart Fault
subsystem was explored in detail in Scenario 1 (see Appendix E) where context data was
gathered to support a database error that stopped the execution of a Web application.
Specific to part A of the research question, there are several challenges to generating
application faults in an optimal manner. These challenges were noted during the
implementation of the Smart Fault subsystem. One significant challenge is gathering
266
problem context data from multiple servers in real time and recording that data in a single
MIR record. At times, this process cannot be completed in a timely enough manner
because of the delay associated with inter-server communication. (Conclusion 26) When
this data gathering process is not completed quickly, the data gathered is not as current as
it needs to be to be helpful to the person who will use it later to solve a problem.
Another significant challenge is the volume of context data. Many Web applications
rely on a large number of supporting middleware and database components and these
components have many tasks and processes. For example, a middleware messaging
component may have 15 queues and 30 processes to manage those queues. When the
context data is gathered, 45 resources names including status, are gathered by the toolset.
This data may be useful, but it can put a strain on the administrator who is examining the
data looking for information. Several survey participants remarked that the toolset
generated too much data for the administrator to review and understand. (Conclusion 27)
Balance is needed between the data that is possible to gather and display and the ability of
the administrator to understand and make use of it.
The Smart Fault subsystem was also used in all the scenarios as a common service for
gathering and formatting ordinary fault data. This common service was useful as the fault
data created for any problem situation formed a foundation for the problem solving
activities. Each scenario documented in Appendix E concludes with views that show the
fault data being transferred to the problem management system where it can be used for
problem tracking and reporting.
Question 16, a secondary research question, is focused on the operations functional
perspective. This question examines the idea of making it possible for the helpdesk
267
personnel to improve their ability to manage an application. This improved ability would
be made possible through a view and related tools that integrate functions that currently
lack integration like job scheduling, data backups, and print outputs. These functions are
related to one another, but are difficult to manage as a collection due to current tool
limitations. The question is as follows:
Question 16 - For the operations functional perspective, is there a way to have an
application view for the helpdesk that integrates key functions like job scheduling,
backup status and history, and the status of key print or file outputs?
To address this question, a detailed design was completed for the Integrated Operations
subsystem which is summarized in Table 26. Although this subsystem was designed it
was not explored in detail in the scenarios. However, job scheduling, output management,
helpdesk, and backup/restore information was displayed in Scenario 5 (see Appendix E)
on the Deep Information view.
The design explored this research question in detail. The technical challenge of
displaying this data in one application specific view is associated with the situation that
most of the commercial products used by customers in this area come from multiple
vendors who primarily use non standard interfaces. (Conclusion 28) This problem is
overcome when a common MIR is used and HTML is used for presenting data in
application specific views. The primary strength of this subsystem was collecting the key
operational data in a MIR and displaying the data on a view that used the specific Web
application as the context.
Question 17, a secondary research question, is focused on the performance functional
perspective. This question examines the alternatives to modifying the Web application
268
directly and explores the notion of a performance proxy that might be in the form of a
instrumented application robot. The question is as follows:
Question 17, Part A - For the performance functional perspective, is there an
alternative to gathering intimate application performance data by modifying the
application itself to insert calls to a performance-measurement tool? Part B - Is there
a proxy for this that is possible using an instrumented application robot?
A detailed design was completed for the Intimate Performance subsystem (see Table
27) and this subsystem was explored in detail in Scenario 5 (see Appendix E) where proxy
performance data was examined for the Value Market application. The Intimate
Performance View was effective in displaying the past, present, and future schedule of
operations of the robots or proxies. It also displayed the result of the proxy execution in
the same view.
Specific to part A of the research question, work on the Intimate Performance
subsystem was centered on the use of a performance proxy as modifications to the Value
Market application were not possible. Source code changes to the application to support
the gathering of performance data was determined to be too risky. During the design
sessions, JAD participants were supportive of the approach to using a performance proxy
and research after the design sessions did not discover other approaches that could be
explored.
(Conclusion 29) Specific to part B of the research question, the approach of using a
proxy is technically feasible and resulted in performance data that was more compelling
than the data produced by simple non-intrusive performance monitors. The proxy-based
approach has the benefit of producing application-specific performance data without
269
modifying the Web application itself. One limitation of this approach is the labor it takes
to both design and implement the performance proxy application. It is anticipated that this
burden can be overcome to a high degree by good procedures, models, and templates to
assist with the creation of the performance proxy programs.
Question 18, a secondary research question, is focused on the problem functional
perspective. This question examines the challenge of improving the quality of the data
that is provided to the problem-management system. Like Question 15, this research
question is concerned with improving the situation without increasing the burden on the
developer of the Web application. The question is as follows:
Question 18, Part A - For the problem functional perspective, most of the focus is on
the problem-management tools. Is it possible to instrument an application to provide
more meaningful and detailed data to the problem management system? Part B - What
would the instrumentation be that would minimize the programming burden yet
maximize the data collected and recorded?
To address this question, a detailed design was completed for the Detailed Data
subsystem. The design for this subsystem is summarized in Table 28. In addition, this
subsystem was explored in detail in all five scenarios (see Appendix E). Detailed data was
collected and stored in the MIR and was displayed in views that supported each of the
scenarios.
(Conclusion 30) Specific to part A of the research question, it is not appropriate to
instrument an application to provide real-time detailed data. This data needs to be
collected before the Web application is placed in use and maintained during the life of the
application. This data should be static data that is supported by detailed operations
270
procedures when human activities are required to take a remedial action. For this toolset,
the Smart Fault subsystem was designed to help create and store more meaningful and
detailed event data for the Web application. The Detailed Data subsystem was created, as
a complement to the Smart Fault subsystem, to provide assistance and analysis to the
human operators and administrators who are viewing the event data.
Specific to part B of the research question, the Smart Fault subsystem should be
effective in lowering the burden on the programmer. However, the Detailed Data
subsystem creates a considerable challenge, as it requires data in the MIR for the common
problems that can occur during the normal operation of a Web application. Much of this
detailed data is available from the vendors who create the common support software
systems like relational databases, but it is not in a form that can be directly imported into
the MIR. (Conclusion 31) An industry-standard approach is needed to make this data
accessible to the systems and users that need it.
Question 19, a secondary research question, is focused on the security functional
perspective. This question explores the idea of a comprehensive approach to monitoring
security for an application. This approach examines the common software interfaces like
sign on and sign off and hardware interface points like routers. The question is as follows:
Question 19 - For the security functional perspective, is it possible to build a view
(with probes) that would be used to monitor key security interfaces for an application?
These interfaces might include traditional access points like application sign on
attempts, failures, and retries as well as information from application dedicated
routers, firewalls, and network interface cards.
271
To address this question, a detailed design was completed for the Interface Monitoring
subsystem (see Table 29), but this subsystem was not explored in detail in the scenarios.
However, security violations, unauthorized changes, front-end and back-end
administrative access information was displayed in Scenario 5 (see Appendix E) on the
Deep Information view.
It is technically feasible to gather security information from systems like routers and
firewalls, however most of these operations are privileged in nature and require special
authorizations. Access to data from security devices like firewalls is typically controlled
closely in most organizations. (Conclusion 32) The privileges required are a significant
barrier to gathering the data that is necessary to support an Interface Monitoring
subsystem. In spite of these challenges, it would be useful to have the ability to display in
one view, security information from multiple sources for a specific Web application.
Since this data is privileged and sensitive in nature, using a view like the Interface
Monitoring view (see Table 29) would require special access privileges.
Question 20, a secondary research question, is focused on the service level functional
perspective. This question examines the possibility of creating a non-intrusive service
level capability that can we useful for reporting both service-level objective and service-
level agreement data. The question is as follows:
Question 20 - For the service level functional perspective, is it possible to architect a
service-level management tool that is independent of the application, yet it records
specific information, that can be used for both service-level objective and service-level
agreement reporting?
272
To address this question, a detailed design was completed for the SLO/SLA Data
subsystem (see Table 30) and this subsystem was explored in detail in Scenario 4 (see
Appendix E) where history and statistics data was examined for the b2b-EzTran
application.
(Conclusion 33) It is possible to architect a tool that is independent of the application
itself using a collection of application-centered monitors involving URLs and key
application, middleware, database, network, and operating system processes. Special
functions like file transfers can also be included in the monitoring activities. The output of
the monitoring activity can be stored in the MIR as a sample and can be used for SLO or
SLA evaluation as needed. (Conclusion 34) The sample should include as many kinds of
monitors as possible otherwise a service failure in an area where there is no monitoring
will go undetected. An undetected service failure for a service level agreement customer
could result in a dispute for a refund between a customer and a service provider.
Question 21, a secondary research question, is also focused on the service level
functional perspective. This question examines the possibility of gathering both
availability and performance metrics in the service-level context. Gathering availability
data is commonly done, but gathering performance data could add another dimension to
the understanding of the service level of the Web application. The question is as follows:
Question 21 - Another service level question is--is it possible for a toolset to gather
availability and performance metrics as they relate to service level?
The detailed design that was completed for the SLO/SLA Data subsystem also
addresses this question. As previously described, this subsystem was explored in detail in
Scenario 4 where history and statistics data were examined for the b2b-EzTran
273
application. (Conclusion 35) It is possible to gather both availability and performance
data for service level use. Availability data is easier to gather as the monitors that gather
this data are simply testing to determine if a resource is up or down. Performance data is
more of a challenge because the monitors that gather this data need to be more detailed
and may have to execute Web application functions to take the necessary measurements.
(Conclusion 36) It would be less costly to use a robot or proxy, like those from the
Intimate Performance subsystem, to gather the performance data as this approach would
put less of a burden on the system supporting the Web application.
Question 22, a secondary research question, is focused on the software distribution
functional perspective. This question is directed at the challenge of providing a useful
mechanism for monitoring the distributions of the software for Web applications. The
question explores what can be done to make this monitoring a simple and straightforward
activity. The question is as follows:
Question 22 - For the software distribution functional perspective, is it possible to
create deployment-phase views that allow software distribution to be monitored on an
application component-by-component basis? Would it be helpful for the monitoring of
mission-critical distributions?
A detailed design was completed for the Deployment Monitoring subsystem (See
Table 31) and this subsystem was explored in the Scenario 2 (see Appendix E). In that
scenario, the deployment function was started and a problem with the deployment was
reported on the Deployment Monitoring view. The deployment was restarted after the
problem was resolved. The deployment function was useful and allowed the administrator
to start, stop, or restart a deployment for a Web application. The subsystem provided the
274
flexibility to handle multiple domains with a variety of different options to handle
situations that might arise including deployment errors. (Conclusion 37) This approach
was successful with products like the System Modification Program (OS/VS2 MVS
Overview, 1980) and it would be useful for Web applications as well where there is a
need to transfer installed and configured application to a target system for day-to-day
operations.
Question 23, a secondary research question, is focused on another aspect of the
software distribution functional perspective. This question examines the usefulness of
productivity tools that would save administration and setup time by gathering data and
building packages in anticipation of their use in distribution of the Web application. The
question is as follows:
Question 23 - Another software distribution question is--would it be useful to have a
tool that reads a directory structure and builds schema and data to populate the
Management Information Repository? These data, once loaded, could be used to build
packages for distribution, objects for distribution views, and storage for data or
information relating to distributions.
To address this question, a detailed design was completed for the MIR Creation
subsystem (see Table 32), but this subsystem was not explored in detail in the scenarios.
This subsystem was designed to perform a utility service for the software distribution
functional perspective and to the other subsystems by making it easier to populate the
MIR with the required data. Considering the number of elements that that are part of a
typical Web applications, it is impractical not to have a MIR creation subsystem.
275
The MIR Creation subsystem would decrease the labor required to meet the definition
needs of the toolset. One of the challenges of the toolset is configuring it to operate
effectively. (Conclusion 38) Some manual configuration is required for all software, but
leveraging the utility program from this subsystem, would achieve a balance between high
labor costs, the need for timely implementation, and the software's need for application-
specific input specifications.
Conclusions for Hypotheses 2, 3, and 4 The second, third, and fourth hypotheses will now be discussed. Hypothesis 1 can be
found earlier in this chapter as it is associated with the primary research questions. The
second hypothesis is a summary or synthesis of a number of secondary research questions.
This hypothesis includes the notion of the data sources which were an aspect of secondary
research questions 5, 15, 17, 18, 19, 20, and 21. This hypothesis also relates to the use of
the MIR. The effective use of the MIR was a consideration of secondary research
questions 4 through 23.
As previously stated, the secondary research questions were explored during the design
and implementation of the prototype toolset. The secondary research questions have an
almost one-to-one relationship to the subsystems that were designed in support of the 15
functional perspectives. Every subsystem that was designed made use of the MIR as its
primary data repository. The hypothesis is as follows:
Hypothesis 2 - Existing data sources like alerts, traps, and messages are sufficient to
build and maintain an effective management information repository for the
management of Web-based applications.
276
(Conclusion 39) This hypothesis is partially supported by the data from the survey, but
some participants questioned the viability of maintaining up-to-date information from
these data sources in the toolset's MIR. Some participants also commented that the data
maintenance burden might prove overwhelming. See Table 45 in Chapter 4 for a summary
of the weaknesses that were noted by the survey participants.
Existing data sources were an important source of information for the Full Life-Cycle
Toolset MIR. The sources of data for the prototype toolset are shown in the Table 47.
Table 47. Data Sources Used in the Toolset Scenarios
Scenario Primary Data Source Existing Source Only?
1 Messages and Monitors Yes
2 Messages No, mixed
3 Faults, Problems, and Command Responses
Yes
4 Monitors and Specialized Commands
No, mixed
5 Monitors, Faults, Problems, and Command Responses
No, mixed
In Scenario 1, an existing database message was used by the Web application to invoke
the Smart Fault subsystem. Other data for this scenario, specifically exception data, was
generated from monitors that detected exceptions with disk, memory, processors, and I/O
using conventional means. In Scenario 2, the primary data source was a message from a
software-distribution utility. Secondary sources were toolset generated and cannot be
characterized as an existing data source. In Scenario 3, the primary data sources are
generated by faults, problems, and command responses. The data sources are all from
existing methods and interfaces.
277
In Scenario 4, the data sources are monitors, but some specialized commands were
used. Existing data sources were important, but not used exclusively. In Scenario 5, data
was gathered using monitors, faults, problems, and command responses. As was the case
with Scenarios 2 and 4, existing data sources were utilized, but were not done so
exclusively. Appendix E contains the narrative and views that were presented to the
survey participants for all of the scenarios.
(Conclusion 40) The toolset data is inconclusive regarding the hypothesis that existing
sources are sufficient, as the prototype toolset did not exclusively exploit existing sources,
but leveraged them only to a significant extent. In a number of instances, toolset unique
function generated the data necessary to support the management function.
The third hypothesis is also a summary or synthesis of a number of secondary research
questions. This hypothesis is focused on problem determination which is an important
dimension of secondary research questions 4, 7, 8, 9, 10, 11, 12, 13, 14, 15, 17, 18, 19,
and 22. Every subsystem that was designed made use of the MIR as its primary data
repository, but these fourteen secondary research questions involve subsystems that have
a strong potential use for solving problems. The hypothesis is as follows:
Hypothesis 3 - Problem determination is significantly improved by a toolset that
utilizes views to display information from a comprehensive management information
repository of data about the Web-based application.
This hypothesis links three variables--improved problem determination, views to
display information, and a comprehensive information repository. These three attributes
with their count and rank are shown in the Table 48 below.
278
Table 48. Three Attributes of Significance to Hypothesis 3
Attribute Count Rank (1-7)
Improvements to problem determination
8 3
Straightforwardness of the User interface
5 6
Integrated data and information
15 1
Table 44, the informal strength summary from Chapter 4 contains information that was
gathered on all three variables. Improvements to problem determination was indicated by
8 of the 33 survey participants as a strength of the toolset. This was the third most
common benefit noted by the participants. Straightforwardness of the User Interface was
indicated by 5 of the 33 survey participants as a strength of the toolset. This was the sixth
most common benefit noted by the participants. Lastly, integrated data and information,
which was the highest ranking comment from the participants, was noted by 15 of the 33
survey participants as a strength of the toolset.
The 28 comments about these three attributes represent 53% of the strengths that were
indicated by the participants in the informal data gathering process. (Conclusion 41) The
data supports the hypothesis that problem determination is significantly improved by a
toolset that utilizes views to display information from a comprehensive management
information repository of data about the Web-based application.
The fourth hypothesis is also a summary or synthesis of a number of secondary
research questions. This hypothesis is focused on availability and performance which is an
279
important dimension of secondary research questions 4, 7, 9, 10, 11, 12, 15, 17, and 21.
The hypothesis is as follows:
Hypothesis 4 - Availability and performance faults are more easily detected and
corrected using a comprehensive toolset.
Of the five scenarios, Scenarios 1 and 4 are exclusively focused on both availability
and performance and for that reason are the best scenarios to use for testing this
hypothesis (see Table 49). The status of these scenarios is high as they are ranked first and
third respectively. Scenario 2 was focused on the administration and software distribution
functional perspectives and did not include any availability or performance focus.
Scenario 3 included availability and performance as a focus, but also included change,
configuration and security. Like Scenario 3, Scenario 5 also included availability and
performance as a focus, but it included other perspectives. The relationship between the
scenarios and the availability and performance perspectives are shown in the Table 49
below.
Table 49. Availability and Performance Focus by Scenario
Scenario Availability Performance Other (excluding fault & problem)
1 Yes Yes
2 No No Administration & Software Distribution
3 Yes Yes Change, Configuration & Security
4 Yes Yes
280
Table 49. (continued)
Scenario Availability Performance Other (excluding fault & problem)
5 Yes Yes Accounting, Administration, Automation, Business, Capacity, Change, Configuration, Operations, Security, Service Level, & Software Distribution
(Conclusion 42) Since the scenarios that were primarily focused on availability and
performance ranked highly, the data supports the hypothesis that availability and
performance faults are more easily detected and corrected using a comprehensive toolset.
Strengths, Weaknesses, and Limitations of the Study The results from this research increase the focus on the management of applications. In
so doing, that helps to develop and grow the emerging discipline of applications
management by expanding the body of knowledge. The results of this research will foster
a change in approach from a narrow focus, like application availability or distribution, to a
broader, full life-cycle approach to address the challenges of managing applications. This
research, with its broader view, has focused on the connections and relationships between
life-cycle phases such as design and operations and functional perspectives like business
and service level.
This toolset helps to improve the productivity of application developers, operations
personnel, and management software administrators by providing them with ready-made
281
toolset components or samples that can easily be adapted for specialized use. Toolset
procedures provide a framework to handle the management of applications. These
procedures can be used to help design more manageable applications when used early in
the application life cycle. During operation and change phases, procedures help maintain
smooth day-to-day operations.
Toolset views provide improved ways to understand availability, capacity,
performance, and service level perspectives of applications management. Schema that
were developed defined the collection of data and information that is essential to full life-
cycle management of applications. These schema provided the definition of the data and
information that is important for the management of applications. It is expected that
effective use of the data and information that pertains to the management of applications
will enhance activities associated with many phases of the life cycle of a Web application.
It is also expected that application design will improve, since it will include design
points for functional perspectives like availability, change, configuration, and service
level. Operation phase activities will be more effective because applications that have
been designed for manageability will be easier to keep available and performing well for
the users who need them.
Problem resolution will be streamlined because diagnostic capabilities are improved
due to methods explored in the toolset. The application MIR will make it easier to find
data associated with failures by providing the data itself or a reference to the location that
contains the data. Information contained in the MIR will make it easier to answer
questions like--how has the application been performing over the last week or what is the
average number of problem records created a month for this application? This
282
information will save time for personnel working in performance, capacity, or problem
management roles.
Over the long term, the toolset will save labor by providing models and reduce the skill
level needed to instrument and operate a Web application. It will also result in higher
availability of Web applications, more rapid understanding of the impact of component
failures, more timely resolution of problems, and a high success rate for Web application
changes. Finally, the toolset will be the basis for improvements to existing products or a
new product or service offering.
There are some limitations associated with this research. This project resulted in a
prototype and not a production system. The focus of the prototype toolset effort was on
creating a significant number and variety of toolset elements so that the overall system
could be evaluated. Because of this focus, the development tools employed favored ease
of development and not the creation of a robust production-ready solution.
The prototype toolset implementation focused specifically on the management of Web
applications, not applications in general. Because of this, the toolset focused on Web-
specific aspects of monitoring, commands, operations interface, automation, and interface
to management systems like problem and change. The research did not focus on the
management of servers, networks, or hardware. A limitation exists in that the work of this
prototype implementation did not fully integrate data from these other important Web site
components.
Implications This section of the Final Report includes the implications of the results for the field of
study. (Implication 1) The schema for the application MIR could be used as a basis for a
283
Web application MIR. Many commercial products have data repositories, but few
products work with one another in an integrated manner. The schema from this toolset
design, when combined with data-gathering utilities, could be used by a variety of tools
and products to provide a powerful data repository to support the management of Web
applications.
(Implication 2) Data reduction techniques used in conjunction with the MIR could
provide a useful solution to many of the challenges of working with high-volume Web
files like activity logs. The number of logs and the volume of data they contain make it
difficult to use them. Several toolset components from this study can provide help in
transforming log data into meaningful information stored in a central repository. This
transformation of data would improve the availability and usefulness of these logs.
(Implication 3) Toolset prototype views, and the tools used to create and maintain
them, are useful to researchers who are working with user interfaces and the difficulties of
displaying large numbers of related components. Although this project was not focused on
the human-computer interface of a management system for Web applications, it did
reinforce the importance of surveying the users of the prototype before it was used as the
basis for a fully developed system. The views developed for this prototype toolset
contained usability problems that would need to be improved before the management
system could be used as the basis for a comprehensive system for the management of Web
applications.
(Implication 4) The exploration of the relationship between procedures and programs
is helpful to researchers working to understand and improve approaches to documenting
activities and maintaining accuracy between task-oriented procedures, their related
284
programs, and the ability of humans to backup automated procedures when needed. The
toolset that was designed for the full life-cycle management of Web applications made use
of automation whenever possible yet included procedures to keep the human operator
informed and knowledgeable about the management activities.
(Implication 5) The important role of the toolset evaluation of this study is useful to
researchers who focus on techniques that improve the effectiveness of systems developed
for human use. The evaluation from this toolset reinforces the benefits of presenting a
prototype to participants who stand to benefit from the use of the new system.
Recommendations This section includes recommendations for future research and for changes in
academic and professional practice. Additional research has already begun to flow from
this dissertation. As discussed in Chapter 3, Projected Outcomes, two papers were
published in 2001 on topics explored in this research. Please see Gulla and Hankins
(2001) and Gulla and Siebert (2001). Two more papers were published in 2002 on ideas
developed while completing this research. Please see Ahrens, Birkner, Gulla, and McKay
(2002) and Gulla and Hankins (2002). The research topics discussed below are proposed
as a continuation of this initial research activity.
The impact of visualization tools on application management: Does seeing the application-management data really help? (Recommendation 1) For this research topic, a study would be developed that would
focus on making application-management views available to key deployment, operation,
or change activity personnel and measuring the effect of the views on the efficiency and
effectiveness of the personnel. It is expected that use of application support views would
285
reduce problems or lessen their impact. It is also expected that the use of application
support views would improve the success rate of changes.
The application MIR as a real-time repository: Challenges of providing a high performance facility for applications management for users and programs (Recommendation 2) For this research topic, a study would be conducted that would
build upon this dissertation research by making the application MIR available in a high-
performance environment by leveraging a data-in-virtual tool like RODM (Finkel and
Calo, 1992) or a RDBMS with significant high-performance facilities. With this
environment in place, the researcher would test various situations where a real-time MIR
was needed and document the results of the efforts including lessons learned. It is
anticipated that making this data available in a high performance environment would
make the data useful in a broader number of circumstances.
Procedures and the programs that support them: Ideas on how to better integrate manual and automated technologies and the people who use them (Recommendation 3) For this research topic, a study would be conducted that would
build upon the dissertation project by exploring automated ways to combine written
procedures with the programs that support them. For example, a tool might be created that
automatically updates a procedure, when the program it uses is changed. Another
example is a tool, which creates a manual backup procedure, and automatically schedules
a test by a human, for an automated program-based task.
Marketplace view: The emerging disciplines for the management of applications (Recommendation 4) For this research topic, an investigation would be conducted and
a report written that would contain an examination of the product marketplace, derive a
286
set of new and emerging application-management disciplines, and discuss the key
technologies used by the most interesting products in these emerging areas.
The role of application management in the automated recovery of Web failures (Recommendation 5) For this research topic, a study would be conducted that would
design, implement, and test automated application recovery scenarios for a variety of Web
failures. Toolset work products could be leveraged to define a common approach to
handling these failures included automated actions like recovery and notification.
(Recommendation 6) Regarding changes in academic and professional practice,
researchers are encouraged to explore the management of applications using a full life-
cycle approach as this has the greatest potential to have an impact on the manageability of
Web applications. The full life-cycle approach is a new area of focus. (Recommendation
7) Researchers are also encouraged to continue to explore approaches that are focused on
disciplines or functional perspectives as this also holds great promise to more completely
address the challenges of managing Web applications.
This research supports the idea that administering a survey at the completion of a
development project can yield important and useful data about the prototype of production
systems. (Recommendation 8) Developers of application-management solutions are
encouraged to create prototypes and to measure their usefulness with the users of those
solutions. (Recommendation 9) Application developers should be encouraged to consider
the manageability of the Web application during the design process. Many of the
procedures developed as part of the toolset were focused on helping Web application
developers design manageability into their applications.
287
(Recommendation 10) Designers and developers should do a more effective job of
collecting and providing fault data to operators and administrators when Web applications
experience significant failures. A small investment in the design of the Web application
can have a important impact on the availability of the application after it is deployed.
(Recommendation 11) Management software companies should consider a full range
of management perspectives when developing products including accounting,
administration, automation, availability, business, capacity, change, configuration, fault,
operations, performance, problem, security, service level, and software distribution. This
scope will improve the effectiveness of products through increased depth and
functionality. (Recommendation 12) They should also build functionality into their
products that address the challenges associated with the management of an application
during all the phases of the application's life cycle including design, construction,
deployment, operation, and change. Consideration should be given other life-cycle phases
besides operation, which has historically received the most focus.
Summary In this research, the author completed a study that incorporated a broad range of
activities including design, development, testing, and evaluation. The design was
completed using a JAD approach that leveraged the skills of a group of individuals
experienced in the management and support of Web applications. Appendix F contains the
materials that were used in the first of two JAD sessions. After the JAD sessions, a
comprehensive design was completed that consisted of 19 subsystems. The subsystems
were supported by 43 procedures, 78 programs, 25 views, and a database that contained
59 tables.
288
The scope of the design included the accounting, administration, automation,
availability, business, capacity, change, configuration, fault, operations, performance,
problem, security, service level, and software distribution functional perspectives. The
scope of the design also embraced the application life cycle including design,
construction, deployment, operation, and change. This design was summarized in Chapter
4. Tables 13 through 32 contain the comprehensive list of procedures, programs, views,
and database tables that make up the 19 subsystems.
A RAD segment strategy was developed and documented and used as a framework to
translate the comprehensive design into a prototype containing a representative subset of
the total functionality of the toolset. The segment strategy was used to group the toolset
components into scenarios that would make them easier develop and evaluate. The
segment strategy is contained in Chapter 4. Using the segment strategy, the researcher
developed the prototype toolset using a HTML generator for the toolset views and
procedures. The graphical interface that was developed consisted of 12 independent Web
pages and 37 framesets consisting of 74 frames. Figure 13 contains an example of the
layout of a typical Web page.
A relational database was used for the MIR and a subset of database tables were
created in support of the prototype toolset. Appendix H contains the data dictionary that
was used for the development of the prototype toolset. The toolset views were collected
into a document that explained the scenarios in detail and this package was given to 33
survey participants. The complete toolset scenarios can be found in Appendix E. The
survey participants answered five questions for each scenario and supplied written
289
comments on the strengths and weaknesses of the toolset. Appendix B contains the toolset
evaluation survey questions.
The survey participants also supplied demographic information about themselves like
years of IT experience and their current job responsibility. A summary of this profile
information can be found in Table 35. The data collected was analyzed to determine if the
toolset was effective and to see if the data supported the hypotheses. Two approaches
were used to analyze the data. The approaches were the scenario-by-scenario approach
and the question-by-question approach.
For the scenario-by-scenario approach, the data revealed that the simpler scenarios
were more successful than the larger and more complex scenarios. The data for the
scenario-by-scenario approach can be found in Tables 36 through 40. For the question-by-
question approach, the data revealed that the participants thought that the toolset was
sophisticated and that it met the requirement for management of Web applications. The
data for the question-by-question approach can be found in Tables 42 and 43. However,
the toolset appears to have fallen short regarding its potential impact on the organization
that it was intended to support.
The toolset research explored three primary research questions and 20 secondary
research questions. The complete discussion for the research questions can be found in
this chapter. The research also examined four hypotheses. Regarding the hypotheses, the
data from the study supported hypothesis 1, 3, and 4, but was inconclusive regarding
hypothesis 2. The author has published four papers relating to this research and there are a
number of opportunities for additional papers and studies. It is expected that this
dissertation will have an impact on the management of applications because it has
290
explored and demonstrated the usefulness of a full life-cycle approach to the management
of applications. It also examined the association of 15 functional perspectives within the
context of the application life cycle while leveraging a toolset that consisted of
procedures, programs, views, schema, and data.
292
Appendix A
Functional Perspectives Analysis Tables A functional perspective list was developed by this researcher and is used as the basis
for the information contained in Chapter 2 of this document. Numeric analysis was done
using the 85 function-perspective observations gathered from 23 sources. The sources
included 4 standards organizations; 6 groups of researchers, research and consulting
organizations, and vendors; and a survey of 13 sample products. Tables that support the
selection of the specific functional perspectives selected for this study are included below.
An "X" in the column across from the functional perspective label indicates that the
standards organization, research and consulting group, vendor, or product supports the
functional perspective. In some cases, the researcher attributed the functional perspective
based on an review of the available documentation. Some of the groups do not express
their efforts in the context of a discipline, task, process, domain, service, or system. Table
50 indicates which standards organizations support the 15 functional perspectives that are
the focus of this project.
Table 50. Standards Organizations and Support for 15 Functional Perspectives
Functional perspective ISO IETF DMTF POSIX
Accounting X
Administration X X
Automation
Availability X X
293
Table 50. (continued)
Functional perspective ISO IETF DMTF POSIX
Business
Capacity X
Change X
Configuration X X X
Fault X
Operations
Performance X X
Problem
Security X
Service level
Software distribution X Table 51 indicates which researchers, research and consulting organizations, and
vendors support the 15 functional perspectives that are the focus of this project
Table 51. Researchers, Research and Consulting Organizations, and Vendors and Support for 15 Functional Perspectives
Functional perspective Merit SysView Tivoli ISMA ITIL DCE
Accounting X
Administration X
Automation
Availability X X
Business X X
Capacity X X
Change X X
Configuration X X X
Fault X
294
Table 51. (continued)
Functional perspective Merit SysView Tivoli ISMA ITIL DCE
Operations X X
Performance X X
Problem X X X
Security X X X
Service level X X X
Software distribution X Table 52 indicates which software products support the 15 functional perspectives that
are the focus of this project. Table 52 contains information on the first 6 of a total of 13
products from a sample of management products.
Table 52. Systems Management Products and Support for 15 Functional Perspectives (first six) Functional perspective
Resonate Central
Dispatch
IBM Client
Response Time
WebManage Content Mover
Tivoli Distributed Monitoring
WebManage Enterprise Reporter
WebManage Inter Scan
Accounting
Administration
Automation X X X
Availability X X
Business
Capacity
Change
Configuration
Fault
Operations X
Performance X X
295
Table 52. (continued) Functional perspective
Resonate Central
Dispatch
IBM Client
Response Time
WebManage Content Mover
Tivoli Distributed Monitoring
WebManage Enterprise Reporter
WebManage Inter Scan
Problem
Security
Service level X X
Software Distribution
X
Table 53 indicates which software products support the 15 functional perspectives that
are the focus of this project. Table 53 contains information on the final seven of a total of
13 products from a sample of management products.
Table 53. Systems Management Products and Support for 15 Functional Perspectives (last seven) Functional perspective
Trend Micro
IS Web
Manager
Keynote Perspective
BMC Patrol
IBM PCPMM
Web Manage Service Level Report
Platform Site
Assure
IBM Server
Resource Manage-
ment
Accounting
Administration
Automation X
Availability X X X
Business
Capacity X X
Change
Configuration
Fault X
Operations
296
Table 53. (continued) Functional perspective
Trend Micro
IS Web
Manager
Keynote Perspective
BMC Patrol
IBM PCPMM
Web Manage Service Level Report
Platform Site
Assure
IBM Server
Resource Manage-
ment
Performance X X X X
Problem
Security X
Service level X X
Software distribution
297
Appendix B
Toolset Evaluation Survey
This five-question survey was administered to participants who were familiar with the
development, administration, deployment, and operations of Web applications. The
questions were answered after a storyboard of a scenario was shown to the participant.
The scenarios are based on the research questions in Chapter 1. A survey was
administered for each of the five scenarios that follow. These scenarios are explained in
detail in Chapter 3, Methodology.
1. Web application operational fault
2. Web application deployment is unsuccessful
3. Web application change results in poor performance
4. Web application experiencing bottlenecks as some queries take a long time
5. Overall response for the Web application is slow, but the application is still functional
Instructions
For each scenario, please record the scenario number at the top of the page and check
the box next to the choice that best answers the question. Please read the wording of the
choices carefully as they are different from question to question. Please answer all five
questions for each scenario.
Survey Scenario number ____
298
1. Which best characterizes how easy it was to understand how the toolset handles this
scenario?
_ A lot of effort to understand
_ A moderate amount of effort to understand
_ A minimum effort to understand
2. Which best characterizes the level of sophistication of the toolset in the way it handled
this scenario?
_ Low
_ Sufficient
_ High
3. Which best characterizes how well the toolset met the requirements of handling this
scenario?
_ Partially fulfills requirements
_ Meets requirements
_ Completely fulfills requirements
4. Which best characterizes how usable the toolset was when handling this scenario?
_ Not easy to understand
_ Easy to understand, but there are some usability concerns
_ User friendly and efficient to use
5. Which best characterizes the impact that the toolset might have on the organization
because of the way it handled this scenario?
299
_ No major impact on the users and their productivity
_ Will have an impact, but improvements are needed
_ Will have an major impact
300
Appendix C
Institutional Review Board Documents
Three forms are included in this appendix--Submission, Research Protocol, and Consent. These forms were submitted and approved by the Nova Southeastern Institution Review Board representative for the SCIS. The electronic letter of approval is also included in this appendix. Submission Form
Institutional Review Board for Research with Human Subjects (IRB)
Submission Form To be completed by IRB/Center/College Representative: Date Received _______Center/College___________________________________ Representative________________________________________________________ *Protocol Number_______________________________________________________ *(To be assigned by the Office of Grants & Contracts) Protocol Qualifies for: Full Review____ Expedited Review____ Exemption____ Instructions: In order to comply with federal regulations as well as to conform with guidelines of the University's Institutional Review Board (IRB), the principal investigator is required to complete all of the following items contained in the Submission Form and the IRB Protocol. Upon completion of all information, the principal investigator must submit the original Submission Form and one copy of the IRB Protocol, including all consent forms and research instruments (questionnaires, interviews, etc.) to the appropriate IRB College/ Center Representative for review and action. Once reviewed and signed off by the Center Representative, the principal investigator is responsible for submitting the original Submission Form along with 22 copies of the Submission Form, IRB Protocol, and consent forms to the Office of Grants and Contracts. In addition, one copy of all research instruments (questionnaires, interviews, etc.) must be submitted to the Office of Grants and Contracts. The completed package must be received by the Office of Grants and Contracts by the last business day of the month prior to the next scheduled IRB meeting. The Office of Grants and Contracts' Web site should be consulted for IRB meeting dates. Incomplete forms may delay review by the IRB. For further information, refer to the Policy and Procedure Manual for Research with Human Subjects. I. General Information A. Project Title_ This study is the evaluation activity for the dissertation titled Design and Implementation of a Prototype Toolset for Full Life-Cycle Management of Web-Based Applications __
301
New__X___ Continuation/Renewal_____ Revision_____ Proposed Start Date__ November 15, 2001_________ Proposed Duration of Research___One month__________________________ Performance Site(s)__IBM Web-Hosting Facility in Research Triangle Park North Carolina __ B. Principal Investigator__Joseph Gulla__________________________________ Faculty _____ Staff _____ Student __X___
Center/College/Department __PhD Candidate, Graduate School of Computer and Information Sciences Home Mailing Address_201 Orchard Lane______________________________ City__Carrboro_______ State_____NC__________ Zip___27510__________ Home Phone Number __(919) 968-6101_ Office Phone Number__(919) 254-4683__ Co-Investigator(s) __Under the guidance of Dissertation Chair John A. Scigliano, Professor School of Computer and Information Sciences______________________ Principal Investigator's Signature_(Signed electronically by) Joseph Gulla _Date_09/19/2001__ II. Funding Information If this protocol is part of an application to an outside agency, please provide: A. Source of Funding __________N/A______________________________ B. Project Title (if different from above)___________________ C. Principal Investigator (if different from above)__________ D. Type of Application: E. Grant_____ Subcontract_____ Contract_____ Fellowship______ F. Date of Submission ______________________________________ III. Cooperative Research Cooperative research projects are those that involve more than one institution and can be designed to be both multi-site and multi-protocol in nature. Each participating institution is responsible for safeguarding the rights and welfare of human subjects and for complying with all regulations. If this proposal has been submitted to another Institutional Review Board please provide: Name of Institution __________N/A_______________________________
302
Date of Review ___________ Contact Person __________________ IRB Recommendation __________________________________________ IV. Subject/patient Information A. Types of Subjects/Patients (check all that apply) Fetus in Utero/non-viable fetues/abortuses Newborns/Infants Children (aged 2-12) Adolescents (aged 13-18) X Adults (over 18) Pregnant Women Special populations (e.g., prisoners, mentally disabled) Specify ____________ B. Other (Check all that apply) Use of investigational drugs or devices Information to be collected may require special sensitivity (e.g. substance abuse, sexual behavior) C. Number of Subjects/Patients ___40______ D. Approximate time commitment for each subject/patient ___60 minutes____ E. Compensation to subjects/patients : Yes_____ No__X___ F. Form (e.g. cash, taxi fare, meals) _____ Amount_____ V. Continuation or Renewals A. Attach a copy of the original IRB protocol B. Indicate all proposed changes in the IRB protocol affecting subjects C. Progress Report * Indicate the number of subjects entered in the study, including their group status, whether they are active or completed, the number of subjects still pending, and the time frame of subject participation. * Indicate adverse or unexpected reactions or side effects that have occurred or are expected. If none, state none. * Summarize the results of the investigation to date (in terms of subjects entered, in process, completed, and pending). D. Attach consent form(s) to be used and indicate if any changes have been made.
303
Research Protocol Form
Institutional Review Board for Research with Human Subjects (IRB)
Research Protocol
Description of Study Purpose and Potential Benefits: The purpose of this study is to gather data about the understandability, technology, and environmental characteristics of a prototype toolset for the full life-cycle management of Web applications (Boloix and Robillard, 1995). This study is the evaluation activity for the dissertation titled Design and Implementation of a Prototype Toolset for Full Life-Cycle Management of Web-Based Applications. There are a number of potential benefits of this study. The results of this dissertation and study are expected to increase the focus on the management of applications. In so doing, it will help to develop and grow the emerging discipline of applications management by expanding the body of knowledge. This research will foster a change in approach from a narrow focus, like application availability or distribution, to a broader, full life-cycle approach to address the challenges of managing applications. Finally, this research and study, with its broader view, will focus more fully on the connections and relationships between life-cycle phases such as design and operations and functional perspectives like business and service level. Location of Study: The study will be conducted in Research Triangle Park North Carolina at the IBM Web-Hosting Center. Dates of Study: The study will be conducted March 1 through March 29, 2002. Subjects: The subjects in this study will be recruited from the IBM Web Hosting community. Subjects will be drawn from the groups that perform system administration of servers, middleware and database support, and account management. Forty subjects will be asked to complete the five-question survey for each of five different operational scenarios. Methods and Procedures: The subjects in this study will be asked to complete a survey consisting of five questions. These questions were adapted from an article titled A Software System Evaluation Framework (Boloix and Robillard, 1995). The questions will be answered immediately after the participant reviews a multiple-page storyboard presentation that shows how the toolset prototype handles a specific operational scenario. The participant will be asked to review five different scenarios and complete the survey for each scenario for twenty-five questions. The specific steps are as follows:
1. View Web application operational fault storyboard then complete the survey for this scenario
2. View Web application deployment is unsuccessful storyboard then complete the survey for this scenario
304
3. View Web application change results in poor performance storyboard then complete the survey for this scenario
4. View Web application availability is limited (some functions not working) storyboard then complete the survey for this scenario
5. View Web application experiencing bottlenecks (some queries take a long time) storyboard then complete survey for this scenario The scenarios and survey questions will be available electronically or in paper form to support the preferences of the participant. Participants who are part of the Web-hosting community, but in locations other than Research Triangle Park North Carolina will also be invited to participate using electronic or paper form. Participant Payments or Costs: There will be no cost to the participant to participate in the survey. Participations will not receive payment to complete the survey, however, they will receive a token of appreciation for participating in the survey. Subject Confidentiality: Confidentially will be maintained by the numeric coding and stripping of identifying information of all data. All subjects will be assigned ID numbers which will be used in place of names on all assessment materials. The list linking the ID numbers and names will be maintained in locked and secured files by the primary investigator. Additionally, all data will be stored in locked file drawers at each stage of data transfer. Moreover, all data obtained will be accessible only to the researcher, and no subject will be identified in any report of the project. Potential Risks to Subjects: The likelihood of loss of confidentially and privacy to the subjects is rare. Techniques to minimize this risk are explained in Subject Confidentiality above. Risk/Benefit Ratio (if required for funded project): This project is not funded. Informed Consent: Subjects will be shown an informed consent form. The form is included in this package. Reference: Boloix, G. & Robillard, P. (1995). A software system evaluation framework. Computer, 28(12), 17-26.
305
Consent Form
Informed Consent Form for a Study Supporting the Dissertation Titled
Design and Implementation of a Prototype Toolset for Full Life-Cycle Management of Web-Based Applications
Funding Source: Not a funded project as study is in support of PhD dissertation
IRB approval # _________________
Joseph Gulla 201 Orchard Lane Carrboro, NC 27510 Institutional Review Board, Office of Grants and Contracts, Nova Southeastern University: (954) 262-5369 Description of the Study: The purpose of this study is to gather data about the understandability, technology, and environmental characteristics of a prototype toolset for the full life-cycle management of Web applications. This study is the evaluation activity for the dissertation titled Design and Implementation of a Prototype Toolset for Full Life-Cycle Management of Web-Based Applications. There are a number of potential benefits of this study. The results of this dissertation and study are expected to increase the focus on the management of applications. In so doing, it will help to develop and grow the emerging discipline of applications management by expanding the body of knowledge. This research will foster a change in approach from a narrow focus, like application availability or distribution, to a broader, full life-cycle approach to address the challenges of managing applications. Finally, this research and study, with its broader view, will focus more fully on the connections and relationships between life-cycle phases such as design and operations and functional perspectives like business and service level. Costs and Payments to the Participant: There is no cost for participation in this study. The participant will, however, receive a token of appreciation for participating in the study. There is no penalty for withdrawal from the study. Risks /Benefits to the Participant: There are no risks involved with your participation in the study. You will be completing a survey with five questions after you review each of five Web application operational scenarios. The main benefit to be derived from your involvement is the satisfaction that might come from helping a student complete the assessment aspects of his research. This research and study may influence future IBM products, but there is no guarantee that the prototype toolset developed for this study will result in an IBM product, offering, or service.
306
Confidentiality: Information obtained in this study is strictly confidential. You will be assigned a study number, and this number, rather than your name, will be recorded on the various assessments you receive. Only Joseph Gulla will have a record of which person has been assigned what number, and this information will be secured in a locked filing cabinet in his office. Your name will not be used in the reporting of information in publications or conference presentations. Your anonymity and confidentiality will be protected. Participant's Right to Withdraw from the Study: You may choose not to participate or to stop participation in the research program at any time without penalty. If you choose not to participate, the information collected about you will be destroyed. Voluntary Consent by Participant: Participation in this research project is voluntary, and your consent is required before you can participate in the research program. I have read the preceding consent form, or it has been read to me, and I fully understand the contents of this document and voluntarily consent to participate. All of my questions concerning the research have been answered. I hereby agree to participate in this research study. If I have any questions in the future about this study Joseph Gulla who can be reached at (919) 413-3274 will answer them. A copy of this form has been given to me. Participant's Signature:__________________________ Date:__________________ Witness's Signature:_____________________________ Date: __________________
307
Electronic Letter of Approval "James Cannady" <j.cannady@computer.org> 09/24/2001 09:43 AM Please respond to j.cannady To: Joseph Gulla/Raleigh/IBM@IBMUS cc: "John Scigliano" <scigl@nova.edu> Subject: IRB Documentation Joe, After reviewing your revised IRB Submission Form, Research Protocol, and the additional documentation that you submitted I have approved your proposed research for IRB purposes. Your research has been determined to be exempt from further IRB review based on the following conclusion: Research using survey procedures or interview procedures where subjects' identities are thoroughly protected and their answers do not subject them to criminal and civil liability. Please note that while your research has been approved, additional IRB reviews of your research will be required if any of the following circumstances occur: 1. If you, during the course of conducting your research, revise the research protocol (e.g., making changes to the informed consent form, survey instruments used, or number and nature of subjects). 2. If the portion of your research involving human subjects exceeds 12 months in duration. Please feel free to contact me in the future if you have any questions regarding my evaluation of your research or the IRB process. Dr. James Cannady Assistant Professor/IRB Representative School of Computer and Information Sciences Nova Southeastern University
308
Appendix D
Tivoli Management Applications
This appendix contains a comprehensive list of applications that are part of the Tivoli
management suite. This appendix is referenced in Chapter 2 of this report.
- Tivoli Application Performance Management
- Tivoli Applications Management Suite
- Tivoli Asset Management
- Tivoli Availability Management Suite
- Tivoli Cable Data Services Manager
- Tivoli Change Management
- Tivoli Change Management Suite
- Tivoli Cross-Site™ for Availability
- Tivoli Cross-Site for Deployment
- Tivoli Cross-Site for Security
- Tivoli Data Protection for Workgroups
- Tivoli Database Management
- Tivoli Device Manager for Palm™ Computing Platform
- Tivoli Decision Support for OS/390® (formerly Tivoli Performance Reporter for
OS/390) Accounting Feature
- Tivoli Decision Support for OS/390 AS/400® System Performance Feature
- Tivoli Decision Support for OS/390 CICS® Performance Feature
- Tivoli Decision Support for OS/390 Distributed System Feature
309
- Tivoli Decision Support for OS/390 IMS Performance Feature
- Tivoli Decision Support for OS/390 Network Performance Feature
- Tivoli Decision Support for OS/390 Performance Reporter Base
- Tivoli Decision Support for OS/390 System Performance Feature
- Tivoli Distributed Monitoring
- Tivoli Distributed Monitoring for Windows NT®/2000
- Tivoli Enterprise Console
- Tivoli Global Enterprise Manager
- Tivoli Inventory
- Tivoli IT Director
- Tivoli Manager for BEA Tuxedo
- Tivoli Manager for CATIA
- Tivoli Manager for Domino
- Tivoli Manager for Domino - IT Director Edition
- Tivoli Manager for MCIS
- Tivoli Manager for Microsoft® Exchange
- Tivoli Manager for Microsoft Exchange - IT Director Edition
- Tivoli Manager for Microsoft SQL Server - IT Director Edition
- Tivoli Manager for MQSeries®
- Tivoli Manager for Network Connectivity
- Tivoli Manager for OS/390®
- Tivoli Manager for PeopleSoft
- Tivoli Manager for Retail
310
- Tivoli Manager for R/3
- Tivoli Manager for SuiteSpot
- Tivoli Manager for Network Hardware
- Tivoli NetView®
- Tivoli NetView - IT Director Edition
- Tivoli NetView for OS/390
- Tivoli NetView Performance Monitor (NPM)
- Tivoli Operations Planning and Control
- Tivoli Output Manager
- Tivoli Problem Management
- Tivoli Remote Control
- Tivoli SANergy™ File Sharing
- Tivoli SecureWay® Global Sign-On
- Tivoli SecureWay Policy Director&
- Tivoli SecureWay Risk Manager
- Tivoli SecureWay Security Manager
- Tivoli SecureWay User Administration
- Tivoli Security Management Suite
- Tivoli Service Desk
- Tivoli Service Desk for OS/390
- Tivoli Software Distribution
- Tivoli Storage Manager
- Tivoli Workload Scheduler for Baan
311
- Tivoli Workload Scheduler for Oracle
- Tivoli Workload Scheduler for R/3 (Tivoli Product Index, 2001)
312
Appendix E
Survey Materials Used for the Toolset Evaluation This appendix contains the snapshot material that was used for the toolset evaluation.
This material is ordered by scenario. Each scenario was given to the participants as part of
a package that contained a cover letter, an informed consent form, survey questions, and
the scenario materials in two parts. The first part contained narrative and screen captures
of the toolset views. This part was used to help the participant to understand the "big
picture" of each scenario. The second part of the scenario package contained a print out of
the right side of each frameset. The right side contained more detail than could be seen in
a screen capture and was easier to read. In this appendix, only the material for the first
part in included.
313
Scenario 1 - Web Application Operational Fault
This Web page is the starting point for all the scenarios to be evaluated. Scenario 1,
Web application operational fault, is the first link on the Web page below.
The General Ledger application has been instrumented to generate faults when it
detects a significant problem. In this scenario, a database fault (SQL error) is experienced
by the application and a fault is created to help the database administrator to diagnose and
fix the problem.
314
This Procedure Page Guides the Administrator's Actions
This page is taken from the toolset procedures. It guides the person handling the
problem through the steps to take to manage the fault that has been generated by the
application.
This procedure outlines the main steps and views to use to handle the fault including --
1. The Specific Fault view is used to get a snapshot of information about the fault 2. The Detailed Data view contains a description, action, recommendation, and contact information for this specific fault 3. Vendor Recommended Actions (not explored here) can be used to take recovery actions 4. The Resource Modeling view is used to see if any disk, memory, processor, or I/O exceptions have been reported 5. Administrator Action View is used to record actions and transfer the fault to the problem-management system.
315
Step 1 - Examine the Specific Fault View
The frame on the right side of this Web page contains primary and additional fault
information. The information was gathered by a subsystem called Specific Fault. The
information is designed to make it clear to individuals how to handle the fault. The fault
itself determines the group that should handle it, for example, Database Administration.
The primary fault information is the information that is usually part of a Tivoli Event
Console (T/EC) Event. The additional information was gathered by the Smart Fault
subsystem and includes detailed information to help the database administrator diagnose
the root cause of the fault. This information may also be useful if the root cause of the
fault is a defect in the application or database software.
316
Step 2 - Examine the Detailed Data for this Fault
The frame on the right side of this Web page contains detailed information about the
fault like cause description and the technical actions to be taken. This data is from a
subsystem called Detailed Data. The Smart Fault Generation and Detailed Data
subsystems were designed to work together. Every fault has associated detail data.
Step 3 - Vendor Recommended Actions
The handling of recommended actions is not shown here, as the person handing the
fault at this time is not a database subject matter expert. When the database administrator
takes over the resolution of the fault, they will use the specific action (above) as a starting
point for handling the problem.
317
Step 4 - Site-Specific Actions Require the use of the Resource Modeling View
The frame on the right side of this Web page displays information about exceptions
that have been recorded for the servers that make up the General Ledger application. This
information was gathered by a subsystem called Resource Modeling.
This view was used because the SQL error involved a problem with system resources.
The resource modeling view displays exceptions that were detected from the production
General Ledger system. These exceptions were recorded when the production system did
not operate in the way that the developer modeled it to run. Put another way, its operation
was inconsistent with its operational model.
318
Step 5 - Make the Required Updates Then Transfer Fault Data to the Problem-
Management System
The frame on the right side of this Web page is used to update information about the
fault and to transfer the data to the problem-management system.
Since the person handling the fault at this time was the primary customer-care person
and not the database administrator for general ledger, their responsibility was to record the
information found and any actions taken and transfer the fault to the problem-management
system as quickly as possible. The database administrator for the application was also
paged for a quick response to the problem.
319
Confirmation Web page
This Web page indicates that the fault data has been successfully transferred to the
problem-management system. The database administrator for General Ledger will use the
problem-management system to close the problem after the problem that caused the fault
is resolved.
This is the end of Scenario 1. Please review the detailed views that follow and complete
the first survey. The detailed views are printed versions of the right-side frames of all the
views used in this scenario. The left-side frames are used only for navigation and contain
links to additional reports and views.
320
Scenario 2 - Web Application Deployment is Unsuccessful
This Web page is the starting point for all the scenarios to be evaluated. Scenario 2,
Web application Deployment is Unsuccessful, is the second link on the page below.
The HR Benefits application is tested and ready to be installed, configured, and
deployed to the verification domain. In this scenario, a problem is detected during the
deployment of the application. The specific fault and detailed data views are used to
understand the fault and take the specified actions. The fault is transferred to the problem-
management system as a closed problem.
321
This Procedure Page Guides the Administrator's Actions
This page is taken for the toolset procedures. It guides the person handling the problem
through the steps to take to monitor the installation, configuration, and deployment of the
application. It is during the deployment that a problem occurs that the administrator needs
to handle.
This procedure outlines the main steps and views to be used to deploy the application
and handle any problems that may arise. To deploy the application, the authorized
installation, automated configuration, and deployment monitoring action views are used.
To handle the deployment failure, the specific fault, detailed data, and administrator
action views are used.
322
Step 1 - Check on the Status of the Automated Installation
The frame on the right side of this Web page contains information on the status of the
automated installation that was run by the development group. The Installation of HR
Benefits on the target systems was a success.
Automated installation and automated configuration are part of the same subsystem.
The convention is to install the application, configure that application, and then deploy it.
The steps were divided up in this way to provide flexibility during processing like
automated installation and manual configuration or manual installation and automated
configuration. In all cases, the deployment to build the application domain (test,
verification, and production) is done using automated processes.
323
Step 2 - Check on the Status of the Automated Configuration
The frame on the right side of this Web page contains detailed information about the
Status of the configuration of the HR Benefits application. Like the installation, the
configuration actions were successful.
From this view, it is possible to browse the configuration definitions. Also, this view
can be used to browse the logfile that was build during the automated configuration
actions. Taking a detailed look at the definitions and the logfile can give the administrator
a deeper understanding of what actions took place during the configuration step.
324
Step 3 - Start the Deployment of the Application to the Target Systems
This view is used to start the deployment of the application to the target domain. The
administrator initiates this action as the installation and configuration activities were a
success. The administrator selects the action (start), the domain to deployment into
(verification), and the options to use during the deployment.
This view can also be used to restart or stop a deployment. The list of domains can vary
depending on the application configuration. The options selected influence the creation of
faults and the detail level of the logging. The simulate option can be used to test the
deployment without actually performing the deployment. This is useful to determine if
there is sufficient space on the target domain before actually attempting a deployment.
325
Step 4 - View the Confirmation Message and Continue
This view indicates the results of the previous actions to start deployment monitoring
for the verification domain.
The actions were successful so the instruction is to close this view and continue
processing using the Deployment Monitoring View. It is on this view that the status of the
deployment can be determined.
326
Step 5 - Monitor the Deployment to the Verification Domain
The frame on the right side of this Web page displays information about the
deployment of the HR Benefits application. The information indicates that the deployment
was unsuccessful. Specifically, there was a copy problem building the application on
server hrveras002.
This view also gives other important information including the name of the target
system (there can be a variety of different target systems like test or production) and
details on the last operation that was completed for the deployment.
327
Step 6 - View the Fault Generated During the Unsuccessful Deployment
The frame on the right side of this Web page displays information about the fault that
was generated during the deployment of the HR Benefits application.
The fault text is DIS SENG 0033 Error: Cannot create temporary file. The Tivoli
utility program that was being used to support the deployment of the application created
the message.
328
Step 7 - Examine the Detail Data for the Fault
The frame on the right side of this Web page displays detailed information about the
fault that was generated during the deployment of the HR Benefits application.
This Web page gives a long-term recommendation as well as a direct link to the Tivoli
book with more information and contact information for the development team in White
Plains.
329
Step 8 - Make the Required Updates Then Transfer Fault Data to the Problem-
Management System
The frame on the right side of this Web page is used to update information about the
fault and to transfer the data to the problem-management system.
The HR Benefits support person contacted the development team in White Plains and
they use SMIT to increase /TMP size. After that, he successfully restarted the deployment
and the roll out was successful. This Fault data is being transferred to Prod-US problem -
management system as a closed record.
330
Confirmation Web page
This Web page indicates that the fault data has been successfully transferred to the
problem-management system.
This is the end of Scenario 2. Please review the detailed views that follow and complete
the second survey. The detailed views are printed versions of the right-side frames of all
the views used in this scenario. The left-side frames are used only for navigation and
contain links to additional reports and views.
331
Scenario 3 - Web Application Change Results in Poor Performance
This Web page is the starting point for all the scenarios to be evaluated. Scenario 3,
Web Application Change Results in Poor Performance, is the third link on the page below.
In this scenario, poor performance results when a new function is installed for a Web
application. For the most part, the new function is operational, but it turns out that an
important definition file was missed in the migration to the new system. This situation is
detected through the use of a number of subsystems including change-window awareness,
unauthorized change, and configuration verification. The subsystems generate a fault that
is used to track the problem and then transfer the fault to the formal problem-management
system.
332
This Procedure Page Guides the Administrator's Actions
This page is taken from the toolset procedures. It guides the person handling the
problem through the steps to take to understand why the application is performing poorly
after a recent change.
This procedure outlines the main steps to follow to handle the problem with
performance. Since the problem happened after a recent change, the starting point for the
procedure is to check on the status of the last change window. Next, a check is made for
unauthorized changes. The configuration of the application is then checked to see if
something was missed during the change that may have caused a problem. When the
problem is found, the fault and detailed data views are used to better understand the
problem and transfer the fault to the problem-management system.
333
Step 1 - Check to See if There is an Active Window
The frame on the right side of this Web page contains information on the status of the
change windows for this application. This view indicates that there is no active window
and the previous window, which ended on 12/23/2001, completed normally.
The previous three windows and the next three planned change windows are shown.
Also shown are counts of faults that occurred during the windows and problem records
that were suppressed.
334
Step 2 - Check for any Unauthorized Changes
The frame on the right side of this Web page contains detailed information about the
number and details on changes that may have happened outside the authorized change
windows.
At this time, 14 changes were identified as having been made outside a change
window. The detail views that follow this snapshot will show that none were related to
this recent change.
335
Step 3 - Check for Configuration Differences/Mismatches
This view is used to check to see if the system having the performance problem is
significantly different that the other systems to which is it related. The comparison is done
using a comparison of test, verification, and production systems.
One difference, a file mismatch, is found. The mismatch is between verification and
production. Details regarding the difference are shown in the next view.
336
Step 4 - Examine the Configuration Verification Detail
The frame on the right side of this Web page displays information about the specific
error that was found during the configuration verification check.
It appears that the difference is in regards to an important file, the Weblogic properties
file. The ThreadCount parameter relates to the number of simultaneous operations
performed by the WebLogic server. This could be the cause of the performance problem.
337
Step 5 - View the Configuration Verification Fault
The frame on the right side of this Web page displays detailed information about the
fault that was generated during the configuration verification check that found that the
verification application domain did not match the production system.
This fault is used to gather the key information about the problems so that it can be
resolved on the spot or transferred to another team to investigate the cause of the problem
and fix it.
338
Step 6 - View the Detailed Data for the Configuration Verification Fault
The detailed data for this fault gives both a action to take as well as a long-term
recommendation.
The detailed data confirms the potential seriousness of the difference and gives
information on how to contact the development team in Boulder.
339
Step 7 - Make the Required Updates Then Transfer Fault Data to the Problem-
Management System
The frame on the right side of this Web page is used to update information about the
fault and to transfer the data to the problem-management system.
The actions taken show that investigation was done because the newly changed
production system was experiencing performance problems. The development team was
contacted using a pager and key information was transferred to the problem-management
system.
340
Confirmation Web page
This Web page indicates that the fault data has been successfully transferred to the
problem-management system.
This is the end of Scenario 3. Please review the detailed views that follow and complete
the third survey.
341
Scenario 4 - Web Application Experiencing Bottlenecks
This Web page is the starting point for all the scenarios to be evaluated. Scenario 4,
Web Application Experiencing Bottlenecks as Some Queries Take a Long Time, is the
fourth link on the page below.
The b2b-EzTran application is running in production mode, but certain transactions
that use the database are taking a long time to complete. In this scenario, various
subsystems of the toolset are used to detect the specific component of the application that
are experiencing bottlenecks. The Application Bottleneck View, DB2 Statement Event
Monitor Analysis Program, and SLO/SLA Views are key to understanding the problem
and understanding its impact. At the end of the scenario, the fault is transferred to the
problem-management system as an open problem for the development DBA to resolve
using the detailed data that has been collected.
342
This Procedure Page Guides the Administrator's Actions
This page is taken from the toolset procedures. It guides the person handling the
problem through the steps to take to detect the bottleneck, invoke the correct utility, and
manage the fault that is generated to help the DBA get a deeper understanding of the root
cause of the slow performance.
After the bottleneck is detected and the database data gathering utility is used, the
specific fault, detailed data, and administrator action views are used transfer the fault to
the production problem-management system to be handled and closed by the development
DBA team who supports the b2b-EzTran application.
343
Step 1 - Look for a Bottleneck That is the Cause of the Slow Response
The frame on the right side of this Web page contains information on status of any
application bottlenecks. Bottlenecks are defined as being conditions involving the
application, database, or middleware that are keeping the application from processing
successfully.
Primary information is gathered about conditions like process hung, too many
processes, missing processes, long queue gets and puts, long SQL query, and long reads
and writes. This data is collected through sampling and monitoring techniques and
reported in the management repository. In this situation, monitoring has detected 231 long
SQL queries.
344
Step 2 - Use the Subsystem Service View to Create Data and Invoke the DB2
Statement Event Monitor Analysis Program
The frame on the right side of this Web page contains actions that can be selected to
start the DB2 Event Monitor Trace for a specific domain and then to start the utility with
options.
The key options pertain to generating faults if there is an error, reporting, performing
analysis, and logging of details that can be use for browsing. Logging facilitates a closer
examination of the details of a related series of exceptions.
345
Step 3 - View the Confirmation Message and Continue
This view indicates the results of the previous actions to start tracing and the DB2
Event Monitor Analysis Program for the production domain.
The actions were successful so the instruction is to close this view and continue
processing using the Specific Fault View. It is on this view that any exceptions can be
managed. Since the trace was already started, the utility simply uses the live trace data.
346
Step 4 - Examine the Fault from the DB2 Event Monitor Analysis Program
This view is used examine the fault that was created by the DB2 Event Analysis
program. The utility found one or more SQL queries that were running longer than the
threshold-specified limit.
From this view, the DB2 Statement Monitor Analysis View can be selected to examine
the detail that was collected and analyzed for the b2b-EzTran application.
347
Step 5 - Examine the DB2 Statement Analysis View
The frame on the right side of this Web page displays detailed information about the
fault including elapsed time, used CPU, System CPU, fetches, sorts, sort time, overflows,
rows read, rows written, SQLcode, SQLstate, timestamp, operation, and the text of the
actual SQL statement.
This data would is helpful to a development DBA to determine if there is a real
problem or just a long-running SQL statement. A real problem might be defined as a
query that is running long because a database index is missing. This situation can be
corrected through the creation of the required index.
348
Step 6 - Examine the Detail Data for the Fault
The frame on the right side of this Web page displays detailed information about the
fault that was generated by the DB2 Statement Analysis utility.
This Web page gives information about the fault including a long-term
recommendation to continue sampling on a regular basis. This sampling can give
important perspective on the nature of the SQL statement that are used to access the
application database.
349
Step 7 - Examine the SLO/SLA Data for the Application
The frame on the right side of this Web page tells if whether the b2b-EzTran
application is a SLO or SLA application. The view states that the application is a SLO
application with a 95% goal.
The recent history shows 8 weeks of history as well as detail information on the
collections defined and the log records that are available for browsing.
350
Step 8 - Make the Required Updates Then Transfer Fault Data to the Problem-
Management System
The frame on the right side of this Web page is used to update information about the
fault and to transfer the data to the problem-management system.
The b2b-EzTran support person contacted the development team in Charlotte and they
will use the detailed data collected to handle the long SQL queries by changing the
database structure or working with the application developers to rework the SQL
statements so they perform better.
351
Confirmation Web page
This Web page indicates that the fault data has been successfully transferred to the
problem-management system.
This is the end of Scenario 4. Please review the detailed views that follow and complete
the second survey. The detailed views are printed versions of the right-side frames of all
the views used in this scenario. The left-side frames are used only for navigation and
contain links to additional reports and views.
352
Scenario 5 - Overall Response for the Application is Slow, but the Application is Still
Functional
This Web page is the starting point for all the scenarios to be evaluated. Scenario 5,
Overall Response for the Application is Slow, but the Application is Still Functional, is
the fifth link on the page below.
The Value Market Web application is performing slowly, but all components are
available. The toolset's deep availability capability is used to determine the root cause of
the overall poor performance. Deep View is used to take a comprehensive look at the
operational status of the application. Business Views are used to see what business system
the application is part of and what applications may be affected. Intimate performance was
used to examine both application-specific and proxy performance data.
353
This Procedure Page Guides the Administrator's Actions
This page is taken from the toolset procedures. It guides the person handling the
problem through the steps to take to detect the root cause of the performance problems
that are being experienced with the Value Market Application.
The scenario begins with a call to the Customer Care Center (CCC). The CCC
personnel are told that this performance problem has been going on for some time. They
have tried to get the development team to look into the problems and there has been no
progress so now that are asking for help from the CCC.
354
Step 1 - Take a Broad Look at the Value Market Application using the Deep
Information View
The frame on the right side of this Web page contains information on a variety of
perspectives for the application including accounting, administration, automation,
availability, business, capacity, change, fault, operations, performance, problem, security,
service level, and software distribution.
In this situation, there are several problems with the site including automation faults
(unsuccessful recoveries), availability, (switch faults), and capacity problems (processor
faults).
355
Step 2 - Use the Business View to Gather More Information on the Application's
Context
The frame on the right side of this Web page contains summary information about the
business systems with which Value Markets is associated. The parent view is VMS
Systems Limited that contains the status of hundreds of application, database, and
middleware resources.
The Value Markets business system contains 31application, 26 database, and 6
middleware resources. 20 application resources are in a degraded status. On the next page,
the details for the Value Markets logical view are displayed.
356
Step 3 - Examine the Value Markets Logical Details
This view displays the application, database, and middleware resources that are in up,
degraded, and down status. The messages in the table are created through monitoring the
resources that are key to the application like processes, URLs, and tables.
In this situation, there are 20 messages relating that an application program was
detected in a stalled state. This is a good indicator that there have been performance
problems with the application.
357
Step 4 - Examine the Intimate Performance Data
This view is used to examine the two kinds of performance data that is available for
applications--application specific and proxy. Application specific data comes directly
from an application that is instrumented to create its own performance data. Proxy data is
often created by a robot application that is a stand-in or substitute for the actual
application.
In this situation, there is only proxy data. The proxy data implies that some executions
of the BuyRobotDaily proxy transaction have been experiencing long response times.
This is a good indication that there are response-time problems with the application.
358
Step 5 - Examine the Fault
The frame on the right side of this Web page displays detailed information about the
fault including its source (Multiple Sources Impacting Performance) and Sub Source
(Failed Restarts, Switch Faults, and Processor Faults).
This data indicates that this is a complex problem. Independent of one another, these
problems are potentially serious to the application. These problems may also be related to
one another and may need to be resolved through careful analysis in order not to cause
even more serious problems with the application.
359
Step 6 - Examine the Detail Data for the Fault
The frame on the right side of this Web page displays detailed information about the
fault that was generated that includes information from multiple sources affecting the
performance of the application.
This Web page gives information about three different aspects of the fault that include
failed restarts, switch faults, and Processor faults. As these problems may be related they
should be investigated more fully and the root causes should be identified and resolved.
360
Step 7 - Make the Required Updates Then Transfer Fault Data to the Problem-Management System
The frame on the right side of this Web page is used to update information about the
fault and to transfer the data to the problem-management system.
The development team in India was contacted by email and asked to look into the
various faults that are contributing to the performance problems with the application. The
fault is being transferred to the problem-management system as an open problem.
361
Confirmation Web page
This Web page indicates that the fault data has been successfully transferred to the
problem-management system.
This is the end of Scenario 5. Please review the detailed views that follow and complete
the survey.
362
Appendix F
Background and Brainstorming JAD Materials This appendix contains materials that were used in the first JAD session. This is the
first presentation page in the set of materials. This chart was used to launch the JAD
session. The session was used to share some background materials on the project and then
to brainstorm with the participants to get their best ideas.
_______________________________________________________________________
Design and Implementation of a
Prototype Toolset forFull Life-Cycle Management of Web-Based Applications
Background and BrainstormingJAD Materials
Figure 17. Cover page from the JAD kickoff presentation
_______________________________________________________________________
363
Agenda From the JAD Kickoff Presentation The agenda indicated that there were two major topic areas. The participants were
experienced with Web-application management however, the conventions of the project
needed to be explained and discussed.
________________________________________________________________________
The agenda includes a background topic which is needed so we can brainstorm the toolset components
1. Background2. Brainstorming
Figure 18. Agenda from the JAD kickoff presentation
________________________________________________________________________
364
Background From the JAD Kickoff Presentation This presentation page was used to indicate the beginning of the background
materials.
________________________________________________________________________
Background
Figure 19. Background from the JAD kickoff presentation
________________________________________________________________________
365
Design Background From the JAD Kickoff Presentation This page was used to share with the information with the participants about how the
design for the toolset was going to be managed during this phase. The basic approach was
JAD activities that leveraged technology like conference calls, email, and a document
database.
________________________________________________________________________
Design is an important first step for this project -- your ideas and input are key
Joint Application Design (JAD)Leverage technology
Conference callsDocumentation databaseElectronic collaboration via Notes
Figure 20. Design background from the JAD kickoff presentation
________________________________________________________________________
366
Implementation Background From the JAD Kickoff Presentation This page was used to share with the information with the participants about how the
implementation for the toolset was going to be handled. The basic approach was to create
working versions of the elements needed to support the toolset scenarios.
________________________________________________________________________
Implementation is limited to a prototype of working versions of toolset components
Rapid Application Design (RAD)Prototype
Working versions of most designed elements of the toolsetToolset elements include procedures, programs, views, scheme, and data/information
Figure 21. Implementation background from the JAD kickoff presentation
________________________________________________________________________
367
Toolset Information From the JAD Kickoff Presentation This page was used to explain to the participants how the toolset components work
together to solve the challenge of managing Web applications.
________________________________________________________________________
Schema
Views
Data
Procedures
Programs
Figure 22. Toolset information from the JAD kickoff presentation
________________________________________________________________________
368
Procedures Information From the JAD Kickoff Presentation This page was used to share information with the participants about the role of
procedures. The basic idea is that procedures are for humans to use in the management of
an application. Procedure can be automated by turning them into programs.
________________________________________________________________________
Manual procedures can be automated with programs and can then be used for manual fallback
Manual procedures -- used to direct human activityAutomatic procedures -- programs to perform manual function
Figure 23. Procedures information from the JAD kickoff presentation
________________________________________________________________________
369
View Information From the JAD Kickoff Presentation This page was used to share with the information with the participants about the
purpose of views. Views are for humans to look at data in the MIR that is useful during
various life cycle phases.
________________________________________________________________________
View creation and use requires support during the full life cycle
Views for design, construction, deployment, operations, and changeKey questions for each phase:
what components? what support instrumentation? what MIR support? how to use view? what command support? what monitor support?
Figure 24. View information from the JAD kickoff presentation
________________________________________________________________________
370
Program Information From the JAD Kickoff Presentation This page was used to explain the role and importance of programs to the toolset.
________________________________________________________________________
Programs are an important part of the toolset and are used in every life-cycle phase
Life cycle examples:Design - script to load MIR with component information from design documents Construction - script to test Web application function exceptionsDeployment - script to distribute Web application Operation - script to monitor Web applicationChange - script to stop/start Web application components
Figure 25. Program information from the JAD kickoff presentation
________________________________________________________________________
371
MIR Information From the JAD Kickoff Presentation This page explains that the MIR is the heart of the management system.
________________________________________________________________________
Management Information Repository (MIR) is the heart of the Web application management system
Used during full Web application life cycleContains all life cycle work products:
Designs, procedures, programs, mapping of key software logs, summary information from key sources, application component information, exception messages, events, and alarms, ...
Figure 26. MIR information from the JAD kickoff presentation
________________________________________________________________________
372
Schema Information From the JAD Kickoff Presentation This page explains the relationship of schema to the MIR.
________________________________________________________________________
The schema provides the mapping and definitions that make the MIR useful
Data definitions -- used by programs and viewsDictionary -- used by humans
Figure 27. Schema information from the JAD kickoff presentation
________________________________________________________________________
373
Data and Information for the MIR Concepts From the JAD Kickoff Presentation This page was used to explain that data and information will be stored in the MIR.
________________________________________________________________________
Both detail data and information will be stored in the MIR
Full life-cycle data and information will be stored for the Web applicationOperations-phase challenges will require transforming log data into summary information
Figure 28. Data and information for the MIR concepts from the JAD kickoff presentation
________________________________________________________________________
374
Brainstorm Page From the JAD Kickoff Presentation This presentation page was used to indicate the beginning of the brainstorming
materials.
________________________________________________________________________
Brainstorm
Figure 29. Brainstorm page from the JAD kickoff presentation
________________________________________________________________________
375
Phases and Toolset Information From the JAD Kickoff Presentation This page was used to review the phases and toolset components that are pertinent to
this project and this brainstorming activity.
________________________________________________________________________
Before getting started, review the phases and toolset components we will use
Phases:Design,Construction,Deployment,Operation, andChange
Toolset components:Procedures,Programs,Views,Schema, andData/Information
Figure 30. Phases and toolset information from the JAD kickoff presentation
________________________________________________________________________
376
Functional Perspectives Information From the JAD Kickoff Presentation This page was used to review the functional perspectives that are pertinent to this
project and this brainstorming activity.
________________________________________________________________________
Also, consider these functional perspectives
Accounting, Administration, Automation, Availability, Business, Capacity, Change, Configuration, Fault, Operations, Performance, Problem, Security, Service Level, and Software Distribution
Figure 31. Functional perspectives information from the JAD kickoff presentation
________________________________________________________________________
377
Design Brainstorming Template From the JAD Kickoff Presentation This page was used to support the brainstorming activity for design-phase toolset
components.
________________________________________________________________________
Design brainstorming template
Name (Perspective) FunctionProcedure
Program Script to load MIR with component information from design documents (3,4,9)
View
Schema mapping and dictionary for component information from design
documents (3,4,9)
Data/Information
Perspectives: (1)Accounting, (2)Administration, (3)Automation, (4)Availability, (5)Business, (6)Capacity, (7)Change, (8)Configuration,(9)Fault, (10)Operations, (11)Performance, (12)Problem, (13)Security, (14)Service Level, and (15) Software Distribution
Figure 32. Design brainstorming template from the JAD kickoff presentation
________________________________________________________________________
378
Construction Brainstorming Template From the JAD Kickoff Presentation This page was used to support the brainstorming activity for construction-phase
toolset components.
________________________________________________________________________
Construction brainstorming template
Name (Perspective) FunctionProcedure
Program
View
Schema
Data/Information
Perspectives: (1)Accounting, (2)Administration, (3)Automation, (4)Availability, (5)Business, (6)Capacity, (7)Change, (8)Configuration,(9)Fault, (10)Operations, (11)Performance, (12)Problem, (13)Security, (14)Service Level, and (15) Software Distribution
Figure 33. Construction brainstorming template from the JAD kickoff presentation
________________________________________________________________________
379
Deployment Brainstorming Template From the JAD Kickoff Presentation This page was used to support the brainstorming activity for deployment-phase toolset
components.
________________________________________________________________________
Deployment brainstorming template
Name (Perspective) FunctionProcedure
Program
View
Schema
Data/Information
Perspectives: (1)Accounting, (2)Administration, (3)Automation, (4)Availability, (5)Business, (6)Capacity, (7)Change, (8)Configuration,(9)Fault, (10)Operations, (11)Performance, (12)Problem, (13)Security, (14)Service Level, and (15) Software Distribution
Figure 34. Deployment brainstorming template from the JAD kickoff presentation
________________________________________________________________________
380
Operations Brainstorming Template From the JAD Kickoff Presentation This page was used to support the brainstorming activity for operation-phase toolset
components.
________________________________________________________________________
Operation brainstorming template
Name (Perspective) FunctionProcedure
Program
View
Schema
Data/Information
Perspectives: (1)Accounting, (2)Administration, (3)Automation, (4)Availability, (5)Business, (6)Capacity, (7)Change, (8)Configuration,(9)Fault, (10)Operations, (11)Performance, (12)Problem, (13)Security, (14)Service Level, and (15) Software Distribution
Figure 35. Operations brainstorming template from the JAD kickoff presentation
________________________________________________________________________
381
Change Brainstorming Template From the JAD Kickoff Presentation This page was used to support the brainstorming activity for change-phase toolset
components.
________________________________________________________________________
Change brainstorming template
Name (Perspective) FunctionProcedure
Program
View
Schema
Data/Information
Perspectives: (1)Accounting, (2)Administration, (3)Automation, (4)Availability, (5)Business, (6)Capacity, (7)Change, (8)Configuration,(9)Fault, (10)Operations, (11)Performance, (12)Problem, (13)Security, (14)Service Level, and (15) Software Distribution
Figure 36. Change brainstorming template from the JAD kickoff presentation
________________________________________________________________________
382
Next Steps From the JAD Kickoff Presentation
This page was used to explain the next steps for the JAD activities.
________________________________________________________________________
Here are the next steps
Document session, post in documentation database, distribute for:
Correctness, clarity, and detailMore ideas are welcome!
Follow up session in 2 weeks
Figure 37. Next steps from the JAD kickoff presentation
________________________________________________________________________
383
Appendix G
Comment Sheet Details for Full Life-Cycle Toolset This appendix contains the comments from the survey participants. The survey
participants in response to three open ended questions wrote the comments. These three
questions appeared on a page after the five sets of survey questions for the toolset
scenarios. The questions and the comments from the survey participants are included
below.
Question 1 - What were the strengths of the toolset as implemented in these scenarios? Survey Participant 1 Makes problem determination easier since data required to debug is available without making additional runs to capture the data. Survey Participant 2 Provided a great deal of detail on problems and pulled in information from numerous sources. Survey Participant 3 Easier to understand and use, documents faults to production problem system. Survey Participant 4 Significant amount of analysis data collected and available-simplifies the effort needed to correlate cause and effect-leads the technician in a methodical way to evaluate the situation. Survey Participant 5 Walks you through the repair or possible resolutions. Gets very detailed. Covers most of the common Web application issues.
384
Survey Participant 6 Tremendous amount of information on system and application components. Built in analysis capability and detailed recommendation for immediate and long term actions very helpful for persons when understand the application or OS environment. Survey Participant 7 Having the procedure page initially to guide the support personnel through the toolset is very helpful. Being able to get at different statuses and records from disparate sources and bringing them all together, accessible from one common toolset, is very beneficial. Tying in detailed error message descriptions (like the DB2 message explanations) and presenting them via the toolset (so the support people don’t have to go off to other manuals) is useful. I like the concept of having some of the “problem determination assistance” views (like Check for Configuration Differences/Mismatches) generating faults that can then be investigated further using the mainline processing views (Specific Fault and Detailed Data). Survey Participant 8 Relatively intuitive, common look and feel. Easy to use help desk personnel to do preliminary problem determination. A good job of compiling all of this data into one place! Survey Participant 9 No comments. Survey Participant 10 Easy to navigate from start to closure. Consistency among scenarios. Survey Participant 11 Very sophisticated and comprehensive. Usability levels were appropriate for the intended audience. Survey Participant 12 The idea that the tools and problem management can be connected automatically. As well as the integration of the log information. Survey Participant 13 No comments.
385
Survey Participant 14 I think the toolset would be helpful for more simple, typical problems that come in when installing, configuring, using, etc software. Includes easy to follow instructions, with information on how it came to the conclusions it did. Even if it does not solve the problem, the suggestions and information provided could be used to find what is causing the problem. Any user with some knowledge of development, DB, or the actual application being used might be able to at least see the error(s) and get a good start to finding the problem and/or who to contact for assistance. Survey Participant 15 The screens are pretty much self-explanatory, but unless you have an experienced person at the helm, I am not convinced that problem cycle time would be reduced. This would be advantageous for a large customer with a dedicated support staff that was familiar with the applications and able to draw conclusions from the data. Survey Participant 16 The tool set provides a comprehensive set of steps/procedures that are easy to use. Very good relationship with various reference tools. Nice to have answers in a centralized location. Survey Participant 17 The ability of the toolset provided to determine root cause by having all of the appropriate information in one place was the toolset’s strongest asset in my view. Survey Participant 18 Knowledge–base in handling error messages, where to find proper team to handle problem. More convoluted as complexity of problem increases- but that is to be expected. Survey Participant 19 No comments. Survey Participant 20 Consistent user interface provides easy navigation. Survey Participant 21 Standards implemented across accounts. “One Stop Shop” for monitoring, and gathering details, for procedures and problems and change tool-link.
386
Survey Participant 22 Comprehensive and thorough. I thought the scenarios were well thought out and represented what would actually happen in reality. The amount of information and the user interface is consistent and easily understood. The design also flowed in a logical pattern. Survey Participant 23 No comments. Survey Participant 24 Comprehensive and easy to understand and flowing user interface. Survey Participant 25 The toolset is user friendly, easy to navigate and does not require technical expertise. Survey Participant 26 Documentation to support operations-detailed. Identification of the application, sub-component and failing component clearly. Integrated information on application, infrastructure and human resources Survey Participant 27 The strengths included easy to follow instructions, quick “at a glance” summaries, as well as detail when the administrator or end user needs to see them. Also, sophisticated features such as long term recommendations, actions to take, as well as integration with problem management system. Survey Participant 28 The step by step approach was very good. The steps were clearly delineated. Survey Participant 29 Logical approach to analyzing data extracted by toolset allowing for quicker reaction and recovery. Will be a very good toolset to more proactively detect problems that plague Web hosted applications. Survey Participant 30 No comments.
387
Survey Participant 31 No comments. Survey Participant 32 It is robust, and organized well to allow for ease of use. It is effective, and employs problem determination techniques that allow for systematic resolution. It adds value to the business by connecting an administrator to other appropriate teams via links; this yields interdependence and team work. Survey Participant 33 The strengths of the toolset as the inherent benefits of an integrated system that covers the aspects of a full-life cycle enterprise. If used correctly the tool would be good for keeping track of the status of a problem through resolution. What were the weaknesses of the toolset as implemented in these scenarios? Survey Participant 1 I was unclear how the information being displayed was gathered. Survey Participant 2 Maintenance of information sources would be high. For instance, updating the SLO information. Survey Participant 3 There may be to much information for the user to digest. Survey Participant 4 Appears that a tremendous amount of customization will be needed to make the toolset functions work for each application environment. Also the tools seem to have the capability of identifying defects and in a pro-active fashion. Example is last scenario- tool identifies high number of switch faults only when queried as a result of customer complaint, then suggests possible speed mismatch. Why not have tool either a. evaluate and report on mismatches or b. generate a fault whenever high number of switch errors are detected (before customer
has to complains)?
388
Survey Participant 5 My only complaint is format is busy. In some cases, I just want to see the root of the problem bullet pointed, and drill down to the solution. I would like to see an external system connectivity breakage scenario. Survey Participant 6 Quantity of information can be overwhelming. In some situations it appeared as though information would have been prioritized better (i.e. SLA/SLO info.) Survey Participant 7 I found the different views that are not common between all the scenarios somewhat difficult to understand/follow (i.e. the views other than Procedure Page, Specific Fault, Detailed Data and Administrator Action). This might just be my lack of understanding of the overall architecture and what all the components/views are that make up the toolset. Having the procedure page definitely overcomes this weakness and points out when it is necessary or desirable to go to a view that is not one of the mainline views. Some of the scenarios seem like they should have been triggered via automation instead of “a call to the help desk.” For example, in Scenario 5, if there were multiple faults generated, a monitoring threshold should probably have been exceeded somewhere indicating a problem much earlier than having to have someone call the help desk and complain of a performance problem. Not understanding fully the methodology of how these views and procedures are created, I guess its possible that after someone discovers this data and “pattern” of problems, in addition to looking into the problems themselves, someone puts in automation and/or monitors to watch for those conditions automatically going forward. I think it would be useful to “run” certain of the views on a regular basis via automation. For example, the Check for Configuration Differences/Mismatches view should probably be run every day or every few days to try to catch those unauthorized configuration changes before they are a problem in the system. Survey Participant 8 Some highly detailed screens were hard to understand. Survey Participant 9 No comments.
389
Survey Participant 10 None that I saw. Survey Participant 11 Scenario 5- Perhaps more analysis info would be helpful for the development team. Survey Participant 12 I am concerned that the alerts in the first example will get to the administration before the record is recorded. The usability needs vast improvement, layout is not well throughout (tables and rows). Requiring a close to continue is not usable. Logs can get rather large, concerned that will have a negative impact in the real world. Survey Participant 13 No comments. Survey Participant 14 Why is calling necessary? Why doesn't the tool open a problem ticket with the appropriate group and give the phone number to follow-up if necessary? Difficult issues like performance issues typically are very difficult to find, and they are not usually one problem. In these examples, only one resolution is given for each problem. Is it possible for the tool to give multiple possible resolutions? Survey Participant 15 Scenario 1- The actual SQL statement isn’t displayed- that would be of value. The fault should initiate a problem ticket as soon as the fault is detected. That doesn’t appear to be the case. Scenario 2- Why not have the tool check space prior to installing the application? Scenario 4- Limit who can start the event monitor- should be a DBA. Survey Participant 16 May be difficult to update tool set based upon new software/new documentation. Survey
390
Survey Participant 17 Ensuring that the data being references by the toolset is accurate and complete seems almost impossible. Survey Participant 18 How/who would decide on action/domain/options screens? Operations? Would they have knowledge/skill set to make determination? Survey Participant 19 No comments. Survey Participant 20 No comments. Survey Participant 21 Could enhance to allow for automation of fix or page-out (if possible for a scenario- not appropriate for all problems). Could include call lists with page-out option built in or could automatically page-out (for appropriate problems). How to keep updated? Survey Participant 22 I thought the scenarios could have benefited from a more detailed breakdown of the possible problems. For example, there was no distinction between “application” and “content”. It’s entirely possible in the Web environment to have the application server and content server be physically different. This amount of detail would have been beneficial in Scenarios 1 and 2. The assumption was that the problem was with the application, but my experience is that content also contributes. The other weakness is the implementation of a closed loop process. When confirmation is given that the fault was sent to the problem management system, how do we know the root cause is actually fixed. Confirmation of the fix or notification to the tool set that a particular problem has been updated would be helpful to the user, especially if they see the problem recurring. Survey Participant 23 No comments.
391
Survey Participant 24 Data shown not always easy to understand. Likely to require end users to be heavily trained in interpreting the data. Survey Participant 25 No comments. Survey Participant 26 Did not perform auto-correct and restart (e.g., for Scenario 2, the idea tool would have invoked SMIT, resized and retried.) Survey Participant 27 Very Few! The only suggestion would be on complex problems, further breaking down the components with more intricate detail. But currently it’s sufficient. Survey Participant 28 No comments. Survey Participant 29 For some user audiences (the Help Desk or Level 1.5 types) it may be difficult for them to follow/fully understand the toolset navigation and results. Survey Participant 30 I didn’t fully understand the domains areas. Maybe a little more explanation on this functionality. Lastly, the output seems to point you in one direction. It does give recommendations, etc.; however, if the direction I am given is incorrect, I am in the same situation as without the tool it seems. Survey Participant 31 No comments.
392
Survey Participant 32 There were no glaring problems to render the tool unsuccessful by any means. However, there is a lot of information on the screen at one time, which may be put in pull down format to allow someone to see only what they need at one time. This makes it even more accessible to an administrator (i.e. top gun, contractor, etc.) who is less familiar with it’s format. Survey Participant 33 The tool does not seem to account for problems that do not have a specific error code from an application or problems that result from a combination of problems. As with Tivoli, the expert will still be in demand to make use of the system. Any other comments or observations?
Survey Participant 1 What is the performance impact of the toolkit gathering this real-time data? Survey Participant 2 No comments. Survey Participant 3 Nice piece of work. Wish I had this type of tool when supporting errant Web applications. Survey Participant 4 The sophistication of this toolset seems to imply a higher level skill in the operations role than traditional. The direction has been to simply and put lowest skill possible at the monitor console. Does this toolset imply a paradigm shift? Survey Participant 5 I recommend explaining SLA/SLO if you are going to use those concepts. Not a problem for IBMers. Although I understand why these packages were distributed the way they were, the use of color greatly enhances understanding and impact of the toolset. Assumption: Tivoli Framework is running underneath. What would happen if there no Tivoli? What other products could interface with the toolset? Assumption: Each of these products (e.g. DB2) can sent the types of alerts that the
393
toolset is looking for, with adequate detail. Survey Participant 6 Nicely integrated with other systems, communications tools (email, pager) and external information resources. Survey Participant 7 I found it confusing that the toolset didn’t really distinguish what each support role was supposed to do and how each person performing each role would know. Is that something in their work instructions or job description??? E.g. for scenario 1, the customer care person was working the problem and was supposed to know not to do the vendor recommended actions because they were not a database SME. The Specific Fault view does identify the problem as needing to go to the database group to be worked, but the customer care person also has to go through the scenario to gather data and transfer the problem to the problem management system. The tie-in with the problem management system is a bit confusing. In cases where the customer care person researches the problem and gathers info and then opens a problem ticket with that data…I understand. But in other cases the problem is seemingly fixed before a problem record is opened (or opened as a closed problem). Survey Participant 8 Overall, even including the more complicated scenarios, it was easy to pinpoint problems or problem areas. Utilizing this tool will/would enable companies to leverage lower cost personal to achieve problem resolution. A side benefit would be freeing the more experienced, highly trained personnel from performing late night problem resolutions. Survey Participant 9 No comments. Survey Participant 10 Where’s my Demo?
394
Survey Participant 11 This is a well thought-out comprehensive set of tools. The level of sophistication is definitely leading edge. In today’s competitive Web-hosting environment it provides an excellent solution for driving down cost (through labor) while improving quality and availability. Survey Participant 12 Overall I understand the concept and it’s goal’s but more work needs to be done in two areas − Process management − Usability
Survey Participant 13 No comments. Survey Participant 14 I would have liked to have a little more info on how the toolset worked to get this information. (i.e. Was this supposed to be Tivoli installing, running, etc. the applications it was diagnosing the problem for? I am not sure if my Tivoli & DB background made this easier or harder for me to judge.:-). I am curious how you are going to use this info in your final project. Survey Participant 15 I think this would be a benefit for large, complex customers. The Web Application deployment could probably be expanded upon to hold a version library- that would be nice. It appears that there is quite a bit of initial set-up required (servers, IPs, etc)- is this done through Tivoli, or is some of it manual? Seems like a major effort to set up all the specified actions and recommendation texts, and they’d need to be maintained. Change management is a hot issue- reports that show change history by server would be a value add. Scenario 5 is too complex for an operator, unless he’s very familiar with this particular customer’s environment. (Because I don’t know the history of the applications, how would I draw any conclusions about the completion times?)
395
Survey Participant 16 The first 3 tools were very powerful and easy to understand for the help desk level personnel. The first 3 tools also provided enough detail for the admin. to easily resolve the issue. The last 2 tools would be too complex for a help desk level person to use easily. Survey Participant 17 I believe the toolset can be very valuable if the data can be collected in a reasonable fashion. Survey Participant 18 Excellent tool for Ops/others to teach reasons for problems- can see as usable to root cause analysis! Survey Participant 19 No comments. Survey Participant 20 No comments. Survey Participant 21 Operations has a procedure database with this data, but this tool would pull all of that data together with the (SMC) tool, and call list database. FYI- A company called 7th wave has a resource/request tool that is similar. Instead of being driven on faults, it is driven on requests, determines the resource (person) work queue to assign the request, and the provides the procedure to perform/do the task. You can see a demo on the Web of the tool. (seventh wave) Would have chosen “will have impact” even without the phrase “but improvements are needed.” I am not sure the degree of impact on operations or maintenance teams due to current toolsets, but feel it would have impact. I like the “standard” procedures approach. Probably could lower Band level using tool in operations. Survey Participant 22 When more than one possible solution is available, I’d like to see the tool set recommend a course of action; for example, do procedure 1, if OK then go to procedure 2, if not, then go to procedure 3.
396
I’d suggest you distinguish between proprietary applications, especially those owned by the customer, and “shrink-wrap” applications like MS Outlook. The problem determination and the course of action could be customized for each. In the scenarios where the performance is slow, you might want to think about distinguishing technical problems from those problems caused by an increase in usage, or traffic, at the Website or the Website page design itself (performance problems could be caused by having large graphics files). Survey Participant 23 No comments. Survey Participant 24 No comments. Survey Participant 25 No comments. Survey Participant 26 Although I noted a desire for more automation, this is a giant leap forward from today’s operational environments. The integration of the various data stores coupled with powerful views supporting the problem determination efforts and enabling lower skilled people to solve more problems at the point of interrupt, significantly improving service level by dramatically reducing MTTR. Survey Participant 27 Excellent tool set and appears to be greatly beneficial to any team or organization that would use! I wish I had these tools when I was a system administrator. Survey Participant 28 Since I am unfamiliar with Help Desk Fault analysis processes and procedures, I could not compare with other existing processes and in addition, it took moderate effort for me to understand the steps. Since the toolset is very consistent throughout and takes a well defined step by step process, I believe someone with a little training could very easily use this tool set.
397
Survey Participant 29 Need to check spelling within context in scenarios. Found several errors. (i.e. ‘end’ where should have been ‘and’. etc) Survey Participant 30 I’m worried about the usability of the tool as compared to the complexity of the issue. Will the tool truly point one in a direction when the issue is very complex. Another concern is the knowledge of each user in regard to the information presented with the tool. Does the user fully understand the output to direct it to the appropriate ‘fixers’. Survey Participant 31 Why is the fault transferred as closed in scenarios? What happens if the installation/configuration stalls or fails? Survey Participant 32 Outstanding toolset was presented well and solves real problems in and efficient and effective manner. Survey Participant 33 You used the Tivoli product set extensively to get your point across. Tivoli already has a host of products to do exactly what your toolset does. Will your toolset be used to integrate other vendor products that may not integrate as well as the Tivoli product set? Is your toolset merely proposing a methodology or is it proposing a new product line? The full Life Cycle Approach can be very effective with buy-in from all involved stakeholders. Also, it would require that this groundwork be laid before any other systems/applications are in place. This would make the system most effective. The problem you are solving with this toolset is not clearly defined. Also, since other products exist that solve the same problem, how does your product differ? The impact of the toolset would have on an organization would not be limited to the efficiency of the tool itself. However, the tool does a good job of trying multiple events of the same problem together: Correlation. The interface is very simple and easy for a user to grasp the functionality. I am sure that any graphical features were missed because of he black and white copies, but if not it could be more exciting using various color schemes. The requirement for the overall product need to be clarified. The scenarios were realistic and straightforward. But in the scope of the whole system, I am unable to quantify if the requirements are met. How is the toolset configured to detect various vendor/application errors? Is there
398
some way to update the system when new tools/apps/hardware/software, etc. are added to the environment? Addressing these questions will speak to the adaptability of the toolset. This is crucial for the dynamic would to appreciate your toolset. Good Work!
399
Appendix H
Data Dictionary for Full Life-Cycle Toolset This appendix contains the data dictionary for the full life-cycle toolset. The scope of
this appendix is limited to the database tables and fields that were used to implement the
prototype toolset scenarios. The database tables named in this appendix are part of a
single database called the Full Lifecycle Toolset MIR.
Table 54. Application Capacity Log
Field Name Description Type
Application Names the application for which this data applies. For example, b2b-EzTran. This is the first of five fields that make up the primary key.
Text
Domain Names the domain for which this data applies. For example, Verification. This is the second of five fields that make up the primary key.
Text
Type of Bottleneck
Indicates the general kind of bottleneck data. Values include: - Application - Database - Middleware - Network - System This is the third of five fields that make up the primary key.
Text
400
Table 54. (continued)
Field Name Description Type
Verification Date Date for this verification data. This is the fourth of four fields that make up the primary key.
Date/Time
Verification Time Time for this verification data. This is the last of five fields that make up the primary key.
Date/Time
Bottleneck Subtype
Indicates the specific kind of data. Values include: - Long Read - Long Queue Get/Put - Long SQL Query - Long Write - Process Hung - Process Missing
Text
Table 55. Application Definition
Field Name Description Type
Application Name Names the application that can be managed using the full life-cycle toolset. This is the primary key.
Text
Accounting Support Indicates if this application has support for the specified subsystem. Values are yes or no.
Text
Administration Support Indicates if this application has support for the specified subsystem. Values are yes or no.
Text
401
Table 55. (continued)
Field Name Description Type
Automation Support Indicates if this application has support for the specified subsystem. Values are yes or no.
Text
Availability Support
Indicates if this application has support for the specified subsystem. Values are yes or no.
Text
Business Support Indicates if this application has support for the specified subsystem. Values are yes or no.
Text
Capacity Support Indicates if this application has support for the specified subsystem. Values are yes or no.
Text
Change Support Indicates if this application has support for the specified subsystem. Values are yes or no.
Text
Fault Support Indicates if this application has support for the specified subsystem. Values are yes or no.
Text
Operations Support Indicates if this application has support for the specified subsystem. Values are yes or no.
Text
Performance Support Indicates if this application has support for the specified subsystem. Values are yes or no.
Text
402
Table 55. (continued)
Field Name Description Type
Problem Support Indicates if this application has support for the specified subsystem. Values are yes or no.
Text
Security Support Indicates if this application has support for the specified subsystem. Values are yes or no.
Text
Service Level Support Indicates if this application has support for the specified subsystem. Values are yes or no.
Text
Software Distribution Support
Indicates if this application has support for the specified subsystem. Values are yes or no.
Text
Table 56. Automated Installation and Configuration Log
Field Name Description Type
Automated Record Type Contains the type of data. Values are Installation or Configuration. This is the first of four fields that make up the primary key.
Text
Target System Names the application for which this data applies. For example, HR Benefits. This is the second of four fields that make up the primary key.
Text
403
Table 56. (continued)
Field Name Description Type
Date Date for this installation or configuration activity. This is the third of four fields that make up the primary key.
Date/Time
Time Time for this installation or configuration activity. This is the last of four fields that make up the primary key.
Date/Time
Target Servers Host names of the servers that make up the target system.
Text
Status Indicates the standing of this activity. Values include: - Successful - activity completed normally. - Unknown - status of this activity is unknown. - Unsuccessful - activity did not completed normally.
Text
Definitions Used Key of the definitions associated with this installation or configuration activity.
Text
Options Used Key of the options associated with this installation or configuration activity.
Text
Message Text Text of the message associated with this installation or configuration activity.
Text
404
Table 57. Business Systems Definitions
Field Name Description Type
Business System
Names the business systems for which this data applies, for example, VMS Systems Limited. This is the primary key.
Text
Application Names the applications that are part of this business system, for example, two applications would be defined--Value Markets/Markets Management.
Text
Views Names the application views that are part of this business system.
Text
Table 58. Change-Window Operations Log
Field Name Description Type
Name Contains the descriptive name given to the defined change window. This is the first of three fields that make up the primary key.
Text
Window Start Date Date for the start of this change window. This is the second of three fields that make up the primary key.
Date/Time
Window Start Time Time for the start of this change window. This is the last of three fields that make up the primary key.
Date/Time
405
Table 58. (continued)
Field Name Description Type
Window End Date Date for the end of this change window.
Date/Time
Window End Time Time for the end of this change window.
Date/Time
Change Details Gives a brief description for the change window, for example, Content Push or Program Changes.
Text
Status Indicates the standing of this change window. Values include: - Cancelled - the planned change window was abandoned. - Completed - change window completed normally. - Extended - the change window completed, but the period for the window was extended beyond the planned time. - Planned - this change window will take place at a future date and time.
Text
Definitions Used Key of the definitions associated with this installation or configuration activity.
Text
Options Used Key of the options associated with this installation or configuration activity.
Text
406
Table 58. (continued)
Field Name Description Type
Message Text Text of the message associated with this installation or configuration activity.
Text
Table 59. Configuration Verification Log
Field Name Description Type
Application Names the application for which this data applies. For example, Order Marketplace. This is the first of four fields that make up the primary key.
Text
Domain Pair Names the domains for which this data applies. For example, Test-Production. This is the second of four fields that make up the primary key.
Text
Verification Date Date for this verification data. This is the third of four fields that make up the primary key.
Date/Time
Verification Time Time for this verification data. This is the last of four fields that make up the primary key.
Date/Time
Total Number of Exceptions
Count of the total number of exceptions.
Number
Program Mismatch Number of programs that do not match between these domains.
Number
407
Table 59. (continued)
Field Name Description Type
File Mismatch Number of files that do not match between these domains.
Number
Directory Mismatch Number of Directories that do not match between these domains.
Number
Other Mismatch Number of other resource types that do not match between these domains.
Number
Message Text Text of the message associated with this configuration verification data.
Text
Table 60. Deep View Application Resources
Field Name Description Type
Application Names the application for which this data applies. For example, Value Market. This is the first of two fields that make up the primary key.
Text
Domain Names the domain for which this data applies. For example, Verification. This is the second of two fields that make up the primary key.
Text
408
Table 60. (continued)
Field Name Description Type
Accounting: Primary Billing Application
Host names of the servers that make up the deployment domain.
Text
Accounting: Accounting Options
Indicates the standing of this activity. Values include: - Successful - activity completed normally. - Unknown - status of this activity is unknown. - Unsuccessful - activity did not completed normally.
Text
Accounting: Last Usage Record
Name of the utility that is associated with this deployment operation.
Text
Accounting: Major Account Operation performed by the utility program associated with this deployment activity.
Text
Administration: Last Install Date
Date of last installation activity.
Date/Time
Administration: Last Install Time
Time of last installation activity.
Date/Time
Administration: Last Install Status
Status of last installation activity, for example, Successful.
Text
Administration: Last Configuration Date
Date of last configuration activity.
Date/Time
Administration: Last Configuration Time
Time of last configuration activity.
Date/Time
409
Table 60. (continued)
Field Name Description Type
Administration: Last Deployment Date
Date of last deployment activity.
Date/Time
Administration: Last Deployment Time
Time of last deployment activity.
Date/Time
Administration: Last Deployment Status
Status of last deployment activity, for example, Unsuccessful.
Text
Automation: Actions Enabled
Defines is automation is desired, for example, Yes or No.
Text
Automation: Attempted Actions
Number of automation attempts.
Number
Automation: Successful Actions
Number of successful actions.
Number
Availability: Application Faults
Number of application faults.
Number
Availability: Database Faults
Number of database faults. Number
Availability: Middleware Faults
Number of middleware faults.
Number
Availability: Network Faults
Number of network faults.
Number
Availability: Switch Faults Number of switch faults.
Number
Availability: Operating System Faults
Number of operating system faults.
Number
Availability: Hardware Faults
Number of hardware faults.
Number
Availability: User Defined Faults
Number of user defined faults.
Number
410
Table 60. (continued)
Field Name Description Type
Business: Part of What Business System
Name of the business systems of which this application is a part.
Text
Business: Logical View Name of the logical view that contains resources for this application.
Text
Business: Physical View Name of the physical view that contains resources for this application.
Text
Business: Application Components
Number of application components that are depicted on business systems views.
Number
Business: Database Components
Number of database components that are depicted on business systems views.
Number
Business: Middleware Components
Number of middleware components that are depicted on business systems views.
Number
Business: Current Status Information about the status of business system views, for example, Two Views Active.
Text
Capacity: Disk Faults Count of disk faults for this application.
Number
Capacity: Memory Faults Count of memory faults for this application.
Number
Capacity: Processor Faults Count of processor faults for this application.
Number
411
Table 60. (continued)
Field Name Description Type
Capacity: I/O Faults Count of I/O faults for this application.
Number
Change: Type Description of the kind of change, for example, Previous or Next.
Text
Change: Name Name given to the change when it was created, for example, C030802SpecReq.
Text
Change: Window Start Date Date that the change started or is scheduled to start.
Date/Time
Change: Window Start Time
Time that the change started or is scheduled to start.
Date/Time
Change: Window End Date Date that the change started or is scheduled to end.
Date/Time
Change: Window End Time Time that the change started or is scheduled to end.
Date/Time
Change: Details Narrative that explains the reason for the change, for example, Content Push.
Text
Change: Status Status of this change window, for example Completed or Planned.
Text
Fault: Total Faults Total number of faults for this application.
Number
Fault: Transferred Closed Number of faults that were transferred to the problems management system in closed status.
Number
412
Table 60. (continued)
Field Name Description Type
Fault: Transferred Open Number of faults that were transferred to the problems management system in open status.
Number
Fault: Examined Number of faults examined using the Specific Fault view.
Number
Fault: Not Yet Examined Number of faults not yet examined using the Specific Fault view.
Number
Fault: Average Per Day Average number of faults per day for this application.
Number
Operations: Job Scheduling Name of job scheduling software in use.
Text
Operations: Output management
Name of output management software in use.
Text
Operations: Help Desk Name of primary help desk view.
Text
Operations: Backup and Restore
Status of last three backup and restore operations.
Text
Performance: Current Indicator
Summarizes the current performance status, for example, Switch+Processor, which indicated that there have been recent performance exceptions regarding switches and processors.
Text
Performance: Previous Indicator
Summarizes the previous performance status.
Text
413
Table 60. (continued)
Field Name Description Type
Performance: Previous -1 Indicator
Summarizes the performance status from 2 periods past.
Text
Performance: Previous -2 Indicator
Summarizes the performance status from 3 periods past.
Text
Problem: Current Problem Record
Supplies the name of the most recent problem record for this application.
Text
Problem: Previous Problem Record
Supplies the name of the second oldest problem record for this application.
Text
Problem: Previous -1 Problem Record
Supplies the name of the third oldest problem record for this application.
Text
Problem: Previous -2 Problem Record
Supplies the name of the fourth oldest problem record for this application.
Text
Problem: Previous -3 Problem Record
Supplies the name of the fifth oldest problem record for this application.
Text
Security: Classification Classification for this application, for example, Private or Public.
Text
Security: Violations Number of detected violations.
Number
Security: Unauthorized Changes
Number of detected unauthorized changes.
Number
414
Table 60. (continued)
Field Name Description Type
Security: Back End Administrative Access
Indicates if administrative access is utilized over a dedicated leased line or Virtual Private Network (VPN).
Text
Service Level: Type of Application
Indicates if the agreement for this application is a service level objective or service level agreement contract.
Text
Software Distribution: Usage
Specifies if software distribution is supported for this application.
Text
Table 61. Deployment Status Log
Field Name Description Type
Application Names the application for which this data applies. For example, HR Benefits. This is the first of four fields that make up the primary key.
Text
Domain Names the domain for which this data applies. For example, Verification. This is the second of four fields that make up the primary key.
Text
Date Date for this deployment activity. This is the third of four fields that make up the primary key.
Date/Time
415
Table 61. (continued)
Field Name Description Type
Time Time for this deployment activity. This is the last of four fields that make up the primary key.
Date/Time
Domain Servers Host names of the servers that make up the deployment domain.
Text
Status Indicates the standing of this activity. Values include: - Successful - activity completed normally. - Unknown - status of this activity is unknown. - Unsuccessful - activity did not completed normally.
Text
Utility Name of the utility that is associated with this deployment operation.
Text
Operation Operation performed by the utility program associated with this deployment activity.
Text
Message Text Text of the message associated with this installation or configuration activity.
Text
Table 62. Detailed Data
Field Name Description Type
Identifier Contains the fault text of the related fault. This is the primary key.
Text
416
Table 62. (continued)
Field Name Description Type
Cause Description Describes the cause of the fault.
Text
Specified Action Indicates the action that the administrator should take.
Text
Long Term Recommendation
Indicates the actions that are strategic to fixing the toot cause of the fault.
Text
Contact Information Describes the contacts for the application and also typically contains a URL that can be used to make specific contacts with the support team.
Text
Table 63. Resource Modeling Log
Field Name Description Type
Server Name Contains the host name of the server. This is the first of four fields that make up the primary key.
Text
IP Address Internet Protocol address of the server. This is the second of four fields that make up the primary key.
Text
Exception Date Date when the exception was detected. This is the third of four fields that make up the primary key.
Date/Time
417
Table 63. (continued)
Field Name Description Type
Exception Time Time when the exception was detected. This is the last of four fields that make up the primary key.
Date/Time
Exception Type Indicates the specific type of exception. Values include: - Disk - Memory - Processor - I/O
Text
Table 64. Resource Modeling Monitoring Input
Field Name Description Type
Application Name Contains the name of the application for which this definition applies. This is the primary key.
Text
Disk Exception Definition Describes a disk exception, for example, Total Of 80 Percent Full.
Text
Memory Exception Definition
Describes a memory exception, for example, Total Of 90 Percent Full.
Text
Processor Exception Definition
Describes a processor exception, for example, Total Of 50 Percent Busy.
Text
I/O Exception Definition Describes an I/O exception, for example, Total Of 50 Percent Busy.
Text
418
Table 65. SLO/SLA Definitions
Field Name Description Type
Application Name Contains the name of the application for which this definition applies. This is the primary key.
Text
Application Type Defined the kind of support, SLO or SLA, for this application.
Collections Defined Indicates the type of this observation. Values include: - URL=Yes/No - Application=Yes/No - Database=Yes/No - Middleware=Yes/No - Network=Yes/No - Server=Yes/No - Hardware=Yes/No - Detailed Logging=Yes/No - External Logging=Yes/No
Text
Table 66. SLO/SLA Log
Field Name Description Type
Application Name Contains the name of the application for which this data applies. This is the first of four fields that make up the primary key.
Text
Date Date of this observation. This is the second of four fields that make up the primary key.
Date/Time
419
Table 66 (continued)
Field Name Description Type
Time Time of this observation. activity. This is the third of four fields that make up the primary key.
Date/Time
Observation Type Indicates the type of this observation. Values include: - URL - Application - Database - Middleware - Server - Hardware This is the last of four fields that make up the primary key.
Text
Observation State Indicates the status of this observation. Values include: - Available - Unavailable - Unknown
Text
Table 67. Specific Fault Data
Field Name Description Type
Application Contains the name of the Web application. This is the first of three fields that make up the primary key.
Text
Fault Date Date when the fault was detected. This is the second of three fields that make up the primary key.
Date/Time
420
Table 67. (continued)
Field Name Description Type
Fault Time Time when the fault was detected. This is the third of three fields that make up the primary key.
Date/Time
Source Indicates the primary basis for the fault.
Text
Sub Source Indicates the secondary basis for the fault.
Text
IP Origin Specifies the primary Internet Protocol address of the host for this fault.
Text
IP Sub Origin Specifies the secondary Internet Protocol address of the host for this fault.
Text
Repeat Count Specifies the number of duplicate faults of the same type for this application.
Number
Status Indicates the standing of this fault. Values include: - Closed - fault is not longer active. - Open - fault is actively being investigated. - TSME - fault has been transferred to a subject matter expert. - Unassigned - fault has not yet been reviewed.
Text
Administrator Contains the name of the individual or group that is handling the fault.
Text
421
Table 67. (continued)
Field Name Description Type
Severity Indicates the importance of this fault. Values include: - Critical - fault is a significant problem for the application. - Important - fault is most probably a significant problem for the application. - Informational - fault is presented as general interest information. - Unknown - the importance of this fault to the application is not ranked in importance.
Text
Fault Text Descriptive text that explains the fault.
Text
Application Programs Contains 5 text sub fields including: - Name of first active program. - Name of second active program. - Name of third active program. - Name of fourth active program. - Name of fifth active program.
Text
422
Table 67. (continued)
Field Name Description Type
Application Processes
Contains 5 text sub fields including: - Process name/status of first process. - Process name/status of second process. - Process name/status of third process. - Process name/status of fourth process. - Process name/status of fifth process.
Text
Database Programs Contains 5 text sub fields including: - Name of first active program. - Name of second active program. - Name of third active program. - Name of fourth active program. - Name of fifth active program.
Text
Database Processes Contains 5 text sub fields including: - Process name/status of first process. - Process name/status of second process. - Process name/status of third process. - Process name/status of fourth process. - Process name/status of fifth process.
Text
423
Table 67. (continued)
Field Name Description Type
Database Tables Contains 5 text sub fields including: - Name/status of first table. - Name/status of second table. - Name/status of third table. - Name/status of fourth table. - Name/status of fifth table.
Text
Database Rows Contains 5 text sub fields including: - Key of first active row. - Key of second active row. - Key of third active row. - Key of fourth active row. - Key of fifth active row.
Text
Middleware Programs Contains 5 text sub fields including: - Name of first active program. - Name of second active program. - Name of third active program. - Name of fourth active program. - Name of fifth active program.
Text
424
Table 67. (continued)
Field Name Description Type
Middleware Processes Contains 5 text sub fields including: - Process name/status of first process. - Process name/status of second process. - Process name/status of third process. - Process name/status of fourth process. - Process name/status of fifth process.
Text
Middleware Queue Contains 5 text sub fields including: - Name/status of first active queue. . - Name/status of second active queue. - Name/status of third active queue. - Name/status of fourth active queue. - Name/status of fifth active queue.
Text
Middleware Records Contains 5 text sub fields including: - Key of first active queue record. - Key of second active queue record. - Key of third active queue record. - Key of fourth active queue record. - Key of fifth active queue record.
Text
425
Table 67. (continued)
Field Name Description Type
Network Programs and Processes
Contains 5 text sub fields including: - Process name/status of first process. - Process name/status of second process. - Process name/status of third process. - Process name/status of fourth process. - Process name/status of fifth process.
Text
Operating Systems Information
Contains 2 text sub fields including: - Level of OS. - Overall System Status.
Text
System Resources Contains 10 numeric sub fields including: - Percentage of CPU Being Utilized. - Packets Per Second. - Pages Per Second. - Swaps Per Second. - Interrupts Per Second. - Disk Transfers Per Second. - Context Switches Per Second. - Run Able Processes Last Minute. -Collisions Per Second. - Errors Per Second.
Number
426
Table 67. (continued)
Field Name Description Type
Name of Problem Management System
Specifies the symbolic name of the problem management system to which this record was transferred. This data is supplied when administrative actions are taken.
Text
Actions Taken Summarized the actions taken by the administrator.
Text
Table 68. Unauthorized Change Detection Log
Field Name Description Type
Application Names the application for which this data applies. For example, Order Marketplace. This is the first of four fields that make up the primary key.
Text
Domain Names the domain for which this data applies. For example, Production. This is the second of four fields that make up the primary key.
Text
Date Date for this unauthorized change. This is the third of four fields that make up the primary key.
Date/Time
Time Time for this unauthorized change. This is the last of four fields that make up the primary key.
Date/Time
427
Table 68. Unauthorized Change Detection Log
Field Name Description Type
Type Indicates the kind of resource associated with this unauthorized change. Values include: - Dir - Program - File - User supplied
Text
Name Physical name of the Dir, Program, File, or User supplied resource.
Text
Created/Modified/Accessed Date
Dates, if available, from the operating system for the specific resource.
Date/Time
Comment Brief narrative manually added after research performed on this unauthorized change.
Text
428
Reference List
A framework for global electronic commerce. United States White House. (1997). Retrieved November 24, 2002, from http://www.w3.org/TR/NOTE-framework- 970706.html
Aghdaie, N., & Tamir, Y. (2001). Client transparent fault tolerant Web service. Proceedings of the 20th IEEE International Performance, Computing, and Communications Conference (IPCCC 2001), Phoenix, AZ, 209-216.
Ahn, S., Yoo, S., & Chung, J. (1999). Design and implementation of a Web based Internet
performance management system using SNMP MIB II. International Journal of Network Management, 9(5), 309-321.
Ahrens, K., Birkner, G., Gulla, J., & McKay, J. (2002). Web Application Availability and
Problem Determination - Case Studies from the South Delivery Center. Proceedings of the Academy of Technology High Availability Best Practices Conference, July 10-12, 2002, Ontario, Canada.
AIX LAN Management Utilities User’s Guide. (1995). Research Triangle Park, NC: IBM
Corporation. Aldrich, S. (1998). Application management? Open Information Systems, 13(6), 1 Amanda, the advanced Maryland automatic network disk archiver. (2002). Retrieved
October 6, 2002 from http://www.amanda.org/ Anderson, G., & James, P. (1998). Rules of the WAN. Network World, 41(15), 48. Anthes, G. (1992). Legent reveals plan for merging Goal products. Computerworld, 26
(33), 14. Application management: A crisis in enterprise client/server computing. (1996). White
paper. Framington, MA: Hurwitz Consulting Group, Inc. Application management MIB, IETF request for comments: 2564. (1999). Retrieved October 6, 2002, from http://www.ietf.org/rfc/rfc2564.txt?number=2564 Application sizing, capacity planning and data placement tools for parallel
databases. (1996). Retrieved October 6, 2002 from http://www.cee.hw.ac.uk/Databases/tools/
Applications and namespaces. (2001). Retrieved October 6, 2002, from
http://www.dmtf.org/education/cimtutorial/extend/apps.php Applications Management Specification. (1995). Austin, TX: Tivoli Systems.
429
Applications Management Specification Version 2.0: A DMTF Common Information Model Based Approach to Application Management. (1997). Austin, TX: Tivoli Systems. Aragon, L. (1997). Take the plunge with ESD. PC Week, 32A(14), 5-6. AS/400 and iSeries. (2001). Retrieved October 6, 2002, from http://www.arcadsoftware.com/index.php?lang=en&page=skipper_suite Atkinson, R., Hawkins, P., Hills, P., Woollons, D., Clearwaters, W., & Czaja, R. (1994). Application management in a distributed, object-oriented condition monitoring system.
Proceedings of the 1994 Engineering Systems Design and Analysis Conference, London, England, 64(6), 167-174.
Attardi, G., Cisternino, A., & Simi, M. (1998). Web based configuration assistants.
Artificial Intelligence Engineer Design Analysis and Manufacturing, 12(4), 321-331. Banga, G., & Druschel, P. (1999). Measuring the capacity of a Web server under realistic
loads. World Wide Web, 2(1-2), 69-83. Barruffi, R., Milano, M., & Montanari, R. (2001). Planning for security management.
IEEE Intelligent Systems, 16(1), 74-80.
Bartal, Y., Mayer, A., Nissim, K., & Wool, A. (1999). Firmato: A novel firewall management toolkit. Proceedings of the 19th IEEE Computer Society Symposium on Security and Privacy, Oakland, CA, 17-31.
Bauer, M., Bunt, R., El Rayess, A., Finnigan, P., Kunz, T., Lutfiyya, H., Marshall, A., Martin, P. Oster, G., Powley, W., Rolia, J., Taylor, D., & Woodside, M. (1997). Services supporting management of distributed applications and systems. IBM Systems Journal, 36(4), 508-526. Bauer, M., Coburn, N., Erickson, D., Finnigan, P., Hong, J., Larson, P., Pachi, J., Slonim, J., Taylor, D., & Teorey, T. (1994). A distributed system architecture for a distributed application environment. IBM Systems Journal, 33(3), 399-425. Bauer, M., Finnigan, P., Hong, J., Rolia, J., Teorey, T., & Winters, G. (1994).
Reference architecture for distributed systems management. IBM Systems Journal, 33(3), 426-444.
Bauer, M., Lutfiyya, H., Black, J., Kunz, T., Taylor, D., Bunt, R., Eager, D., Rolia, J., Woodside, C., Hong, J., Martin, T., Finnigan, P. & Teorey, T. (1995) MANDAS: Management of distributed applications. Proceedings of the 5th IEEE Computer Society Workshop on Future Trends of Distributed Computing Systems, Cheju Island, Republic of Korea, 200-206.
430
Best practices in enterprise management. (1998). White paper. Retrieved October 6, 2002 from http://www.meritproject.com/white_papers.htm
Bladergroen, D., Maas, B., Dullaart, L., Kalfsterman, J., Koppens, A., Mameren, A., &
Veen, R. (1998). Delivering IT Services. Utrecht, The Netherlands: Kluwer BedrijfsInformatie.
BMC software solutions for SAP environments. (2001). White paper. Houston, TX: BMC Software, Inc. Boardman, B. (1999). Managing your enterprise piece by piece. (1999). Retrieved
October 6, 2002, from http://www.networkcomputing.com/1010/1010f177.html Boardman, B. (1999). The double edged side of ESM. CMP Media Ltd., 4(20), 26, 28-29. Boloix, G., & Robillard, P. (1995). A software system evaluation framework. Computer,
28(12), 17-26. Britton, C. (2001). IT Architectures and Middleware. Upper Saddle River, NJ: Addison
Wesley. Byrd, J. (1997). A basic UNIX tutorial. Retrieved October 6, 2002, from
http://www.isu.edu/comcom/workshops/unix/index.html CandleNet ETEWatch. (n.d.). Retrieved October 6, 2002, from
http://www.candle.com/www1/cnd/portal/CNDportal_Channel_Master/0,2179,2683_2919,00.html
Carter, C., Whyte, I., Birchall, S., & Swatman, D. (1997). Rapid integration and testing of business solutions. BT Technology, 15(3), 37-47. Central dispatch. (n.d.). Retrieved October 6, 2002 from
http://www.resonate.com/news/press_releases/10_23_01cd40.php Cerutti, D., & Pierson, D. (1993). Distributed Computing Environments. New York:
McGraw-Hill, Inc. Chin, K. (1995). JAD experience. Proceedings of the ACM SIGCPR Conference, New
York, 235-236. Chin, W., Ramachandran, V., & Cheng, C. (2000). Evidence sets approach for Web fault
diagnosis. Malaysian Journal of Computer Science, 13(1), 84-89.
Christensen, K., & Javagal, N. (1997). Prediction of future world wide Web traffic characteristics for capacity planning. International Journal of Network Management, 7(5), 264-276.
431
Cleland, D., & Gareis, R. (1994). Global Project Management Handbook. New York: McGraw-Hill, Inc.
Client response time monitoring world-wide world-class performance measurement
tool. (1998). Research Triangle Park, NC: IBM Corporation. Unpublished manuscript. Cochran, H. (2000). Web developers: Manage change or fail. Application Development
Trends, 7(12), 59-62.
Compaq TeMIP version 4.0 for Tru64 UNIX. (2000). Retrieved October 6, 2002, from http://www.compaq.com/info/SP5417/SP5417PF.PDF
Computer associates: Enterprise management strategy white paper. (1997). Retrieved October 6, 2002 from http://www.cai.com/products/unicent/whitepap.htm Compuware Abend-AID products home page. (2002). Retrieved October 6, 2002, from http://www.compuware.com/products/abendaid/ Compuware Vantage products home. (2002). Retrieved October 6, 2002, from http://www.compuware.com/products/vantage/ Consulting partners a to z. (2000). Retrieved October 6, 2002, from
http://www.ca.com/services/partners/partner_az.htm
ContentMover. (1999). Chelmsford, MA: WebManage Technologies, Inc. Continuus/CM: Change management for software development. (2001). White paper.
Malmö, Sweden: Telelogic AB. CONTROL: Enterprise Web application management. (1999). White paper. San
Francisco, CA: Eventus Software, Inc. CONTROL overview. (1999). White paper. San Francisco, CA: Eventus Software, Inc. Cover, R. (2000). DMTF common information model (CIM). Retrieved October 6,
2002, from http://www.oasis-open.org/cover/dmtf-cim.html Curtis, R. (1997). A Web based configuration control system for team projects.
Proceedings of 28th SIGCSE Technical Symposium on Computer Science Education, San Jose, CA, 29(1), 189-193.
Daniels, A., & Yeates, D. (1971). Systems Analysis. Palo Alto, CA: Science Research
Associates, Inc. Dart, S. (1994). Adopting an automated configuration management solution.
Paper presented at Software Technology Center '94, Salt Lake City, UT.
432
Dart, S. & Krasnov, J. (1995). Experiences in risk mitigation with configuration management. Paper presented at the 4th Conference on Software Risk, Monterey, CA.
Database 2 Messages Reference. (1995). New York: IBM Corporation.
Day, B. (1992). Implementing automated operations - A user's experience. Capacity Management Review, 20(11), 5-7.
DCE-RPC interoperability (XDSA-DCE) - Introduction. (1997). Retrieved October 6,
2002 from http://www.opengroup.org/onlinepubs/009656999/chap1.htm DCE overview. (1996). Retrieved October 6, 2002, from
http://www.opengroup.org/dce/info/papers/tog-dce-pd-1296.htm Debar, H., Huang, M., & Donahoo, D. (1999). Intrusion detection exchange format
data model. Retrieved October 6, 2002, from http://www.ietf.org/proceedings/99nov/I-D/draft-ietf-idwg-data-model-00.txt
Desmond, J. (1990). Culture clash at Research Triangle Park. Software Magazine, 10(14), 47-51.
Developing Visio Solutions. (1997). Seattle, WA: Visio Corporation. Dictionary of Computing. (1987). Poughkeepsie, NY: IBM Corporation. Distributed management task force, inc. (1999). Retrieved November 24, 2002, from http://www.dmtf.org/about/index.php DMTF standards and specifications: Understanding the application management
model. (1998). Retrieved February 24, 2002, from http://www.dmtf.org/standards/index.php
Dr. ecommerce answers your questions - European commission information society
directorate general. (2000). Retrieved February 24, 2002, from http://europa.eu.int/ISPO/ecommerce/drecommerce/answers/000014.html
DSL: Copper mountain and Xedia partner to deliver traffic shaping and management control in digital subscriber line networks. Offers carriers and DSL service providers IP quality of service for Internet access. (1998). EDGE: Work-Group Computing Report, 31(8), 1-3.
E-Commerce Construction Kit User's Guide. (2001). Belmont MA: Boomerang
Software, Inc.
433
Elder-Vass, D. (2000). MVS Systems Programming. Retrieved February 24, 2002, from http://www.mvsbook.fsnet.co.uk/index.htm
Eloff, M., & Von Solms, S. (2000). Information security management: A hierarchical
framework for various approaches. Computers and Security, 19(3), 243. Endler, M., & Souza, A. (1996) Supporting distributed application management in Sampa.
Proceedings of the Third International Conference on Configurable Distributed Systems, Annapolis, MD, 177-184.
Enterprise distributed computing. (2000). Retrieved February 24, 2002, from http://www.dstc.edu.au/Research/Projects/EDOC/ODPEL.html Enterprise reporter. (1999). Chelmsford, MA: WebManage Technologies, Inc. Environmental Record Editing and Printing Program. (1999). Poughkeepsie, NY: IBM
Corporation.
Event management and notification - White paper. (2000). Houston, TX: BMC Software, Inc. Evidian products. (2001). Retrieved February 24, 2002 from
http://www.ism.bull.net/products/
Fearn, P., Berlen, A., Boyce, B. & Krupa, D. (1999). The Systems Management Solution Life Cycle. Research Triangle Park, NC: IBM Corporation.
Feit, S. (1996). TCP/IP Architecture, Protocols, and Implementation with IP v6 and IP
Security. New York: McGraw-Hill. Finkel, A., & Calo, S. (1992). RODM: A control information base. IBM Systems Journal,
31(2), 252-269. First annual report. United States Government Working Group on Electronic Commerce (1998). Retrieved February 20, 2002, from http://www.ecommerce.gov/usdocume.htm Flanagan, P. (1996). 10 hottest technologies in telecom. Telecommunications, 30(5),
29-38. Foote, S. (1997a). Managing applications. DBMS, 10(11), 52-54, 56, 60, 62. Foote, S. (1997b). Managing applications in a wired world. Retrieved November 14,
2002 from http://www.novadigm.com/pdf/hurwitz.pdf Fosdick, H. (1998). Performance monitoring's cutting edge. Database Programming & Design, 5(11), 50-56.
434
FREE DESK - Help desk software. (2000). Retrieved February 24, 2002, from http://freedesk.wlu.edu/
Freed, N., & Kille, S. (1998). Network services monitoring MIB. Retrieved February 24,
2002 from http://www.faqs.org/rfcs/rfc2248.html Frick, Vaughn. (2000). Transforming the enterprise to embrace e-business. (Available
from Gartner, Inc. 56 Top Gallant Road, Stamford, CT 06904) Gaffaney, N., & Carlin, N. (1998). Implementing business systems management with
global enterprise manager. The Managed View, 2(2), 55-70. Gallagher, S. (1998). ProVision: The unframework. Retrieved February 28, 2002, from
http://www.informationweek.com/673/73olplt.htm Garg, A. (1998). Is it the network or the application? (Available from Enterprise
Management Associates 2108 55th Street, Suite 110 Boulder, CO 80301) Garg, A., & Schmidt, R. (1999). Get proactive (network performance management).
Communication News, 36(9), 73-74. Geschickter, C. (1996a). Applications management: An unmet user requirement. Framington, MA: Hurwitz Group, Inc. Geschickter, C. (1996b). Application management defined: The application dependency
stack. Framington, MA: Hurwitz Group, Inc. Gillooly, C. (1999). E-business management: Solving the next management challenge.
White paper. Framington, MA: Hurwitz Consulting Group, Inc. Gilly, D. (1994). UNIX in a Nutshell. Sebastopol, CA: O'Reilly & Associates. Goedicke, M., & Meyer, T. (1999). Web based tool support for dynamic management of
distribution and parallelism in integrating architectural design and performance evaluation. Proceedings of the International Symposium on Software Engineering for Parallel and Distributed Systems, Los Angeles, CA, 156-163.
Gulla, J. (1991). Multiple virtual storage (MVS) concepts, job control language (JCL),
and utilities: Student handout. Wayne, PA: International Business Machines. Unpublished educational materials.
Gulla, J. (1997). Ethernet local area networks - Their relationship to network management in the context of a computer network. Unpublished manuscript.
Gulla, J., & Hankins, J. (2001). Web site monitoring and management perspectives: A
readiness-evaluation methodology. Retrieved November 24, 2002, from http://www.isoc.org/inet2001/CD_proceedings/index.shtml
435
Gulla, J. & Hankins, J. (2002). Ensuring High Web Application Availability Through An Effective Monitoring Framework That Combines Both Empirical and Experiential Dimensions. Proceedings of the IBM Academy of Technology High Availability Best Practices Conference, July 10-12, 2002, Ontario, Canada.
Gulla, J., & Siebert, E. (2001). Monitoring implementation planning: A key activity to
bridge engagement and transition for the monitoring of a customer's Web site. Poster Session presented at the IBM Professional Leadership Technical Exchange, San Francisco, CA.
Gulla, J., & Warren, R. (1998). Deploying a business system solution. Proceedings of the
Planet Tivoli Conference, May 18-21, 1998, Orlando, FL. Gumbold, M. (1996). Software distribution by reliable multicast. Proceedings of LCN -
21st Annual Conference on Local Computer Networks, Minneapolis, MN, 222-231. Hahn, K., & Bruck, R. (1999). Web based design tools for MEMS process configuration.
Proceedings of International Conference on Modeling and Simulation of Microsystems, San Juan, Puerto Rico, 346-349.
Harikian, V., Blust, B., Campbell, M., Cooke, S., Foley, R., Gulla, J., Gayo, F., Howlette, M., Mosher, L., & O'Mara, M. (1996). Distributed Systems Management Design Guidelines: The Smart Way to Design. Research Triangle Park: International Business Machines. Hariri, S., & Mutlu, H. (1995). Hierarchical modeling of availability in distributed systems. IEEE Transactions on Software Engineering, 21(1), 50-56. Hellerstein, J., Zhang, F., & Shahabuddin, P. (1998). Characterizing normal operation of a
Web server: Application to workload forecasting and problem detection. CMG Proceedings, Turnersville, NJ, 1, 150-160.
Help desk, Web call center and diagnostic software. (2001). Retrieved February 27,
2002, from http://www.support.com/solutions/products/productsoverview.asp Hodges, J. (2000). An LDAP roadmap & FAQ. Retrieved February 20, 2002, from
http://www.kingsmountain.com/ldapRoadmap.shtml Hong, J., Gee, G., & Bauer, M. (1995). Towards automating instrumentation of systems
and applications for management. Proceedings of GLOBECOM '95, Communications for Global Harmony, Singapore, 1, 107-111.
436
Hong, J., Katchabaw, M., Bauer, M., & Lutfiyya, H. (1995). Modeling and management of distributed applications and services using the OSI management framework. Proceedings of Information Highways from a Smaller World and Better Living '95, 12th Annual International Conference on Computer Communication, Seoul, South Korea, 215-220.
Horrocks, I. (2001). Security training: Education for an emerging profession (Is security
management a profession?). Computers and Security, 20(3), 219-226. Horwitt, E. (2000). CIM creeps ever closer. Retrieved February 26, 2002, from
http://www.nwfusion.com/news/1999/0621cim.html Hosted help desk - Web-based help desk software application service provider.
(2000). Retrieved February 26, 2002, from http://www.hostedhelpdesk.com/ Hough, D. (1993). Rapid delivery: an evolutionary approach for application development. IBM Systems Journal, 32(3), 397-419. How to collect Chassis information (including the Chassis serial number) for routers and
Catalyst switches using SNMP. (2002). Retrieved May 18, 2002, from http://www.cisco.com/warp/public/477/SNMP/chassis.shtml
HP OpenView directions. (1998). Retrieved February 26, 2002, from
http://www.openview.hp.com:80/pdfs/22.pdf Huang, G., Yee, W., & Mak, K. (2001). Development of a Web-based system for
engineering change management. Robotic Computer Integrated Manufacturing, 17(3), 255-267.
Huh, S., & Bae, K. (1999). Dynamic Web server construction on the Internet using a
change management framework. International Journal of Intelligent Systems in Accounting, Finance and Management, 8(1), 45-60.
Hurwitz, J. (1996). The application dependency stack. DBMS, 13(9), 8-9. Hurwitz, J. (1997). Not just technology: Organizational issues in developing an
applications management strategy. DBMS, 3(10), 10, 12. Hurwitz, J. (1998). World class business requires 100% application availability. White
paper. Framington, MA: Hurwitz Consulting Group, Inc. Hussain, D., & Hussain, K. (1985). Information Processing Systems for Management.
Homewood, IL: Richard D. Irwin, Inc. IBM Thinkpad 760 EL User's Guide. (1996). New York: IBM Corporation.
437
Information technology - Portable operating system interface - POSIX (r) - System administration. Part 2: Software administration. (1995). Retrieved February 20, 2002, from http://standards.ieee.org/reading/ieee/std_public/description/posix/1387.2-1995_desc.html
Integrated systems management. (2000). (Available from Global Communications,
Hanson Cooke, 1-3 Highbury Station Road, London N1 1SE, UK) Integration overview. (2001). Retrieved February 27, 2002
from http://www.tivoli.com/products/documents/whitepapers/io.html International Business Machines. (1996, May). External design specification. (Document
Number: D138.DOC). Quezon City, Philippines: Angelica Salvacion-Bala. Interscan Webmanager. (2000). Retrieved February 27, 2002, from
http://www.antivirus.com/products/Webmanager Introducing Windows 95. (1995). Seattle, WA: Microsoft Corporation. Irlbeck, B. (1992). Network and system automation and remote system operation. IBM
Systems Journal, 31(2), 206-222. ISO DP 7489/4: Information processing systems--ISO reference model--Part 4:
Management framework. (1986). Geneva, Switzerland: International Organization for Standardization.
ISO - International organization for standardization. (1999). Retrieved February 25,
2002, from http://www.iso.ch/iso/en/aboutiso/introduction/achievements.html Jackson, J., & McClellan, A. (1996). JAVA by Example. Mountain View, CA: Sun Microsystems, Inc. Jackson, R., & Embley, D. (1996). Using joint application design to develop readable formal specifications. Information and Software Technology, 38(10), 615-631. Jander, M. (1998). Clock watchers. Data Communications, 27(13), 75-80. JAVA management extensions home page. (1999). Retrieved February 20, 2002, from http://java.sun.com/products/JavaManagement/ JAVA management extensions white paper. (1999). Retrieved February 27, 2002, from http://java.sun.com/products/JavaManagement/wp/ Job scheduling server for Windows NT. (2001). Retrieved February 27, 2002, from
http://members.home.net/microwork/
438
Joining the IETF. (2000). Retrieved February 27, 2002, from http://www.ietf.org/join.html Jutla, D., Ma, S., Bodorik, P., & Wang, Y. (1999). WebTP: A benchmark for Web-based order management systems. Proceedings of the 32nd Hawaii International
Conference of System Sciences, HICSS-32, 341-351. Kalbfleisch, C., Krupczak, C., & Presuhn, R. (1999). Application management MIB.
Retrieved February 27, 2002, from http://www.faqs.org/rfcs/rfc2564.html Karpowski, W. (1999). Computer associates Unicenter TNG framework. White paper.
Retrieved February 20, 2002 from http://www.neccomp.com/servers/osservmanage/TNGFrameworkWhitePaper.pdf
Katchabaw, M., Lutfiyya, H., Marshall, H., & Bauer, M. (1996). Policy-driven fault management in distributed systems. Proceedings of the Seventh International Symposium on Software Reliability Engineering, White Plains, NY, 236-245. Kay, A. (1999, September 13). Bottom-line management--Companies seek a business-
oriented view of their enterprise information systems. Information Week, 124. Keahey, K. (2000). A brief tutorial on CORBA. Retrieved February 20, 2002, from
http://www.cs.indiana.edu/hyplan/kksiazek/tuto.html Keynote perspective brochure: Assuring peak Web-site performance and quality of
service. (2000). Retrieved February 20, 2002, from http://www.keynote.com/solutions/html/resource_product_research_libr.html
Keynote systems services. (2000). Retrieved February 27, 2002, from
http://www.keynote.com/services/downloads/whitepapers/wp_ecommerce.html Kille, S. (1998) Why do I need a directory when I could use a relational database?
Retrieved February 27, 2002, from the Stanford University Web site: http://www.stanford.edu/%7Ehodges/talks/EMA98-DirectoryServicesRollout/ Steve_Kille/index.htm
Kramer, J., Magee, J., Ng, K., & Sloman, M. (1993). The system’s architect’s assistant for design and construction of distributed systems. Proceedings of the Fourth Workshop on Future Trends of Distributed Computing Systems, Lisbon, Portugal,
284-290. Krapf, E. (2001). Check point beefs up security software. Business Communications
Review, 31(4), 77. Kroenke, D., & Dolan, K. (1987). Business Computer Systems. Santa Cruz, CA: Mitchell
Publishing, Inc.
439
Krupczak, C., & Saperia, J. (1998). Definition of system-level managed objects for applications. Retrieved February 20, 2002, from ftp://ftp.isi.edu/in-notes/rfc2287.txt
Ku, H., Forslow, J., & Park, J. (2000). Web based configuration management architecture
for router networks. Proceedings of Network Operations and Management Symposium, The Networked Planet: Management Beyond 2000, Honolulu, HI, 173-186.
Kundtz, J. (1996). Implementing problem management processes at the helpdesk using the
business process method. Proceedings of the 1996 8th Annual Quest for Quality and Productivity in Health Service Conference, Norcross, GA, 350-356.
LAN Network Manager for OS/2 Reference. (1997). Research Triangle Park, NC: IBM
Corporation.
Leadership for the new millennium, delivering on digital progress and prosperity. The third annual report of the electronic commerce working group. (2001). Retrieved February 28, 2002, from http://www.ecommerce.gov/
Learn CIM. (1999). Retrieved February 27, 2002, from http://www.dmtf.org/education/cimtutorial.php Lendenmann, R., Nelson, J., Lara, C., & Selby, J. (1997). An Introduction to Tivoli's
TME 10. Austin, TX: IBM Corporation. Leoni, M., Trainotti, M., & Valerio, A. (1999). Applying software configuration
management in Web site development. Proceedings of Systems Configuration Management, 9th International Symposium, Toulouse, France, 1675, 34-37.
Levitt, J. (1997). Rating the push products. Informationweek, 628(28), 53-59. Lewis, L., & Ray, P. (1999). Service level management definition, architecture, and
research challenges. Proceedings of Global Telecommunications Conference, GLOBECOM'99, Rio de Janeiro, Brazil, 1974-1978.
Long, L. (1989). Management Information Systems. Englewood Cliffs, NJ: Prentice Hall. Loyola, R. (1998). Capacity management software. Retrieved February 27, 2002, from
http://www.ntsystems.com/db_area/archive/1998/9807/207r2.shtml Maltinti, P., Mandorino, D., Mbeng, M., & Sgamma, M. (1996). OSI system and
application management: an experience in a public administration context. Proceedings of the 1996 IEEE Network Operations and Management Symposium, Kyoto, Japan, 2, 492-500.
Mangold, B., & Brandner, R. (1993). Systems and Network Management in Distributed
Environments. Research Triangle Park, NC: IBM Corporation.
440
Mark, R., & Nielsen, J. (1994). Usability Inspection Methods. New York: John Wiley & Sons.
Martin, P. (1996). A management information repository for distributed applications management. Proceedings of the 1996 International Conference on Parallel and Distributed Systems, ICPADS 1996, Tokyo, Japan, 472-477. Mason, R. (1998). WebSpectrive. White paper. Framington, MA: International Data
Corporation. Mason, R. (2001). Enterprise management becomes infrastructure management, an IDC
white paper. Framingham, MA: International Data Corporation. Mattison, R. (1997). Understanding Database Systems. New York: McGraw-Hill. Mazurek, G. (1998, August 31). Real reseller opportunity lies in services. Computer
Reseller News, 71. McQuillen, K. (1975). System/360-370 Assembler Language (OS). Fresno, CA:
Mike Murach & Associates, Inc. Menasce, D., & Almeida, V. (1999). Evaluating Web server capacity. Web Technology,
4(4), 47-51. Merant PVCS. (2001). Retrieved February 27, 2002, from http://www.merant.com/products/pvcs/ Microcode - Webopedia definition and links. (2001). Retrieved February 27, 2002, from http://Webopedia.internet.com/TERM/M/microcode.html Microsoft Word User's Guide. (1994). Seattle, WA: Microsoft Corporation. Miller, P. (1994). Integrated system management design considerations for a
heterogeneous network and system management product. IEEE Symposium Record on Network Operations and Management, NOMS'94, Kissimmee, FL, 2, 555-575.
Modiri, N. (1991). The ISO reference model entities. IEEE Network, 5(4), 24-33. Mohan, C., Pirahesh, H., Tang, W., & Wang, Y. (1994). Parallelism in relational database
management systems, IBM Systems Journal, 33(2), 349-371. Muller, N. (1998). Digital equipment corp.’s Polycenter framework. Retrieved
February 27, 2002, from http://www.ddx.com/polyctr.html Nash, E. (1999). Catch of the day. Unix NT News, 26, 39-40, 42.
441
NetIQ AppManager suite architecture overview. (2001). Retrieved February 20, 2002, from http://www.express.com.au/software/netiq/netiq_appmanager_architectural.html
NetIQ AppManager suite overview. (2001). Retrieved February 20,
2002, from http://www.netiq.com/products/am/default.asp NetView Database Guide. (1997). Research Triangle Park, NC: Tivoli Systems. NetView for OS/390 Application Programmer's Guide. (2001). Research Triangle Park,
NC: Tivoli Systems. NetView for OS/390 Planning Guide. (1997). Research Triangle Park, NC: Tivoli
Systems. NetView User's Guide. (2001). Research Triangle Park, NC: Tivoli Systems. Network management. (2001). Retrieved February 28, 2002, from
http://www.managementsoftware.hp.com/solutions/categories/networkmgmt/index.asp Neumair, B. (1998). Distributed application management based on ODP viewpoint
concepts and CORBA. IEEE Symposium Record on Network Operations and Management, NOMS'98, New Orleans, LA, 2, 559-569.
New Netscape extension enables seamless integration of Netscape application server with existing enterprise applications and systems. (1998). Retrieved February 28, 2002,
from http://home.netscape.com/newsref/pr/newsrelease606.html NovaStor - Backup, encryption, and data interchange software. (2001). Retrieved
February 28, 2002, from http://www.novastor.com/ Olsen, F. (1998). Army corps of engineers keeps tabs on data use. Government Computer
News, 29(17), 1. Open system interconnection (OSI) protocols. (1999). Retrieved February 28, 2002,
from http://www.cisco.com/univercd/cc/td/doc/cisintwk/ito_doc/osi_prot.htm OpenVision tech unveils C/S strategy. (1994, August 22). Computer Reseller News, 12. OS/VS2 MVS Overview. (1980). New York: International Business Machines. Osel, P., & Gansheimer, W. (1995). OpenDist incremental software distribution. Proceedings of the Ninth Systems Administration Conference, LISA IX, Monterey, CA,
181-193.
442
Overview and installation of the SysMan software manager. (2001). Retrieved February 28, 2002, from http://gatekeeper.dec.com/pub/SysManSwMgr/00README.txt
Overview of parallel concurrent processing. (2002). Retrieved May
17, 2002, from http://sandbox.aiss.uiuc.edu/oracle/nca/fnd/parallel.htm
Overview of wired for management (WfM) baseline 2.0. (2000). Retrieved February 27, 2002, from http://developer.intel.com/ial/WfM/wfmover.htm
Patrol enterprise manager - White paper. (2001). Retrieved February 27, 2002, from
http://www.bmc.com/products/document/00039757/09003201804f386f.html Patrol 2000 by BMC software. (2000). Retrieved February 28, 2002, from
http://www.bmc.com/products/esm/index.html Platform SiteAssure. (2000). Retrieved February 28, 2002, from http://www.platform.com/products/rm/siteassure/index.asp Platinum technology emerges as enterprise management leader with major expansion
of platinum provision. (1998). Retrieved February 28, 2002, from http://ca.com/press/platinum_archive/provpf.htm
Pratt, P. (1990). A Guide to SQL. Boston, MA: Boyd & Fraser Publishing Company. Products - ASG-impact product details. (2001). Retrieved February 28, 2002, from
http://www.asg.com/products/product_details.asp?id=31 Puka, D., Penna, M., & Prodocimo, V. (2000). Service level management in ATM
networks. Proceedings of the International Information Technology Conference on Coding and Computing, Las Vegas, NV, 324-329.
Purvis, R., & Sambamurthy, V. (1997). Examination of designer and user perceptions of JAD and the traditional IS design methodology. Information and Management, 32(3), 123-135. Queen’s University database systems laboratory. (n.d.). Retrieved March 1,
2002, from the Queen's University Web site: http://www.qucis.queensu.ca/home/cords/database_lab.html
Queen’s University MANDAS research WWW server. (n.d.). Retrieved March 1,
2002, from the Queen's University Web site: http://www.qucis.queensu.ca/home/cords/mandas-queens.html
443
Reliable software. (2001). Retrieved February 28, 2002, from http://www.relisoft.com/ Remedy action request system. (2001). Retrieved February 28, 2002, from
http://www2.remedy.com/solutions/core/datasheets/arsystem.htm Remedy discovery services for Intel LANdesk.. (2000). Retrieved February 28, 2002,
from http://www2.remedy.com/solutions/ebis/itsm/datasheets/discovery-landesk.htm
Rennhackkamp, M. (1997). System sprawl: New tools for managing distributed enterprises. DBMS, 5(10), 67-75. Rhee, Y., Park, N., & Kim, T. (2000). Heuristic connection management for improving
server side performance. Proceedings of Open Hypermedia Systems and Structural Computing, San Antonio, TX, 1903, 31-37.
Richardson, R. (1998, April 1). Software distribution: Does it deliver? Network World, 1-9. Router products configuration guide. (2001). Retrieved February 28, 2002, from
http://www.cisco.com/univercd/cc/td/doc/product/software/ios11/cbook/cdspu.pdf Rubin, A., Geer, D., & Ranum, M. (1997). Web Security Sourcebook. New York:
John Wiley & Sons, Inc. Russinovich, M. (1999). NT vs. UNIX: Is one substantially better. Retrieved
February 28, 2002, from http://www.winntmag.com/Articles/Index.cfm?IssueID=97&ArticleID=4500
Ryan, K. (1993). Six ways to boost mainframe productivity. Datamation, 39(10), 72-74. Rymer, J. (1995). Direct application management: Direct management of application modules solves a crucial need. Boston, MA: Patricia Seybold Group. Sandoval, G. & Dignan, L. (2001). Amazon analyst say fourth quarter real test. Retrieved
February 28, 2002, from http://www.zdnet.com/zdnn/stories/news/0,4586,5098607,00.html
SAP output management, forms overlay, report distribution, archiving, and retrieval.
(2001). Retrieved February 28, 2002, from http://www.cypressdelivers.com/sap.htm Schade, A., Trommler, P., & Kaiserswerth, M. (1996). Object instrumentation for
distributed applications management. Proceedings of the IFIP/IEEE International Conference on Distributed Platforms: Client/Server and Beyond: DCE, CORBA, ODP, and Advanced Distributed Applications, ICDP'96, Dresden, Germany, 173-185.
444
SCIS help: HTML and your Web page. (1999). Retrieved February 28, 2002, from the the Nova Southeastern University Web site:
http://scis.nova.edu/NSS/Help/Webpage.html SDS helpdesk software. (2001). Retrieved February 28, 2002, from
http://www.ScottDataSystems.com/ Server consolidation methodology. (2001). White paper. Houston, TX: BMC Software,
Inc. Server resource management fact sheet. (2000). Retrieved February 28, 2002, from
http://srmWeb.raleigh.ibm.com/servlet/com.ibm.srm.servlet.gui.SrmBeginHere? Service level agreements. (2001). Retrieved February 28, 2002 from
http://www.uu.net/customer/sla/ Service level reporter. (1999). Chelmsford, MA: WebManage Technologies, Inc. Schmidt, D. (2001). Overview of CORBA. Retrieved February 28, 2002 from
http://www.cs.wustl.edu/~schmidt/corba-overview.html Shukla, R., & McCann, J. (1998). TOSS: TONICS for operation support systems: System
management using the world wide Web and intelligent software agents. Proceedings of Network Operations and Management Symposium, NOMS'98, New Orleans, Louisiana, 1, 100-109.
Siyan, K. (2000). Network management for Microsoft networks using SNMP. Retrieved
February 28, 2002, from http://www.microsoft.com/technet/treeview/default.asp?url=/TechNet/prodtechnol/win ntas/maintain/featusability/networkm.asp
Slater, P. (1999). PCFONFIG: A Web based configuration tool for build to order
products. Proceedings of ES98, the 18th Annual International Conference of the British Computer Society Specialist, Applications and Innovations in Expert Systems, Cambridge, England, 27-41.
Snell, M. (1997). Spec puts applications management in arm's reach. LANTIMES, 14(15),
1. SNMP MIB support: IBM HTTP server. (2001). Retrieved February 20, 2002, from
http://www-4.ibm.com/software/Webservers/appserv/doc/v35/ae/ infocenter/ihssun/9acmib.htm
Sobel, K. (1996a). Application management: It's not just technology. Framington, MA:
Hurwitz Group, Inc.
445
Sobel, K. (1996b). Creating an applications management strategy. Framington, MA: Hurwitz Group, Inc.
Sobel, K. (1996c). HP and Tivoli announce performance management API. Framington,
MA: Hurwitz Group, Inc. Sobel, K. (1996d). Navigating the application management hype. Framington, MA:
Hurwitz Group, Inc. Sobel, K. (1997). Application management standards: Instrumenting applications for
management. Framington, MA: Hurwitz Group, Inc. Software Distributor Administration Guide. (2001). Retrieved February 28, 2002, from
http://docs.hp.com/hpux/onlinedocs/B2355-90740/B2355-90740.html Solstice Enterprise Manager. (2001). Mountain View, CA: SunSoft, Inc. Solstice enterprise manager 2.1: A technical white paper. (1997). Palo Alto, CA: Sun Microsystems, Inc. SPARCstation 5 Installation Guide. (1996). Palo Alto, CA: Sun Microsystems. Spectrum Concepts Guide. (1996). Rochester, NH: Cabletron Systems. Spectrum Enterprise Manager: Getting Started with Spectrum for Operators. (1998). Rochester, NH: Cabletron Systems. Spectrum/NV-S Gateway User's Guide. (1998). Rochester, NH: Cabletron Systems. Spuler, D. (2000). Web-based enterprise management for a standardized world.
Retrieved February 28, 2002, from http://www.bmc.com/products/whitepapers.cfm Starbase corporation: Configuration management. (2001). Retrieved March 3,
2002, from http://www.starbase.com/products/starteam/ Start, K., & Patel, A. (1995). The distribution management of service software. Computer Standard Interfaces, 17(3), 291-301. Straus, F., Schoenwaelder, J., Braunschweig, T., & McCloghrie, K. (2001). SMIng - Next generation structure of management information. Retrieved February 28, 2002,
from http://search.ietf.org/internet-drafts/draft-ietf-sming-02.txt StreamServe overview. (2001). Retrieved February 28, 2002, from http://www.streamserve.com/default.asp?ItemID=498 Sturdevant, C. (1999). Ready, set, deploy! PC Week, 22(16), 70-75.
446
Sturm, R. & Bumpus, W. (1999). Foundations of Application Management. New York: John Wiley & Sons.
Sturm, R., & Weinstock, J. (1995). Application MIBs: Taming the software beast. Data
Communications, 24(15), 85-92. Symantec first to provide anti-virus and enterprise security management protection against recently issued fraudulent VeriSign digital certificates. (2001). Software Industry Report, 33(7), 1. System management: Application response measurement (ARM) API. (1998). Retrieved
February 20, 2002, from http://www.opengroup.org/publications/catalog/c807.htm System management tools. (1996). DBMS, 6(9), 87-88. System software: Product intros. (1997, October 8). ENT, 2(15), 42. Szabat, M., & Meyer, G. (1992). IBM network management strategy. IBM Systems
Journal, 31(2), 154-160. Szymanski, R., Szymanski, D., Morris, N., & Pulschen, D. (1988). Introduction to
Computers and Information Systems. Columbus, OH: Merrill Publishing Company. Talluru, L., & Deshmukh, A. (1995). Problem management in decision support systems: a
knowledge-based approach. Proceedings of the IEEE International Conference on Systems, Man and Cybernetics, Vancouver, BC, 3, 1957-1962.
Tanaka, H., & Ishii, H. (1995). Service operation and management architecture using
surveillance of application software elements. Proceedings of Global Telecommunications Conference, GLOBECOM'95, Singapore, 3, 13-17.
TeMIP OSS framework overview. (2001). Retrieved November 24, 2002, from http://www.openview.hp.com/products/tmpfw/index.asp TeMIP OSI management toolkit. (1999). Retrieved February 28, 2002, from
http://apache.ethz.ch/docu/sysman/dllgzaa3.html Text-based configuration management. (2001). White paper. Malmö Sweden: Teleogic
AB. The IETF application area. (2000). Retrieved February 28, 2002, from
http://www.apps.ietf.org/apps-area.html The Information Technology Process Model: A New Model for Managing the
Information Technology Business. (1995). New York: IBM Corporation.
447
The Java tutorial. (1999). Retrieved February 28, 2002, from http://www.javasoft.com/docs/books/tutorial/index.html.
The open group adopts the ARM API as its standard for application instrumentation. (1999). Retrieved February 28, 2002, from
http://www.tivoli.com/news/press/pressreleases/en/1999/opengroup_adopts_arm.html The portable applications standards committee. (2000). Retrieved February 28, 2002,
from http://www.pasc.org Thompson, P., & Sweitzer, J. (1997). Successful practices in developing a complex
information model. Proceedings of the Conceptual Modeling – ER ’97 Conference, 16th International Conference on Conceptual Modeling, Los Angeles, CA, 376-393.
TIDAL software - Sys*ADMIRAL. (2001). Retrieved February 27, 2002,
from http://www.tidalsoft.com/products/sysadmiral/index.htm Tisdale, C. (1998). JOPES design and project plan. Research Triangle Park, NC: Tivoli
Systems, Inc. Unpublished manuscript. Tivoli business partners (2001). Retrieved February 27, 2002 from
http://www.tivoli.com/Tivoli_Channels/WebPartners.nsf/Tivoli+Partner?OpenForm. Tivoli developer kit for PowerBuilder concepts and facilities white paper. (1996). Austin, TX: Tivoli Systems, Inc. Tivoli distributed monitoring. (1999). Retrieved February 27, 2002, from
http://www.tivoli.com/products/index/distmon/. Tivoli Global Enterprise Manager Instrumentation Guide. (1998). Raleigh, NC: Tivoli Systems. Tivoli Manager for MCIS User's Guide. (1998). Austin, TX: Tivoli Systems. Tivoli Manager for Oracle Reference Guide V2.0. (2000, December). Austin, TX: Tivoli
Systems. Tivoli Module Builder User’s Guide. (1998). Research Triangle Park, NC: IBM
Corporation. Tivoli module designer. (1998). Retrieved February 27, 2002, from http://www.tivoli.com/products/index/module_designer/ Tivoli operations planning and control. (2001). Retrieved February 17, 2002 from
http://www.tivoli.com/products/index/opc/index.html
448
Tivoli product index. (2001). Retrieved February 27, 2002, from http://www.tivoli.com/products/index/
Tivoli service desk for OS/390 - INFOMAN. (2001). Retrieved February 27, 2002, from
http://www.tivoli.com/products/index/service_desk_390/infoman.html Tivoli service desk for OS/390 - Datasheet. (2001). Retrieved February 27, 2002, from
http://www.tivoli.com/products/index/service_desk_390/sd390_driection.html Tivoli solutions. (2001). Retrieved March 3, 2002 from
http://www.tivoli.com/products/solutions/ TME 10 Inventory User's Guide. (1998). Austin, TX: Tivoli Systems. TME 10 Software Distribution. (1998). Austin, TX: Tivoli Systems.
Tong, L. (1996). Data compression for PC software distribution. Software Practical Experience, 26(11), 1181-1195. Tsaoussidis, V., & Liu, K. (1998). Network management and operations: Application
oriented management in distributed environments. Proceedings of Third Symposium on Computers and Communications, ISCC'98, Athens, Greece, 130-134.
Tschichholz, M., Hall, J., Abeck, S., & Wies, R. (1995). Information aspects and future
directions in an integrated telecommunications and enterprise management environment. Journal of Network and Systems Management, 3(1), 111-138.
Tuning the WebLogic Server. (2000). Retrieved May 14, 2002 from http://www.Weblogic.com/doc51/admindocs/tuning.html Turner, R. (1998). USAA Internet member services business system specification.
Research Triangle Park, NC: Tivoli Systems, Inc. Unpublished manuscript. Turner, R. (1999). IBM REQCAT Web business system. Research Triangle Park, NC:
Tivoli Systems, Inc. Unpublished manuscript. TUSS system specifications. (2001). Retrieved November 24, 2002, from
http://www.outputmanagement.com/html/aboutproduct12.html Udupa, D. (1996). Network Management Systems Essentials. New York: McGraw-
Hill. Understanding the digital economy. (1999). Retrieved February 28, 2002, from the
World Wide Web: http://www.digitaleconomy.gov/
449
UniPress software - Web-based help desk software, CRM and issue management software, development tool. (2001). Retrieved February 27, 2002, from http://www.unipress.com/footprints/
Universal server farm base services. (2000). Somers, NY: IBM Global Services. UNIX and Windows centralized backup solutions for restoring and recovering data on
the network. (2001). Retrieved February 27, 2002, from http://www.syncsort.com/bex/infobex.htm
UNIX Unleashed. (1994). Indianapolis, IN: Sams Publishing. Using Tivoli software installation service for mass installation. (1998). Retrieved
February 27, 2002, from http://publib-b.boulder.ibm.com/Redbooks.nsf/ RedbookAbstracts/sg245109.html?Open
Vallillee, T. (n.d.). SNMP & CMIP an introduction to network management.
Retrieved November 24, 2002, from http://www.geocities.com/SiliconValley/Horizon/4519/snmp.html
Vamgala, R., Cripps, M., & Varadarajan, R. (1992). Software distribution and
management in a networked environment. Proceedings of the Sixth System Administration Conference (LISA VI), Long Beach, CA, 163-170.
Veritas NerveCenter. (2001). Retrieved March 3, 2002 from
http://techupdate.cnet.com/enterprise/0-6133362-720-1885931.html Verton, D. (2000). Security survival training. Federal Computer Week, 14(8), 30-31. Wahl, M., Howes, T., & Kille, S. (1997). Lightweight directory access protocol (V3).
Retrieved November 24, 2002, from ftp://ftp.isi.edu/in-notes/rfc2251.txt Warrier, U., Besaw, L., LaBarre, L., & Handspicker, B. (1990). The common
management information services and protocols for the Internet. Retrieved February 27, 2002, from http://andrew2.andrew.cmu.edu/rfc/rfc1189.html
Weber, D. (1999). CM strategies for RAD. System Configuration Management. 9th
International Symposium, SCM 9. Proceedings (Lecture Notes in Computer Science Vol. 1675), 204-216
Webster's New International Dictionary. (1955). Springfield, MA: G. & C. Merriam
Company. WEBM initiative. (2001). Retrieved February 27, 2002, from http://www.dmtf.org/standards/standard_wbem.php/index.html
450
Welcome to Hewlett-Packard. (2000). Retrieved November 24, 2002, from http://www.hp.com/country/us/eng/welcome.htm
Welter, P. (1999). Web server monitoring white paper. (Available from Enterprise Management Associates 2108 55th Street, Suite 110 Boulder, CO 80301)
Westerinen, A., & Strassner, J. (Eds.). (2000). CIM core model white paper. Retrieved February 17, 2002, from http://www.dmtf.org/var/release/Whitepapers/DSP0111.pdf
Windows NT and Windows 2000 FAQ - How do I use the security configuration and analysis snap-in?. (2000). Retrieved November 24, 2002, from
http://www.windows2000faq.com/Articles/Print.cfm?ArticleID=15290 Woodruff, S. (1999). PCPMM: Port checking and pattern matching monitor documentation and configuration. Schaumburg, IL: IBM Corporation. Yahoo! search results for "capacity management services". (2001). Retrieved November
24, 2002, from http://google.yahoo.com/bin/query?p=%22capacity+management+services%22&hc=0&hs=0
Yang, A., Linn, J., & Quadrato, D. (1998). Developing integrated Web and database applications using JAVA applets and JDBC drivers. Proceedings of the 1998 29th SIGCSE Technical Symposium on Computer Science Education, SIGCSE, New York, 302-306.
Yang, C., & Luo, M. (2000). Building an adaptable, fault tolerant, and highly manageable
Web server on clusters of non dedicated workstations, Proceedings of 2000 International Conference on Parallel Processing, Toronto, Canada, 413-420.
Yemini, A., Kliger, S., Mozes, E., Yemini, Y., & Ohsie, D. (1996). High speed and robust event correlation. IEEE Communications Magazine, 34(5), 82-90. Yucel, S., & Anerousis, N. (1999). Event aggregation and distribution in Web-based management systems. Proceedings of the Sixth International Symposium on Integrated Network Management, IM'99, Boston, MA, 35-48. Yun, J., Ahn, S., & Chung, J. (2000). Fault diagnosis and recovery scheme for Web
server using case based reasoning. Proceedings of the IEEE ICON International Conference on Networks 2000, ICON'2000, Singapore, 495.
Recommended