DFOManagementGuide-Aug2008

Design for Operations

Designing Manageable

Applications

August 2008

Table of Contents

Introduction .................................................................................................................................... 12

Intended Audiences ................................................................................................................ 12

How This Guide Is Organized ................................................................................................ 12

Chapter Outline ...................................................................................................................... 13

Scenarios Discussed in This Guide ....................................................................................... 14

Worked Example Used in This Guide .................................................................................... 14

Northern Electronics Shipping Application ......................................................................... 14

The Dynamic IT Systems Initiative (DSI) ............................................................................... 15

Patterns and Practices ........................................................................................................... 15

Feedback and Support ........................................................................................................... 15

Acknowledgments .................................................................................................................. 15

Section 1 ........................................................................................................................................ 17

Introduction to Manageable Applications ................................................................................... 17

Chapter 1 ....................................................................................................................................... 18

Understanding Manageable Applications .................................................................................. 18

Application Perspectives ........................................................................................................ 19

Operating Business Applications............................................................................................ 19

Application Dependencies ...................................................................................................... 20

Core Principles for Designing Manageable Applications ....................................................... 22

Northern Electronics Scenario................................................................................................ 23

Operations Challenges ....................................................................................................... 24

Development Challenges .................................................................................................... 25

Summary ................................................................................................................................ 25

Chapter 2 ....................................................................................................................................... 26

A High-Level Process for Manageable Applications .................................................................. 26

Roles Participating in the High-Level Process ....................................................................... 27

Understanding the Process .................................................................................................... 29

Designing the Manageable Application .............................................................................. 29

Developing the Manageable Application ............................................................................ 30

Deploying the Manageable Application .............................................................................. 30

Operating the Manageable Application .............................................................................. 30

Facilitating the Process � Guidance and Artifacts.................................................................. 31


Summary ................................................................................................................................ 34

Section 2 ........................................................................................................................................ 36

Architecting for Operations ........................................................................................................ 36

Chapter 3 ....................................................................................................................................... 37

Architecting Manageable Applications ....................................................................................... 37

Designing Manageable Applications ...................................................................................... 37

Representing Applications as Managed Entities .................................................................... 39

Advantages of Using Managed Entities ................................................................................. 40

Providing an Operations View of an Application................................................................. 41

Ensuring That Instrumentation Is Sufficient ........................................................................ 41

Close Mapping to Configuration ......................................................................................... 41

Benefits of Defining a Management Model for the Application .............................................. 42

Designing, Developing, Deploying, and Maintaining Manageable Applications: Refining the Process ................................................................................................................................... 42


Summary ................................................................................................................................ 44

Chapter 4 ....................................................................................................................................... 45

Creating Effective Management Models .................................................................................... 45

Benefits of Using Management Models ................................................................................. 46

Management Model Views ..................................................................................................... 46

Comprehensive Management Models ................................................................................... 47

Configuration Modeling ....................................................................................................... 47

Task Modeling .................................................................................................................... 48

Instrumentation Modeling ................................................................................................... 49

Health Modeling .................................................................................................................. 49

Performance Modeling ........................................................................................................ 50

Modeling Instrumentation and Health .................................................................................... 51

Effective Instrumentation Modeling .................................................................................... 51

Types of Instrumentation ................................................................................................. 51

Performance Counters .................................................................................................... 52

Events ............................................................................................................................. 52

Determining What to Instrument ......................................................................................... 53

Granularity of Instrumentation ......................................................................................... 54

Performance Considerations........................................................................................... 54

Building Effective Health Models ............................................................................................ 54

Health States ...................................................................................................................... 55

Health State Hierarchies ..................................................................................................... 56

Managed Entity Hierarchies ............................................................................................ 56

Aggregate Aspects .......................................................................................................... 58

Rolling Up Aspects into Managed Entities ...................................................................... 59

Monitoring and Troubleshooting Workflow ......................................................................... 60

Detection ......................................................................................................................... 60

Verification....................................................................................................................... 61

Diagnostics ...................................................................................................................... 61

Resolution ....................................................................................................................... 62

Re-verification ................................................................................................................. 62

Structure of a Health Model ................................................................................................ 62

Mapping Requirements to Individual Indicators.................................................................. 64

Multiple Distributed Managed Entities ................................................................................ 64


Instrumentation Model ........................................................................................................ 66

Health Model ....................................................................................................................... 66

Summary ................................................................................................................................ 68

Chapter 5 ....................................................................................................................................... 69

Proven Practices for Application Instrumentation ...................................................................... 69

Events and Metrics ................................................................................................................. 69

Architectural Principles for Effective Instrumentation ............................................................. 69

Create a Flexible Instrumentation Architecture .................................................................. 70

Create Instrumentation That Operations Staff Easily Understands .................................... 70

Support Existing Operations Processes and Tools ............................................................ 70

Create Applications That Are Not Self-Monitoring .............................................................. 71

Support Flexible Configuration of Instrumentation ............................................................. 71

Using Instrumentation Levels to Specify Instrumentation Granularity ............................ 71

Using Infrastructure Trust Levels to Specify Instrumentation Technologies ................... 73

Designing Application Instrumentation ................................................................................... 73

Use the Capabilities of the Underlying Platform ................................................................. 73

Provide Separate Instrumentation for Each Purpose ......................................................... 74

Isolate Abstract Instrumentation from Specific Instrumentation Technologies ................... 74

Create an Extensible Instrumentation Architecture ............................................................ 75

Use Base Events for Instrumentation ................................................................................. 75

Use Event Names and Event IDs Consistently .................................................................. 75

Ensure Events Provide Backward Compatibility................................................................. 76

Support Logging to Remote Sources ................................................................................. 76

Consider Distributed Event Correlation .............................................................................. 76

Developing the Instrumentation.............................................................................................. 76

Minimize Resource Consumption ....................................................................................... 77

Consider the Security of the Event Information .................................................................. 77

Supply Appropriate Context Data ....................................................................................... 77

Record the Times Events Are Generated ........................................................................... 77

Provide Resolution Guidance ............................................................................................. 78

Building and Deploying Instrumentation ................................................................................. 78

Automate Implementation of Instrumentation ..................................................................... 78

Automate the Build and Deploy Process ............................................................................ 78

Monitor Applications Remotely ........................................................................................... 78

Summary ................................................................................................................................ 79

Chapter 6 ....................................................................................................................................... 80

Specifying Infrastructure Trust Levels........................................................................................ 80

Infrastructure Model Scenarios .............................................................................................. 81

In-House Application Scenario ........................................................................................... 81

ISV or Shrink-Wrap Application Scenario ........................................................................... 81

Privilege and Trust Considerations ........................................................................................ 82

Tools for Infrastructure Modeling............................................................................................ 84

Standalone Tools ................................................................................................................ 85

Integrated Tools .................................................................................................................. 85

Infrastructure Modeling with the TSMMD ........................................................................... 85

Instrumentation Technologies Supported by the TSMMD .............................................. 86


Summary ................................................................................................................................ 87

Chapter 7 ....................................................................................................................................... 88

Specifying a Management Model Using the TSMMD Tool ........................................................ 88

Requirements for the TSMMD................................................................................................ 88

Creating a Management Model .............................................................................................. 89

The TSMMD Guided Experience ........................................................................................ 89

Creating the TSMMD File ................................................................................................... 89

Graphically Modeling an Operations View of the Application ............................................. 91

Executable Application .................................................................................................... 92

Windows Service ............................................................................................................. 92

ASP.NET Application ...................................................................................................... 93

ASP.NET Web Service ................................................................................................... 94

Windows Communication Foundation (WCF) Service .................................................... 95

Defining Target Environments for the Application .............................................................. 96

Defining Instrumentation for the Application ....................................................................... 97

Defining Abstract Instrumentation ................................................................................... 97

Defining Instrumentation Implementations .................................................................... 100

Discovering Existing Instrumentation in an Application .................................................... 105

Creating Health Definitions ............................................................................................... 109

Validating the Management Model ................................................................................... 111

Management Model Guidelines............................................................................................ 112

Northern Electronics Scenario.............................................................................................. 112

Summary .............................................................................................................................. 115

Section 3 ...................................................................................................................................... 116

Developing for Operations ....................................................................................................... 116

Chapter 8 ..................................................................................................................................... 117

Creating Reusable Instrumentation Helpers ............................................................................ 117

Creating Instrumentation Helper Classes ............................................................................ 117

Instrumentation Solution Folder ........................................................................................... 118

API Projects ...................................................................................................................... 119

Technology Projects ......................................................................................................... 120

Event Log Project .......................................................................................................... 120

Windows Eventing 6.0 Project ...................................................................................... 121

WMI Project ................................................................................................................... 121

Performance Counter Project........................................................................................ 121

Using the Instrumentation Helpers ....................................................................................... 121

Verifying That Instrumentation Code is called from the Application..................................... 122

Summary .............................................................................................................................. 123

Chapter 9 ..................................................................................................................................... 124

Event Log Instrumentation ....................................................................................................... 124

Installing Event Log Functionality ......................................................................................... 125

Event Sources .................................................................................................................. 125

Using the EventLogInstaller class .................................................................................... 127

Writing Events to an Event Log ............................................................................................ 129

Using the WriteEntry Method ............................................................................................ 129

The WriteEvent Method .................................................................................................... 130

Reading Events from Event Logs ......................................................................................... 131

Creating and Configuring an Instance of the EventLog Class .......................................... 131

Using the Entries Collection to Read the Entries.............................................................. 132

Clearing Event Logs ............................................................................................................. 133

Deleting Event Logs ............................................................................................................. 133

Removing Event Sources ..................................................................................................... 134

Creating Event Handlers ...................................................................................................... 135

Using Custom Event Logs .................................................................................................... 135

Writing to a Custom Log ................................................................................................... 136

Installing the Custom Log .............................................................................................. 136

Writing Events to the Custom Log ................................................................................ 137

Other Custom Log Tasks .................................................................................................. 137

Summary .............................................................................................................................. 137

Chapter 10 ................................................................................................................................... 138

WMI Instrumentation ................................................................................................................ 138

WMI and the .NET Framework ............................................................................................. 138

Benefits of WMI Support in the .NET Framework............................................................. 139

Limitations of WMI in the .NET Framework ...................................................................... 140

Using WMI.NET Namespaces .......................................................................................... 141

Publishing the Schema for an Instrumented Assembly to WMI ........................................... 142

Republishing the Schema ................................................................................................. 143

Unregistering the Schema ................................................................................................ 143

Instrumenting Applications Using WMI.NET classes ........................................................... 143

WMI .NET Classes ........................................................................................................... 144

Accessing WMI Data Programmatically ............................................................................... 144

Summary .............................................................................................................................. 145

Chapter 11 ................................................................................................................................... 147

Windows Eventing 6.0 Instrumentation ................................................................................... 147

Windows Eventing 6.0 Overview .......................................................................................... 147

Reusable Custom Views ................................................................................................... 147

Command Line Operations ............................................................................................... 148

Event Subscriptions .......................................................................................................... 149

Integration with Task Scheduler ....................................................................................... 149

Online Event Information .................................................................................................. 150

Publishing Windows Events ................................................................................................. 150

Event Types and Event Channels .................................................................................... 151

Event Types and Channel Groups ................................................................................ 151

Serviced Channel .......................................................................................................... 152

Direct Channel .............................................................................................................. 152

Channels Defined in the Winmeta.xml File ................................................................... 152

Creating the Instrumentation Manifests ............................................................................ 153

Elements in the Instrumentation Manifest ..................................................................... 153

Using Templates for Events .......................................................................................... 157

Using the Message Compiler to produce development files ............................................ 157

Writing Code to Raise Events ........................................................................................... 158

Compiling and Linking Event Publisher Source Code ...................................................... 162

Installing the Publisher Files ............................................................................................. 162

Consuming Event Log Events .............................................................................................. 163

Querying for Events .......................................................................................................... 163

Querying Over Active Event Logs .................................................................................... 163

Querying Over External Files............................................................................................ 163

Reading Events from a Query Result Set ......................................................................... 164

Subscribing to Events ....................................................................................................... 164

Push Subscriptions ........................................................................................................... 165

Pull Subscriptions ............................................................................................................. 168

Summary .............................................................................................................................. 171

Chapter 12 ................................................................................................................................... 172

Performance Counters Instrumentation ................................................................................... 172

Performance Counter Concepts ........................................................................................... 172

Categories......................................................................................................................... 172

Instances........................................................................................................................... 173

Types ................................................................................................................................ 173

Installing Performance Counters .......................................................................................... 174

Writing Values to Performance Counters ............................................................................. 176

Connecting to Existing Performance Counters .................................................................... 178

Performance Counter Value Retrieval ................................................................................. 178

Raw, Calculated, and Sampled Data ................................................................................ 178

Comparing Retrieval Methods .............................................................................................. 180

Summary .............................................................................................................................. 180

Chapter 13 ................................................................................................................................... 181

Building Install Packages ......................................................................................................... 181

Section 4 ...................................................................................................................................... 182

Managing Operations ............................................................................................................... 182

Chapter 14 ................................................................................................................................... 183

Deploying and Operating Manageable Applications ................................................................ 183

Deploying the Application Instrumentation ........................................................................... 183

Running the Instrumented Application ................................................................................. 183

Event Log Instrumentation ................................................................................................ 185

Performance Counter Instrumentation ............................................................................. 187

WMI ................................................................................................................................... 188

Trace File Instrumentation ................................................................................................ 188

Summary .............................................................................................................................. 189

Chapter 15 ................................................................................................................................... 190

Monitoring Applications ............................................................................................................ 190

Distributed Monitoring Applications ...................................................................................... 190

Management Packs .............................................................................................................. 192

Rules and Rule Groups ........................................................................................................ 192

Monitoring the Example Application ..................................................................................... 196

Monitoring the Remote Web Service ................................................................................... 201

Summary .............................................................................................................................. 209

Chapter 16 ................................................................................................................................... 210

Creating and Using Microsoft Operations Manager 2005 Management Packs ...................... 210

Importing a Management Model from the MMD into Operations Manager 2005 ................. 210

Viewing the Management Pack ........................................................................................ 212

Guidelines for Importing a Management Model from the Management Model Designer . 218

Creating and Configuring a Management Pack in the Operations Manager 2005 Administrator Console .......................................................................................................... 219

Guidelines for Creating and Configuring a Management Pack in the Operations Manager 2005 Administrator Console ............................................................................................. 232

Editing an Operations Manager 2005 Management Pack ................................................... 233

Editing Rule Groups and Subgroups ................................................................................ 233

Editing Event Rules, Alert Rules, and Performance Rules ............................................... 236

Editing Computer Groups and Rollup Rules ..................................................................... 242

Creating and Editing Operators, Notification Groups and Notifications ........................... 246

Viewing and Editing Global Settings ................................................................................. 249

Guidelines for Editing an Operations Manager 2005 Management Pack ........................ 251

Create an Operations Manager 2005 Computer Group and Deploy the Operations Manager Agent and Rules ................................................................................................................... 251

Guidelines for Creating an Operations Manager 2005 Computer Group and Deploying the Operations Manager Agent and Rules ............................................................................. 257

View Management Information in Operations Manager 2005 .............................................. 258

Guidelines for Viewing Management Information in Operations Manager 2005 .............. 266

Create Management Reports in Operations Manager 2005 ................................................ 267

Guidelines for Creating Management Reports in Operations Manager 2005 .................. 269

Summary .............................................................................................................................. 269

Chapter 17 ................................................................................................................................... 270

Creating and Using System Center Operations Manager 2007 Management Packs ............. 270

Convert and Import a Microsoft Operations Manager 2005 Management Pack into Operations Manager 2007 .................................................................................................... 270

Guidelines for Converting and Importing a Microsoft Operations Manager 2005 Management Pack into Operations Manager 2007 .......................................................... 271

Creating a Management Pack in the Operations Manager 2007 Operations Console ........ 272

Guidelines for Creating a Management Pack in the Operations Manager 2007 Operations Console ............................................................................................................................. 294

Editing an Operations Manager 2007 Management Pack ................................................... 295

Guidelines for Editing an Operations Manager 2007 Management Pack ........................ 303

Deploying the Operations Manager 2007 Agent .................................................................. 303

Best Practices for Deploying the Operations Manager 2007 Agent ................................. 306

Viewing Management Information in Operations Manager 2007 ......................................... 306

Guidelines for Viewing Management Information in Operations Manager 2007 .............. 311

Creating Management Reports in Operations Manager 2007 ............................................. 312

Guidelines for Creating Management Reports in Operations Manager 2007 .................. 315

Summary .............................................................................................................................. 316

Section 5 ...................................................................................................................................... 317

Technical References .............................................................................................................. 317

Appendix A .................................................................................................................................. 318

Building and Deploying Applications Modeled with the TSMMD ........................................... 318

Consuming the Instrumentation Helper Classes .................................................................. 318

Verifying Instrumentation Coverage ..................................................................................... 320

Deploying the Application Instrumentation ........................................................................... 322

Installing Event Log Functionality ..................................................................................... 322

Installing Windows Eventing 6.0 Functionality .................................................................. 323

Publishing the Schema for an Instrumented Assembly to WMI ....................................... 323

Installing Performance Counters ...................................................................................... 323

Using a Batch File to Install Instrumentation .................................................................... 324

Using the Event Messages File ........................................................................................ 324

Specifying the Runtime Target Environment and Instrumentation Levels ........................... 324

Generating Management Packs for System Center Operations Manager 2007 ................. 328

Importing a Management Pack into System Center Operations Manager 2007 ................. 330

Prerequisite Management Packs ...................................................................................... 331

Creating a New Distributed Application ................................................................................ 331

Appendix B .................................................................................................................................. 333

Walkthrough of the Team System Management Model Designer Power Tool ........................ 333

Building a Management Model ................................................................................................ 333

Generating the Instrumentation Code ...................................................................................... 347

Testing the Model with a Windows Forms Application ............................................................ 349

Generating an Operations Manager 2007 Management Pack ................................................ 353

Appendix C .................................................................................................................................. 355

Performance Counter Types .................................................................................................... 355

Copyright Information

Information in this document, including URL and other Internet Web site references, is subject

to change without notice. Unless otherwise noted, the companies, organizations, products,

domain names, e-mail addresses, logos, people, places, and events depicted in examples herein

are fictitious. No association with any real company, organization, product, domain name, e-

mail address, logo, person, place, or event is intended or should be inferred. Complying with all

applicable copyright laws is the responsibility of the user. Without limiting the rights under

copyright, no part of this document may be reproduced, stored in or introduced into a retrieval

system, or transmitted in any form or by any means (electronic, mechanical, photocopying,

recording, or otherwise), or for any purpose, without the express written permission of

Microsoft Corporation.

Microsoft may have patents, patent applications, trademarks, copyrights, or other intellectual

property rights covering subject matter in this document. Except as expressly provided in any

written license agreement from Microsoft, the furnishing of this document does not give you

any license to these patents, trademarks, copyrights, or other intellectual property.

Microsoft, Windows, System Center Operations Manager, C#, Visual Basic, Visual Studio, and

Team System are trademarks of the Microsoft group of companies.

All other trademarks are property of their respective owners.

© 2008 Microsoft Corporation. All rights reserved.

Introduction

Welcome to Design for Operations: Designing Manageable Applications – January 2008 Release.

This guide describes how to create applications that are easier to manage than existing

applications. When used alongside the associated code artifacts, this guide should help

dramatically simplify the process of creating manageable applications, and therefore reduce the

costs associated with application operations.

Intended Audiences This guide is designed for people involved in designing, developing, testing, deploying, and

operating business applications. These include people in the following roles:

• Solutions architects

• Infrastructure architects

• Developers

• Senior operators

People in each role are likely to use the guide in different ways; different sections are suitable

for different roles. For more information about which sections are appropriate for particular

roles, see the next section, "How This Guide Is Organized."

How This Guide Is Organized This guide is designed to provide comprehensive guidance for designing manageable

applications. The guide is divided into five sections. The following table describes each section

and the intended audience for each section.

Section Primary audience Description

"Understanding Manageable

Applications"

Solutions architects

Infrastructure architects

Defines manageable applications.

Explains the benefits of manageable

applications.

Defines a high-level process for designing

manageable applications.

"Architecting for Operations" Solutions architects

Infrastructure architects

Examines the architectural principles that

should be followed to design manageable

applications.

Explains management models, and shows

how these can be defined.

"Developing for Operations" Developers Examines the development tasks that must

be performed to create manageable

applications. Shows how the management

model can be consumed by developers to

make developing manageable applications

easier.

"Managing Operations" Operators Explains how to take manageable

applications and use them in the operations.

"Technical References" Developers Includes technical resources that provide

additional information about the process of

creating manageable applications.

Chapter Outline This guide includes the following chapters and an appendices:

• "Introduction"

• Section 1, "Understanding Manageable Applications"

◦ Chapter 1, "Understanding Manageable Applications"

◦ Chapter 2, "A High-Level Process for Manageable Applications"

• Section 2, "Architecting for Operations"

◦ Chapter 3, "Architecting Manageable Applications"

◦ Chapter 4, "Creating Effective Management Models"

◦ Chapter 5, "Proven Practices for Application Instrumentation"

◦ Chapter 6, "Specifying Infrastructure Trust Levels"

◦ Chapter 7, "Specifying a Management Model Using the TSMMD Tool"

• Section 3, "Developing for Operations"

◦ Chapter 8, "Creating Reusable Instrumentation Helpers"

◦ Chapter 9, "Event Log Instrumentation"

◦ Chapter 10, "WMI Instrumentation"

◦ Chapter 11, "Windows Eventing 6.0 Instrumentation"

◦ Chapter 12, "Performance Counters Instrumentation"

◦ Chapter 13, "Building Install Packages"

• Section 4, "Managing Operations"

◦ Chapter 14, "Deploying and Operating Manageable Applications"

◦ Chapter 15, "Monitoring Applications"

◦ Chapter 16, "Creating and Using Microsoft Operations Manager 2005

Management Packs"

◦ Chapter 17, "Creating and Using System Center Operations Manager 2007

Management Packs"

• Section 5, "Technical References"

◦ Appendix A, "Building and Deploying Applications Modeled with the TSMMD"

◦ Appendix B, "Walkthrough of the TSMMD Tool"

◦ Appendix C, "Performance Counter Types"

The technical reference section chapters are included in this outline for the sake of

completeness. However, these chapters are scheduled for inclusion in a later revision of the

guide. The plans for the final version of this guide are subject to change, based on feedback

from the community.

Scenarios Discussed in This Guide The principles discussed in this guide apply to a wide range of applications. However, the

specific guidance may vary according to the particular type of application being used. This guide

specifically considers a number of different types of applications and offers targeted guidance

where appropriate. The types of applications this guide considers include the following:

• Line-of-business (LOB) applications

• Web services

• Smart client applications

• Mobile client applications

Worked Example Used in This Guide Most chapters in this guide include references to a scenario that helps explain the concepts of

that chapter. The entire guide uses a single worked example, the Northern Electronics shipping

application, instead of using different scenarios for each chapter. This example is refined in each

chapter, according to the requirements of that chapter.

Northern Electronics Shipping Application Northern Electronics is an electronics maker based in Everett, Washington, with a partly-owned

manufacturing subsidiary based in Nanjing, China.

Product shipping is a core business process for Northern Electronics. However, the company has

had ongoing problems with the product shipping process. Trucks do not always arrive on time.

And even when the trucks do arrive on time, the requirements for each truck are not always

met. Also, the wrong cargo shows up at the loading dock more often than it should. All of these

logistical problems result in much higher overhead costs, and especially, in delay for the

customers expecting on-time arrival.

To improve the situation, the Chief Operations Officer (COO) of Northern Electronics has

approved a plan to overhaul the product shipping process. This plan includes the development

of a new product shipping application.

The product shipping application is critical to the continued success of Northern Electronics.

From his previous experience in other companies, the Chief Information Officer (CIO) of

Northern Electronics is aware that business applications can prove less reliable and more costly

to operate than expected and is looking to avoid those problems with this application. He has

asked the solutions architect to carefully consider operations costs when designing this

application.

The solutions architect has committed to work with others in the organization to address

operations costs as he evaluates how to design this application.

The Dynamic IT Systems Initiative (DSI) The Dynamic IT Systems Initiative (DSI) is a proposal from Microsoft and its partners to deliver

self-managing dynamic systems. Organizations can use the technologies that form part of DSI to

automate many of the ongoing operations tasks that are currently manually performed. This

results in reduced costs and more time to proactively focus on what is most important to the

organization. Designing manageable applications now represents an important step toward the

goal of providing fully dynamic systems later.

For more details about the DSI initiative, see "Dynamic Systems Initiative" on the Microsoft

Business & Industry Web site at http://www.microsoft.com/business/dsi/default.mspx.

Patterns and Practices Microsoft patterns & practices are Microsoft recommendations for how to design, develop,

deploy, and operate architecturally sound applications for the Microsoft application platform.

There are four types of patterns & practices guidance:

• Software factories

• Application blocks

• Reference implementations

• Guides

Microsoft patterns & practices contain deep technical guidance and tested source code based

on real-world experience. The technical guidance is created, reviewed, and approved by

Microsoft architects, product teams, consultants, product support engineers, and by Microsoft

partners and customers. The result is a thoroughly engineered and tested set of

recommendations that you can follow with confidence when building your applications.

Feedback and Support This version of the guide represents preliminary thinking from the Design for Operations (DFO)

team; as such, it is subject to change resulting from feedback. To provide feedback to the DFO

team, please send an e-mail message to [email protected].

Acknowledgments Thanks to the following individuals who assisted in the content development, code

development, test, and documentation experience:

Core Development Team

• William Loeffler, Microsoft Corporation

• Keith Pleas, Keith Pleas and Associates

• Fernando Simonazzi, Clarius Consulting

• Vanesa Cillo, Clarius Consulting

• Peter Clift, Tek Systems

• Alex Homer, Content Master Ltd

• Paul Slater, Wadeware LLC

Test and Edit Team

• Lakshmi Prabha Vijaya Sundiram, Infosys Technologies Ltd

• Sateesh Venkata Surya Nadupalli, Infosys Technologies Ltd

• Eric Blanchet, VMC Consulting Corporation

• Tina Burden McGrayne, TinaTech Inc

Reviewers

• David Aiken, Microsoft Corporation,

• Mary Gray, Microsoft Corporation

• Peter Costatini, Microsoft Corporation

• Marty Hough, Microsoft Corporation

• Kyle Bergum, Microsoft Corporation

• Alex Torone, Microsoft Corporation

• David Trowbridge, Microsoft Corporation

• Tim Sinclair, Microsoft Corporation

• Jeff Levinson, Boeing Corporation

Section 1

Introduction to Manageable

Applications

This section defines manageable applications and explains the benefits to operators, developers,

and architects of manageable applications. It also defines a high level-process for designing,

developing, deploying, and operating manageable applications.

This section should be of use primarily to solutions architects and infrastructure architects.

However, it also provides useful background information to developers and operators.

Chapter 1, "Understanding Manageable Applications"

Chapter 2, "A High-Level Process for Manageable Applications"

Chapter 1

Understanding Manageable

Applications

Hardware and software costs form only a small percentage of the total cost of ownership (TCO)

for enterprise applications. Over time, the costs of managing, maintaining, and supporting those

applications are far more significant.

A large portion of day-to-day running costs is attributable to application failures, performance

degradation, intermittent faults, and operator error. The resultant downtime can severely

impact business processes throughout an organization.

Many of these problems can be mitigated by ensuring that the enterprise applications are

designed to be manageable. As a minimum, a manageable application must meet the following

criteria:

• It is compatible with the target deployment environment.

• It works well with operational tools and processes.

• It provides visibility into the health of the application.

• It is dynamically configurable at run time.

Manageable applications make day-to-day operations a more predictable, efficient process.

However, the benefits of manageable applications are not restricted to the operations team.

With many current applications, when a problem occurs with an existing application, the

operator attempts to diagnose the problem and may solve the problem by either modifying the

configuration of the application or modifying the system at a lower level (for example, by

making changes to the operating system, the hardware, or the network).

If the operator is unable to diagnose or fix an application problem, the operator may have to

report it to the development team so a fix can be produced. One of the main reasons this

happens is because of insufficient or irrelevant instrumentation. If architects and developers

create manageable applications, they can reduce the number of times they are called upon to

fix problems through additional development.

This chapter demonstrates how to understand applications from different perspectives and

describes how knowledge of the operations perspective can lead to applications that are

designed to be manageable.

Application Perspectives Depending on their relationship to an application, different people in an organization will have a

different perspective about an application. The different perspectives include the following:

• User. The user perspective can be thought of as the consumer of the application. From

the user perspective, an application is responsible for meeting user requirements.

Requirements such as security, performance, and availability are typically defined in a

service-level agreement (SLA).

• Operator. The operator perspective can be thought of as the facilitator of the

application. From the operator perspective, the application must be provided to the

user, according to the requirements of the application SLA. The operator is responsible

for ensuring that the requirements of the user are being met and taking appropriate

action if they are not being met. Appropriate action includes troubleshooting problems,

providing the user with feedback, and providing the developer with feedback that may

lead to further development.

• Developer. The developer perspective can be thought of as the creator of the

application. From the developer perspective, the application must be designed and built

to meet the needs defined by the user. However, when creating manageable

applications, the developer perspective should also capture the needs of the operator

and the tasks the operator must perform.

Each of these perspectives is held by multiple job roles, all of whom should be involved in

developing and consuming a manageable application. For example, the developer perspective

will typically be held by one or more architect roles, along with the application developers. For

more details about the specific job roles involved in a manageable application, see Chapter 3,

"Architecting Manageable Applications."

Operating Business Applications Before developing manageable applications, it is important to understand the challenges that

operations teams typically face when managing applications.

Operations consists of a series of interrelated tasks, including the following:

• Monitoring applications and services

• Starting and stopping applications and services

• Detecting and resolving failures

• Monitoring performance

• Monitoring security

• Performing change and configuration management

• Protecting data

The operations team is responsible for ensuring day-to-day availability of the application, yet

they are often provided with applications that are difficult to effectively manage. This often

results in a number of problems, including the following:

• An inability to determine the consequences of problems when they occur

• Insufficient run-time configurability of applications

• Poor understanding of interdependencies between the hardware and software

elements that make up a system

• Poorly designed administration tools that do not reflect the way the IT administrator

views the application

• Changes in one part of a system creating significant impact on the overall environment.

The intent of the administrator and the dependencies among the various components

often cannot be determined by looking at how the resources were deployed in the

environment.

• IT administrators providing the only points of integration across different subsystems.

System configuration rules often reside only in someone's head. Typically, there are no

formal records of either the configuration itself or of the changes that have been made

to it.

• Social processes being responsible for achieving coordination of systems.

Administrators have hallway conversations, send e-mail, or write on sticky notes to

remind each other of issues, changes, and so on.

These problems affect the efficiency of the operations team to manage the application and can

ultimately affect the experience of the users consuming the application.

To solve these problems, the work of the operations team needs to be considered throughout

application design, development, test, and deployment. In many cases, this will be an iterative

process. For example, the experience gained from the day-to-day operation of the system

should guide improvements to the application design over time. With manageable applications,

it is generally easier to transfer system knowledge between all phases of the IT life cycle.

Application Dependencies Figure 1 illustrates a typical three-tiered architecture for an application.

Figure 1 Application three-tier architecture

From an operations perspective, applications always execute on a platform and generally

communicate over a network. Applications are dependent on their own underlying system and

network layers, but they may also communicate with, and be dependent on, other applications

and services.

Figure 2 illustrates the application from the perspective of an operations team.

Figure 2 Applications from an operations perspective

Operators collect information that corresponds to each of these layers, using the information to

ensure that applications continue to run smoothly. Understanding each layer as a separate

entity, and understanding the relationships between the layers, often allows the operations

team to quickly isolate the source of any problem.

For example, if a computer running a SQL database that provides data to an application

becomes unavailable, the functionality of the application could be affected. In this situation, the

operator needs to know several things:

• What has caused the SQL Sever to become unavailable? Typically, this is exposed in the

form of instrumentation at the system tier and network tier. For example, the computer

running SQL Server may have shut down or a network cable may have been removed.

• What are the consequences to the application? Typically, this is exposed in the form of

instrumentation at the application tiers. For example, some functionality of the

application may be lost or performance of the application may be affected.

• What are the consequences to the business operations of the company? Typically, this

can be exposed in the form of instrumentation at the application business logic tier and

may depend on factors outside the application itself. For example, if a business

operation that occurs once a month is affected, and the problem occurs when there are

25 days before the operation occurs again, the problem is less critical than if the

operation must occur every day.

Typically, developers are not concerned with the details of the lower layers. However, an

architect that is designing for operations should at least have a greater awareness of these

details, because issues at a lower level can lead to problems with the health of the application

itself.

Core Principles for Designing Manageable Applications If you are going to design manageable business applications, you must consider manageability

as an integral part of the initial design of the application; it should not be just an afterthought.

Manageability should also be refined and improved through feedback from the operations team

after you get better insight about how applications behave after deployment. The process of

designing manageable applications is the result of collaboration between multiple parties who

must agree to a number of core principles, including the following:

• Applications will provide comprehensive, configurable instrumentation that is

relevant to the IT team. Instrumentation is a very important tool that helps you

understand how an application functions and whether it is functioning as expected.

Instrumentation can also form the basis for determining the resolution to problems.

• Applications will have a health state that varies according to their ability to perform

operations as expected. A healthy application is an application that is performing as

expected. By setting certain parameters for an application, and measuring whether the

application is functioning within those parameters, you can determine the health of an

application and take corrective measures when an application is unhealthy. For more

information about application health, see Chapter 4, "Creating Effective Management

Models."

• Application development must remain independent of the underlying platform.

Problems with the underlying platform can affect the health of an application (for

example, a DNS issue may prevent an application from functioning as expected), and it

is often necessary to capture these dependencies in tools such as System Center

Operations Manager 2007. However, this should not prevent developers from using the

proven practice of developing applications that are independent of the underlying

platform.

• Applications will be managed according to proven practices. Operations teams

currently use a series of practices to manage applications. These practices are

determined by experience and the capabilities and limitations of the available

management tools. Manageable applications should provide an operations experience

similar to the best examples of current manageable server applications.

• Operations will use existing standard management tools to manage applications.

There are many existing tools available for operating applications, including built-in

Microsoft Management Console (MMC) tools such as Event Viewer and Performance

Logs and Alerts. For more sophisticated operations management, there are tools

available such as System Center Operations Manager 2007. Creating new tooling for

managing applications further increases the operations team's workload, so wherever

effective existing tooling is available, it should be used.

This list of core principles is not comprehensive. In many cases, additional core principles will be

established to cover areas such as task management and configuration management.

Northern Electronics Scenario The solutions architect of Northern Electronics has suggested a product shipping solution

centered around three Web services:

• ShippingService Web service. This is the supplier's Web service that is used to send and

receive the details of the shipment pickup.

• PickupService Web service. This is the supplier's Web service that is internally used to

be notified of product pickup and to confirm the shipment was picked up.

• TransportService Web service. This is the transport consolidator's Web service that is

used by the supplier to initially order the transport and finally to confirm that the

shipment was picked up.

The TransportService Web service is not directly implemented by Northern Electronics.

However, it still needs to be considered as part of the overall design because it forms part of the

overall functionality of the application.

Figure 3 illustrates the planned application flow between the Web services, databases, and

workstations used in the application.

Figure 3 Application flow for the Northern Electronics shipping application

Operations Challenges A number of the problems with product shipping faced by Northern Electronics stem from the

existing product shipping application. The operations team for this application face the following

challenges:

• They rely on users to detect and report faults. Sometimes, users cannot provide

sufficient or accurate information; this makes diagnosis and resolution of faults difficult,

costly, and time-consuming.

• They may have to visit the computer to investigate issues. The information they

receive or can extract from the event logs or performance counters may not provide the

appropriate data required to resolve the fault.

• They cannot easily detect some problems early. These problems include impending

failure of a connection to a remote service caused by a failing network connection or

lack of disk space on the server. They are unlikely to monitor performance counters and

event logs continuously and, instead, use them solely as a source of information for

diagnosing faults.

Development Challenges The solutions architect is committed to making the new product shipping solution a manageable

application. However, he faces several challenges in achieving this goal:

• The development team has no experience in developing manageable applications, and

there is no budget for using external developer resources.

• Northern Electronics is planning to modify the design of its infrastructure, and these

plans are currently not finalized.

• Northern Electronics is planning to migrate early to Windows Vista and Windows Server

2008.

The solutions architect plans to use a management model for the application to help him

overcome these challenges.

Summary This chapter examined the different perspectives that interact with an application and focused

more closely on the operations perspective, which must be well understood to design

manageable applications. It introduced some core principles that should be followed when

designing manageable applications. It also provided more details about the Northern Electronics

scenario.

Chapter 2

A High-Level Process for Manageable

Applications

The high-level process for manageable applications defines four interconnected stages that

capture the application through design, development, deployment, and operations, as shown in

Figure 1.

Figure 1 High-level process for manageable applications

This chapter describes each stage and demonstrates how the stages are used together in

manageable applications. As illustrated in Figure 1, the stages are the following:

• Design. A management model is used to define how the application will function in

operations. The management model captures, at an abstract level, the entities that

make up the application, the dependencies between them, the deployment model for

the application, and an abstract representation of the health and instrumentation in the

application.

• Develop. A manageable application will include extensive health and instrumentation

artifacts represented in the management model. Information contained in the

management model is used to help determine the specifics of the health and

instrumentation implementation. Instrumentation will include event IDs, performance

counters, categories, and messages. The application may also perform additional health

checks, such as synthetic transactions.

• Deploy. After the application is developed, it must be deployed. The infrastructure

model (defined as part of the management model) for the application affects the

specific environment that the application runs in, which in turn, affects the health and

instrumentation technologies that can be used. For example, an application deployed in

a low trust environment may not be able to log to a Windows Event Log.

• Operate. After the application is deployed, it must be operated on a day-to-day basis.

Typically, the operations team uses management tools to consume the health and

instrumentation information provided by the application in daily operations and makes

necessary changes to application configuration.

The order of these stages is important - adding the appropriate instrumentation to an

application on an as-needed basis at the end of the development process—or, even worse, after

completing testing and deployment—is unlikely to produce a manageable application. However,

in many cases, feedback during the cycle leads to further development of the management

model.

Roles Participating in the High-Level Process The following four roles are primarily involved in the high-level process:

• Solutions architect. The solutions architect is responsible for defining the application at

the logical level. This involves determining how the application should be structured,

how health can be determined for the application (in an abstract sense), and the

instrumentation that is necessary to make that determination.

• To help define the various manageability requirements of an application, the solutions

architect should create a management model; typically, this is created in collaboration

with the infrastructure architect.

• Developer. The developer is responsible for consuming the model created by the

solutions architect and creating the application, along with appropriate health,

instrumentation, and configuration artifacts, as defined in the model.

• Infrastructure architect. The infrastructure architect is responsible for specifying the

environment in which the application will run. This information may be specified in an

infrastructure model, which may affect decisions made by the solutions architect (for

example, the trust environment into which the application will be deployed). The

infrastructure architect must also ensure that the application can be deployed in the

environment; if it cannot be deployed in the environment, the infrastructure architect

must ensure that the appropriate changes are made to the application or the

environment.

• Operator. The operator is responsible for the ongoing running of the application and

responds to application and system alerts using a variety of operations tools. The

operator may also adjust run-time configuration of the application in response to

certain events.

Figure 2 illustrates how these job roles participate in the high-level process.

Figure 2 High-level process showing job roles

Many additional job roles participate at some point in the life cycle of a manageable application.

The following table lists these roles and the perspectives that they would hold on the

application. For more information about application perspectives, see Chapter 1,

"Understanding Manageable Applications."

Role Perspective Description

User User Uses application.

User Product Manager

(User PM)

User Defines user needs and required features of the application.

Works with the solutions architect and infrastructure architect

to define service-level agreement (SLA) for application.

Helpdesk User or

Operator

Responds to user problems. Works with operations to

troubleshoot application problems.

Records information that will assist operations and future

development.

User Education Developer Responsible for content in error messages, events, and Help

files.

Test Developer Provides feedback to developer during development cycles.

In many cases, individuals are responsible for more than one role in a project.

Understanding the Process To understand how manageable applications are designed, implemented, deployed, and

managed, it is important to look at the process in more detail.

Designing the Manageable Application The management model forms the starting point for a manageable application. One of the great

challenges of creating manageable applications is determining, at design time, the needs for the

application in daily operations. By investing time in creating an effective management model

early, you can dramatically increase the likelihood that your application is manageable later.

Creating a management model for the application does not prevent you from using an

iterative approach when designing your application—the model should be flexible enough to

be altered as changes occur in later iterations.

Typically, the infrastructure architect and the solutions architect are the main roles involved in

creating a management model. The infrastructure architect provides input about the

environment in which the application will be deployed, which may include factors such as

network connectivity, network zones, and allowed protocols. This information is critical to the

overall design, because it can affect the way instrumentation will be implemented in the

application. For example, if the application is to be deployed in a low-trust environment, it is

typically not possible to write events to an event log. In some cases (for example, for a shrink-

wrapped application), it may not be possible to determine in advance what the deployment

environment will be, so multiple trust levels may have to be supported.

Generally, the solutions architect is responsible for the specifics of the management model. The

management model defines how the application is broken into manageable operational units

(known as managed entities). It also contains abstract information about the application, which

defines how the application is developed, deployed, and, ultimately, how it is managed. This

information includes an instrumentation model, which indicates all the instrumentation points

for the application, and a health model, which indicates the various health states for the

application.

For more information about creating a management model, including information about how to

use the Team System Management Model Designer Power Tool (TSMMD) tool, see Chapter 4,

"Creating Effective Management Models," Chapter 5, "Instrumentation Best Practices," and

Chapter 6, "Specifying Infrastructure Requirements."

Developing the Manageable Application After an effective management model is created for the application, the application itself needs

to be developed, and the information contained in the model must be incorporated. The

developer is responsible for taking the abstract elements in the model and generating concrete

artifacts in the code. In particular, the developer typically incorporates specific instrumentation,

such as the following:

• Event log events

• WMI events

• Performance counters

• Event traces

The developer may also need to incorporate specific health indicators, which are used to

determine the health of an application, and configurability support, which are used to modify

what instrumentation is used at run time.

Deploying the Manageable Application After the application is developed, it must be deployed (for simplicity, testing is intentionally

omitted from this process). For an application to be truly manageable, you should have a high

degree of control over the deployment of that application. This allows you to more easily

manage the process of changes to the application. Also, during deployment, specific

configuration settings for the application may be chosen.

Typically, manageable applications should be deployed using a redistributable Microsoft

Windows Installer (.msi) package or SMS package. However, the specifics of the deployment

model are beyond the scope of this guide.

Operating the Manageable Application After the application is deployed to its target environment, it must be operated. The operator is

responsible for managing the application, using an administrative console, and supporting tools

such as Event Viewer and Performance Logs and Alerts. The operator may also use more

advanced tooling, such as System Center Operations Manager 2007, with the application.

In many cases, information contained in the management model can be consumed at run time

by operations. This may be as simple as the operator using a report generated from the original

model to understand the workings of the application, or it may be a Management Pack

automatically generated from the original management model.

Facilitating the Process – Guidance and Artifacts The following tools and artifacts can be used to facilitate the process of designing,

implementing, deploying, and managing applications:

• Guidance. This guidance can be used at all stages of the application life cycle. Chapter 4

includes detailed architectural guidance for designing manageable applications.

Chapters 8–15 provide developer and deployment guidance. Chapters 16 and 17

provide detailed guidance for operating manageable applications.

• TSMMD. This tool is integrated with Visual Studio; it supports many of the

requirements involved in developing manageable application. The feature set of

TSMMD includes the following:

◦ Modeling capabilities. You can use the tool to model many of the artifacts

required in a manageable application. TSMMD represents the application as a

series of related managed entities. By defining different properties of the

managed entities, you can create an abstract representation of application

health, instrumentation, and the target infrastructure.

◦ Automated generation of instrumentation code. TSMMD includes recipes for

automatically generating instrumentation code from the information in the

management model. Instrumentation code is generated in the form of

instrumentation helpers, which separate the process of instrumentation from

the application itself. This means that the application can call abstract

instrumentation, and the application developer does not have to worry about

the specifics of the instrumentation technologies being used.

◦ Validation. TSMMD supports two forms of validation. It ensures that the model

is internally consistent and does not contain orphaned elements. It also

validates that defined instrumentation is called from the application. If

instrumentation represented in the management model is not included in the

application code, the tool generates warnings in Visual Studio.

◦ Management Pack Generation. The TSMMD can generate Management Packs

for System Center Operations Manager directly, using the information about the

instrumentation stored in the Management Model.

• MMD. The Management Model Designer (MMD) is a standalone tool that can be used

to create a hierarchy of managed entities and define a health model for the application.

The MMD can also be used to create Management Packs for Microsoft Operations

Manager (MOM) 2005 and System Center Operations Manager 2007.

• Trust levels. In some cases, the developer architect will not know the specifics of the

deployment environment for the application. By specifying multiple trust levels for an

application, the application can support deployment environments and the decision

about which trust level to use can be deferred until run time. For more details, see

Chapter 6, "Specifying Infrastructure Requirements."

• Run-time configuration. At an architectural level, it is usually not possible to be sure

exactly how the application will be used in daily operations. Therefore, the developer

architect should support flexible operations by providing run-time configuration of the

application. Typically, manageable applications need to support run-time configuration

of instrumentation so the operations team can turn on and turn off instrumentation in

real time and modify the granularity level of instrumentation.

• Management Packs. Management Packs provide a predefined, ready-to-run set of

rules, monitoring scripts, and reports that encapsulate the knowledge required to

monitor, manage, and report about a specific service or application. A Management

Pack monitors events that are placed in the application event log, system event log, and

directory service event log by various components of an application or subsystem. The

rules and monitoring scripts also can monitor the overall health of an application or

system and alert you to critical performance issues in several ways:

◦ They can monitor all aspects of the health of that application or system and its

components.

◦ They can monitor the health of vital processes that the application or system

depends on.

◦ They can monitor service availability.

◦ They can collect key performance data.

◦ They can provide comprehensive reports, including reports about service

availability and service health and reports that can you can use for capacity

planning.

Figure 3 illustrates how the guidance and other artifacts provided can be used to facilitate the

process.

Figure 3 The process showing guidance and artifacts

Northern Electronics Scenario The solutions architect has decided the overall design of the Northern Electronics shipping

application and plans to use a management model to help him ensure that the application is

manageable by the operations team.

The solutions architect has read this guide and has decided to apply its guidance when creating

the Northern Electronics shipping application. He has identified the other key stakeholders

involved in the application and needs input from them to create the management model. The

operator provides information about the manageability requirements for the application, and

the infrastructure architect helps the solutions architect define the likely target deployment

environment.

The four key roles involved in designing, developing, deploying, and operating the Northern

Electronics shipping application are as follows:

• Solutions architect. The solutions architect will use the information obtained from the

operator and the infrastructure architect to create a management model for the

application. The solutions architect has decided in this case to use both the TSMMD and

MMD. These tools will provide the solutions architect with the following functionality:

◦ Modeling. The Northern Electronics shipping application will be represented as

managed entities, which map to the operations view of the application. The

solutions architect will also model health and instrumentation for the

application, using a combination of the TSMMD and MMD tools.

◦ Model validation. The validation feature of the TSMMD tool will help the

solutions architect ensure that he has created an internally consistent

management model for the Northern Electronics application.

• Developer. The senior developer was worried about his lack of experience in creating

manageable applications, but he is now confident that the TSMMD tool will help in

development. He will use the following functionality provided with the tool:

◦ Code generation. Automatic generation of code for instrumentation helps

ensure that the code for the application has no errors and conforms to the

requirements of the solutions architect.

◦ Code validation. This helps the developer ensure that he calls the

instrumentation code from the application.

• Infrastructure architect. The infrastructure in which the Northern Electronics shipping

application is changing, and the infrastructure architect is currently unsure about the

trust level the application should support. Therefore, he has specified a requirement

that the application should run successfully in both a low trust environment and a high

trust environment. The decision about which trust level to use in the application will be

made at deployment time.

• Operator. The senior operator has experience operating business applications at

Northern Electronics; as such, he has some requirements that he has communicated to

the solutions architect. These include the following:

◦ The shipping application must be developed with the operations team in mind.

◦ Events must be relevant to the operations team.

◦ Instrumentation must be configurable at run time.

◦ The application must be manageable from System Center Operations Manager

2007. The MMD tool can be used to generate a Management Pack for the

application that is useable in System Center Operations Manager 2007.

Summary This chapter examined a high-level process for designing, developing, deploying, and operating

manageable applications. It examined the roles that participate in that process and the

responsibilities that each role holds. It also examined the artifacts that are available to facilitate

the process of designing manageable applications.

Section 2

Architecting for Operations

This section examines the architectural principles that should be followed when designing

manageable applications. It examines management models and looks in detail at modeling

health and instrumentation. It also captures best practices for instrumenting applications, and it

discusses how to instrument applications that may be deployed to different infrastructures.

Lastly, it shows how to use the Team System Management Model Designer Power Tool

(TSMMD) to create a management model for an application.

This section should be of use primarily to solutions architects and infrastructure architects.

Chapter 3, "Architecting Manageable Applications"

Chapter 4, "Creating Effective Management Models"

Chapter 5, "Proven Practices for Application Instrumentation"

Chapter 6, "Specifying Infrastructure Trust Levels"

Chapter 7, "Specifying a Management Model Using the TSMMD Tool"

Chapter 3

Architecting Manageable Applications

There are a number of significant challenges the architect faces when determining how to

design a manageable application. This chapter examines the fundamental design principles that

must be addressed. It then demonstrates a structure of a manageable application. Finally, it

shows how creating a management model for the application can simplify the work of the

architect and other members of the development team.

Designing Manageable Applications To design manageable applications, architects should adhere to a number of fundamental

design principles, including the following:

• A management model should be defined for the application. A management model

provides a single authoritative source of knowledge about an application. As a

minimum, the management model you define should capture the dependencies

between different parts of the application, the logical flow of the application, the

instrumentation that will be used to support effective operations, and artifacts that will

be used to measure application health. The architect of the application is the individual

most likely to understand how the application operates and, more importantly, fails to

operate. By describing this in a management model, the architect can communicate to

developers what instrumentation is required and communicate to operations how the

application can be managed.

• The application should expose comprehensive relevant instrumentation.

Instrumentation should provide information that is consistent with the operations view

of the application. Coarse-grained instrumentation can be provided to indicate the

health state of the application, and additional fine-grained instrumentation can provide

supporting diagnostic information to help troubleshoot application problems. Where

possible, instrumentation should also reflect the relationship between the application,

the platform, and the underlying hardware; this allows the operator to relate problems

at a lower level with changes in the health of the application.

• The application should be designed so that its health can be accurately determined. A

number of factors contribute to accurately determining the health of an application. A

well instrumented application will provide information that can be evaluated against a

rule set to determine application health. In some cases, it may be necessary to provide

additional health indicators, such as an application heartbeat or support for a

comprehensive health check of a managed entity. However, while these indicators may

be part of the application itself, the entity responsible for measuring application health

should be separated from the application itself.

• The design of the application should support separation of concerns. The solutions

architect, infrastructure architect, and developer will all have a role in designing a

manageable application. Therefore, it is very important for the design of the application

to allow each role to separate their specific concerns from other roles.

• The application instrumentation should be isolated from the rest of application code.

The architect should make informed choices about the instrumentation technologies to

use, and enforce the use of those technologies by isolating the instrumentation code in

an instrumentation helper. In this case, the application developer calls only abstract

instrumentation code and this is mapped to concrete instrumentation technologies. For

more details, see Chapter 5, "Proven Practices for Application Instrumentation."

• The application should be designed with the target environment (or environments) in

mind. Some instrumentation technologies cannot be used in low trust environments

because they require the application to have a higher level of trust than is available.

Abstracting the specifics of instrumentation can allow for increased flexibility in this

area. In cases where the architect knows the nature of the deployment environment

ahead of time, the appropriate concrete instrumentation can be mapped to the

abstract representation of the instrumentation. In other cases, increased flexibility will

be needed, and the decision about the specific instrumentation technology used must

be deferred until the application is deployed.

• The application should provide configuration options useful to the operations team.

The information provided by extensive instrumentation is of use to the operations team

only if they can perform an action based on that information. In some cases, the

operations team will need to restart the application or individual services. In other

cases, it may be possible to make other real-time changes to application configuration

to solve a problem. Instrumentation information that is closely related to configuration

options is more relevant to operations. It should also be possible to configure the

instrumentation options themselves—for example, to increase the amount of

information that is reported when troubleshooting a problem. Where possible,

configuration settings should be constrained to ensure that the operations team does

not create incorrect settings or change the wrong settings.

• Application code should be auto-generated where appropriate. Auto-generating code

according to the requirements defined by the application architect can increase

efficiency by saving time and reducing errors. The architect can use the management

model to define in some detail the instrumentation requirements for the application;

this allows much of the instrumentation code in the application to be auto-generated

from the pre-defined requirements.

• The application design should conform to effective, proven design principles. When

designing a manageable application, architects have to consider additional factors that

affect the design of the application. However, this does not prevent the architect from

designing the application so that it adheres to existing, proven design patterns. You

should always make sure your application is well designed throughout, and considering

key design patterns during the application design process helps ensure this.

Representing Applications as Managed Entities A managed entity is any logical part of an application that a system administrator needs to

configure, monitor, and create reports about while managing that application or service.

Examples of managed entities are a Web service, a database, an Exchange routing group, an

Active Directory site, a computer, a server role, a network device, a hardware component, or a

subnet.

It is important to understand that administrators will evaluate what to monitor and what actions

to take, based on the importance of an application in meeting the business needs of their

organization. They will not base these decisions about how the software is physically built or any

internal organizational divisions that may have impacted its design. For these reasons, when

defining a managed entity, it is a good practice to use internal architectural design documents as

a starting point, focusing on the logical objects and relationships that operators will understand.

Every managed entity that makes up your application performs a discrete set of operations.

These operations are interesting from a monitoring, dependency, and troubleshooting

perspective, but they are not objects that an administrator would think of as being able to

configure or manipulate like a managed entity. The collection of these subdivisions is referred to

as the aspect of a managed entity.

Typically, managed entities have relationships with other managed entities. The relationships

indicate the dependencies that exist between different managed entities. For example, consider

an application that consists of a number of Web services and databases. Figure 1 illustrates the

dependencies between the different entities that make up the application.

Figure 1

Dependencies between entities

Relationships between managed entities are very important in a management model. These

relationships can directly affect the health, instrumentation, and performance of an entire

system. For example, in Figure 1, the Products Web service is dependent on the Products

database and the Transport Web service. This means that a change in the health state of the

Transport Web service may affect the health of the Products Web service.

The way these relationships are specified depends on the tooling used to represent the

management model. For example, the Management Model Designer (MMD) tool (which focuses

predominantly on health) enforces a parent-child hierarchy between managed entities and uses

the relationship to determine the health of a managed entity. In this case, the health of child

managed entities is rolled up to provide an indication of the health of a parent managed entity.

By contrast, the Team System Management Model Designer Power Tool (TSMMD) tool (which

focuses predominantly on instrumentation) does not use a parent-child relationship.

Advantages of Using Managed Entities Dividing applications into managed entities brings a number of advantages to an application

architect creating manageable applications, including the following:

• It provides an operations view of the application.

• It ensures that instrumentation is sufficient.

• It provides a close mapping to configuration.

The next sections describe each of these items in more detail.

Providing an Operations View of an Application Effective operations often rely on a "divide and conquer" approach. By dividing the operations

environment into a series of interdependent managed entities, the operations team can quickly

diagnose the source of any problem and determine the effects of the problem.

This approach has proven effective for the infrastructure on which applications run, but

applications themselves are often viewed as single, irreducible units. If the application itself can

be represented as discrete managed entities, supported with comprehensive instrumentation,

the architect allows the operations team to isolate the particular managed entity that has the

problem. If the developer also allows the service to be configured, the operations team may be

able to fix problems without contacting the developer.

Separating the application into managed entities also allows you to determine and document

additional information that is highly useful to the operations team. For example, managed

entities will often have dependencies on one another and on other external managed entities

(such as a partner Web service). By isolating these dependencies, you can determine the effect

that the failure of a managed entity will have on the functionality of the entire application.

It should also be possible to determine a logical flow for the application, which shows how the

managed entities communicate with one another in the course of normal business operations.

This information will also help the operations determine the effects of a failure in a managed

entity.

Ensuring That Instrumentation Is Sufficient One of the great challenges of designing manageable applications is providing comprehensive,

relevant instrumentation for the application. That task is made significantly easier in

applications represented as managed entities. The managed entities and relationships between

them should already reflect an operations view of the application, so events associated with

managed entities should be relevant to operations personnel. If events are associated with all

managed entities and with communications between managed entities, those events should

capture all the operations of the application and can be used as a basis for determining the

health state of the application.

Close Mapping to Configuration Configuration provides a way for the operations team to diagnose and troubleshoot problems,

to improve application performance, to specify an appropriate level of instrumentation at run

time, and to alter configuration to reflect changes in the underlying application environment. As

already discussed, managed entities provide an operations view of an application, so the

developer should ensure that each managed entity has its own configuration settings. This helps

to ensure that the operator is provided with a consistent view of the application, centered on

managed entities, and does not need to understand a different application model for

configuration purposes.

Benefits of Defining a Management Model for the Application Nothing about the design of a manageable application compels you to create a management

model. However, using a management model as the starting point dramatically simplifies the

process of application design. The management model allows the architect to capture important

information about the application, which can then be used as the basis for further development

of a manageable application. For more information about management models, see Chapter 4,

"Creating Effective Management Models."

The artifacts discussed in this chapter can all be modeled using the TSMMD tool; including the

managed entities, the abstract instrumentation, the mappings between abstract

instrumentation and concrete instrumentation. The tool can then be used to generate

instrumentation code from the model. The standalone MMD tool can also be used to define

aspects of a management model. For more information about using each of these tools, see

Chapter 7, "Specifying a Management Model Using the TSMMD Tool."

Designing, Developing, Deploying, and Maintaining Manageable

Applications: Refining the Process Chapter 2, "A High Level Process for Manageable Applications," outlined a high-level process for

designing, developing, deploying, and maintaining manageable applications. However, this

process can be refined using the additional guidance and tooling introduced in this chapter. The

process can now be summarized as follows:

1. Use the architectural guidance contained in this guide to determine how to design

your application.

2. Use the TSMMD tool to create an operations view of the application and to model

health and instrumentation artifacts for the application.

3. Generate instrumentation code for the application from the model.

4. Call abstract events from the application code.

5. Build the application, including the instrumentation helper.

6. Test the application.

7. Deploy the application.

8. Manage the application.

More information about the stages of this process can be found throughout the rest of this

guide. For a detailed walkthrough of using the TSMMD tool, see Chapter 7, "Specifying a

Management Model Using the TSMMD Tool."

Northern Electronics Scenario The solutions architect of Northern Electronics has defined the following managed entities for

the application:

• One managed entity for each Web service in the application:

◦ ShippingService

◦ PickupService

◦ TransportService

• One managed entity for each database used by the application:

◦ Transport

◦ Shipping

• One or more managed entities for each workstation application that communicates

with a Web service:

◦ WarehouseClient. This corresponds to the application running on the

warehouse workstation.

◦ PickupConfirmationClient. This corresponds to the application running on the

loading dock workstation.

◦ OrderClient. This corresponds to the application running on the shipping clerk

workstation.

◦ PickupNotificationClient and ProcessOrderClient. These correspond to the

application running on the transport office workstation.

The application running on the Transport Office workstation has two distinct pieces of

functionality of concern to the operations team. The solutions architect has decided to reflect

this by representing the application as two separate managed entities.

The solutions architect plans to use these managed entities as the basis for an application

management model. As a minimum, he plans to define abstract events and measures for each

managed entity, along with default instrumentation levels for each abstract event, and

mappings to concrete instrumentation technologies. He will also define trust levels for the

application and health states for the application.

Summary This chapter examined the overall design of a manageable application and discussed the design

principles that should be adhered to when architecting manageable applications. It also used

these principles to refine the high-level process previously discussed in Chapter 2, "A High-Level

Process for Manageable Applications," and provided additional information about the Northern

Electronics Scenario.

Chapter 4

Creating Effective Management

Models

Creating a management model is a key part of designing manageable applications.

Comprehensive management models provide an abstract representation of all knowledge about

the application; they do this by capturing information that is relevant to the successful

management of the application. Management models ensure that manageability is built into

every service and application; they also ensure that management features are aligned with the

needs of the administrator who will be running the application. As a result, they can

dramatically simplify the deployment and maintenance of applications in a distributed IT

environment.

Information contained in a comprehensive management model for an application has a number

of uses for the operations team, including the following:

• It provides operations with a broader view of the applications they need to maintain by

encapsulating all the information about an application in a coherent organized manner.

• It provides an abstraction of day-to-day operations from low-level technologies. For

example, if a database that forms part of a business application fails, the operations

team will often have to examine low-level events in a SQL log to determine the cause of

a problem. However, if the management model encapsulates the functionality of the

application, a management tool can be used to diagnose and correct the problem.

• It demonstrates how the various technologies that form a solution relate to one

another in operations.

• It predicts the impact of proposed changes to the environment.

• It provides effective troubleshooting information and a detailed view of issues,

including the impact of any problem.

• It provides well-defined, prescriptive configurations for deployment.

• It automates operations with pre-defined command line tools and scripting.

The output from a management model can form the basis for the definition of many artifacts

required during development, including instrumentation and health artifacts. This ultimately

leads to well-designed application instrumentation that supports full monitoring, diagnosis and

troubleshooting by IT operations staff. Effective management models can also reduce the time

needed to adopt a new application, because operations staff will have a more thorough

understanding of the application architecture.

Management models should represent the application as comprehensively as possible.

However, even a partial management model can be very useful in creating a manageable

application. This chapter discusses the elements that make up a comprehensive management

model, and then it discusses in more detail two of the key areas that the rest of this guide will

focus on: instrumentation and health.

Benefits of Using Management Models Creating comprehensive management models provides a total system view and provides many

benefits, including the following:

• All interrelated software and hardware components managed by the administrator can

be captured in a single source.

• Prescriptive configurations and best practices can be captured in a single knowledge

base; this allows changes to the system to be tested before the changes are

implemented.

• The infrastructure that holds the system model captures and tracks the configuration

state, so administrators do not have to maintain it in their heads.

• Administrators do not have to operate directly on the real-world systems; instead, they

can model changes before committing them. This allows "what if" questions to be tried

out without impacting the business.

• Knowledge of the total system view can improve over time. When the system is

developed, basic rules and configurations are defined. As the system is deployed, the

details of the configuration and environmental constraints or requirements are added.

As operational best practices are developed or enhanced, they can also affect the

model.

• The management model becomes the point of coordination and consistency across

administrators who have separate but interdependent responsibilities.

Management Model Views Typically, management models are consumed in different ways, so a comprehensive

management model must capture the elements of a system from a number of different views.

The management model should encapsulate the following two common views:

• Layered view. Applications have dependencies on many other layers of technology,

including databases, operating systems, and hardware. These layers should all be

captured in a comprehensive management model, so the impact of changes in a

particular layer can be understood.

• Administrative view. Typically, administrative responsibilities are split between

different administrative roles. It is important for the management model to capture

these roles, so a problem can be assigned to the appropriate team for resolution.

Administrative responsibilities could include the client computer desktop, network,

Active Directory, database, and application.

After a comprehensive management model is in place, management of the complete system can

be performed through the model.

Comprehensive Management Models Creating a comprehensive management model consists of modeling in a variety of different

areas to provide a total system view, including the following:

• Configuration modeling. This involves encapsulating all the settings that control the

behavior or functionality of an application or system component.

• Task modeling. This involves cataloging the complete list of tasks that administrators

have to perform to administer and manage a software system or application.

• Instrumentation modeling. This involves capturing the instrumentation used to record

the operations of a system or application. Instrumentation provides information to the

operations team to increase understanding about how the application functions, and to

diagnose problems with an application.

• Health modeling. This involves defining what it means for a system or application to be

healthy (operating normally) or unhealthy (operating in a degraded condition or not

working at all). A health model represents logically the parts of an application or service

the operations team is responsible for keeping operational.

• Performance modeling. This involves capturing the expected baseline performance of

an application. Performance counters can then be used to report and expose

performance on an ongoing basis, and a monitoring tool can compare this performance

to the expected performance.

The next sections describe each of these tasks in more detail.

Configuration Modeling In a corporate setting, system administrators frequently have to configure thousands of client

computers and hundreds of servers in their organizations. Standardizing and locking down

configurations for client computers and servers helps simplify this complexity. Recent studies on

total cost of ownership (TCO) identify loss of productivity at the desktop as one of the largest

costs for corporations. Lost productivity is frequently attributed to user errors, such as

modifying system configuration that renders their applications unworkable, or to complexity,

caused by non-essential applications and features on the desktop. Configuration modeling

attempts to address this problem by capturing all the settings that control the behavior or

functionality of an application or system component.

Configuration modeling addresses only those settings that are controllable by an administrator

or an agent. Typically, a configuration model captures the valid configuration settings for client

computers and users, and also for member servers and domain controllers in an Active Directory

forest.

In many cases, configurations will be standardized and centrally managed using technologies

such as Group Policy or Systems Management Server (SMS).

For an application to be managed using Group Policy, that application must have built-in

support for Group Policy. Enterprise Library applications can be managed by Group Policy,

using Enterprise Library.

Task Modeling Administrators typically must learn to use multiple tools to achieve a single administrative task.

Task modeling helps address this problem by enumerating the activities that are performed

when managing a system as defined tasks. These may be maintenance tasks, such as backup,

event-driven tasks, such as adding a user, or diagnostic tasks performed to correct system

failures. Defining these tasks guides the development of administration tools and interfaces and

becomes the basis for automation. The task model can also drive self-correcting systems when

used in conjunction with instrumentation and health models.

Task-based administration uses tasks to administer systems. Task models describe

administration of the component or application in terms of tasks. Tasks are defined as complete

actions that accomplish a goal that has a direct value to the administrator. They enable task-

based administration; this makes it easier to define, enforce, and delegate responsibilities to

different system administrators. In the future, task models will provide a foundation for role-

based access control.

Building all command-line and GUI administration tools based on the same task model can

dramatically lower the time and effort required to learn how to manage Windows operating

systems, server applications, and client applications and it enables automating system

administration tasks.

The following are the most important benefits of building a task-based administration model:

• Administrative tasks can more closely reflect the operations experience. The

administration of applications is described in terms of tasks that are understandable by

system administrators instead of simply reflecting the way in which the application was

developed.

• User experiences are consistent with the administrative tools. Administrative tools

may be GUI-based snap-ins, command line-based utilities, or scripts (for example,

Powershell scripts). Consistency between all these administrative tools allows

administrators to start working with the system using easy-to-understand GUI tools,

and then directly use this knowledge to manage applications with command-line tools

and build automated management scripts.

• Role-based administration is easier to implement. Task models can be the foundation

for implementing role-based administration for your application. Role-based

administration allows you to simplify the access control list (ACL) complexity that exists

today. Task models provide a simplified method for assigning and grouping

responsibilities and access rights. A user role can then be defined as a collection of

tasks. Being a member of a particular user role simply implies being allowed to perform

a set of tasks.

• System management costs for your software are easier to estimate. Each task in the

task model has an associated cost when performing the task. The cost of executing a

task depends on different factors, such as how frequently the task should be

performed, how long it takes to do it, the skill level of the person who runs it, and so on.

It currently takes a substantial amount of effort to gather these statistics. Capturing this

data in task models allows you to do the following:

◦ Calculate the management cost for your product.

◦ Compare it to the management cost of the previous version or a competitor’s

product.

◦ Show your customers the financial benefits of migrating to the new version.

◦ See what tasks cost your customers the most to perform.

Instrumentation Modeling Applications often contain minimal instrumentation or instrumentation that is not relevant to

operations. This results in applications that are difficult to manage, because the operations staff

is not provided with the information it needs to manage the application on a daily basis or to

troubleshoot issues as they occur.

Instrumentation modeling helps to ensure that appropriate instrumentation is built into the

application from the beginning. An instrumentation model allows you to discover the

appropriate instrumentation requirements and then implement this instrumentation within the

application.

Benefits of instrumentation modeling include the following:

• It makes the task of developing the instrumented application more straightforward

for the application developer. The application architect can create the instrumentation

model in abstract form in advance of the development process, clearly defining the

nature of instrumentation required in the application.

• It provides relevant feedback about the application to the operations staff. Well-

designed instrumentation will correlate closely to the operations view of the application

and assist in daily operations tasks. In other words, it will correspond directly to the

configuration and task models. At a deeper level, instrumentation will provide

diagnostic information that the operations team can use to troubleshoot applications

problems.

• It provides feedback about the application to the application developers.

Instrumentation can also provide information to a developer that is directly relevant to

the design of the application. This makes application testing easier, and it reduces the

costs of future development cycles. This type of administration is generally hidden from

operations.

Health Modeling Health modeling defines what it means for a managed entity to be healthy or unhealthy. Good

information about the health state of an application or system is necessary for maintaining,

diagnosing, and recovering from errors in applications and operating systems deployed in

production environments.

Health modeling uses instrumentation as the basis on which monitoring and automated

recovery is built. Frequently, information is supplied in a way that has meaning for developers

but does not reflect the user experience of the administrator who manages, monitors, and

repairs the application or system day to day. Health models allow you to define both what kinds

of information should be provided and how the administrator and the application or system

should respond.

When customers are evaluating a new application, they expect to receive important information

about its capabilities, along with deployment and setup instructions. However, they frequently

are never given the guidance or tools to operate that software on a daily basis after it is

deployed.

Providing the correct view of an application, what it looks like when it is functionally normally

and when it is not functioning normally, and providing the correct knowledge to troubleshoot

issues to IT and operations customers allows them to meet their service level agreements (SLAs)

to their own customers. Troubleshooting guidance and automated monitoring capabilities

delivered to customers when an application is released will substantially improve the adoption

and deployment rates for any new or updated application. Customers will be more comfortable

and confident in deploying new technology when they can monitor how it is performing in

production and know how to get out of trouble quickly when something goes wrong.

Most problems that impact the service delivery of an application could be fixed before the

problem is visible to end users. Effective health modeling ensures that the operations team

thoroughly understands what affects the health of their system, so problems can be detected

before service is impacted and troubleshooting and resolution can be automated as much as

possible. When a problem is detected, the management model facilitates a thorough diagnosis

and a proper solution. Health modeling also enables the operator to take preventive-care

measures before problems occur to maximize system up-time.

Performance Modeling Performance modeling is used to capture the expected performance of a system, defining a

baseline that can be measured against in the future. Performance modeling is closely related to

instrumentation modeling (performance counters are a form of instrumentation) and health

modeling (an application that is performing poorly compared to a pre-determined baseline is

typically considered to be unhealthy).

Performance modeling is useful in capacity planning because it can be used to help determine

expected performance when a system is put under stress or when the configuration of a system

is changed in some way.

A monitoring tool is normally used to measure an application against the performance

information in a management model. When the monitoring tool detects that the application is

not responding or is failing to meet the expected performance level, it can raise an alert to the

operations staff and send an e-mail message. Operators can check the performance and event

logs to get diagnostic information about the problem that will help them recover the application

in the shortest possible time.

Modeling Instrumentation and Health Ideally, a management model should encapsulate all knowledge of a system. However, in cases

where this is not possible, as a minimum the management model should include health and

instrumentation information. Health and instrumentation are intrinsically linked together,

because instrumentation is important to determining the health of any system. Therefore, this

section starts with how to define an effective instrumentation model, and then it moves on to

discuss health models.

Effective Instrumentation Modeling Instrumentation is responsible for exposing the internal state of an application or system.

Instrumentation, along with additional health indicators, can be used to reveal the health of an

application by capturing a transition between a healthy and an unhealthy state.

However, not all instrumentation is directly related to health. One service sending a service

request to another service may be instrumented, but it does not indicate the health state of the

application. Instrumentation can reveal more detailed information that may be used for a

number of purposes, including the following:

• It can provide performance information for the application.

• It can demonstrate usage trends for an application (for example, to support capacity

planning).

• It can show whether service-level agreements (SLAs) have been met.

• It can provide a basis for usage charges.

• It can provide information that can be used to troubleshoot application problems.

• It can reveal security breaches.

Your management models should capture the abstract instrumentation requirements for your

application. The developer can then use these requirements to create the corresponding

instrumentation artifacts.

Types of Instrumentation Typically, instrumentation takes one of two forms in an application:

• Performance counters

• Events

When determining how to support manageability, you should consider how operations will

consume the instrumentation you create. Instrumentation created by the developer may be

consumed in a relatively raw form by the operator—for example, by examining event logs or by

using a low level tool or script to examine Windows Management Instrumentation (WMI)

events. However, particularly in larger organizations, the operator may have access to a tool

such as Microsoft Operations Manager (MOM), which allows him or her to see the information

in a more structured way, and can automate many of the processes of effective operations, such

as creating rule sets and issuing alerts.

Performance Counters Performance counters provide continuous metrics for specific processes or situations within the

system. For example, a performance counter may indicate the current processor usage as a

percentage of its maximum capacity or the percentage of memory available. The metric can also

be an absolute value instead of a percentage, such as the number of current connections to a

database, or the number of queued requests for a Web server.

The operating system and the default services, such as Internet Information Server (IIS) and the

Common Language Runtime (CLR), expose built-in performance counters. In general, you should

aim to use these where possible, complementing them with custom performance counters only

where necessary. For example, your management model should specify use of the built-in IIS

Request Execution Time counter if this can provide the information required by the

management model. In this case, adding an equivalent custom counter will simply add to the

load on the server; it will not achieve anything extra.

Built-in counters cover a wide range of processes in IIS, ASP.NET, the CLR, and SQL Server. For

a complete list of these counters, see "Windows Server 2003 Performance Counters

Reference" on Microsoft TechNet at

http://technet2.microsoft.com/WindowsServer/en/library/3fb01419-b1ab-4f52-a9f8-

09d5ebeb9ef21033.mspx.

Events Monitoring tools can read the event logs of each server in a distributed application and use this

information to raise alerts and send e-mail messages to specified groups of operators when

problems occur. They can also indicate recovery from a problem, which allows operators to

verify that resolution of a problem was successful. Events may take many forms, including

Windows Event Log events, WMI events, and trace statement file entries.

You should consider specifying events for all possible state transitions; operators can filter those

that are of interest. To allow filtering to take place in the monitoring environment, events must

specify a severity and a category in addition to the description and, where possible, recovery

information. Events can also specify security levels; in this case, filtering can take place based on

an operator's security status.

You can use events to indicate non-error conditions if this is appropriate for your application, or

if it is necessary to indicate state changes. To indicate a state change, you can arrange for a

service to raise an event when it starts and again when it completes processing of each Web

service request. An event handler can then be used to determine the average number of

requests in a particular period, the average request time, and the total number of requests. If

these values reach some pre-defined threshold, the event handler then raises another event.

In this case, you are using events to implement a counter, and then monitoring the counter

within your code. However, the overall result is that, in line with the principles of health

monitoring, your application raises an event to the monitoring system that indicates a state

transition. Your management model will simply indicate that the specified process can undergo

a state transition and the parameters that indicate when this state transition takes place.

Determining What to Instrument Determining what to instrument in an application is a critical factor in application design.

Specifying and applying the wrong types of instrumentation, at the wrong places and in

inappropriate numbers, may provide a wealth of information but can seriously affect application

performance. The alternative, specifying too few indicators or indicators of the wrong type, can

result in state changes occurring without operators being aware of them.

Instrumentation information that is relevant to operators includes the following:

• Instrumentation directly related to actions the operator can perform to fix a problem.

This type of instrumentation is normally related to the configuration settings within an

application, or reveals a dependency causing the problem.

• Supporting information that helps the operator to diagnose a problem. For example,

performance counter information can help an operator see that a service is being

underutilized.

• Instrumentation at multiple levels that allows application issues to be related to

problems in underlying platform or hardware. In many cases, application problems can

be caused by lower-level issues. Instrumenting at multiple levels helps the operations

team to determine that this is happening.

• Information that allows the operator to determine the urgency of a task. An operator

may have 20 tasks to perform, and operations will be more efficient if priority is given

to urgent tasks.

In general, instrumentation should be comprehensive but relevant. One approach is to

instrument everything that could possibly change state in some way. However, this has a

number of disadvantages:

• It requires increased development time.

• Much of the instrumentation is irrelevant to operators.

• It can negatively impact performance of an application.

• It requires additional resources from the operations team required to determine which

instrumentation is relevant (although tooling such as System Center can help in filtering

events).

Another approach is to focus instrumentation on elements that are relevant to operations. If the

application is structured according to the principles outlined earlier in this chapter, the services

that make up the application will correspond to those defined in the management model; they

will also correspond to the units of operation that can be seen and, in many cases, configured by

operations. This approach offers a number of advantages:

• It provides information directly relevant to operations, which can be acted upon.

• It requires less development time (although potentially more initial time in determining

the services).

The difficulty of the second approach is that it requires the developer to map instrumentation to

the operator's view of the application. In some cases, it may not be possible to determine

exactly what instrumentation will be most useful at run time, although involving the

infrastructure architect and administrators in the creation of the management model should

help.

The recommended approach when determining what to instrument is to perform extensive

instrumentation of all elements that could be of use to operations, and provide run-time

configurability to allow operators fine-grained control over instrumentation.

Granularity of Instrumentation After you decide what to instrument, you need to determine the appropriate level of granularity

for instrumentation. Normally, the appropriate level of instrumentation will depend on the

current health state of an application. When an application is functioning normally, the operator

will require minimal detail to indicate successful operations. However, if the application has a

problem, or is about to have a problem, more detailed and granular information is useful.

To support this, you should consider instrumenting the application at a fine-grained level but

allowing the operator to configure the level of instrumentation that is exposed at application

runtime.

Performance Considerations When determining how to instrument your application, you should bear in mind that monitoring

performance counters and raising events absorbs resources from the system being monitored.

As a general rule, you should ensure that monitoring does not consume more than 10 percent of

the available resources on the host.

Building Effective Health Models An application is considered healthy if it is operating within a series of defined parameters. A

number of factors may result in a change in application health, including the following:

• Change in application configuration

• An application update

• A change in an external dependency

• A hardware change

• A network change

• Bad input to the application

• Scalability problems

• Operator error

• Change in deployment

• Malicious attack

You should always design your applications and services in a way that maximizes performance

and minimizes resource usage. Part of the definition of a healthy application is that is maximizes

performance by making appropriate use of the available infrastructure. It should do the

following:

• Release memory and resources as soon as possible.

• Minimize its footprint on the operating system and available hardware.

• Make best use of the environment and other systems and features.

• Not adversely affect other applications and services.

Health States The overall health of an application or system is determined by the health of the managed

entities that make up the application. A managed entity is typically considered to be in any one

of three health states:

• RED. This corresponds to a failed state.

• YELLOW. This corresponds to a less than fully operational state.

• GREEN. This corresponds to normal operation within expected performance

boundaries.

In some cases, it is considered beneficial to differentiate between a failed state and an offline

state. In this case, the failed state is represented by RED and the offline state is represented by

BLACK.

Information about the health state of managed entities can be manually gathered by operators

or by management tools that allow operators to do the following:

• Detect a problem.

• Verify that the problem still exists.

• Diagnose the cause(s) of the problem.

• Resolve the problem.

• Verify that the problem was resolved.

Designing a health model entails the following:

• Build the correct application structure, which is made up of components derived from

appropriately predefined components (base classes as defined in the Common Model

Library [CML]) and the relationships between them.

• Build a hierarchy of managed entities that represent the logical services and objects the

application exposes—in a way IT professionals can understand.

• Identify the functional aspects for each managed entity that are of interest for

monitoring. For more information about aspects, see the definition of a managed entity

in Chapter 3 of this guide.

• Identify all the health states that are possible for the application.

• Identify the verification steps that need to be taken to confirm or refute whether an

aspect is in a particular health state.

• Provide the instrumentation required to detect each health state.

• Identify the diagnostic steps needed to determine the root causes for each aspect's

health state.

• Identify the recovery steps that need to be taken to resolve each root cause and return

an aspect and its parent managed entity to full health.

Health State Hierarchies In a health model, managed entities are arranged in a hierarchical form. This allows the health

state of a child managed entity to affect the health state of a parent managed entity. Aspects

can also be collected together into aggregate aspects. Aspects may also form parent-child

relationships with managed entities.

Managed Entity Hierarchies The managed entity hierarchy is the starting point for any health model; its structure drives the

definition, connection, and relationships for all the other concepts in the health model.

For example, consider an application that consists of a number of Web services and databases.

Figure 1 illustrates the dependencies between the different entities that make up the

application.

Figure 1 Dependencies in the example application

The following table indicates the health states of the low-level entities illustrated in Figure 1.

Entity State Description and effect

CustomerDatabase GREEN Working normally

YELLOW Degraded, will impact CustomerWebService

RED Failed, CustomerWebService will fail

ProductsDatabase GREEN Working normally

YELLOW Degraded, will impact ProductsWebService

RED Failed, ProductsWebService will fail

TransportWebService GREEN Working normally

YELLOW Degraded, will not directly impact ProductsWebService, but

operators should receive a warning

RED Failed, ProductsWebService will not fail, but operators should

receive a warning

The customer Web service and products Web service both have dependencies on these low-

level entities and a corresponding dependency on the health state, as shown in the following

table.

Entity Dependencies State Description and effect

CustomerWebService CustomerDatabase

GREEN

GREEN Working normally

CustomerDatabase

YELLOW or GREEN

YELLOW Degraded, will impact OrderApplication

and ExtranetWebSite

CustomerDatabase RED,

YELLOW or GREEN

RED Failed, OrderApplication and

ExtranetWebSite will fail

ProductsWebService ProductsDatabase and

TransportWebService

both GREEN


ProductsDatabase

YELLOW or GREEN;

TransportWebService

RED, YELLOW or

GREEN

YELLOW Degraded, will impact OrderApplication

and ExtranetWebSite

ProductsDatabase RED,

YELLOW or GREEN;

TransportWebService

RED, YELLOW or

GREEN

RED Failed, OrderApplication and

ExtranetWebSite will fail

The health state of these entities eventually has an effect on the health state of the business

processes that they enable, as shown in the following table.

Entity Dependencies State Description and effect

ExtranetWebSite CustomerWebService

GREEN and

ProductsWebService

GREEN


CustomerWebService

YELLOW or GREEN;

ProductsWebService

YELLOW or GREEN

YELLOW Degraded

CustomerWebService

RED, YELLOW or

GREEN;

ProductsWebService

RED, YELLOW or

GREEN

RED Failed

OrderApplication CustomerWebService

GREEN and

ProductsWebService

GREEN


CustomerWebService

YELLOW or GREEN;

ProductsWebService

YELLOW or GREEN

YELLOW Degraded

CustomerWebService

RED, YELLOW or

GREEN;

ProductsWebService

RED, YELLOW or

GREEN

RED Failed

Aggregate Aspects Different audiences or consumers of an application or service require different views of the

health of a managed entity. An aggregate aspect provides a higher-level view of a health state

by aggregating health state information from different aspects (and potentially other aggregate

aspects). A common scenario for using aggregate aspects is when you need to represent the

health state for a particular functional area of an application at the managed entity level and a

managed entity has multiple instances (multi-instance managed entity). Figure 2 illustrates a

case where health state can be aggregated at different levels.

Figure 2 Health state of an aggregate aspect

In this case, Web service A is a parent of multiple instances of Web service B (residing on a Web

farm). Web service B has an aspect named connectivity, which corresponds to connectivity to a

database. An administrator wanting to monitor Web service A looks at the connectivity aspect,

which turns yellow if 50 percent of the instances of Web service B have no connectivity and red

if 75 percent of the instances of Web service B have no connectivity.

Rolling Up Aspects into Managed Entities Ultimately, the two questions you should ask when determining how to roll up aspects into

managed entities are the following:

• What impact does the health state of an aspect have on its managed entity?

• What impact does a managed entity have on its parent?

In some cases, a RED (failure) for one aspect may cause its managed entity, and perhaps even

the entire application, to fail. In other cases, a RED (failure) for one aspect may cause only

degraded performance of one managed entity (YELLOW). There are no definite rules to help you

decide, because each application is unique and the affects of each component and managed

entity will vary. However, there are two general rules:

• If any child of a parent is RED, the parent should be RED or YELLOW. Otherwise,

operators viewing only roll-up indicators will not realize that there is a failure

somewhere in the application.

• If a managed entity is vital for operation of the application, a RED state must cause all

parents and ancestors to be RED indicating at the top level of the monitoring tree that

the application has failed.

Exactly how roll-ups are used in determining health will depend on the technology used. For

example, System Center Operations Manager (SCOM) 2007 uses roll-ups in a different way to

the MMD tool.

Monitoring and Troubleshooting Workflow When defining and consuming health models, it is important to determine, both in automatable

and human readable form, how to detect, verify, diagnose, and recover from errors. This is the

monitoring and troubleshooting workflow, which defines the logical stages of the monitoring

process and problem recovery process. The stages of this process are illustrated in Figure 3.

Figure 3 Troubleshooting workflow

Detection A monitoring agent defines how the health states of a particular aspect can be detected.

Typically, there are multiple ways to detect a problem with a managed entity. To detect a

problem, a monitoring agent can do the following:

• Listen for events related to the health of the managed entity.

• Poll and compare performance counters against the specified thresholds as the basis to

detect a problem.

• Scan trace logs for information used to detect a problem.

• Use health indicators, such as heartbeats or synthetic transactions, to determine health.

For more details about the specific health indicators you can use, see Chapter 6,

"Specifying Infrastructure Requirements."

The instrumentation listed within a single detector is linked by the OR logic operator—that is,

the application enters the health state if any of the items is detected. If multiple detectors are

present, they are linked by the AND operator, and all the conditions need to be detected

simultaneously for the application to enter the health state.

It can also be given a NOT flag; in which case, the health state is signaled by the absence of the

detector listed within the monitoring agent within a particular timeframe.

When the defined problem signatures are detected, a problem associated with the operational

condition and health state is indicated. Until the condition is verified, the associated health state

is not updated, and diagnosis and recovery steps should not be attempted.

Verification After a problem is detected, it is often necessary to verify that it actually still exists. This step is

critical to make sure the problem was not simply a surge in resource consumption, spike in

workload, or simply a transient issue that has gone away. Verification is basic confirmation that

the application is in a particular operational condition without trying to diagnose why or to

recover from it.

The logic that verifies whether an aspect is in a red or yellow operational condition should be

separated into a separate external verifier that will simply return which of the three possible

conditions is in effect at the time. Verifiers should not attempt any kind of diagnosis because

they need to be lightweight and their job is only to confirm whether or not the loss of

functionality (such as "Queue Latency Critical" or "Can't Print") is still observed. Having a verifier

that is built as an external script or executable file will allow the same piece of code to be used

to do the following:

• Confirm whether or not a particular aspect is in an unhealthy health state.

• Verify that recovery actions were successful at resolving a particular problem with an

aspect.

• Perform on-demand detection of the condition of an aspect even in the absence of an

event being logged. Scheduled execution of verifiers can be used as health "pings"

specific to the service even if no user has yet noticed a problem.

• Provide troubleshooting tools that can be used in multiple environments and

applications. In proactive monitoring environments, such as Microsoft Operations

Manager, the verification step is frequently combined with other parts of the

monitoring workflow. This can be done in these environments because there is not

usually a delay between detection and the start of diagnosis where the problem may

have gone away on its own.

Diagnostics After a negative health state has been detected within an aspect and confirmed to still exist, it

may be necessary to perform diagnosis to determine the root cause of the problem so the

appropriate recovery actions can be taken. Wherever possible, you should try to have

instrumentation that is specific enough to lead directly to resolution, and thereby avoiding this

step. Even if you do not know the exact root cause of a problem, there is usually a good

indication of where to start diagnosis based on the context of how the problem was detected.

In many cases, further analysis is required during diagnosis. For example, it may be known that

there is a network connectivity problem of some kind because of an error code that was

returned to the application. However, until it has been determined that the IP address lease

from the DHCP server was lost, the steps needed to fix it (attempting to renew the lease) are not

clear. Additional trace logs may have to be examined, correlation of information from other

events may have to be done, or even querying the live run-time state may be necessary to

determine the true root cause of a problem.

The diagnostics step uses all forms of available instrumentation, such as events, performance

counters, WMI providers, and activity traces, to correlate information and determine the root

cause. The diagnostics step can take a long time and further disrupt service while it is

happening. It can be necessary to inspect a much broader set of internal state parameters and

correlate between applications to perform the diagnostics step.

The diagnosis step captures the step-by-step instructions of what someone needs to do to

diagnose the root cause of a problem, and may also include script or code to automate this

diagnosis. It can be thought of as a function that takes a general high-level indication of what is

causing a particular aspect of health state as input and returns a specific root cause that can

then be used to take the appropriate recovery steps. The event or performance counter that

leads to the detection of the health state will usually indicate where to start diagnosis for the

problem.

Resolution After the root cause is identified, the next step is to attempt to resolve the problem. This

process can involve reconfiguration of the application, restarting a service, manipulating internal

state by calling some management API, or performing some other administrative task.

Resolution may also be in the form of a code or a script that will attempt to automate the

resolution steps. It can also reference the GUID of another "blame" managed entity that is

failing to provide the needed services that will become the new starting point for diagnosis.

Re-verification The same verification procedure that was used to verify the existence of the operational

condition is used to re-verify that the operational condition has indeed been corrected. When

the issue is successfully resolved, this returns false.

Structure of a Health Model You can create a health model following your own custom structure, and use any tool or

application of your choice (for example, a spreadsheet such as Microsoft Excel); specialist tools

can make the process much easier. They help you to generate a model that conforms to the

accepted formats and schemas. Some tools can also generate Management Packs aimed at the

common monitoring applications, such as Microsoft Operations Manager (MOM). For example,

AVIcode produces a range of tools and integration kits specially designed to meet the Design for

Operations (DFO) vision and work with .NET Framework applications and Microsoft tools. For

more details, see http://www.avicode.com/.

The Management Model Designer (MMD), makes it easy to generate a health model and export

it as a MOM Management Pack. You can also use the Health Model Editor within MOM to create

a model of the instrumentation exposed by an existing assembly, edit this to create a health

model, and then generate the corresponding Management Pack. For more information about

Management Packs in both MOM 2005 and SCOM 2007, see Chapters 15–17 of this guide.

A comprehensive health model can contain state transitions that fall into three categories, each

aimed at a specific area of the application requirements:

• Business Operations health model. This defines the requirements in terms of business

rules and SLAs, and it includes contingency information.

• Application health model. This defines the requirements for the processing and

performance of the application as a whole, its related services and components, and

associated services, such as databases.

• System health model. This defines the requirements for the underlying operating

system processes used by the application.

Figure 4 illustrates the structure and content sections of typical health models for the System,

Application, and Business Operations categories.

Figure 4 The structure and content sections of a typical health model

As shown in Figure 4, each section contains the following:

• Requirement. This includes a series of rule definitions in appropriate terms for the

section in which it resides—for example, a rule in the Business Operations section that

all orders marked as "urgent" should be processed and completed within four hours.

• Detection Information. This includes a series of rules or functions that implement the

detection information. The rules or functions indicate the heath state or condition of

the application (such as "offline" or "failed"), the criticality (RED, YELLOW, or GREEN),

the alerts to send to the operator and monitoring system, and a series of indicators that

define the Health and Diagnostics Workflow sections to which this rule applies. There

may also be a contingency plan that describes workaround procedures while awaiting

rectification of the fault.

• Health and Diagnostics Workflow. This describes the steps to verify the fault, diagnose

the causes, resolve the problem, and re-verify the solution afterward.

Mapping Requirements to Individual Indicators In some cases, the rules in the health model do not map directly to the set of indicators required

by the application. The following are some examples:

• The health model may define requirements for which measurement is possible only

through using indicators built into the operating system or some underlying application

code, such as one of the Enterprise Library application blocks. In this case, you would

take advantage of the indicators provided by the operating system or underlying

application instead of creating a specific indicator. However, the rule is still part of the

health model for the application you are designing.

• A state transition in specific components may reflect one of a set of different underlying

problems. Correct implementation of the instrumentation will include indicators for

each detectable condition, such as failure to open a connection, failure to update data,

or failure to commit a transaction. Each indicator will return a RED, YELLOW, or GREEN

status, allowing operators to see if the failure is because of, for example, an incorrect

connection string, incorrect permissions within the database, or failure of another of

the series of data access operations.

• Some failures may have more than one cause but only one effect that is detectable

within the application. For example, failure to access a database may be the result of a

network failure, an incorrect connection string, or a database server failure. However,

indicators for this aspect within the application will probably not be able to detect the

actual cause and will just return RED (failed). To diagnose this failure requires other

indicators within the database system, which are not part of the health model for this

application.

A state transition defined in the health model will not always map directly to a single indicator

status change; in some cases, it will reflect the overall results from a combination of settings.

Multiple Distributed Managed Entities Another common scenario is a controller that manages multiple instances of managed entities.

For example, a Web service may reside on an array of four servers fronted by a router that

distributes requests amongst them based on their current response times. You may be tempted

to think of the Web service as a single managed entity, and implement instrumentation to

measure overall response times, but this may not provide the optimum solution. When the

response times fall below the minimum, and a YELLOW or RED state occurs, all you can

determine is that the Web service farm is at fault. You cannot determine which, if any, server is

at fault, or if the router or a network connection has failed.

Instead, you can think of each Web service as a separate managed entity that you create

instrumentation for, as shown in Figure 5. Rolling up or combining these managed entities

allows administrators to obtain an overall view that is, in some respects, an average of

performance or an indicator of the overall health of the complete application or a subset of

processes. When a YELLOW or RED indication appears, operators can drill down to the next level

and get extra information that allows them to isolate more quickly the source of the problem.

Figure 5 Rolling up managed entities from a Web server farm

The example in Figure 5 illustrates how you can circumvent some of the issues you may

encounter when mapping a health model to the managed entities it describes. To create an

indicator for the overall state of this section of the application in terms of the number of servers

online (if this was a requirement of the health model), you would have to use a probe, such as a

ping request to each server through its IP address, to detect individual server failures.

The alternative of rolling up the individual aspects from the servers through a rule in the health

model makes more sense and reduces the impact on the servers. In the monitoring application,

you would implement the health model rule using the tools that monitor the event logs of the

individual servers and a roll-up rule or combining rule that produces the overall indication based

on the definition in your health model of the minimum number of servers to be online.

Northern Electronics Scenario The solutions architect has considered creating a fully comprehensive management model for

the Northern Electronics shipping application, but he currently lacks the resources to do so.

However, he is committed to modeling health and instrumentation for this application.

Instrumentation Model The application will provide comprehensive instrumentation in the form of events and

performance counters. The solutions architect has defined the following abstract events for the

application:

• PickupServiceNETGeneralError

• PickupServiceSOAPError

• ShippingServiceConfigurationError

• ShippingServicePickupFault

• ShippingServiceSOAPError

• ShippingServiceSQLException

• ShippingServiceSuggestedDateMismatch

• ShippingServiceTruckArrivalDelayed

• TransportServiceConfigurationError

• TransportServiceOrderFault

• TransportServiceSOAPError

• TransportServiceSQLException

• TransportServiceUnknownError

• OrderClientConfigurationError

• OrderClientSQLError

• OrderClientUnknownEvent

The solutions architect has also defined the following abstract measures for the application:

• PickupServiceConfirmPickup

• ConfirmShippingService

• ConfirmShippingServiceResponse

• DelayedShippingService

• DelayedShippingServiceResponse

• ShippingRequestPerSecond

• TransportServiceOrderTransport

• TransportServiceOrderTransportResponse

Health Model The solutions architect has defined the following operational requirements for the application:

• The Transport Order Web service must be available at all times.

• The Transport Order application must be available at all times.

• The Warehouse Management application must have an availability exceeding 90

percent.

• The Shipping Service must be available at all times. However, if the Transport Order

Web service is not available, it will store transport requests until it can pass them to the

Transport Order Web service and must store them for a maximum of two hours.

The solutions architect will use the health model for the application to help determine whether

these requirements are being met. The solutions architect does not consider it necessary to

differentiate between offline and failed heath states, so he defines the following three health

states for each managed entity:

• Green. This indicates the managed entity is working normally.

• Yellow. This indicates the functionality of the managed entity is degraded.

• Red. This indicates the managed entity is either unavailable or offline.

The solutions architect has defined the following aspects for each managed entity:

• Connectivity

• Data access

Other documentation that references Northern Electronics differentiates between a failed and

offline state, so it uses the four health states: Green, Yellow, Red, and Black.

The Transport Order Web service depends on the Transport Order application and the

Warehouse Management application. This means that the health of the Transport Order Web

service is affected by the health of these other managed entities and their corresponding

aspects. Figure 6 illustrates how problems with the Transport Order Web service and

Warehouse Management applications affect the health of the Transport Order application and

the Shipping Service.

Figure 6 Rolling up heath states

In this case, the connectivity aspect of the Transport Order Web service is RED, indicating a

failure. This means that the Transport Order application is also RED because it cannot process

requests, even though its own Data Access aspect is GREEN. The Warehouse Management

application is GREEN because its Data Access aspect has only just transitioned to YELLOW, and

this situation is within the operating parameter defined in the health model (90 percent

availability).

The health model also defines the contingency situation for the Shipping Service in that it will

store requests for the Transport Oder application for a maximum of two hours. Assuming that

this period has not yet passed, the Shipping Service is YELLOW indicating a pending problem but

not a failure.

Summary This chapter has described how to create effective management models that capture the

knowledge about an application. In cases where all the knowledge cannot be captured, it is still

effective to use a management model to capture health and instrumentation information, which

are critical to designing manageable applications.

Chapter 5

Proven Practices for Application

Instrumentation

Software instrumentation provides information about executing applications. This information

can be used for a number of purposes, such as troubleshooting, capacity planning, business

monitoring, optimizing development, and security auditing. In this guide, instrumentation is

created to support the management of software applications by operations staff, where their

primary concern is the health of the application—for example, the response time for specific

operations, the availability of key resources, or the status of integration points. This chapter

provides a number of proven practices for the architecture and design of software

instrumentation in general, but it focuses on those aspects of instrumentation that assist

operations staff determine application health.

Other proven practices regarding the general principles of architecting manageable

applications are discussed in Chapter 3, "Architecting Manageable Applications."

Events and Metrics All data collected from instrumentation falls into one of two categories:

• Events. These are raised when specific things happen in a running software application,

or when things fail to happen as expected. Events provide contextual information about

the occurrence, including data such as machine name, process name, user context, and

date/time information. For example, if an application cannot read from a database, an

event could be raised that details the application process, the data connection

parameters, and even the exact syntax of the database query.

• Metrics. These represent measurement of a variable and the units associated with it.

For example, an application might choose to expose the amount of physical storage

being used, the available percentage of a critical resource, such as network capacity, or

the number of orders in a queue. Metrics may be used by themselves, or they may form

the basis of more complex measurements.

Architectural Principles for Effective Instrumentation If you are responsible for architecting a well-instrumented application, you should adhere to the

following proven practices:

• Create a flexible instrumentation architecture.

• Create instrumentation that operations staff easily understands.

• Support existing operations processes and tools.

• Create applications that are not self-monitoring.

• Support flexible configuration of instrumentation.

The next sections describe each of these principles in more detail.

Create a Flexible Instrumentation Architecture This guide is primarily concerned with creating instrumentation that supports application health.

However, developers can use the same instrumentation architecture to support capacity

planning, business monitoring, code optimization, and support debugging. You should design

your instrumentation architecture flexibly, so the same principles can be used across all these

areas.

The Team System Management Model Designer Power Tool (TSMMD) tool can be used to

design instrumentation for any purpose. If your management model extends to capture these

different areas, you may want to consider using the TSMMD tool for these purposes.

Create Instrumentation That Operations Staff Easily Understands As discussed in Chapter 1, "Understanding Manageable Applications," in many cases

applications are designed with little regard to the operations perspective. This can lead to

incomplete instrumentation, but it can also lead instrumentation that is of relevance to only the

development team.

Where possible, the application should generate events and metrics around business entities,

such as customers, orders, and product queries, instead of the technical entities, such as

threads, stacks, and collections of internal objects. When designing the application, start with

basic instrumentation that supports the operational view of the system, and then add more

detailed metrics that provide further insight into your systems and applications. This will

promote a strong implementation that focuses on well-defined and relevant data.

Support Existing Operations Processes and Tools If you require operations staff to adopt new procedures and use new tools, you may limit their

adoption. For example, if your operations staff uses a lot of scripting, you might consider

generating sample scripts along with your application that exposes the naming and syntax for

your events. Instead of developing a custom tool for operations, it is often preferable to create a

Microsoft Management Console (MMC) snap-in.

Creating MMC snap-ins has been dramatically simplified with the release of MMC 3.0, which

ships with Windows Vista and Windows Server 2008, but it can also be installed on Windows XP

and Windows Server 2003. Another new platform capability to consider is Windows Eventing

6.0. When used on Windows Vista or Windows Server 2008, Windows Eventing 6.0 allows the

operations team to filter events, correlate events across computers, and create and save custom

views.

Create Applications That Are Not Self-Monitoring It is desirable for applications to detect and recover from unhealthy conditions. However, those

applications cannot always be trusted to reliably monitor themselves. If the health of an

application deteriorates, it may be unable to accurately monitor its performance or even

perform any monitoring at all.

Even in cases where an application is providing reliable information about its own health, users

of the application are likely to perceive self-reported status from an application as biased.

Instead, wherever possible, you should generate instrumentation from the application and use

other external tools to correlate the events and determine the health of the application—for

example, by recording the number of failures in a defined period.

Support Flexible Configuration of Instrumentation Not all instrumentation needs to be used at all times. For example, very low-level

instrumentation may be required by a developer resolving a bug in an application but may have

no use to an operator. An operator may also require much more information from an

application that is unhealthy than from one that is functioning normally. Different

instrumentation is also required in different deployment environments, and it may be necessary

to choose between these deployment environments at deployment time or at run time. For this

reason, a manageable application should be well instrumented and it should allow that

instrumentation to be configured at design time, at deployment time, and at run time.

Using Instrumentation Levels to Specify Instrumentation Granularity To support flexible granularity of instrumentation, you should define instrumentation levels for

each abstract event at application design time. The possible instrumentation levels may include

the following:

• Coarse. This level indicates the event is raised during all operations.

• Fine. This level indicates the event is raised during diagnostic and debug operations.

• Debug. This level indicates the event is raised only during debug operations.

• Off. This level indicates the event is not raised at all.

Do not use the granularity level for anything other than verbosity control. Configuration levels

should be inclusive; if you require different behavior, you should use different events.

After levels are defined for each event, an overall instrumentation level can be specified for

each managed entity in the configuration file for that managed entity. Whether a particular

event is raised is dependent on comparing these two values. For example, an architect could

specify that if an event is specified as fine, and the overall instrumentation level for the

managed entity is specified in configuration as coarse, the event will not be raised. However, in

this case, if the overall instrumentation level in the configuration file is changed to fine or

debug, the event will be raised.

The information in the following table shows in more detail how these rules are used.

Instrumentation level (event) Overall instrumentation level Event raised?

Coarse Coarse Yes

Coarse Fine Yes

Coarse Debug Yes

Coarse Off No

Fine Coarse No

Fine Fine Yes

Fine Debug Yes

Fine Off No

Debug Coarse No

Debug Fine No

Debug Debug Yes

Debug Off No

Off Coarse No

Off Fine No

Off Debug No

Off Off No

By specifying an overall instrumentation level in the configuration file for each managed entity,

it is possible to change the instrumentation level at run time in response to a change in

circumstances. For example, if a stop event is raised during coarse grain monitoring, the level of

instrumentation may be changed to fine, so that more information about the application can be

gained.

You may require additional configurability of instrumentation at run time. As a minimum, you

should be able to turn on or turn off instrumentation at run time. However, in many cases, in

addition to granularity control, you will also need to define other settings, such as designating a

remote source for logging.

The default overall instrumentation level will normally be set to Coarse in production

environments to maximize application performance.

Using Infrastructure Trust Levels to Specify Instrumentation

Technologies Instrumentation may run well in the development environment, but virtually all production

servers run with a reduced set of privileges. Commercial hosting providers may provide even

more limited capabilities than in-house operations. In particular, it may be necessary to

propagate message and trace files using FTP. Without access to platform capabilities, such as

the Windows Event Log, an application may be limited to generating messages or trace files

containing to obtain any production events.

In some cases, specific information about the deployment environment for an application is not

known in advance, or the application must be designed to support multiple target

environments. In these cases, you will need to design the application to support configuration of

instrumentation technologies at deployment time or run time, using trust levels for the different

target environments. For more information about how to achieve this, see Chapter 6,

"Specifying Infrastructure Trust Levels."

Designing Application Instrumentation You should consider the following proven instrumentation design practices when designing the

instrumentation for an application:

• Use the capabilities of the underlying platform.

• Provide separate instrumentation for each purpose.

• Isolate abstract instrumentation from specific instrumentation technologies.

• Create an extensible instrumentation architecture.

• Use base events for instrumentation.

• Use event names and event IDs consistently.

• Ensure events provide backward compatibility.

• Support logging to remote sources.

• Consider distributed event correlation.

The next sections describe each of these design practices in more detail.

Use the Capabilities of the Underlying Platform In many cases, the platform will already provide some of the instrumentation required by an

application. For example, adding a performance counter that shows the number of open

database connections may not provide any additional value over the existing ADO.NET counter,

or logging that a Windows service has started may provide no more information than that

provided by the Windows Service Control Manager.

In some cases, your application may need to be supported on earlier versions of the operating

system, and you should consider this when determining which technologies to use. For example,

if you are running an application on Windows Vista, but the application will also run on

Windows XP, you should ensure that you do not exclusively use Windows Eventing 6.0.

Provide Separate Instrumentation for Each Purpose Instrumentation can have a number of different uses, including determining health,

performance analysis, capacity planning, and developer debugging. Wherever possible, you

should use discrete sets of instrumentation for each purpose. Doing this allows you to configure

the application instrumentation on a case-by-case basis. For example, you may not want to

enable the instrumentation for capacity planning because it may affect the performance of the

application, but you may consider the instrumentation that indicates application health to be

required at all times.

Isolate Abstract Instrumentation from Specific Instrumentation

Technologies In traditional applications, where instrumentation exists, it is generally coded directly into the

application. This means the application developer must have a good understanding of that

specific form of instrumentation to be able to write the application. If run-time configuration of

instrumentation technologies is required (as suggested previously in this chapter), the

application developer must write code to implement each instrumentation technology and

support run-time configuration to choose between them.

Isolating instrumentation code from application code helps to solve many of these problems. In

this case, the architect describes abstract events, and a mapping between those events and

concrete instrumentation technologies. The developer can then call the abstract events in code,

and an instrumentation helper is responsible for the actual instrumentation using concrete

instrumentation technologies. This simplifies the development of instrumentation and does not

require developers to learn multiple instrumentation technologies. If the instrumentation

helpers are automatically generated, the overall development effort can be streamlined.

In cases where run-time configuration of instrumentation technology is required using trust

levels, isolating instrumentation code also helps. In this case, more than one instrumentation

helper can be deployed (one for each trust level) and the trust level can be chosen through

configuration. This allows the instrumentation technologies used to change without altering the

application itself.

Figure 1 illustrates an application designed to use abstract events for instrumentation.

Figure 1 Using abstract events for instrumentation

Create an Extensible Instrumentation Architecture Defining an abstract representation of instrumentation is truly useful only if the instrumentation

helper supports the specific instrumentation technology that you use in your environment. By

defining an extensible architecture using pluggable instrumentation providers, you can ensure

that instrumentation helpers can be developed to support other technologies (such as log4net)

when they are required.

Use Base Events for Instrumentation A number of technologies may be used for collecting events raised by an application, including

Windows Event Logs, Windows Management Instrumentation (WMI), and Windows Eventing

6.0. Events may also be written to trace files, placed in a database, or sent as messages to local

or remote services. The structure and calling syntax may be different between these

technologies, but they consist of mostly common elements. Using a common base class can

minimize the effect on the application of changes in the instrumentation technology. In

addition, a structured approach to building the event will make your events more consistent,

which will help with automation (through scripting or Windows Powershell) and search-aware

tools, such as the Event Viewer that ships with Windows Eventing 6.0.

Use Event Names and Event IDs Consistently The consistent use of names for events and their parameters enables subsequent filtering of

events to focus on only the relevant events. Windows Eventing 6.0 provides built-in support for

filtering, but a consistent use of event string parameters can enable a script or Windows

Powershell command to process an equivalent subset of recorded events.

Each event ID should uniquely identify an occurrence or failure in your application. You should

not try to use the same ID to report multiple different events, because this can prove confusing

to the operations team.

Ensure Events Provide Backward Compatibility Many tools rely on particular events to function correctly. If you change the meaning of an event

or delete an event you can cause errors in a tool that monitors for that event. For events used

for purposes other than developer debugging, you should only add new parameters to an event

and make them optional. If you need to remove parameters, or make the parameters required,

you should define new events for this purpose.

Support Logging to Remote Sources It some cases, it may also be most efficient for an application to asynchronously log events

directly to a remote computer. In other cases, the process of collecting the data must be done

locally, but for performance reasons the processing of the data may be done remotely on

dedicated computers.

If the data must be collected locally, it is often more efficient to compress the events first, and

then send them to a centralized location as a batch.

Windows Vista supports event forwarding, which allows the application to write events locally

and then have the events forwarded to a centralized location automatically.

Consider Distributed Event Correlation Many applications today are composed of multiple pieces running on different computers. In

these cases, and when logging to remote sources is enabled (as explained in the previous

section) it is important to be able to generate an integrated picture. The use of an application-

generated unique key may be necessary to understand the relationship between distributed

events. In particular, you must include sufficient data—such as network address, cluster ID, and

so on—to allow others to correlate with multiple sources. Windows Eventing 6.0 supports

ActivityID (and RelatedActvityID), which can be used for event correlation.

Developing the Instrumentation If you are responsible for programming the application events that ultimately raise

instrumentation events, you should consider the following proven instrumentation

development practices:

• Minimize resource consumption.

• Consider the security of the event information.

• Supply appropriate context data.

• Record the times events are generated.

• Provide resolution guidance.

The next sections describe these development practices in more detail.

Minimize Resource Consumption There is a direct tradeoff between how much information can be collected and the application's

ability to service requests. Wherever possible, instrumentation should take up less than 10

percent of total processing of the application. It is desirable that the instrumentation does not

itself degrade the application, but if critical information is not being logged, the instrumentation

is not sufficient to support its primary goal.

In cases where complex calculations must be performed on instrumentation, these should

normally be deferred to an external process, so that they do not adversely affect the

performance of the application. The calculations may be performed at a later time, or even on a

different computer.

Logging to remote sources, as discussed in the previous section, can also help to improve

performance. Providing instrumentation that is configurable at run time is another practice that

can help improve performance. Run-time configuration allows you to define extensive

instrumentation for your applications and to enable that instrumentation only when required,

ensuring the best available performance in your applications.

Consider the Security of the Event Information You must be extremely careful if you are logging any sensitive data such as passwords, account

names, SIDs, user data, or events that track end-user activity that impacts privacy. Putting

sensitive information about an exception into an event mechanism on the server may seem

more secure than returning it through the call stack, but you need to understand the security

policies and restrictions of that event mechanism. For example, when applications are hosted on

third-party servers, the reduced trust levels on those servers may result in the instrumentation

being recorded into text file in the application directory.

Supply Appropriate Context Data Information such as the machine name, user name, and process ID are all contextual

information that could be critical to isolate a problem. Eventing technologies may automatically

supply some of these values, but they may represent the identity servicing the request as

distinct from the identity making the request. When supplying context data, you should also

consider the security of the event information, as mentioned earlier.

Record the Times Events Are Generated The system time when an event is generated may differ from the time when it is received or

processed, particularly when events are logged to a centralized location. In many cases, the

order in which events are received, and how they match observed behavior in an application, is

critical to troubleshooting problems with an application. Therefore, it is very important that

events include accurate information about the time they are raised. Where possible, each of the

systems raising events should be time synchronized. This is particularly challenging in distributed

systems where systems that are off even by a few milliseconds may make it very hard to

correlate based on time. In those systems, it is sometimes easier to correlate on some

synthetically-generated key that are invariant.

Provide Resolution Guidance The event string should contain sufficient information to give the operator guidance on how to

proceed in resolving the issue. This may include guidance on what other managed entities may

be affected by a problem, along with contextual information. In some cases, the event will

include a URL that points to centralized documentation or a Help topic ID for a Help file.

Building and Deploying Instrumentation If you are responsible for building and deploying manageable applications, you should consider

the following instrumentation guidelines.

• Automate implementation of instrumentation implementation.

• Automate the build and deploy process.

• Monitor applications remotely.

The next sections describe these instrumentation guidelines in more detail.

Automate Implementation of Instrumentation Generating instrumentation code from a model or template will make the structure and calling

syntax more consistent and robust. It will also enable generating meta-information about the

instrumentation, such as the meta-information that might be used by an installer, as discussed

in the next section.

Automate the Build and Deploy Process For instrumentation, automating the build and deploy process involves creating an installation

package that uses the Installer tool (Installutil.exe) to handle counters and event logs that need

to be registered at install time. Automating this process saves considerable effort and makes the

process more reliable.

Monitor Applications Remotely The process of collecting, interpreting, and displaying the instrumentation information increases

the load on the application server. In most cases, you should monitor from a remote computer.

Network activity is normally monitored on the local computer, because monitoring the network

remotely can affect performance and therefore lead to unreliable data.

Summary This chapter has examined proven practices that can be used for instrumentation when

architecting, designing, and deploying applications. You should use this chapter in conjunction

with Chapter 2, "Architecting Manageable Applications," to ensure that you create instrumented

applications that support the overall goal of designing manageable applications.

Chapter 6

Specifying Infrastructure Trust Levels

It is important to understand the infrastructure to which an application is going to be deployed

when the application is first developed. This allows the solutions architect to ensure that the

application will work as expected in the target environment; it also allows the application to

take full advantage of the existing infrastructure.

If different deployment infrastructures are not considered at design time, the application will

not function as expected in some cases; this results in increased costs from change requests at

staging or deployment time. However, it is not always possible to determine the exact nature of

the deployment infrastructure at design time. In many cases, the full details of each datacenter

are not known. Even when the deployment infrastructure is well known at design time, there

may still be a requirement to support multiple infrastructure environments.

In some cases, changes will occur to the target infrastructure during application development.

The deployment environment may also change during deployment or after deployment in a way

that affects the application. If these changes are communicated back to the application

architect, appropriate changes can be made to the application design.

Understanding the target infrastructure is particularly important when designing manageable

applications. The environment into which an application is deployed can have a significant effect

on the instrumentation technologies that can be used in that application. For example, if an

application cannot be installed or run with administrator privileges, that application cannot

write to an event log.

The developer architect must ensure that the application uses instrumentation that will work in

the target environment. In cases where the target environment is not known, the architect will

typically need to support multiple forms of instrumentation, and allow the specific technologies

to be configured at deployment time or run time.

You should deploy applications into the lowest possible level trust environment for proper

execution. The trust environment used leads to design decisions that a developer architect

must make early in the development life cycle.

This chapter describes different infrastructure model scenarios, and then it examines tools that

can be used to create an infrastructure model. The chapter then describes how the Team

System Management Model Designer Power Tool (TSMMD) can be used to capture information

about an infrastructure pertinent to manageable applications.

Infrastructure Model Scenarios It can be important to model the infrastructure of a target environment in many cases.

However, in creating the guidance included in this book, two main scenarios have been

considered:

• Development of an in-house application

• Development of an ISV or shrink-wrapped application

The next sections describe these two scenarios.

In-House Application Scenario In this scenario, the solutions architect and infrastructure architect have access to detailed

information about the corporate datacenter into which the application will be deployed. Often,

they will also have gained insight from past deployments that may not have gone well or as

intended. This knowledge can be represented in an infrastructure model that is used by the

developer architect when defining the specifications for the application.

The in-house infrastructure model definition should include named instances of the operational

environments contained in the test datacenter, staging datacenter, and production datacenter.

These names can be descriptive of the actual datacenter, such as Production, or they may

represent a security level within the datacenter, such as Medium. In many cases, it will be a

combination of the two, such as PROD_MED.

Security level specifications may be mapped to a platform specification. For example, the

Microsoft Windows Vista operating system contains well-defined operating system trust levels

named High trust, Medium trust, and Low trust.

Even though the target environment is generally well known in this scenario, it may still be

necessary to support multiple infrastructures. For example, the application may be deployed in

several different data centers with different infrastructures or the infrastructure may differ at

different stages of the application life cycle. When a developer architect is designing a

manageable application in this scenario, he or she must ensure that the instrumentation

technologies used are compatible with the target environment or environments that the model

defines. The developer architect must also allow the application to be configured at deployment

time (and possibly at run time) to support the particular environment used.

ISV or Shrink-Wrap Application Scenario In this scenario, the deployment environment is unknown, or a wide range of deployment

scenarios must be supported. To address these concerns, the developer architect must design

the application with flexibility in mind, so that the infrastructure architect or the operations

team can configure the application at deployment or run time to support the requirements of

the specific infrastructure used.

In many ways, this scenario can be thought of as the inverse of the preceding scenario. In the

preceding scenario, the deployment infrastructure is well known at the beginning of the

development cycle, and a comprehensive infrastructure model can be used to influence the

development of the application. In this scenario, the deployment infrastructure is not well

known, so the infrastructure model that is used influences the infrastructures that can be

supported. In this case, the more flexible the model, the more environments the application can

successfully be deployed into.

When a solutions architect is designing a manageable application in this scenario, he or she

must ensure that instrumentation is provided for each of the proposed target infrastructures.

He or she must also provide configurability so that those deploying the application can ensure

that the correct instrumentation is used in a particular scenario.

Privilege and Trust Considerations Privileges are an important consideration when defining trust levels (called target environments

in the Team System Management Model Designer Power Tool). Privileges to perform specific

operations, such as those used by the instrumentation technologies discussed in this guide, may

be allowed or denied at three distinct levels—the operating system, the application host (most

notably IIS), and by .NET Framework security. These levels build on each other, and the

operating system has the ultimate role in denying access to resources. For example, a .NET

Framework Web application may be running under a .NET Framework security policy that allows

writing to the file system, which defers to the IIS file permissions for the site's home directory,

and then defers to any file access or deny permission set in the Windows file system. When

defining instrumentation technologies for a managed entity, the privileges associated with the

run-time process context that will host the managed entity will likely decide what

instrumentation options are available.

Privileges are particularly important as an application moves from a developer's environment

(which typically runs as Administrator with correspondingly high trust), through testing, and into

production (where the security principle of "least privilege" often results in access being denied

to required platform capabilities that were present during development).

Also, the privileges that are required to run the application and emit instrumentation from a

managed entity may be different from the privileges that are required to install the application

and its associated instrumentation. For example, many infrastructure configurations may allow

an application write to an event log or increment a performance counter, but the event log itself

and any named performance counters (and message queues) can only be created by code with

Administrator privileges. For this reason, it is common practice to move such high trust

operations outside the execution of the application and into a separate install process, which is

more commonly run with elevated privileges. For more information, see chapters 9–12 of this

guide.

Security in the .NET Framework is often less well understood by operations staff than platform

and core services security. The principle of least privilege in this context means that while .NET

Framework applications are typically developed with the full trust permission set, they will

normally run in production environments using the LocalIntranet or LowTrust permission sets.

Named permission sets may be specified at the computer level or the "enterprise" level, and

custom permission sets may be created to reflect combinations of permissions that are not

offered by one of the built-in permission sets.

The following table lists built-in permission sets, and their corresponding affects on

instrumentation.

Permission set Affect on Instrumentation

LocalIntranet Writable to only isolated storage, not the file system.

LowTrust Writable to isolated storage, local event log, and Web access.

MedTrust Writeable to isolated storage, local event log, Web access, the file system, and

system registry.

Everything Unrestricted access to performance counters and event log, message queues, and

the service controller, in addition to the file system, system registry, and the Web.

Several of these capabilities have some overlap. For example, creating a performance counter

requires administrator privileges because it essentially writes to a secure part of the system

registry.

Figure 1 illustrates the .NET Framework Configuration tool that can be used to specify the

permission set used for an application.

Figure 4 .NET Framework Configuration tool

The Team System Management Model Designer Power Tool (TSMMD) does not attempt to

enforce selections made when defining trust levels or the instrumentation technologies that are

available within the trust level. This is because each deployment infrastructure is potentially

unique.

This means the TSMMD is flexible in that it may be adapted to a wide range of infrastructure

definitions represented as trust levels. But with this flexibility comes opportunity to make

mistakes when defining available instrumentation technologies for a target deployment

environment.

Tools for Infrastructure Modeling Software tools for infrastructure modeling can take a number of forms, including the following:

• Standalone tools

• Tools integrated with the development environment

The next sections describe each of these types of tools in more detail.

Standalone Tools Standalone tools, such as Microsoft Visio, can be used to create a comprehensive model of each

infrastructure that should be supported by an application. In its simplest form, the model would

consist of a named instance of each infrastructure element, with a number of properties

defined. This model can be modified over time as the infrastructure changes.

In some cases, a standalone tool may be designed to export information directly to a

development environment, allowing the developer architect to directly use the information in

his or her design. In other cases, the architect would simply read the model and use it to make

decisions on the design of the application.

Integrated Tools Integrated tools, such as the TSMMD are used to make the infrastructure model part of the

overall design of the application. This allows the developer architect to directly model elements

of the infrastructure at design time, and it ensures that the infrastructure considerations are not

missed when the application is developed. The TSMMD also allows the instrumentation

technologies for each infrastructure scenario to be automatically generated. The next section

provides more details about the TSMMD.

Infrastructure Modeling with the TSMMD The TSMMD allows you to define infrastructure models in the context of creating manageable

applications. This means defining one or more target environments, specifying instrumentation

that will be useable in those environments, and ensuring that the correct instrumentation is

mapped to the correct target environment. The TSMMD does not consider other aspects of

application functionality that may be affected in different target environments. For example,

using a different communication mechanism for a different infrastructure would not form part

of a management model defined in the TSMMD.

The TSMMD uses target environments to represent the different infrastructures that an

application may be deployed into. Target environments can be given any name, but typically

they will have names such as high, medium, or low, to reflect the level of trust in the target

environment. Target environments are defined across all managed entities that make up an

application, and each managed entity can be configured to use one or more of the defined

target environments.

To create a target environment in the management model explorer, use the New Management

Model Wizard or simply right-click the Management Model root and then click Add New Target

Environment to specify the named instance and its properties.

Specifying target environments in the TSMMD tool forces the application architect to consider

the instrumentation options available when defining the abstract instrumentation at design

time. This should lead to predictable instrumentation implementation when developers call the

instrumentation helpers within the solution.

The TSMMD tool can also be used to generate instrumentation code. This simplifies changes to

the infrastructure later in the application life cycle. Any changes can be reflected by simply

updating the model definition and regenerating the instrumentation implementations.

Instrumentation Technologies Supported by the TSMMD The following instrumentation technologies are provided by the Microsoft Windows operating

system and are supported by the TSMMD:

• Enterprise Library Logging. Enterprise Library, from the Microsoft patterns & practices

division, contains the Logging Application Block that allows developers to perform a

wide range of logging tasks using a standardized and easy-to-use interface. The TSMMD

integrates with the Logging Application Block to allow architects to model and specify

events that the Logging Application Block will handle, and which it will write to the

configured target. By default, the TSMMD configures the Logging Application Block to

write events to the Windows Event Log, but administrators can change the

configuration to send events to any target medium supported by the Logging

Application Block (such as email, database, MSMQ, or text files).

• Windows Event Logging. The Windows Event Log service enables an application to

publish, access, and process events. Events are stored in event logs, which can be

routinely checked by an administrator or monitoring tool to detect occurrences of

problems on a computer. The Windows Event Log SDK allows users to query for events

in an event log, receive event data as events occur (subscribe to events), create event

data and raise events (publish events), and display event data in a readable format

(render events). For more information about Windows Event Logging, see Chapter 9,

"Event Log Instrumentation."

• Windows Eventing 6.0 event s. The Windows Eventing 6.0 service added to Windows

Vista and Windows Server 2008 extends the capabilities of the event logging system

while still providing familiar access to Windows Event Logs. Applications can publish,

access, and process events, and administrators or monitoring tools can use the logs

detect occurrences of problems on a computer. For more information about Windows

Eventing 6.0 Event Logging, see Chapter 11, "Windows Eventing 6.0 Instrumentation."

• Event Trace for Windows. Event Tracing for Windows (ETW) provides application

programmers the ability to start and stop event tracing sessions, instrument an

application to provide trace events, and consume trace events. Trace events contain an

event header and provider-defined data that describes the current state of an

application or operation. You can use the events to debug an application and perform

capacity and performance analysis.

• WMI events. Windows Management Instrumentation (WMI) is the instrumentation

standard used by management applications such as Microsoft Operations Manager

(MOM), Microsoft Application Center, and many third-party management tools. The

Windows operating system is instrumented with WMI, but developers who want their

own products to work with management tools must provide instrumentation in their

own code. WMI in the .NET Framework is built on the original WMI technology and

allows the same development of applications and providers with the advantages of

programming in the .NET Framework. For more information about WMI events, see

Chapter 10, "WMI Instrumentation."

• Performance counters. Windows collects performance data on various system

resources using performance counters. Windows contains a pre-defined set of

performance counters with which you can interact; you can also create additional

performance counters relevant to your application. For more information about how to

programmatically create performance counters and how to read performance counters,

see Chapter 12, "Performance Counter Instrumentation."

Northern Electronics Scenario The infrastructure architect has provided information to the solutions architect about the

potential deployment environment for the application. As a result, the solutions architect has

determined the following:

• Each workstation application will log events to the event log.

• Each Web service may log events to the event log and generate performance counters.

However, the Web service may also need to run in a lower trust environment, where it

can only log events to a trace file.

To support this scenario, the solutions architect decides to define two target environments:

• Low trust will be used when a managed entity is writing to a trace file. Low trust will be

defined as a target environment instrumentation for all the Web service managed

entities.

• High trust will be used when the managed entity is writing to the event log and creating

performance counters. High trust will be defined as a target environment

instrumentation for both the Web service and the workstation application managed

entities.

Summary This chapter discussed the use of infrastructure trust levels, which are used to support different

target deployment environments. If an architect can define different infrastructure models that

the application supports, the decision about which instrumentation technologies to use can be

deferred until the application is deployed or run. This helps to ensure that the application will

function as expected in the target environment, without requiring changes to the underlying

application code.

Chapter 7

Specifying a Management Model

Using the TSMMD Tool

Creating a management model for your application can be somewhat challenging. To simplify

the process, this guide includes a tool, known as the Team System Management Model Designer

Power Tool (TSMMD), which allows you to graphically model and operations view of the

application. You can use the tool to apply instrumentation to this model and some basic health

artifacts.

This chapter describes the requirements for the TSMMD, and then it demonstrates how to use it

to create a management model for your application.

Requirements for the TSMMD To install the TSMMD tool, your computer should meet the following software prerequisites:

• Windows XP, Windows Vista, or Windows Server 2003

• One of the following versions of Visual Studio 2008:

◦ Visual Studio Team System 2008 Architecture Edition

◦ Visual Studio Team System 2008 Database Edition

◦ Visual Studio Team System 2008 Developer Edition

◦ Visual Studio Team System 2008 Test Edition

◦ Visual Studio Team System 2008 Team Suite

◦ Visual Studio Team System 2008 Team Foundation Server

You must install both the C# and C++ languages when you install Visual Studio 2008.

The TSMMD requires C++ to generate instrumentation for Windows Eventing 6.0

events.

• Guidance Automation Extensions (GAX) version 1.4 or later

• Enterprise Library version 4.0

To obtain the Team System Management Model Designer Power Tool, visit the Design For

Operations community Web site at http://www.codeplex.com/dfo/.

Creating a Management Model The following are the high-level steps to creating a management model with the TSMMD tool:

1. Create a TSMMD file.

2. Graphically model an operations view of the application.

3. Define Target Environments for the application.

4. Define instrumentation for the application.

5. Create health definitions for the application.

6. Validate the model.

The TSMMD Guided Experience One of the major features of the TSMMD that saves architects and developers time, simplifies

the process of building a management model, and reduces the opportunities for errors, is a

series of Wizards that make up the guided experience for the TSMMD.

The Team System Management Model Designer Power Tool includes the following guided

experience wizards that help you to configure individual parts of a management model:

• New Managed Entity wizard. This wizard helps you to create a new managed entity,

and set its properties.

• New Aspect wizard. This wizard helps you to create a new health state aspect for a

managed entity, and define the abstract instrumentation that implements the health

transition indicators for the new aspect.

• New Event Implementation wizard. This wizard helps you to create implementations of

an abstract event, including specific implementations for each of the instrumentation

technologies you specify in the target environments of the model.

• New Measure Implementation wizard. This wizard helps you to create

implementations of an abstract measure, including specific implementations for each of

the instrumentation technologies you specify in the target environments of the model.

• Discover Instrumentation wizard. This wizard helps you to locate and identify existing

instrumentation in the assemblies of your application, and import the definitions into

the model so that you can associate them with abstract events and measures.

Creating the TSMMD File The TSMMD file is used to hold the information contained within the model. It can be used to

generate instrumentation artifacts in the code; potentially, it can be used to provide information

for a System Center Operations Manager Management Pack.

The following procedure assumes that you already have a Visual Studio solution already in place.

The model is created at the solution level because it will commonly span more than one project.

To create the .tsmmd file and start using the TSMMD tool

1. Start Visual Studio 2008 Team System Edition, click the File menu, point to New, and

then click Project.

2. In the New Project dialog box, click TSMMD Project in the list of project types, and then

click TSMMD Project in the list of projects. Enter a name and location for the new

project and click OK. This creates a new TSMMD project containing a new management

model named operations.tsmmd. The Management Model Explorer window appears

showing this new empty model, and the blank model designer surface appears in the

main window.

If you cannot see the Management Model Explorer window, click the View menu,

point to Other Windows, and click ManagementModel Explorer.

3. Ensure that the guidance packages for the TSMMD are loaded. To do this, click

Guidance Package Manager on the Visual Studio Tools menu. If the list of recipes in the

Guidance Package Manager dialog box does not contain any entries that apply to Team

System Management Model, follow these steps to enable the recipes:

◦ Click the Enable/Disable Packages button.

◦ Select the two guidance packages named Team System MMD Instrumentation

and Team System MMD Management Pack Generation.

◦ Click OK to return to the Guidance Package Manager dialog box.

◦ Click Close to close the Guidance Package Manager dialog box.

If you do not see the two guidance packages in the list, you may need to reinstall the

TSMMD guidance package.

4. In Management Model Explorer, select the top-level item named Operations. In the

Visual Studio Properties window, enter values for the Description, Knowledgebase,

Name, and Version. If you cannot see the Properties window, press F4.

5. Enter values for the Description, Knowledgebase, Name, and Version in the controls of

the wizard, and then click Next.

6. In Management Model Explorer, expand the Target Environments node and select the

target environment named Default. Change the values of properties to indicate

instrumentation technologies you want to use in the default target environment.

7. Right-click on the top-level model entry in Management Model Explorer and click Add

New Target Environment if you want o add more target environments, setting the

appropriate instrumentation technology check boxes for each one.

You use the properties of a target environment to specify that you require any

combination of Enterprise Library Logging events, Windows Event Log events, trace file

events, Windows Eventing 6.0 events, Windows Management Instrumentation (WMI)

events, and Windows performance counters for that target environment. You can also

add more than one target environment to a model to describe different deployment

scenarios.

8. On the File menu, click Save All to save the entire solution.

Graphically Modeling an Operations View of the Application The TSMMD designer allows you to create a graphical operations view of the application. This

model can then be added to, both by the TSMMD tool and by other designers. The TSMMD

designer allows you to graphically model the following artifacts:

• Executable Application managed entities

• Windows Service managed entities (including data services such as databases)

• ASP.NET Application managed entities

• ASP.NET Web Service managed entities

• Windows Communication Foundation (WCF) services

• External Managed Entities

• Connections between managed entities

The following procedures detail how to model these artifacts in the designer.

To create managed entities using the Wizard:

1. Right-click the designer surface of the management model diagram or right-click the

top-level item in Management Model Explorer, and then click New Managed Entity

Wizard.

2. Enter the required information into the pages of the wizard. The wizard allows you to

specify the name, type, description, discovery type and target, and enable model

extenders for the new managed entity.

To create managed entities using the Toolbox:

1. Open the Visual Studio Toolbox, and then drag one of the managed entity types onto

the designer surface. You can choose Executable Application, Windows Service,

ASP.NET Application, or ASP.NET Web Service as the managed entity type.

2. In the designer, select the managed entity and modify the properties that specify the

discovery method (for use in a management pack), the description, the name, and the

type of the entity. If you cannot see the Properties window, press F4 or right-click the

managed entity, and then click Properties.

3. In the Properties window, modify any extended properties specific to the type of the

selected managed entity. For example, if you added an ASP.NET Application or ASP.NET

Web Service to the designer surface, you can specify the exception and performance

counter thresholds, sample times, and warning levels.

4. Repeat steps 1 through 3 for all the other managed entities in the application.

Each managed entity must have a name that is unique from other managed entities and

external managed entities in the management model. Validation code checks for this and

prompts you with a dialog box if two entities are identically named.

For a local managed entity (an entity from the list above that is part of the application, but

excluding the External Managed Entity), you can specify values for the properties shown in the

following table.

Managed Entity property Description

Description This property contains the description of the entity.

Discovery Target This property, in conjunction with the Discovery Type, defines the way that the

monitoring system will locate the entity to check if it exists on a monitored server. In

other words, whether this part of the application is deployed on that server.

Discovery Type This property defines where the monitoring system should look for the Discovery

Target value. Depending on the type of entity, the options are FilePath,

RegistryValue, ServiceName, and IISApplicationName.

Name This property contains the name of the entity.

Executable Application The Executable Application entity represents a Windows Forms application, a console

application, or any other type of application that is not a Windows Service or an ASP.NET based

application or service. You can specify a Discovery Type of either FilePath or RegistryValue for

this type of managed entity. There are no additional properties for an Executable Application

entity.

Windows Service The Windows Service entity represents a Windows Service that (usually) has no runtime user

interface. You can specify Discovery Type of only ServiceName for this type of managed entity.

The Windows Service entity has one additional property shown in the following table.

Windows Service property Description

Windows Service Extension

Enabled

This Boolean property specifies whether the process that generates

management packs will add a specific extender monitor to the management

pack that checks the status of a Windows service by querying WMI at timed

intervals. The monitor will raise an alert if the service is configured to start

automatically and is not currently running. The monitor will not raise an alert

If the service is disabled or configured to start manually if it is not running, or

when it is stopped.

ASP.NET Application The ASP.NET Application entity represents an ASP.NET application that runs on an Internet

Information Services (IIS) Web server. You can specify a Discovery Type of only

IISApplicationName for this type of managed entity. The ASP.NET Application entity has the

additional properties shown in the following table.

ASP.NET Application property Description

ASP.NET Extension Enabled This Boolean property specifies if the process that generates

management packs will add specific extender monitors to the

management pack. The default setting is False. When set to True, the

following properties in this table specify the parameters for the

extender monitors.

Exception Error Threshold This property defines the threshold value at which point the extender

monitor will change the health state of the entity to Critical (RED) for

exceptions generated by the entity within a time specified by the

Exception Sample Time Interval property. The default value is 50.

Exception Sample Time Interval This property defines the duration is seconds that the extender monitor

counts exceptions occurring in the entity, and matches this figure to

the Exception Error Threshold and Exception Warning Threshold

values. The default value is 30 seconds.

Exception Warning Threshold This property defines the threshold value at which point the extender

monitor will change the health state of the entity to Warning (YELLOW)

for exceptions generated by the entity within a time specified by the


Performance Error Threshold This property defines the threshold value at which point the extender


degraded performance measures incurred by the entity within a time

specified by the Performance Sample Time Interval property. The

default value is 50.

Performance Sample Time

Interval

This property defines the duration is seconds that the extender monitor

counts degraded performance measures incurred by the entity, and

matches this figure to the Performance Error Threshold and

Performance Warning Threshold values. The default value is 30

seconds.

Performance Warning Threshold This property defines the threshold value at which point the extender


for degraded performance measures incurred by the entity within a

time specified by the Performance Sample Time Interval property. The


Response Time (ms) This property defines the maximum time within which the application

must respond to the request. The default value is 5000 (5 seconds).

These extended properties allow you to specify the behavior of the application in terms of the

intrinsic performance and internal errors that it generates. This is useful for monitoring and

reporting scenarios that ensure the application meets business requirements and Service Level

Agreements (SLAs).

ASP.NET Web Service The ASP.NET Web Service entity represents a Web service implemented by ASP.NET as an ASXM

service and running on an Internet Information Services (IIS) Web server, or implemented as a

WCF Web service. You can specify a Discovery Type of only IISApplicationName for this type of

managed entity. The ASP.NET Web Service entity has the additional properties shown in the

following table.

ASP.NET Web Service property Description

ASP.NET Extension Enabled This Boolean property specifies if the process that generates

management packs will add specific extender monitors to the

management pack. The default setting is False. When set to True, the

following properties in this table specify the parameters for the

extender monitors.

Exception Error Threshold This property defines the threshold value at which point the extender


exceptions generated by the entity within a time specified by the


Exception Sample Time Interval This property defines the duration is seconds that the extender monitor

counts exceptions occurring in the entity, and matches this figure to

the Exception Error Threshold and Exception Warning Threshold

values. The default value is 30 seconds.

Exception Warning Threshold This property defines the threshold value at which point the extender


for exceptions generated by the entity within a time specified by the


Performance Error Threshold This property defines the threshold value at which point the extender


degraded performance measures incurred by the entity within a time

specified by the Performance Sample Time Interval property. The


Performance Sample Time

Interval

This property defines the duration is seconds that the extender monitor

counts degraded performance measures incurred by the entity, and

matches this figure to the Performance Error Threshold and

Performance Warning Threshold values. The default value is 30

seconds.

Performance Warning Threshold This property defines the threshold value at which point the extender


for degraded performance measures incurred by the entity within a

time specified by the Performance Sample Time Interval property. The


Response Time (ms) This property defines the maximum time within which the service must

respond to the request. The default value is 5000 (5 seconds).

These extended properties allow you to specify the behavior of the service in terms of the

intrinsic performance and internal errors that it generates. This is useful for monitoring and

reporting scenarios that ensure the application meets business requirements and Service Level

Agreements (SLAs).

Windows Communication Foundation (WCF) Service The WCF Service entity represents a Windows Communication Foundation (WCF) Web service.

For this type of managed entity, the architect can specify a Discovery Type of FilePath or

RegistryValue (if the service is self-hosting), IISApplicationName (if hosted by IIS), or

ServiceName (if hosted by a Windows service). There are no additional properties for a WCF

Service entity.

To create external managed entities for the application

1. Open the Visual Studio Toolbox, and then drag an External Managed Entity onto the

designer surface.

2. In the designer, select the external managed entity, and then modify its Name property.

If you cannot see the Properties window, press F4 or right-click the external managed

entity, and then click Properties.

3. Repeat steps 1 and 2 for all the other external managed entities used by the

application.

To model connections between managed entities

1. Open the Visual Studio Toolbox, and then click Connection.

2. Click the managed entity where the connection will start.

3. Click the input port where the connection will end.

4. Edit the Text property of the connection as required.

5. Repeat steps 1 through 4 for all other connections between entities.

You can use External Managed Entities to model services that your application consumes, but

which are not part of your model. You can also use External Managed Entities to split a large

model into smaller models. In this case, the External Managed Entity simply represents the

section that does not appear in the current diagram. It is important to avoid repeating

Managed Entities in more than one section of a management model.

However, you can add only one management model to a solution, and so—in this release—the

management model equates to the solution. If you need to divide your application into

multiple management models, you must create multiple solutions.

Everything outside of the model is classed as external. It is likely that external entities such as

databases, Web sites, and Web services will already be instrumented and managed by other

tools, such as existing management packs (for example, the System Center Operations

Manager pack for SQL Server).

Defining Target Environments for the Application Target Environments are used to model different deployment environments for the application.

You can associate different types of concrete instrumentation with each target environment. For

example, in a low trust environment, it may not be possible to write to an event log, so writing

events to a file may be the preferred option.

To define target environments for the model:

1. In Management Model Explorer, right-click the top level Management Model entry, and

then click Add New Target Environment. If you cannot see Management Model

Explorer, point to Other Windows on the View menu, and then click Management

Model Explorer.

2. The new target environment appears in the Target Environments section of

Management Model Explorer. Select the new target environment, and then modify its

Name property. If you cannot see the Properties window, press F4 or right-click the

target environment, and then click Properties.

3. Change the values for the types of instrumentation you will use in this target

environment. You can enable each type by setting the value of the corresponding

property to True. You can enable Enterprise Library Logging events, Windows Event Log

events, trace file events, Windows Eventing 6.0 events, Windows Management

Instrumentation (WMI) events, and Windows performance counters.

4. Repeat steps 1 through 3 for all other target environments that the model must

support.

For instrumentation helpers to be generated for managed entities, you must define target

environments and associate them with those managed entities. Validation code checks to

ensure that you have at least one target environment defined for each managed entity, and it

displays a warning if not.

Defining Instrumentation for the Application The management model defines instrumentation in two ways:

• Abstract instrumentation. This instrumentation is an abstraction of the specific

instrumentation technology being used. By defining abstract events and measures, the

developer can call the abstract instrumentation rather than the specific technology.

• Implementations of the abstract instrumentation. This instrumentation corresponds to

the specific instrumentation technologies and is mapped to the abstract

instrumentation.

Defining Abstract Instrumentation You can create abstract events and measures as you specify the aspects (health states) of a

model by using the New Aspect Wizard, instead of creating each artifact manually. The New

Aspect Wizard makes it easy to define the health states for an aspect, and then create or select

abstract events and measures that indicate changes to the health state. If you use the New

Aspect Wizard, you will perform the following steps for each aspect:

1. Use the wizard to create the aspect and specify the type of instrumentation (events or a

measure) that provides the state change information.

2. Create the implementations of the events or the measure you specify in the wizard. You

can use the New Event Implementation Wizard or the New Measure Implementation

Wizard to create the implementations, or you can create them manually.

3. Create any parameters you need to pass application-specific values to the

instrumentation that the TSMMD creates. You must do this manually using the

procedure "Modeling abstract event parameters" described later in this topic.

If you decide to manually create the instrumentation for your model by adding each abstract

and concrete implementation individually, you will perform the following steps:

1. Specify the abstract instrumentation (events and measures) for each entity.

2. Specify the implementations of these events and measures for each target environment

for each entity.

3. Map the event and measure implementations to each aspect in the health model.

This section describes how to create both forms of instrumentation (abstract and implemented)

individually. It consists of the following procedures:

This section describes how to create both forms of instrumentation and how to map one to the

other. It consists of the following procedures:

• Modeling abstract events

• Modeling abstract event parameters

• Modeling Enterprise Library logging events

• Modeling event log events

• Modeling Windows Eventing 6.0 events

• Modeling trace files

• Modeling Windows Management Instrumentation (WMI events)

• Modeling an abstract measure

• Modeling performance counters

To model an abstract event

1. In Management Model Explorer, click to expand the node of the managed entity for

which you want to add instrumentation.

2. Right-click the Management Instrumentation node, and then click Add New Event.

3. Select the new event in the Events section of the model.

4. In the Properties window, select a value for the Instrumentation Level property of the

event; you can select Coarse, Fine, or Debug. If you cannot see the Properties window,

press F4 or right-click the new event, and then click Properties.

5. Modify the value of the Name property of the event.

6. Repeat steps 2 through 5 for any other abstract events you require.

To model an abstract event parameter

1. In Management Model Explorer, click to expand the Management Instrumentation

node of the managed entity, and expand the Events node within it.

2. Right-click the abstract event, and then click Add New Event Parameter.

3. In the Properties window, modify the value of the Index property for the event. This

specifies the index of the placeholder (starting at 1) in the message template for the

value of this parameter. If you cannot see the Properties window, press F4 or right-click

the new event parameter, and then click Properties.

4. Modify the Name property for the event parameter.

5. Select the data type for the event parameter. You can select DateTime, Double, Int32,

Int64, or String.

6. Repeat steps 2 through 5 for any other event parameters you require.

An abstract event has two properties that you can set, as shown in the following table.

Abstract Event

property

Description

Instrumentation Level This property specifies the level at which the entity will raise the event. The options

are Coarse (all operations, the default), Fine (diagnostic and debug operations

only), and Debug (debug operations only). For information about how this setting

affects the behavior of an application, see Appendix A.

Name This property contains the name of the abstract event definition.

In addition, you will define one or more parameters for each abstract event. For each

parameter, architects set the two properties shown in the following table.

Event Parameter property Description

Name This property contains the name of the abstract event definition.

Index This property is an integer value that specifies which placeholder in the

message template the value of the parameter will replace.

Type The data type of the parameter. The available types are DateTime, Double,

Int32, Int64, and String (the default).

Event parameter names should use title-style capitalization (the first letter must be

capitalized). Validation code checks for this and displays an error message if this is not the

case. Also, if multiple event parameters are used, they should be numbered in increasing order

from 1, with no duplicates and no missing integers. Again, validation code checks for this.

To model an abstract measure

1. In Management Model Explorer, click to expand the node of the managed entity for

which you want to add instrumentation.

2. Right-click the Management Instrumentation node, and then click Add New Measure.

3. Select the new measure in the Measures section of the model.

4. Select a value for the Instrumentation Level property of the measure; you can select

Coarse, Fine, or Debug. If you cannot see the Properties window, press F4 or right-click

the new measure, and then click Properties.

5. Modify the Name property for the measure.

6. Repeat steps 2 through 5 for any other abstract measures you need to model.

An abstract measure has two properties that you can set, as shown in the following table.

Abstract Measure

property

Description

Instrumentation Level This property specifies the level at which the entity will update the counter. The

options are Coarse (all operations, the default), Fine (diagnostic and debug

operations only), and Debug (debug operations only). For information about how

setting this affects the behavior of an application, see Appendix A.

Name This property contains the name of the abstract measure definition.

Defining Instrumentation Implementations You can create the concrete implementations of the abstract events and measures you defined

in the model using the New Event Implementation Wizard or the New Measure Implementation

Wizard. To start the Wizard, right-click on an existing abstract event or measure in the

Management Model Explorer window, then click New Event Implementation Wizard or New

Measure Implementation Wizard.

Alternatively, you can create them manually as described in the following procedures.

To model concrete event instrumentation

1. If you need to model an Enterprise Library Log Entry event, right-click the abstract

event, and then click Add New Enterprise Library Log Entry.

2. In the Configurable Implementations section, select the Enterprise Library Log Entry

event you created.

3. Modify the properties of the event, specifying values for the Categories, Event ID,

Message, Name, Priority, Severity and Title.

4. Repeat steps 1 through 3 for any other Enterprise Library Log Entry events you need to

model.

5. If you need to model an Event Log event, right-click the abstract event, and then click

Add New Event Log Event.

6. Click Configurable Implementations, and then select the Event Log event you created.

7. Modify the properties of the Event Log event, specifying the Category, Event ID, Log,

Name, Severity and Source.

8. Specify a value for the Message Template property of the event. This is a template

containing placeholders (such as %1) for the values of the event parameters. You must

include the same number of placeholders as there are parameters for the abstract

event, and number them in increasing order starting from 1 with no duplicates and no

missing integers.

9. Repeat steps 5 through 8 for the other Event Log events you need to model.

10. If you need to model a Windows Eventing 6.0 event, right-click the abstract event, and

then click Add New Windows Eventing 6 Event.

11. In the Configurable Implementations section, select the Windows Eventing 6.0 Event

you created.

12. Modify the properties of the event, specifying values for the Channel, Level, Name,

Operation, Provider, Task, and Value.





missing integers.

14. Repeat steps 10 through 13 for any other event log events you need to model.

15. If you need to model a trace file entry, right-click the abstract event, and then click Add

New Trace File Entry.

16. Click Configurable Implementations, and then select the trace file entry you created.

17. Modify the properties of the trace file entry, specifying the Name property.





missing integers.

19. Repeat steps 15 through 18 for the other trace file entries you need to model.

20. If you need to model a WMI event, right-click the abstract event, and then click Add

New WMIEvent.

21. Click Configurable Implementations, and then select the WMI event you created.

22. Modify the properties of the WMI event, specifying the Name property.

23. Repeat step 20 through 12 for the other WMI events you need to model.

The following table shows the properties of an Enterprise Library Event implementation.

Enterprise Library Event

property

Description

Categories This property specifies a list of Categories that allows you to filter logging

events using a Category Filter in the Enterprise Library Logging Application

Block configuration. Separate each category name with a carriage return.

Event ID This property specifies the identifier for the event, and should be different from

any existing events.

Message This property specifies the text that Enterprise Library Logging Application

Block will include in the log message it generates.

Name This property contains the name of the Enterprise Library Logging Event

implementation. The name should start with a capital letter, and can contain

only alphanumeric characters (letters and numbers) and underscores.

Priority This property specifies the priority of the event using a positive or negative

numeric value. The priority allows you to filter logging events using a Priority

Filter in the Enterprise Library Logging Application Block configuration.

Severity This property specifies the severity of the error Select the Severity for the

event. You can select Critical, Error (these are equivalent to Windows Event

Log Error events), Information, Resume, Start, Stop, Suspend, Transfer,

Verbose (these are equivalent to Windows Event Log Information events), or

Warning (equivalent to Windows Event Log Warning events).

Title This property specifies the text that Enterprise Library Logging Application

Block will use as the title of the log message it generates.

The following table shows the properties of an Event Log Event implementation.

Event Log

Event

property

Description

Category This property contains a value list that allows you to filter individual events.

Event ID This property specifies the identifier for the event, and should be different from any existing

events.

Log This property specifies the target Windows Event Log name such as Application, or the

name of a custom Event Log.

Name This property contains the name of the Event Log Event implementation. The name should

start with a capital letter, and can contain only alphanumeric characters (letters and

numbers) and underscores.

Severity This property specifies the severity of the error, which sets the type of icon shown in

Windows Event Log and is useful for filtering events in a monitoring tool. The options

available are Error, Warning, Information, SuccessAudit, and FailureAudit.

Source This property contains the name to pass to the event system as the source of the error or

event.

Message

Template

This property is a template containing placeholders where the event system will insert the

values from event parameters when raising the event. If the abstract event defines any

parameters, you must include placeholders for the value of each parameter. The

placeholders must start with %1 and run consecutively to the number of parameters defined

for the event.

The following table shows the properties of a Windows Eventing 6.0 Event implementation.

Windows Eventing 6.0

Event property

Description

Channel This property specifies the channel to use to deliver the event. The channels

you can use are Operational, TraceClassic, System, Application, Security,

Analytic, and Debug. Generally, you should use the three channels that target

the Event Log. These are Application, System, and Security.

Level This property specifies the severity or importance of the event. The values you

can select are Error, Critical, Warning, Informational, and Verbose. The

usual approach is to select Error for events that cause a transition to a Red

(failed) state, Warning for events that cause a transition to a Yellow (degraded)

state, and Information for events that cause a transition to a Green (working

normally) state.

Message Template This property is a template containing placeholders where the event system

will insert the values from event parameters when raising the event. If the

abstract event defines any parameters, you must include placeholders for the

value of each parameter. The placeholders must start with %1 and run

consecutively to the number of parameters defined for the event.

Name This property contains the name of the Windows Eventing 6.0 Event

implementation. The name should start with a capital letter, and can contain

only alphanumeric characters (letters and numbers) and underscores.

Operation This property indicates the type of low-level operation the application was

executing when the event occurred. The values you can choose are Info,

Start, Stop, DC_Start, DC_Stop, Extension, Reply, Resume, Suspend, and

Send.

Provider This property contains the value passed to the event system to indicate the

provider, and provides an indication to administrators and operators of the

source of the event. The default value is a combination of the name of the

model and the name of the current managed entity.

Task This property contains additional information that may be useful to

administrators and operators to indicate what the application was doing when

the event occurred; for example, "Create Order", "Import Data", or "Application

Starting".

Value This property is a unique identifier for the event, and should therefore be

different from any other events so that the monitoring system can filter on this

value.

The following table shows the single property of a Trace File Entry implementation.

Trace File

Entry

property

Description

Name This property contains the name of the Trace File Entry implementation.

The following table shows the properties of a WMI Event implementation.

WMI Event

property

Description

Name This property contains the name of the WMI Event implementation.

Namespace This property contains the WMI namespace within which the event will reside.

Query This property contains a query that identifies the event.

To model a performance counter

1. Right-click the abstract measure, and then click Add New Performance Counter.

2. Click Configurable Implementations, and then click the performance counter you

created.

3. Modify the properties of the performance counter, specifying the Counter Category

Name and Counter Object Name properties. The counter name must start with a

capital letter.

4. Select a value for the Counter Type property of the new performance counter that

indicates the way that it exposes the data, such as ElapsedTime or

RateOfCountsPerSecond32.

5. Modify the values of the Name property of the new performance counter.

6. Repeat steps 1 through 5 for all other performance counters you need.

The following table shows the properties of a Performance Counter implementation.

Performance Counter

property

Description

Counter Category Name This optional property contains the category name of the Windows Performance

Counter that supplies the values for this measure.

Counter Object Name This property contains the name of the Windows Performance Counter that

supplies the values for this measure. It must start with a capital letter.

Counter Type This property specifies the type of counter to use in terms of the way that it

aggregates or measures the target object, such as AverageBase or

ElapsedTime.

Name This property contains the name of the Performance Counter implementation.

Now that you have modeled concrete events and performance counters, you can map them to

the trust levels associated with the managed entity.

Discovering Existing Instrumentation in an Application The Team System Management Model Designer Power Tool can discover instrumentation in

assemblies that are part of an existing application solution. The process will discover most

common instances of Windows Event Log Events, WMI Events, Enterprise Library Logging

entries, and Windows Performance Counters. The assemblies must reside in one or more

projects located in the Solution Items folder of the Management Model solution. The TSMMD

will compile the projects automatically when required.

To discover existing instrumentation:

1. Open the TSMMD solution that contains the application project(s) from which you want

to discover existing instrumentation. If you have not yet created a TSMMD solution

containing the application project(s), do the following:

a. Create a new TSMMD solution by following the steps in the topic "Creating a

New Management Model."

b. In Solution Explorer, right-click the Solution Items folder, point to Add, and then

click Existing Project.

c. Navigate to the existing project, and then click Open to add it to the TSMMD

solution.

d. Repeat the steps b and c to add any more required projects.

2. Ensure that the TSMMD guidance package is enabled:

a. On the Tools menu, click Guidance Package Manager.

b. In the Guidance Package Manager dialog box, click the Enable/Disable

Packages button.

c. In the Enable and Disable Packages dialog box, select the TSMMD

Instrumentation and TSMMD Management Pack Generation check boxes.

d. In the Enable and Disable Packages dialog box, click OK, and then click Close in

the Guidance Package Manager dialog box.

3. Open an existing .tsmmd management model file into the designer, and then open

Management Model Explorer. If you cannot see Management Model Explorer, click the

View menu, point to Other Windows on the View menu, and then click Management

Model Explorer.

4. Right-click the top-level node in Management Model Explorer, and then click Discover

Instrumentation.

5. The Discover Instrumentation Wizard opens, showing a list of all assemblies in all

projects with a check box next to each one. The check boxes for assemblies that will be

searched are already set. You can change the settings to add or remove individual

assemblies from the discovery process as required.

6. Select the type of instrumentation you want to discover in the Instrumentation Type

option list under the list of assemblies. You can select Event Log Event, WMI Event,

Performance Counter Measure, or Enterprise Library Logging, depending on whether

the assemblies you select contain instances of these types of instrumentation. Figure 2

shows the Discover Instrumentation Wizard.

Figure 2

The Discover Instrumentation Wizard

7. Click the Discover button. The Discovery Results window opens in Visual Studio showing

a list of all the discovered instrumentation. Figure 2 shows the Discovery Results

window after discovering Event Log Events instrumentation.

After you discover the instrumentation within one or more projects, you must map that

instrumentation to the appropriate managed entities in the management model. The following

procedure describes this process.

To map discovered instrumentation to a model:

1. Perform the steps of the previous procedure to generate a list of discovered

instrumentation using the TSMMD Discover Instrumentation recipe.

2. Locate the rows containing the instrumentation you want to import. You can filter the

list of instrumentation rows using the drop-down lists at the top of some of the columns

to help locate rows, and then click a column heading to sort the rows based on the

values in that column.

3. If you are not sure of the actual implementation of an instrumentation item, such as an

event or performance counter, right-click that item in the list of rows, and then click

one of the Go To options. For example, with Enterprise Library Logging, you can go to

the source code line that makes the call into the Logging Application Block or go to the

line that writes the logging entry.

4. Some of the instrumentation rows may contain one or more values that the discovery

process could not resolve. It marks these values as <Not Resolved>. Some of the

unresolved values may be optional (such as the instance name of some performance

counters), while others are mandatory. You must provide these values as part of the

mapping process.

5. Select the rows in the Discovery Results window that contain the discovered

instrumentation items you want to import into your management model. You can press

SHIFT+CTRL while clicking the list to select multiple items.

6. Now you can specify the mapping between the selected instrumentation items in the

Discovery Results window and the management model entities. To map one or more

instrumentation items to a specific managed entity, right-click the selected item rows,

click Quick Map, and then click the name of the entity. If none of the rows contains

unresolved mandatory items, you will see the managed entity name appear in the

Mapped To column.

7. If any row contains an unresolved mandatory item, you will see a dialog box that asks if

you want to resolve mandatory properties. Click Yes to display a dialog box where you

can provide values to override those in all the selected rows in the discovered

instrumentation list. For example, Figure 3 shows the Event Details dialog box, where

you specify the mandatory Source, Severity, and Log Name properties for an Enterprise

Library Logging event.

Figure 3

The Event Details dialog box for specifying unresolved mandatory instrumentation

properties

8. Alternatively, you can force the TSMMD to display the mapping details window;

perhaps because you want to change some values for the properties of the discovered

instrumentation or there are unresolved mandatory properties for which you know you

must provide values. In these cases, right-click the selected rows in the Discovery

Results window, click Map to open the mapping details window and enter the relevant

values, and then click OK.

9. The TSMMD adds the instrumentation to the Discovered Instrumentation section of

the selected management entity in the Management Model Explorer. Open the

Discovered Instrumentation section in Management Model Explorer to see the result,

to rename events or measures, and to make any remaining edits you require to the

properties.

The following tables describe the properties that you can set or edit for discovered

instrumentation. The Events section can contain definitions of Event Log Events and WMI

Events. For an existing or imported Event Log Event, the architect defines or edits the properties

shown in the following table.

Existing Event Log

Event property

Description

Description This property contains a description of the existing Event Log Event.

Event ID This property specifies the identifier for the event, and should be different from any

existing events.

IsDiscovered This Boolean property indicates if the Event Log Event was discovered by the

TSMMD or entered manually into the model.

Log This property specifies the target Windows Event Log name such as Application,

or the name of a custom Event Log.

Message This property contains the error message for this event.

Name This property contains the name of the existing Event Log Event.

Severity This property specifies the severity of the error, which sets the type of icon shown in

Windows Event Log and is useful for filtering events in a monitoring tool. The

options available are Error, Warning, Information, SuccessAudit, and

FailureAudit.

Source This property contains the name to pass to the event system as the source of the

error or event.

For an existing or imported WMI Event, the architect defines or edits the properties shown in

the following table.

Existing WMI Event

property

Description

Description This property contains a description of the existing WMI Event.

IsDiscovered This Boolean property indicates if the WMI Event was discovered by the TSMMD or

entered manually into the model.

Name This property contains the name of the existing WMI Event.

Namespace This property contains the WMI namespace within which the event will reside.

Query This property contains a query that identifies the event.

The Measures section can contain only definitions of Performance Counters. For an existing or

imported Performance Counter, the architect defines or edits the properties shown in the

following table.

Existing

Performance

Counter property

Description

Counter Category

Name

This optional property contains the category name of the Windows Performance

Counter that supplies the values for this measure.

Counter Instance

Name

This property contains the instance name of the Windows Performance Counter that

supplies the values for this measure.

Counter Object

Name

This property contains the name of the Windows Performance Counter that supplies

the values for this measure. It must start with a capital letter.

Counter Object Type This property specifies the type of counter to use in terms of the way that it

aggregates or measures the target object, such as AverageBase or ElapsedTime.

Description This property contains a description of the existing Performance Counter.

IsDiscovered This Boolean property indicates if the Performance Counter was discovered by the

TSMMD or entered manually into the model.

Name This property contains the name of the existing Performance Counter.

Visible Name This property indicates the name of the counter as seen by the operating system.

Discovered instrumentation (either manually defined or automatically discovered by the

TSMMD) cannot be mapped to a target environment.

Creating Health Definitions You can use health definitions to provide additional information about the model. The

information can be used to create an Operations Manager Management Pack.

Creating health definitions for each managed entity relies on creating aspects, each one of

which can have a health state. You can create aspects for a model using the New Aspect Wizard,

which simplifies the process of defining the aspect and health states, and specifying or creating

the abstract events or measure for each aspect. To start the Wizard, right-click on a managed

entity in the designer window or in the Management Model Explorer window, then click New

Aspect Wizard.

Alternatively, you can create new aspects directly in the Management Model Explorer window

by defining each aspect and health state individually, then selecting the abstract events or

measure that provides the information about state changes for the new aspect. The following

procedure explains how to add aspects manually to a TSMMD model.

To model health definitions for a managed entity

1. In Management Model Explorer, expand the managed entity node for which you want

to define aspects. If you cannot see Management Model Explorer, point to Other

Windows on the View menu, and then click Management Model Explorer.

2. Right-click the Health Definition node, and then click Add New Aspect.

3. In the Properties window, modify the value of the Name property for the aspect and

enter information for the Knowledgebase property that will assist operators and

administrators. If you cannot see the Properties window, press F4 or right-click the

Health Definition node, and then click Properties.

4. System Center Operations Manager categorizes aspects into four categories:

Availability, Configuration, Security, and Performance. In the Properties window, select

the appropriate category as the value of the Type property for the aspect.

5. In the Aspects section of Management Model Explorer, right-click the new aspect node,

and then click Add New Green Health State.

6. Expand the new aspect node to show the three health states, and then expand the

Green Health State node to show the Health Formula (which is currently empty).

7. If the indicators for state transitions for this aspect are events, right-click the Green

Health State node, and then click Add New Event Formula.

8. If the indicators for state transitions for this aspect are measures (performance

counters), right-click the Green Health State node, and then click Add New Measure

Formula.

You cannot mix events and measures in an aspect. All the states you define for an

aspect must be either events or measures (performance counters).

9. If you added a new Event Formula, use the Event property to specify the event that will

act as the indicator for this state transition. You can select an abstract event that you

previously defined in the Management Instrumentation section of this entity.

Alternatively, you can select an event discovered by the TSMMD or defined in the

Discovered Instrumentation section.

10. If you added a new Measure Formula, use the Measure Formula property to specify the

measure that will act as the indicator for this state transition. You can select an abstract

measure that you previously defined in the Management Instrumentation section of

this entity. Alternatively, you can select a performance counter discovered by the

TSMMD or defined in the Discovered Instrumentation section.

11. For a Measure Formula, you must also specify the conditions that trigger a state

transition. Select the Measure Formula node and specify values for the Upper Bound

and Lower Bound properties.

12. Repeat steps 5 through 11 to specify the yellow and red states for the aspect. In

addition to the mandatory green health state, you can specify either or both of the

yellow and red health states for a managed entity.

13. Repeat steps 2 through 12 to add any other aspects you require to the managed entity.

14. Repeat the complete procedure to add aspects to all other managed entities in the

model.

Validating the Management Model Creating a management model using the TSMMD tool can be a fairly lengthy and complex

process. Before the model can be used by others, such as the development team for the

application, you must ensure that it is complete. The TSMMD tool can perform a number of

checks on the model to ensure that it is internally consistent.

To validate the management model

1. To validate the complete model, right-click the model designer surface or on any of the

nodes in the model in Management Model Explorer, and then click Validate All. If you

cannot see Management Model Explorer, point to Other Windows on the View, and

then click Management Model Explorer.

2. To validate a section of the model (useful as you define sections of instrumentation or

individual health aspects), right-click the parent node of the section you want to

validate, and then click Validate.

3. The TSMMD validates the model, or the selected node and its child nodes, and reports

the result in the Visual Studio Output window. If there are validation errors and/or

warnings, they appear in the Visual Studio Error List window.

In addition to validating the management model itself, after you create the application itself, it

is also possible to verify that the application code calls the instrumentation code. For more

information, see Chapter 8 of this guide.

Management Model Guidelines When creating management models using the TSMMD tool, you should consider the following

guidelines:

• All managed entities and external managed entities must have unique names. An

external managed entity cannot have the same name as a managed entity.

• Entry points to the model from other managed entities not represented in the model

should be shown as unmanaged entities.

• If multiple models are used to represent a system, each managed entity should only be

represented in one model; this managed entity can be represented as an external

managed entity in other models.

Northern Electronics Scenario The solutions architect is now in a position to define the management model for the application.

The architect decides that the application will consist of two solutions and will split the model

across those two solutions, as shown in Figure 1.

Figure 1 Solutions used in the Northern Electronics example

The solutions architect then creates management models for each solution, as shown in Figures

2 and 3.

Figure 2 The Transport Consolidation Solution

Figure 3 The Shipping Solution

Summary This chapter described how to use the TSMMD tool to create a management model, and it

provided guidelines for effective use of the TSMMD tool. It also showed how the TSMMD tool

was used to model the solutions in the Northern Electronics Scenario.

Section 3

Developing for Operations

This section focuses on the developer tasks necessary for creating well-instrumented

manageable applications. It describes how to create reusable instrumentation helpers from the

model defined in the Team System Management Model Designer Power Tool (TSMMD) and

discusses the instrumentation artifacts that are generated. It examines the developer tasks that

are necessary to create and manage event log, Windows Management Instrumentation (WMI),

Eventing 6.0, and performance counter instrumentation. The section also includes a chapter

about building install packages for instrumentation; however, this chapter is not complete in the

preliminary version of this guide.

This section should be of use primarily to application and instrumentation developers.

Chapter 8, "Creating Reusable Instrumentation Helpers"

Chapter 9, "Event Log Instrumentation"

Chapter 10, "WMI Instrumentation"

Chapter 11, "Windows Eventing 6.0 Instrumentation"

Chapter 12, "Performance Counters Instrumentation"

Chapter 13, "Building Install Packages"

Chapter 8

Creating Reusable Instrumentation

Helpers

After the architect defines the management model for the application, it is up to the developer

to write instrumentation code that reflects the management model. It is recommended that you

isolate instrumentation in an instrumentation helper. This chapter describes how to use the

guidance automation supplied with the Team System Management Model Designer Power Tool

(TSMMD) to automatically create the instrumentation helper, and it includes details about the

artifacts that are created. It then discusses how to consume the instrumentation from an

application.

The guidance automation included with the TSMMD tool simplifies the process of creating

instrumentation helper artifacts. However, you can use the information contained in this

chapter to manually create your own instrumentation helper classes.

Creating Instrumentation Helper Classes After you determine that the model has no errors (as shown in Chapter 7 of this guide), you can

generate the instrumentation helper classes.

To generate instrumentation helper classes

1. If you have previously generated the instrumentation code from your model, you

should delete it before you regenerate the code. In Visual Studio Solution Explorer,

select and delete the Instrumentation subfolder and all its contents.

2. In Visual Studio, make sure that the TSMMD guidance package is enabled:


b. In the Guidance Package Manager dialog box, click the Enable/Disable Packages

button.





3. In Management Model Explorer, right-click the top-level entry, and then click

Generate Instrumentation Helper. Alternatively, right-click anywhere on the model

designer surface, and then click Generate Instrumentation Helper.

4. The guidance recipe first validates the entire model and then (providing there are no

errors) automatically generates the instrumentation projects and artifacts in the

Instrumentation solution folder. Finally, it opens the file

InstrumentationConfiguration.config in the editor window so that you can specify

the run-time target environments and instrumentation granularities for each

managed entity in the application.

Figure 1 Instrumentation Helper code generated by the TSMMD guidance automation

Instrumentation Solution Folder A fundamental principle behind the design of manageable applications is to abstract

instrumentation, meaning that applications call abstract events and measures, which are

mapped to concrete implementations of these events and measures. This abstraction is

reflected in the code, with a separate solution folder named Instrumentation. This folder

captures the instrumentation defined in the management model, and after the artifacts in this

folder have been created, it should not be necessary to modify them. This allows you to

separate application design from the instrumentation, and in some cases have a separate

instrumentation developer responsible for creating this solution artifact.

The Instrumentation solution folder contains instrumentation projects and a lib folder, which

contains the file Microsoft.Practices.DFO.Guidance.Configuration.dll.

Three types of instrumentation projects are created as artifacts in the Instrumentation solution

folder:

• API projects

• Implementation projects

• Technology projects

The next sections describe each of these types of projects in more detail.

API Projects One API project is created for each managed entity. Each of these projects contains an abstract

class. The abstract class is a helper class that defines the following:

• It defines one protected constructor receiving a ManagedEntityHealthElement as

parameter.

• It defines a static GetInstance method that returns an instance of this API class.

• It defines one public CanRaise method (named "CanRaise" + <EventName>) for each

abstract event defined in the managed entity. This method returns true if the event can

be raised according to the instrumentation; otherwise, it returns false.

• It defines one public Raise method (named "Raise" + <Event Name>) for each abstract

event defined in the managed entity. This method calls the concrete instrumentation if

the event can be raised according configuration.

• It defines one protected abstract DoRaise method (named "DoRaise" + <Event Name>)

for each abstract event defined in the managed entity. This method is overridden on

each concrete instrumentation class.

• It defines one public CanIncrement method (named "CanIncrement" + <Measure

Name>) for each abstract measure defined in the managed entity. This method returns

true if measure can be incremented according instrumentation; otherwise, it returns

false.

• It defines one public Increment method (named "Increment" + <Measure Name>) and

one public IncrementBy method (named "IncrementBy" + <Measure Name>) for

each abstract measure defined in the managed entity. This method calls the concrete

instrumentation if the event can be incremented according configuration. The

difference between Increment and IncrementBy method is that the second receives

one Int parameter to specify how much measure wants to be incremented.

• It defines one protected abstract DoIncrement method (named "DoIncrement" +

<Measure Name>) and one protected abstract DoIncrementBy method (named

"DoIncrementBy" + <Measure Name>) for each abstract measure defined in the

managed entity. These methods are overridden on each concrete instrumentation class.

The guidance automation in the TSMMD names the API projects ManagedEntityNameAPI.

One implementation project is created for each managed entity’s trust level. Each of these

projects contains one class as the concrete implementation of the API class previously

explained. This concrete helper class extends the API class and defines the following:

• It defines one public constructor calling the base constructor.

• It defines all the implementations of the abstract methods defined on the base class.

The implementation of these methods depends on the type of event or measure chosen

on the concrete implementation of the event or measure associated to the trust level.

As an example, suppose you have a managed entity with one abstract event defined

and then you add two concrete implementations for this event: a WMIEvent and an

EventLogEvent. After this, you define two trust levels named Medium Trust and High

Trust. You then associate the concrete WMIEvent to the Medium Trust level and the

concrete EventLogEvent to the High Trust level. In this case, you should create one

project with one API class with the corresponding methods for the abstract event and

two projects with one class, each containing different concrete implementations of the

abstract event defined on the API class.

The guidance automation in the TSMMD names the implementation project

ManagedEntityName.TargetEnvironmentName.Impl.

Technology Projects One technology project is created for each technology used. Exactly what each technology

project contains depends on the technology. This section describes the three technologies

currently represented in the TSMMD tool: event logs, Windows Management Instrumentation

(WMI) events, and performance counters.

For more information about how the event logs, WMI events, and performance counters are

used, see Chapters 9, 10, and 11 of this guide.

There is no technology project for Enterprise Library Logging events. The TSMMD generates

the code required to create logging entries within the API helper classes.

Event Log Project An event log project contains the following:

• It contains one *.mc file for each source defined on eventLogEvents across all entities.

Each of these files contains one entry for each eventLogEvent defined for that source.

• It contains one EventMessages.cmd file.

• It contains one EventLogEventsInstaller class.

The guidance automation provided with this guide names the event log project

EventLogEventsInstaller.

Windows Eventing 6.0 Project A Windows Eventing 6.0 project contains the following:

• It contains one EventingResourceComplier.cmd file.

• It contains one EventsDeclaration.man XML manifest file.

The guidance automation provided with this guide names the Windows Eventing 6.0 project

WindowsEventing6EventsInstaller.

The TSMMD can create a Windows Eventing 6.0 View file that administrators can use to create

a custom view in Windows Event Log in Windows Vista and Windows Server 2008 to view

events generated by a TSMMD-based application.

WMI Project A WMI project contains the following:

• It contains one class for each WMI event defined across managed entities.

• It contains one WmiEventsInstaller class.

The guidance automation provided with this guide names the WMI project

WmiEventsInstaller.

Performance Counter Project A performance counter project contains the following:

• It contains one PerformanceCountersIntaller class.

The guidance automation provided with this guide names the performance counter project

PerformanceCountersInstaller.

Using the Instrumentation Helpers After the helper classes are created, you can use the instrumentation code by calling

instrumentation methods for the generated API classes. At run time, configuration of the

application will determine which implementation of instrumentation should be used. You do not

need to be aware of the application's configuration during development; instead, the logic to

apply configuration is in the API helper classes that are generated.

Your application code should only call abstract events and measures and the instrumentation

helper code will ensure that the corresponding events and performance counters are used, as

defined in the instrumentation model.

Abstract events have three methods in their corresponding API class:

• DoRaise<eventName>(<eventParameters>). This is an abstract method that should be

implemented by subclasses. The implementation depends on the type of event (event

log, WMI, or trace file entry).

• CanRaise<eventName>(). This method returns true or false, depending on settings in

the configuration file. For example, if an event is defined as fine, and the

instrumentation level in the configuration file is set to coarse, this method returns false.

• Raise<eventName>(<eventParameters>). If configuration settings allow the event to be

raised, this method raises the event by calling the concrete implementation.

Abstract measures have five methods in their corresponding API class:

• DoIncrement<measureName>(). This is an abstract method that should be

implemented by subclasses.

• CanIncrement<measureName>(). This method returns true or false, according to

settings in the configuration file.

• Increment<measureName>(). If configuration settings allow the measure to be

incremented, this method increments the measure by calling the concrete

implementation.

• DoIncrementBy<measureName>(<incrementQuantity>). This is an abstract method that

should be implemented by subclasses.

• IncrementBy<measureName>(<incrementQuantity>). If configuration settings allow the

measure to be incremented, this method increments the measure by the quantity

defined in the <incrementQuantity> parameter.

Verifying That Instrumentation Code is called from the Application After you call the instrumentation code from your application, you can perform a validation

check in Visual Studio to check that the instrumentation methods of the generated API helper

classes are called from the application code.

To validate instrumentation

1. In Visual Studio, click the Solution Explorer tab.

2. Right-click the model file, and then click Verify Instrumentation Coverage.

The results of the validation check appear in the Output window. Figure 2 shows a case where

helper methods are not called from the application.

Figure 5 Error list generated when Verify Instrumentation Coverage runs

You can use the validation check to provide a checklist of tasks when instrumenting your

application. The TSMMD can verify coverage for applications written in Visual Basic and C#. If

you create your application using any other language, the TSMMD will not be able to locate

calls to the instrumentation, and will report an error.

An additional limitation in this release is that the TSMMD cannot discover instrumentation

calls made from an ASP.NET Web application written in Visual Basic.

Summary This chapter described how to generate instrumentation helper classes for an application, and

how to call the application code from the application. By starting with a management model

defined in TSMMD, you can automatically create the instrumentation code you require, and

then call the abstract events from your application code. The instrumentation helpers ensure

that the correct instrumentation technologies are used.

Chapter 9

Event Log Instrumentation

In Windows, an event is defined as any significant occurrence—whether in the operating system

or in an application—that requires users to be notified. Critical events are sent to the user in the

form of an immediate message on the screen. Other event notifications are written to one of

several event logs that record the information for future reference.

Event logging in Microsoft Windows provides a standard, centralized way for you to have your

applications record important software and hardware events. Operations staff can access events

written to the event logs using the Event Viewer and use them to diagnose application

problems.

This chapter focuses on the eventing mechanism used in versions of Windows earlier than

Windows Server 2003. Windows Vista uses a different eventing mechanism, Eventing 6.0, as

will future versions of Windows. For information about Eventing 6.0, see Chapter 11 of this

guide.

By default, there are three event logs available:

• System log. This tracks events that occur on system components—for example, a

problem with a driver.

• Security log. This tracks security changes and possible breaches.

• Application log. This tracks events that occur in an application.

In addition to these logs, other programs, such as Active Directory, may create their own default

logs. You can also create your own custom logs for use with your own applications.

This chapter demonstrates how developers can create event log events in code and ensure that

they are written to the appropriate event log. Where appropriate, code examples reflect the

code used in the Northern Electronics Transport Consolidation Solution.

Not all of the event log instrumentation code described in this chapter is implemented in the

instrumentation helpers generated by the TSMMD tool. For example, no code is generated to

clear existing event logs or to delete event logs. However, it is still included in this chapter

because it may be required.

Installing Event Log Functionality Before you can write event log entries, you must specify settings for the event log in the

Windows registry. These changes require administrative rights over the local computer, so they

should usually be performed when the application is installed instead of at run time. This section

describes how to use the EventLogInstaller class to install event log functionality for your

application.

Event Sources One of the primary responsibilities of the EventLogInstaller class is to create an event source for

the application. Event sources are used to uniquely identify a source of events in the event log.

They are defined in the registry under

HKLM\System\CurrentControlSet\Services\EventLog\EventLogName.

Typically, an event source will be named after the application or managed entity that the event

arose from. Figure 1 shows an event in Event Viewer, with the source value for the event

highlighted.

Figure 1 Event log entry with the event source highlighted

By default, an event source for an application is defined in the Windows Application log.

However, it is possible to specify different logs, including custom event logs. For more

information, see "Using Custom Event Logs" later in this chapter.

The EventLogInstaller class can install event logs only on the local computer.

It is common for the source to be the name of the application or another identifying string. Any

attempt to create a duplicated Source value will result in an exception. However, a single event

log can be associated with multiple sources.

Using the EventLogInstaller class To install an event log, you should create a project installer class that inherits from Installer and

set the RunInstallerAttribute for the class to true. Within your project, create an

EventLogInstaller instance for each event source and add the instance to your project installer

class.

When the install utility is called, it looks at the RunInstallerAttribute. If this attribute is set to

true, the utility installs all the items in the Installers collection associated with your project

installer. If RunInstallerAttribute is false, the utility ignores the project installer.

You modify properties of an EventLogInstaller instance either before or after adding the

instance to the Installers collection of your project installer. You must set the Source property if

your application will be writing to the event log.

If the specified source already exists when you set the Source property, EventLogInstaller

deletes the previous source and recreates it, assigning the source to the log you specify in the

Log property.

Typically, you would set the following additional properties:

• Log. This property is the event log that events will be written to. If it is not set, the

event source is registered to the Application log.

• UninstallAction. This property gets or sets a value that indicates whether the installer

tool (Installutil.exe) should remove the event log or leave it in its installed state at

uninstall time.

• CategoryResourceFile. This property identifies a category resource file, which is used to

write events with localized category strings. It should only be used if you are creating

events with categories.

• CategoryCount. This property sets (and gets) the number of categories in the category

resource file. It should only be used if you are creating events with categories.

• ParameterResourceFile. This property gets or sets the path of the resource file that

contains message parameter strings for the source. It is used when you want to

configure an event log source to write localized event messages with inserted

parameter strings.

• MessageResourceFile. This gets or sets the path of the resource file that contains

message formatting strings for the source. It is used when you want to configure an

event log source to write localized event messages.

These last four properties in the preceding list provide a lot of flexibility in creating events that

are useful for manageability purposes. By using message resource files, categories, and inserting

parameters, you can create messages with more useful information, and manageability

applications can perform automated processes based on particular parameters. For more

information about how these properties are used, see "Writing Events to an Event Log" later in

this chapter.

Typically, you should not call the methods of the EventLogInstaller class from within your code;

they are generally called only by the InstallUtil.exe installation utility. The utility automatically

calls the Install method during the installation process. It backs out failures, if necessary, by

calling the Rollback method for the object that generated the exception.

The following code example shows EventLogInstaller.

C#

using System;

using System.Management.Instrumentation;

using System.ComponentModel;

using System.Diagnostics;

using System.Configuration.Install;

using System.IO;

using System.Text;

namespace EventLogEvents.InstrumentationTechnology

{

[RunInstaller(true)]

public class EventLogEventsInstaller : Installer

{

// constructor

public EventLogEventsInstaller()

{

// Installer for events with source name: PS

EventLogInstaller myEventLogInstallerPS = new EventLogInstaller();

string resourceFilePS = Path.Combine(Environment.CurrentDirectory,

"EventMessagesPS.dll");

myEventLogInstallerPS.Source = "PS";

myEventLogInstallerPS.Log = "Application";

myEventLogInstallerPS.CategoryCount = 0;

myEventLogInstallerPS.CategoryResourceFile = resourceFilePS;

myEventLogInstallerPS.MessageResourceFile = resourceFilePS;

Installers.Add(myEventLogInstallerPS);

// Installer for events with source name: SS

EventLogInstaller myEventLogInstallerSS = new EventLogInstaller();

string resourceFileSS = Path.Combine(Environment.CurrentDirectory,

"EventMessagesSS.dll");

myEventLogInstallerSS.Source = "SS";

myEventLogInstallerSS.Log = "Application";

myEventLogInstallerSS.CategoryCount = 0;

myEventLogInstallerSS.CategoryResourceFile = resourceFileSS;

myEventLogInstallerSS.MessageResourceFile = resourceFileSS;

Installers.Add(myEventLogInstallerSS);

// Installer for events with source name: TS

EventLogInstaller myEventLogInstallerTS = new EventLogInstaller();

string resourceFileTS = Path.Combine(Environment.CurrentDirectory,

"EventMessagesTS.dll");

myEventLogInstallerTS.Source = "TS";

myEventLogInstallerTS.Log = "Application";

myEventLogInstallerTS.CategoryCount = 0;

myEventLogInstallerTS.CategoryResourceFile = resourceFileTS;

myEventLogInstallerTS.MessageResourceFile = resourceFileTS;

Installers.Add(myEventLogInstallerTS);

// Installer for events with source name: WSTransport

EventLogInstaller myEventLogInstallerWSTransport = new

EventLogInstaller();

string resourceFileWSTransport =

Path.Combine(Environment.CurrentDirectory, "EventMessagesWSTransport.dll");

myEventLogInstallerWSTransport.Source = "WSTransport";

myEventLogInstallerWSTransport.Log = "Application";

myEventLogInstallerWSTransport.CategoryCount = 0;

myEventLogInstallerWSTransport.CategoryResourceFile =

resourceFileWSTransport;

myEventLogInstallerWSTransport.MessageResourceFile =

resourceFileWSTransport;

Installers.Add(myEventLogInstallerWSTransport);

}

}

}

Writing Events to an Event Log After the event log functionality is installed, you can write events to the event log. You have two

choices in writing events to an event log:

• WriteEntry method

• WriteEvent method

The next sections describe each of these methods.

Using the WriteEntry Method After the EventLog component is appropriately configured, you can use the WriteEntry

overloaded method to write the event to the appropriate event log.

The following code shows one of the overloads used.

C#

byte[] myByte=new byte[10];

for(int i=0;i<10;i++)

{

myByte[i]= (byte)(i % 2);

}

// Write an informational entry to the event log.

Console.WriteLine("Write from second source ");

EventLog1.WriteEntry("SecondSource","Writing warning to event log.",

EventLogEntryType.Error,myEventID ,myCategory ,myByte);

The WriteEvent Method The WriteEvent method is a more flexible alternative to the WriteEntry method. This method is

used in the instrumentation helpers generated by the TSMMD tool.

The WriteEvent method can be used to write a localized entry with additional event-specific

data to the event log, using a source already registered as an event source for the appropriate

log. You specify the event properties with resource identifiers rather than string values.

The Event Viewer uses the resource identifiers to display the corresponding strings from the

localized resource file for the source. You must register the source with the corresponding

resource file before you write events using resource identifiers.

The instance input specifies the event message and properties. You should set the InstanceId of

the instance input for the defined message in the source message resource file. Optionally, you

can set the CategoryId and EntryType of the instance input to define the category and event

type of your event entry. You can also specify an array of language-independent strings to insert

into the localized message text.

The following code shows the WriteEvent method.

C#

protected override void DoRaisePickupServiceSOAPError(string errorMessage)

{

string source = "SS";

string logName = "Application";

string machineName = ".";

long eventId = 2003;

int categoryId = 0;

Object[] values = new Object[1];

values[0] = errorMessage;

EventLogEntryType entryType = EventLogEntryType.Error;

EventLog eventLog = new EventLog();

eventLog.Source = source;

eventLog.Log = logName;

eventLog.MachineName = machineName;

EventInstance eventInstance = new EventInstance(eventId, categoryId);

eventInstance.EntryType = entryType;

eventLog.WriteEvent(eventInstance, values);

}

Set values to a null reference if the event message does not contain formatting placeholders

for replacement strings.

You can specify binary data with an event when it is necessary to provide additional details for

the event. For example, use the data parameter to include information about a specific error.

The Event Viewer does not interpret the associated event data; it displays the data in a

combined hexadecimal and text format. You should use event-specific data sparingly; include it

only if you are sure it will be useful. You can also use event-specific data to store information the

application can process independently of the Event Viewer.

The specified source must be registered for an event log before using WriteEvent. The specified

source must be configured for writing localized entries to the log; the source must at minimum

have a message resource file defined.

If your application writes entries using both resource identifiers and string values, you must

register two separate sources. For example, configure one source with resource files, and then

use that source in the WriteEvent method to write entries using resource identifiers to the

event log. Then create a different source without resource files, and use that source in the

WriteEntry method to write strings directly to the event log using that source.

Reading Events from Event Logs It is not necessary to use the EventLogInstaller class to read events from event logs. Instead, you

should perform the following high level tasks:

1. Create and configure an instance of the EventLog class.

2. Use the Entries collection to read the entries in the log.

Reading events from event logs is not included in the functionality of the instrumentation

helper classes automatically generated by the TSMMD tool.

You should treat the data from an event log as you would any other input coming from outside

your system. Your application may need to validate the data in the event log before using it as

input. Another process, possibly a malicious one, may have accessed the event log and added

entries.

Creating and Configuring an Instance of the EventLog Class An instance of the EventLog class defined in the following code.

C#

EventLog eventLog = new EventLog();

There are three major properties involved in configuring an instance of the EventLog class:

• Log. This property indicates the log with which you want to interact.

• MachineName. This property indicates the computer on which the log you resides.

• Source. This property indicates the source string that will be used to identify your

component when it writes entries to a log. In this case, you are reading from a log, so

you do not need to specify this property.

To read from an event log, you must specify the Log and MachineName properties, so that the

component is aware of which log to read from. The following code shows the Log and

MachineName properties specified.

C#

eventLog.Source = source;

eventLog.Log = logName;

eventLog.MachineName = machineName;

Dim log, machine As String

...

Dim EventLog1 As New EventLog

EventLog1.Log = log

EventLog1.MachineName = machine

Using the Entries Collection to Read the Entries You use the Entries collection to look at the entries in a particular event log. You can use

standard collection properties such as Count and Item to work with the elements the collection

contains. You might read event log entries to learn more information about a problem that

occurred in your system, to identify usage patterns, or to identify problems (such as a failing

hard drive) before they cause damage.

The Entries collection is read-only, so it cannot be used to write to the event log.

The following example shows how to retrieve all of the entries from a log.

C#

foreach (System.Diagnostics.EventLogEntry entry in EventLog1.Entries)

{

Console.WriteLine(entry.Message);

}

If you ask for the count of entries in a new custom log that has not yet been written to, the

system returns the count of the entries in the Application log on that server. To avoid this

problem, make sure that logs you are counting have been created and written to.

Clearing Event Logs Event logs are set to a maximum size that determines how many entries each log can contain.

When an event log is full, it either stops recording entries or begins overwriting the oldest

entries with new entries, depending on the settings specified in the Windows Event Viewer. In

either case, you can clear the log of its existing entries to free the log and allow it to start

recording events again. You must have Administrator rights to the computer on which the log

resides in order to clear entries.

Clearing event logs is not included in the functionality of the instrumentation helper

automatically generated by the TSMMD tool.

By default, the Application log, System log, and Security log are set to a default maximum size of

4992 K. Custom logs are set to a default maximum of 512 K.

You can also use the Windows Event Viewer to free up space on a log that has become full.

You can set the log to overwrite existing events, you can write log entries to an external file, or

you can increase the maximum size of the log. However, you cannot remove only some of the

entries in a log; when you clear a log, you remove all of its contents. For more information, see

"How to: Launch Event Viewer" on MSDN or your Event Viewer documentation.

You use the Clear method to clear the contents of an event log. The following code is used to

clear the events from EventLog1.

C#

EventLog1.Clear();

Deleting Event Logs You can delete any event log on your local computer or a remote server if you have the

appropriate registry rights. When you delete a log, the system first deletes the file that contains

the log's contents and then accesses the registry and removes the registration for all of the

event sources that were registered for that log. Even if you re-create the log at a later point, this

process will not create the sources by default, so some applications that previously were able to

write entries to that log may not be able to write to the new log.

Deleting event logs is not included in the functionality of the instrumentation helper


To delete an event log, you should use the Delete method and specify the name of the log you

want to delete. The Delete method is static, so you do not need to create an instance of the

EventLog component before you call the method—instead, you can call the method on the

EventLog class itself, as shown in the following code.

C#

System.Diagnostics.EventLog.Delete ("MyCustomLog");

Re-creating an event log can be a difficult process. It is good practice to not delete any of the

system-created event logs, such as the Application log. You can delete your custom logs and

re-create them as needed.

The following code shows an example of verifying a source and deleting a log if the source

exists. This code assumes that an Imports or Using statement exists for the System.Diagnostics

namespace.

C#

if (System.Diagnostics.EventLog.Exists("MyCustomLog"))

{

System.Diagnostics.EventLog.Delete("MyCustomLog");

}

Removing Event Sources You can remove your source if you no longer need to use it to write entries to that log. Doing

this affects all components that used that source to write to the log. For example, if you have

two Web services that write to a log using the source name "mysource," removing "mysource"

as a valid source of events affects both Web services.

Removing event sources is not included in the functionality of the instrumentation helper


To remove an event source, you should call the DeleteEventSource method, specifying the

source name to remove. The following code shows an event source named MyApp1 being

removed from the local computer.

C#

System.Diagnostics.EventLog.DeleteEventSource("MyApp1");

The following code removes an event source from a remote computer.

C#

System.Diagnostics.EventLog.DeleteEventSource("MyApp1", "myserver");

Removing a source does not remove the entries that were written to that log using this source.

However, it does affect the entries by adding information to them indicating that the source

cannot be found.

Creating Event Handlers You can create event handlers for your EventLog components. These can be used to determine

when an event has been raised. Notifications can then be raised, or code can be run to

automatically correct a problem.

The instrumentation helper automatically generated by the TSMMD tool does not create event

handlers.

To programmatically create a handler

1. Use the AddHandler method to create an event handler of type

EventLogEventHandler for your component that will call the

EventLog1.EntryWritten procedure when an entry is written to the log. Your code

should look like the following.

this.eventLog1.EntryWritten += new

System.Diagnostics.EntryWrittenEventHandler(

this.eventLog1_EntryWritten);

For more information about this syntax, see "Event Handlers in Visual Basic and Visual

C#" on MSDN at http://msdn2.microsoft.com/en-us/library/aa984105(VS.71).aspx.

2. Create the EntryWritten procedure and define the code you want to process the

entries.

3. Set the EnableRaisingEvents property to true.

Using Custom Event Logs You should use a custom log if you want to organize events in a more granular way than is

allowed when your components write entries to the default Application log. For example,

suppose you have a component named OrderEntry that writes events to an event log. You are

interested in backing up and saving these entries for a longer period of time than some other

entries in the Application log. Instead of registering your component to write to the Application

log, you can create a custom log named OrdersLog and register your component to write entries

to that log instead. That way, all of your order information is stored in one place and will not be

affected if the entries in the Application log are cleared.

You may also use custom event logs in situations where you do not have rights to write to the

Application event log.

Writing to a Custom Log Typically, writing to a custom log consists of two high level tasks:

• Installing the custom log (only necessary if the log does not already exist)

• Writing events to the custom log

The next sections describe these tasks in more detail.

Installing the Custom Log You can use the EventLogInstaller class to create a custom log. In this case, you specify the log

property to be a log that does not already exist. In this case, the system automatically creates a

custom log for you and registers an event source for that log.

The following example shows how to create a custom log named MyNewLog on the local

computer.

C#

using System;





using System.IO;

using System.Text;

namespace EventLogEvents.InstrumentationTechnology

{


public class EventLogEventsInstaller : Installer

{

// constructor

public EventLogEventsInstaller()

{

// Installer for events with source name: PS

EventLogInstaller myEventLogInstallerPS = new EventLogInstaller();

string resourceFilePS = Path.Combine(Environment.CurrentDirectory,

"EventMessagesPS.dll");

myEventLogInstallerPS.Source = "PS";

myEventLogInstallerPS.Log = "MyNewLog";

myEventLogInstallerPS.CategoryCount = 0;

myEventLogInstallerPS.CategoryResourceFile = resourceFilePS;

myEventLogInstallerPS.MessageResourceFile = resourceFilePS;

Writing Events to the Custom Log Writing events to a custom log is the same process as writing events to any other log. For more

details, see "Writing Events to the Event Log" earlier in this chapter.

Other Custom Log Tasks Other tasks you may perform with custom logs, such as reading from the log or clearing entries

from the log, are the same as performing the tasks on built-in logs. For more details, see the

corresponding sections earlier in this chapter.

Summary This chapter has demonstrated many of the developer tasks associated with event log

instrumentation. Many of the developer tasks you will need to perform are automated by the

TSMMD tool. However, it is still important for developers to understand the work performed by

the TSMMD tool when developing instrumented application.

Chapter 10

WMI Instrumentation

Windows Management Instrumentation (WMI) is the Microsoft implementation of Web-based

Enterprise Management (WBEM), which is an industry initiative developed to standardize the

technology for managing enterprise computing environments. WMI uses classes based on the

Common Information Model (CIM) industry standard to represent systems, processes, networks,

devices, and other enterprise components.

WMI supplies a pre-installed class schema that allows scripts or applications written in scripting

languages, Visual Basic, or C++ to monitor and configure applications, system or network

components, and hardware in an enterprise. For example, instances of the Win32_Process class

represent all the processes on a computer, and the Win32_LogicalDisk class can represent any

disk devices. For more information, see "Win32 Classes" in the Windows Management

Instrumentation documentation in the MSDN Library at http://msdn.microsoft.com/library.

The WMI architecture consists of the following tiers:

• Client software components. These perform operations using WMI, such as reading

management details, configuring systems, and subscribing to events.

• Object manager. This is a broker between providers and clients that provides some key

services, such as standard event publication and subscription, event filtering, query

engine, and other services.

• Provider software components. These capture and return live data to the client

applications, process method invocations from the clients, and link the client to the

infrastructure being managed.

Not all the WMI instrumentation code described in this chapter is implemented in the

instrumentation helpers generated by the Team System Management Model Designer Power

Tool (TSMMD). However, it is still included in this chapter as it may be required.

WMI and the .NET Framework WMI is the instrumentation standard used by management applications such as Microsoft

Operations Manager, Microsoft Application Center, and many third-party management tools.

The Windows operating system is instrumented with WMI, but developers who want to

generate instrumentation for their own applications must write their own instrumentation code.

WMI in the .NET Framework is built on the original WMI technology and allows the same

development of applications and providers with the advantages of programming in the .NET

Framework.

The classes in the System.Management.Instrumentation namespace allow managed code

developers to surface information to WMI-enabled tools. The goal in creating this namespace

was to minimize the work involved in enabling an application for management. The namespace

also makes it easy to expose events and data. Exposing an application's objects for management

should be intuitive for .NET Framework developers—the WMI schema is object-oriented and has

many traits in common with the .NET Framework metadata—code classes map to schema

classes, properties on code objects map to properties on WMI objects, and so on. Therefore, it is

easy to instrument managed code applications to provide manageability. Developers who are

already familiar with writing managed code have many of the skills required to provide

instrumentation through WMI. There is almost no learning curve,

You can expose application information for management by making declarations—no extensive

extra coding is required. The developer marks the objects as manageable by using the .NET

Framework classes and defines how they map to the corresponding WMI classes.

The developer can also derive the class from a common System.Management.Instrumentation

schema class, in which case the attribution and mapping is already done. The

InstrumentedAttribute and InstrumentationClassAttribute classes are the primary means of

instrumenting your code.

After your application is instrumented, other applications can discover, monitor, and configure

objects and events through WMI and the management applications developed by the extensive

WMI customer base (such as Computer Associates, Tivoli Systems, Inc., BMC Software, Hewlett-

Packard, and so on). The managed-code events marked for management are raised as WMI

events when the WMI event raising API is invoked.

Security support in System.Management is tightly linked to security in WMI. In WMI, client

access to information is controlled using namespace-based security. For more information, see

"Security for WMI in .NET Framework" on MSND at http://msdn2.microsoft.com/en-

us/library/ms186154.aspx.

Benefits of WMI Support in the .NET Framework Writing a client application or provider using WMI support in the .NET Framework provides

several advantages over original WMI. In this case, writing a provider means adding

instrumentation to an application written in managed code.

WMI support in the .NET Framework offers the following advantages for writing client

applications and providers:

• Use of common language runtime features, such as garbage collection, custom

indexer, and dictionaries. It also offers other common language runtime features such

as automatic memory management, efficient deployment, an object-oriented

framework, evidence-based security, and exception handling.

• Definition of classes and publication of instances entirely with .NET Framework

classes to instrument applications so the applications can provide data to WMI. The

classes in System.Management.Instrumentation allow you to register a new provider,

create new classes, and publish instances without the developer having to use Managed

Object Format (MOF) code.

• Simplicity of use. Applications for WMI are sometimes difficult or lengthy to develop.

The class structure of System.Management brings more script-like simplicity to

applications developed in the .NET Framework. The development of both applications

and providers can be done more quickly with easier debugging.

• Access to all WMI data. Client applications have the same access to, and can do all the

same operations with WMI data as in the original WMI. Provider-instrumented

applications are somewhat more restricted. For more information, see "Limitations of

WMI in .NET Framework" on MSDN at http://msdn2.microsoft.com/en-

us/library/ms186136.aspx.

Limitations of WMI in the .NET Framework An instrumented application can only exist as a decoupled provider, out of process to WMI.

Objects exposed through native WMI providers can still expose these features, which are

accessible from managed code through System.Management classes. A client application can

still do most of the original WMI client operations.

You will encounter most of the limitations of WMI in .NET Framework when writing provider-

instrumented applications. The limitations include the following:

• Managed code providers cannot define methods. Instrumented applications running

on the .NET Framework and providing data to WMI cannot use the

System.Management or System.Management.Instrumentation class methods to

define and implement WMI methods. A client application can still invoke the method of

an original WMI provider.

• Instrumented applications cannot expose writeable properties on new classes that

are not wrappers of underlying unmanaged WMI classes. A client of a WMI class

exposed by an instrumented managed application cannot change instance data and

then write the data back using a Put operation.

• You cannot create qualifiers on instrumented classes. Instead, managed code defines

several operative attributes in System.Management.Instrumentation that indicates

how the mapping between WMI classes and managed code classes is performed.

• You cannot define properties of instrumented objects as key properties.

• Although WMI supports embedded objects as well as references to other objects

using WMI in .NET Framework, you can only use embedded objects when defining

new classes.

• You cannot create an event consumer provider in managed code. For more

information, see "Writing an Event Consumer Provider" in the Windows Management

Instrumentation documentation in the MSDN Library at

http://msdn.microsoft.com/library. However, managed client applications can still

access existing unmanaged code and WMI consumer providers, such as the Standard

Consumers. For more information, see "Monitoring and Responding to Events with

Standard Consumers" in the Windows Management Instrumentation documentation in

the MSDN Library at http://msdn.microsoft.com/library.

• WMI in .NET Framework does not support refreshers. If you want to retrieve data from

Win32_FormattedData_* classes, you can use the

System.Diagnostics.PerformanceCounter class instead of using refreshers with the

Win32_FormattedData_* classes, or you can get the raw counter samples from the

Win32_PerfRawData_* classes at the desired interval and calculate the result yourself

using the last two samples. For more information about these Win32 classes, see

"Win32_Classes" in the Windows Management Instrumentation documentation in the

MSDN Library at http://msdn.microsoft.com/library.

• The System.Management.Instrumentation namespace does not support the

inheritance of classes if the derived class is in a different namespace than the parent

class.

• The WMI infrastructure and providers on native and managed (.NET) stacks have not

been verified for use in a cluster environment. This means that the WMI infrastructure

and providers are not supported by Microsoft in a cluster environment.

Using WMI.NET Namespaces WMI organizes its reinstalled classes into namespaces. You should use the following

recommendations when defining WMI namespaces:

• As a convenience during development and if it is not otherwise specified by the

InstrumentedAttribute class on the assembly, instrumentation data is published to the

root\default namespace. However, you should normally override this default and define

a specific namespace for their application, so it can be managed independently.

• Create a separate namespace for your particular assembly, group of assemblies, or

application, having similar security requirements. Use the company name and software

product name in your namespace definition, to ensure uniqueness. For example,

instrumentation from your application can be published into the root\<your company

name>\<your product name> namespace. Potentially, the namespace hierarchy can

also contain version information (see more about versioning in the schema registration

section).

Administrators can use WMI Control to specify security constraints for a specific namespace. For

more information, see "Locating the WMI Control" in the Windows Management

Instrumentation documentation in the MSDN Library at http://msdn.microsoft.com/library.

The WMI namespaces, such as root\cimv2 and root\default, are not to be confused with the

.NET Framework namespaces System.Management and

System.Management.Instrumentation. The System.Management namespace contains the

WMI in .NET Framework classes to perform WMI operations. The

System.Management.Instrumentation namespace contains the classes for adding

instrumentation to your application.

Administrators and IT developers can use the classes in System.Management to write

applications that access WMI data in any .NET Framework language, such as C#, Visual Basic

.NET, or J#. These applications can do the following:

• Enumerate or retrieve a collection of instance property data, such as the FreeSpace

property of all the instances of Win32_LogicalDisk on all the computers of a network.

For more information, see "Win32_LogicalDisk" in the Windows Management

Instrumentation documentation in the MSDN Library at

http://msdn.microsoft.com/library.

• Query for selected instance data. WMI in .NET Framework uses the original WMI WQL

query language, a subset of SQL. For more information on WQL, see "WQL query

language" in the Windows Management Instrumentation documentation in the MSDN

Library at http://msdn.microsoft.com/library.

• Subscribe to events, defined as instances of event classes. An event occurs when an

instrumented application (provider) creates an instance of one of its event classes.

Publishing the Schema for an Instrumented Assembly to WMI An instrumented application must undergo a registration stage, in which its schema can be

registered in the WMI repository. Schema publishing is required on a per assembly basis. Any

assembly that declares instrumentation types (events or instances) must have its schema

published to WMI. This is done using the standard installer mechanisms in the .NET Framework.

As a convenience for developers at design time, the schema is automatically published the first

time an application raises an event or publishes an instance. This avoids having to declare a

project installer and running the InstallUtil.exe tool during rapid prototyping of an application.

However, this registration will succeed only if the user invoking it is a member of the Local

Administrators group, so you should not rely on this as a mechanism for publishing the

schema.

The event (or instance) class schema resides in the assembly and is registered in the WMI

repository during installation.

To publish a schema to WMI, you must first define an installer for the project. You can use the

ManagementInstaller class provided in the System.Management.Instrumentation namespace.

For example, you would add the following code to your project installer's constructor.

C#


public class WmiEventsInstaller : DefaultManagementProjectInstaller

{

// constructor

public WmiEventsInstaller()

{

}

}

Typically, you should not call the methods of the ManagementInstaller class from within your

code; they are generally called only by the InstallUtil.exe installation utility. The utility

automatically calls the Install method during the installation process. It backs out failures, if

necessary, by calling the Rollback method for the object that generated the exception.

Republishing the Schema In some cases, you will make changes to an application, and it will need to be reinstalled. In this

case, you should perform the following actions:

• In case of schema changes, re-install the assembly.

• Re-install the assemblies for all classes derived from the changed class (if there are any).

• Re-compile client applications.

If the currently registered schema becomes corrupted for any reason, there might be cases in

which re-running InstallUtil.exe will not detect the need to re-register the original schema. In

this case, it is possible to force the installer to re-install the schema using the /f or /force switch.

It is not always necessary to recompile the client application when the schema changes. If the

event schema has been changed by adding properties and methods, and none of the earlier

defined properties or methods were removed, you can move the application's instrumentation

to a different WMI namespace, and not recompile the client application.

Unregistering the Schema The ManagementInstaller class does not perform any operations at uninstall; specifically, it

does not unregister the schema. The reason is that more than one WMI provider could use the

same schema, and there is no mechanism in place for identifying whether a particular schema is

not being used by any other entity and can be safely removed. If you need to unregister the

schema, you can use the WBEMTest utility.

Instrumenting Applications Using WMI.NET classes Developers can use the classes in System.Management.Instrumentation to instrument their

application so that it provides data to WMI about the behavior of the application.

Instrumenting an application involves defining classes and then setting attributes on those

classes to designate them for instrumentation. The running application creates instances of

those classes and publishes them to WMI using the services provided by WMI.NET classes. For

example, your application may expose data about its health. An instrumented application is a

provider of data to WMI in the same way that providers work in the original WMI.

WMI .NET Classes The following tables list the main classes that must be implemented for each of the specified

task areas. Where relevant, the associated interfaces and configuration elements are also listed.

This is not a comprehensive list of all the classes in each namespace, but it includes all classes

demonstrated in the How-to topics.

Classes in the System.Management Namespace

Technology Area Classes/interfaces/configuration elements

Gathering WMI class information ManagementObject, ManagementClass

Querying for data

Querying for data asynchronously

SelectQuery, ManagementObjectSearcher, WqlObjectQuery,

ObjectQuery

ManagementObjectCollection, ManagementOperationObserver

Executing methods

Executing methods asynchronously

ManagementBaseObject

ManagementOperationObserver

Receiving events

Receiving events asynchronously

WqlEventQuery, ManagementEventWatcher

EventArrivedEventArgs, EventArrivedEventHandler,

CompletedEventArgs, CompletedEventHandler

Connecting to a remote computer ConnectionOptions, ManagementScope

Classes in the System.Management.Instrumentation Namespace

Technology Area Classes/interfaces/configuration elements

Creating data providers Instance, InstrumentationClassAttribute, InstrumentedAttribute

Creating event providers BaseEvent, Instrumentation

Registering a provider ManagementInstaller

Accessing WMI Data Programmatically You can create queries for WMI data in the .NET Framework, specified as a string in the WMI

supported WQL format, or constructed using a query class from the System.Management

namespace. The WqlEventQuery class is used for event queries, and the WqlObjectQuery class

is used for data queries.

The instrumentation helpers automatically generated by the TSMMD tool do not perform

queries for WMI data.

Creating targeted queries can noticeably increase the speed with which data is returned, and

make it easier to work with the returned data. Targeted queries can also cut down on the

amount of data that is returned, an important consideration for scripts that run over the

network.

The following code example shows how a query can be invoked using the

ManagementObjectSearcher class. In this case, the SelectQuery class is used to specify a

request for environment variables under the System user name. The query returns results in a

collection.

C#

using System;

using System.Management;

// This example demonstrates how to perform an object query.

public class QueryInstances {

public static int Main(string[] args) {

// Create a query for system environment variables only.

SelectQuery query = new SelectQuery("Win32_Environment",

"UserName=\"<SYSTEM>\"");

// Initialize an object searcher with this query.

ManagementObjectSearcher searcher = new ManagementObjectSearcher(query);

// Get the resulting collection and loop through it.

foreach (ManagementObject envVar in searcher.Get()) {

Console.WriteLine("System environment variable {0} = {1}",

envVar["Name"], envVar["VariableValue"]);

}

return 0;

}

}

The preceding example requires references to the System and System.Management

namespaces.

Summary This chapter has demonstrated many of the developer tasks associated with WMI

instrumentation. Most of the developer tasks you will need to perform are automated by the

TSMMD tool. However, it is still important for developers to understand the work performed by

the TSMMD tool when developing applications with WMI instrumentation.

Chapter 11

Windows Eventing 6.0

Instrumentation

In versions of the Windows operating system earlier than Windows Vista, you would use either

Event Tracing for Windows (ETW) or event logging to log events. Windows Vista introduces a

new Eventing model that unifies both the ETW and Windows Event Log API.

The new model uses an XML manifest to define the events that you want to publish. Events can

be published to a channel or an ETW session. You can publish the events to the following types

of channels:

• Admin

• Operational

• Analytic

• Debug

This chapter provides an introduction to the Windows Eventing 6.0 mechanism, and describes

the tasks that must be performed when developing Eventing 6.0 instrumentation. For more

information about Windows Event Log, see "Windows Event Log" on MSDN

(http://msdn.microsoft.com/en-us/library/aa385780(VS.85).aspx).

Windows Eventing 6.0 Overview Windows Vista and future versions of Windows Server incorporate an enhanced, XML-based

Event Log and the associated administrative tools. The new Event Log format allows events to

include additional fields for keywords, activity correlation between machines, and a link for

additional information. Events can be persisted into a hierarchical log structure, forwarded

between machines, and linked to scheduled tasks. Of course, backwards compatibility with

traditional Windows events is also supported.

Reusable Custom Views A new Event Viewer provides graphical access to this richer event information (see Figure 1).

The new viewer provides capabilities for creating rule-based filters, searching across multiple

logs, and attaching tasks to specific events. Events can be displayed as XML, and administrators

can create custom filters using XPath queries. Custom views can also be exported for use on

other computers, or shared with other administrators.

Figure 1

Windows Event Viewer in Windows Vista and Windows Server 2008

Command Line Operations The new WevtUtil.exe utility is used to access event log information from the command line. It

supports the following command parameters:

Command parameter Description

al (archive-log) Archive an exported log.

cl (clear-log) Clear a log.

el (enum-logs) List log names.

ep (enum-publishers) List event publishers.

epl (export-log) Export a log.

gl (get-log) Get log configuration information.

gli (get-log-info) Get log status information.

gp (get-publisher) Get publisher configuration information.

im (install-manifest) Install event publishers and logs from manifest.

qe (query-events) Query events from a log or log file.

sl (set-log) Modify configuration of a log.

um (uninstall-manifest) Uninstall event publishers and logs from manifest.

Administrators can execute scripts against the Event Log if Windows PowerShell is installed.

Event Subscriptions An instance of the Event Viewer enables administrators to view events on a single local or

remote computer. However, some troubleshooting scenarios may involve examining filtered

events stored in logs on multiple computers. The new Windows Evening system includes the

ability to forward copies of events from multiple remote computers and collect them on a single

computer.

Administrators create event subscriptions to exactly specify which events will be collected, and

in which log they will be stored. Forwarded events can be viewed and manipulated as with any

other local events. Event subscriptions require configuring the Windows Remote Management

(WinRM) service and the Windows Event Collector (Wecsvc) service on participating forwarding

and collecting computers.

Integration with Task Scheduler Using the new Event Viewer, administrators have the ability to associate configurable tasks with

specific events (see Figure 2). The new Task Scheduler manages events created through the new

Event Viewer.

Figure 2

Specifying an Edit Trigger for a Windows Eventing 6.0 event

Online Event Information Ideally, an event should contain enough context information to allow an administrator to

diagnose, correct, and verify exceptional conditions. Often this diagnostic information is not

known during development, and it is necessary to put a pointer to an external resource into the

event. Classic windows events often include this link in the message body, but there is no

standard mechanism for handling this.

The new Windows Event format includes a Uniform Resource Locator (URL) link to a web site

(Microsoft or developer defined) that allows administrators to jump quickly to a link with

additional, up-to-date information. URLs can be defined either in the Windows registry, or in an

instrumentation manifest. Registry-based URLs are common across all published events on the

computer where the registry is located. Registry-based URLs also override any URLs defined in

instrumentation manifests.

Publishing Windows Events Publishing Windows Events consists of six high level steps:

1. Decide the type of event to raise and where to publish the events (that is, which

channel).

2. Define the publisher, events, channels, and metadata in an instrumentation

manifest.

3. Execute the Message Compiler utility (mc.exe) against the manifest to generate a

header and binary resource files.

4. Write code to raise the events.

5. Compile and link your event publisher source code.

6. Install the publisher files.

The next sections describe each of these steps in more detail:

Event Types and Event Channels A channel is a named stream of events that transports events from an event publisher to an

event log file, where an event consumer can get an event. Event channels are intended for

specific audiences and have different types for each audience.

While most channels are tied to specific event publishers (they are created when publishers are

installed and deleted when publishers are uninstalled), there are a few channels that are

independent from any event publisher. System Event Log channels and event logs, such as

System, Application, and Security, are installed with the operating system and cannot be

deleted.

A channel can be defined on any independent Event Tracing for Windows (ETW) session. Such

channels are not controlled by Windows Event Log; they are controlled by the ETW consumer

that creates them.

Channels defined by event publishers are identified by a name and should be based on the

publisher name.

There are restrictions on channel naming. Channel names can contain spaces, but a channel

name cannot be longer than 255 characters, and cannot contain '>', '<', '&', '"', '|', '\', ':', '`', '?',

'*', or characters with codes less than 31. Additionally, the name must follow the general

constraints on file and registry key names.

The following XML example shows a valid default channel name.

XML

Company-Product-Component/ChannelName

Event Types and Channel Groups Event types and channel types can be considered the same thing because the type of channel

defines the type of event that travels through the channel to an event log. Each channel group

contains two event (or channel) types: serviced and direct.

Serviced Channel You can subscribe to a serviced channel in addition to querying the channel. The event

consumer subscriptions to a serviced channel are based on XPath queries; thus, only events that

match the query are delivered to the subscribers. Events in the serviced channel can be

forwarded to another system. Forwarding is subscription-based and selected events can be

forwarded from any number of channels. Serviced channels have the following types:

• Administrative. These events are primarily targeted at administrators and support staff,

and indicate a serious fault or issue that requires intervention. These types of event

should be relatively infrequent, and indicate a problem and a well-defined solution that

an administrator can act on. An example of an admin event is an event that occurs

when an application fails to connect to a printer. These events are either well-

documented or have a message associated with them that gives the reader direct

instructions of what must be done to rectify the problem.

• Operational. Operational events are used for more mundane tasks such as reporting

status, or to assist in analyzing and diagnosing a problem or occurrence. They can be

used to trigger tools or tasks based on the problem or occurrence. An example of an

operational event is an event that occurs when a printer is added or removed from a

system.

Direct Channel You cannot subscribe to a direct channel, but you can query a direct channel. A direct channel is

performance-oriented. Events are not processed in any way by the eventing system. This allows

the direct channel to support high volumes of events. Direct channels have the following types:

• Analytic. Analytic events are published in high volume. They describe program

operation and indicate problems that cannot be handled by user intervention.

• Debug. Debug events are used solely by developers to diagnose a problem for

debugging.

Channels Defined in the Winmeta.xml File Some channels are already defined in the Winmeta.xml file that is included in the Windows SDK.

These channels can be imported using the importChannel element. The following table contains

a list of these channels.

Name Type Symbol Description

TraceClassic Debug WINEVENT_CHANNEL_CLASSIC_TRACE

Value: 0

Events for classic ETW event

tracing.

System Admin WINEVENT_CHANNEL_GLOBAL_SYSTEM

Value: 8

This channel is used by

applications running under system

service accounts (installed system

services), drivers, or a component

or application that has events that

relate to the health of the computer

system.

Application Admin WINEVENT_CHANNEL_GLOBAL_APPLICATION

Value: 9

Events for all user-level

applications. This channel is not

secured and it is open to any

applications. Applications that log

extensive information should

define an application-specific

channel.

Security Admin WINEVENT_CHANNEL_GLOBAL_SECURITY

Value: 10

The Windows Audit Log. This

event log is for exclusive use of the

Windows Local Security Authority.

User events may appear as audits

if supported by the underlying

application.

Creating the Instrumentation Manifests Instrumentation manifests contain the event publisher metadata, event definitions and

templates, channel definitions, and localized event messages.

Instrumentation manifests are created in a particular structure and can be validated against the

EventManifest schema.

The instrumentation manifest includes the following information:

• It includes the identity of the publisher and the location of the publisher's resources.

• It includes the definition and settings of any channels that the application creates. For

more information about channels, see "Event Logs and Channels in Windows Event Log"

on MSDN at http://msdn2.microsoft.com/en-us/library/aa385225.aspx.

• It includes the definition, XML shape, message text and destination channel of the

events reported by the publisher.

• It includes localized event messages.

Elements in the Instrumentation Manifest Provider metadata and event information are found in the manifest in the following elements:

• <instrumentationManifest>. This is the top-level element in an instrumentation

manifest. This element contains the elements that configure event publishers, create

and configure new channels, disclose what events a publisher is planning to publish

(and into which channels the events are published), and provide localized strings to be

used in event rendering (displaying the event message).

• <instrumentation>. This contains the elements that configure event publishers and

disclose what events a publisher is planning to publish. This element contains a list of all

the publishers in the manifest.

• <events> (parent element: <instrumentation>). This defines a list of event publishers

that is defined in the manifest. This element also allows you to create a list of event

messages.

• <provider>. This contains provider metadata for an event publisher. The metadata

contains information such as the provider's name, channels that are used by the

provider, opcodes, and other data in the provider. For more information about the

metadata that can be defined, see "ProviderType Complex Type" on MSDN at

http://msdn2.microsoft.com/en-us/library/aa384018.aspx.

• <channels>. This contains the list of channels into which this provider publishes events.

Channels help define the event log into which the events will be "channeled." The

channels that are referenced in the event definitions must be declared in the manifest.

This allows you to obtain all the channels that are used in the manifest and by the event

publisher, and it allows system tools to verify that there are no spelling mistakes in the

channel names used in the event definitions. When a channel is referenced by an event

definition, the event will be published into this channel.

• <opcodes>. This contains the definitions of opcodes to be used by the events published

by this provider. For more information about opcodes, see "OpcodeType Complex

Type" on MSDN at http://msdn2.microsoft.com/en-us/library/aa383956.aspx.

• <keywords>. This contains the definitions of keywords to be used by the events

published by this provider. For more information about keywords, see "KeywordType

Complex Type" on MSDN at http://msdn2.microsoft.com/en-us/library/aa382786.aspx.

• <templates>. This contains the data-rendering templates used by the events published

by this provider. For more information about templates, see "Using Templates for

Events" later in this section.

• <events> (parent element: <provider>). This contains the definitions of the events

published by a provider. Each event has a 16-bit integer ID associated with it.

Additionally, each event has a set of classifiers present even when they are not explicitly

identified in the event definition (there are default values for all classifiers): task,

opcode, keywords, version, and level. The combination of the value and version of the

event uniquely identifies an event.

• <stringTable>. This specifies a list of event messages or references to strings in the

localization section of the manifest. The event message is a readable description of the

event. This description is localized. The message can also contain substitution

parameters (similar to template) that specify user supplied values from the event to

substitute into the message so that the full description suitable for display to the user

can be formed.

The following XML example shows how to use substitution parameters in event messages. A

printer name value can be substituted into the message (as it is the first parameter) during an

event.

XML

Print Spooler has failed to connect to %1 printer.

All further print jobs to this printer will fail.

Ping the printer to check if it is online.

The following XML example shows an instrumentation manifest with each of the preceding

elements defined.

XML



<instrumentationManifest

xmlns="http://schemas.microsoft.com/win/2004/08/events">

<instrumentation xmlns:xs="http://www.w3.org/2001/XMLSchema"

xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"

xmlns:win="http://manifests.microsoft.com/win/2004/08/windows/events">

<events xmlns="http://schemas.microsoft.com/win/2004/08/events">



<provider name="Microsoft-Windows-EventLogSamplePublisher"

guid="{1db28f2e-8f80-4027-8c5a-a11f7f10f62d}"

symbol="MICROSOFT_SAMPLE_PUBLISHER"

resourceFileName="C:\temp\Publisher.exe"

messageFileName="C:\temp\Publisher.exe">



<channels>



<importChannel chid="C1" name="Application"/>



<channel chid="MyChannel"

name="Microsoft-Windows-SamplePublisher/Operational"

type="Operational"

symbol="SAMPLE_PUBLISHER"

isolation="Application" enabled="true" />

</channels>



<templates>

<template tid="MyEventTemplate">

<data name="Prop_UnicodeString" inType="win:UnicodeString" />

<data name="Prop_AnsiString" inType="win:AnsiString"

outtype="xs:string" />

<data name="Prop_Int8" inType="win:Int8" />

<data name="Prop_UInt8" inType="win:UInt8" />







<data name="Prop_Float" inType="win:Float" />

<data name="Prop_Double" inType="win:Double" />

<data name="Prop_Boolean" inType="win:Boolean" />

<data name="Prop_GUID" inType="win:GUID" />

<data name="Prop_Pointer" inType="win:Pointer" />

<data name="Prop_FILETIME" inType="win:FILETIME" />

<data name="Prop_SYSTEMTIME" inType="win:SYSTEMTIME" />

<data name="Prop_SID_Length" inType="win:UInt32" />

<data name="Prop_SID" inType="win:SID" length="Prop_SID_Length"/>

<data name="Prop_Binary" inType="win:Binary" length="11" />

<UserData>

<MyEvent2 xmlns="myNs">

<Prop_UnicodeString> %1 </Prop_UnicodeString>

<Prop_AnsiString> %2 </Prop_AnsiString>

<Prop_Int8> %3 </Prop_Int8>

<Prop_UInt8> %4 </Prop_UInt8>







<Prop_Float> %11 </Prop_Float>

<Prop_Double> %12 </Prop_Double>

<Prop_Boolean> %13 </Prop_Boolean>

<Prop_GUID> %14 </Prop_GUID>

<Prop_Pointer> %15 </Prop_Pointer>

<Prop_FILETIME> %16 </Prop_FILETIME>

<Prop_SYSTEMTIME> %17 </Prop_SYSTEMTIME>

<Prop_SID_Length> %18 </Prop_SID_Length>

<Prop_SID> %19 </Prop_SID>

<Prop_Binary> %20 </Prop_Binary>

</MyEvent2>

</UserData>

</template>

</templates>



<events>

<event value="1"

level="win:Informational"

template="MyEventTemplate"

opcode="win:Info"

channel="MyChannel"

symbol="PROCESS_INFO_EVENT"

message="$(string.Publisher.EventMessage)"/>

</events>

</provider>

</events>

</instrumentation>

<localization>

<resources culture="en-US">

<stringTable>



<string id="Publisher.EventMessage"

value="Prop_UnicodeString=%1;%n

Prop_AnsiString=%2;%n

Prop_Int8=%3;%n

Prop_UInt8=%4;%n

Prop_Int16=%5;%n

Prop_UInt16=%6;%n

Prop_Int32=%7;%n

Prop_UInt32=%8;%n

Prop_Int64=%9;%n

Prop_UInt64=%10;%n

Prop_Float=%11;%n

Prop_Double=%12;%n

Prop_Boolean=%13;%n

Prop_GUID=%14;%n

Prop_Pointer=%15;%n

Prop_FILETIME=%16;%n

Prop_SYSTEMTIME=%17;%n

Prop_SID_Length=%18;%n

Prop_SID=%19;%n

Prop_Binary=%20"/>

</stringTable>

</resources>

</localization>

</instrumentationManifest>

You can also create event descriptions in multiple languages, by adding the localized strings to

the localization element of the instrumentation manifest.

Using Templates for Events Templates specify the names and the types of data that the event publisher supplies with an

event. Additionally, a template may specify the XML structure of the event (defined static

content of the event, and insertions for dynamic content of the event).

If an XML template is attached to the event, the event can be represented as an XML fragment.

Using XML, each event attribute value should be labeled as its semantic meaning. This allows

queries and analysis to be performed later.

Using the Message Compiler to produce development files MC.exe is used to produce development files that are required for compiling the source files

that raise events. It creates the following files:

• .h. This is the header file that contains the definitions for the event provider, event

attributes, channels, and events. These values are referenced when creating a handle

for the provider and when publishing events.

• .rc. This is a resource compiler script that can be used to include the generated

resources. This script is included into the component's main resource file.

• .bin. This suffix is used for two different types of bin files (typically, multiple bin files are

created). The first type of bin file is a culture-independent resource that contains the

provider and event metadata. This is the template resource, which is signified by the

TEMP suffix of the base name of the file. The second type of bin file is a culture-

dependent (localizable) resource that contains a message table. This is the message

resource and its name by default starts with the MSG prefix that is followed by a

number. The number starts with one and is incremented for each additional language

defined in the manifest.

An event publisher application uses these files along with the Windows Event Log API to publish

events to an event channel.

If MC.exe is used on the instrumentation manifest shown in the previous section, the following

Publisher.h file is generated.

C++

// publisher.h

#pragma once

__declspec(selectany) GUID MICROSOFT_SAMPLE_PUBLISHER = {0x1db28f2e, 0x8f80,

0x4027, {0x8c, 0x5a,0xa1,0x1f,0x7f,0x10,0xf6,0x2d}};

#define SAMPLE_PUBLISHER 0x10

__declspec(selectany) EVENT_DESCRIPTOR PROCESS_INFO_EVENT = {0x1, 0x0, 0x10,

0x4, 0x0, 0x0, 0x8000000000000000};

#define MSG_Publisher_EventMessage 0x00000000L

// end of publisher.h

The Publisher.h header file contains an EVENT_DESCRIPTOR variable definition that was defined

in the instrumentation manifest. This variable will be used in the EventWrite function call to

publish the event.

Writing Code to Raise Events The following code example shows code to raise events from an event publisher.

C++

// publisher.cpp

#include <windows.h>

#include <comdef.h>

#include <sddl.h>

#include <iostream>

#include <tchar.h>

#include <string>

#include <vector>

#include <evntprov.h> // ETW Publishing header

# pragma comment(lib, "advapi32.lib")

#include <winevt.h> // EventLog Header

# pragma comment(lib, "wevtapi.lib")

#include "publisher.h" // Header generated by mc.exe

// from manifest (publisher.man)

using namespace std;

void __cdecl wmain()

{

REGHANDLE hPublisher = NULL; //Handle to Publisher

wprintf(L"Publishing Event to Microsoft-Windows-

EventLogSamplePublisher/Operational Channel... \n");

// Register a Publisher

ULONG ulResult = EventRegister(

&MICROSOFT_SAMPLE_PUBLISHER, // provider guid

NULL, // callback; unused for now

NULL, // context

&hPublisher); // handle required to unregister

if ( ulResult != ERROR_SUCCESS)

{

wprintf(L"Publisher Registration Failed!. Error = 0x%x", ulResult);

return;

}

// EventData

std::vector<EVENT_DATA_DESCRIPTOR> EventDataDesc;

EVENT_DATA_DESCRIPTOR EvtData;

// inType="win:UnicodeString"

PWSTR pws = L"Sample Unicode string";

EventDataDescCreate(&EvtData, pws, ((ULONG)wcslen(pws)+1)*sizeof(WCHAR));

EventDataDesc.push_back( EvtData );

// inType="win:AnsiString"

CHAR * ps = "Sample ANSI string";

EventDataDescCreate(&EvtData, ps, ((ULONG)strlen(ps)+1)*sizeof(CHAR));


// inType="win:Int8"

INT8 i8 = 0x7F;

EventDataDescCreate(&EvtData, &i8, sizeof(i8));


// inType="win:UInt8"

UINT8 ui8 = 0xFF;

EventDataDescCreate(&EvtData, &ui8, sizeof(ui8));



INT16 i16 = 0x7FFF;




UINT16 ui16 = 0xFFFF;




INT32 i32 = 0x7FFFFFFF;




UINT32 ui32 = 0xFFFFFFFF;




INT64 i64 = 0x7FFFFFFFFFFFFFFFi64;




UINT64 ui64 = 0xFFFFFFFFFFFFFFFFui64;



// inType="win:Float"

FLOAT f = -3.1415926e+23f;

EventDataDescCreate(&EvtData, &f, sizeof(f));


// inType="win:Double"

DOUBLE d = -2.7182818284590452353602874713527e-101;

EventDataDescCreate(&EvtData, &d, sizeof(d));


// inType="win:Boolean"

BOOL b = TRUE;

EventDataDescCreate(&EvtData, &b, sizeof(b));


// inType="win:GUID"

GUID guid;

EventDataDescCreate(&EvtData, &guid, sizeof(guid));


// inType="win:Pointer"

PVOID p = NULL;

EventDataDescCreate(&EvtData, &p, sizeof(p));


// inType="win:FILETIME"

SYSTEMTIME st;

FILETIME ft;

GetSystemTime(&st);

SystemTimeToFileTime(&st, &ft);

EventDataDescCreate(&EvtData, &ft, sizeof(ft));


// inType="win:SYSTEMTIME"

GetSystemTime(&st);

EventDataDescCreate(&EvtData, &st, sizeof(st));


// inType="win:SID"

PSID pSid = NULL;

ConvertStringSidToSidW(L"S-1-5-19", &pSid); // LocalService

UINT32 sidLength = GetLengthSid(pSid);

EventDataDescCreate(&EvtData, &sidLength, sizeof(sidLength));


EventDataDescCreate(&EvtData, pSid, GetLengthSid(pSid));


// inType="win:Binary"

// Note: if you change the size of this array you'll have to change the

// "length" attribute in the manifest too.

BYTE ab[] = {0,1,2,3,4,5,4,3,2,1,0};

EventDataDescCreate(&EvtData, ab, sizeof(ab));


if ( EventEnabled(hPublisher, &PROCESS_INFO_EVENT) )

{

ulResult = EventWrite(hPublisher,

&PROCESS_INFO_EVENT,

(ULONG)EventDataDesc.size(),

&EventDataDesc[0]

);

if (ulResult != ERROR_SUCCESS)

{

//Get Extended Error Information

wprintf(L"EvtWrite Failed. Not able to fire event. Error = 0x%x",

ulResult);

LocalFree(pSid);

// Close the Publisher Handle

EventUnregister(hPublisher);

return;

}

}

else {

wprintf(L"Disabled");

}

wprintf(L"Success\n");

LocalFree(pSid);

// Close the Publisher Handle

EventUnregister(hPublisher);

}

// end of publisher.cpp

Compiling and Linking Event Publisher Source Code The resource script that is generated by the Message Compiler tool is included in the resource

script of the program built, and the result is compiled by the Resource Compiler (RC.exe) tool to

produce .res files. These files are then linked into a project binary during its link phase (using

CL.exe or Link.exe).

The commands for this step are as follows:

• rc.exe publisher.rc

• cl.exe publisher.cpp /link publisher.res

The Publisher.cpp file is shown in the preceding example (it includes the generated Publisher.h).

The Publisher.res file is the resource file generated from the Publisher.rc file.

Installing the Publisher Files Publisher files, including the manifest, must be installed on the target system. You install the

manifest using the Wevtutil.exe utility:

wevtutil install-manifest publisher.man

This command is usually limited to members of the Administrators group and must be run with

elevated privileges. As a result, this step will typically occur when the application is installed.

Consuming Event Log Events Eventing 6.0 includes a number of mechanisms for consuming event log events, such as

querying, reading, and subscribing. This section outlines those mechanisms.

Querying for Events A user can query either over active event logs (a log is still maintained within the system) or over

an external event log that was previously exported from the system. Events can still be written

to the log while the user is querying it. A user can also query over event logs on a remote

computer or the local computer. The example at the end of this topic shows how to query for

events on a remote computer.

An event query can be created by using an XPath query or an XML-formatted query.

Querying Over Active Event Logs A user queries over active event logs by specifying an event query and then obtaining a query

result set, which is used to enumerate the results. Note that the registering of an active log

query does not cause the system to return a snapshot of events at the time of the query.

Instead, the system generates the result set as the user traverses through it so that events that

are generated during the query are not lost.

The following C++ example shows how to query over active event logs using the EvtQuery

function to obtain a handle to the query result set for use later to enumerate through the result

set.

C++

EVT_HANDLE queryResult = EvtQuery (

NULL,

L"Application",

L"*",

EvtQueryChannelPath | EvtQueryReverseDirection );

if ( queryResult == NULL )

return GetLastError();

Querying Over External Files To query over an exported event log file, .evt file, or .etl file, use the same function (EvtQuery

function) that is used when querying over active logs, but pass in a path to the external file and

the appropriate flags to the function. The query is executed on the external file.

Relative paths and environment variables cannot be used when specifying an exported event log

file. A Universal Naming Convention (UNC) path can be used to locate the file. Any relative path

and environment variable expansion needs to be done prior to making API calls, as shown in the

following C++ code example.

C++

EVT_HANDLE queryResult = EvtQuery (

NULL,

L"c:\\temp\\MyExportedLog.log",

L"*",

EvtQueryFilePath | EvtQueryForwardDirection );

if ( queryResult == NULL )


Reading Events from a Query Result Set The process for obtaining events from a query result set is the same if the original source of the

events was an active log or an exported log. You obtain an enumeration object over the result

set and use its methods to retrieve the event instances. The system supports simple forward-

only navigation over the result set in direct logs and .evt files. Other logs can be read forward

and backward. Event instances can be fetched from the log files in batches to improve the

performance.

The following C++ example shows how to use the EvtNext function to obtain event instances

from the query result set. For efficiency reasons, it is recommended that the user specify a

batch size much greater than 1 for enumerating large result sets.

C++

const int BatchSize = 10;

DWORD numRead = 0;

EVT_HANDLE batch[BatchSize];

if (!EvtNext(queryResult, BatchSize, batch, -1, 0, &numRead))


for (int i=0; i < numRead; i++)

{

// Render event instance here

EvtClose(batch[i]);

}

Subscribing to Events Subscribing to events involves the receiving of notifications when selected events are raised. To

select events for a subscription, an event query is applied to events that are logged in one or

more channels. For information about creating a query, see "Event Selection" on MSDN at

http://msdn2.microsoft.com/en-us/library/aa385231.aspx. Because a data stream is logged,

subscriptions can get events that occur during periods when the subscriber is not connected. A

subscriber does not miss events that occur during down times (computer startup or shutdown).

Not only can log subscribers get the live events that pass the subscription filter, they can also get

the events that occurred before they were connected. At the time the subscription starts, any

events in the log that match the subscription start criteria are queued first and then live events

are added to the queue as they occur.

The subscription start criteria includes the following:

• Subscribing to future events (events not currently in the event log)

• Subscribing to events since the oldest event in the log

• Subscribing to events since a bookmark (marking some other event)

If a subscriber wants to ensure that it never misses a record and does not get repeat events,

then the subscriber indicates the last record that it received, which is marked by a bookmark.

The starting criteria for a subscription is specified in the Flags parameter of the EvtSubscribe

function by passing in a value from the EVT_SUBSCRIBE_FLAGS enumeration.

Push Subscriptions In the push subscription model, events are delivered asynchronously to the callback function

that is provided to the EvtSubscribe function.

The following C++ example shows how to set up a push subscription by passing a callback

function into the Callback parameter of the EvtSubscribe function. The example subscribes to all

the Level 2 events in the Application channel.

C++


#include <iostream>



using namespace std;

// Callback to receive RealTime Events.

DWORD WINAPI SubscriptionCallBack(

EVT_SUBSCRIBE_NOTIFY_ACTION Action,

PVOID Context,

EVT_HANDLE Event );


{

EVT_HANDLE hSub = NULL; // Handle to the event subscriber.

wchar_t *szChannel = L"Application"; // Channel.

wchar_t *szQuery = L"*[System/Level=2]"; // XPATH Query to specify which

// events to subscribe to.

wprintf(L"Subscribing to all level 2 events from the Application channel...

\n");

wprintf(L"NOTE: Hit 'Q' or 'q' to stop the event subscription\n");

// Register the subscription.

hSub = EvtSubscribe( NULL, // Session

NULL, // Used for pull subscriptions.

szChannel, // Channel.

szQuery, // XPath query.

NULL, // Bookmark.

NULL, // CallbackContext.

(EVT_SUBSCRIBE_CALLBACK) SubscriptionCallBack, // Callback.

EvtSubscribeToFutureEvents // Flags.

);

if( !hSub )

{

wprintf(L"Couldn't Subscribe to Events!. Error = 0x%x", GetLastError());

return;

}

else

{

// Keep listening for events until 'q' or 'Q' is hit.

WCHAR ch = L'0';

do

{

ch = _getwch();

ch = towupper( ch );

Sleep(100);

} while( ch != 'Q' );

}

// Close the subscriber handle.

EvtClose(hSub);

wprintf(L"Event Subscription Closed !\n");

}

/**********************************************************************

Function: CallBack

Description: This function is called by EventLog to deliver RealTime Events.

Once the event is received it is rendered to the console.

Return: DWORD is returned. 0 if succeeded, otherwise a Win32 errorcode.

***********************************************************************/

DWORD WINAPI SubscriptionCallBack(

EVT_SUBSCRIBE_NOTIFY_ACTION Action,

PVOID Context,

EVT_HANDLE Event )

{

WCHAR *pBuff = NULL;

DWORD dwBuffSize = 0;

DWORD dwBuffUsed = 0;

DWORD dwRes = 0;

DWORD dwPropertyCount = 0;

// Get the XML EventSize to allocate the buffer size.

BOOL bRet = EvtRender(

NULL, // Session.

Event, // HANDLE.

EvtRenderEventXml, // Flags.

dwBuffSize, // BufferSize.

pBuff, // Buffer.

&dwBuffUsed, // Buffersize that is used or required.

&dwPropertyCount);

if (!bRet)

{

dwRes = GetLastError();

if( dwRes == ERROR_INSUFFICIENT_BUFFER )

{

// Allocate the buffer size needed to for the XML event.

dwBuffSize = dwBuffUsed;

pBuff = new WCHAR[dwBuffSize/sizeof(WCHAR)];

// Get the Event XML

bRet = EvtRender(

NULL, // Session.

Event, // HANDLE.



pBuff, // Buffer.

&dwBuffUsed, // Buffer size that is used or required.

&dwPropertyCount);

if( !bRet )

{

wprintf(L"Couldn't Render Events!. Error = 0x%x", GetLastError());

delete[] pBuff;

return dwRes;

}

}

}

// Display the Event XML on console

wprintf(L"The following Event is received : \n %s \n\n", pBuff);

// Cleanup

delete[] pBuff;

return dwRes;

}

Pull Subscriptions The pull subscription model is used to control the delivery of events by allowing the caller to

decide when to get an event from the queue.

To create a pull model subscription, the caller must provide an event to the SignalEvent

argument in the EvtSubscribe function. The event that is provided in the SignalEvent argument

is set when the first event arrives in the queue. The event is also set when an event arrives after

the client has attempted to read an empty queue.

A client can wait on an event until it is set. After the event is set, the client can read the

subscription results using the EvtNext function until the EvtNext function fails because of an

empty queue (in which case the client can start waiting again).

In the pull subscription model, the user obtains an enumeration object over the result set and

uses its methods to retrieve the event instances.

The following C++ example shows how to subscribe to events from event log channels using a

pull subscription. It registers a subscriber by providing an XPATH query, and then if any events

are received, the event XML is displayed on console using EvtRender.

C++


#include <wchar.h>




{

// Channel.

PWSTR szChannel = L"Application";

// XPATH Query to specify which events to subscribe to.

PWSTR szQuery = L"*";

wprintf(L"Subscribing to all events from the Application channel... \n");

const int BatchSize = 10;

DWORD numRead = 0;

EVT_HANDLE batch[BatchSize];

HANDLE signalEvent = CreateEventW(NULL, false, false, NULL);

// Register the subscription.

EVT_HANDLE subscription = EvtSubscribe(

NULL, // Session

signalEvent, // Used for pull subscriptions.

szChannel, // Channel.

szQuery, // XPath query.

NULL, // Bookmark.

NULL, // CallbackContext.

NULL, // Callback.

EvtSubscribeToFutureEvents // Flags.

);

if( subscription == NULL )

{

wprintf(L"Couldn't subscribe to events. Error = 0x%x", GetLastError());

return;

}

else

{

DWORD result = ERROR_SUCCESS;

while( result == ERROR_SUCCESS )

{

if( EvtNext( subscription, BatchSize, batch, -1, 0, &numRead) )

{

// Do something with numRead event handles in the batch array.

// For example, render the events.

for( DWORD i=0; i < numRead; i++)

{

// Render the events in the array

WCHAR *pBuff = NULL;

DWORD dwBuffSize = 0;

DWORD dwBuffUsed = 0;

DWORD dwRes = 0;

DWORD dwPropertyCount = 0;

// Get the XML EventSize to allocate the buffer size.

BOOL bRet = EvtRender(

NULL, // Session.

batch[i], // EVT_HANDLE.



pBuff, // Buffer.

&dwBuffUsed, // Buffer size used.

&dwPropertyCount);

if (!bRet)

{

dwRes = GetLastError();

if( dwRes == ERROR_INSUFFICIENT_BUFFER )

{

// Allocate the buffer size needed for the XML event.

dwBuffSize = dwBuffUsed;

pBuff = new WCHAR[dwBuffSize/sizeof(WCHAR)];

// Get the Event XML

bRet = EvtRender(

NULL, // Session.

batch[i], // EVT_HANDLE.



pBuff, // Buffer.

&dwBuffUsed, // Buffer size used.

&dwPropertyCount);

if( !bRet )

{

wprintf(L"Couldn't render events. Error = 0x%x",

GetLastError());

delete[] pBuff;

// Close the remaining event handles for this batch.

for(DWORD j=i; j < numRead; j++)

{

EvtClose(batch[j]);

}

break;

}

}

}

// Display the event XML on console

wprintf(L"The following event is received : \n %s \n\n", pBuff);

// Cleanup

delete[] pBuff;

EvtClose(batch[i]);

}

}

else

{

DWORD waitResult = 0;

result = GetLastError();

if( result == ERROR_NO_MORE_ITEMS )

{

// Wait for the subscription results

waitResult = WaitForSingleObject( signalEvent, INFINITE );

if( waitResult == WAIT_OBJECT_0 )

{

result = ERROR_SUCCESS;

}

else

{

result = GetLastError();

break;

}

}

}

}

}

// Close the subscriber handle.

EvtClose(subscription);

CloseHandle(signalEvent);

wprintf(L"Event Subscription Closed !\n");

}

Summary This chapter provides information about the Windows Eventing 6.0 mechanism. It describes how

to view and handle events in Windows Vista and Windows Server 2008. It also contains technical

information about the way that Windows Eventing 6.0 works and how you can interact with the

mechanism in your own programs. You can define and implement Windows Eventing 6.0 events

in an application using the Team System Management Model Designer.

Chapter 12

Performance Counters

Instrumentation

Windows collects performance data about various system resources using performance

counters. Windows contains a pre-defined set of performance counters with which you can

interact; you can also create additional performance counters relevant to your application. This

chapter describes how to install performance counters, how to write to them, and how to read

existing performance counters.

Example code automatically generated by the Team System Management Model Designer

Power Tool (TSMMD) for the Northern Electronics scenario is used in this chapter to illustrate

its concepts.

Performance Counter Concepts To work effectively with performance counters, it is important to understand some key

concepts, including the following:

• Categories

• Instances

• Types

The next sections describe each of these in more detail.

Categories Performance counters monitor the behavior of aspects of performance objects on a computer.

Performance objects include physical components, such as processors, disks, and memory,

system objects, such as processes and threads, and application objects, such as databases, and

Web services.

Counters that are related to the same performance object are grouped into categories that

indicate their common focus. When you create an instance of the PerformanceCounter object,

you first indicate the category (for example, the Memory category) for the object and then

choose a counter to interact with from within that category (for example Cached Bytes).

If you create new performance counter objects for your application, you cannot associate them

with existing categories. Instead, you must create a new category for the performance counter

object.

Instances In some cases, categories are further subdivided into instances. If multiple instances are defined

for a category, each performance counter in the category also has those instances defined. For

example, the Process category contains instances named "Idle" and "System." Each counter

within the Process category specifies data in these two ways, showing information about either

idle processes or system processes. Figure 1 illustrates the structure of the category and

counters.

Figure 1 Performance counter categories and instances

Although instances are applied to the category, you create an instance by specifying an

instanceName on the PerformanceCounter constructor. If the instanceName already exists,

the new object will reference the existing category instance.

Types There are many different types of performance counters. Each type is distinguished by how the

performance counter performs calculations. For example, there are counters that are used to

calculate average values over a period of time, and counters that measure the difference

between a current value and a previous value.

The following table lists the most commonly used counter types.

Counter Type Usage Example

NumberOfItems32 Maintain a simple count of

items, operations, and so on.

You might use this counter type to track

the number of orders received as a 32-bit

number.

NumberOfItems64 Maintain a simple count with

a higher capacity


orders for a site that experiences very

high volume; stored as a 64-bit number.

RateOfCountsPerSecond32 Track the amount per second

of an item or operation


the orders received per second on a retail

site; stored as a 32-bit number.

RateOfCountsPerSecond64 Track the amount per second

with a higher capacity


the orders per second for a site that

experiences very high volume; stored as

a 64-bit number.

AverageTimer32 Calculate average time to

perform a process or to

process an item

You might use this counter type to

calculate the average time an order takes

to be processed; stored as a 32-bit

number.

Some performance counter types rely on an accompanying base counter that is used in the

calculations. The following table lists the base counter types with their corresponding

performance counter types.

Base counter type Performance counter types

AverageBase AverageTimer32

AverageCount64

CounterMultiBase CounterMultiTimer

CounterMultiTimerInverse

CounterMultiTimer100Ns

CounterMultiTimer100NsInverse

RawBase RawFraction

SampleBase SampleFraction

For a detailed description of all the Performance Counter types available, see "Appendix B.

Performance Counter Types."

Installing Performance Counters Your application cannot increment built-in performance counters, so if you want to have your

application write to performance counters, you will need to create these counters yourself.

Administrative rights are required to install performance counters, so you should install

performance counters before run time. Typically, performance counters are installed when the

application responsible for using them is installed.

You must create counters in a user-defined category instead of in the categories defined by

Windows. That is, you cannot create a new counter within the Processor category or any other

system-defined categories. Additionally, you must create a counter in a new category; adding a

counter to an existing user-defined category will raise an exception.

To install performance counters, you should create a project installer class that inherits from

Installer, and set the RunInstallerAttribute for the class to true. Within your project, create a

PerformanceCounterInstaller instance for each performance counter category, and then add

the instance to your project installer class.

Now you can specify each of the individual custom counters. You should use the

CounterCreationData class to set attributes for each custom counter. This class has the

following properties:

• CounterName. This property is used to get or set the name of the custom counter.

• CounterHelp. This property is used to get or set the description of the custom counter.

• CounterType. This property is used to get or set the type of the custom counter.

If the performance counter relies on a base counter, the performance counter creation data

must be immediately followed by the base counter creation data in code. If it is not, the two

counters will not be linked properly.

If you do not specify a counter type when creating the counter, it defaults to

NumberofItems32.

Now that the individual custom counters are created, they can be added to the

PerformanceCounterInstaller collection. The following code shows a performance counter

named ConfirmPickup in the category WSPickupService being added to the

PerformanceCounterInstaller collection.

C#

using System;




using System.IO;

using System.Text;


namespace PerformanceCounters.InstrumentationTechnology

{


public class PerformanceCountersClass : Installer

{

// constructor

public PerformanceCountersClass()

{

// Installer for performanceCounters with category name: WSPickupService

PerformanceCounterInstaller WSPickupServicePerfCountInstaller = new

PerformanceCounterInstaller();

WSPickupServicePerfCountInstaller.CategoryName = "WSPickupService";

// CounterCreation for event ConfirmPickup

CounterCreationData confirmPickupCounterCreation = new

CounterCreationData();

confirmPickupCounterCreation.CounterName = "ConfirmPickup";

confirmPickupCounterCreation.CounterHelp = "Counter Help"; //n/a now

confirmPickupCounterCreation.CounterType =

PerformanceCounterType.NumberOfItemsHEX32;

WSPickupServicePerfCountInstaller.Counters.Add(confirmPickupCounterCreation);

Installers.Add(WSPickupServicePerfCountInstaller);

}

}

}

Typically, you should not call the methods of the PerformanceCounterInstaller class from within

your code; they are generally called only by the InstallUtil.exe installation utility. The utility

automatically calls the Install method during the installation process. It backs out failures, if

necessary, by calling the Rollback method for the object that generated the exception.

Writing Values to Performance Counters You write a value to a performance counter in a number of ways.

• You can increment a counter by one using the Increment method on the

PerformanceCounter class.

• You can increment by incrementing the counter's current raw value by a positive or

negative number using the IncrementBy method on the PerformanceCounter class.

• You can set a particular value for a performance counter by using the RawValue

method on the PerformanceCounter class.

Incrementing by a negative number decrements the counter by the absolute value of the

number. For example, incrementing with a value of 3 will increase the counter's raw value by

three. Incrementing with a value of –3 will decrease the counter's raw value by three.

You can only increment values on custom counters; by default, your interactions with system

counters via a PerformanceCounter component instance are restricted to read-only mode.

Before you can increment a custom counter, you must set the ReadOnly property on the

component instance with which you are accessing it to false.

There are security restrictions that affect your ability to use performance counters. For more

information, see "Introduction to Monitoring Performance Thresholds" on MSDN at

http://msdn.microsoft.com/library/en-

us/vbcon/html/vbconintroductiontomonitoringperformancethresholds.asp.

To write values to performance counters

1. Create a PerformanceCounter instance and configure it to interact with the desired

category and counter.

2. Write the value using one of the methods listed in the following table.

To... Call this method Parameter

Increase the raw value by one Increment None

Decrease the raw value by one Decrement None

Increase the raw value by greater than one IncrementBy A positive integer

Decrease the raw value by greater than one IncrementBy A negative integer

Reset the raw value to any integer, instead of

incrementing it

RawValue A positive or negative integer

The following code shows how to set values for a counter in various ways. This code assumes

that you are working on a Windows Form that contains a text box named txtValue and three

buttons: one that increments the raw value by the number entered in the text box, one that

decrements the raw value by one, and one that sets the raw value of the counter to the value

set in the text box.

C#

protected override void DoIncrementByPickupServiceConfirmPickup(int increment)

{

using (PerformanceCounter counter

= new PerformanceCounter("WSPickupService", "ConfirmPickup", false))

{

counter.IncrementBy(increment);

}

protected override void DoIncrementPickupServiceConfirmPickup()

{

using (PerformanceCounter counter

= new PerformanceCounter("WSPickupService", "ConfirmPickup", false))

{

counter.Increment();

}

}

Main and base counters must be updated independently.

Connecting to Existing Performance Counters When you connect to an existing performance counter, you do so by specifying the computer on

which the counter exists, the category for the counter, and the name of the counter itself.

Additionally, you have the option of specifying the instance of the counter you want to use, if

the counter contains more than one instance. You can then read any and all data from the

counter. You can also enumerate the existing categories, counters, and instances on the

computer by using code, or you can use Server Explorer to see a list of existing counters on the

computer.

You may have to restart the Performance Monitor (Perfmon.exe) that is installed with

Windows when you create custom performance counters before you can see the custom

counter in that application.

Performance Counter Value Retrieval There are several ways you can read performance counter values:

• You can retrieve a raw value from a counter using the RawValue property on the

PerformanceCounter class.

• You can retrieve the current calculated value for a counter using the NextValue method

on the PerformanceCounter class.

• You can retrieve a set of samples using the NextSample method on the

PerformanceCounter class and compare their values using the

CounterSample.Calculate method.

There are security restrictions that affect your ability to use performance counters. For more

information, see "Introduction to Monitoring Performance Thresholds" on MSDN at

http://msdn.microsoft.com/library/en-

us/vbcon/html/vbconintroductiontomonitoringperformancethresholds.asp..

Raw, Calculated, and Sampled Data Performance counters record values about various parts of the system. These values are not

stored as entries; instead, they are persisted for as long as a handle remains open for the

particular category in memory. The process of retrieving data from a performance counter is

referred to as sampling. When you sample, you either retrieve the immediate value of a counter

or a calculated value.

Depending on how a counter is defined, its value might be the most recent aspect of resource

utilization, also referred to as the instantaneous value, or it might be the average of the last two

measurements over the period of time between samples. For example, when you retrieve a

value from the Process category's Thread Count counter, you retrieve the number of threads for

a particular process as of the last time this was measured. This is an instantaneous value.

However, if you retrieve the Memory category's Pages/Sec counter, you retrieve a rate per

second based on the average number of memory pages retrieved during the last two samples.

Resource usage can vary dramatically, based on the work being done at various times of day.

Because of this, performance counters that show usage ratios over an interval are a more

informative measurement than averages of instantaneous counter values. Averages can include

data for service startup or other events that might cause the numbers to go far out of range for

a brief period, thereby skewing results.

The PerformanceCounter component provides facilities for the most common Windows

performance monitoring requirement, namely, connecting to an existing counter on the server

and reading and writing values to it. Additional functionality, such as complex data modeling, is

available directly through Windows Performance Monitor. For example, you can use

Performance Monitor to chart the data a counter contains, run reports on the data, set alerts,

and save data to a log.

The interaction between raw values, next (or calculated) values, and samples is fairly

straightforward after you understand that raw and calculated values shift constantly, whereas

samples allow you to retrieve a static snapshot of the counter at a particular point in time.

Figure 2 illustrates the relationship between raw value, next value, and samples.

Figure 2 Performance counter values: raw, calculated, and sampled

The diagram in Figure 2 shows a representation of the data contained in a counter named

Orders Per Second. The raw values for this counter are individual data points that vary by

second, where the calculated average is represented by the line showing an increasing order

receipt over time. In this chart, the following data points have been taken:

• The user has used the NextValue method to retrieve the calculated value at three

different times, represented by NV1, NV2, and NV3. Because the next value is

constantly changing, a different value is retrieved each time without specifying any

additional parameters.

• The user has used the NextSample method to take two samples, indicated by S1 and S2.

Samples freeze a value in time, so the user can then compare the two sample values

and perform calculations on them.

Comparing Retrieval Methods Retrieving a raw value with the RawValue property is very quick, because no calculations or

comparisons are performed. For example, if you are using a counter simply to count the number

of orders processed in a system, you can retrieve the counter's raw value.

Retrieving a calculated value with the NextValue method is often more useful than retrieving

the raw value, but this value may also give you an unrealistic view of the data because it can

reflect unusual fluctuations in the data at the moment when the value is calculated. For

example, if you have a counter that calculates the orders processed per second, an unusually

high or low amount of orders processed at one particular moment will result in an average that

is not realistic over time. This may provide a distorted view of the actual performance of your

system.

Samples provide the most realistic views of the data in your system by allowing you to retrieve,

retain, and compare various values over time. You would retrieve a sample, using the

NextSample method, if you needed to compare values in different counters or calculate a value

based on raw data. This may be slightly more resource-intensive, however, than a NextValue

call.

The NextSample method returns an object of type CounterSample. When you retrieve a

sample, you have access to properties on the CounterSample class, such as RawValue,

BaseValue, TimeStamp, and SystemFrequency. These properties let you get a very detailed look

at the data that makes up the sample data.

Summary This chapter demonstrated how to create performance counters in custom performance

categories and how to connect to existing performance counters so that performance counter

data can be retrieved. For more detailed information about specific performance counter types,

see "Appendix C: Performance Counter Types."

Chapter 13

Building Install Packages

In this preliminary version of the guide, this chapter is still currently under development. It is

anticipated that a future release of the guide will include detailed information about building

install packages for instrumented applications.

Section 4

Managing Operations

This section focuses on the tasks performed by the operations team when managing

applications. It demonstrates an application in use and describes event log events, performance

counters, Windows Management Instrumentation (WMI) events, and event trace entries for the

application. It examines some of the important concepts involved in creating Management

Packs for Microsoft Operations Manager (MOM) 2005 and System Center Operations Manager

2007, and it describes the tasks involved in creating those Management Packs, including

importing Management Packs from the Management Model Designer (MMD) tool and the

TSMMD.

This section should be of use primarily to the operations team and for Management Pack

developers.

Chapter 14, "Deploying and Operating Manageable Applications"

Chapter 15, "Monitoring Applications"

Chapter 16, "Creating and Using Microsoft Operations Manager 2005 Management Packs"

Chapter 17, "Creating and Using System Center Operations Manager 2007 Management Packs"

Chapter 14

Deploying and Operating Manageable

Applications

After you define your application in the TSMMD, generate instrumentation for the application,

and call the instrumentation code from your application, you are ready to deploy the application

in a test, and ultimately a production environment. It is at this point that the instrumentation

you have created can be used by the test and operations teams.

This chapter uses the Transport Order application and the Transport Order Web service, two

parts of the Northern Electronics example, to illustrate how instrumentation can be created for

an application.

Deploying the Application Instrumentation When deploying an application instrumented using the TSMMD tool, as well as installing the

application, you will also need to install the instrumentation used by the application. This

involves building the solutions and then running the installation utility, Installutil.exe, against

each instrumentation technology DLL. You may encounter the following technology DLLs:

• EventLogEventsInstaller.dll

• WindowsEventing6EventsInstaller.dll

• PerformanceCountersInstaller.dll

• WmiEventsInstaller.dll

For more information about application instrumentation and how it is installed, see Chapters

8–12 of this guide.

Running the Instrumented Application In this scenario, the Transport Order application presents a page where the user can select an

order placed by a customer, edit the delivery details, and send the order to the transport

service. Figure 1 illustrates this page.

Figure 1 First page of the example Transport Order application

The dates shown in this and other screenshots in this chapter are shown in dd/mm/yyyy

format.

However, in operation, the Transport Order application depends on configuration values such as

the URL of the Transport Order Web service. If this configuration information is incorrect,

perhaps because the operations team has been reorganizing the servers or the Transport Order

Web service has experienced a failure, posting the order results in an error. The application

detects that the order submission failed and displays a message on the left side of the page, as

shown in Figure 2.

Figure 2 The result when the application cannot contact the Transport Order Web service

Event Log Instrumentation Comprehensive instrumentation is included for this application, meaning that the operations

team, when informed of the problem, can open the Windows Event Log (either locally or

remotely), and see the event message shown in Figure 3.

Figure 3 Details of the application failure in the Windows Event Log

In this case, the error message probably does not provide sufficient information to help to the

operations team. It simply contains the same text as the message displayed on the Web page.

The developer has correctly caught the exception and written it to the event log, but there is no

way of knowing why the remote server that implements the Transport Order Web service did

not respond, unless the operations team is aware of a particular cause of this error. To solve this

problem, more information about this error should be included in the TSMMD model for the

application, and the instrumentation should then be regenerated.

If the operations team is aware of the error or a specific application dependency, they can open

the Web.config file for this application to investigate further. In this case, they find that, as

shown in Figure 4, the value for the TOA.WebServiceProxies.Transport key contains the

incorrect port number, which they can easily correct.

Figure 4 The Web service configuration value in the Web.config file

After the operations team corrects the port number, the application behaves correctly. After

posting the order to the Transport Order Web service and receiving an indication that it

succeeded, the Web page clears the existing values from the controls and allows the user to

select another order.

Performance Counter Instrumentation The instrumentation that the developers implemented includes performance counters that

provide a record of the execution of the Transport Order Web service. Operations staff can use

the Performance utility within the Windows operating system to examine the current values and

history for the average document processing time, and the total number of requests placed over

a period, as shown in Figure 5.

Figure 5 The performance counters exposed by the Transport Order Web service

WMI The instrumentation for the application includes Windows Management Instrumentation (WMI)

events. Figure 6 shows the WMI event generated when the database is stopped.

Figure 6 WMI event raised from the Transport Order application

Trace File Instrumentation The instrumentation for the application also includes trace file entries. Figure 7 shows the trace

file entry generated when the database is stopped.

Figure 7 Trace file entry raised from the Transport Order application

Summary This chapter showed how to install the application instrumentation and demonstrated the use

of application instrumentation for the Transport Order application, which forms part of the

Northern Electronics example.

Chapter 15

Monitoring Applications

Defining a management model for an application and using the management model to ensure

that your application is well instrumented and can report health state is an important

requirement for designing manageable applications. However, a manageable application is not

all that is required to ensure that an application is easy to manage by the operations team. You

must also have a solution for monitoring the application, such as Microsoft Operations Manager

2005 or System Center Operations Manager 2007.

This chapter examines how to monitor applications; as an example, it uses Operations Manager

2005 monitoring the Transport Order Web service (part of the Northern Electronics Scenario).

Distributed Monitoring Applications Most monitoring applications and environments, such as IBM Tivoli, CA Unicenter, Operations

Manager 2005 and Operations Manager 2007, use a central server to collect, store, and expose

information about remote applications and systems using agents installed on these remote

computers. In larger installations, there may be several services collecting information from

their own subsets of remote client computers, collating the data, and passing it back to the

central monitoring server. Figure 1 illustrates a simple Operations Manager installation.

Figure 1 The main components of a typical Microsoft Operations Manager environment

The agents installed on the remote client computers that collect the information can be

managed agents, which run on computers running Windows, or unmanaged agents, which run

on computers running other operating systems. These agents collect and send back information

about their own performance (so the central monitoring server can detect them and check that

they are executing correctly), the basic parameters of the host system (such as processor and

memory usage), and any other information specified by the rules within the Management Packs

installed on the central monitoring server.

In some cases, it is not possible to install an agent on remote computers. When this is the case,

you can take advantage of Operations Manager's support for agent-less monitoring. Using

remote procedure calls (RPC) and DCOM method invocation, the management server can

provide most of the monitoring features as monitoring through a remote agent. This requires

the providers used within the application to support remote access, and the account used by the

Operations Manager server must have administrative permissions on the remote computers.

Most monitoring systems, including MOM, also provide a connector framework that allows

other monitoring systems to interact with the central server(s) and custom clients to provide

information about remote computers.

To view the information collected by the remote client computers, and to monitor application

and system performance, monitoring systems, such as Operations Manager, provide remote

consoles that operations staff can use to administer the system and view the state of monitored

applications. As you administer the system by adding, editing, and removing rules, the central

server sends these as configuration changes to all the remote agents, which store this

configuration locally and use it to determine the events and other data to send to the central

server.

Most monitoring systems also provide a reporting feature that allows administrators and

operators to create historical reports from the data collected by the agents. These reports are

often useful in providing indications of performance degradation over time and detecting

impending issues.

Management Packs Microsoft Operations Manager, like most other monitoring applications and environments,

relies on a series of Management Packs that defines the rules, views, and alerts for a specific set

of monitoring processes. Each Management Pack contains a rule group, which contains the set

of rules applicable to the monitored application or system.

The usual approach is to create a Management Pack that matches the management model you

create for your application and install it along with the standard Management Packs that

monitor other features and systems. For example, the Management Pack generated by the

Management Model Designer (MMD) and used with the Northern Electronics application

contains a series of rules and alerts that map directly to the instrumentation within the

application. Separate Management Packs (provided with Operations Manager) monitor the

basic features of the remote computers as sent by the agents installed on the remote

computers.

This division of monitoring tasks into separate functional areas means that your Management

Pack should only include rules that directly relate to your application and that measure

features your application can influence. As an example, you should not include a rule to

monitor the amount of free memory in your application Management Pack, because this does

not directly relate to your application processes. Instead, you install and use the Management

Pack that contains the remote sever information and use this to monitor all the non-

application related features, such as processor loading and memory usage.

Rules and Rule Groups Subsequent chapters include detailed procedures that describe the process of creating a

Management Pack. This brief overview introduces you to Management Packs and rules. This will

help you to understand the way you can map the instrumentation of an application to a set of

monitoring rules.

Figure 2 illustrates the Administrator Console in Microsoft Operations Manager 2005 with a

Management Pack for the Transport Order application installed. This is a very simple example of

a Management Pack that consists of only a rule group named TransportOrderApplication that

defines three event rules and two performance-processing rules.

Figure 2 The MOM 2005 Administrator Console showing the TransportOrderApplication rule group

You can use a rule group to associate a set of rules with an individual server or a group of

servers. You can enable and disable a complete group, and display and work with just a single

group, without associating each rule that you add or modify directly with one or more servers.

This makes management and monitoring much easier, especially as the architecture and

deployment of an application change over time.

Figure 3 illustrates some of the properties of the TransportWebServiceFailed event rule. This

rule takes as its source the application event log on the monitored server (where the main

Transport Order application runs) and uses criteria to select the event log entries that

correspond to the "Unable to connect to the remote server" error. When it detects this event, it

generates a critical error alert within the monitoring system, using the source and description of

the original event log entry for the new alert.

Figure 3 The event rule for the TransportWebServiceFailed event

Figure 4 illustrates the TransportServiceResponseTime performance rule. In this case, the

source of the values for the rule (the provider) is the AverageDocumentProcessingTime

performance counter implemented by the instrumentation within the Transport Order Web

service. In this example, the monitoring system interrogates the counter every minute and

stores the values so it can present a graph of performance over time.

Figure 4 The Performance rule for the Transport Order Web service response time

With these rules in place, the Operator Console will display the overall state of the application

(based, of course, on the defined rules) by rolling up the individual values of each alert raised by

the event and performance rules. You can specify how these rules roll up (how they combine

when there is more than one monitored entity). In this example with only a single monitored

instance, the overall state directly reflects the worst case—this is described in greater detail in

Chapters 17 and 18 of this guide.

Figure 5 illustrates the state view in the Operations Manager 2005 Operator Console. You can

use the Group list on the toolbar at the top of the window to select the displayed scope; in this

case, it is set to show only the TransportApplication rule group. This rule group is associated

with only a single server named DELMONTE that (in this simple scenario demonstration)

implements both the main Web application and the Transport Order Web service. You can see

that there are no open (in other words, unresolved) alerts for the entire application running on

this server.

Figure 5 Monitoring the overall state of the Transport Order application running on a single remote server

Monitoring the Example Application The Transport Order application contains extensive instrumentation for many different kinds of

exceptions that might occur, including general errors, such as the user entering invalid values for

the order parameters. For example, as illustrated in Figure 6, omitting the Expected Weight

value when submitting the order raises an error indicated by the message on the left side of the

page.

Figure 6 The result of a missing order parameter in the example Transport Order application

This is not strictly a failure of the application, but it is not an event designed to occur. The

instrumentation in the application writes an entry in Windows Event Log indicating the error. (A

similar error occurs if the value is not numeric or if the user enters incompatible values for other

parameters, such as From and To dates that do not define a valid period.)

When this event occurs, and the application adds an entry to the event log, the local Operations

Manager agent running on this computer passes details of the event log entry to the Operations

Manager central server. This causes the state for the application to change to that specified in

the rule that applies to this event—in this example, it changes to a Warning state. As you can

see in Figure 7, the Operations Console reflects this change in the application state, and displays

the rolled-up state as a Warning.

Figure 7 The State view of the monitored application, showing a Warning state

The operations team is now immediately aware that the application state has changed

(Operations Manager can send an e-mail message, a system alert, or a pager message when an

alert occurs), and they can investigate. Switching to Alerts view (or double-clicking the entry for

the server) displays details of all the current alerts. In this case, the issue is not critical and

simply indicates that the user did not enter valid values. However, as you can see in Figure 8, the

Properties view for the alert contains a great deal of useful information about the event that

caused the alert.

Figure 8 The Properties view for an alert, showing the values from the event log and other useful diagnosis information

One of the core tenets of management modeling is that your management model should

incorporate knowledge that helps operations staff to diagnose a problem, resolve it, and verify

the resolution. This knowledge contains both application-specific content (usually provided by

the architect and developer), and company-specific knowledge (some of which is generated

during and after installation of the application).

Each rule you create in Operations Manager can store both product knowledge and company

knowledge, and the Operator Console presents this knowledge as you view each alert. For

example, Figure 9 illustrates the product knowledge incorporated into the rule, including a

summary, likely causes (the diagnostic information), and resolution information.

Figure 9 The Product Knowledge view for an alert, showing diagnosis and resolution details

Figure 10 illustrates the Company Knowledge view, which contains an Edit button that allows

operators with the appropriate access permission to edit the knowledge. There is also a view

that displays alert history that allows operators to quickly see how and when this alert occurs.

Figure 10 The Company Knowledge view for an alert, which you can edit to provide up-to-date information

One useful feature of monitoring and recording user errors (as opposed to application faults) is

that you gain an insight into the usability of the application and the kinds of problems that

users face when using it. In this particular example, you may decide that some mechanism that

prevents users submitting orders with no Expected Weight value (such as client-side

validation) would reduce the loading on the servers and make the application easier to use.

This is useful feedback for the architect and developer; it is automatically collected and reflects

actual usage instead of user perception and opinion.

Monitoring the Remote Web Service Collecting events that correspond to user errors is useful, but the fundamentally more

important aspect of monitoring is to be able to detect failures, application performance issues,

and other problems that directly affect business processes.

The example Management Pack for the Transport Order application contains two rules that

detect failure of the Transport Order Web service:

• TransportOrderServiceFailed. This rule maps event log entries created by the main

Web application when it fails to connect to the remote Web service to an alert in MOM.

• TransportWebServiceFailed. This rule maps event log entries created by the Transport

Order Web service when it encounters an error to an alert in MOM.

As soon as the Transport Order Web service fails, the Operations Manager agent on the Web

server sends details of the event log entry to the central monitoring server, which automatically

changes the state to the worst of all currently unresolved alerts. The

TransportOrderServiceFailed rule maps event log entries to a critical error alert, so this is the

state displayed in the Operator Console (as shown in Figure 11).

Figure 11 The critical error state caused by failure to connect to the Transport Order Web service

Viewing details of the alert provides very little useful information—mainly the information that

is visible in Windows Event Log, plus a count of the number of times that Operations Manager

detected this event, the period within which they occurred, and the mapped rule name (see

Figure 12).

Figure 12 Alert details and summary for the alert raised by the TransportOrderServiceFailed rule

However, the TransportOrderServiceFailed rule specifies both product knowledge and company

knowledge that is directly useful for diagnosing and resolving the problem indicated by this

alert. For example, as you can see in Figure 13, the CAUSES and RESOLUTIONS sections identify

the configuration error that causes this failure to connect to the remote service and provides

the correct value (or points to where the operator could obtain the current value).

Figure 13 The product knowledge provided by the TransportOrderServiceFailed rule

Figure 14 illustrates the company knowledge for this rule. Operators could use this editable area

to store the correct current value for the Transport Order Web service or notes that indicate

how to deduce or discover its location if it changes on a regular basis.

Figure 14 The company knowledge provided by the TransportOrderServiceFailed rule

After correcting the incorrect configuration value, the operations staff can reattempt the

request to ensure it completes successfully. In fact, like in the example application, the target

Web service may expose a simple method that does no processing; instead, it simply indicates

successful connection to the service. In this case, the knowledge will include details of how to

execute this method to verify resolution of the connection problem.

In the example application, having resolved the connection problem, the operations staff might

now discover that the Transport Order Web service itself is failing. However, this is not directly

obvious because the only indication is that the controls on the Web page remain populated with

the original values, even after posting the order to the Transport Order Web service, as shown in

Figure 15. In normal circumstances, as demonstrated earlier, code in the application clears the

controls and allows the user to select another order.

Figure 15 Failure of the Transport Order Web service is not directly obvious in the Transport Order application

However, the monitoring system shows the real situation, because the

TransportWebServiceFailed rule maps event log entries created by the Transport Order Web

service when it encounters an error to an alert in Operations Manager. Figure 16 illustrates the

new critical error alert at the top of the list and the event log message in the lower-right part of

the window. In this case, there is much more information in the error message, including the

useful fact that the code detected the DataBaseName key missing in the application

configuration file.

Figure 16 The alert created by the failure of the Transport Order Web service

The product knowledge stored within the rule indicates in more detail why this error occurred

and how to resolve it. The RESOLUTIONS section indicates that the DataBaseName key should

have the value "Transport", as shown in Figure 17.

Figure 17 The product knowledge provided by the TransportWebServiceFailed rule

Looking at the Web.config file for the Transport Order Web service, it becomes obvious why this

error occurred—someone has commented out the DataBaseName key, as shown in Figure 18.

Removing the enclosing comment markers "!--" and "--" and running the application again

results in successful execution of the Transport Order Web service.

Figure 18 The error arises because the DataBaseName key is commented out

From this simple example, you can see just how the combination of a suitable health model with

the appropriate instrumentation and application monitoring makes it much easier to detect,

diagnose, and resolve problems in complex and distributed applications. It solves the following

three issues encountered at the start of this chapter:

• The operations team no longer needs to rely on users to detect and report faults.

Sufficient and accurate information in the form of knowledge stored within the health

model and the monitoring rules make diagnosis and resolution of faults easier, less

costly, and less time-consuming.

• The operations team does not have to visit the computer to investigate, nor do they

have to depend on scant information they may extract from the event logs or

performance counters. The health model knowledge provides the detailed data

required to resolve the fault.

• The operations team can easily detect problems early, such as impending failure of a

connection to a remote service caused by a failing network connection or lack of disk

space on the server, without having to continuously monitor performance counters and

event logs or use them as the sole sources of information for diagnosing faults.

Summary Defining a management model for an application is very important in ensuring that it can be

managed by the operations team. However, without an effective way of monitoring the

instrumentation that is generated by your application, the application may still prove difficult to

manage. This chapter explained the benefits of monitoring applications and explained some of

the most important components of monitoring software, using Operations Manager 2005 as an

example. The following chapters will examine the use of Operations Manager 2007 and

Operations Manager 2007 Management Packs in more detail.

Chapter 16

Creating and Using Microsoft

Operations Manager 2005

Management Packs

As discussed in Chapter 15, the basis for monitoring applications in Microsoft Operations

Manager 2005 are Management Packs that describe the rules, views, and alerts for a specific set

of monitoring processes. This chapter describes a number of scenarios of creating and using

Management Packs. It includes detailed information about the following:

• Importing a Management model from the Management Model Designer

• Creating a Management Pack in the Operations Manager 2005 Administrator Console

• Editing an Operations Manager 2005 Management Pack

• Viewing Management Information in Operations Manager 2005

• Creating Management Reports in Operations Manager 2005

The Transport Order application uses as a running example throughout this chapter. This

application forms part of the Shipping solution in the Northern Electronics worked example

used throughout this guide.

Importing a Management Model from the MMD into Operations

Manager 2005 Operations Manager 2005 provides many manageability benefits to the operations team, but it

does not allow you to define directly the type of management model for your application

previously described in this guide. In fact, Operations Manager 2005 does not allow you to

specify health state information such as the RED, YELLOW, and GREEN health aspects commonly

used in a management model. As a result, in many cases, you are likely to design your

application management model using the Team System Management Model Designer Power

Tool (TSMMD) or the Management Model Designer (MMD) tool.

As shown in Chapter 7 of this guide, an application management model defined in the TSMMD

tool can be exported to a management pack for Operations Manager 2005, and edited within

Operations Manager as required. Finally, you can specify the associations between the rules and

the computers that will run the application, and deploy the rules to the Operations Manager

Agents on those computers.

To import a Management model from the Management Model Designer into Operations Manager 2005

1. In the tree-view pane of the Administrator Console, expand the tree until

Management Packs is visible. Right-click Management Packs, and then click

Import/Export Management Pack.

If the tree-view pane is not visible, click Customize on the View menu. In the

Customize View dialog box, select the Console tree check box, and then click OK.

2. On the first page of the wizard, click Next.

3. On the Import or Export Management Packs page, select the Import Management

Packs and/or Reports radio button, and then click Next.

4. On the Select Folder and Choose Import Type page, select the folder that contains

the Management Pack (the .akm file) you exported from the Management Model

Designer. Under Type of import, select the Import Management Packs only radio

button, and then click Next.

5. The Select Management Packs page displays a list of the Management Packs located

in the folder you specified on the previous page. Select the Management Pack you

created with the Management Model Designer. If you want to import more than one

Management Pack, press and hold the SHIFT key or CTRL key while clicking each

Management Pack you want to import.

6. On the same page of the wizard, under Import Options, select the way you want

Operations Manager to update any existing Management Packs with the same

name:

◦ If you want to update an existing Management Pack, select the first option.

Operations Manager will retain any custom settings and knowledge from

existing rules in this Management Pack and import only changes to the

Management Pack.

◦ If you want to replace the existing Management Pack, select the second option.

If you are importing a new Management Pack that does not already exist in

Operations Manager, you can use the default "update" setting because there

will be no existing rules to update, so Operations Manager will create a

completely new set.

7. If you want Operations Manager to create backups of the Management Packs it

updates, select the Backup existing Management Pack check box. In the

Management Pack backup directory text box, type the name of the folder you want

the backup file to be placed in or use the Browse button to select the folder, and

then click Next.

8. On the Confirmation page of the wizard, click Finish. The Import Status dialog box

shows the status of the import process. When the import process completes, it

provides details of each stage of the process and indicates success or failure. It

shows the backup and import operations, detailed information about the operation,

the name of the imported file, the status, and a description. To create a file that

includes all the import steps, click the Create Log File button, as shown in Figure 1.

Figure 1 The status report displayed after importing a Management Pack

Viewing the Management Pack You can now view the new Management Pack in the Operations Manager Administrator

Console. The Management Model Designer (MMD) generates a Management Pack that contains

multiple nested rule groups corresponding to the individual components and levels of the

original management model defined in the MMD, as shown in Figure 2. Within each level,

Operations Manager generates the event rules, alert rules, and performance rules defined in the

management model. It also automatically generates an alert rule at the top level that creates a

notification response to the network administrators when any rule with a severity of Error or

higher causes the state of the application to change.

Figure 2 The Management Pack imported into the Operations Manager 2005 Administrator Console

For more details about rule groups and rules, see "Creating and Configuring a Management

Pack in the Operations Manager 2005 Administrator Console".

As an example of the way that the MMD translates a management model into a Management

Pack, Figure 3 illustrates the General page of the Properties dialog box for the

TransportOrderUIErrors event rule. This rule uses Warning entries in Windows Application

Event Log to detect input errors by the user. In the management model, this causes a state

change to YELLOW for the user interface application, and the MMD appends this state to the

event name.

Figure 3 The General page of the Properties dialog box for a newly imported event rule

The MMD uses the values you enter when specifying the detector in the management model (in

this case, the TransportOrderUIErrors event) to generate the appropriate criteria for the event

rule. As shown in Figure 4, the MMD sets the Source of the event, and generates a regular

expression for the event ID to match that specified in the management model.

Figure 4 The Criteria page of the Properties dialog box for a newly imported event rule

The Alert page of the Properties dialog box shows that the MMD set the severity to Warning

(equivalent to the YELLOW health state), and specified that Operations Manager should create

an alert when this event occurs. It uses the values of the Source and Description from the event

to populate these fields of the alert, as shown in Figure 5.

Figure 5 The Alert page of the Properties dialog box for a newly imported event rule

The MMD uses the knowledge included in the management model to generate information for

the rules it generates. Figure 6 shows the Knowledge Base page for the new

TransportOrderUIErrors event rule, which contains sections labeled Summary, Diagnose,

Resolve, and Verify. These correspond directly to the steps defined in the management model

for monitoring and maintaining the application.

Figure 6 The Knowledge Base page of the Properties dialog box for a newly imported event rule

Figure 7 shows the Threshold page of the Properties dialog box for a newly imported

performance rule. The MMD sets the Threshold value and Match when the threshold meets

the following condition options for the rule based on the aspects you specified when creating

the management model. In this case, the rule will generate an alert and indicate a state change

when the value of the performance counter that measures the Transport Order Web service

response time exceeds 4999 (milliseconds).

Figure 7 The Threshold page of the Properties dialog box for a newly imported performance rule

After importing a Management Pack from the Management Model Designer, you may need to

edit it, add new rules, or change the behavior of some sections. The remaining procedures in

this chapter show these processes in detail.

You also need to generate computer groups in the Administrator Console that correspond to the

sets of computers that will run the application and deploy the rules to these computers. For

more information, see Create an Operations Manager 2005 Computer Group and Deploy the

Operations Manager Agent and Rules.

Guidelines for Importing a Management Model from the

Management Model Designer When importing a management model from the MMD, you should consider the following

proven practices:

• Ensure that you provide all the required information, including the knowledge that

describes Diagnosis, Resolution, and Verification procedures, when you create your

management model in the Management Model Designer (MMD).

• Use the validation features of the MMD to make sure that there are no errors or

missing information before generating the Management Pack.

• Use the import options in Operations Manager 2005 to generate a backup of the

existing Management Pack if you are updating it or replacing it.

• After importing the Management Pack into Operations Manager 2005, make sure that

there are no conflicts and edit the rules as required.

• Create the required computer groups and associate and deploy the rule groups in the

new Management Pack to the appropriate computers.

Creating and Configuring a Management Pack in the Operations

Manager 2005 Administrator Console If you have not used the MMD tool (or another tool that can export *.akm files) as part of

defining the management model for your application, you will need to use the Administrator

Console in Operations Manager 2005 to create a Management Pack for your application. This

can be complex, because as mentioned earlier, there are no direct mappings between the many

of the concepts contained in a management model and an Operations Manager 2005

Management Pack. However, by using rules to detect events and performance counters, you

can update state variables that correspond to the health state of an application. The rules may

also send a message to operators through e-mail or pager. The monitoring system also allows

you to create alert rules that combine different events and performance counters to tailor the

alert to match exactly the requirements of both the application and the operations staff.

By assigning individual rules to groups, you can associate a group with a specific section or

component of the application, which may correspond to a managed entity. This makes it easier

to update the monitoring configuration when the physical layout of the monitored application

and its components changes over time. You can also assign knowledge from the management

model that is common to a set of rules to the rule group; this reduces duplication of effort and

makes knowledge updates easier.

This section contains the following procedures:

• To create a new Management Pack and rule group

• To create an event rule for a rule group

• To create an alert rule for a rule group

• To create a performance rule for a rule group

This section contains only enough information to create a Management Pack with rule groups

and rules in place. In many cases, you will need to perform additional editing to the

Management Pack. For more information about these other tasks, see "Editing an Operations

Manager 2005 Management Pack" later in this chapter.

To create a new Management Pack and rule group

1. In the tree view pane of the Administrator Console, expand the list until Rule Groups

is visible under Management Packs, and then right-click Rule Groups. If the shortcut

menu contains Enable Authoring mode, click it, and then click Yes in the

confirmation dialog box. If the shortcut menu contains Disable Authoring mode, you

are already in authoring mode.

If the tree-view pane is not visible, click Customize on the View menu. In the

Customize View dialog box, select the Console tree check box, and then click OK.

2. If this is the first rule group for your application, you must create a top-level (parent)

group. To create a top-level rule group, right-click Rule Groups, and then click Create

Rule Group.

3. If you have already created a top-level rule group for your application, you can

create nested (child) rule groups within that top-level group. To create a child rule

group, right-click the top-level rule group entry, and then click Create Rule Group.

4. On the General page of the Rule Group Properties Wizard, type a name for the rule

group and, optionally, a description. Make sure the Enabled check box is selected,

and then click Next.

5. On the Knowledge Base page, click Edit, and then enter any company-specific

knowledge for this rule group. For example, you can record the name and location of

the application, the application purpose and owner, or any other information that is

common to all the rules you will create for this group, and which may be useful to

operators and administrators. Click OK, and then click Next.

6. On the Advanced page, you can reconcile Management Packs imported from

Operations Manager 2000 with the changes to the way Operations Manager 2005

handles rule groups when you come to export the rule group. When you create a

new rule group, you only need to select the way Operations Manager will export the

rules in this group. By selecting the first or second option in the Rule Group

ownership options section, you ensure that exported Management Packs contain all

the rules, not just the rules you have modified since you installed or imported the

rule group. And, because you will usually not want to preserve deleted rules in the

exported Management Pack, select the default option Export as a vendor produced

rule. If the rule is disabled, then do not export. Also, make sure the Mark this rule

group as deleted check box is clear. Click Next.

A rule group may contain child (nested) rule groups, which can make it easier to

administer monitoring for large or very complex applications by providing a facility to

set the parameters of rules in multiple child groups in one operation. Operations

Manager 2000 allows a child rule group to link to multiple parent rule groups. Later

versions of Operations Manager do not support this, so you must disable the links

between parent and child groups that violate this condition in imported Management

Packs using the list at the top of this page.

7. On the Knowledge Authoring page, you can provide content for the product

knowledge of a rule group. Under Sections, click Purpose, and then enter

information about the purpose of the application and the rule group. Repeat the

process by clicking Features and Configuration and entering the appropriate

information.

8. Click Finish to create the new rule group. You will be prompted to deploy the rules in

the new rule group to a group of computers. Click No because there are no rules in

the new group.

9. The new rule group appears in the left-side tree view. Select it and expand the nodes

below it to see the three rule categories that Operations Manager automatically

adds to each rule group: Event Rules, Alert Rules, and Performance Rules. These are

all empty. The right pane of the Administrator Console shows a summary of the

properties and Company Knowledge for the Rule Group (see Figure 8).

Figure 8 A new rule group in Microsoft Operations Manager 2005

To create an event rule for a rule group

1. In the left tree view of the Administrator Console, expand the list until Rule Groups

is visible (it is under the Management Packs entry). Expand the rule group to which

you want to add the new event rule.

2. In the tree view, right-click Event Rules, and then click Create Event Rule to open

the Select Event Rule Type dialog box.

3. In the Select Event Rule Type dialog box, select the type of rule you want to create

from the list. The rule types allow you to do the following:

◦ Alert on or Respond to Event. The rule will generate a single alert, or perform a

process you define one time, for each occurrence of the specified event.

Operations Manager will not process any more rules that may match this event

occurrence.

◦ Filter Event. The rule will generate a single alert, or perform a process you

define one time, for each occurrence of the specified event. Operations

Manager will then continue to process other rules that match this event

occurrence.

◦ Detect Missing Event. The rule will generate a single alert, or perform a process

you define one time, if an event you specify does not occur during a specified

period on specified days. You can use this rule to detect failures where the

component or application generates a "heartbeat" event on a regular basis.

◦ Consolidate Similar Events. The rule will generate a single alert, or perform a

process you define one time, for a specified combination of events that occur

within a duration you specify. You consolidate the events by specifying the event

fields that will have identical values.

◦ Collect Specific Events. The rule will generate a single alert, or perform a

process you define one time, for a specified combination of events that occur

within a duration that you specify. You can choose whether to store some or all

of the event parameters or to just discard them.

4. After selecting the required rule type, click Next to open the Data Provider page of

the Event Rule Properties Wizard. Here, you select the source of the event in the

Provider name drop-down list box. For example, the Operations Manager agent may

collect events from a specified Windows Event Log (such as Application, System,

DNS Server), detect an SNMP event through Windows Management

Instrumentation (WMI), detect an internally generated or a script-generated event

(a generic event), or occur on a fixed schedule that you choose (a timed event).

5. If the event source or data source you require is not in the drop-down list box, you

can specify your own event provider by clicking New and then entering the required

data. For example, you can specify an IIS Web server or FTP server log file, a custom

log file, a custom timed event period, or a custom WMI provider. After you specify

the required information in the Provider name and Provider type boxes, click Next

to open the Criteria page.

6. On the Criteria page, specify the criteria that will match the event you want to

handle. Select the check boxes that correspond to the criteria you want to specify,

such as the text values of the Source, ID, Type, and Description of the event. If you

are handling events from Windows Event Log, you can usually obtain these values by

examining the event in the Event Log viewer (see Figure 9).

Figure 9 The Source, ID, Type, and Description values for a Windows Event Log entry

7. As you enter criteria, the Criteria description section shows a summary of these

criteria. If you need to apply more specific criteria, click the Advanced button to

open the Advanced Criteria dialog box, where you can select any of the fields for an

event (including parameters for a WMI or custom event, log file names, and more),

and specify criteria to select only the events you want. You can also use the

Advanced Criteria dialog box to match any values in the event using different

conditions, such as partial string matching, regular expressions, and numerical value

order comparisons. After you construct the required criteria combination, click Close

in the Advanced Criteria dialog box. On the Criteria page, click Next.

8. If you are creating a Collect Specific Events rule, the next page is the Parameter

Storage page. Here, you can specify whether Operations Manager should collect the

parameters of each event. The default is to store no parameters, but you can use the

option buttons on this page to specify that you want to store all the parameters

from all matching events or just specific named parameters. After you select the

required option, click Next to show the Schedule page.

9. By default, Operations Manager will always process events, but you can change this

behavior on the Schedule page by specifying time spans when Operations Manager

will or will not process events. In the drop-down list box, click either Only process

data during the specified time or Process data except during the specified time,

and then specify the start time and end time. Use the check boxes below the drop-

down list box to select the days of the week to which this setting applies, and then

click Next.

If you are creating a Detect Missing Event rule, you use the Schedule page to specify

the period during each day when you expect the event to occur. You must specify a

schedule for this type of event rule.

10. The page you see next depends on the type of rule you are creating, and source of

the event for this rule. If you specified as the source of this rule a mechanism that

creates an event, such as Windows Event Log or a WMI event, and you are creating

an Alert on or Respond to Event rule or a Detect Missing Event rule, the next page is

the Alert page. On this page, you specify whether the event(s) detected by this rule

will generate an alert (which will appear in the Operator Console) and details about

the alert. The Alert page allows you to do the following:

◦ Specify whether the event will generate an alert (for which you will create an

alert rule) by selecting the Generate alert check box.

◦ Turn on display of health state for this rule and alert severity condition checking

by selecting the Enable state alert properties check box.

◦ Specify the alert severity (such as Critical Error, Warning, or Success) if you want

to always generate the same severity alert for this event. Alternatively, click the

Edit button to open the Alert Severity Calculation for State Rule dialog box,

where you specify a series of If conditions and an Else condition so the severity

(and therefore the health state displayed in the console) depends on the

parameter values for the event. This allows you to define, for example, that a

particular event will generate a Service Unavailable condition for specific values

of the parameters, and a Success condition for other values. You must specify at

least one condition that causes a RED state (Critical Error, Service Unavailable),

and one that causes a GREEN state (Success). You can also specify conditions

that cause a YELLOW state (Warning).

◦ Specify the name of the person responsible for tracking and resolving the alert

as the Owner. This allows Operations Manager to direct the alert to the

appropriate administrators and operators listed in the Notification Groups

section of the Administrator Console.

◦ Specify the Resolution state for the alert. By default, this is New, but you can set

it to Assigned to Ca group of people such as a helpdesk, vendors, or mark it as

requiring scheduled maintenance. You can use the Global Settings section of the

Administrator Console to modify or define new resolution states. For more

information, see the later section "Viewing and Editing Global Settings"

◦ Specify the value for the Alert source. This is the text displayed as the Source in

the Operator Console when this alert occurs. You can enter custom text or select

from any of the fields in the event that causes this alert. The default is to use the

Source field value.

◦ Specify the value for the Description. This is the text displayed as the

Description in the Operator Console when this alert occurs. You can enter

custom text or select from any of the fields in the event that causes this alert.

The default is to use the Description field value.

◦ Specify details of the role of the server in the alert process using the Server role,

Instance, Component, and Customer Fields options.

Not all of the controls on the Alert page are available for every type of event

rule. Depending on the type of rule and the provider source, some of the

controls may be disabled.

11. If you are creating a Consolidate Similar Events rule, the next page is the Consolidate

page. Use the check boxes in the list of event fields to specify those that must have

identical values in order for Operations Manager to consolidate multiple events into

a single alert. You can also specify the period within which the multiple events must

occur as a number of seconds. Operations Manager will only raise one alert in the

Operator Console for any number of consolidated events within this period.

12. If you are creating a Filter Event rule, the next page is the Filter page. Select the

option required for the way you want Operations Manager to evaluate other rules

that match the source event. You can specify if it should add matching events to the

database or ignore them as it continues evaluating rules.

13. If you specified as the source of this rule a timed event (a regular scheduled

occurrence) or if you are creating an Alert on or Respond to Event or a Detect

Missing Event rule, the next page is the Alert Suppression page. Use the check boxes

in the list of alert fields to specify the repeated alerts that must have identical values

in order for Operations Manager to ignore (suppress) them.

14. Click Next. If you are not creating a Consolidate Similar Events or Collect Specific

Events rule, the next page is the Responses page. Here, you specify the actions

Operations Manager should perform when a matching event occurs. If you do not

specify any response, Operations Manager simply generates an alert (provided you

have specified this on the Alerts page), and changes the state displayed in the

Operator Console. You can specify the following types of response:

◦ Launch a Script. This opens a dialog box where you select an existing Operations

Manager script or create a new script. You also specify whether the script should

run on the remote computer (where the Operations Manager agent resides) or

on the Operations Manager management server, the script timeout, and any

parameters required by the script.

◦ Send an SNMP trap. This opens a dialog box where you choose whether to

generate the trap on the remote computer that raised the alert (SNMP must be

installed and enabled there) or on the Operations Manager management server.

◦ Send a Notification to a Notification Group. This opens a multi-tabbed dialog

box. On the Notification tab, select an existing notification group, modify an

existing notification group, or create a new notification group. On the Email

Format tab, you can accept the standard format for a notification e-mail or edit

this to create a custom format using placeholder variables. On the Email Format

tab, you can accept the standard format for a pager notification message or edit

this to create a custom format using placeholder variables. On the Command

Format tab, you can accept the standard command to run another application

or batch file, or you can edit this to create a custom format using placeholder

variables.

◦ Execute a command or batch file. This opens a dialog box where you can select

the Application and/or the Command Line, and the Initial directory. You also

specify whether the command or batch file should run on the remote computer

(where the Operations Manager agent resides) or on the Operations Manager

management server and the command timeout.

◦ Update state variable. This opens a dialog box where you can add state

variables that correspond to specific actions based on the values of fields in the

source event. Click the Add button in this dialog box and select an action (such

as incrementing the value of the variable or storing the last n occurrences), and

then select the field from the source event that provides the value for this

action. You also specify whether the operation is performed on the remote

computer (where the Operations Manager agent resides) or on the Operations

Manager management server.

◦ Transfer a file. This opens a dialog box where you specify a virtual directory for

the transferred file, whether to upload or download files, and the source and

destination file names. You can use values in the source event fields to select

the appropriate file, and use the standard Windows environment variables (such

as %WINDIR%) to specify the paths.

◦ Call a method on a managed code assembly. This opens a dialog box where you

specify the assembly name and type name for the managed code assembly you

want to execute. You must also enter the method name within that assembly

you want to call, specify whether it is a Static method or an Instance method,

and provide any parameters required for the method. You also specify whether

the assembly is located on the remote computer (where the Operations

Manager agent resides) or on the Operations Manager management server and

the response timeout.

15. Click Next to display the Knowledge Base page, click Edit, and enter any company-

specific knowledge appropriate for the rule that may be useful to operators and

administrators.

16. Click Next to display the Knowledge Authoring page. Click Summary in the Sections

list at the top of the page and enter summary information for this event. Repeat the

process by clicking Causes, Resolutions, and the other available categories and

entering the appropriate information. For each entry, you can specify the GUID of

another rule that shares this entry—this reduces the duplication that may occur if

many rules require the same knowledge.

17. Click Next to display the Advanced page, where you can mark this rule as deleted,

and specify the way it will be exported within a Management Pack. For a new rule,

leave the values set to the defaults.

18. Click Next to display the General page, where you provide a name for the new rule.

By default, the rule is enabled but you can disable it using the check box in this page.

You can also allow overrides of the rule by specifying the override name.

19. Finally, click Finish to create the new rule, which appears in the Event Rules section

of the left-side tree view. If you want to immediately force the new rule (or any

updated rules) through to the Operations Manager agents on remote computers,

instead of waiting for the scheduled update cycle, right-click the Management Packs

entry in the left-hand tree view, and then click Commit Configuration Change.

By default, Operations Manager pushes rule changes to all remote agents every five

minutes. To change this value, right-click Global Settings in the left-side tree view, click

Management Server Properties, click the Rule Change Polling tab, and then select the

required value.

To create an alert rule for a rule group

1. In the left-side tree view of the Administrator Console, expand the list until Rule

Groups is visible (it is under the Management Packs entry). Expand the rule group to

which you want to add the new alert rule.

2. In the tree view, right-click Alert Rules, and then click Create Alert Rule to open the

Alert Rule Properties Wizard.

3. On the Alert Criteria page, specify any criteria required to match the event rule or

performance rule that generates this alert. You can match the alert using the values

for the alert source, the severity (such as Error or Warning), and confine the match

to alerts generated within a rule group that you select. As you enter criteria, the

Criteria description section shows a summary of these criteria.

4. If you need to apply more specific criteria, click the Advanced button to open the

Advanced Criteria dialog box, where you can select any of the fields for an alert

(such as the Description, Domain, and Owner), and specify criteria to select just the

alerts you want. You can also use the Advanced Criteria dialog box to match any

values in the alert using different conditions, such as partial string matching, regular

expressions, and numerical value order comparisons. After you construct the

required criteria combination, click Close in the Advanced Criteria dialog box. On the

Alert Criteria page, click Next.

By default, event rules will generate alerts that have the same Source and Description

values as the original event. By default, performance rules will generate alerts that

have the Source set to a combination of the Object, Counter, and Instance values from

the original performance counter, and a Description set to the Source value plus the

text "value = " and the current counter value.

5. By default, Operations Manager will always process alerts, but you can change this

behavior on the Schedule page by specifying time spans where Operations Manager

will or will not process alerts. In the drop-down list box, select Only process data

during the specified time or Process data except during the specified time, and

specify the start times and end times for the period. Then use the check boxes below

the drop-down list box to select the days of the week to which this setting applies,

and then click Next.

6. The next page is the Responses page. Here, you specify the actions Operations

Manager should perform when a matching alert occurs. If you do not specify any

response, Operations Manager simply changes the State displayed in the Operator

Console. You can specify the following types of response:


Manager script or create a new script. You also specify whether the script should

run on the remote computer (where the Operations Manager agent resides) or

on the Operations Manager management server, the script timeout, and any


◦ Send an SNMP trap. This opens a dialog box where you specify where to

generate the trap: on the remote computer (where the Operations Manager

agent resides) or on the Operations Manager management server. You can use

SNMP responses to communicate alerts to other computers and systems that

run a wide variety of operating systems.

◦ Send a notification to a Notification Group. This opens a multi-tabbed dialog








or batch file or edit this to create a custom format using placeholder variables.

◦ Execute a command or batch file. This opens a dialog box where you can select


specify if the command or batch file should run on the remote computer (where

the Operations Manager agent resides) or on the Operations Manager

management server and the command timeout.


variables that correspond to specific actions based on the values of fields in the

source alert. Click the Add button in this dialog box, select an action (such as

incrementing the value of the variable, or storing the last n occurrences), and

then select the field from the source alert that provides the value for this action.

You also specify if the operation is performed on the remote computer (where

the Operations Manager Agent resides) or on the Operations Manager

Management server.



destination file names. You can use values in the source alert fields to select the

appropriate file, and use the standard Windows environment variables (such as

%WINDIR%) to specify the paths.


specify the Assembly name and Type name for the managed code assembly you

want to execute. You must also enter the Method name within that assembly

you want to call, specify whether it is a Static or an Instance method, and

provide any Parameters required for the method. You also specify if the

assembly is located on the remote computer (where the Operations Manager

agent resides) or on the Operations Manager management server, and the

response timeout.

7. Click Next to display the Knowledge Base page, click Edit, and then enter any

company-specific knowledge appropriate for the rule that may be useful to

operators and administrators.

8. Click Next to display the Knowledge Authoring page. In the Sections list at the top of

the page, click Summary, and then enter summary information for this alert. Repeat

the process by clicking the Causes, Resolutions, and the other available categories

and entering the appropriate information. For each entry, you can specify the GUID

of another rule that shares this entry—this reduces the duplication that may occur if

many rules require the same knowledge.


and specify the way it will be exported within a Management Pack. For a new rule,

leave the values set to the defaults.


By default, the rule is enabled but you can disable it using the check box on this

page. You can also allow overrides of the rule by specifying the Override Name.

11. Finally, click Finish to create the new rule, which appears in the Alert Rules section

of the left-side tree view. If you want to immediately force the new rule (or any

updated rules) through to the Operations Manager agents on remote computers,

instead of waiting for the scheduled update cycle, right-click the Management Packs

entry in the left-side tree view, and then click Commit Configuration Change.




required value.

To create a Performance Rule for a Rule Group

1. In the left-side tree view of the Administrator Console, expand the list until Rule

Groups visible (it is under the Management Packs entry). Expand the rule group to

which you want to add the new performance rule. If the tree-view pane is not

visible, click Customize on the View menu. In the Customize View dialog box, select

the Console tree check box, and then click OK.

2. In the tree view, right-click Performance Rules, and then click Create Performance

Rule to open the Performance Rule Type dialog box.

3. In the Performance Rule Type dialog box, select the type of rule you want to create

from the list. The rule types allow you to do the following:

◦ Sample Performance Data. This is a "measuring" rule that causes Operations

Manager to collect numeric values from the Windows performance counter or

WMI counter you specify and store them in the database for viewing and

reporting. You can also generate a response each time the rule collects a value.

◦ Compare Performance Data. This is a "threshold" rule that generates an alert

and/or a response when the samples value falls outside a specified range or

crosses a defined threshold.

4. After selecting the required rule type, click Next to open the Data Provider page.

Here, you select the source of the data in the Provider name drop-down list box. The

list includes all the standard and custom performance counters defined on the

computers where the Operations Manager agents reside, as well as the counters on

the Operations Manager server itself. Alternatively, for a Compare Performance Data

(threshold) rule, you can select an internally generated or a script-generated

(Generic) event.

5. If the data source you require is not in the drop-down list box, you can specify your

own data provider by clicking the New button and entering the data required. You

can specify an application log file, a Windows performance counter, or a WMI

numeric event. Depending on which option you choose, you see a dialog box or

wizard that allows you to specify details of the data source. You can specify a custom

provider for an application log file, any of the available performance counters on

remote computers, or details of a custom class and methods for a WMI provider.

The Data Provider dialog box also contains a Modify button that opens a dialog box

where you can change the parameters for the selected performance counter. After

you enter information in the required Provider name and Provider type boxes, click

Next to open the Schedule page.

6. By default, Operations Manager will always process performance counters at the

frequency you specify on the Data Provider page, but you can change this behavior

on the Schedule page by specifying time spans where Operations Manager will or

will not collect counter values. In the drop-down list box, select Only process data

during the specified time or Process data except during the specified time, and

specify the start and end times for the period. Then use the check boxes below the

drop-down list box to select the days of the week to which this setting applies, and

then click Next.

7. The page you see next depends on the type of performance rule you are creating.

For a Sample Performance Data (measuring) rule, the next page is the Responses

page, discussed in step 13. If you are creating a Compare Performance Data

(threshold) rule, the next page you see is the Criteria page.

8. On the Criteria page, specify any criteria required to match the counter that provides

the values for this rule. You can match using the values of the fields for a counter:

Counter (name), Instance, Object, Source Computer, and Source Domain. You can

specify match values using the controls from instance, from computer, and from

domain on the Criteria page. As you enter criteria, the Criteria description shows a

summary of these criteria.

9. If you need to apply more specific criteria, or match on the Counter (name) or

Object fields, click the Advanced button to open the Advanced Criteria dialog box

where you can select any of the fields for a counter and specify criteria to select only

the counter you want. You can also use the Advanced Criteria dialog box to match

any values for the counter using different conditions, such as partial string matching,

regular expressions, and numerical value order comparisons. After you construct the

required criteria combination, click Close in the Advanced Criteria dialog box. On the

Criteria page, click Next to show the Threshold page.

10. On the Threshold page, specify the conditions under which the values collected from

the counter will raise an alert or cause a response action to occur. In the Threshold

value section of this page, select an option so that Operations Manager uses just the

current value of the counter, the average value of a specified number of samples, or

the change in the value over a specified number of samples. In the Match when the

threshold meets the following condition: section of this page, select an option so

that Operations Manager will respond to a sample value that is greater than a

specified value, less than a specified value, or always respond. You can also allow

overrides of the threshold values for this rule, and specify the Override Name. Click

the Set Criteria button to specify the target computer or group, and the override

values.

An override defines a specific computer or computer group. You can create an

override that changes the settings of rules for a specific target computer or group

without having to create custom rules for that target. Overrides allow you to disable a

rule, override the threshold value of a performance threshold rule, override a script

parameter value, and override an existing override parameter in the advanced alert

severity formula. You replace a value in any of relevant the property settings for the

rule by the name of the override. The Override Criteria section of the tree view in the

Administrator Console shows the overrides you create.

11. Click Next to show the Alert page. On this page, you specify if the counters for this

rule will generate an alert (which will appear in the Operator Console) when they

cross a threshold value, and details of the alert. This complex page allows you to do

the following:

◦ Specify that a counter threshold event will generate an alert, by selecting the

Generate alert check box.

◦ Turn on alert severity condition checking, by selecting the Enable state alert

properties check box.

◦ Specify the alert severity (such as Critical Error, Warning, or Success) if you

always want to generate the same severity for this counter threshold event.

Alternatively, you can specify a series of If conditions and an Else condition so

that the severity depends on the parameter values for the counter threshold

event. This allows you to define, for example, that a particular counter threshold

event will generate a Service Unavailable condition for specific values of the

parameters, and a Success condition for other values. Click the Edit button to

enter the condition criteria.

◦ Specify the name of the person responsible for tracking and resolving the

counter threshold event as the Owner. This allows Operations Manager to direct

the alert to the appropriate administrators and operators listed in the

Notification Groups section of the Administrator Console.

◦ Specify the Resolution state for the counter threshold event. By default, this is

New, but you can set it to Assigned to a group of people such as a helpdesk,

vendors, or mark it as requiring scheduled maintenance. You can use the Global

Settings section of the Administrator Console to modify or define new

resolution states.


the Operator Console when this counter threshold event occurs. You can enter

custom text or select from any of the fields in the counter threshold event. The

default is to use the Object, Counter, and Instance field values.


Description in the Operator Console when this counter threshold event occurs.

You can enter custom text, or select from any of the fields in the counter

threshold event. The default is to use the Object, Counter, and Instance field

values followed by the text "value = " and the counter value.


Instance, Component, and Custom Fields options.

12. Click Next to show the Alert Suppression page. Use the check boxes in the list of

alert fields to specify the repeated alerts that must have identical values in order for

Operations Manager to ignore (suppress) them, and then click Next.

13. The next page you see, for both types of Performance Rule, is the Responses page.

Here, you specify the actions Operations Manager should perform for a matching

counter. You can specify the following types of response:


Manager script or create a new script. You also specify if the script should run on

the remote computer (where the Operations Manager agent resides) or on the

Operations Manager management server, the script timeout, and any


◦ Execute a command or batch file. This opens a dialog box where you can specify




management server, and the command timeout.

◦ Update a state variable. This opens a dialog box where you can add state

variables that correspond to specific actions based on the values of fields for the

counter. Click the Add button in this dialog box and select an action (such as

incrementing the value of the variable or storing the last n occurrences), and

then select the field from the source counter that provides the value for this

action. You also specify if the operation is performed on the remote computer


management server.



destination file names. You can use values in the source counter fields to select










response timeout.

14. Click Next to display the Knowledge Base page, click Edit, and enter any company-

specific knowledge appropriate for the rule that may be useful to operators and

administrators.

15. Click Next to display the Knowledge Authoring page. In the Sections list at the top of

the page, click the Summary entry, and then enter summary information for this

counter. Repeat the process by clicking the Causes, Resolutions, and the other

available categories and entering the appropriate information. For each entry, you

can specify the GUID of another rule that shares this entry—this reduces the

duplication that may occur if many rules require the same knowledge.


and specify how it will be exported within a Management Pack. For a new rule, leave

the values set to the defaults.


The rule is enabled by default, but you can disable it using the check box in this page.

You can also allow overrides of the rule by specifying the Override Name.

18. Finally, click Finish to create the new rule, which appears in the Performance Rules

section of the left-side tree view. If you want to immediately force the new rule (or

any updated rules) through to the Operations Manager agents on remote

computers, instead of waiting for the scheduled update cycle, right-click the

Management Packs entry in the left-side tree view, and then click Commit

Configuration Change.




required value.

Guidelines for Creating and Configuring a Management Pack in the

Operations Manager 2005 Administrator Console When creating and configuring a management pack in the Operations Manager 2005

Administrator Console, you should consider the following proven practices:

• Use the management model you developed for your application to help you decide

what rules and performance counters you need to create.

• Create a top-level rule group that corresponds to the application, using a name for the

group that makes it easy to identify. You will later be able to use this top-level rule

group to expose the overall rolled-up state of the entire application. Then create child

rule groups to build a multi-level hierarchy that mirrors that of the management model,

adding the appropriate rules into each child group.

• Create only rules directly relevant to your application. Avoid duplicating rules that are

available in built-in Management Packs, such as measuring processor usage or free

memory.

• Use alerts to raise urgent issues to operations staff immediately, perhaps through e-

mail or pager.

• Take advantage of specific features of the monitoring application, such as timed events

that can provide heartbeat monitoring of remote services, or the ability to run scripts or

commands in response to alerts (for example, to query values or call a method that

provides diagnostic information, and then generates a suitable alert).

• Provide as much useful company-specific and application-specific knowledge as possible

for each rule group and rule to make problem diagnosis, resolution, and verification

easier for operators and administrators.

Editing an Operations Manager 2005 Management Pack After creating or importing a Management Pack in Operations Manager 2005, you will typically

need to perform additional actions to fine-tune the Management Pack or respond to changes in

the operations environment and your management model.

In the case where you have imported a management pack from the MMD, changes are quite

commonly required, because the import process does not always generate the ideal

combination of rules and rule groups. For example, the MMD generates an alert that creates a

notification to members of the administration group. However, this group has no members by

default, so you may want to edit this notification, add members to the various notification

groups, or create new notification groups.

This section discusses a number of actions that may be necessary when editing an Operations

Manager 2005 Management Pack, including the following:

• Editing rule groups and subgroups

• Editing event rules, alert rules, and performance rules

• Editing computer groups and rollup rules

• Creating and editing operators, notification groups, and notifications

• Viewing and editing global settings

Editing Rule Groups and Subgroups To edit rule groups, you must right-click the Rule Groups entry in the tree pane of the

Administrator console, expand the list of rule groups, right-click the entry for the rule group you

want to edit, and then click Properties. The Properties dialog box (see Figure 10) contains six

tabs that allow you to edit individual features and settings for this rule group.

Figure 10 The Properties dialog box for a rule group

The resulting dialog box includes the following tabs:

• General. On this page, you can edit the name, description, and version of the group. To

disable all the rules in the group, or re-enable them, clear or select the Enabled check

box.

• Knowledge Base. On this page, you can view the Knowledge Base content (the Purpose,

Features, and Configuration knowledge), and the Company Knowledge Base content for

this rule group. If you want to edit the Company Knowledge Base content, click Edit.

You cannot edit the overall Knowledge Base content on this page—you must use the

Knowledge Authoring page for this.

• Knowledge Authoring. On this page, you can edit the overall Knowledge Base content.

It displays a list of knowledge sections (Purpose, Features, and Configuration). Select a

section in this list and then edit the knowledge content for that section in the text box

in this page. When complete, click the Generate Knowledge button to create the

formatted content. To see the result, go back to the Knowledge Base page.

• Advanced. On this page, you can reconcile Management Packs imported from

Operations Manager 2000 with the changes to the way Operations Manager 2005

handles rule groups. You can also use this page to specify the way that Operations

Manager 2005 will structure rule groups when it exports them, and mark a rule group

as deleted:

◦ In the upper section of the Advanced page is a list of any child rule groups for

this rule group. Select the check box next to any that you want to mark as

deleted—these are usually groups that are also children of other parent rule

groups.

Operations Manager 2000 allows a child rule group to link to multiple parent

rule groups. Later versions of Operations Manager do not support this, so you

must mark the links between parent and child groups that violate this

condition as deleted.

◦ In the lower section of the Advanced page, select from the three options that

govern the export of this rule group. The default option is Export as a vendor

produced rule. If the rule is disabled, then do not export. If you want to include

the child group (in order to import the rules into Operations Manager 2000),

select Export as a vendor produced rule. Export rule if it is enabled or disabled.

If you want to export the rule as a modified rule, which Operations Manager will

not overwrite when importing Management Packs, select Export as a customer

created/modified rule.

◦ If you want to mark the current rule group as deleted (as opposed to marking

child groups as deleted), select the Mark this rule group as deleted check box.

• Computer Groups. On this page, you can deploy the rules in the current rule group to

one or more computer groups. Click the Add button, select an existing computer group

in the Select Item dialog box, and then click OK to deploy the rules to the selected

group. Repeat to deploy the rules to more groups. To remove the rules from a

computer group, select the group in the list on the Computer Groups page, and then

click Remove.

You can double-click the computer group in the Select Item dialog box to open the

Properties dialog box for that computer group, and use it to edit the properties of the

group. For details about editing computer groups, see the procedure To edit Computer

Groups and Rollup Rules.

• Parent Rule Groups. On this page, you can see a list of the rule groups for which the

current rule group is a child. In Operations Manager 2005, each rule group can have

only one parent, so there should be only one rule group shown. The exception is in

Management Packs imported from Operations Manager 2000, where you must use the

Advanced page to mark the relevant child groups as deleted.

After making the required changes to the properties of the rule group, click Apply or OK in the

Properties dialog box. If you want to immediately force the changes through to the Operations

Manager agents on remote computers, instead of waiting for the scheduled update cycle, right-

click the Management Packs entry in the left-side tree view, and then click Commit


By default, Operations Manager pushes rule changes to all remote agents every five minutes. To

change this value, right-click Global Settings in the left-side tree view, click Management Server

Properties, click the Rule Change Polling tab, and then select the required value.

Editing Event Rules, Alert Rules, and Performance Rules To edit rules in the Administrator Console, you must expand the list of rule groups under the

Rule Groups entry (which is under the Management Packs entry) to show all the currently

configured groups. Then expand the group that contains the rule you want to edit, and select

the appropriate rule type (Event Rules, Alert Rules, or Performance Rules). The right window

shows a list of the rules in the selected section. Right-click the rule you want to edit, and then

click Properties (or double-click the rule).

You can search for rules that meet specific criteria if you cannot remember where a rule

resides, or if you want to find rules that have specific properties. Right-click Rule Groups (or

any group under the Rule Groups entry) in the tree view, and then click Find Rules. This opens

the Rule Search Wizard; in it, specify the criteria, such as the location, name, type, or response.

When you click Finish, a new console window appears containing all the matching groups.

The Properties dialog box for an event rule (see Figure 11) contains ten tabs that allow you to

edit individual features and settings for this rule.

Figure 11 The Properties dialog box for an event rule

The Properties dialog box for an alert rule (see Figure 12) contains seven tabs that allow you to

edit individual features and settings for this rule.

Figure 12 The Properties dialog box for an alert rule

The Properties dialog box for a performance threshold rule (see Figure 13) contains eleven tabs

that allow you to edit individual features and settings for this rule. The Properties dialog box for

a performance measuring rule is similar, but it does not have the Criteria, Threshold, Alert, and

Alert Suppression tabs.

Figure 13 The Properties dialog box for a performance threshold rule

Many of the pages in the Properties dialog box are common across the three types of rules:

• General. On this page, you can edit the name of the rule. To disable this rule, or re-

enable it, clear or select the This rule is enabled check box. If you want to override this

rule with another rule defined elsewhere, select the Enable rule-disable overrides for

this rule check box, click the Set Criteria button, and then click the Add button in the

Set Override Criteria dialog box. Select a computer or a computer group, and then

specify Enable (0) or Disable (1) in the Edit Override Criteria dialog box. This allows you

to specify whether this rule will apply to the selected computer or group.

• Data Provider. (This page is not available for an alert rule.) On this page, you can select

the source of the event or the performance counter that acts as the data source for the

rule:

◦ For an event rule, you can select a Windows Event Log, a scheduled (timed)

event, a WMI event, or a custom script event. To specify a source not in the list,

click the New button, select an event type in the Select Provider Type dialog

box, click OK, and then specify the details for this source.

◦ For a performance rule, you can select any of the performance counters

exposed by Operations Manager and the Operations Manager agent installed on

monitored computers, or a script-generated or internally generated event. To

specify a source not in the list, click the New button, select a performance

counter type in the Select Provider Type dialog box, click OK, and then specify

the details for this source. To edit the properties, such as the counter location or

synchronization, click the Modify button and edit the values as required.

• Schedule. On this page, you can specify the periods when the rule is active. By default,

the rule is active at all times. To specify the active periods, select either Only process

data during the specified time or Process data except during the specified time, select

the start and end times, and then select the check boxes for the days of the week to

which this period applies.

• Criteria. (For an alert rule, this page is labeled Alert Criteria; this page is available for all

rule types except for a performance measuring rule.) On this page, you can specify how

an event rule or a performance threshold rule matches the source event or

performance counter, or how an alert rule matches the source alert:

◦ For an event rule, you can specify the Source, ID, Type, and Description

properties of the source event. Alternatively, click the Advanced button to

specify individual criteria for matching on any of the fields of the source event,

using a range of string matching, regular expressions, and numerical order

matching operations.

◦ For a performance threshold rule, you can specify the Instance, Domain, and

Computer properties of the source counter. Alternatively, click the Advanced

button to specify individual criteria for matching on any of the fields of the

source counter, using a range of string matching, regular expressions, and

numerical order matching operations.

◦ For an alert rule, you specify the Alert source and Severity properties of the

source alert generated by an event rule or a performance rule. If you only want

to match alerts from rules in a specific rule group, select the only match alerts

generated by rules in the following groups: check box, click the Browse button,

and then select the appropriate rule group. You can also click the Advanced

button to specify individual criteria for matching on any of the fields of the

source alert, using a range of string matching, regular expressions, and

numerical order matching operations.

• Threshold. (This page is available for only a performance threshold rule.) On this page,

you can specify the way that the rule samples the counter values, and the way that it

matches the sampled values. You can specify that the rule should calculate the

Threshold value using a single counter value, the average of a specified number of

values, or a specified change in the values. You can also specify that the threshold value

must be greater than or less than a value you provide, or if it should raise an alert for all

values. Finally, you can use this page to enable an override for this rule, and specify the

overriding rule.

• Alert. (This page is not available for an alert rule or a performance measuring rule.) On

this page, you can turn on and turn off generation of an alert when this event rule or

performance rule is activated and set the properties for the alert it generates. The

controls on this page allow you to do the following:

◦ Specify whether the event or counter will generate an alert by selecting the

Generate alert check box.

◦ Turn on alert severity condition checking by selecting the Enable state alert

properties check box.

◦ Specify the Alert severity (such as Critical Error, Warning, or Success) if you

always want to generate the same severity alert for this event or counter.

Alternatively, you can specify a series of If conditions and an Else condition so

that the severity depends on the parameter values for the event or counter. This

allows you to define, for example, that a particular event or counter will

generate a Service Unavailable condition for specific values of the parameters,

and a Success condition for other values. Click the Edit button to enter the

condition criteria.

◦ Specify the name of the person responsible for tracking and resolving the alert

as the Owner. This allows Operations Manager to direct the alert to the

appropriate administrators and operators listed in the Notification Groups

section of the Administrator Console.

◦ Specify the Resolution state for the alert. By default, this is New, but you can set

it to Assigned to in order to assign it to Ca group of people such as a helpdesk,

vendors, or mark it as requiring scheduled maintenance. You can use the Global

Settings section of the Administrator Console to modify or define new

resolution states.


the Operator Console when this alert occurs. You can enter custom text or select

from any of the fields in the event or counter that causes this alert. The default

is to use the Source field value.


Description in the Operator Console when this alert occurs. You can enter

custom text, or select from any of the fields in the event or counter that causes

this alert. The default is to use the Description field value.


Instance, Component, and Customer Fields options.

Not all the controls on the Alert page are available for every type of event rule

or performance rule. Depending on the type of rule and the provider source,

some of the controls may be disabled.

• Alert Suppression. (This page is not available for an alert rule or a performance

measuring rule.) On this page, you can compound multiple events or counter samples

into a single alert; this prevents the generation of duplicate alerts for the same source

condition. Turn on alert suppression using the check box at the top of this page, and

then select the check boxes in the list of alert fields below for those that must be

identical to suppress duplicated alerts.

• Responses. On this page, you can specify the actions that should occur when the event

rule, alert rule, or performance rule is activated. Click the Add button to show a list of

the available responses and click the one you require. Alternatively, click the Edit

button to edit an existing response selected in the list, or click the Remove button to

remove the selected response. The response actions available are the following:


Manager script or create a new script. You also specify if the script should run on

the remote computer (where the Operations Manager agent resides) or on the

Operations Manager management server, the script timeout, and any


◦ Send an SNMP trap. This opens a dialog box where you specify where to

generate the trap: on the remote computer (where the Operations Manager

agent resides) or on the Operations Manager management server. You can use

SNMP responses to communicate alerts to other computers and systems that

run a wide variety of operating systems.

◦ Send a notification to a notification group. This opens a multi-tabbed dialog








or batch file or edit this to create a custom format using placeholder variables.

◦ Execute a command or batch file. This opens a dialog box where you can specify




management server, and the command timeout.


variables that correspond to specific actions based on the values of fields for the

counter. Click the Add button in this dialog box to select an action (such as

incrementing the value of the variable or storing the last n occurrences), and

then select the field from the source counter that provides the value for this

action. You also specify if the operation is performed on the remote computer


management server.



destination file names. You can use values in the source counter fields to select










response timeout.

• Advanced. On this page, you can specify how Operations Manager 2005 will structure

rule groups when it exports them. If you want to mark the rule as deleted, select the

Mark this rule as deleted check box. Select from the three options that govern the

export of this rule group. The default option is Export as a vendor produced rule. If the

rule is disabled, then do not export. If you want to include the child group (in order to

import the rules into Operations Manager 2000), select the Export as a vendor

produced rule. Export rule if it is enabled or disabled check box. If you want to export

the rule as a modified rule, which Operations Manager will not overwrite when

importing Management Packs, select the Export as a customer created/modified rule

check box.

• Knowledge Base. On this page, you can view the Knowledge Base content and the

Company Knowledge Base content for this rule. Click the Edit button if you want to edit

the Company Knowledge Base content. You cannot edit the overall Knowledge Base

content in this page—you must use the Knowledge Authoring page for this.

• Knowledge Authoring. On this page, you can edit the overall Knowledge Base content.

It displays a list of knowledge Sections (such as Summary, Causes, and Resolutions).

Select a section in this list and then edit the knowledge content for that section in the

text box in this page. You can also specify that each knowledge section is shared with

other rules by clicking the Share new button and entering the sharing rule ID. This

reduces duplication of content and makes updates easier. When complete, click the

Generate Knowledge button to create the formatted content. To see the result, go back

to the Knowledge Base page.

After making the required changes to the properties of the rule, click Apply or OK in the

Properties dialog box. If you want to immediately force the changes through to the Operations

Manager agents on remote computers, instead of waiting for the scheduled update cycle, right-

click the Management Packs entry in the left-side tree view, and then click Commit


By default, Operations Manager pushes rule changes to all remote agents every five minutes.

To change this value, right-click Global Settings in the left-side tree view, click Management

Server Properties, click the Rule Change Polling tab, and then select the required value.

Editing Computer Groups and Rollup Rules To edit computer groups and rollup roles, in the Administrator Console, expand the list of

computer groups under the Computer Groups entry (which is under the Management Packs

entry) to show all of the currently configured computer groups. Right-click the entry for the

computer group you want to edit, and then click Properties to open the Properties dialog box,

as illustrated in Figure 14.

Figure 14 The Properties dialog box for a computer group

The Properties dialog box contains ten tabs that allow you to edit individual features and

settings for this computer group:

• General. On this page, you can edit the name and description for the group.

• Included Subgroups. (This page displays a list of the subgroups within this group.) On

this page, you can add and remove subgroups. Click the Add button to open the Add

Subgroup dialog box, select an existing computer group, and then click OK to move it

from its current position in the computer groups hierarchy to become a child of the

current group. To remove a subgroup from the current computer group, select it in the

list on the Included Subgroups page, and then click the Remove button.

• Included Computers. (This page displays a list of the computers within the current

computer group.) On this page, you can add a new computer to the group. Click the

Add button to open the Add Computer dialog box, which shows a list of computers that

have an Operations Manager agent installed. Select the check box next to computers in

the list that you want to add to this group, and then click OK. To add a computer that is

not listed, click New in the Add Computer dialog box, enter the domain name and

computer name, and then click OK. To remove a computer from the current computer

group, select it in the list on the Included Computers page, and then click the Remove

button.

• Excluded Computers. (This page displays a list of the computers that are always

excluded from the current computer group, even if they are listed on the Included

Computers page.) On this page, you can exclude a computer. Click the Add button to

open the same Add Computer dialog box as used on the Included Computers page

(described earlier). Alternatively, click the Search button to open the Computer dialog

box where you can specify computers to exclude using wildcard strings or regular

expressions to match on the domain name or the computer name. Select a computer

on the Excluded Computers page, and then click Edit to edit an existing computer or

Remove to remove the selected computer.

• Search for Computers. On this page, you can specify criteria that select computers to

add to this computer group. You can search for different types of computer (such as

servers, clients, and domain controllers), and use wildcard strings or regular expressions

to match on the domain name or the computer name.

• Formula. On this page, you can specify a formula that selects computers based on the

criteria entered on the Search for Computers page. You can generate the formula using

a range of attributes for the target computers, such as the IP address, subnet, operating

system, fully qualified domain name, and more. You can also use a range of operators

and string matching functions, and select from lists of other computer groups.

• State Rollup Policy. On this page, you can specify how the overall state for a computer

group will reflect the states of individual members of the group. The members can be

the subgroups included within this group and/or the individual computers in the group.

The three options on this page (see Figure 15) are the following:

◦ The worst state of any member computer or subgroup. If you select this option,

Operations Manager will set the State value displayed in the Operator Console

to that specified for the Severity for the worst of the current unresolved alerts

for the members of this group. The alert Severity states range from Success

(best) to Server Unavailable (worst). You can see a list of these states on the

Alert page of the Properties dialog box for any of your existing event rules,

performance rules, or alert rules.

◦ The worst state from the specified percentage of best states in the computer

group. If you select this option, you must specify a percentage that defines the

proportion of the group will act as the state indicator for the group. Operations

Manager will select a set of members from the group that consists of the

computers with the best health state up to the percentage you specified of the

total group membership. In other words, if there are 10 computers and you

specify C60%C, Operations Manager will select the six members of the group that

currently have the least severe state. It then uses the worst (the most severe)

state of the subset it selects as the overall (rolled-up) state for the group, and

displays this in the Operator Console as the State value for this computer group.

◦ The best state of any member or subgroup. If you select this option, Operations

Manager will set the State value displayed in the Operator Console to that

specified for the Severity for the best of the current unresolved alerts for the

members of this group. It is unlikely that you will use this option very often,

because it effectively hides the state of most of the members of the group as

long as one member is performing correctly.

Figure 15 The State Rollup Policy page of the Properties dialog box for a computer group

• Console Scopes. (This page displays a list of the scopes where the current computer

group is used. By default, every group is a member of every scope.) On this page,

administrators can specify custom sets of computer groups for each scope (Operations

Manager Users, Operations Manager Authors, and Operations Manager Administrators)

using the Console Scopes options within the main Administration section of the

Administrator Console.

• Parents. This page displays a list of the parent computer groups for this group, if it is a

child (nested) group. C

• Rules. On this page, you can enable and disable the rules in this computer group and its

child subgroups. Select the check box at the top of the page to disable all the rules in

this group and all its child subgroups (if any). The Rules page also shows a list of any

rule groups associated with parent computer groups that this computer group inherits.

At the bottom of the page, a list shows the rule groups already associated with this

computer group, which its child computer groups will inherit. To add a rule group to this

list, click the Add button to open the Select Rule Group dialog box, select the required

rule group, and then click OK. To remove a rule group from the list, select it, and then

click the Remove button.

After making the required changes to the properties of the computer group, click Apply or OK in

the Properties dialog box. If you want to immediately force the changes through to the

Operations Manager agents on remote computers, instead of waiting for the scheduled update

cycle, right-click the Management Packs entry in the left-side tree view, and then click Commit


By default, Operations Manager pushes rule changes to all remote agents every five minutes.

To change this value, right-click Global Settings in the left-side tree view, click Management

Server Properties, click the Rule Change Polling tab, and then select the required value.

For details about how to create a computer group, see the later section, "Create an Operations

Manager 2005 Computer Group and Deploy the Operations Manager Agent and Rules."

Creating and Editing Operators, Notification Groups and

Notifications Rules can create notifications that consist of e-mail or pager alerts, or they can run commands

to perform custom notification tasks. You first create notification groups and add individual

members to these groups. Then you can specify the group(s) to which a rule will send

notifications:

To create and edit operators, notification groups, and notifications

1. In the Administrator Console, expand the Notification entry (which is under the

Management Packs entry), and then expand Notification Groups entry to show all

of the currently configured notification groups. The Operators entry contains a list of

all operators configured for Operations Manager 2005. The Notification Groups

entry contains a list of all configured Notification Groups. Select a group entry in the

left-side tree view to display a list of the operators within that group in the right

pane. Select the Notification entry to see a summary of the number of configured

operators and groups and links to view and create operators and groups (see Figure

16).

Figure 16 The Notification and Notification Groups section of the Administrator Console

2. To create a new operator, click the Create Operator link in the right pane, or right-

click Operators in the left pane, and then click Create Operator. In the Operator

Properties Wizard that opens, enter the name of the operator and specify whether

this operator is enabled by selecting or clearing the Enabled check box, and then

click Next to show the Email page.

3. On the Email page, select the Email this operator check box if you want Operations

Manager to send email alerts to this operator. Enter the e-mail address, and then

select either Always email this operator (to send e-mail messages at any time) or

Email this operator at the specified times. If you select the second option, enter the

start and end times for the period, and then select the days of the week to which it

applies. Click Next to show the Page(r) page.

4. On the Page(r) page, select the Page this operator check box if you want Operations

Manager to send pager alerts to this operator. Then enter the pager address, and

then select either Always page this operator (to send pager alerts at any time) or

Page this operator at the specified times. If you select the second option, enter the

start and end times for the period, and then select the days of the week to which it

applies. Click Next to show the Command page.

5. On the Command page, select the Notify this operator by external command check

box if you want Operations Manager to alert this operator by running an external

command that you specify in the Global Settings for Operations Manager 2005. You

must enter an operator ID string value that is passed to the command. For details

about editing the Global Settings, see "Viewing and Edit Global Settings" later in this

chapter Click Finish to create the new operator.

6. To edit the properties for an existing operator, select the Operators entry in the left-

side tree view. In the list of operators in the right pane, right-click the one you want

to edit, and then click Properties. The Properties dialog box has the following tabs:

◦ General. On this page, you can edit the operator name and enable or disable

this operator.

◦ Email. On this page, you can enable or disable the sending of email alerts to this

operator, edit the e-mail address, and specify the periods when e-mail alerts can

be sent.

◦ Page(r). On this page, you can enable or disable the sending of pager alerts to

this operator, edit the pager address, and specify the periods when pager alerts

can be sent.

◦ Command. On this page, you can enable or disable the execution of an external

command that sends alerts to this operator, edit the operator ID that is passed

to the command, and specify the periods when commands can be executed.

◦ Notification Groups. (This page displays a list of the groups of which this

operator is a member.) On this page, you can add this operator to another

group. Click the Add button to open the Notification Groups dialog box, select a

group, and then click OK. To remove this operator from a notification group,

select the group in the list on the Notification Groups page, and then click the

Remove button.

7. To delete an existing operator, right-click it in the right pane, and then click Delete.

Click Yes in the confirmation dialog box that appears.

8. To create a new notification group, right-click Notification Groups in the left-side

tree view, and then click Create Notification Group. In the Notification Group

Properties dialog box that opens, enter the name for the new group.

9. To add members to the new group, select them in the right-side list of Available

operators and click the "<-" button. To create a new operator to add to the group,

click the New Operator button to start the Operator Properties Wizard and follow

the steps shown earlier in this procedure (steps 2 through 5). After adding or

creating all the required operators, click Finish in the Notification Group Properties

dialog box.

10. To edit an existing notification group, right-click it in the left-side tree view, and then

click Properties to open the Notification Group Properties dialog box. Edit the name

of the group in the text box at the top of the dialog box, and edit the list of members

for the group by selecting them in the lists and clicking the "<-" and "->" buttons.

11. To have Operations Manager generate a notification when an event, performance

counter threshold, or alert occurs, you specify one or more notification groups in the

properties of that rule. Select the required rule from the Event Rule, Performance

Rule, or Alert Rule section of the appropriate rule group, right-click it, and then click

Properties. For details about the properties of a rule, see the earlier section, "To edit

Event Rules, Alert Rules, and Performance Rules"

12. In the Properties dialog box for the rule, open the Responses page, and then click

the Add button. Select Send a notification to a Notification Group, and then select

the group from the drop-down list box. Repeat to add more notification groups as

required, and then click OK to close the rule Properties dialog box.

Viewing and Editing Global Settings Many of the features in Operations Manager 2005 depend on configuration settings and

properties defined as global settings. To view global settings in the Administrator Console,

expand the Administration entry, and then select Global Settings. The right pane shows the

global settings for Operations Manager 2005. Although there are thirteen entries for the global

settings, there are only three different Properties dialog boxes. You can open the main global

settings Properties dialog box by double-clicking (or right-clicking and then clicking Properties)

any of the entries except Management Servers and Agents. The main Properties dialog box is

shown in Figure 17.

Figure 17 The main Properties dialog box for the Global Settings in Operations Manager 2005

The main properties dialog box contains eleven tabbed pages:

• Notification Command Format. On this page, you can specify a custom application that

you want to execute in response to an operator alert. You can specify the command line

for the application and include placeholders that Operations Manager replaces with

values when it executes the command. These placeholders include the Operator ID you

specify in the Properties dialog box for each operator.

• Knowledge Base Template. On this page, you can edit the HTML template Operations

Manager uses to generate the multi-section knowledge base content for items in a

Management Pack. The template contains placeholders of the form <!section-name>

that indicate where Operations Manager will insert the separate sections of knowledge

content text.

• Database Grooming. On this page, you can specify how Operations Manager will

automatically mark alerts as resolved after a certain period, removing them from the

Operator Console display.

• Operational Data Reports. On this page, you can automatically send to Microsoft

reports about the way you use Operations Manager 2005; this provides valuable

feedback to the development team about typical usage patterns.

• Custom Alert Fields. On this page, you can change the names of the five custom fields

displayed in alerts. You can use these if you want to add application-specific or

company-specific information to every alert.

• Alert Resolution States. On this page, you can modify the existing alert resolution

states or add new ones. The default states include Acknowledged, Assigned to xxx, and

Resolved. You can also specify the service level interval within which each state should

be resolved, the shortcut key assigned to this state, and whether users can set the state

within the Operator Console and the Web Console.

• Email Server. On this page, you can configure the settings used to send e-mail alerts

through your SMTP mail server.

• Licenses. On this page, you can manage the number of management licenses for

remote managed clients.

• Web Addresses. On this page, you can specify the URL of the Operations Manager Web

Console and the URL used for online product knowledge (the default is the Microsoft

Support Web site). You can also specify custom Web addresses for your file server (for

transferring files to clients), and for your company knowledge base.

• Communications. On this page, you can specify the port Operations Manager uses for

encrypted communication with remote managed computers.

• Security. On this page, you can specify features of the authentication and the response

execution for the Operations Manager server and remote managed computers.

After making the required changes to the global settings, click Apply or OK in the Properties

dialog box.

In the Management Servers Properties dialog box and the Agents Properties dialog box, you

can fine-tune the behavior of the Operations Manager server and the Operations Manager

agents installed on the Operations Manager server and on remote computers. You will usually

not need to change these settings, and this chapter does not describe them in detail. For more

information, examine the Operations Manager 2005 Help file or click the Help button in the

relevant Properties dialog box.

After you finish editing your Management Pack(s) and settings, you can turn off

Authoring mode. Right-click Rule Groups in the left-side tree view, click Disable

Authoring mode, and then click Yes in the confirmation dialog box.

Guidelines for Editing an Operations Manager 2005 Management

Pack When editing an Operations Manager 2005 Management pack, consider the following proven

practices:

• Ensure that your Management Pack contains the appropriate rule groups and rules to

match the management model and the instrumentation exposed by the application.

Keep the hierarchy of the rule groups and the properties of the rules as simple as

possible, while following the structure and the requirements of the management

model.

• Ensure that your Management Pack contains the appropriate computer groups and

subgroups that match the physical layout of the computers that will run the application.

Use subgroups to provide the appropriate rollup features for each subset of computers

that run each component or section of the application.

• Create the appropriate notification groups containing operators that manage the

application and other people that have an interest in its operation (such as business

owners and application developers). Configure responses for the rules and alerts that

send operational alerts to the appropriate groups.

• Modify any of the global settings that affect your application. For example, you may

want to use the custom alerts fields for company-specific information or modify the

alert resolution states and service level periods to suit your requirements.

Create an Operations Manager 2005 Computer Group and Deploy

the Operations Manager Agent and Rules When creating or editing a Management Pack for Microsoft Operations Manager (or the

equivalent sets of rules and knowledge for other monitoring applications and environments), it

is sometimes hard to relate the architecture of a management model with the features provided

by the monitoring application. For example, a management model as defined in the MMD tool

uses a hierarchical structure of application components and services, using a simple three-state

(RED, YELLOW, and GREEN) indicator paradigm for the health of each section. Most monitoring

applications provide a wide range of features, but they do not relate directly to this simple

approach.

To match the management model to the capabilities of the monitoring application, you can

create groups of computers that perform similar or related tasks, and then combine these

groups in a hierarchical way that mirrors the structure of the management model. Each group

exposes a rolled-up state indication that depends on the state of its members, according to the

rules contained in the management model.

The set of rules specified in the management model for each component or section of the

application, implemented as a rule group in the monitoring application, then corresponds to a

group of computers, and you can associate and deploy the appropriate set of rules to each

group.


• To deploy the Operations Manager agent to remote computers

• To create a computer group

• To associate a rule group with a computer group and deploy the rules

To deploy the Operations Manager agent to remote computers

1. In the left-side tree view of the Administrator Console, expand the list to show the

Computers entry (which is under the Administration entry). If the tree-view pane is

not visible, click Customize on the View menu. In the Customize View dialog box,

select the Console tree check box, and then click OK.

2. Right-click Computers, and then click Install/Uninstall Agents Wizard. Alternatively,

you can click the Install/Uninstall Agents Wizard link in the right pane of the

Administrator Console after selecting the Computers entry in the tree view.

3. Click Next on the introduction page of the wizard. On the next page, select the

Install Agents option (you can also use this wizard to remove installed agents from

specific computers by selecting the Uninstall Agents option).

4. Click Next, and then select one of the following two options for discovering

computers:

◦ Browse for or type in specific computer names. If you select this option, the

next page you see displays an empty list and a Browse button that opens the

Select Computers dialog box. The Select Computers dialog box is the same as

you use to find objects in Active Directory or within a domain (see Figure 18).

You can use the Advanced button in this dialog box to search for computers

based on a range or criteria and conditional matching methods.

Figure 18 Selecting computers from the domain using the Select Computers dialog box

◦ Search criteria. If you select this option, the next page you see displays an

empty list for the discovery rules. Click Add to open the Computer Discovery

Rule dialog box. In this dialog box, you specify the Domain name containing the

computers you want to discover, a condition and a text string for the Computer

name, and select a Computer type. You can use partial string matches, including

wildcards, or a regular expression to match the computer name, and specify that

the computer type is a server, a client, or accept both servers and clients. You

can also apply the discovery rule to domain controllers if you want. By default,

the wizard will contact each computer in turn to verify that it exists, though you

can disable this feature using the check box at the bottom of the Computer

Discovery Rule dialog box. After creating a discovery rule, click OK to add it to

the list of rules in the wizard, and repeat the process to add more rules as

required.

5. Click Next, and then specify the account the wizard will use to install the agents. The

default is to use the Management Server Action Account created when you installed

Operations Manager. However, if required, you can select Other, and then specify a

user name and a password for the account you want to use.

6. Click Next, and then specify the Agent Action Account. This is the account that the

agent will run under on the remote computers. The default is the Local System

account. However, if required, you can select Other and specify a user name and a

password for the account you want to use.

7. Click Next, and then specify the folder on the remote computers where the wizard

will install the agent. The default is a subfolder of the Program Files directory,

though you can select other environment variables (such as %SYSTEMDRIVE% and

%PROGRAMFILES%) in the drop-down list box, and then add a custom path if

required.

8. Click Next to see a summary of the actions the wizard will perform. By default, the

wizard will display the progress of each action, though you can clear the check box

on this page to prevent this if you prefer. Click Finish to install the agents.

9. After the wizard completes, you can use the links in the right pane of the

Administrator Console or the entries below the Computers entry in the tree view to

see a list of the computers that have the agent installed (see Figure 19).

Figure 19 The Computers page shows the installed agents and a link to install/uninstall the agent

To create a computer group


Computer Groups entry (which is under the Management Packs entry). If the tree-

view pane is not visible, click Customize on the View menu. In the Customize View

dialog box, select the Console tree check box, and then click OK.

2. If this is the first computer group for your application, you must create a top-level

(parent) group (you can add child groups to this group as you create it if required).

To create a top-level computer group, right-click Computer Groups in the tree view,

and then click Create Computer Group to start the wizard.

3. If you have already created a top-level computer group for your application, you can

create nested (child) groups within that top-level group. To create a child computer

group, right-click the top-level group entry, and then click Create Computer Group

to start the wizard.

4. Click Next to open the General page. Enter a name for the computer group, and then

enter a description that will help operators to identify the purpose of this group.

5. Click Next to open the Included Subgroups page. If you want to add existing groups

as children of the new group (subgroups), click the Add button to open the Add

Subgroup dialog box. Select the check box for each existing group you want to add,

and then click OK. The wizard displays the new group and the subgroups you

selected.

6. Click Next to open the Included Computers page. Click Add to open the Add

Computer dialog box. This shows the Windows domain that contains the Operations

Manager management server and all computers discovered in that domain

(computers that have the Operations Manager agent installed). Select the check box

next to the domain name to include all listed computers or select the check boxes

next to individual computers you want to include. To add computers not already

listed, click New, enter the domain name and computer name, and then click OK.

After you select the computers you want to include, click OK to close the Add

Computer dialog box. The wizard displays a list of the computers you selected.

7. Click Next to open the Excluded Computers page. A computer group can include all

the computers in a domain or computers found using a search or formula process (as

you will see later). You can exclude specific computers on the Excluded Computers

page by clicking the Add button to open the Add Computer dialog box or search for

computers to exclude by clicking the Search button to open the Computer dialog

box. The Computer dialog box allows you to select computers using a range of

criteria, such as partial and full string matching on the name, wild-card string

matching, and regular expressions.

8. Click Next to open the Search for Computers page. This page allows you to search for

computers to add to the group based on their function (such as server, client, or

domain controller), or based on the name using similar options as in the Computer

dialog box discussed in the previous step of this procedure. If you do not want to add

more computers to this group, select the Do not search for computers option.

If you want to select computers based on a formula in the next step of this procedure,

you must provide the relevant criteria on the Search for Computers page of the wizard.

9. Click Next to open the Formula page. Here, you can specify a formula that selects

computers based on the criteria you entered in the previous page. You can generate

the formula using a range of attributes for the target computers, such as the IP

address, subnet, operating system, fully qualified domain name, and more. You can

also use a range of operators and string matching functions, and select from lists of

other computer groups. If you do not want to add more computers to this group,

select the Do not use a formula to determine membership for this computer group

option.

You can use a custom registry key located on the remote computer as an attribute

instead of one of the existing attributes created by Operations Manager and the

Management Packs you install. Click Attribute, click New on the Set Attribute page,

and then specify details of the registry key and value as required.

10. Click Next to open the State Rollup Policy page. Here, you specify how the overall

state for a computer group will reflect the states of individual members of the group.

The members can be the subgroups included within this group and/or the individual

computers in the group. The three options on this page (see Figure 20) are the

following:

◦ The worst state of any member computer or subgroup. If you select this option,

Operations Manager will set the State value displayed in the Operator Console

to that specified for the Severity for the worst of the current unresolved alerts

for the members of this group. The alert Severity states range from Success

(best) to Server Unavailable (worst). You can see a list of these states in the

Alert page of the Properties dialog box for any of your existing event rules,

performance rules, or alert rules.

◦ The worst state from the specified percentage of best states in the computer

group. If you select this option, you must specify a percentage that defines the





specify 60%, Operations Manager will select the six members of the group that



displays this in the Operator Console as the State value for this computer group.

◦ The best state of any member or subgroup. If you select this option, Operations

Manager will set the State value displayed in the Operator Console to that

specified for the Severity for the best of the current unresolved alerts for the

members of this group. It is unlikely that you will use this option very often,

because it effectively hides the state of most of the members of the group as

long as one member is performing correctly.

Figure 20 Specifying the State Rollup Policy for a computer group

11. Click Next to open the Confirmation page, which provides a summary of the options

you have set in the wizard. To change any settings, click the Back button to return to

the relevant page.

12. If you are happy with the settings shown, click Next, and then click Finish. The new

computer group appears in the Administrator Console tree view. If you specified any

existing groups as subgroups of the new group, they move to appear under the new

group in the tree view.

To associate a rule group with a computer group and deploy the rules


Rule Groups entry (which is under the Management Packs entry). If the tree-view

pane is not visible, click Customize on the View menu. In the Customize View dialog

box, select the Console tree check box, and then click OK.

2. Expand the list of rule groups, and then right-click the group of rules you want to

deploy to a specific set of computers. On the shortcut menu, click Associate with

Computer Group to open the Properties dialog box for this rule group with the

Computer Groups page selected. Alternatively, you can right-click the rule group,

select Properties, and then select the Computer Groups tab.

3. On the Computer Groups page, click Add to open the Select Item dialog box. Select

the computer group to which you want to deploy the rules in this rule group, and

then click OK. Repeat the process if you want to deploy the rules to more than one

computer group.

4. Back in the Properties dialog box for the rule group, click OK.

5. If you want to immediately force the rules in this rule group through to the

Operations Manager agents on remote computers, instead of waiting for the

scheduled update cycle, right-click Management Packs in the left-side tree view, and

then click Commit Configuration Change.



Management Server Properties, click the Rule Change Polling tab, and then click

select the required value.

Guidelines for Creating an Operations Manager 2005 Computer

Group and Deploying the Operations Manager Agent and Rules When creating an Operations Manager 2005 computer group and deploying the Operations

Manager agent and rules, you should consider the following proven practices:

• Create a top-level computer group that includes all the computers that will execute the

application, and which you want to monitor. If the application has distinct separate

sections, such as separate Web services running on different computers or separate

groups of servers that may be in use at different times, create separate child rule

groups for each set of computers within a parent (top-level) rule group.

• Use the state rollup options for the top-level computer group to specify the overall

state for all the computers involved in the application, so the console displays the

appropriate state indication to operators. Use the appropriate severity settings for each

rule to represent the three basic states RED ("failed" or "unavailable"), YELLOW

("degraded"), and GREEN ("working normally" or "available").

• Combine the state of each subgroup using the same approach as for the top-level

group, so operators can drill down, monitor, and see the state of individual components

or sections of the application. This makes diagnosis of problems easier.

View Management Information in Operations Manager 2005 After you import or create a Management Pack for your application, you can use it to monitor

your application. You will also usually take advantage of existing Management Packs, provided

with Microsoft Operations Manager 2005 or downloaded from TechNet at

Hhttp://www.microsoft.com/technet/prodtechnol/mom/mom2005/catalog.aspxH. These

additional Management Packs allow you to detect faults in the underlying infrastructure, such as

performance degradation or core operating system service failures, and monitor services such

as Microsoft Exchange and SQL Server.

This section includes procedures for using both the Operator Console and the Web Console. The

Operator Console allows you to view the state of an application and drill down to see details of

the events, alerts, performance counters, and computers that run the application. The Web

Console has less functionality, but it can still be of great use to operators, particularly when the

Operator Console is not installed.

To view state information, alerts, events, and computers in Operations Manager 2005 using the Operator Console

1. Open the Operations Manager 2005 Operator Console, and use the Group: drop-

down list at the top of the window to select the computer group for which you want

to view information. Click the State link in Navigation pane at the lower-right section

of the window to show the overall health state for the application you selected in

the Group: drop-down list (See Figure 21).

Figure 21 The State view of an application in the Operations Manager 2005 Operator Console

If you cannot see all of the panes shown in Figure 21, on the View menu, select the

pane you want to open (Navigation Pane or Detail Pane).

2. Figure 21 indicates that the overall state for this computer group (all the computers

running this application) is Critical Error. The lower section of the window shows the

computers in this computer group (in this case, there is only one), and indicates the

total number of open or unresolved alerts, and the total number of events.

3. Click the Alerts link in the navigation pane to see all the open alerts for the computer

group. The lower window now shows details of the selected alert, including the

properties (field values) of the event or counter threshold that caused the alert (see

Figure 22). The Alert Details section in the lower window also displays the product-

specific and company-specific knowledge for the rule that detected the problem.

This knowledge assists in diagnosing, resolving, and verifying resolution of the

problem that originally caused the alert (see Figure 23).

Figure 22 The list of all alerts for the computer group and details about the selected alert

Figure 23 Viewing the product knowledge for the alert

4. To view only the alerts for a specific computer within the computer group, go back

to the State view and double-click an alert in the State upper window for the

computer you want to view, or double-click the computer in the State Details lower

window. You see the same view as in Figure 23, but it contains a list of only the

alerts for the selected computer.

5. Click the Events link in the navigation pane to see all the events from the Windows

Event Log for computers in the computer group. The list shows the domain and

computer names, and the lower window contains the values of the event fields for

the event selected. You can view a list of alerts raised by this event on the Alerts

tabbed page in the lower window, and the parameters of the event on the

Parameters tabbed page (see Figure 24).

Figure 24 The list of events for all computers within the selected computer group

Right-click the upper window in any view, and then click Personalize View to select the

columns displayed in the list or to change the order of the columns.

6. Click the Performance link in the navigation pane to see a list of all the computers

within the currently selected Group: scope. Select a computer in the list, and then

click the Select Counters button to display a list of all the performance counters for

that computer. This includes the standard operating system counters implemented

by the built-in Management Packs in Operations Manager 2005, such as processor

usage and elapsed time (see Figure 25).

Figure 25 Selecting a performance counter to view

7. Select the check boxes next to the counters you want to view results for, and then

click the Draw Graph button. In Figure 26, you can see the results for the

WSTransport Service counter implemented in an example application.

Figure 26 A chart showing performance counter data samples collected by Operations Manager

8. Click the Computers and Groups link in the navigation pane to see a list of all the

subgroups within the current group (the group selected in the Group: drop-down list

at the top of the window) and the state of each one. Double-click a subgroup to

navigate to that group and view the state and details of the group.

9. Click the Diagram link in the navigation pane to see a schematic diagram of the

current computer group, its subgroups, and the computers within each group. It also

displays the current health state of each group and computer (see Figure 27). This

makes it easy for operators to grasp visually the overall state of the application and

the individual components.

Figure 27 A computer group in Diagram view showing the state of each computer

10. Double-click a computer (not a computer group) in the right window in Diagram

view to switch to Alerts view for that computer.

11. You can use the My Views link in the navigation pane to create custom views of the

monitoring information. You can also define custom Public Views for viewing in the

Operator Console using the Console Scopes section within the Administration

section of the Administrator Console. For more details, see the Operations Manager

Help file.

Operations Manager 2005 also installs a Web-based Operator Console. While this has fewer

features, it can be used for remote monitoring and problem diagnosis from locations outside

your own network.

To use the Web Console for remote monitoring and problem diagnosis

1. To open the Web Console from the Administrator Console, select the Microsoft

Operations Manager entry directly below the console root in the left-side tree view.

On the right-side Home page, click the Start Web Console link in the Operations

section of the page.

2. To discover the URL of the Web Console, expand the Administration section of the

left-side tree view in the Administrator Console, and then select the Global Settings

entry. Double-click Web Addresses in the right window to see the Web Console

Address. This is, by default, a non-standard port on the local computer, such as

http://machine-name:1272. Enter this URL into your Web browser.

The Web Console provides three views of the monitoring information: Alerts, Computers, and

Events. These are very similar to the views you see in the Operator Console. For example, Figure

28 shows the Alerts view in the Web Console. You can select an alert and view the properties,

events, knowledge, and history just as you can in the Operator Console.

Figure 28 The Alerts view in the Operations Manager 2005 Web Console showing the product knowledge

Guidelines for Viewing Management Information in Operations

Manager 2005 When viewing management information in Operations Manager 2005, consider the following

proven practices:

• If you connect directly to the management domain, use the Operator Console to

monitor applications and computers. If you connect from a remote location over the

Internet or an intranet, use the Web Console to monitor applications and computers.

• Use the drop-down Groups: list to limit your view to the appropriate computer group

and its subgroups, unless you want to see alerts raised by all the managed computers

for all events.

• Use the State view and the Diagram view to provide an overall picture of the health

state of the application. In Diagram view, you can also see the state of the subgroups

and individual computers.

• Use the Alerts view to obtain a list of alerts, generally sorted by descending severity,

which is useful in prioritizing diagnosis and resolution requirements, and the

corresponding actions.

• Use the Events view to see the details of source events, and use the Performance view

to see the values and history of performance counter samples. Both are useful in

diagnosing problems and verifying resolution.

• Use the Administrator Console to create custom views if you want to restrict the

features of the Operator Console available to specific groups of users, or to all users.

Create Management Reports in Operations Manager 2005 Regular and continuous monitoring makes it easier to detect application failures, problems, or

unsatisfactory performance, but the actions taken by administrators and operations staff are

usually short-term in nature. They tend to concentrate on the present, and they may not take

into account historical events and performance over longer periods that indicate fundamental

issues or causes.

However, business owners and hosting services must often conform to specified service level

agreements (SLAs) on performance and availability. The data required to support these

agreements only appears over longer periods and requires access to historical information.

Data captured in summary reports can also be vital to operations staff in detecting missed

computers, or incorrectly configured application or computer groups, particularly in large and

complex installations. These reports may be the only way that operations staff can match

monitoring infrastructure to the physical hardware.

Microsoft Operations Manager 2005 includes a report generator that uses SQL Server Reporting

Services to publish performance and error details it collects while monitoring applications. This

can provide useful summaries of application performance, and the history of issues encountered

with an application. You can use the reports to view the overall performance over time and

detect specific problem areas within your application.

To view monitoring and management reports in Operations Manager 2005

1. Start the Reporting Console from the Microsoft Operations Manager 2005 section of

your Start menu. Alternatively, unless you specified a different location when

installing SQL Server Reporting Services, you can open the SQL Server Reporting

Console in a Web browser by entering the address http://localhost/Reports.

The Reporting Console and Web Console are optional features that you must install

when you install Operations Manager 2005. If you encounter problems when opening

the reports, check that the SQL Server Reporting Services service is running.

2. When prompted, enter the user name and password of an account that has

permission to access the Operations Manager reporting data in SQL Server Reporting

Services. Usually this is an administrator-level account for the monitoring domain.

3. If you opened the Reporting Console from your Start menu, you will see options to

view reports for Microsoft Operations Manager, Operational Data Reporting, and

Operational Health Analysis. If you opened SQL Server Reporting Services in your

browser, you must click the Microsoft Operations Manager Reporting link in the

SQL Report Manager Home page to get to this menu.

4. Click the Microsoft Operations Manager link to see a menu of the operational

reports. These include details of the Operations Manager agents installed on the

management server and the remote computers and a summary of the health and

performance of the management group.

5. Select a report and enter the criteria in the controls at the top of the report page,

then click the View Report button. For example, Figure 29 shows the Management

Group Agents report, with the Management Group: drop-down set to the name of

the required management group.

Figure 29 Viewing the Management Group Agents report for an Operations Manager management group

The other two links on the Microsoft Operations Manager Reporting page open submenus

containing a range of other pre-defined reports. The Operational Data Reporting page contains

links to view all alerts and events, as well as the general health and a report listing any script or

response errors.

The Operational Health Analysis page contains a number of more detailed reports that drill

down into the operational history of the management group. These include analysis of alerts,

events, and performance by type, severity, time, frequency, and computer group. You can also

view reports on the association between rule groups, computer groups, and individual

computers.

Guidelines for Creating Management Reports in Operations

Manager 2005 When creating management reports in Operations Manager 2005, consider the following

proven practices:

• Use the Operations Manager Reporting Console to view the historical performance of

an application to ensure it performs within the service level agreements or the

parameters defined by business rules.

• Use the reports to discover inconsistencies in performance, check overall reliability, and

detect problematic situations such as unreliable networks—and the times when these

issues most commonly arise.

• Use the reports to confirm management coverage of the computers running the

application and deployment of the appropriate sets of rules to each group.

Summary Management Packs can be a very useful tool for the operations team in managing applications.

This chapter demonstrated how to create and import Management Packs in Operations

Manager 2005 and then showed how to edit the Management Packs to provide the functionality

required when monitoring an application.

Chapter 17

Creating and Using System Center

Operations Manager 2007

Management Packs

Chapter 16 of this guide described creating and authoring Management Packs in Microsoft

Operations Manager 2005. This chapter describes how to perform the same tasks using System

Center Operations Manager 2007. This chapter discusses the same scenarios for creating and

using Management Packs. It describes in detail the following:

• Converting and importing a management model from Operations Manager 2005

• Creating a management pack in the Operations Manager 2007 Operations Console

• Editing an Operations Manager 2007 Management Pack

• Viewing management information in Operations Manager 2007

• Creating management reports in Operations Manager 2007

The Transport Order application is used as a running example throughout this chapter. This

application forms part of the shipping solution in the Northern Electronics worked example

used throughout this guide.

Convert and Import a Microsoft Operations Manager 2005

Management Pack into Operations Manager 2007 The format of Management Packs differs between Microsoft Operations Manager 2005 and the

System Center Operations Manager 2007. Therefore, you cannot import Microsoft Operations

Manager 2005 Management Packs directly into Operations Manager 2007. Instead, you must

convert these to the appropriate format, or recreate them using the Operations Manager 2007

tools.

To convert and import a Microsoft Operations Manager 2005 Management Pack for Operations Manager 2007

1. Obtain the Microsoft Operations Manager 2005 Management Pack (.akm file) that

contains the rules, alerts, notifications, and computer groups you want to implement

in Operations Manager 2007 by doing the following:

◦ Export a management model from your management model designer (such as

the Microsoft Management Model Designer) as a Microsoft Operations Manager

2005 Management Pack.

◦ Export an existing Management Pack from Microsoft Operations Manager 2005

using the Management Pack Export Wizard.

◦ Acquiring the appropriate Management Pack from a third-party provider.

2. Copy the .akm file into the folder where you installed Operations Manager 2007. By

default, this is %ProgramFiles%\System Center Operations Manager 2007\.

3. Open a Command window from your Start menu, navigate to the Operations

Manager 2007 folder where you placed the .akm file, and use the MP2XML tool to

convert the Microsoft Operations Manager 2005 Management Pack to an

Operations Manager 2007-compatible XML file. The syntax is the following:

mp2xml [folder_name\]source_file.akm [destination_folder\]

destination_file.xml

4. Use the MPConvert tool to convert the XML file into an Operations Manager 2007

Management Pack file. The syntax is the following:

mpconvert [folder_name\]source_file.xml [destination_folder\]

new_filename.xml

5. On the taskbar, click Start, point to System Center Operations Manager 2007, and

then click Operations Console. In the navigation pane, click the Administration

button. If the navigation pane is not visible, click Navigation Pane on the View

menu.

6. In the left-side tree view, right-click Administration (at the top of the tree), and the

click Import Management Pack(s).

7. In the Select Management Pack(s) to import dialog box, select the .mp or .xml file

for the Management Pack you want to import. You can hold down the SHIFT or CTRL

keys while clicking to select more than one file.

Files with the .mp file name extension are Sealed Management Packs that you cannot

edit. Files with the .xml file name extension are Unsealed Management Packs that you

can edit.

8. Operations Manager imports the Management Packs you selected and installs them.

A dialog box reports the results, indicating any that it cannot import. When you close

the dialog box, Operations Manager 2007 begins monitoring; it collects the same

data as in Microsoft Operations Manager 2005.

Guidelines for Converting and Importing a Microsoft Operations

Manager 2005 Management Pack into Operations Manager 2007 When converting and importing a Microsoft Operations Manager 2005 Management Pack into

Operations Manager 2007, you should consider the following guidelines:

• Ensure that you maintain all current information in your management model, so you

can export it from your management model editor and it is ready to use in Operations

Manager 2007.

• Apply any changes you make to the application and the monitoring environment

following feedback or run-time experience, especially where this affects the

instrumentation of the application.

• If you are already using Microsoft Operations Manager 2005, back up your

Management Packs, by exporting them using the Export Wizard in Microsoft Operations

Manager 2005, whenever you make changes to them or add customizations.

• Use the conversion tools provided with Operations Manager 2007 to convert your

Microsoft Operations Manager 2005-format Management Pack (.akm) files to the

correct Operations Manager Management Pack format.

Creating a Management Pack in the Operations Manager 2007

Operations Console If you have not imported the management pack from another source, you will need to use the

Operations Console in System Center Operations Manager 2007 to create a Management Pack

for your application. Although the modeling concepts in Operations Manager 2007 more closely

match the management model concepts contained in this guide, there are still complications in

mapping the two. By using rules and monitors, you can you can update state variables that

correspond to the health state of an application. You can also define tasks and probes that

execute scripts or commands at specified intervals.

Rules that detect an event can generate an alert to display in the console, and they can send a

message to operators via e-mail or pager. You can create notification groups and add operators

to these groups to make it easier to manage notification. Rules that collect data and store it in

the Operations Manager 2007 database do not create alerts directly. However, you can

associate unit monitors with one or more event or performance rules so that the monitor raises

an alert when Operations Manager collects an event or a performance counter value that

matches the criteria for that monitor.

Other types of monitors you can create include probe monitors that check the status of a

process, such as a Web application or a database, at pre-defined intervals and raise an alert if

the target component fails; and Rollup Monitors that determine the overall health state

exposed by a set of rules and monitors.

By assigning individual rules and monitors to groups, you can associate these rules and monitors

with a specific section or component of the application. This makes it easier to update the

monitoring configuration as the physical layout of the monitored application and its

components change over time. You can also assign knowledge from the management model

that is common to a set of rules to the group, reducing duplication of effort and making

knowledge updates easier.

In Operations Manager 2007, a monitoring group is referred to as an instance group, and can

contain nested subgroups. You can also create a distributed application that consists of a series

of related managed objects; therefore, it contains all the services and components of your

application. A distributed application consists of a series of nested instance groups, and you can

use templates provided with Operations Manager 2007 to create distributed applications of

various types.

After creating the groups, you can create the rules and monitors for the application—associating

these rules and monitors with the appropriate groups as you create them. You can then handle

the monitored application as one entity. This makes it much easier to manage the creation,

monitoring, and editing of the rules and groups.


• To create a new Management Pack in the Operations Manager 2007 Operations

Console

• To create a new distributed application in the Operations Manager 2007 Operations

Console

• To create a new monitoring group in the Operations Manager 2007 Operations Console

• To create a new rule for a group in the Operations Manager 2007 Operations Console

• To create a probe monitor in the Operations Manager 2007 Operations Console

• To create a unit monitor in the Operations Manager 2007 Operations Console

• To create a health rollup monitor in the Operations Manager 2007 Operations Console

To create a Management Pack in the Operations Manager 2007 Operations Console




menu.

2. Expand the tree view in the left pane of the main window to show the Management

Packs node if it is not already visible. Right-click the Management Packs node, and

then click Create Management Pack.

3. Enter the name for the Management Pack and the version number. For a new

Management Pack, use the version number 1.0.0.0. Also, enter a description that

will help administrators and operators to identify the Management Pack.

4. Click Next to display the Knowledge Article page. Click the Edit button, and enter the

knowledge that will help administrators and operators to diagnose, resolve, and

verify resolution of errors.

You must have Microsoft Office and the Visual Studio Tools for Office runtime on the

computer where you want to create and edit the knowledge for Management Packs.

5. Click Create, and the new Management Pack appears in the list of all Management

Packs in the main window of the Operations Console.

To create a distributed application in the Operations Manager 2007 Operations Console


then click Operations Console. In the navigation pane, click the Authoring button. If

the navigation pane is not visible, click Navigation Pane on the View menu.

2. Expand the tree view in the left pane of the main window to show the Distributed

Applications node if it is not already visible. Right-click the Distributed Applications

node, and then click Create a new distributed application.

3. In the Distributed Application Designer dialog box that opens, enter the name for

the new distributed application and a description that will help administrators and

operators to identify the application.

4. Select a template that will help you to define the application from the Template list,

such as Line of Business Web Application or a Messaging Application. As you select

each template, the dialog box displays a description of the target application type.

To see more details of the selected type, click the View Details link next to the

Template list. If you want to create the distributed application hierarchy yourself,

select Blank (Advanced) in the Template list.

5. Specify the Management Pack to which you want to add the new distributed

application. Select the Management Pack you created in the previous procedure. If

you have not already created a Management Pack, click the New button and follow

the instructions in the earlier procedure, "To create a new Management Pack in the

Operations Manager 2007 Operations Console."

6. Click OK in the Distributed Application Designer dialog box. You now see the

Distributed Application Designer window, where you can design the distributed

application model.

7. If you selected one of the existing templates, you will see the objects and

relationships from that template in the Designer window. For example, Figure 1

shows the result of selecting the Line of Business Web Application template.

Figure 1 The Distributed Application Designer

8. If you selected the Blank (Advanced) option in the Distributed Application Designer

dialog box, you will see an empty designer surface. To add items to the designer,

click the Add Component button in the toolbar at the top of the window to open the

Create Component Group dialog box, where you specify the type of component you

want to add. Enter a name for the new component, and select the Objects of the

following types(s) option. Then select the component type in the tree view at the

bottom of the Create Component Group dialog box.

The list contains a wide selection of possible component types. For a Web-based

application or Web service, expand the Application Component node of the tree view

to see components such as Database and Web Site. For a Windows-based application,

expand the Local Application node of the tree view and then expand the Windows

Local Application node to see the various types of user and local application types.

These include Health Service components such as a Management Server, Notification

Server, Windows Cluster Service, Windows Local Service, and Windows User

Application.

9. To create a relationship between the items you add to the designer, click the Create

Relationship button in the toolbar at the top of the window, click the source item in

the relationship, and then click the target item. This creates a relationship such that

the source item "uses" (depends on) the target item and the arrow points towards

the target item. Click the Create Relationship button again to switch out of the

Create Relationship mode and return to the normal "arrow" mouse pointer.

You use component groups and relationships to separate the sets of rules for each

component into logical groups that correspond to the separation between the

components of the application. You can apply rollup rules to the overall health state of

the component groups to generate the appropriate health state indication at higher

levels of the application structure.

10. Click the Save button in the toolbar at the top of the window and close the

Distributed Application Designer window.

To create a new monitoring group in the Operations Manager 2007 Operations Console




2. Expand the tree view in the left pane of the main window to show the Groups node

if it is not already visible. Right-click the Groups node, and then click Create a new

Group.

3. In the Create Group Wizard dialog box, enter a name for the group and type in a

description that will help administrators and operators to identify the group.

4. Select the Management Pack to which you want to add the new group in the drop-

down list at the bottom of the dialog box. If you have not already created a

Management Pack, click the New button and follow the instructions in the earlier

procedure, "To create a new Management Pack in the Operations Manager 2007

Operations Console."

5. Click Next in the Create Group Wizard dialog box to show the Choose Members from

a List page. On this page, you can explicitly choose the members for the new group.

To add a member, click the Add/Remove Objects button to open the Object

Selection dialog box. Select the type of entity you want to add in the Search for

drop-down list or leave the list set to Entity to search for all suitable objects. Enter

all or part of the name of the items you want to find in the Filter by part of name

text box or leave it blank to search for all items of the selected type.

6. Click the Search button to display the items that match your selection in the

Available items list. The list shows the available entities (computers, databases,

sites, and applications) based on a range of features, such as the name, operating

system, or status within the Operations Manager 2007 environment (such as

Notification Server). Select the individual items you want to add, and then click the

Add button. You can hold down the SHIFT and CTRL keys while clicking to select

more than one item. To remove an item selected in the Selected objects list, click

the Remove button.

7. Click OK in the Object Selection dialog box to return to the Create Group Wizard

dialog box, and then click Next to show the Create a Membership Formula page. On

this page, you can create rules and use formulae to automatically select computers

to add to the new group. Click the Create/Edit rules button to open the Query

Builder dialog box, and then select the type of items you want to add to the group in

the drop-down list at the top of the window.

8. Click the Add button to create a row in the grid where you can specify an expression

for selecting items. In the first column of the conditional expression row, select a

property for the item you added, such as the Display Name, and then select the

criteria for matching the property in the second column of the grid. You can use a

range of criteria, such as partial and full string matching on the name, wild-card

string matching, and regular expressions. Enter the criteria value for this row in the

third column of the grid. Then repeat the process to add more conditional

expressions to the grid as required.

Clicking the Insert button adds a conditional expression row to the grid. However, if

you click the small "down arrow" next to the Insert button, you can create a series of

AND and OR groups containing conditional expressions. Select an expression row and

click the Formula button to view the conditional expression for that row, or click the

Delete button to remove any row from the grid.

9. After you create any rules you require for selecting objects, click Next in the Create

Group Wizard dialog box to show the Choose Optional Subgroups page. On this

page, you can select other groups you have already created to build a hierarchy of

groups that allows you to use rollup rules to expose the health state of the group

members as a whole. Click the Add/Remove Subgroups button to open the Group

Selection dialog box, and enter any part of the name of the group(s) you want to add

in the text box at the top of the window. If you want to see a list of all groups, leave

the text box empty.

10. Click the Search button, and the Available items list shows all available groups.

Select the groups you want to add as children of the new groups, and then click the

Add button. You can hold down the SHIFT and CTRL keys while clicking to select

more than one item. To remove an item selected in the Selected objects list, click

the Remove button.

11. Click OK in the Group Selection dialog box to return to the Create Group Wizard

dialog box, and then click Next to show the Specify Exclude List page. Here, you can

specify any objects you do not want to include in the group, which the previous rules

set up in the Create Group Wizard would include.

12. Click the Exclude Objects button to open the Object Exclusion dialog box, and select

the type of entity you want to exclude in the Search for drop-down list or leave the

list set to Entity to search for all suitable objects. Enter all or part of the name of the

items you want to find in the Filter by part of name text box or leave it blank to

search for all items of the selected type.

13. Click the Search button to display the items that match your selection in the

Available items list, select the individual items you want to exclude, and then click

the Add button to add them to the Selected objects list. You can hold down the

SHIFT and CTRL keys while clicking to select more than one item. To remove an item

selected in the Selected objects list, click the Remove button.

14. Click OK in the Object Exclusion dialog box to return to the Create Group Wizard

dialog box, and then click Create. The new group appears in the Groups list in the

Operations Console.

To create a new rule for a group in the Operations Manager 2007 Operations Console




2. Expand the tree view in the left pane of the main window to show the Rules node

(which is under the Management Pack Objects node) if it is not already visible, and

then click the Rules node to select it. The main window shows a list of all the rules

installed in Operations Manager 2007, grouped by type.

3. Click the Change Scope hyperlink the in small notification area above the list to open

the Scope MP Objects by target(s) dialog box. You can use this feature to limit the

list of items to those within a particular scope (such as a Management Pack, group,

or distributed application); which makes it easier to find and work with the rules and

other objects you create.

4. Type part of the name of the group or Management Pack you want to scope in the

Look for text box at the top of the Scope MP Objects by target(s) dialog box. The list

changes to reflect matching item and a tick appears in the check boxes of these

matching items. To see all the items, select the View all targets option button, select

the check boxes of any other targets you want to include, and then click OK.

Alternatively, use the Look for text box and the Find Now button below the

notification area to select specific rules that match a search string.

5. Right-click the Rules node in the left-side tree view, and then click New rule, or click

the New rule link on the toolbar or in the Actions window at the right of the main

window to start the Create Rule Wizard. If you cannot see the Actions window, click

Actions on the View menu.

6. The Select a Rule Type page of the Create Rule Wizard allows you to select the type

of rule you want to create. You can create an alert generating rule based on an

event; a collection rule based on an event, a performance counter, or a probe; or a

timed command that executes a command or a script. Figure 2 shows rule type

selection page of the wizard.

Figure 2 The different types of rule available in the Create Rule Wizard

• For an alert generating rule, you can select the following:

◦ Generic CSV Text Log (Alert). This rule type matches against the entries stored

in a Comma-Separated-Values log file and generates an alert when a value that

you specify using a pattern matches an entry in the log file.

◦ Generic Text Log (Alert). This rule type matches against the entries stored in a

generic text log file and generates an alert when a value that you specify using a

pattern matches an entry in the log file.

◦ NT Event Log (Alert). This rule type matches against the properties of events in

Windows Event Log of the monitored computers and generates an alert when a

matching event occurs. You can match on any of the fields of an event log entry,

such as the name, computer name, event number, category, and description.

◦ SNMP Trap (Alert). This rule type listens for events generated by specific classes

and traps of an SNMP provider on the monitored computers and generates an

alert when a matching event occurs.

◦ Syslog. This rule type matches against syslog entries forwarded to the

monitored computers, and generates an alert when a matching event occurs.

You can match on any of the values in the incoming syslog entry.

◦ WMI Event (Alert). This rule type uses a Windows Management Instrumentation

(WMI) query within a namespace you specify, which runs at intervals you define,

to query WMI objects and generate an alert when a query match occurs.

• For a collection rule, you can select from three categories of rule type, located in the

three folders named Event Based, Performance Based, and Probe Based. The collection

rule types are the following:

◦ Generic CSV Text Log (event-based rule). This rule type collects and logs to the

Operations Manager database entries stored in a comma-separated values log

file, using pattern matching to locate entries in the log file.

◦ Generic Text Log (event-based rule). This rule type collects and logs to the

Operations Manager database entries stored in a generic text log file, using

pattern matching to locate entries in the log file.

◦ NT Event Log (event-based rule). This rule type collects and logs to the

Operations Manager database events occurring in the Windows Event Log of the

monitored computers.

◦ SNMP Event (event-based rule). This rule type collects and logs to the

Operations Manager database events from a specified SNMP provider on the

monitored computers.

◦ SNMP Trap (Event) (event-based rule). This rule type collects and logs to the

Operations Manager database event traps from a specified SNMP provider on

the monitored computers.

◦ Syslog (event-based rule). This rule type collects and logs to the Operations

Manager database syslog entries forwarded to the monitored computers.

◦ WMI Event (event-based rule). This rule type uses a WMI query within a

namespace you specify, which runs at intervals you define, to collect and log

results to the Operations Manager database.

◦ SNMP Performance (performance-based rule). This rule type collects and logs to

the Operations Manager database performance counters exposed by a specified

SNMP provider on the monitored computers.

◦ WMI Performance (performance-based rule). This rule type collects and logs to

the Operations Manager database performance counters exposed through WMI

on the monitored computers.

◦ Windows Performance (performance-based rule). This rule type collects and

logs to the Operations Manager database values from Windows performance

counters defined on the monitored computers.

◦ Script (Event) (probe-based rule). This rule type collects and logs to the

Operations Manager database details of events that cause a specified script to

run when a matching event occurs on the monitored computers.

◦ Script (Performance) (probe-based rule). This rule type collects and logs to the

Operations Manager values of performance counters that cause a specified

script to run when a matching event occurs on the monitored computers.

• For a timed command, the rule types are the following:

◦ Execute a Command. This rule type runs a specified command using the

Operations Manager Windows command shell at the intervals you specify.

◦ Execute a Script. This rule type runs a specified script, either VBScript or JScript,

at the intervals you specify.

7. While you are still on the Select a Rule Type page, select the Management Pack to

which you want to add the new rule in the drop-down list at the bottom of the

dialog box. If you have not already created a Management Pack, click the New

button and follow the instructions in the earlier procedure, "To create a new

Management Pack in the Operations Manager 2007 Operations Console."

8. Click Next, and enter a name for the new rule and a description that will help

administrators and operators to identify the rule. Then click the Select button to

open the Select a Target Type dialog box. This dialog box shows a list of all the types

of object to which you can apply the new rule. Type part of the name of the entity

(group, computer, or Management Pack) you want to apply the rule to in the Look

for text box at the top of the dialog box. The list changes to reflect matching items

and the check boxes of these matching items become selected. To see all the

available entities, select the View all targets option button, select the check boxes of

any other targets you want to include, and then click OK.

9. Make sure that the Rule is enabled check box is selected (unless you want to create

the new rule but not enable it yet), and then click Next. The page you see next

depends on the type of rule you are creating:

◦ For a rule that uses a Generic Text (or CSV) Log as its source, you see the

Application Log Data Source page where you specify the source log file path and

name, and the pattern you want to use to match values in the log file. You can

also specify if the log file is UTF8 format instead of the more usual UTF16

format. Then click Next to show the Build Event Expression page, where you

specify how the rule will map to values your pattern selects from the log file.

You can use a range of criteria, such as partial and full string matching on the

name, wild-card string matching, and regular expressions. Enter the criteria

value for this row in the third column of the grid. Click the Insert button to add

more conditional expressions to the grid as required.

Clicking the Insert button adds a conditional expression row to the grid.

However, if you click the small "down arrow" next to the Insert button, you

can create a series of AND and OR groups containing conditional expressions.

Select an expression row and click the Formula button to view the conditional

expression for that row, or click the Delete button to remove any row from

the grid.

◦ For an NT Event Log or an NT Event Log (Alert) rule, you see the Event Log Name

page where you specify the source event log (such as Application, System, or

Security). Click the ellipsis button (...) to open a dialog box where you can select

a computer, and then select from the list of all available Windows Event Logs on

that computer. Click OK to return to the Event Log Name page, and then click

Next to show the Build Event Expression page where you specify how the rule

will map to events in the Windows Event Log. You can match on the standard

event properties, or use a numbered parameter, and specify a conditional

expression to match to that property value. You can use a range of criteria, such

as partial and full string matching on the name, wild-card string matching, and

regular expressions (see Figure 3). Enter the criteria value for this row in the

third column of the grid. Click the Insert button to add more conditional

expressions to the grid as required.

Figure 3 Specifying the mapping between a Windows Event and an event rule

◦ For a rule that uses SNMP as its data source, you see an SNMP object identifier

configuration page. Here, you must specify the discovery or a community string

that identifies the SNMP provider. If you are creating a collection rule, you can

also change the collection frequency using the drop-down list on this page. Then

specify the object identifier properties for each property you want to access.

Alternatively, if you are creating an alert generating rule, you can select the All

Traps check box.

◦ For a rule that uses a forwarded syslog entry as its data source, you see a Build

Event Expression page similar to that for the NT Event Log rule types. You can

use a range of criteria to match the value your pattern selects from the log

entry, such as partial and full string matching on the name, wild-card string

matching, and regular expressions. Enter the criteria value for this row in the

third column of the grid. Click the Insert button to add more conditional

expressions to the grid as required. Click the small "down arrow" next to the

Insert button to create AND and OR groups containing conditional expressions.

◦ For a rule that uses WMI as its data source, you see the Configure WMI Settings

page. Here, you specify the WMI namespace and the query. You can change the

polling interval using the drop-down list in this page.

◦ For a Windows performance rule, you see the Performance Object, Counter, and

Instance page. Click the Browse button to display the Select Performance

Counter dialog box and select the source computer, the performance counter

object (either a built-in object such as .NET CLR Data or your application

performance counter object), and the actual counters contained in this counter

object. Click the Explain button to see the explanatory text for the selected

counter. Then click OK to automatically populate the Object, Counter, and

Instance text boxes. Alternatively, you can use pattern matching strings for

these values to select multiple counters. You can also select the check box below

the text boxes to specify that the rule should include all instances of the

specified counter. Finally, change the collection Interval settings as required,

and click Next to show the Optimized Performance Collection Settings page.

Here, you must specify a tolerance for changes in the sample values collected

from the data source. Low tolerance (low optimization) means that small

changes in the values will cause Operations Manager to create a database entry,

while high tolerance (high optimization) provides information on changes in

performance that is more granular but stores less data. You can also specify an

absolute tolerance value or a percentage (see Figure 4).

Figure 4 Specifying the Optimized Performance Collection Settings

◦ For a Script (Event) or a Script (Performance) rule, you see the Schedule page,

where you specify how often the script should execute. The default is every 15

minutes, and you can enter a specific synchronization time from which the

intervals are measured. Then click Next to open the Script page, where you

enter the name of the script to execute, specify the script timeout, and select

the language (VBScript or JScript). Then edit the script in the window or click the

Edit in full screen button, and then type (or copy and paste) the script you

require. If your script requires parameters, click the Parameters button and

enter the parameter names. You can click the Target button next to the

Parameters list to insert property placeholders such as the display name or ID of

the computer or management group. Click OK and, back on the Script page, click

Next. If you are creating a Script (Event) rule, you see the Event Mapper page.

Use the ellipsis buttons (...) next to each text box to specify the Computer, Event

source, Event log, Event ID, Category of the event that will cause the script to

execute, and select the Level (such as Information, Warning, or Error) from the

drop-down list. If you are creating a Script (Performance) rule, you see the

Performance Mapper page. Use the ellipsis buttons (...) next to each text box to

specify the Object, Counter, Instance, and Value of the counter that will cause

the script to execute.

◦ For the Execute a Command rule, you see the Specify your Schedule Settings

page, where you can specify execution of the command or script on a simple

recurring interval basis or create a weekly schedule to execute the command or

script. After creating a suitable schedule, click Next to show the Configure

Command Line Execution Settings page. Here, you specify the full path and

name of the program to execute and any parameters you want to pass to that

program. You can click the arrow button next to the Parameters text box to

insert value placeholders, such as the display name or ID of the computer or

management group. In the Additional settings section of this page, you can

specify the working directory for the program, whether to capture the program

output, and the timeout for program execution.

◦ For the Execute a Script rule, you see the Specify your Schedule Settings page,

where you can execute the command or the script at a simple recurring interval,

or you can create a weekly schedule to execute the command or script. After

creating a suitable schedule, click Next to show the Script page. This is the same

page that appears for the Script (Event) or a Script (Performance) rules discussed

earlier and allows you to create the script to execute for this rule.

10. If you are creating an alert-generating event, you now see the Configure Alerts page.

On this page, you must specify the Name, Description, Priority, and Severity of the

alert that the rule will generate. Select Low, Medium, or High in the Priority drop-

down list, and Warning, Information, or Critical in the Severity drop-down list. If you

want to suppress repeated occurrences of this alert, click the Alert suppression

button to open the Alert Suppression dialog box, and select the check boxes next to

the fields of the source event that must have identical values for the alert to be

considered as a duplicate and suppressed.

You can use custom fields to pass values from event rules to alerts and monitors. Click

the Custom alert fields button and enter the values for any of these fields you want to

use or click the ellipsis button (...) next to a field text box and select a value from the

target entity or the source alert in the lists available in the Alert Description dialog box

that appears.

11. Click Create on the final page of the Create Rule Wizard and the new rule appears in

the list in the main window of the Operations Console.

To create a probe monitor in the Operations Manager 2007 Operations Console




2. Expand the tree view in the left pane of the main window to show the Management

Pack Templates node if it is not already visible, and then expand this node to show a

list of available templates. Right-click the Management Pack Templates node or one

of the template nodes, and then click Add monitoring wizard.

3. On the Select Monitoring Type page of the Add Monitoring Wizard, select the type of

probe monitor you want to create from the list. The four template types are the

following:

◦ OLE DB Data Source. This probe monitor tests the connectivity to any OLE-DB

compliant database at the specified intervals.

◦ TCP Port. This probe monitor sends a "ping" to the specified port on a specified

computer at the specified intervals.

◦ Web Application. This probe monitor sends one or more HTTP requests to a

specified Web site at the specified intervals.

◦ Windows Service. This probe monitor sends commands to a specified Windows

service at the specified intervals.

4. Click Next to show the General Properties page, and enter a name for the probe

monitor. Enter a description that will help administrators and operators to identify

the monitor. Then select the Management Pack to which you want to add the new

monitor in the drop-down list at the bottom of the dialog box. If you have not

already created a Management Pack, click the New button and follow the

instructions in the earlier procedure, "To create a new Management Pack in the

Operations Manager 2007 Operations Console."

5. The page you see next depends on the type of rule you are creating:

◦ OLE DB Data Source. For this type of probe monitor, you see a page where you

specify the connection details for the database. You can specify a Simple

Configuration using the Provider name, the IP address or device name, and the

name of the Database. Alternatively, you can select Advanced Configuration

and provide the full connection string. Click the Test button to check the

connection.

◦ TCP Port. For this type of probe monitor, you see a page where you specify the

IP address or device name and the Port number of the target computer you

want to probe. Click the Test button to check the availability of the specified

port.

◦ Web Application. For this type of probe monitor, you see a page where you

specify the URL of the Web application or Web page you want to probe. Click

the Test button to check the availability of the specified URL.

◦ Windows Service. For this type of probe monitor, you see a page where you

specify the service name. Click the ellipsis button (...) to open the Select

Windows Service dialog box, where you can select a computer and see a list of

the available services on that computer. Then go directly to step 9.

6. Click Next and, for all types except the Windows Service monitor, you see the

Choose Watcher Nodes page. This displays a list of all computers running the

Microsoft Operations Manager remote agent. Select the checkbox next to the

computer(s) that you want to execute this probe monitor. You can execute it from

the Microsoft Operations Manager management server or any of the remote agent-

managed computers in the management group.

7. Use the controls at the bottom of the Choose Watcher Nodes page to change the

frequency at which the probe monitor executes to the required value. The default is

every two minutes.

8. Click Next to see a summary of your settings. If you are creating a Web Application

probe monitor, you can select the check box at the bottom of this page to start the

Web Application Editor, where you can specify exact details of the request, create

group requests, and even record navigation using your Web browser.

9. Click Create to create the new monitor and close the wizard. Then expand the tree

view in the left pane of the main window, and then select the Monitors node (which

is under the Management Pack Objects node) if it is not already visible. The main

section of the window shows a list of all the monitors. You can use the Change Scope

link the in toolbar to limit the list of items to those within a particular scope (such as

a Management Pack, Group, or Distributed Application); which makes it easier to

find and work with the monitors you create.

To create a unit monitor in the Operations Manager 2007 Operations Console




2. Expand the tree view in the left pane of the main window to show the Monitors

node (which is under the Management Pack Objects node) if it is not already visible,

and select it. Use the Change Scope link the in toolbar to limit the list of items to

those within the required scope, and expand the nodes below the Entity Health

node in the main Operations Console window (see Figure 5).

Figure 5 The list of monitors for a distributed application

3. For each entity (such as a distributed application or group), you can create monitors

for the four categories: Availability, Configuration, Performance, and Security.

Right-click the category node to which you want to add a new monitor, click Create a

Monitor, and then click Unit Monitor to start the Create Monitor Wizard.

4. On the Select a Monitor Type page of the Create Monitor Wizard, select the type of

monitor you want to create. There are many different types available, organized in

folders denoting the type (see Figure 6). These monitor types equate to the rule

types described in more detail in the earlier procedure, "To create a new rule for a

group in the Operations Manager 2007 Operations Console."

Figure 6 Some of the different types of unit monitor you can create

There are five basic types of unit monitor. You can create a monitor that reacts to an event, to

the changes in a performance counter value, to the result of executing a custom script, to an

SNMP event or trap, or which monitors a Windows service.

For an event monitor, you can detect one or more events that match a specified criteria (a

correlated event), a combination of different events occurring over a specified period, a

missing event that you expect to occur, or a series of repeated events. You can also specify if

the operator must reset the state manually, or if another event or a timer can reset the state.

For a performance monitor, you can detect specific values or specify threshold value ranges,

and expose a two-state (RED and GREEN) or a three-state (RED, YELLOW, and GREEN) health

status. You can also create a baseline performance monitor that measures average

performance over time. This is useful for measuring adherence to service level agreements

(SLAs) and estimating performance capabilities of the application and its individual

components.

For a script monitor, you can expose a two-state or a three-state health status.

For an SNMP monitor, you can detect a combination of different events occurring over a

specified period.

For a Windows service monitor, you can detect changes to the state and operation of the

service.

5. The following pages of the Create Monitor Wizard collect all the information

required for the specific monitor type you select. Each asks first for the name and

description of the monitor, and the Management Pack to add it to. Then there is a

different series of pages, but all follow the same basic pattern. The first steps help

you to set up the correlation (mapping) between one or more source events,

counters, or script executions and the new monitor:

◦ Event Monitor. For this type of monitor, you specify the name of the correlated

event logs, and expressions that match the events you want to monitor. This is a

similar process to that described in the earlier procedure for creating an event

rule.

◦ Performance Monitor. For this type of monitor, you specify the counter name

and location, and the threshold values. You can also use this type of monitor to

create baseline information (including varying the "learning rate") that indicates

the average performance of the monitored application or its individual

components over long or short business cycles (see Figure 7).

Figure 7 Specifying the threshold and learning cycle values for a baseline performance monitor

◦ Script Monitor. For this type of monitor, you specify the script to execute, and

any parameters it requires.

◦ SNMP Monitor. For this type of monitor, you specify one or more expressions

that match the SNMP traps or probes.

◦ Windows Service Monitor. For this type of monitor, you specify the location and

name of the service you want to monitor.

6. Complete the remaining pages of the Create Monitor Wizard. These pages include

the following:

◦ The Configure Health page that allows you to specify the health states that

Operations Manager will display when the correlated event, counter threshold,

or script execution occurs. You assign a Critical (RED), Warning (YELLOW), or

Healthy (GREEN) health state to each occurrence or value of the correlated

event, counter, or script execution.

◦ The Configure Alerts page that allows you to specify if changes to the state

detected by this monitor will raise an alert to display in the console (and,

optionally, send it to operators as an e-mail or pager message). You also specify

the severity of the alert here.

7. Click Create to create the new monitor and you will see it appear in the list in the

main window of the Operations Console.

8. To add product or company knowledge to a monitor, select it in the list in the main

window, right-click, and then click Properties. Open the Product Knowledge page,

click the Edit button, and enter the required information that helps operators and

administrators to diagnose, resolve, and verify resolution of the problem.

To create a health rollup monitor in the Operations Manager 2007 Operations Console







those within the required scope, and expand the nodes below the Entity Health

node in the main Operations Console window to see the four categories:

Availability, Configuration, Performance, and Security.

3. If you want to create a rollup monitor that reflects the health state of a complete

distributed application or a top-level group, select the Entity Health node for that

distributed application or group. If you want to create a rollup monitor that reflects

the health state of one of the four categories below the Entity Health node

(Availability, Configuration, Performance, and Security), select that node instead.

4. Right-click the selected node, click Create a monitor, and then click either

Dependency Rollup Monitor or Aggregate Rollup Monitor to start the wizard.

A dependency rollup monitor allows you to specify the rollup policy based on subsets

of computers or components within the same group and specify what state to expose

when monitoring is unavailable or the computers are in maintenance mode

(temporarily disconnected from the monitoring system). An aggregate rollup monitor

simply exposes the best or worse state of all the computers or components within the

group.

5. In the General Properties page of the wizard, enter a name for the monitor and a

description that will help administrators and operators to identify the monitor. Then

click the Select button to open the Select a Target Type dialog box. This dialog box

shows a list of all the types of object to which you can apply the new monitor. Type

part of the name of the entity (group, computer, or Management Pack) you want to

apply the monitor to in the Look for text box at the top of the dialog box. The list

changes to reflect matching items and the check boxes of these matching items

become selected. To see all of the available entities, click the View all targets option

button and select the check boxes of any other targets you want to include.

6. Click OK to close the Select a Target Type dialog box, and select the appropriate

parent monitor that will act as a rollup for this monitor from the list on the main

wizard page.

7. Select the Management Pack to which you want to add the new monitor in the drop-

down list at the bottom of the dialog box. If you have not already created a

Management Pack, click the New button and follow the instructions in the earlier

procedure, "To create a new Management Pack in the Operations Manager 2007

Operations Console." Also make sure that the Monitor is enabled check box is

selected unless you do not want to enable the monitor immediately. Then click Next.

8. If you are creating an aggregate rollup monitor, the Health Rollup Policy page you

see next allows you to specify if the health state exposed by the monitor is that of

the worst state of any member of the group or the best state of any member of the

group. Select the required option, and then go to step 12 of this procedure.

As an example, if you select Worst state of any member while one computer has a

Warning state, one has an Critical state, and the rest have a Healthy state, the

monitor will show Critical. If you select Best state of any member while one computer

has a Warning state, one has an Critical state, and the rest have a Healthy state, the

monitor will show Healthy.

9. If you are creating a dependency rollup monitor, the next wizard page contains a

tree-view list of the entities related to the current entity for which you are creating

the monitor. These relationships match those that you (or the template you used in

the Distributed Application Designer) created. You also see all the subgroups within

the current group. Expand the target entity or group for which you are creating a

Monitor, and you see the Entity Health node and the four category nodes,

Availability, Configuration, Performance, and Security. Within each of these nodes

are any Monitors you have already created, and any default monitors created by the

Distributed Application Designer (see Figure 8).

Figure 8 Selecting the target entity for a Dependency Rollup Monitor

10. Select the node for which you want to rollup the state of the members, and click

Next to show the "Configure Health Rollup Policy" page. This page allows you to

specify how the overall state for the group will reflect the states of individual

members of the group. The three options in this page (see Figure 9) are:

◦ Worst state of any member. If you select this option, Operations Manager will

set the State value displayed in the Operations Console to that specified for the

Severity for the worst of the current unresolved alerts for the members of this

group.

◦ Worst state of the specified percentage of members in good health state. If

you select this option, you must specify a percentage that defines the





specify 60%, Operations Manager will select the six members of the group that



displays this in the Operations Console as the State value for this group.

◦ Best state of any member. If you select this option, Operations Manager will set

the State value displayed in the Operations Console to that specified for the

Severity for the best of the current unresolved alerts for the members of this

group. It is unlikely that you will use this option very often, as it effectively hides

the state of most of the members of the group as long as one member is

performing correctly.

Figure 9 Configuring the Health Rollup Policy for a Dependency Rollup Monitor

11. In the lower section of the "Configure Health Rollup Policy" page, use the two drop-

down lists to specify what state you want to assume for unavailable members of the

group (members where monitoring has failed, or members in maintenance mode). In

the first drop-down list, specify if the Rollup Monitor should treat a failed member's

state as either a Warning or an Error, or just ignore the failed member. In the second

drop-down list, specify if the Rollup Monitor should treat a member in maintenance

mode as either a Warning or an Error, or just ignore this member.

12. Click Next to show the "Configure Alerts" page (for both a Dependency Rollup

Monitor and an Aggregate Rollup Monitor). Set or clear the checkbox at the top of

the Alert Settings section of the page to specify if this Monitor will create an alert to

display in the console and send to operators when the health state changes. If you

turn on alerts, use the drop-down list below this checkbox to specify generation of

an alert for both a Critical state and a Warning state, or just for a Critical state. If

you require the alert to be automatically resolved when the monitor returns to a

Healthy state, set the checkbox below the drop-down list.

13. In the Alert Properties section of the page, enter a name for the Alert, a description,

and select the Priority and Severity you want to assign to the alert. The available

values for Priority are Low, Medium, and High. The available values for Severity are

Critical, Warning, and Information.

14. Click Create and you will see the new Monitor appear in the list in the main window

of the Operations Console.

15. To add product or company knowledge to a monitor, select it in the list in the main

window, right-click, and select Properties. Open the Product Knowledge page, click

the Edit button, and enter the required information that helps operators and

administrators to diagnose, resolve, and verify resolution of the problem.

Guidelines for Creating a Management Pack in the Operations

Manager 2007 Operations Console When creating a management pack in the Operations Manager 2007 Operations Console, you

should consider the following proven practices:

• Use the management model you developed for your application to help you decide

what rules and performance counters you need to create.

• Either use the Operations Manager 2007 Distributed Application Designer to create a

multi-level hierarchy of related groups as a distributed application that mirrors that of

the management model or create a multi-level hierarchy of instance monitoring groups

and subgroups that mirrors that of the management model.

• Use a name for the distributed application or the top-level monitoring group that makes

it easy to identify. You will later be able to use the top-level group to expose the overall

rolled-up state of the entire application. Each subgroup will expose the rolled-up state

of the members of that group.

• Create only rules and monitors directly relevant to your application. Avoid duplicating

rules and monitors that are available in built-in Management Packs, such as measuring

processor usage or free memory.

• Use the available health configuration settings for monitors to display the health state

of individual components of the application, and use the rolled-up health state

indicators for sections of the application and the application as a whole.

• Use alerts to immediately raise urgent issues to operations staff, perhaps through e-

mail or pager.

• Take advantage of specific features of the monitoring application, such as probes that

can provide heartbeat monitoring of remote services, or the ability to run scripts or

commands in response to alerts (for example, to query values or call a method that

provides diagnostic information then generates a suitable alert).

• Provide as much useful company-specific and application-specific knowledge as possible

for each group and rule to make problem diagnosis, resolution, and verification easier

for operators and administrators.

Editing an Operations Manager 2007 Management Pack After creating or importing a Management Pack in Operations Manager 2007, you will typically

need to perform additional actions to fine-tune the management pack or to respond to changes

in the operations environment and your management model.

In the case where a Management Pack originated in the Management Model Designer (MMD),

changes are quite commonly required, because the import process does not always generate

the ideal combination of rules and rule groups. For example, the MMD generates an alert that

creates a notification to members of the administration group. However, this group has no

members by default, so you may want to edit this notification, add members to the various

notification groups, or create new notification groups.


• To edit the properties of a Management Pack

• To edit a distributed application

• To edit a rule

• To edit a monitor

• To create and edit notification channels and recipients

• To create and edit notification subscriptions

• To view and edit the global settings

To edit the properties of a Management Pack




menu.

2. Expand the tree view in the left pane of the main window and select the

Management Packs node. In the main window, right-click the Management Pack you

want to edit, and then click Properties.

3. The Properties dialog box for a Management Pack contains three tabbed pages:

◦ Properties. On this page, you can edit the Name, Version, and Description

(except for a built-in or sealed Management Pack).

◦ Knowledge. On this page, you can click the Edit button to edit the product and

company-specific knowledge for the Management Pack.

◦ Dependencies. On this page, you can see which other Management Packs

depend upon this one, and which other Management Packs this one depends

upon. You cannot edit these lists.

4. Click OK or Apply to save your changes to the Management Pack properties.

To edit a distributed application




2. Expand the tree view in the left pane of the main window to show the Distributed

Applications node (which is under the Management Pack Objects node) if it is not

already visible, and select it. In the main window, right-click the distributed

application you want to edit, and then click Edit to open the Distributed Application

Designer.

3. Use the Distributed Application Designer to modify your distributed application as

required. You can add and remove components, change their properties, and add

and remove relationships between the components. For more details about working

with the Distributed Application Designer, see the earlier procedure, "To create a

new distributed application in the Operations Manager 2007 Operations Console."

4. Click the Save button on the toolbar of the Distributed Application Designer when

you finish editing your distributed application.

To edit a rule




2. Expand the tree view in the left-hand pane of the main window to show the Rules



those within the required scope.

3. Select the rule you want to edit, right-click it, and then click Properties (or double-

click the rule). The Properties dialog box contains the following tabbed pages:

◦ General. On this page, you can edit the Rule name and the Description.

However, you cannot change the rule target in this dialog box. To enable or

disable this rule, select or clear the Rule is enabled check box.

◦ Configuration. On this page, you can see the details of the source for the rule,

such as an event log, WMI query, or a performance counter. If the details are

available for editing, you will see an Edit button that opens a source type-

specific dialog box that allows you to change the settings for the source of this

rule. If the details are not editable, you will see a View button that opens a

source type-specific dialog box that allows you to view the settings for the

source of this rule.

◦ Configuration. On this page, you can see a list of any Responses defined for this

rule, such as creating an alert or running a script. If the details are available for

editing, you can click a response in the list and click the Edit button to view and

edit the properties of the selected response. You can also add new responses or

remove existing responses. If the details are not editable, you will see just a

View button that opens a dialog box that allows you to view any properties of

the selected response for this rule.

◦ Product Knowledge and Company Knowledge (for built-in rules). On this page,

you can see the knowledge associated with this rule. Click the Edit button to edit

the company knowledge for a built-in rule.

4. Click OK or Apply to save your changes to the rule properties.

To edit a monitor







those within the required scope.

3. Select the monitor you want to edit, right-click it, and then click Properties (or

double-click on the monitor). The Properties dialog box contains the following

tabbed pages:

◦ General. On this page, you can edit the Name and the Description. Although you

cannot change the monitor target in this dialog box, you can select a different

parent monitor if you want a different roll-up monitor to handle state changes

for this monitor. To enable or disable this monitor, select or clear the Monitor is

enabled check box.

◦ Product Knowledge and Company Knowledge (for roll-up monitors). On this

page, you can click the Edit button and edit the product and company-specific

knowledge for this monitor.

◦ Health. On this page, you can specify the heath state (Critical, Warning, or

Healthy) for each monitor condition. For example, you can map the Degraded

monitor state to a Warning health state.

◦ Alerting. On this page, you can edit the settings for an alert that this monitor will

generate.

◦ Diagnostic and Recovery. On this page, you can add, modify, and remove

diagnostic and recovery tasks that will execute when the state of the monitor

changes to Critical or Warning. For example, you can configure a script or a

command to execute.

4. Depending on the type of monitor you are editing, you may see other pages in the

Properties dialog box. These include details of the event, script, counter, WMI query,

log file, Windows service, or other source for the monitor. Each allows you to modify

the settings for this monitor source. There are also pages, depending on the monitor

type, for the schedule to execute a script or command, and the actions that reset the

monitor when the state changes.

5. Click OK or Apply to save your changes to the monitor properties.

To create and edit notification channels and recipients




menu.

2. If you previously specified and configured the notification channels that you want to

be available for sending alerts, go directly to step 10 of this procedure. However, if

you have not already specified and configured the notification channels, you must

do so before you can create recipients and subscriptions.

3. Expand the tree view in the left pane of the main window and select the Settings

node. Right-click the Notification setting in the main window, and then click

Properties. The Notification settings dialog box contains four tabbed pages where

you configure the parameters for Email, Instant Messaging, Short Message Service,

and Command notification channels.

4. On the Email page, select the Enable e-mail notifications check box if you want to

enable e-mail notifications, and then click the Add button to open the Add SMTP

Server dialog box. Specify the fully qualified domain name of the mail server to use,

the Port number (the default is 25), and the Authentication method (Anonymous or

Windows Integrated).

You must configure your mail server to allow the Operations Manager management

server to relay through it if you are not using the local SMTP service.

5. Back in the main setting dialog box, specify a valid Return address for messages, and

a retry interval for failed messages. Then use the two text boxes at the bottom of

the page to specify the Email subject and the Email message to send. Click the

"arrow" button next to these text boxes to insert placeholder strings (replaced by

the actual value when Operations Manager generates the e-mail alert) into the

subject or message body. Finally, specify the required encoding for the message

from the Encoding drop-down list.

6. On the Instant Messaging page, select the Enable instant messaging notifications

check box if you want to enable instant messaging notifications, and then enter the

name of the IM server and the Return address in the two text boxes below this.

Next, specify the IM port number, the Authentication method to use, and the

Protocol option (TCP or TLS).

7. Now use the text box at the bottom of the page to specify the IM message to send.

Click the "arrow" button next to the text box to insert placeholder strings (replaced

by the actual value when Operations Manager generates the IM alert) into the

message body. Finally, specify the required encoding for the message from the

Encoding drop-down list.

8. On the Short Message Service page, select the Enable short message service

notifications check box if you want to enable SMS notifications. Now use the text

box to specify the SMS message to send. Click the "arrow" button next to the text

box to insert placeholder strings (replaced by the actual value when Operations

Manager generates the SMS alert) into the message body. Finally, specify the

required encoding for the message from the Encoding drop-down list.

9. On the Command page, if you want to enable command notifications, click the Add

button to open the Notification Command Channel dialog box. Enter a name and

description for the channel, and enter the full path to the executable file that

implements the channel or command you want to execute for this notification. If

you need to pass parameters to the command, add these in the next text box. Click

the "arrow" button next to this text box to insert placeholder strings (replaced by

the actual value when Operations Manager executes the command) into the

command parameters. Finally, specify the initial directory and click OK to return to

the Command page of the Properties dialog box. You can also edit and remove

existing commands in the list in this page of the Properties dialog box.

10. After you configure the available notification channels, you can create and configure

recipients and subscriptions. Expand the tree view in the left pane of the main

window and select the Notifications node. Expand this node to see the

Subscriptions and Recipients nodes.

11. Right-click the Recipients node, and then click New Notification Recipient. On the

General page of the Notification Recipient Properties dialog box, specify the display

name for the recipient. You can click the ellipsis button (...) to open the Select User

or Group dialog box that allows you to select from all the users and groups

configured on the machine or the domain.

12. Back in the Notification Recipient Properties dialog box, specify if Operations

Manager should send notifications at any time (when they occur) or only during

specified times. If you select the "specified times" option, you can specify periods

when Operations Manager can send messages, and when it cannot. Click the Add

button above the Schedules to send or Exclude schedules list and use the Specify

Schedule Period dialog box that opens to specify the period in terms of the days,

and the start and end times. You can add multiple schedules to both lists.

13. Open the Notification Devices page of the Notification Recipient Properties dialog

box to see a list of channels (if any) already configured for sending notifications to

this user. Click the Add button above the list to open the Create Notification Device

Wizard.

14. On the first page of the wizard, specify the channel (E-mail, IM, SMS, or Custom

Command) and edit the delivery address for this recipient if required.

15. Click Next and enter a name that identifies this channel for this recipient, and then

click Finish to create the new device. You will see it appear in the Notification

Devices list in the Notification Recipient Properties dialog box. Repeat this process

to add a notification device that maps this user to every notification channel you

want them to be able to use. You can add more than one device for each channel

(such as multiple email accounts) if required.

16. Click OK or Apply to save the new recipient, and you will see it appear in the main

window when you select the Recipients node in the right-side tree view.

17. To edit the properties for a recipient, double-click the recipient entry in the main

window to open the Notification Recipient Properties dialog box. To delete a

recipient, right-click the recipient entry in the main window, and then click Delete.

To create and edit notification subscriptions

Subscriptions link an alert generated by Operations Manager with one or more

recipients that will receive the alert. If you not have previously specified and

configured the notification channels that you want to be available for sending alerts,

you must do so before you can perform this procedure to create subscriptions. For

more details, see the earlier procedure, "To create and edit notification channels and

recipients."




menu.

2. Right-click the Subscriptions node (which is under the Notifications node) and click

New Notification Subscription to start the Create Notification Subscription Wizard.

3. Click Next to show the General page, and provide a name and description for this

subscription. Then click the Add button to open the Add Notification Recipient

dialog box that shows a list of all configured recipients. Select the check box of all

those you want Operations Manager to notify as part of this subscription, and click

OK to return to the wizard.

4. Click Next in the wizard to show the User Role Filter page. If you have configured

filtering based on user roles (in the User Roles section of the Security configuration),

you can specify the group or role using the drop-down list in this page. Leave it

empty if you do not want to use role filtering.

5. Click Next to show the Groups page, which contains a list of all the groups you have

configured. Select the check boxes in the list for the groups whose alerts will trigger

notifications through this subscription.

6. Click Next to show the Classes page, which allows you to specify which classes are

"approved" for activating a subscription based on an alert. The default is all classes.

However, you can select the Only classes specifically added option and build a list of

approved classes if you want. Click the Add button to open the Select Management

Pack Objects dialog box and select the Management Packs that you want to include,

and then click OK. Back in the wizard, you can continue to use the Add and Remove

buttons to create the list of approved classes you need.

7. Click Next to show the Alert Criteria page, where you specify which alerts will

activate this subscription. You can select properties from four lists: Severity, Priority,

Resolution State, and Category. Select the check box next to all the criteria you

require. For example (see Figure 10), you can specify that the alert must be an error

or a warning and it must meet the following criteria:

◦ It must have a priority of Medium or High (using the second list).

◦ It must be a New (not a Closed) alert (using the third list).

◦ It must be an alert caused by a Performance Collection or an Event Collection

monitor rule.

Figure 10 Specifying the criteria for activation of a subscription

8. Click Next to show the Alert Aging page, where you specify if the subscription will

respond to the aging of alerts, sending repeated notifications at each stage of the

alert aging process. You can change the period between alerts in this page.

9. Click Next to show the Format page. By default, the subscription will use the settings

specified in the notification channel you created earlier, and which you specified for

this subscription. However, you can change the options from Use the default and

specify a custom subject and message for e-mail, IM, and SMS notifications if

required. Click the "arrow" button next to each text box to insert placeholder strings

(replaced by the actual value when Operations Manager generates the notification)

into the subject or message body.

10. Click Finish to create the new subscription, which appears in the list in the main

window.

11. To edit the properties for a subscription, double-click the subscription entry in the

main window to restart the Notification Subscription Properties Wizard, where you

can modify the settings. To delete a subscription, right-click the recipient entry in the

main window, and then click Delete.

To view and edit the global settings




menu.

2. Expand the tree view in the left pane and select the Settings node. A list of

configuration settings appears in the main window under the three categories

Agent, General, and Server.

3. To edit the settings for the Operations Manager remote agents installed on

monitored computers, double-click the Heartbeat entry in the Agent section.

Heartbeat checking ensures that a remote agent is available. Change the value for

the Heartbeat interval from its default value of 60 seconds as required.

4. To specify how Operations Manager will minimize the size of the database that

stores operational data, double-click the Database Grooming entry in the General

section. Select the type of data you want to change the setting for, such as Resolved

Alerts, and click the Edit button to specify the number of days before Operations

Manager will remove this information from the database.

5. To edit the settings for notifications sent to recipients via subscriptions, double-click

on the Notification entry in the General section. For details about the settings

available in the Global Management Group Settings – Notification dialog box, see

the previous procedure, "To create and edit notification channels and recipients."

6. To edit the settings for participation in feedback programs and error reporting,

double-click the Privacy entry in the General section. The Global Management

Group Settings – Privacy dialog box has four tabbed pages that allow you to do the

following:

◦ CEIP. On this page, you can specify if you want to partake in the Customer

Experience Improvement Program by providing feedback to Microsoft about

how you use Operations Manager 2007.

◦ Error Reporting. On this page, you can specify if you want to send Operational

Data Reports to Microsoft about how you use the product so it can be improved

in accordance with customer requirements.

◦ Error Transmission. On this page, you can specify if Operations Manager will

send error reports to Microsoft when an error occurs within the software, and

whether it will prompt before sending them.

◦ Operational Data Reports. On this page, you can specify a filter for the types

and sources of errors that Operations Manager will send to Microsoft, what

information to include in the error reports, and whether to display links to

possible solutions.

7. To edit the settings for viewing and generating reports, double-click the Reporting

entry in the General section and specify the URL of the reporting server. The

Operations Console uses this information to launch reports.

8. To edit the settings for accessing the Web Console and your own custom online

product knowledge, double-click the Web Addresses entry in the General section.

Specify the URL of the Operations Manager 2007 Web Console, and the URL of the

start page for your product knowledge Help pages or Web site. The Operations

Console uses these values to launch the Web Console and access online knowledge.

9. To edit the way that Operations Manager reacts to failed remote agents, double-

click the Heartbeat entry in the Server section and change the value for the number

of missing heartbeats allowed before Operations Manager will create a failure alert.

The default is three.

10. To edit the security setting for installation of remote agents, double-click the

Security entry in the Server section. For maximum security, select the option to

reject manually installed remote agents. If you want to allow manual installation of

the Operations Manager agent on remote computers, select the second option. You

can then manually approve newly installed agents. If you want Operations Manager

to approve new remote agent installations automatically, select the check box below

this option.

11. Click OK or Apply in each of the settings dialog boxes to save the changes you made.

Guidelines for Editing an Operations Manager 2007 Management

Pack When editing an Operations Manager 2007 Management Pack, you should consider the

following proven practices:

• Ensure that your Management Pack contains the appropriate distributed applications,

groups, rules, and monitors to match the management model and the instrumentation

exposed by the application. Keep the hierarchy of the groups and the properties of the

rules and monitors as simple as possible, while following the structure and the

requirements of the management model.

• Ensure that you assign the elements of your Management Packs to the appropriate

groups and subgroups that match the physical layout of the computers that will run the

application. Use rollup monitors to expose the rolled up overall health state for a

subgroup, or a specific category (such as Availability or Performance) within that

subgroup.

• Create the appropriate recipients and subscriptions for operators that manage the

application, and other people that have an interest in its operation (such as business

owners and application developers), so that you can automatically send alerts to them.

Configure responses for the rules and alerts that send operational alerts to the

appropriate groups.

• Modify any of the global settings that affect your application. For example, you may

want to use the custom alerts fields for company-specific information, or modify the

alert resolution states and service level periods to suit your requirements.

Deploying the Operations Manager 2007 Agent After you import a management model to generate a Management Pack, or after you create a

new Management Pack, and designate the servers that will run the application and its

components using one or more instance groups or a distributed application, you must install the

Operations Manager agent on the target computers if you have not already done so.

To deploy the Operations Manager agent to remote computers




menu.

2. Expand the tree view in the left pane and select the Device Management node.

Expand the Device Management node to see the nodes that contain the different

types of computer and device discovered on your network (such as management

servers, agent managed, and agent-less managed computers).

3. Right-click the Device Management node, and then click Discovery Wizard to open

the Computer and Device Management Wizard. This wizard helps you to install the

Operations Manager agent on remote computers.

4. Click Next on the Introduction page to show the Auto or Advanced page. You can

select the first option and allow the wizard to scan your entire domain, but this can

be a long process. Instead, select the second option, Advanced discovery, and select

a value in the drop-down list that corresponds to the types of computers you want

to discover. You can select Servers & Clients (the default), Servers Only, Clients

Only, or Network Devices. Ensure that the correct management server is selected in

the Management Server drop-down list. If you want to ensure that connections can

be made to remote computers, select the check box below this list.

5. Click Next to show the Discovery Method page. Select the first option if you want to

allow the wizard to scan Active Directory to find suitable computers. Click the

Configure button to open the Find Computers dialog box, open the Advanced

tabbed page, and specify a field name, condition, and value that selects computers

that you want to include (see Figure 11). Remember also to select the appropriate

target domain in the Domain drop-down list.

Figure 11 Specifying computers in Active Directory using the Find Computers dialog box

6. Alternatively, you can simply browse or type in the names of the computers you

want to include. In this case, select the second option on the Discovery Method

page. Either type a list of names directly into the text box at the bottom of the page,

or, to select computers, click the Browse button to open the Active Directory Select

Computers dialog box.

7. Click Next to show the Administrator Account page. Here, you must provide the

credentials of an account that has administrator-level privileges on the target

computers, and which Operations Manager will use when installing the remote

agents on these computers. You can specify the existing action account that

Operations Manager creates during installation if you first ensure this has the

relevant privileges. However, it is usually better to select the Other user account

option and enter the credentials of a domain-level administrative account. After

installation completes, the account you specify here is no longer used.

8. Click Discover to start the discovery process. You will see a dialog box that indicates

the status. The Pending Actions node in the tree view (below the Device

Management node) also lists all pending processes such as this discovery process.

9. The status dialog box indicates when discovery and installation is complete.

You can install the Operations Manager agent on a remote computer by running the

Operations Manager setup on that computer, using your setup CD or by loading the

setup .msi file from a network drive. In this case, ensure that you edit the Operations

Manager management server security settings for installation of remote agents. Go to

the Settings section in the Administration view in the Operations Console and double-

click the Security entry in the Server section to open the Global Management Server

Settings – Security dialog box that contains the options to accept remote agent

installation.

Best Practices for Deploying the Operations Manager 2007 Agent When deploying the Operations Manager 2007, you should consider the following proven

practices:

• Use the Discovery Wizard to find remote computers where you want to install the

Operations Manager agent. You can scan Active Directory, but it is usually quicker to

specify a condition and search for computers that match this condition—for example,

computers whose name starts with a specific string such as "WEB" or "SALES".

• Specify the types of device you want to discover, such as servers and clients, servers

only, or clients only to narrow the search.

Viewing Management Information in Operations Manager 2007 After you import or create a Management Pack for your application, you can use it to monitor

your application. You will usually also take advantage of existing Management Packs provided

with System Center Operations Manager 2007; this allows operators to detect faults in the

underlying infrastructure in addition to errors and performance issues directly associated with

the application. These additional Management Packs allow you to detect faults in the underlying

infrastructure, such as performance degradation or core operating system service failures, and

monitor services such as Microsoft Exchange and SQL Server.

This section includes procedures for using both the Operations Console and the Web Console.

The Operations Console allows you to view the state of an application and drill down to see

details of the events, alerts, performance counters, and computers that run the application. The

Web Console has less functionality, but it can still be of great use to operators, particularly when

the operator console is not installed.

To view management information in Operations Manager 2007


then click Operations Console. In the navigation pane, click the Monitoring button.

If the navigation pane is not visible, click Navigation Pane on the View menu.

2. The first section of the tree view in the left pane displays the five basic monitoring

information views:

◦ Active Alerts. In this view, the main window shows a list of all alerts raised by

the rules and monitors. Select an alert to see the details and the knowledge for

verifying, resolving, and re-verifying this problem in the lower details pane (see

Figure 12). If the details pane is not visible, click Detail Pane on the View menu.

Double-click an alert to see its properties, including a summary of the

knowledge, history, and context. Right-click a computer, click Open on the

shortcut menu, and then select from the four available views: Diagram View,

Event View, Performance View, or State View. You can alternatively open the

Health Explorer for the selected computer from here or a PowerShell command

prompt. Right-click an alert, click Set Resolution State, and then click either New

or Closed to change the resolution state for this alert. If you click Closed,

Operations Manager removes it from the list.

Figure 12 Viewing the knowledge for an unresolved alert in Active Alerts view

◦ Computers. In this view, the main window shows a list of computers within the

current scope, and the health state of each one—including the monitored

features it supports such as Agent, Management Server, or Windows Operating

System. Right-click a computer, click Open on the shortcut menu, and then

select from the five available views: Alert View, Diagram View, Event View,

Performance View, or State View. You can alternatively open the Health

Explorer for the selected computer from here or a PowerShell command

prompt.

◦ Discovered Inventory. In this view, the main window shows the overall state,

display name, and the path to each computer in the current scope. Double-click

a computer to see the properties of that computer. Right-click a computer, click

Open on the shortcut menu, and then select from the four available views: Alert

View, Diagram View, Event View, or Performance View. You can alternatively

open the Health Explorer for the selected computer from here or a PowerShell

command prompt.

◦ Distributed Applications. In this view, the main window shows the distributed

applications in the current scope and the overall state for each one. Right-click

an application, click Open on the shortcut menu, and then select from the five

available views: Alert View, Diagram View, Event View, Performance View, or

State View. You can alternatively open the Health Explorer for the selected

application from here or a PowerShell command prompt.

◦ Task Status. In this view, the main window shows all the tasks that Operations

Manager carried out, such as discovering computers, installing agents, and

executing monitoring probes. The details pane shows the output from each task

as you select it in the main Task Status list. If the details pane is not visible, click

Detail Pane on the View menu. Right-click a task, and then click Health Service

Tasks to see a list of the many tasks you can execute. These include a range of

configuration, probe, discovery, recovery, and execution tasks.

3. The first four of the basic monitoring categories listed in the previous step provide

views containing more detailed information:

◦ Alert view. This shows a list of active alerts for only the selected computer or

application.

◦ Diagram view. This shows a schematic representation of this computer or

application. This is a useful view for understanding the structure of a distributed

application or series of hierarchical groups (see Figure 13). It shows the overall

health state for each component as well as the application as a whole, and you

can expand and collapse the nodes to explore where any problems or

performance issues exist. Right-click any of the components, and then click

Health Explorer to open a window that contains a tree view where you can

explore the individual rules and monitors for the entire application; you can also

see details of the state of each one and the associated knowledge that helps to

verify, diagnose, resolve, and re-verify any problems.

Figure 13 Viewing the schematic structure and state of a distributed application in Diagram view

◦ Event view. This shows details of the source events for the computer or

application. Right-click an event, and then select Show associated rule

properties to see the rules associated with the event. The details pane shows

the properties of the selected event.

◦ Performance view. This shows the performance counters available for a

computer or application. Select a counter from the list to see a graph of the

values over time (see Figure 14). This window contains commands on the

Actions menu that allow you to select the time range, copy or save the graph

image, and copy the source data to the clipboard for further examination and

analysis. For a baseline counter, you can also pause or restart a collection, or

you can reset the baseline values.

Figure 14 Viewing the history for a performance counter in Performance view

◦ State view. This shows the overall state of the computer or application. This

window shows the state and properties of the selected computer or application,

and contains commands to show the Health Explorer window and view reports.

◦ Other options available in the views listed in steps 2 and 3 allow you to start

Maintenance mode for an application or a computer, or you can create

personalized views with specific columns, grouping and ordering to suit your

requirements.

4. In any view, select a distributed application or a computer and double-click to open

the Health Explorer window. In the left pane tree view, expand the nodes to show

the overall state for the application or computer (the Entity Health node). Within

this node, depending on the structure of your application, you see the rolled-up

health state and the individual category health states for each component. Figure 15

shows the health state for an example distributed application.

Figure 15 The Health Explorer window for a distributed application

5. As you select each node in the Health Explorer tree view, the Knowledge tabbed

page in the right pane shows the product and company-specific knowledge for that

node. The State Change Events tab page shows a list of the events that caused

changes to the state, and the event context.

6. Examine the other views available in Monitoring mode to see a high-level view of the

computers and the applications they are running, and the overall health state of

each one. Expand the nodes in the tree node in the left pane of the main Operations

Console window for the category of information you are interested in. Available

categories include agentless monitored computers, Windows client computers,

Windows Server computers, Web applications, network devices, Operations

Manager itself, and any application groups you have created. Select the State node

within category group to see an overall view of the state for that category. You can

then right-click entries in the main window to see the different views (described in

step 3), or open Health Explorer or the PowerShell command prompt.

To use the Web Console to connect over an external network such as the Internet


then click Web Console. The Web Console provides only monitoring features and

displays a much simpler interface for selecting and viewing information (see Figure

16).

Figure 16 The Web Console provided with System Center Operations Manager 2007

2. The left pane tree view displays only four basic categories and a reduced set of other

monitoring categories. However, it still provides a wealth of monitoring capabilities,

and works in much the same way as the standard Operations Console.

Guidelines for Viewing Management Information in Operations

Manager 2007 When viewing management information in Operations Manager 2007, consider the following

proven practices:

• If you connect directly to the management domain, use the Operations Console to

monitor applications and computers. If you connect from a remote location over the

Internet or an intranet, use the Web Console to monitor applications and computers.

• Use the Scope option on the View menu to limit your view to the appropriate

distributed applications or groups and subgroups, unless you want to see alerts raised

by all the managed computers for all events.

• Use the State view and the Diagram view to provide an overall picture of the health

state of the application. In Diagram view, you can also see the state of the subgroups

and individual computers.

• Use the Alerts view to obtain a list of alerts, generally sorted by descending severity,

which is useful in prioritizing diagnosis and resolution requirements, and the

corresponding actions.

• Use the Events view to see the details of source events, and use the Performance view

to see the values and history of performance counter samples. Both are useful in

diagnosing problems and verifying resolution.

• Use the Health Explorer to see the state of individual components, individual categories

(such as Configuration or Performance), and individual monitors and rules.

• Create personalized views if you want to see information displayed in a different order

or in different groups.

Creating Management Reports in Operations Manager 2007 Regular and continuous monitoring makes it easier to detect application failures, problems, or

unsatisfactory performance, but the actions taken by administrators and operations staff are

usually short-term in nature. They tend to concentrate on the present, and may not take into

account historical events and performance over longer periods that indicate fundamental issues

or causes.

However, business owners and hosting services must often conform to specified service level

agreements (SLAs) about performance and availability. The data required to support these

agreements only appears over longer periods and requires access to historical information.

Data captured in summary reports can also be vital to operations staff in detecting missed

computers, or incorrectly configured application or computer groups, particularly in large and

complex installations. These reports may be the only way that operations staff can match

monitoring infrastructure to the physical hardware.

System Center Operations Manager 2007 includes a report generator that uses SQL Server

Reporting Services to publish the stored performance and error details within its own database.

This can provide useful summaries of application performance, and the history of issues

encountered with an application. You can use the reports to view the overall performance over

time and detect specific problem areas with your application.

The reporting feature for System Center Operations Manager 2007 is a separate installation

from the monitoring system. You must rerun the setup for Operations Manager and select

Operations Manager 2007 Reporting to install the reporting feature.

To view monitoring and management reports in Operations Manager 2007


then click Operations Console. There are two ways to create a report. To view

information for a single computer or a single distributed application, go to step 2 of

this procedure. To view information for multiple computers, distributed applications,

or other entities, go to step 5 of this procedure

2. To view information for a single computer or a single distributed application, click

the Monitoring button in the navigation pane at the lower-left of the window. If the

navigation pane is not visible, click Navigation Pane on the View menu.

3. In the left pane tree view, select either the computer you want to view information

for in the Computers section or the application you want to view information for in

the Distributed Applications section.

4. The actions pane to the left of the main window contains a series of links to the

popular types of report. If you cannot see the actions pane, click Actions on the

View menu. Click the report you want to generate to open the Report Viewer

window. Now go to Step 7 of this procedure.

5. To view information for multiple computers, distributed applications, or other

entities, click the Reporting button in the navigation pane at the lower-left of the

window. Note that the Reporting button is not available until you install the

reporting feature for Operations Manager 2007.

6. In the left pane tree view, expand the Reporting node, and then click Microsoft

Generic Report Library. Right-click a report in the list in the main window, and then

click Open to open the Report Viewer window.

7. The Report Viewer window contains a series of controls where you specify the

period for the report, the objects to include, and any other parameters specific to

that report. For example, when you open the Alerts report, you can specify the

severity and priority of alerts that the report will include. Figure 17 shows these

parameter settings and the other Report Viewer controls, and the way that you can

select the period for the report.

Figure 17 The Report Viewer showing the controls for the parameters for the report

8. If you specified a computer or a distributed application and opened Report Viewer

from the Monitoring section of the Operations Manager console, the Objects list in

Report View will contain the item you selected. If you opened Report Viewer from

the Reporting section of the Operations Manager console, the Objects list will be

empty.

9. To add items to the Objects list, click the Add Group or Add Object button. In the

dialog box that opens, select a search option in the drop-down list, such as Contains

or Begins with, and then enter the text part of the name of the object(s) or group(s)

you want to find. If you want to specify the dates between which objects or groups

were created, or the management group they belong to, click the Options button,

and then enter the relevant details.

10. Click the Search button to view all matching items in the Available items list. Select

those you want to include in the report (you can hold down the SHIFT and CTRL keys

to select multiple items in the list), and then click the Add button to add them to the

Selected objects list. Then click OK to return to Report Viewer.

11. Set any other parameter values you require in the controls at the top of the Report

Builder window, and then click the Run button on the main toolbar to start the

report running. After a few moments, the report appears (see Figure 18).

Figure 18 The results of running the Event Analysis report for two computers

The reports included with Operations Manager 2007 allow you to view alerts and alert latency;

availability and health; custom configuration and configuration changes; event analysis, most

common events, and custom events; and performance and health details. You can also author

your own reports, and set up scheduled reporting. Figure 19 shows the graphical reports for

alert latency over 1 second.

Figure 19 The results of running the Alert Latency report for all alerts during one day

Guidelines for Creating Management Reports in Operations

Manager 2007 When creating management reports in Operations Manager 2007, you should consider the

following guidelines:

• Use System Center Operations Manager Report Viewer to examine the historical

performance of an application to ensure that it performs within the service level

agreements (SLAs) or the parameters defined by business rules.

• Use the reports to discover inconsistencies in performance, check overall reliability, and

detect problematic situations such as unreliable networks—and the times when these

issues most commonly arise.

• Use the reports to confirm management coverage of the computers running the

application, and deployment of the appropriate sets of rules to each group.

Summary Management Packs can be a very useful tool for the operations team in managing applications.

This chapter demonstrated how to create and import Management Packs in Operations

Manager 2007, and then it described how to edit the Management Packs to provide the

functionality required when monitoring an application.

Section 5

Technical References

This section provides additional technical resources that can be of use when designing and

developing manageable applications. Chapter 18, "Design of the DFO Artifacts," is incomplete in

the preliminary version of this guide. Chapter 19 describes how to create or modify a guidance

package to modify the application management model defined in the Team System

Management Model Designer Power Tool (TSMMD).

This section is aimed primarily at solutions architects and application developers.

Appendix A, "Building and Deploying Applications Modeled with the TSMMD"

Appendix B, "Walkthrough of the TSMMD Tool"

Appendix C, "Performance Counter Types"

Appendix A

Building and Deploying Applications

Modeled with the TSMMD

In this preliminary version of the guide, this chapter provides guidance on how you can consume

the instrumentation artifacts generated by the Team System Management Model Designer

Power Tool (TSMMD) in your applications, and how you can deploy the applications complete

with the appropriate instrumentation. This chapter also explains how you can generate

Management Packs for System Center Operations Manager using the TSMMD. The topics in this

chapter are:

• Consuming the Instrumentation Helper Classes

• Verifying Instrumentation Coverage

• Removing Obsolete Events

• Deploying the Application Instrumentation

• Specifying the Runtime Target Environment and Instrumentation Levels

• Generating Management Packs for System Center Operations Manager 2007

• Importing a Management Pack into System Center Operations Manager 2007

• Using Management Packs with System Center Operations Manager

• Creating a New Distributed Application

Consuming the Instrumentation Helper Classes After you generate the instrumentation helper classes for a model, you can make calls to these

classes in your application. The abstraction of the instrumentation into separate classes makes it

easier to focus on the application code without having to worry about the instrumentation

requirements. If the model changes, you can regenerate the helper classes and use them

without requiring changes to the application code (providing that the existing instrumentation

still exists in the model).

If you change the name of a managed entity and regenerate the instrumentation helper

classes, you must update the references and your application code to match the new

instrumentation helper class names.

To call the instrumentation helper classes

1. Open the TSMMD solution in Visual Studio 2008, and then open Solution Explorer. If

you cannot see Solution Explorer, click Solution Explorer on the View menu.

2. In Solution Explorer, right-click the top level solution entry, point to Add, and then click

New Project. Select the required project type, such as Windows Forms Application, and

then enter the name for the project and any other required information.

3. In Solution Explorer, right-click the new project, and then click Add Reference. In the

Add Reference dialog box, click the Projects tab, and then select the [entity-name].API

and [entity-name].[target-environment].Impl projects for all of the managed entities in

your model. For example, if you have two managed entities named DatabaseEntity and

WebsiteEntity and two target environments named HighTrust and MediumTrust, you

would select the following projects:

◦ DatabaseEntity.API

◦ DatabaseEntity.HighTrust.Impl

◦ DatabaseEntity.MediumTrust.Impl

◦ WebsiteEntity.API

◦ WebsiteEntity.HighTrust.Impl

◦ WebsiteEntity.MediumTrust.Impl

4. In the code of the application that consumes the instrumentation, call the methods of

the instrumentation helper classes to raise events or increment performance counters.

For example, to raise an event named DatabaseFailedEvent that takes as a parameter

the name of the database, you can use code like the following.

C#

DatabaseEntity.API.DatabaseEntityAPI.GetInstance().RaiseDatabaseFailedEv

ent("SalesDatabase");

Visual Basic

DatabaseEntity.API.DatabaseEntityAPI.GetInstance().RaiseDatabaseFailedEv

ent("SalesDatabase")

5. To increment a performance counter, you call either the Increment[measure-name] or

the IncrementBy[measure-name] method of the instrumentation helper class. For

example, to increment a counter named OrdersProcessedCounter, you can use code

like the following.

C#

// increment counter by the default value

WebsiteEntity.API.WebsiteEntityAPI.GetInstance().IncrementOrdersProcesse

dCounter();

// increment counter by a specified value

WebsiteEntity.API.WebsiteEntityAPI.GetInstance().IncrementByOrdersProces

sedCounter(5);

Visual Basic

' increment counter by the default value

WebsiteEntity.API.WebsiteEntityAPI.GetInstance().IncrementOrdersProcesse

dCounter()

' increment counter by a specified value

WebsiteEntity.API.WebsiteEntityAPI.GetInstance().IncrementByOrdersProces

sedCounter(5)

For a detailed description of the instrumentation projects and artifacts, see Chapter 8 "Creating

Reusable Instrumentation Helpers".

Verifying Instrumentation Coverage After developers add the instrumentation classes to a project and call them from the

application, they can perform a validation check in Visual Studio to ensure that the application

code does in fact call the instrumentation methods of the generated API classes. The verification

check confirms that the code makes at least one call to an overload of every method defined in

the instrumentation classes. Developers can also use the verification process to provide a

checklist of tasks when instrumenting applications. Figure 1 shows a case where the application

does not make calls to the helper methods.

Figure 1

The error list generated by the Verify Instrumentation Coverage recipe

To verify instrumentation coverage for a project

1. In Visual Studio, ensure that the TSMMD guidance package is enabled:


b. In the Guidance Package Manager dialog box, click the Enable/Disable

Packages button.





2. In Solution Explorer, right-click the .tsmmd model file, and then click Verify

Instrumentation Coverage (C#).

3. The TSMMD looks in your solution projects for calls to all of the abstract

instrumentation methods defined in the instrumentation helper classes. Any missing

calls (instrumentation methods that you do not call from the application) appear in the

Visual Studio Error List window.

In the current release of the TSMMD, you can only verify coverage for applications written in

Visual Basic and C#. If you create your application using any other language, the TSMMD will

not be able to locate calls to the instrumentation, and will report an error.

An additional limitation in this release is that the instrumentation discovery process will not

locate instrumentation in an ASP.NET Web application written in Visual Basic.

Deploying the Application Instrumentation When deploying and installing an application instrumented using the TSMMD tool, you must

also install the instrumentation used by the application. You achieve this by building your

solution in the usual way, and then running the installation utility against each instrumentation

technology DLL. The installation utility InstallUtil.exe is part of the default installation of the

.NET Framework.

Depending on the instrumentation defined in your management model, you may need to install

one or more of the following technology DLLs:

• EventLogEventsInstaller.dll

• WindowsEventing6EventsInstaller.dll

• WmiEventsInstaller.dll

• PerformanceCountersInstaller.dll

If you include Enterprise Library Log Events in your model, the configuration file created by the

TSMMD will contain the configuration information that Enterprise Library requires. You must

copy this into your application configuration file, as described in the section Specifying the

Runtime Target Environment and Instrumentation Levels. You must also ensure that Enterprise

Library is installed on the target computer(s) where you deploy your application.

Installing Event Log Functionality Before your application can write event log entries, you must specify settings for the event log in

the Windows registry. These changes require administrative rights over the local computer, so

should occur when the application is installed, and not at runtime.

The instrumentation generation process creates a suitable EventLogEventsInstaller class in the EventLogEventsInstaller subfolder. You can use the InstallUtil.exe utility with the EventLogEventsInstaller class in to install the event logs with your application.

The EventLogEventsInstaller class can install event logs only on the local computer.

Installing Windows Eventing 6.0 Functionality The TSMMD creates a Windows Eventing 6.0 manifest file if the model defines any Windows

Eventing 6.0 events. Before your application can write event log entries, you must install the

publisher file, including the manifest, on the target system. To do this, you use the Wevtutil.exe

utility. The command you must execute on the target system is:

wevtutil install-manifest EventsDeclaration.man

You will usually execute this command during the installation process for your application. The

Wevtutil utility can usually be executed only by members of the Administrators group, and must

run with elevated privileges.

The TSMMD can also generate a Windows Eventing 6.0 view that allows you to display events

from your application in a custom view of the Event Log. To create a Windows Eventing 6.0

view, right-click on the top-level entry in the Management Model Explorer window and click

Generate Windows Eventing 6.0 View. The TSMMD creates a new XML view file named [model-

name]View.xml and opens it in Visual Studio.

Publishing the Schema for an Instrumented Assembly to WMI If your application instrumentation definition includes any WMI Events, you must register the

appropriate WMI schema in the WMI repository. The instrumentation generation process

creates a suitable WmiEventsInstaller class in the WmiEventsInstaller subfolder. You can use the

InstallUtil.exe utility with the WmiEventsInstaller class in to install the events with your

application.

As a convenience for developers at design time, Windows automatically publishes a WMI

schema the first time an application raises an event or publishes an instance. This avoids the

requirement to declare a project installer and run the InstallUtil.exe utility during prototyping

of an application. However, this registration will succeed only if the user invoking it is a

member of the Local Administrators group, and therefore you should not rely on this approach

as a mechanism for publishing the schema.

Installing Performance Counters If your application instrumentation definition includes any Windows Performance Counters, you

must register these before the application can use them. The instrumentation generation

process creates a suitable PerformanceCountersInstaller class in the

PerformanceCountersInstaller subfolder. You should edit this file before use to set the value of

the CounterHelp property for each counter instance. This value determines the help text shown

in the Performance Counter viewer and monitoring tools that display the counter values. You

can use the InstallUtil.exe utility with the PerformanceCountersInstaller class in to install the

counters with your application.

Using a Batch File to Install Instrumentation The easiest way to install instrumentation using the classes described in the previous sections is

with a batch file that executes the Installutil.exe utility. The following listing shows an example

batch file.

InstallUtil

Instrumentation\EventLogEventsInstaller\bin\Debug\EventLogEventsInstaller.dll

InstallUtil

Instrumentation\PerformanceCountersInstaller\bin\Debug\PerformanceCountersInst

aller.dll

InstallUtil

Instrumentation\WmiEventsInstaller\bin\Debug\WmiEventsInstaller.dll

Using the Event Messages File In order for the text of event messages to appear in Windows Event Log, you must register the

assembly containing these messages on the target system. You can use the event messages file

that the TSMMD generates.

To install the Event Messages file

1. Add a reference to the EventLogEventsInstaller.dll located in the Instrumentation folder

to your application project. This allows Visual Studio to copy this assembly, which

contains install information, into the execution directory of the application.

2. In Solution Explorer, right-click on the top-level solution entry and click Rebuild All.

3. In Windows Explorer, navigate to the Installation subfolder of your solution, and copy

the file Source1_Messages.dll file from the output directory of the

EventLogEventsInstaller project into the execution folder of your application.

4. Open a Visual Studio Command Prompt window, navigate to the execution folder of

your application, and register the event messages assembly by executing the following

command:

InstallUtil EventLogEventsInstaller.dll /i

Specifying the Runtime Target Environment and Instrumentation

Levels The code generation process creates the instrumentation code configuration file, named

InstrumentationConfiguration.config, in the Instrumentation folder of the project. This file,

shown in the following listing, contains the information that developers or operators will copy

into their application configuration files, and edit to specify the target environment under which

the application will run and the required instrumentation granularity.

<configuration>

<configSections>

<section name="tsmmd.instrumentation"

type="Microsoft.Practices.DFO.Guidance

.Configuration.ApplicationHealthSection,

Microsoft.Practices.DFO.Guidance.Configuration"/>



<section name="loggingConfiguration"

type="Microsoft.Practices.EnterpriseLibrary

.Logging.Configuration.LoggingSettings,

Microsoft.Practices.EnterpriseLibrary.Logging, Version=3.1.0.0,

Culture=neutral, PublicKeyToken=b03f5f7f11d50a3a" />

</configSections>

<tsmmd.instrumentation>



<add name="CustomerDatabase" targetEnvironment="LocalIntranet"

instrumentationLevel="Coarse"/>

<add name="CustomerWebService" targetEnvironment="LocalIntranet"

instrumentationLevel="Coarse"/>

...

... all other managed entities in model listed here ...

...

</tsmmd.instrumentation>



<loggingConfiguration name="Logging Application Block" tracingEnabled="true"

defaultCategory="General"

logWarningsWhenNoCategoriesMatch="true">

...

... default logging configuration here ...

...

</loggingConfiguration>

</configuration>

Developers copy the contents of this file (excluding the <configuration> element) into their

application configuration file and edit the values as required. The <tsmmd.instrumentation>

element contains an <add> element for each managed entity in the model, identified by the

entity name. Each <add> element defines two other attributes:

• targetEnvironment. This is one of the target environments defined in the model, and

controls which of the concrete event and measure (counter) implementations the

abstract API class methods will use in the application at runtime. It defines mapping

between the target environments in the model and the concrete event and measure

implementations.

• instrumentationLevel. This indicates the level at which the instrumentation will raise

events or increment counters. Every abstract event and measure in the original model

defines a value for its Instrumentation Level property. The options are Coarse (all

operations), Fine (diagnostic and debug operations only), Debug (debug operations

only), and Off (instrumentation disabled).

The following table shows how the combination of the Instrumentation Level property of an

event and the setting of the instrumentationLevel attribute in the configuration file affects the

raising of events.

Instrumentation level

(event)

Overall instrumentation

level

Event raised?

Coarse Coarse Yes

Coarse Fine Yes

Coarse Debug Yes

Fine Coarse No

Fine Fine Yes

Fine Debug Yes

Debug Coarse No

Debug Fine No

Debug Debug Yes

Off Coarse No

Off Fine No

Off Debug No

The instrumentation configuration file created by the TSMMD code generation routines contains

settings that specify the runtime target environments and instrumentation levels for the

managed entities within the application. When you deploy your application, you must copy the

contents of the instrumentation configuration file into your application configuration file and

edit it to specify the appropriate settings.

The configuration file, named InstrumentationConfiguration.config, resides with the generated

instrumentation classes in the Instrumentation folder of the TSMMD solution. It contains in the

<tsmmd.instrumentation> section an <add> element for each managed entity in your

application. This element defines the target environment within which that entity will execute,

and the granularity of the instrumentation. You must copy the contents of this file (excluding

the <configuration> element) into your application configuration file and edit the values as

required.

If you specified any Enterprise Library Log Events in your model, you must also copy the entire

<loggingConfiguration> section, and the corresponding <section> element from the

<configSections> section, into your application configuration file.

Remember that the term "target environment" refers to the capability for specifying multiple

events or performance counters for an aspect of an entity, and having the entity use a specific

one of these events or counters at runtime depending on the requirements of the application,

the execution permissions available, and the limitations of the runtime environment.

To specify runtime target environment and instrumentation levels

1. Open the file named InstrumentationConfiguration.config in Visual Studio or any other

text or XML editor.

2. Open your application configuration file (usually Web.config or App.config) in Visual

Studio or any other text or XML editor.

3. If your application configuration file already contains a <configSections> element, copy

only the <section name="tsmmd.instrumentation" ... /> element (including all of its

attributes) from the InstrumentationConfiguration.config file into the <configSections>

element in your application configuration file. If you are using any Enterprise Library Log

Events in your model, you must also copy the <section name=" loggingConfiguration"

... /> element (including all of its attributes) from the

InstrumentationConfiguration.config file into the <configSections> element in your

application configuration file.

If your application configuration file does not contain a <configSections> element, copy

the entire <configSections> element from the InstrumentationConfiguration.config file

into your application configuration file.

4. Copy the entire <tsmmd.instrumentation> element from the

InstrumentationConfiguration.config file into your application configuration file, and

place it within the root <configuration> element but outside all other elements.

5. If you are using any Enterprise Library Log Events in your model, you must also copy the

entire <loggingConfiguration> element from the InstrumentationConfiguration.config

file into your application configuration file, and place it within the root <configuration>

element but outside all other elements.

6. Close the InstrumentationConfiguration.config file.

7. In the <tsmmd.instrumentation> section of your application configuration file, locate

the <add> element for the managed entity you want to configure.

8. Edit the value of the targetEnvironment attribute to specify the name of one of the

target environments defined in the management model. A target environment maps

one of the concrete instrumentation implementations to its abstract event or measure.

Therefore, setting this attribute specifies which of the concrete implementations the

instrumentation helper code will call when the application code executes the methods

of the abstract instrumentation class.

9. Edit the value of the instrumentationLevel attribute to specify the granularity of the

instrumentation defined in the management model. The values you can use are Coarse,

Fine, Debug, and Off. Setting this attribute to one of these values controls how the

instrumentation will behave.

10. Save and close your application configuration file.

Generating Management Packs for System Center Operations

Manager 2007 If you intend to monitor your application using Microsoft System Center Operations Manager

2007, you can use your management model to generate a management pack that you can install

into Operations Manager. The management pack will contain a template for a distributed

application; base classes and components for each managed entity in your model; and the

monitors and rules required to monitor the application.

The TSMMD can generate management packs automatically when you build an application, or

you can generate management packs from the command line. For more information about

generating management packs from the command line, see the documentation installed with

the TSMMD.

To generate a management pack for System Center Operations Manager 2007

1. In Visual Studio, ensure that the TSMMD guidance package is enabled:

a. On the Visual Studio Tools menu, click Guidance Package Manager.

b. In the Guidance Package Manager dialog box, click Enable/Disable Packages.

c. In the Enable and Disable Packages dialog box, make sure that the Team

System MMD Instrumentation and TSMMD Management Pack Generation for

OpsMgr 2007 check boxes are selected.

d. Click OK and Close to return to the management model designer.

2. In Management Model Explorer, right-click the top level node, and then click Generate

Management Pack for OpsMgr2007. Alternatively, right-click anywhere on the model

designer surface, and then click Generate Management Pack for OpsMgr2007. If you

cannot see Management Model Explorer, point to Other Windows on the View menu,

and then click Management Model Explorer.

3. By default, unless you changed the name of the model or the settings in the properties

page for the project, the TSMMD generates a management pack named

Application.Operations.xml in a folder named ManagementPack within the same folder

as your project. Open this folder in Windows Explorer, and open the new management

pack in a text editor to view the contents.

4. To import a TSMMD-generated management pack into System Center Operations

Manager, you must also import the standard management packs upon which the

generated management pack depends.

You can also specify the properties required for the management pack in the properties window

for the TSMMD project, and then have the TSMMD create the management pack automatically

when you build the project.

To change the settings for automatic System Center Operations Manager management pack generation

1. In Solution Explorer, right-click the TSMMD project entry and click Properties to open

the project properties window in the main Visual Studio editor pane. The Settings page

of the project properties determines if the TSMMD will automatically generate a

management pack when you build the TSMMD project, and the parameters for the

management pack generation process

2. To enable automatic generation of a management pack, set the checkbox named

Enable Microsoft SCOM 2007 management pack generation at the top of the General

section.

3. Edit the default values in the text boxes below this checkbox as required. You can

specify the following properties:

◦ Management pack ID. This setting is the fully qualified identifier for the

management pack that the TSMMD will automatically generate. The default is

Application.[model-name]. The name must start with a letter or a number, and

contain only letters, numbers, periods, and underscore symbols. The total

length must less than 255 characters, and the value must be unique within the

scope of the System Center Operations Manager server to which you will import

the management pack.

◦ Management pack display name. This setting is the name for the management

pack. The default is the current management model name.

◦ Default namespace. This setting is the namespace in which the management

pack will reside. The default is Application.

◦ Output path. This setting is the full path for the generated management pack.

Click the Browse button next to the Output path text box and select the folder

where you want to create the management pack. The default is a folder named

ManagementPack within your project folder.

Importing a Management Pack into System Center Operations

Manager 2007 After you create a management pack using the management model designer, you can import

the management pack into Microsoft System Center Operations Manager 2007.

To import a management pack into System Center Operations Manager

1. In the System Center Operations Manager Operations Console, click the Administration

tab in the left pane.

2. Right-click the top-level Administration item in the left tree view pane, and then click

Import Management Packs.

3. In the Select Management Packs to Import dialog box, navigate to the folder that

contains the management pack you created in the TSMMD, select the management

pack, and then click Open.

4. The Import Management Packs dialog box shows the management pack you selected. If

there are any prerequisites or referenced management packs that are not already

installed, the dialog box displays a warning and details of the required packs (for more

information, see the next section, "Prerequisite Management Packs"). If this warning

appears, click the Add button at the top of the dialog box, and then locate and select

the required management packs.

5. The Import Management Packs dialog box analyzes the selected management pack(s)

and indicates whether it can successfully import them. After the dialog box shows that

the analysis succeeded (every management pack has a green check mark in the list),

click Import.

6. In the Administration tree view, click the Management Packs node to see a list of all

installed management packs.

Prerequisite Management Packs If your management pack includes references to ASP.NET applications or Web services, you

must also import the following system infrastructure management packs if you have not already

done so:

• Microsoft.Windows.InternetInformationServices.CommonLibrary.mp

• Microsoft.Windows.InternetInformationServices.2003.mp

• Microsoft.SystemCenter.ASPNET20.2007.mp

The first two of these management packs are part of the Microsoft Windows Server 2000/2003

Internet Information Services Management Pack, which you can obtain from the Microsoft

Download Center. The third of the management packs in the previous list is provided with

System Center Operations Manager, and can be found in the %Program Files%\System Center

Operations Manager 2007\Health Service State\Management Packs folder.

Creating a New Distributed Application Mapping a managed model to a distributed application creates one instance of distributed

application. This instance contains all instances of the managed entities. However, in some

cases, a distributed application should contain only a subsection of the managed application,

and possibly another distributed application should contain the remaining subsections. To

achieve this separation, administrators and operators must create several identical distributed

applications using application components from different classes.

It is possible to create several distributed application with same architecture and attach

different instances of classes (managed entities) to different distributed applications (for

example, separate environments for testing and production). To achieve this, you create a new

distributed application based on the distributed application template created by the TSMMD.

To create distributed applications using the management model template

1. Import the management pack generated by the TSMMD into System Center Operations

Manager as described in the earlier procedure "Importing a Management Pack".

2. In the System Center Operations Manager Operations Console, click the Authoring tab

in the left pane. If you see the Overview page describing tasks required, click the Go to

Distributed Applications link.

3. Click the Create a new distributed application icon on the toolbar or the Create a new

distributed application link in the Actions pane on the right of the main console

window.

4. This starts the Distributed Application Designer wizard. Enter a name and description

for the distributed application. Then, in the Template list, select template that the

TSMMD generated within the management pack. The name of this template is

"Template for your-model-name".

5. Specify the location to store the management pack for the new distributed application.

This should be the management pack generated by the TSMMD, which contains the

template. If this management pack is sealed, select an existing management pack that is

not sealed.

6. Click OK. Then, when the wizard finishes, the Operations Manager Distributed

Application Designer window contains a diagram similar to the TSMMD model. This

diagram shows the components of the distributed application, and the left pane of the

window shows lists of class instances by component type.

7. Drag a class instance from the list onto the component in the diagram. You can only

attach one component type to any instance of an Executable Application component in

the designer. However, the Windows Service, ASP.NET Web Application, and ASP.NET

Web Service components have extensions. You can attach the base class and the

extension class to these types of component, as shown in Figure 2.

Figure 2

Specifying class instances for components in the Distributed Application Designer

8. The designer will create the common dependency and roll up monitors for the

distributed application. However, you can delete some components if required; for

example, if you are creating separate environments for testing and production.

9. Click Save to create the new distributed application, or to save your changes if you are

editing an existing distributed application.

10. Unlike the original distributed application, you can modify the distributed application

afterwards if required by using the Operations Manager management console.

For details of how to edit and use a management pack for an application, see Chapter 17

"Creating and Using System Center Operations Manager 2007 Management Packs".

Appendix B

Walkthrough of the Team System

Management Model Designer Power

Tool

This topic contains a simple hands-on demonstration of the Team System Management Model

Designer Power Tool (TSMMD) that will help you understand what it does and how you can use

it.

Note that this walkthrough describes the minimum set of steps required to build a management

model and health definition, generate instrumentation, and generate a System Center

Operations Manager 2007 management pack. It does not implement good programming

practices, but it will serve as a valuable starting point for understanding the DFO process and the

TSMMD.

The process divides into discrete sections, so that you can complete as many as you want.

However, you must complete the first section if you want to generate the instrumentation code

and an Operations Manager management pack. The following are the four sections:

• Building a Management Model

• Generating the Instrumentation Code

• Testing the Model with a Windows Forms Application

• Generating an Operations Manager 2007 Management Pack

Building a Management Model

The first task is to build a management model using the Team System Management Model

Designer Power Tool (TSMMD). You will create a new TSMMD solution, and then you will create

the graphical model and specify the instrumentation for it.

To create the new TSMMD solution

1. Start Visual Studio 2008 Team System Edition, click the File menu, point to New, and

then click Project.

2. In the New Project dialog box, click TSMMD Project in the list of project types, and then

click TSMMD Project in the list of projects. Enter a name and location for the new

project and click OK. This creates a new TSMMD project containing a new management

model named operations.tsmmd. The Management Model Explorer window appears

showing this new empty model, and the blank model designer surface appears in the

main window.

If you cannot see the Management Model Explorer window, click the View menu,

point to Other Windows, and click ManagementModel Explorer.

3. Ensure that the guidance packages for the TSMMD are loaded. To do this, click

Guidance Package Manager on the Visual Studio Tools menu. If the list of recipes in the

Guidance Package Manager dialog box does not contain any entries that apply to Team

System Management Model, follow these steps to enable the recipes:

◦ Click the Enable/Disable Packages button.

◦ Select the two guidance packages named Team System MMD Instrumentation

and Team System MMD Management Pack Generation.

◦ Click OK to return to the Guidance Package Manager dialog box.

◦ Click Close to close the Guidance Package Manager dialog box.

If you do not see the two guidance packages in the list, you may need to reinstall the

TSMMD guidance package.

4. In Management Model Explorer, select the top-level item named Operations. In the

Visual Studio Properties window, change the Name property to MyTestModel, and then

enter some text for the Description and Knowledgebase properties. If you cannot see

the Properties window, press F4.

5. In Management Model Explorer, expand the Target Environments node and select the

target environment named Default. Change the value of the Event Log property to True

to indicate that you require instrumentation that writes to the Windows Event Log.

You use the properties of a target environment to specify that you require any

combination of Enterprise Library Logging events, Windows Event Log events, trace file

events, Windows Eventing 6.0 events, Windows Management Instrumentation (WMI)

events, and Windows performance counters for that target environment. You can also

add more than one target environment to a model to describe different deployment

scenarios.

The next stage is to create the graphical representation of the application entities.

To create the new management model

1. In Management Model Explorer, right-click the top-level MyTestModel entry, then click

New Managed Entity Wizard. Enter the name CustomerApplication for this entity,

select Executable Application in the drop-down list, type a description for this entity in

the Description box, as shown in Figure 1, and then click Next.

Figure 6

First page of the Add New Managed Entity wizard

Alternatively, you can right-click the top-level MyTestModel entry and then click Add

New Executable Application or you can drag an Executable Application control from

the Toolbox onto the designer surface and then edit the properties in the Properties

window.

2. On the Specify Managed Entity properties page of the wizard, make sure FilePath is

selected in the Discovery Type box, then type %Program

Files%\CustomerApplication.exe in the Discovery Target box as shown in Figure 2.

Monitoring systems such as System Center Operations Manager use the settings on this

page (which are exposed in the management pack you generate) to check whether the

application is installed on a specific target computer. Click Finish to create the new

CustomerApplication managed entity, which appears on the designer surface. The

Properties window shows the settings and values you entered in the wizard.

Figure 7

Last page of the Add New Managed Entry wizard

Some managed entity types, such as ASP.NET Application and ASP.NET Web Service,

have extender properties that specify additional settings for the management pack

generated by the TSMMD.

3. Drag an External Managed Entity control from the Toolbox onto the designer surface.

In the Properties window, change the value of the Name property to

CustomerDatabase.

The wizard does not allow you to create unmanaged entities because the only

property they have is the name. Unmanaged entities act as connectors or placeholders

for parts of the overall application or system that are outside the management scope.

4. In the Toolbox, click the Connection control, click the CustomerApplication entity, and

then click the CustomerDatabase entity. This creates the connection between the two

entities. You can edit or delete the Text property for the connection in the Visual Studio

Properties window.

5. In Management Model Explorer, expand the Managed Entities node to see the two

entities you added to the diagram. Notice that the External Managed Entity (named

CustomerDatabase) has no instrumentation or health sections. You do not create

instrumentation or health definitions for External Managed Entities. Figure 3 shows the

model at this stage.

6. On the Visual Studio File menu, click Save All.

Figure 3

The graphical representation of the application entities

The next stage is to populate the health definition and instrumentation sections of the

management model. The health model defines the health states for each entity as a series of

aspects and the indicators (the instrumentation) that causes a transition in these health states.

You can add events, measures, and aspects to the model individually and set their properties as

you develop and fine-tune your model. However, the TSMMD provides a wizard that helps you

create a new aspect and specify the associated instrumentation. When you build a complex

model, you will probably have to iterate through the process of using the wizard, and then

manually add and edit items in the graphical model as it evolves. However, the wizard makes it

easy to start adding instrumentation and health definitions to the model.

To add a health definition aspect and the associated instrumentation to the management model

1. In Management Model Explorer, right-click the top-level MyTestModel entry, and then

click Validate All. The Visual Studio Error List window will show a warning indicating

that you must define at least one event or measure for the managed entity

(CustomerApplication).

This is a useful way to check that your model is valid as you work with it. You can also

validate individual sections of the model. For example, to check only the managed

instrumentation for this entity, right-click the Managed Instrumentation child node of

the CustomerApplication node in Management Model Explorer, and then click

Validate.

2. In Management Model Explorer, right-click the CustomerApplication node, and then

click New Aspect Wizard. Specify the following values in the first page of the wizard, as

shown in Figure 4, then click Next:

◦ Type NoDatabaseConnection in the Aspect name box

◦ Select Availability in the Aspect category drop-down list

◦ Type some explanatory text for the aspect in the Aspect knowledgebase box

◦ Click Event in the Aspect based on list

◦ Click Green-Red in the Aspect States list

Figure 4

First page of the Add New Aspect wizard

These settings specify that you want to implement a two-state health indicator for this

aspect, which will be driven by two events—one that indicates connection failed

(RED), and one that indicates connection available or restored (GREEN). If you want to

implement instrumentation that displays a warning, you select Green-Red-Yellow and

will therefore need to specify three events. Alternatively, you can base an aspect on a

performance counter by selecting Measure instead of Event.

3. On the next page of the wizard, you specify the events for the NoDatabaseConnection

aspect. Click the ellipsis button (...) next to the Green Health State Event text box to

open the Browse Events dialog box (shown in Figure 5). The dialog box is currently

empty because your model does not define any events.

Figure 5

The Browse Events dialog box where you select an existing event or create a new event

4. In the Browse Events dialog box, click New.

5. In the Create New Event dialog box, type NoDatabaseConnection in the Event Name

box, select Coarse in the Level drop-down list (if it is not already selected), and then

click OK, as shown in Figure 6. This adds the new event to the Browse Events list.

Figure 6

Create New Event dialog box

6. In the Browse Events dialog box, click NoDatabaseConnection in the Events list and

then click OK. This adds the NoDatabaseConnection event to the Green Health State

Event box of the Add New Aspect wizard.

7. Repeat the process for the Red Health State. To do this:

◦ Click the ellipsis button (...) next to the Red Health State Event text box

◦ Click New in the Browse Events dialog box

◦ Type DatabaseConnectionRestored in the Event Name box and select Coarse in

the Level drop-down list

◦ Click OK in the Create New Event dialog box

◦ In the Browse Events dialog box, click DatabaseConnectionRestored in the

Events list and then click OK.

8. This adds the DatabaseConnectionRestored event to the Red Health State Event box of

the Add New Aspect wizard. You now have events defined for both of the states of the

NoDatabaseConnection aspect, as shown in Figure 7.

Figure 7

The two new events specified for the NoDatabaseConnection aspect

9. Click Finish. The wizard creates the new aspect named NoDatabaseConnection and the

abstract event implementations NoDatabaseConnection and

DatabaseConnectionRestored. You can examine the new aspect and events in

Management Model Explorer, as shown in Figure 8.

Figure 8

The new aspect and events in Management Model Explorer

You can define parameters for events, which the instrumentation will populate and

expose to Windows event system when that event is raised. In this example, the two

events will pass the name of the database as a parameter to the events system.

Therefore, the next step is to define these parameters.

10. In Management Model Explorer, right-click the NoDatabaseConnection node, and then

click Add New Event Parameter. In the Properties window for the new parameter,

make sure the Index property is set to 1, and the Type property is set to String. Change

the Name property to DatabaseName.

11. Repeat this process for the DatabaseConnectionRestored event by adding a new

parameter and changing the Name property to DatabaseName. Figure 9 shows the

result.

Figure 9

The events and their parameters shown in Management Model Explorer

The two events you have defined are abstract events. Now you must create the

concrete implementations of these events. Each abstract event and measure must

have an implementation for every target environment in the model. In this example,

you use just the default target environment; therefore, you require only one concrete

implementation of each event.

12. In Management Model Explorer, right-click the NoDatabaseConnection node, and then

click New Event implementation Wizard. The first page of the wizard shows any

discovered (existing) events in your application and the managed implementations you

must create for each target environment (see Figure 10). There are no discovered

events in this example; and it shows only the single implementation technology you

specified when you created the model—Event Log—which is selected by default.

Figure 10

First page of the New Event Implementation Wizard

13. On this page of the wizard, click Next.

14. On the next page of the wizard, you specify the properties for the Event Log event, as

shown in Figure 11. For the example application:

◦ Type NoDatabaseConnectionEvent in the Name box

◦ Type Application in the Log Name box (if it is not already there)

◦ Select Error in the Severity list box if it is not already selected

◦ Leave the default setting in the Source box

◦ Type 9000 in the Event Id box

◦ Type Database name: %1 in the Message Template box

◦ Click Finish.

Figure 11

Last page of the New Event Implementation Wizard

The value Database name: %1 in the Message Template box is a string that will be

passed to the event system; it must contain a placeholder (%1) for the event

parameter you defined when you created the abstract event definition. In general, a

Message Template string must include a placeholder for each parameter you define

for an event. The placeholders must start with a "%1" and run consecutively up to the

number of parameters you define for that abstract event.

The wizard creates the configurable implementation of the event. You can view this

event in the Management Model Explorer and see the property values you specified in

the Properties window. Figure 12 shows both of these windows at this stage.

Figure 12

The concrete implementation of the NoDatabaseConnection event

15. Repeat this process to create an Event Log Event implementation for the

DatabaseConnectionRestored event. To do this, execute the New Event

implementation Wizard, change the event name to

DatabaseConnectionRestoredEvent, change the event ID to 9001, set the severity to

Information, and enter the value Database name: %1 in the Message Template box.

If you have more than one target environment in the model, the wizard will display a

dialog box to collect information for the appropriate event or measure

implementation(s) for each target environment.

16. To confirm that you have created a valid model, right-click anywhere in Management

Model Explorer, and then click Validate All. You should see the following message in the

Visual Studio Output window:

---- Validation started: Model elements validated: 58 ----

==== Validation complete: 0 errors, 0 warnings, 0 information messages

=====


You have now completed the simple health model definition and instrumentation definition for

the application, and you have validated the model. Figure 13 shows the model at this stage.

Figure 13

The management model showing the complete managed instrumentation definition

Of course, you will usually add more aspects to the model and specify the appropriate events

and measures (performance counters). Remember that the correct approach during application

and system design is to identify the health states and transitions first, which leads to the

definition of the instrumentation required to surface these transitions. This simple walkthrough

is designed to help you gain experience with the Team System Management Model Designer

Power Tool, so it assumes that you have previously identified the health states.

Generating the Instrumentation Code

With the management model now complete, you can use the Team System Management Model

Designer Power Tool to create the instrumentation code for your application. In this section of

the walkthrough, you will use a recipe within the TSMMD guidance package to create the helper

classes, instrumentation implementations, and the configuration information required for the

simple management model you created in the previous sections of the walkthrough.

To generate the instrumentation code for the application


click Generate Instrumentation Helper. This first validates the model, and then it

generates the instrumentation projects, classes, and artifacts. When it is complete, you

will see the file InstrumentationConfiguration.config open in the main Visual Studio edit

pane. Close this before you continue with this walkthrough.

2. Open the Visual Studio Solution Explorer window. You will see a new folder named

Instrumentation that contains the instrumentation projects, classes, and artifacts.

3. Now you can use the TSMMD to verify the instrumentation coverage in the application.

In the Visual Studio Solution Explorer window, double-click the management model file

operations.tsmmd (in the project folder of your solution) to show the model designer

and Management Model Explorer.

4. In Management Model Explorer, right-click the top-level Management Model entry and

click Verify Instrumentation Coverage. You will see two errors in the Visual Studio Error

List window that indicate your application code does not invoke the two methods

defined in your generated instrumentation code. This is the expected result at this stage

because you have not yet built the application. You will do this is the next stage of the

walkthrough.


As you saw in this section of the walkthrough, the TSMMD can create the instrumentation

helper classes for an application and verify that your application actually does invoke all of the

instrumentation in the model. In other words, the application should raise every abstract event

and increment each abstract counter in at least one location in the code. Figure 1 shows the

instrumentation generated at this stage of the walkthrough.

Figure 1

The instrumentation projects, classes, and artifacts generated by the TSMMD

Testing the Model with a Windows

Forms Application

With the instrumentation classes now available, you are ready to create a minimal application,

connect the instrumentation, and verify instrumentation coverage. Then you can configure the

application, compile it, and run it to ensure that the instrumentation code works correctly.

To create the test application, and verify the instrumentation coverage

1. On the Visual Studio File menu, point to Add, and then click New Project.

2. In the Add New Project dialog box, expand the Visual C# entry in the project types list,

and then click Windows.

3. In the list of templates, click Windows Forms Application. Change the name to

CustomerApplication and change the location to specify a new subfolder named

CustomerApplication in your solution folder, as shown in Figure 1. Then click OK.

Figure 1

Adding a new application project to the solution

4. In Solution Explorer, right-click the References node in the new CustomerApplication

project, and then click Add Reference.

5. On the Projects tab of the Add Reference dialog box, hold down the CTRL key and click

CustomerApplication.API and CustomerApplication.Default.Impl (so that both are

selected), then click OK.

6. In the designer for Form1.cs, drag two Button controls from the Common Controls

section of the Toolbox onto the form. Change the Text property of the first button to

Database Connected, and change the Text property of the second button to

Connection Lost. Resize the buttons and the form so that you can see the captions.

7. Double-click the Database Connected button to open the code editor with the insertion

point in the button1_Click method. Add the following line of code to the method.

C#

CustomerApplication.API.CustomerApplicationAPI.GetInstance().RaiseDataba

seConnectionRestored("CustomerDatabase");

Visual Studio's IntelliSense feature will help you to enter the code quickly and easily.

8. Double-click the Connection Lost button to open the code editor with the insertion

point in the button2_Click method. Add the following line of code to the method.

C#

CustomerApplication.API.CustomerApplicationAPI.GetInstance().RaiseNoData

baseConnection("CustomerDatabase");

Notice that these events expect you to specify a parameter—the DatabaseName

parameter that you defined in the model. For details of the methods exposed by the

instrumentation helper classes, see Using the Generated Instrumentation Code in

Applications.

9. On the Visual Studio File menu, click Save All, and then close the code editor and Form1

designer windows.


click Verify Instrumentation Coverage. You should see that the Visual Studio Error List

window now contains no errors or warnings because your code now invokes all the

abstract events defined in the model.

Figure 2 shows the completed test application in the Visual Studio designer.

Figure 2

The completed test application

You are now ready to run the application, but first you must configure it. The instrumentation

generation routines in the TSMMD create a configuration file that allows administrators to

specify the target environment and the granularity of the instrumentation. You must copy the

contents of this file into your application configuration file (App.config or Web.config) and edit

the contents before you run the application.

To configure and run the test application

1. In Solution Explorer, right-click the CustomerApplication project entry, point to Add,

and then click New Item.

2. In the Add New Item dialog box, select Application Configuration File, and then click

Add.

3. In Solution Explorer, double-click the InstrumentationConfiguration.config file (located

in the Instrumentation folder of the main solution) to open it into the editor. Select the

entire contents of the <configuration> element, excluding the opening and closing

<configuration> tags, and copy it into the App.config file between the opening and

closing <configuration> tags.

By default, the configuration settings you just added to the App.config file specify the

Default target environment for the CustomerApplication entity, with the

instrumentation level set to Coarse. These are the values you need. If you created

other target environments or specified different levels for instrumentation you

created, you would edit the values of the targetEnvironment and

instrumentationLevel attributes of the <add> element for each of the managed

entities in your model.

4. In Solution Explorer, right-click the CustomerApplication project entry, and then click

Set as Startup Project.

5. Press F5 to run the test application. Click the Connection Lost button to raise the

NoDatabaseConnection event, click the Database Connected button to raise the

DatabaseConnectionRestored event, and then close the test application.

6. In Control Panel, open Windows Event Viewer from the Settings item or Administrative

Tools item and view the contents of the Application log. You will see the two events

(with the Source set to MyTestModel_CustomerApplication) raised by the

instrumentation in the test application.

Notice that, although you raised the abstract events in your test application

(NoDatabaseConnection and DatabaseConnectionRestored), the settings in the application

configuration file specify that the instrumentation helpers should raise the concrete

implementations of these events that you mapped to the Default target environment (the Event

Log Events named NoDatabaseConnectionEvent and DatabaseConnectionRestoredEvent).

Generating an Operations Manager

2007 Management Pack

If you intend to monitor your application using Microsoft System Center Operations Manager

2007, you can use your management model to generate a management pack that you can install

into Operations Manager. The management pack will contain a template for a distributed

application; base classes and components for each managed entity in your model; and the

monitors and rules required to monitor the application.

To generate a management pack for System Center Operations Manager 2007


click Generate Management Pack for OpsMgr 2007.

2. By default, the TSMMD generates a management pack named

Application.Operations.xml in a folder named ManagementPack within the same folder

as your project. Open this folder in Windows Explorer, and open the new management

pack in a text editor to view the contents or import it into System Center Operations

Manager to see the artifacts it contains.

To import a TSMMD-generated management pack into System Center Operations Manager,

you must also import the standard management packs upon which the generated

management pack depends.

The TSMMD can generate management packs automatically when you build an application, or

you can generate management packs from the command line. For more information see, the

documentation installed with the TSMMD.

Appendix C

Performance Counter Types

This appendix lists the performance counter types available in the .NET Framework 2.0. It is

reproduced directly from MSDN.

Counter type Description

AverageBase A base counter that is used in the calculation of time or count

averages, such as AverageTimer32 and AverageCount64. Stores the

denominator for calculating a counter to present "time per operation"

or "count per operation".

AverageCount64 An average counter that shows how many items are processed, on

average, during an operation. Counters of this type display a ratio of

the items processed to the number of operations completed. The ratio

is calculated by comparing the number of items processed during the

last interval to the number of operations completed during the last

interval.

Formula: (N 1 -N 0)/(B 1 -B 0), where N 1 and N 0 are performance

counter readings, and the B 1 and B 0 are their corresponding

AverageBase values. Thus, the numerator represents the numbers of

items processed during the sample interval, and the denominator

represents the number of operations completed during the sample

interval.

Counters of this type include PhysicalDisk\ Avg. Disk Bytes/Transfer.

AverageTimer32 An average counter that measures the time it takes, on average, to

complete a process or operation. Counters of this type display a ratio

of the total elapsed time of the sample interval to the number of

processes or operations completed during that time. This counter type

measures time in ticks of the system clock.

Formula: ((N 1 -N 0)/F)/(B 1 -B 0), where N 1 and N 0 are performance

counter readings, B 1 and B 0 are their corresponding AverageBase

values, and F is the number of ticks per second. The value of F is

factored into the equation so that the result can be displayed in

seconds. Thus, the numerator represents the numbers of ticks

counted during the last sample interval, F represents the frequency of

the ticks, and the denominator represents the number of operations

completed during the last sample interval.

Counters of this type include PhysicalDisk\ Avg. Disk sec/Transfer.

CounterDelta32 A difference counter that shows the change in the measured attribute

between the two most recent sample intervals.

Formula: N 1 -N 0, where N 1 and N 0 are performance counter

readings.

CounterDelta64 A difference counter that shows the change in the measured attribute

between the two most recent sample intervals. It is the same as the

CounterDelta32 counter type except that is uses larger fields to

accommodate larger values.

Formula: N 1 -N 0, where N 1 and N 0 are performance counter

readings.

CounterMultiBase A base counter that indicates the number of items sampled. It is used

as the denominator in the calculations to get an average among the

items sampled when taking timings of multiple, but similar items. Used

with CounterMultiTimer, CounterMultiTimerInverse,

CounterMultiTimer100Ns, and CounterMultiTimer100NsInverse.

CounterMultiTimer A percentage counter that displays the active time of one or more

components as a percentage of the total time of the sample interval.

Because the numerator records the active time of components

operating simultaneously, the resulting percentage can exceed 100

percent.

This counter is a multitimer. Multitimers collect data from more than

one instance of a component, such as a processor or disk. This

counter type differs from CounterMultiTimer100Ns in that it measures

time in units of ticks of the system performance timer, rather than in

100 nanosecond units.

Formula: ((N 1 - N 0) / (D 1 - D 0)) x 100 / B, where N 1 and N 0 are

performance counter readings, D 1 and D 0 are their corresponding

time readings in ticks of the system performance timer, and the

variable B denotes the base count for the monitored components

(using a base counter of type CounterMultiBase). Thus, the numerator

represents the portions of the sample interval during which the

monitored components were active, and the denominator represents

the total elapsed time of the sample interval.

CounterMultiTimer100Ns A percentage counter that shows the active time of one or more

components as a percentage of the total time of the sample interval. It

measures time in 100 nanosecond (ns) units.

This counter type is a multitimer. Multitimers are designed to monitor

more than one instance of a component, such as a processor or disk.

Formula: ((N 1 - N 0) / (D 1 - D 0)) x 100 / B, where N 1 and N 0 are


time readings in 100-nanosecond units, and the variable B denotes the

base count for the monitored components (using a base counter of

type CounterMultiBase). Thus, the numerator represents the portions

of the sample interval during which the monitored components were

active, and the denominator represents the total elapsed time of the

sample interval.

CounterMultiTimer100NsInverse A percentage counter that shows the active time of one or more

components as a percentage of the total time of the sample interval.

Counters of this type measure time in 100 nanosecond (ns) units.

They derive the active time by measuring the time that the

components were not active and subtracting the result from multiplying

100 percent by the number of objects monitored.

This counter type is an inverse multitimer. Multitimers are designed to

monitor more than one instance of a component, such as a processor

or disk. Inverse counters measure the time that a component is not

active and derive its active time from the measurement of inactive time

Formula: (B - ((N 1 - N 0) / (D 1 - D 0))) x 100, where the denominator

represents the total elapsed time of the sample interval, the numerator

represents the time during the interval when monitored components

were inactive, and B represents the number of components being

monitored, using a base counter of type CounterMultiBase.

CounterMultiTimerInverse A percentage counter that shows the active time of one or more

components as a percentage of the total time of the sample interval. It

derives the active time by measuring the time that the components

were not active and subtracting the result from 100 percent by the

number of objects monitored.

This counter type is an inverse multitimer. Multitimers monitor more

than one instance of a component, such as a processor or disk.

Inverse counters measure the time that a component is not active and

derive its active time from that measurement.

This counter differs from CounterMultiTimer100NsInverse in that it

measures time in units of ticks of the system performance timer, rather

than in 100 nanosecond units.

Formula: (B- ((N 1 - N 0) / (D 1 - D 0))) x 100, where the denominator

represents the total elapsed time of the sample interval, the numerator

represents the time during the interval when monitored components

were inactive, and B represents the number of components being

monitored, using a base counter of type CounterMultiBase.

CounterTimer A percentage counter that shows the average time that a component

is active as a percentage of the total sample time.

Formula: (N 1 - N 0) / (D 1 - D 0), where N 1 and N 0 are performance

counter readings, and D 1 and D 0 are their corresponding time

readings. Thus, the numerator represents the portions of the sample

interval during which the monitored components were active, and the

denominator represents the total elapsed time of the sample interval.

CounterTimerInverse A percentage counter that displays the average percentage of active

time observed during sample interval. The value of these counters is

calculated by monitoring the percentage of time that the service was

inactive and then subtracting that value from 100 percent.

This is an inverse counter type. Inverse counters measure the time

that a component is note active and derive the active time from that

measurement. This counter type is the same as

CounterTimer100NsInv except that it measures time in units of ticks of

the system performance timer rather than in 100 nanosecond units.

Formula: (1- ((N 1 - N 0) / (D 1 - D 0))) x 100, where the numerator

represents the time during the interval when the monitored

components were inactive, and the denominator represents the total

elapsed time of the sample interval.

CountPerTimeInterval32 An average counter designed to monitor the average length of a

queue to a resource over time. It shows the difference between the

queue lengths observed during the last two sample intervals divided

by the duration of the interval. This type of counter is typically used to

track the number of items that are queued or waiting.

Formula: (N 1 - N 0) / (D 1 - D 0), where the numerator represents the

number of items in the queue and the denominator represents the time

elapsed during the last sample interval.

CountPerTimeInterval64 An average counter that monitors the average length of a queue to a

resource over time. Counters of this type display the difference

between the queue lengths observed during the last two sample

intervals, divided by the duration of the interval. This counter type is

the same as CountPerTimeInterval32 except that it uses larger fields

to accommodate larger values. This type of counter is typically used to

track a high-volume or very large number of items that are queued or

waiting.

Formula: (N 1 - N 0) / (D 1 - D 0), where the numerator represents the

number of items in a queue and the denominator represents the time

elapsed during the sample interval.

ElapsedTime A difference timer that shows the total time between when the

component or process started and the time when this value is

calculated.

Formula: (D 0 - N 0) / F, where D 0 represents the current time, N 0

represents the time the object was started, and F represents the

number of time units that elapse in one second. The value of F is

factored into the equation so that the result can be displayed in

seconds.

Counters of this type include System\ System Up Time.

NumberOfItems32 An instantaneous counter that shows the most recently observed

value. Used, for example, to maintain a simple count of items or

operations.

Formula: None. Does not display an average, but shows the raw data

as it is collected.

Counters of this type include Memory\Available Bytes.

NumberOfItems64 An instantaneous counter that shows the most recently observed

value. Used, for example, to maintain a simple count of a very large

number of items or operations. It is the same as NumberOfItems32

except that it uses larger fields to accommodate larger values.


as it is collected.

NumberOfItemsHEX32 An instantaneous counter that shows the most recently observed

value in hexadecimal format. Used, for example, to maintain a simple

count of items or operations.


as it is collected.

NumberOfItemsHEX64 An instantaneous counter that shows the most recently observed

value. Used, for example, to maintain a simple count of a very large

number of items or operations. It is the same as

NumberOfItemsHEX32 except that it uses larger fields to

accommodate larger values.


as it is collected

RateOfCountsPerSecond32 A difference counter that shows the average number of operations

completed during each second of the sample interval. Counters of this

type measure time in ticks of the system clock.

Formula: (N 1 - N 0) / ((D 1 -D 0) / F), where N 1 and N 0 are


time readings, and F represents the number of ticks per second. Thus,

the numerator represents the number of operations performed during

the last sample interval, the denominator represents the number of

ticks elapsed during the last sample interval, and F is the frequency of

the ticks. The value of F is factored into the equation so that the result

can be displayed in seconds.

Counters of this type include System\ File Read Operations/sec.

RateOfCountsPerSecond64 A difference counter that shows the average number of operations

completed during each second of the sample interval. Counters of this

type measure time in ticks of the system clock. This counter type is the

same as the RateOfCountsPerSecond32 type, but it uses larger fields

to accommodate larger values to track a high-volume number of items

or operations per second, such as a byte-transmission rate.

Formula: (N 1 - N 0) / ((D 1 -D 0) / F), where N 1 and N 0 are


time readings, and F represents the number of ticks per second. Thus,

the numerator represents the number of operations performed during

the last sample interval, the denominator represents the number of

ticks elapsed during the last sample interval, and F is the frequency of

the ticks. The value of F is factored into the equation so that the result

can be displayed in seconds.

Counters of this type include System\ File Read Bytes/sec.

RawBase A base counter that stores the denominator of a counter that presents

a general arithmetic fraction. Check that this value is greater than zero

before using it as the denominator in a RawFraction value calculation.

RawFraction An instantaneous percentage counter that shows the ratio of a subset

to its set as a percentage. For example, it compares the number of

bytes in use on a disk to the total number of bytes on the disk.

Counters of this type display the current percentage only, not an

average over time.

Formula: (N 0 / D 0) x 100, where D 0 represents a measured attribute

(using a base counter of type RawBase) and N 0 represents one

component of that attribute.

Counters of this type include Paging File\% Usage Peak.

SampleBase A base counter that stores the number of sampling interrupts taken

and is used as a denominator in the sampling fraction. The sampling

fraction is the number of samples that were 1 (or true) for a sample

interrupt. Check that this value is greater than zero before using it as

the denominator in a calculation of SampleCounter or

SampleFraction.

SampleCounter An average counter that shows the average number of operations

completed in one second. When a counter of this type samples the

data, each sampling interrupt returns one or zero. The counter data is

the number of ones that were sampled. It measures time in units of

ticks of the system performance timer.

Formula: (N 1 � N 0) / ((D 1 � D 0) / F), where the numerator (N)

represents the number of operations completed, the denominator (D)

represents elapsed time in units of ticks of the system performance

timer, and F represents the number of ticks that elapse in one second.

F is factored into the equation so that the result can be displayed in

seconds.

SampleFraction A percentage counter that shows the average ratio of hits to all

operations during the last two sample intervals.

Formula: ((N 1 - N 0) / (D 1 - D 0)) x 100, where the numerator

represents the number of successful operations during the last sample

interval, and the denominator represents the change in the number of

all operations (of the type measured) completed during the sample

interval, using counters of type SampleBase.

Counters of this type include Cache\Pin Read Hits %.

Timer100Ns A percentage counter that shows the active time of a component as a

percentage of the total elapsed time of the sample interval. It

measures time in units of 100 nanoseconds (ns). Counters of this type

are designed to measure the activity of one component at a time.

Formula: (N 1 - N 0) / (D 1 - D 0) x 100, where the numerator represents

the portions of the sample interval during which the monitored

components were active, and the denominator represents the total


Counters of this type include Processor\ % User Time.

Timer100NsInverse A percentage counter that shows the average percentage of active

time observed during the sample interval.

This is an inverse counter. Counters of this type calculate active time

by measuring the time that the service was inactive and then

subtracting the percentage of active time from 100 percent.

Formula: (1- ((N 1 - N 0) / (D 1 - D 0))) x 100, where the numerator

represents the time during the interval when the monitored

components were inactive, and the denominator represents the total


Counters of this type include Processor\ % Processor Time.

Documents

DFOManagementGuide-Aug2008