Upload
others
View
4
Download
0
Embed Size (px)
Citation preview
Monitoring: Know Your AudienceBest Practice for Opsview Business Service Monitoring and Keywords
Opsview Technical Overview
Monitoring: Know Your Audience
ContentsSummary................................................................................................................................3
Modeling Your Business in Opsview........................................................................................4
BSM Components...................................................................................................................4
Keywords................................................................................................................................5
BSM Services..........................................................................................................................7
Suitable Visibility for Separate Target Audiences..................................................................8
Dashboards............................................................................................................................8
Notifi cations.........................................................................................................................10
Reporting..............................................................................................................................10
Conclusion.............................................................................................................................12
Page 2
Opsview and the Opsview logo are trademarks or registered trademarks of Opsview Ltd. All other product names may be trademarks or
registered trademarks of their respective companies. © 2014 Opsview Ltd. All rights reserved.
t: +1 866 662 4160 e: [email protected] www.opsview.com
Summary
Monitoring data, like all operations data, is at its most valuable when it leverages a presentation layer that puts the
information in the proper context for any audience. When QoS values like uptime and throughput make up strict
SLA requirements it becomes essential to make sure that the correct metrics and status information goes to the
right audience.
Complex IT environments leverage the concept of redundancy as a way of creating resilience and of improving
overall performance. While high availability, disaster recovery, and load balancing clusters provide invaluable peace
of mind for stakeholders, it manages to signifi cantly complicate SLA reporting at the same time. How can this
qualitative peace of mind be translated into a quantitative and reportable SLA value?
At the same time, it is important to make sure that isolated outages are still resolved before they can propagate
into a true loss of service. The more effi ciently that isolated outages can be prioritized and resolved, the better the
overall SLA report will be in the end. It is, therefore, important to stay ahead of isolated outages in order to ensure
SLA requirements for redundant and resilient IT off erings.
It is worth pointing out that these two levels of granularity are valuable to two completely separate target
audiences. Real life, end user availability is valuable for service consumers such as customers, executive
stakeholders, or compliance departments. SLA reporting against individual IT services running on hosts is best
utilized by administrators and team leads. This is the primary focus of this white paper; that building business
rules into a monitoring solution and properly reporting to application consumers and application administrators
appropriately can improve visibility and communication between the two parties and contribute to the overall
success of the business.
This guide will go over the use cases for the Opsview BSM feature and for Opsview Keywords and will demonstrate
how to get the most out of both features. It will cover the creation of BSM components and matching keywords
to be used as a means of drilling into the component for extra information. It will then cover some eff ective
dashboard confi gurations, notifi cation rules, and reporting practices.
Page 3
Opsview and the Opsview logo are trademarks or registered trademarks of Opsview Ltd. All other product names may be trademarks or
registered trademarks of their respective companies. © 2014 Opsview Ltd. All rights reserved.
t: +1 866 662 4160 e: [email protected] www.opsview.com
Page 4
Modeling Your Business in Opsview
The presentation layer is a vital element to any business intelligence platform. Without being able to decode
the underlying data to the business terms that they represent, the information would provide little benefi t to
anyone. Therefore, like any other business intelligence tool, Opsview must provide a presentation layer so that
real architecture rules can be modeled in the tool providing accurate end user availability information. By being
thorough and doing this correctly, risk can be identifi ed and resolved before problems eff ect the service consumer
rather than allowing a costly outage to happen and tracing the failure back to a root cause.
It is a way to simplify the complicated architecture rules to the application by approaching it in pieces. The way
that a component is defi ned starts with the Opsview Host Template feature. This is the same feature that allowed
service checks to be applied to hosts in bulk by function. It is similarly able to group the check results by the
function. This will serve as the starting point for creating any component.
Selecting the desired Host Template fi lters the host selection box to only display hosts that are currently using that
template. These are all hosts that are being monitored in the same way because functionally they are all nearly the
same. It is then the job of the Opsview administrator to determine which of these common hosts are contributing
to a shared goal like a cluster of Solaris servers would be. This supports an arbitrary number of hosts to be grouped
together with an operational zone applied to the newly created BSM Component. This Operational Zone indicates
Opsview and the Opsview logo are trademarks or registered trademarks of Opsview Ltd. All other product names may be trademarks or
registered trademarks of their respective companies. © 2014 Opsview Ltd. All rights reserved.
t: +1 866 662 4160 e: [email protected] www.opsview.com
BSM Components
The fi rst essential things to defi ne in
Opsview are all BSM Components. BSM
components are a functional grouping
of hosts and services together with an
operating region used to determine the
overall health and priority of the grouping.
These are commonly used to defi ne
clusters, farms or failovers.
Opsview and the Opsview logo are trademarks or registered trademarks of Opsview Ltd. All other product names may be trademarks or
registered trademarks of their respective companies. © 2014 Opsview Ltd. All rights reserved.
t: +1 866 662 4160 e: [email protected] www.opsview.com
Page 5
the total percentage of the component that needs to be healthy for the entire component to be considered
eff ective. This now means that small failures within a cluster are fl agged as a potential impact to service rather
than immediately being marked as a failure.
Keywords
Before BSM Components, keywords were the only way to group together services to evaluate the impact of a critical
event. The fi gure below shows the process for creating a complementary keyword to the BSM component that was
made in the previous step. It is best practice to create these complementary keyword/component pairs for every
reportable component. The reason for this will become clearer in later stages of this exercise
For a BSM component complement, it is best to make a Keyword with the same name as the component that is
grouped by service. This way each of the functions the cluster needs to provide are accounted for and can be broken
down by each node.
Confi guring this complementary keyword to refl ect the same hosts and services as the component is a manual
process. The hosts are selected fi rst. These are the same hosts selected to create the component. Next there is a
check box to “Filter by selected hosts” so that choosing the correct service checks is less daunting. By selecting the
same checks that would be included in the host template, Opsview will now have a grouping that is eff ectively the
component minus the resiliency rules.
Page 6
Opsview and the Opsview logo are trademarks or registered trademarks of Opsview Ltd. All other product names may be trademarks or
registered trademarks of their respective companies. © 2014 Opsview Ltd. All rights reserved.
t: +1 866 662 4160 e: [email protected] www.opsview.com
Page 7
BSM Services
The next step is to model the consumable service. Examples of these consumables could include VoIP phone
systems, email, the company website, collaboration portals, and other various applications. The status of these
consumables are of particular interest to anyone who is looking from the outside in. The list of those interested
could include customers, executive stakeholders, compliance departments, and auditors. In order to provide an
accurate status of these consumables they must fi rst be properly modelled within the monitoring software. By
understanding the anatomy of the application, website, or workfl ow and representing its uptime needs properly it
is possible to foster a culture that concentrates on proactive troubleshooting rather than fi ghting fi res. This may
appear to be an intimidating task but the diffi cult part has already been accomplished. These consumable business
services are simply a grouping together of BSM components which already have been assigned priority and itemized
SLA requirements when the operational zone was defi ned.
When creating a BSM service, there will be a section named the “Components Drawer”. This can be fi ltered with
the text box adjacent to it to make components easier to fi nd. To create a BSM service, simply click and drag
components from the drawer into the BSM service.
Opsview and the Opsview logo are trademarks or registered trademarks of Opsview Ltd. All other product names may be trademarks or
registered trademarks of their respective companies. © 2014 Opsview Ltd. All rights reserved.
t: +1 866 662 4160 e: [email protected] www.opsview.com
Page 8
Suitable Visibility for Separate Target Audiences
Monitoring is a three phase eff ort: data collection, presentation, and response. Now that Opsview has been
confi gured to model consumable business services, both in BSM and keywords, it is time to move on to the second
and third phases of the project. First, real-time presentation views will be created, then notifi cations rules will be
put in place keywords, BSM components, and BSM services according to the recipient and fi nally historical reporting
will be defi ned for both administrators and for external audiences alike.
Dashboards
Dashboards should be valuable both for tactical and strategic audiences. High-level views are often the most
important things for customers or executives to see. This view would have very little detail and will rarely use
keywords, if at all. One of the most powerful tools at your disposal for executive dashboards is the fi ltering feature
on the BSM Summary dashlet. By placing a BSM Summary dashlet and confi guring the settings correctly it is
possible to view a subset of services that may be of importance to you and it is also possible to fi lter by status. By
unchecking the “Operational” box, the widget is now set to be a traffi c light for application level statuses. If any
BSM services appear in the widget at all, it means that they are either in scheduled maintenance mode, impacted by
an underlying problem, or in a full failure. This is an ideal way to see IT operations at the highest levels possible.
Opsview and the Opsview logo are trademarks or registered trademarks of Opsview Ltd. All other product names may be trademarks or
registered trademarks of their respective companies. © 2014 Opsview Ltd. All rights reserved.
t: +1 866 662 4160 e: [email protected] www.opsview.com
Page 9
The next task is to create dashboards that show details about the BSM service or BSM component for application
owners or administrators respectively. A good practice for dashboarding is to create a user for the purpose of
holding shared dashboards for other contacts. For the purpose of displaying Business Service views, the contact
“BSM” with local authentication should be created. This user should have VIEWALL access, CONFIGUREBSM
access, and DASHBOARDEDIT access at a minimum.
In the dashboard tab for this user the following layout can be created per BSM to provide some detailed value for
the top level Business Service.
For larger environments with many applications and services it may be a good idea to create an additional contact
named “COMPONENTS” to hold component level dashboards to be shared with others. Dashboards for this contact
may look like the example below.
Opsview and the Opsview logo are trademarks or registered trademarks of Opsview Ltd. All other product names may be trademarks or
registered trademarks of their respective companies. © 2014 Opsview Ltd. All rights reserved.
t: +1 866 662 4160 e: [email protected] www.opsview.com
Page 10
Each of these contacts should share their dashboards with the roles that might fi nd value in them. This practice
now essentially provides saved dashboards that can be pulled up and deleted as needed by other Opsview contacts.
This saves dashboard space without having to recreate these views from scratch every time that they are needed.
The list of shared dashboards that are available should display BSM: <dashboard name> or Component: <dashboard
name> for each choice, making it easy to fi nd.
Notifi cations
With the addition of the BSM feature in Opsview it is now possible to set a notifi cation rule for BSM services or
components rather than the previous choices of host groups, service groups, and keywords. As with everything
else, the audience dictates the alerting requirements. BSM service notifi cations are ideal for application owners
and should then escalate to management levels. BSM component notifi cations are directed towards team leads
of various disciplines such as database, server, and network teams with an escalation to the appropriate architect.
This leaves the legacy host group, service group, and keyword based notifi cations for the front lines of monitoring
where every new issue should be investigated as quickly as possible.
Reporting
The ultimate goal of reporting as it relates to monitoring is to tell a story that is both accurate and puts IT
operations in a positive light. Accuracy can actually be fl attering when it comes to SLAs in such complicated
architectures. High availability, load balancing, site failovers, and other architecture concepts are implemented
because they make for a more stable environment. This means that reporting on SLAs as it relates to a BSM service
is going to be a real depiction of end user availability. Opsview now provides SLA reporting against the application
or business service so that this information can be sent to management or a customer.
Opsview and the Opsview logo are trademarks or registered trademarks of Opsview Ltd. All other product names may be trademarks or
registered trademarks of their respective companies. © 2014 Opsview Ltd. All rights reserved.
t: +1 866 662 4160 e: [email protected] www.opsview.com
Page 11
It is important, however, to make sure that this report will be telling a positive story at the end of the day, week,
month, or year. An ideal way to make sure that the fi nal, often automated, report is going to satisfy the SLA
requirements is to stay in front of application outages by maintaining the components that make them up. To do
this, Daily Service Level Reports and Daily Performance Reports can be run against the keyword that was created to
complement the components that make up the BSM service.
Opsview and the Opsview logo are trademarks or registered trademarks of Opsview Ltd. All other product names may be trademarks or
registered trademarks of their respective companies. © 2014 Opsview Ltd. All rights reserved.
t: +1 866 662 4160 e: [email protected] www.opsview.com
Page 12
These can be regularly scheduled to be sent to the appropriate administrators so that individual service checks
and host failures can be corrected before they exceed the component’s operational zone. By reporting both at the
BSM service level and at the component/keyword level, diff erent audiences can be provided the correct level of
granularity for their individual needs.
Conclusion
Monitoring is made up of three factors: data collection, presentation, and action. In order to use this information
for high value business intelligence reasons it is important to always keep in mind, “Who is my audience?”
Monitoring data can be used for an immediate response by an administrator that specializes in networks,
servers, or hardware as appropriate. This same data can be rolled up for team leads, architects, and application
owners. Finally, at the highest level, transparency should be provided for executive stakeholders and the
expected consumers of these applications and business services. This way risks can be identifi ed at every stage
and prevented instead of root causes being determined after outages occur. This promotes a culture of better
communication and better relationships between Information Technology and other business units in the
organization leading to better overall operations.
Opsview and the Opsview logo are trademarks or registered trademarks of Opsview Ltd. All other product names may be trademarks or
registered trademarks of their respective companies. © 2014 Opsview Ltd. All rights reserved.
t: +1 866 662 4160 e: [email protected] www.opsview.com