Upload
dennis-evans
View
213
Download
0
Embed Size (px)
Citation preview
TOSCA Monitoring Straw-man for Initial Minimal Monitoring Use Case
Roger DevCA Technologies
Revision 3May 21, 2015
TOSCA Monitoring Use Cases (full – From Arch Ref Strawman)
• As an Application Architect, I want to know which metrics I can expect to be available from a given component.
• As an Application Architect, I want to define, within the Service Template, the Metrics to be collected for a component, as well as how they are to be collected and managed (thresholded, etc).– Additionally, I may want to use my favorite monitoring tools rather than those provided by the
Service Provider.• As an Application Operator, I want to be able to access collected metrics and events for any or
all of my deployed components, either:– Interactively– Programmatically
• As an Application Developer, I want to be able to produce custom metrics from my application(s) and have them stored and accessed along with any standard metrics
• As a Service Provider, I may want to define Monitoring Policies for component types that may be different from those designated by the Application Architect.
• As a Service Provider, I want to be able to utilize my favorite monitoring tools rather than those supplied with an orchestration framework.
• As a Service Provider, I want to be able to access a robust set of Metrics and Events about the orchestration framework, since that is a critical component of my infrastructure.
• As a Service Provider, I want to be able to utilize the full set of topological information provided by the Nested Service Template(s) to enhance my knowledge of running applications. This includes the output sections of the Templates.
Revised approach based on feedback and discussion to date
• Simplify the initial use-case to the bare minimum
• Use that to work through the basic mechanisms
• Define and agree upon the fundamental mechanisms
• Expand from there
TOSCA Monitoring Reference Diagram
Monitoring Automation Point (MAP)
MAD(Monitoring Act / De-act)
MIA(Monitorning Info Access)
MEA(Monitoring Extension Advert)
OM(Orchestrator Monitoring)
Service Template External Process Internal ProcessExternal
Monitoring System
- Monitoring Template / Policy
- Management Communication Info
- Metric Availability - Metric Time Series- Events?
- Metric Values- Events?
- Metric Time Series- Events
Focus on Subset of MAD
Initial Minimal Use Case (1 of 2)• Assume we have a mechanism for defining the
metrics associated with a given component-type:– This is a tractable problem, so let’s come back to it
after solving more fundamental issues• Assume that we want to monitor all the metrics
that each component can produce– Defer defining the mechanism whereby the
Application Architect can define the monitoring policy
– Defer defining the finer points of policy (e.g. events, actions, transformations, etc.)
Initial Minimal Use Case (2 of 2)• Scenario:– A Service Template is deployed with a single
SoftwareComponent running on a single ComputeNode (virtual).• Metrics (Capabilities) are defined for both
SoftwareComponent and ComputeNode types.– ComputeNode:
» PercentCpuUtilization, IoBytesIn, IoBytesOut– SoftwareComponent:
» PercentCpuUtilization, TransactionsProcessed, ErrorsEncountered
– Metrics are collected for some time– The Service Template is removed (de-deployed)
Minimal Use Case Constraints• The components themselves (e.g. the ComputeNode --
via virtualization software, and the SoftwareComponent) will not be required to implement a new monitoring protocol– One of several existing monitoring protocols could be used
(SNMP, WMI, Proprietary ,etc.) depending on the service provider and the underlying technologies used
• The Monitoring Sub-System (MSS) is not required to be embedded within the Orchestrator.– An off-the-shelf monitoring system may be employed
• The MSS is running and attached to the Orchestrator before the Service Template is deployed
Notes on Monitoring Agent• We should consider that there is always a Monitoring Agent (or give
it a new name)– So called “Agentless Monitoring” just means that the Monitoring Agent role
is baked into the component and doesn’t have to be explicitly added in.– From the MSS side, the only difference is the particular protocol used, and
elimination of the need for an agent deployment step.• Coordination is, in any case, still needed between the MSS and the
Agent role within the component (address, port, creds, and other identifiers).
• In many cases, the Agent capability of one component is used to monitor a different component (e.g. one might use the hypervisor’s Agent to monitor a VM; one might use the Host OS’s agent to monitor an application process)
Diagram for Scenario 1
Service Template
Virtual Machine
(new)
Software Component
(new)
Causes
Causes
HostedOn
Monitoring Sub-System
(MSS)
Notify State Change:- Create- Modify- Destroy
Deliver Metrics (any existing push or pull protocol)
12
3
Notes for Scenario 1 Diagram• New components are created. In some cases, there must be relationship
information for components that are created outside of the ST (such as hypervisor or physical system -- see 2 below)
• MSS is notified of new components to be monitored. MSS Needs:– Service Template meta-data in order to know the ID and Type of the component– Instance Model in order to know the address, port, and credentials needed in order
to collect metrics– Possibly the relationship to components not in the ST (e.g., the hypervisor) if info
about the component is provided by that outside component.• If push protocol, the monitoring agent, within the component, must be
configured with the address of the MAP, and the TOSCA id of the component. If pull, then there must be coordination between Orchestrator and the monitoring agent, or explicitly defined in the ST, so that the create notification can know e.g., the agent’s port address and creds.
1
2
3
Minimal Scenario Questions to Answer
• What information is needed by the Monitoring Sub-System (MSS) in order to activate monitoring when the Service Template is deployed.
• What mechanisms could be used to notify the MSS of the significant state changes for the components?– Activate– Modify– Deactivate
• What is the simplest mechanism that could handle this scenario
What information does MSS need?
• Agent Address and Protocol (How to talk to agent)
• Component Identifier (How to ask the question about the correct component)
• Credentials to access Agent– Might be able to set up a closed management
network and not need creds???
Potential Mechanisms
• M1 – When a component is activated, make correlated Template Model and Instance Model available to MSS. MSS figures out how to monitor based on this info:– Proximate portions of Models are extracted and passed or
an interactive API for browsing relationships is provided– Assume that Agents are either baked into the
components, or are explicitly deployed by Service Template.
• Others???
What do we need to specify?• What metric available for each Component Type?
– Metric Type ID– Description– Data Type (e.g. Numeric, String, etc.)– Units (e.g. Volts, Megabytes, Percent, etc.)– Constraints (Min, Max, Enumerated Values, etc)
• Monitoring Policy (controlled via Service Template):– Monitoring Disposition:
• Required – Don’t deploy if you can’t monitor• Best Effort – Deploy anyhow but enable monitoring if available• None
– Metrics to Include? Exclude?– Components to Include? Exclude?– Minimum Sample Frequency?– Action Conditions (e.g. If A then Do B)
• Not in this phase?
Metric Types• Availability:
– Percent Available• Performance:
– CPU Usage– I/O– Memory
• Workload:– Units Processed– Failed Units– Bytes Processed
• Security– Access Attempts– Failed Access Attempts
• Locally Defined?