67
NM functions Configuration, Performance, Fault, Accounting, Security

NM functions Configuration, Performance, Fault, Accounting, Security

  • View
    221

  • Download
    0

Embed Size (px)

Citation preview

Page 1: NM functions Configuration, Performance, Fault, Accounting, Security

NM functions

Configuration, Performance, Fault,

Accounting, Security

Page 2: NM functions Configuration, Performance, Fault, Accounting, Security

Configuration Management• Middle and long range activities for

controlling Physical, electrical and logical inventoriesMaintaining vendor files and trouble ticketsSupporting provisioning and order processingDefining and supervising service level

agreementsManaging changesDistributing software

Page 3: NM functions Configuration, Performance, Fault, Accounting, Security

• Configuration management is central to all other network management functionsAll other management are supported by

configuration details Enhances control over configuring the network

and devices Quick access to vital configuration data Helps initialization, maintenance and shutdown

of individual components and logical subsystems

Page 4: NM functions Configuration, Performance, Fault, Accounting, Security

Primary Information• Actual configuration • Attributes of network elements • Generated configuration • Status indicators of network elements • Vendor data • Change requests and record • Order data • Actual inventory • Status of service-level indicators

Page 5: NM functions Configuration, Performance, Fault, Accounting, Security

Secondary Information • Traffic Volumes

• More details on indicators

• Performance indicators of the network elements

• etc

Page 6: NM functions Configuration, Performance, Fault, Accounting, Security

Configuration management functions

• Inventory management

• Network topology services

• Service Level agreements

• Designing, implementing and processing trouble tickets

• Order processing and provisioning

• Change Management

Page 7: NM functions Configuration, Performance, Fault, Accounting, Security

Inventory management• Automated inventory – online record of

currently implemented components and spares, contact vendors, location of components, maintenance requirements for certain

equipment classes, service statistics like

• number of outages, • response for repair, • repair time distribution

Page 8: NM functions Configuration, Performance, Fault, Accounting, Security

Good Inventory Management• less redundancy

if same information is stored in different data bases- wastage of resource, processing time to back up the data bases

• synchronized change management • unique names and addresses

Helps during troubleshooting

• Efficient troubleshooting• Better capacity and contingency planning

Page 9: NM functions Configuration, Performance, Fault, Accounting, Security

Network Topology Services• Requires current and historical

configurations

• Layered configuration displays at network and component level of Electrical layouts PhysicalLogical

Page 10: NM functions Configuration, Performance, Fault, Accounting, Security

Display of configuration details

Network Backbone T1

T1/T3

T3

T3

Page 11: NM functions Configuration, Performance, Fault, Accounting, Security

Network details – click on icon

node Network details

M

M

M

M

Page 12: NM functions Configuration, Performance, Fault, Accounting, Security

Protocol level

PHY

DLC

Protocol

level

Page 13: NM functions Configuration, Performance, Fault, Accounting, Security

Auto Discovery tool• Auto- discovery tool can discover devices

on the network ( periodically)

• Auto mapping produces the network map

• Takes up bandwidth to execute all this

Page 14: NM functions Configuration, Performance, Fault, Accounting, Security

SLA• Need to evaluate long-term service levels• Consistency in customer service level• Increased planning and decreased crisis

management • Service levels

Responsiveness, accuracy, availability

• Performance reporting Planned and actual workload characteristics

and service levels during report period

Page 15: NM functions Configuration, Performance, Fault, Accounting, Security

trouble tickets • Linking trouble-tickets • Information in a trouble tickets

Time reportedTime received by responsible groupTime network service restoredTime vendor notifiedTime vendor respondedTime vendor restored serviceTotal vendor timeTotal user non-availabilityTotal service outage

Page 16: NM functions Configuration, Performance, Fault, Accounting, Security

Change Management

User request

Study Impact

Plan Change

Schedule

Request OK

Execute

Document

Configuration and inventory database

Page 17: NM functions Configuration, Performance, Fault, Accounting, Security

Tools for configuration management

• Simple tools Provide simple storage for all network related

information Manually collecting and entering data

• Complex tool Automatically gather data – latest information on

configuration Compare current configuration with stored conf Change a device’s configuration while running Specify configuration errors that should generate

warning messages –

Page 18: NM functions Configuration, Performance, Fault, Accounting, Security

Performance Management• Activities required to continuously evaluate

principal performance indicators to check Service level maintenance Identify potential bottlenecks Establish trend reports Network utilization and error rates

Page 19: NM functions Configuration, Performance, Fault, Accounting, Security

Contd..• Involves

Collection of data on current utilization of network devices and links

Analyze data to discern high utilization trendsSetting utilization thresholdsUsing off-line simulation and or analytical

studies on how to maximize performance

Page 20: NM functions Configuration, Performance, Fault, Accounting, Security

Primary Information• Actual Configuration • Generated configuration • Performance indicators in real-time or in near-

real-time Response time Congested channels Resource utilization

• Selected vendor data • Performance histories for selected facilities • Operational procedures

Page 21: NM functions Configuration, Performance, Fault, Accounting, Security

Performance Indicators• Availability• Response time• Throughput• Utilization – channel occupancy• Grade of service• Transmission volumes• Offered load • Accuracy

Page 22: NM functions Configuration, Performance, Fault, Accounting, Security

Indicators• Service oriented indicators

Have priority

• Efficiency oriented indicators

Page 23: NM functions Configuration, Performance, Fault, Accounting, Security

Service Oriented Indicators • Availability

Customers perspective depends on technical reliability of componentsRedundancy?

• Cost benefitTotal Costs = costs of redundancy + cost of

cosnequences

Page 24: NM functions Configuration, Performance, Fault, Accounting, Security

AvailabilityMTBF

__________________________________

MTBF+MTTD+MMTR+MTOR • MTBF – Mean time between failures• MTTD – Mean time to diagnose• MTTR- Mean time to Repair (or report)• MTOR – Mean time of Repair • Better Availability, keep MTTD, MTTR, MTOR

low,

Page 25: NM functions Configuration, Performance, Fault, Accounting, Security

Response Time

• Propagation Delays, Processing delays, Transmission delays, Protocol delays

User

System think time

enter time

network System response

time

output response time

End user response time

Page 26: NM functions Configuration, Performance, Fault, Accounting, Security

Contd..• Total Response Time

• Network Delays

• Processing delays

• Protocol delays – time outs

• Response time consideration depend onProtocols and their behaviorJob prioritiesLoads in the system

Page 27: NM functions Configuration, Performance, Fault, Accounting, Security

Accuracy• Accuracy can be affected by

Erroneous transmission (wireless & fiber)Characters transmitted but not deliveredCharacters received which were not sentCharacters duplicated

Page 28: NM functions Configuration, Performance, Fault, Accounting, Security

Residual Error RateCHE+CHV+CHN+CHD

______________________________

CHT

• CHE = erroneous characters due to media & processing

• CHV = transmitted but not received

• CHN = extra characters received

• CHD = duplicated characters

• CHT = total characters

Page 29: NM functions Configuration, Performance, Fault, Accounting, Security

Efficiency oriented indicators• Efficiency oriented indicators - Represent

interest of the organization

• Service oriented monitoring and and efficiency oriented monitoring conflicts?

Page 30: NM functions Configuration, Performance, Fault, Accounting, Security

Efficiency vs service

CPU Busy

Channel Busy

Line Busy

Service L3

Service L2

Service L1

30% 40% 70% efficiency

serv

ice

Page 31: NM functions Configuration, Performance, Fault, Accounting, Security

Throughput• Measure of a server’s capacity - MIPS

• Line throughput – kilobits/sec

• Application oriented Number of transaction / unit timeNumber of customer sessions per applicationNumber of calls servicedNumber of jobs provided by a node

Page 32: NM functions Configuration, Performance, Fault, Accounting, Security

Utilization • Dynamic measure of resources used

• Puts a practical limits on the throughput under operational conditions

• Helps study overlap among component processing, mutual waits etc.

Page 33: NM functions Configuration, Performance, Fault, Accounting, Security

Utilization • Utilization vs Accuracy

• Utilization vs throughput• Utilization vs Goodput

Lin

k ut

iliza

tion

Err

ors

per

seco

nd

Time in seconds Time in seconds

100 10

Page 34: NM functions Configuration, Performance, Fault, Accounting, Security

Overlap effects

Input Subsystem

Output Subsystem

CPU Output Link

Slow link?

Page 35: NM functions Configuration, Performance, Fault, Accounting, Security

Availability• Availability of system depends on

availability of individual components (Very difficult to measure and report on

availability)Check on each component and compare with

configurationDepends on how components are connected

Page 36: NM functions Configuration, Performance, Fault, Accounting, Security

Example

• Each Component availability = 0.98

• Availability of the serial combination is 0.98 * 0.98 = 0.96

Example : 2 modems . Serial processing of data

A A

Configuration 1

Page 37: NM functions Configuration, Performance, Fault, Accounting, Security

• Prob 1 link is not available = 0.02• Prob both links are no available is

0.02 * 0.02 = 0.0004

• Availability = 1- 0.0004 = 0.9996

A

A Configuration 2

Page 38: NM functions Configuration, Performance, Fault, Accounting, Security

Performance measurements• Data Gathering

Exhaustive Statistical

• Distribution for sampling times

• Correlation effects

• Performance AnalysisData presentationInterpretation

Page 39: NM functions Configuration, Performance, Fault, Accounting, Security

Contd..• Historical trends

• Real time trends

• Graphical presentation and comparison

• Linking different performance indicatorsThen set thresholds

Page 40: NM functions Configuration, Performance, Fault, Accounting, Security

Simulation studies

• To improve the performance or identify bottlenecks – model the network and components – (primary)Study effects of changes in the modelTarget Optimal performanceRequires Synthetic traffic generation

• Analytical and simulation tools

Page 41: NM functions Configuration, Performance, Fault, Accounting, Security

Simple tools for PM• Provides real-time information on network

componentsGraphical – bars, histograms

• Can help find bottlenecks• Main information

Processor utilizationMemory utilizationLink – pkts/sec, bits/sec Bit error rates

Page 42: NM functions Configuration, Performance, Fault, Accounting, Security

Complex Tools• Set threshold

• Take action once thresholds exceedAlarm Enable backup

• Near threshold warning

• Store historical daya

Page 43: NM functions Configuration, Performance, Fault, Accounting, Security

A complex tool at work• Performance problem

• Brief periods on interrupted service between systems – no information passes through –3 pm and 12 am

Daisy Gatsby

Mainframe

Page 44: NM functions Configuration, Performance, Fault, Accounting, Security

PM tool at work• Check error rates in the network

Normal

• Check utilizationPeaks at 3pm and 12 am – times of back up

• Check Gatsby and Daisy utilizationPeaked to 100% at the specified times

• Check for processor intensive applicationsnegative

Page 45: NM functions Configuration, Performance, Fault, Accounting, Security

Contd..• Check network traffic type

Located an unknown protocol packetFlooding the network – locating serversCheck originatorSend message to himOr block his traffic

Page 46: NM functions Configuration, Performance, Fault, Accounting, Security

Fault management • Activities needed to dynamically maintain

the network service level

• High network availability

Page 47: NM functions Configuration, Performance, Fault, Accounting, Security

Primary Information• Actual configuration• Generated configuration• Event reports and alarms• Status indicators of network elements• Performance indicators• Spare components and their status• Backup routes and their status• Vendor data for problem dispatch• Global traffic volumes• Progress of trouble resolution

Page 48: NM functions Configuration, Performance, Fault, Accounting, Security

Steps in FM• Identify the occurrence of fault

• Isolate the cause of fault

• Correct the fault if possible

• First is difficult, second is very difficult!

Page 49: NM functions Configuration, Performance, Fault, Accounting, Security

Network Status Supervision• Layered configuration maps (status)

(Tightly coupled to topology display)

• Zoom in on parts to isolate problems• Real time traffic status displays• Good monitoring devices/sensors• Monitored information to be passed on to

agents, or management elements • Process and distribute messages, events and

alarms

Page 50: NM functions Configuration, Performance, Fault, Accounting, Security

Status• Is a measurement of the behavior of an object at a specific

instance in time Represented by a set of status information items and

their values at a specific time Network

Status Element Status

CSU1 down

CSU2 down

No Carrier

Element 0

Element 1

Element 2

Page 51: NM functions Configuration, Performance, Fault, Accounting, Security

Event• Change in the status of the element – which justifies

notification i.e. significant to fault management• Event report can be generated

Type of eventChange in statusTime stampReporting entity -Object or process that generated eventManaged object whose status changedManaged object informationProbable causeEffect of event on the managed object

Page 52: NM functions Configuration, Performance, Fault, Accounting, Security

Event Filtering• Multi-layered filtering

E

E

E E E E

E E

E

P

E

E E

1

2

3

Activity on Network

Threshold Filter

Grouping Filter

Prioritizing Filter

Prioritized problems

Page 53: NM functions Configuration, Performance, Fault, Accounting, Security

Filtering Process

time

Bit

err

ors

Investigated, no action

investigated

Action taken

Action effective

Page 54: NM functions Configuration, Performance, Fault, Accounting, Security

Filtering Process• Global filtering

First process on an event – is the event serious and does it have to be processed

Use a set of criteria for this assessmentCan not be function specific

Page 55: NM functions Configuration, Performance, Fault, Accounting, Security

Filtering Process• Distribution Filtering

An event processor selects the event it wishes to receive

There are various event processes running simultaneously

• Event process filteringFiltering done by the event processorSpecific to the functional

Page 56: NM functions Configuration, Performance, Fault, Accounting, Security

Event Processor• Examine and process event reports

• Passive processingSampling and logging

• Proactive processingTakes automatic corrective action

Page 57: NM functions Configuration, Performance, Fault, Accounting, Security

Process for filtering

Event Distribution Unit

Event Reports

Q Distribution Q

Q

Q Event Processor

Event Processor

Event Processor

Distribution Subscription

services Global Filtering

events

Page 58: NM functions Configuration, Performance, Fault, Accounting, Security

Event effect• Permanent – external action required

• Temporary – will correct automatically

• Impending – will result in failure soon

• Impaired – services can be provided at reduced levels

• Inhibited – services stopped

Page 59: NM functions Configuration, Performance, Fault, Accounting, Security

Dynamic Troubleshooting• Opens trouble tickets, links them, dispatches to the

proper vendors, checks on-line progress of trouble tickets

• Problem detection – Is something wrong?

• Problem determinationWhat is wrong and where is the problem in the

network?

• Problem diagnosis & resolutionTo isolate, fix or provide backup and fix

Page 60: NM functions Configuration, Performance, Fault, Accounting, Security

End-to-end testing• To verify dynamically correct network

operationConducted during normal network operation,

without affecting it

• Can we have over-head free testing?• What components should be tested?• How should tasks be assigned?

Local sitesCentral sites

Page 61: NM functions Configuration, Performance, Fault, Accounting, Security

Contd..• When to monitor and test?

Continually, periodically, on demand

• How to monitor and testDisruptive, non-disruptive

• What indicators to monitor and test?Service level, efficiency, loops, circuits

• What instruments to use?Hw, sw, analog, digital

• What reports are to be generated?Standard, adhoc with special evaluations

• What are the triggering events?Time, single or combined events, alarms

Page 62: NM functions Configuration, Performance, Fault, Accounting, Security

Types of faults• Unobservable

Deadlocks between processesInstrument not capable of recording the events

• Partially observableNode failure – actual reason – low level

protocol

• Uncertainty in observationLack of device response

• Device is down, network partitioned, congestion delays, local timer faulty

Page 63: NM functions Configuration, Performance, Fault, Accounting, Security

Issues in isolating faults• Multiple potential faults

Number of elements failing

• Too many related observationsOne fault manifests itself as various events

• Interference between diagnosis and local recovery proceduresError recovery sets in before diagnosis

• Absence of automated tools

Page 64: NM functions Configuration, Performance, Fault, Accounting, Security

Example FM• Problem scenario – sergeant fails due to buffer overflow

Sergeant

LAN2

Pepper

Network Management System

LAN3

LAN1

Page 65: NM functions Configuration, Performance, Fault, Accounting, Security

Contd..• Buffer is sergeant is well provisioned for

Fails due to traffic surge

• Pepper reports link failure to LAN3Message sent to NM system

• NMS asks pepper to check on carrier presence in Link to LAN3Carrier Absence reported

• NMS ask Pepper to perform loopback on link3ok

Page 66: NM functions Configuration, Performance, Fault, Accounting, Security

Contd..• NM resets Sergeant

• ?

• Actual reason for failure not identified

• This could have been avoided if there was an event from sergeant of utilization in excess of 80% or 90%

Page 67: NM functions Configuration, Performance, Fault, Accounting, Security

Simple tool• Points out problem existence

Eg ICMP ping tells you about the existence of a system

• Complex tool may perform all functions shown in the previous example