45
Emulex Confidential - © 2014 Emulex Corporation NetPod Overview Boni Bruno, Technical Director, Emulex

Netpod - The Merging of NPM & APM

Embed Size (px)

Citation preview

Page 1: Netpod - The Merging of NPM & APM

Emulex Confidential - © 2014 Emulex Corporation

NetPod Overview

Boni Bruno, Technical Director, Emulex

Page 2: Netpod - The Merging of NPM & APM

Emulex Confidential - © 2014 Emulex Corporation

What is Netpod???

Netpod is an Application Awareness & Network Performance Monitoring solution designed to help problem response teams responsible for resolving network and application issues…

Page 3: Netpod - The Merging of NPM & APM

3 Emulex Confidential - © 2014 Emulex Corporation

Page 3

NetPod Application Decodes

• Web

• SAP

• Oracle Forms

• Exchange

• VoIP

• XML/SOAP and WCF

• Websphere MQ and XML over MQ

• Tuxedo Jolt

• LDAP

• DRDA (DB2), Informix, TDS (MSSQL, Sybase), Oracle SQL*Net, MySQL

• SSL

• Citrix ICA

• TCAM Thin Client Analysis Module (Citrix & WTS)

Page 4: Netpod - The Merging of NPM & APM

Emulex Confidential - © 2014 Emulex Corporation

Immediate Benefits of NetPod

Page 5: Netpod - The Merging of NPM & APM

Page 5Time

Cum

ula

tive C

ost

to O

rganiz

ation

Incident

Occurrence

Problem reported(by user or monitoring tool)

Fault domain identified

Stages of Incident Investigation

Root Cause

Found

Repair

»»»Costs begin accumulating: lost end-user productivity, lost revenue, etc.

Cost rate increases as IT and network staff work on

the incident, seek to find a work around, etc.

Once location of faulty equipment or

application is found the cost rate may

decrease: smaller team focuses on

root cause analysis

Repair begins once root

cause is found

Page 6: Netpod - The Merging of NPM & APM

Page 6

Time

Cum

ula

tive C

ost

to O

rganiz

ation

Incident

Occurrence

Reported

(via AA-NPM)

Responsible

Technology

Identified

Root Cause

Found

Goal is to Decrease Time-to-Resolution

NetPod reduces

each stage of the

investigation

workflow Repair

»»»

Page 7: Netpod - The Merging of NPM & APM

Page 7

Incident Complexity

Single Incident Recurring Problem

Ca

us

e U

nid

en

tifi

ed

Ca

us

e I

de

nti

fie

d

• Change-related cause

• Hardware Failure

• Software Failure

• Misconfiguration

• Operations Error

• User Error

• Operations Error

• User Error

• H/W “hiccup”

• No idea … happened

once and has never

happened again!

• Intermittent H/W Failure

• Intermittent S/W Failure

• Known Error

• Common user error

• Application Logic

• Transient overload

• Unexpected interactions

• Incorrect failover

• Keeps happening but then

correcting itself!

»»»

Page 8: Netpod - The Merging of NPM & APM

Page 8

Problem Resolution Process

Single Incident Recurring Problem

Cau

se U

nid

en

tifi

ed

Cau

se Id

en

tifi

ed

• Change-related cause

• Hardware Failure

• Software Failure

• Misconfiguration

• Operations Error

• User Error

• Operations Error

• User Error

• H/W “hiccup”

• No idea … happened

once and has never

happened again!

• Intermittent H/W Failure

• Intermittent S/W Failure

• Known Error

• Common user error

• Application Logic

• Transient overload

• Unexpected interactions

• Incorrect failover

• No idea … keeps

happening but then

correcting itself!

Resolution is within scope of

standard operations processes

and using existing tools

Resolution is handled by

diagnostics team on a per

incident basis

»»»

Ca

us

e U

nid

en

tifi

ed

Ca

us

e Id

en

tifi

ed

Ca

us

e U

nid

en

tifi

ed

Ca

us

e Id

en

tifi

ed

Page 9: Netpod - The Merging of NPM & APM

Page 9

The IT Nightmare: Recurring Grey Problems

Single Incident Recurring Problem

Ca

us

e U

nid

en

tifi

ed

Ca

us

e I

de

nti

fie

d

• Change-related cause

• Hardware Failure

• Software Failure

• Misconfiguration

• Operations Error

• User Error

• Operations Error

• User Error

• H/W “hiccup”

• No idea … happened

once and has never

happened again!

• Intermittent H/W Failure

• Intermittent S/W Failure

• Known Error

• Common user error

• Application Logic

• Transient overload

• Unexpected interactions

• Incorrect failover

• No idea … keeps

happening but then

correcting itself!

Recurring

Grey

Problem

»»»

Page 10: Netpod - The Merging of NPM & APM

Page 10

Problem Manager

Data

Networks

Server

Support

Database

Support

Solution

Architects

Desk

Support

Grey Problem

App

Support

Why “Grey” is so challenging (and expensive)

Pass the Hot (Grey) Potato!

Too long!

»»»

Page 11: Netpod - The Merging of NPM & APM

Page 11

Outsourcing Makes it Worse

Ca

us

e U

nid

en

tifi

ed

Ca

us

e I

de

nti

fie

d »»»

Resolution is within scope of

standard operations processes

and using existing tools

Resolution is handled by

diagnostics team on a per

incident basis

Single Incident Recurring Problem

Outsourcing moves the line up…

Distributing operation responsibility

across multiple organizations means

less incidents can be handled via

standard operating procedures

Ca

us

e U

nid

en

tifi

ed

Ca

us

e Id

en

tifi

ed

Ca

us

e U

nid

en

tifi

ed

Ca

us

e Id

en

tifi

ed

Page 12: Netpod - The Merging of NPM & APM

Page 12

Resolution is handled by

diagnostics team on a per

incident basis

Single Incident Recurring Problem

How Monitoring and Analysis Tools Help

Ca

us

e U

nid

en

tifi

ed

Ca

us

e I

de

nti

fie

d »»»

Resolution is within scope of

standard operations processes

and using existing tools

Single Incident Recurring Problem

AA-NPM via NetPod moves the

line down. More incidents can

be handled via standard

operating procedures

Ca

us

e U

nid

en

tifi

ed

Ca

us

e Id

en

tifi

ed

Ca

us

e U

nid

en

tifi

ed

Ca

us

e Id

en

tifi

ed

Page 13: Netpod - The Merging of NPM & APM

Page 13

Typical Application Transactions

Web and App Servers DatabasesEnd User

Request

Response

Logical

Transaction

Flow

»»»

Network

Tap

Points

Tap points deployed between each service tier makes the

diagnostic data available to the incident response team

Visible

Transaction

Flow

Page 14: Netpod - The Merging of NPM & APM

Page 14

Tap Points Determine What is Visible

Servers

Other serversEnd User

Network

Visible

Transaction

Flow

»»»

Tap

Points

Only visible as a single

functional unitVisible as two

functional units

NetPod

Page 15: Netpod - The Merging of NPM & APM

Page 15

1

2 3

Increasing Visibility

Servers

Other serversEnd User

Network

Tapped

Transaction

Flow

»»»

Tap

Points

Now visible as a three

functional unitsTap Points

NetPod

Page 16: Netpod - The Merging of NPM & APM

Page 16

Adding Visibility “Inside the Tiers”

Web and App Servers DatabasesEnd Users

Network

»»»

Switch SwitchSwitch

Tap

TapTap

Tap

NetPod

Agent Agent

Taps

Agent

Inter-TechnologyIntra-Technology

Page 17: Netpod - The Merging of NPM & APM

17 Emulex Confidential - © 2014 Emulex Corporation

Page 17

Why Packet-Based Analysis?

• Network connections exist between functional units

• Packets (and the transactions they create) provide visibility deep into actual service behaviour

– Especially when the service is misbehaving!

• Traffic monitoring is passive and has no impact on application or network performance

• Packets enable per-incident transaction analysis – This greatly simplifies problem investigations by reducing their scope– Puts everything into as simple a context as possible

• Transaction analysis enables Fault Domain Isolation

• Packets provide deep insight into root cause whether it be application or network based

»»»

Page 18: Netpod - The Merging of NPM & APM

Page 18

NetPod Physical Architecture

• NetPod agentless monitoring & packet recording:• Forwards metadata to NetPod Analysis Server• Stores all packets to local storage.• Forwards requested packet to Analysis Server for in-

depth back-in-time analysis

• NetPod Analysis Server:• Performs Transaction Analysis• Performance measurement and reporting• Visualized via browser based NetPod GUI

• NetPod DNA• Ultra deep packet analysis on desktop• Also 3rd party pcap analyzers (e.g. Wireshark)

Database and

back-end

servers

Application

servers

Application

serversWeb serverLoad balancer

Internet/Intranet

NetPod

NetPod AMD and

packet store

NetPod Analysis

Server

NetPod

Analyst

NetPod DNA

Page 19: Netpod - The Merging of NPM & APM

Page 19

NetPod Enabled Incident Investigation Workflow

Use the NetPod GUI to work edge-in, identifying

the application operation(s) that fully explain the

performance issue.

The Data Center Analysis screen provides the

visibility to perform Fault Domain Isolation.

Operations are associated with a specific client

(user) – server pair. Download the packets

between that pair and inspect in DCA or

Wireshark to confirm what you are seeing

End-user contacts help desk

with complaint, includes time-

of-day and application name

Reactive Path

NetPod flags a specific

Application, Transaction,

Operation, or User

performance problem

Zero in on a specific time-of-

day when issue is occurring

Proactive Path »»»

Page 20: Netpod - The Merging of NPM & APM

Page 20

Network Landing Page

Page 21: Netpod - The Merging of NPM & APM

Page 21

Network Overview

Page 22: Netpod - The Merging of NPM & APM

Page 22

End-to-End Application & User Health Visibility

Affected

Users

Cross-Tier

FDI

Network

Health

Transaction

Health

Response

Time

Spike

Page 23: Netpod - The Merging of NPM & APM

Page 23

Domain Isolation Across Multi-Tiered Applications

Zoom to any

time period

Most often

slower than

benchmark

Slowest

operations

Page 24: Netpod - The Merging of NPM & APM

Page 24

EUE OverviewSites

Page 25: Netpod - The Merging of NPM & APM

Page 25

Client Site = Bangalore, All Users

Page 26: Netpod - The Merging of NPM & APM

Page 26

DNA (running on Desktop) – Conversation Map

Page 27: Netpod - The Merging of NPM & APM

Page 27

DNA (running on Desktop) – CNS Breakdown

Page 28: Netpod - The Merging of NPM & APM

Page 28

DNA (running on Desktop) – Thread Analysis

Page 29: Netpod - The Merging of NPM & APM

Page 29

DNA – Transaction Bounce Diagram

Page 30: Netpod - The Merging of NPM & APM

Page 30

Transaction Expert Report

Page 31: Netpod - The Merging of NPM & APM

Page 31

Transaction Expert Report

Page 32: Netpod - The Merging of NPM & APM

32 Emulex Confidential - © 2014 Emulex Corporation

Page 32

In Summary…

• Integrated, easy to use business and network transaction analysis solution

• Proactive visibility into network & application performance

• Unique ability to decode applications

• Reduce network and application downtime with immediate fault domain isolation

• Fully triage issues quickly and effectively via rapid access to underlying packets

• Track performance and availability on a user-by-user basis

Page 33: Netpod - The Merging of NPM & APM

Emulex Confidential - © 2014 Emulex Corporation

Thank You

Page 34: Netpod - The Merging of NPM & APM

Emulex Confidential - © 2014 Emulex Corporation

Fault Domain Isolation using NetPod

Additional slides…

Page 35: Netpod - The Merging of NPM & APM

Page 35

»»»

A “Normal” Payment Transaction

2 s

Browser

Ucookie

S U

Load

Balancer

S U

Front

EndsBusiness

Backend

S U

Payment

Backend

S

Payment

DB

S

cookie cookie

Page

Request

Mouse

ClickPage

Request

Final

Response

Payment

Confirmed

Payment

Request

.5 s

Page 36: Netpod - The Merging of NPM & APM

Page 36

»»»

Fault Domain Isolation: Slow User Browser

4 s

Browser

U S U

Load

Balancer

S U

Front

EndsBusiness

Backend

S U

Payment

Backend

S

Payment

DB

S

Payment

Request.5 s

Page

Request

Mouse

Click

.5 s

.7 s

Page

Request

.4 s

.4 s

Payment

Confirmed

Final

Response

Page 37: Netpod - The Merging of NPM & APM

Page 37

»»»

Fault Domain Isolation: Problem with Load Balancer

4 s

Browser

U S U

Load

Balancer

S U

Front

EndsBusiness

Backend

S U

Payment

Backend

S

Payment

DB

S

Payment

Request.5 s

Page

Request

Mouse

ClickPage

Request.4 s

.4 s

.4 s

Final

ResponsePayment

Confirmed

.4 s

.4 s

Page 38: Netpod - The Merging of NPM & APM

Page 38

»»»

Fault Domain Isolation: Slow Database Tier

4 s

Browser

U S U

Load

Balancer

S U

Front

EndsBusiness

Backend

S U

Payment

Backend

S

Payment

DB

SPage

Request

Mouse

ClickPage

Request

Final

Response

Payment

Confirmed

Payment

Request

2.5 s2 s

Page 39: Netpod - The Merging of NPM & APM

Page 39

»»»

Fault Domain Isolation: Slow due to Network

4 s

Browser

U S U

Load

Balancer

S U

Front

EndsBusiness

Backend

S U

Payment

Backend

S

Payment

DB

SPage

Request

Mouse

ClickPage

Request

Final

Response

Payment

Confirmed

2.5 s

Payment

Request

xxxxxxxx

x

TCP ACK

not received

.5 s

Page 40: Netpod - The Merging of NPM & APM

40 Emulex Confidential - © 2014 Emulex Corporation

Page 40

How NetPod Simplifies the Workflow

• Simple Fault Domain Isolation Requires:– An ability to decode applications and their unique transactions

– Correlation of end-user experience to specific transactions

– Fine-grained transaction monitoring• Averages hide intermittent problems

• Problem investigations sometime require visibility into individual transactions

• Fast Root Cause Analysis requires:– Visibility into the individual operations that comprise high level

transactions

– Detailed network performance metrics, error condition monitoring, etc.

»»»

Page 41: Netpod - The Merging of NPM & APM

Page 41

Application Detection via Deep Packet Inspection

Page 42: Netpod - The Merging of NPM & APM

Page 42

Traffic Filtering and Analysis

Page 43: Netpod - The Merging of NPM & APM

Page 43

Visualizing Traffic Type by Bandwidth

Page 44: Netpod - The Merging of NPM & APM

Page 44

Visualizing Traffic Type by IP Address

Page 45: Netpod - The Merging of NPM & APM

Page 45

NetPod Components

NetPod X (5RU)

Capture CWP-7024

AMD (4x) CWP-4004

CAS CWP-4000

NetPod I (3RU)

Capture/AMD CWP-8004

CAS CWP-4000

NetPod CAS (1RU)

CAS CWP-4000

NetPod IU (2RU)

Capture/AMD CWP-8004

NetPod XS (1RU)

AMD (4x) CWP-4004

NetPod XU (3RU)

Capture CWP-7024

Bundle

sU

pgra

des

10G 1G

NetPod ADS (1RU)

ADS CWP-4000

FunctionAppliance

model

Product

name