Upload
bonibruno
View
139
Download
3
Tags:
Embed Size (px)
Citation preview
Emulex Confidential - © 2014 Emulex Corporation
NetPod Overview
Boni Bruno, Technical Director, Emulex
Emulex Confidential - © 2014 Emulex Corporation
What is Netpod???
Netpod is an Application Awareness & Network Performance Monitoring solution designed to help problem response teams responsible for resolving network and application issues…
3 Emulex Confidential - © 2014 Emulex Corporation
Page 3
NetPod Application Decodes
• Web
• SAP
• Oracle Forms
• Exchange
• VoIP
• XML/SOAP and WCF
• Websphere MQ and XML over MQ
• Tuxedo Jolt
• LDAP
• DRDA (DB2), Informix, TDS (MSSQL, Sybase), Oracle SQL*Net, MySQL
• SSL
• Citrix ICA
• TCAM Thin Client Analysis Module (Citrix & WTS)
Emulex Confidential - © 2014 Emulex Corporation
Immediate Benefits of NetPod
Page 5Time
Cum
ula
tive C
ost
to O
rganiz
ation
Incident
Occurrence
Problem reported(by user or monitoring tool)
Fault domain identified
Stages of Incident Investigation
Root Cause
Found
Repair
»»»Costs begin accumulating: lost end-user productivity, lost revenue, etc.
Cost rate increases as IT and network staff work on
the incident, seek to find a work around, etc.
Once location of faulty equipment or
application is found the cost rate may
decrease: smaller team focuses on
root cause analysis
Repair begins once root
cause is found
Page 6
Time
Cum
ula
tive C
ost
to O
rganiz
ation
Incident
Occurrence
Reported
(via AA-NPM)
Responsible
Technology
Identified
Root Cause
Found
Goal is to Decrease Time-to-Resolution
NetPod reduces
each stage of the
investigation
workflow Repair
»»»
Page 7
Incident Complexity
Single Incident Recurring Problem
Ca
us
e U
nid
en
tifi
ed
Ca
us
e I
de
nti
fie
d
• Change-related cause
• Hardware Failure
• Software Failure
• Misconfiguration
• Operations Error
• User Error
• Operations Error
• User Error
• H/W “hiccup”
• No idea … happened
once and has never
happened again!
• Intermittent H/W Failure
• Intermittent S/W Failure
• Known Error
• Common user error
• Application Logic
• Transient overload
• Unexpected interactions
• Incorrect failover
• Keeps happening but then
correcting itself!
»»»
Page 8
Problem Resolution Process
Single Incident Recurring Problem
Cau
se U
nid
en
tifi
ed
Cau
se Id
en
tifi
ed
• Change-related cause
• Hardware Failure
• Software Failure
• Misconfiguration
• Operations Error
• User Error
• Operations Error
• User Error
• H/W “hiccup”
• No idea … happened
once and has never
happened again!
• Intermittent H/W Failure
• Intermittent S/W Failure
• Known Error
• Common user error
• Application Logic
• Transient overload
• Unexpected interactions
• Incorrect failover
• No idea … keeps
happening but then
correcting itself!
Resolution is within scope of
standard operations processes
and using existing tools
Resolution is handled by
diagnostics team on a per
incident basis
»»»
Ca
us
e U
nid
en
tifi
ed
Ca
us
e Id
en
tifi
ed
Ca
us
e U
nid
en
tifi
ed
Ca
us
e Id
en
tifi
ed
Page 9
The IT Nightmare: Recurring Grey Problems
Single Incident Recurring Problem
Ca
us
e U
nid
en
tifi
ed
Ca
us
e I
de
nti
fie
d
• Change-related cause
• Hardware Failure
• Software Failure
• Misconfiguration
• Operations Error
• User Error
• Operations Error
• User Error
• H/W “hiccup”
• No idea … happened
once and has never
happened again!
• Intermittent H/W Failure
• Intermittent S/W Failure
• Known Error
• Common user error
• Application Logic
• Transient overload
• Unexpected interactions
• Incorrect failover
• No idea … keeps
happening but then
correcting itself!
Recurring
Grey
Problem
»»»
Page 10
Problem Manager
Data
Networks
Server
Support
Database
Support
Solution
Architects
Desk
Support
Grey Problem
App
Support
Why “Grey” is so challenging (and expensive)
Pass the Hot (Grey) Potato!
Too long!
»»»
Page 11
Outsourcing Makes it Worse
Ca
us
e U
nid
en
tifi
ed
Ca
us
e I
de
nti
fie
d »»»
Resolution is within scope of
standard operations processes
and using existing tools
Resolution is handled by
diagnostics team on a per
incident basis
Single Incident Recurring Problem
Outsourcing moves the line up…
Distributing operation responsibility
across multiple organizations means
less incidents can be handled via
standard operating procedures
Ca
us
e U
nid
en
tifi
ed
Ca
us
e Id
en
tifi
ed
Ca
us
e U
nid
en
tifi
ed
Ca
us
e Id
en
tifi
ed
Page 12
Resolution is handled by
diagnostics team on a per
incident basis
Single Incident Recurring Problem
How Monitoring and Analysis Tools Help
Ca
us
e U
nid
en
tifi
ed
Ca
us
e I
de
nti
fie
d »»»
Resolution is within scope of
standard operations processes
and using existing tools
Single Incident Recurring Problem
AA-NPM via NetPod moves the
line down. More incidents can
be handled via standard
operating procedures
Ca
us
e U
nid
en
tifi
ed
Ca
us
e Id
en
tifi
ed
Ca
us
e U
nid
en
tifi
ed
Ca
us
e Id
en
tifi
ed
Page 13
Typical Application Transactions
Web and App Servers DatabasesEnd User
Request
Response
Logical
Transaction
Flow
»»»
Network
Tap
Points
Tap points deployed between each service tier makes the
diagnostic data available to the incident response team
Visible
Transaction
Flow
Page 14
Tap Points Determine What is Visible
Servers
Other serversEnd User
Network
Visible
Transaction
Flow
»»»
Tap
Points
Only visible as a single
functional unitVisible as two
functional units
NetPod
Page 15
1
2 3
Increasing Visibility
Servers
Other serversEnd User
Network
Tapped
Transaction
Flow
»»»
Tap
Points
Now visible as a three
functional unitsTap Points
NetPod
Page 16
Adding Visibility “Inside the Tiers”
Web and App Servers DatabasesEnd Users
Network
»»»
Switch SwitchSwitch
Tap
TapTap
Tap
NetPod
Agent Agent
Taps
Agent
Inter-TechnologyIntra-Technology
17 Emulex Confidential - © 2014 Emulex Corporation
Page 17
Why Packet-Based Analysis?
• Network connections exist between functional units
• Packets (and the transactions they create) provide visibility deep into actual service behaviour
– Especially when the service is misbehaving!
• Traffic monitoring is passive and has no impact on application or network performance
• Packets enable per-incident transaction analysis – This greatly simplifies problem investigations by reducing their scope– Puts everything into as simple a context as possible
• Transaction analysis enables Fault Domain Isolation
• Packets provide deep insight into root cause whether it be application or network based
»»»
Page 18
NetPod Physical Architecture
• NetPod agentless monitoring & packet recording:• Forwards metadata to NetPod Analysis Server• Stores all packets to local storage.• Forwards requested packet to Analysis Server for in-
depth back-in-time analysis
• NetPod Analysis Server:• Performs Transaction Analysis• Performance measurement and reporting• Visualized via browser based NetPod GUI
• NetPod DNA• Ultra deep packet analysis on desktop• Also 3rd party pcap analyzers (e.g. Wireshark)
Database and
back-end
servers
Application
servers
Application
serversWeb serverLoad balancer
Internet/Intranet
NetPod
NetPod AMD and
packet store
NetPod Analysis
Server
NetPod
Analyst
NetPod DNA
Page 19
NetPod Enabled Incident Investigation Workflow
Use the NetPod GUI to work edge-in, identifying
the application operation(s) that fully explain the
performance issue.
The Data Center Analysis screen provides the
visibility to perform Fault Domain Isolation.
Operations are associated with a specific client
(user) – server pair. Download the packets
between that pair and inspect in DCA or
Wireshark to confirm what you are seeing
End-user contacts help desk
with complaint, includes time-
of-day and application name
Reactive Path
NetPod flags a specific
Application, Transaction,
Operation, or User
performance problem
Zero in on a specific time-of-
day when issue is occurring
Proactive Path »»»
Page 20
Network Landing Page
Page 21
Network Overview
Page 22
End-to-End Application & User Health Visibility
Affected
Users
Cross-Tier
FDI
Network
Health
Transaction
Health
Response
Time
Spike
Page 23
Domain Isolation Across Multi-Tiered Applications
Zoom to any
time period
Most often
slower than
benchmark
Slowest
operations
Page 24
EUE OverviewSites
Page 25
Client Site = Bangalore, All Users
Page 26
DNA (running on Desktop) – Conversation Map
Page 27
DNA (running on Desktop) – CNS Breakdown
Page 28
DNA (running on Desktop) – Thread Analysis
Page 29
DNA – Transaction Bounce Diagram
Page 30
Transaction Expert Report
Page 31
Transaction Expert Report
32 Emulex Confidential - © 2014 Emulex Corporation
Page 32
In Summary…
• Integrated, easy to use business and network transaction analysis solution
• Proactive visibility into network & application performance
• Unique ability to decode applications
• Reduce network and application downtime with immediate fault domain isolation
• Fully triage issues quickly and effectively via rapid access to underlying packets
• Track performance and availability on a user-by-user basis
Emulex Confidential - © 2014 Emulex Corporation
Thank You
Emulex Confidential - © 2014 Emulex Corporation
Fault Domain Isolation using NetPod
Additional slides…
Page 35
»»»
A “Normal” Payment Transaction
2 s
Browser
Ucookie
S U
Load
Balancer
S U
Front
EndsBusiness
Backend
S U
Payment
Backend
S
Payment
DB
S
cookie cookie
Page
Request
Mouse
ClickPage
Request
Final
Response
Payment
Confirmed
Payment
Request
.5 s
Page 36
»»»
Fault Domain Isolation: Slow User Browser
4 s
Browser
U S U
Load
Balancer
S U
Front
EndsBusiness
Backend
S U
Payment
Backend
S
Payment
DB
S
Payment
Request.5 s
Page
Request
Mouse
Click
.5 s
.7 s
Page
Request
.4 s
.4 s
Payment
Confirmed
Final
Response
Page 37
»»»
Fault Domain Isolation: Problem with Load Balancer
4 s
Browser
U S U
Load
Balancer
S U
Front
EndsBusiness
Backend
S U
Payment
Backend
S
Payment
DB
S
Payment
Request.5 s
Page
Request
Mouse
ClickPage
Request.4 s
.4 s
.4 s
Final
ResponsePayment
Confirmed
.4 s
.4 s
Page 38
»»»
Fault Domain Isolation: Slow Database Tier
4 s
Browser
U S U
Load
Balancer
S U
Front
EndsBusiness
Backend
S U
Payment
Backend
S
Payment
DB
SPage
Request
Mouse
ClickPage
Request
Final
Response
Payment
Confirmed
Payment
Request
2.5 s2 s
Page 39
»»»
Fault Domain Isolation: Slow due to Network
4 s
Browser
U S U
Load
Balancer
S U
Front
EndsBusiness
Backend
S U
Payment
Backend
S
Payment
DB
SPage
Request
Mouse
ClickPage
Request
Final
Response
Payment
Confirmed
2.5 s
Payment
Request
xxxxxxxx
x
TCP ACK
not received
.5 s
40 Emulex Confidential - © 2014 Emulex Corporation
Page 40
How NetPod Simplifies the Workflow
• Simple Fault Domain Isolation Requires:– An ability to decode applications and their unique transactions
– Correlation of end-user experience to specific transactions
– Fine-grained transaction monitoring• Averages hide intermittent problems
• Problem investigations sometime require visibility into individual transactions
• Fast Root Cause Analysis requires:– Visibility into the individual operations that comprise high level
transactions
– Detailed network performance metrics, error condition monitoring, etc.
»»»
Page 41
Application Detection via Deep Packet Inspection
Page 42
Traffic Filtering and Analysis
Page 43
Visualizing Traffic Type by Bandwidth
Page 44
Visualizing Traffic Type by IP Address
Page 45
NetPod Components
NetPod X (5RU)
Capture CWP-7024
AMD (4x) CWP-4004
CAS CWP-4000
NetPod I (3RU)
Capture/AMD CWP-8004
CAS CWP-4000
NetPod CAS (1RU)
CAS CWP-4000
NetPod IU (2RU)
Capture/AMD CWP-8004
NetPod XS (1RU)
AMD (4x) CWP-4004
NetPod XU (3RU)
Capture CWP-7024
Bundle
sU
pgra
des
10G 1G
NetPod ADS (1RU)
ADS CWP-4000
FunctionAppliance
model
Product
name