Upload
doanque
View
216
Download
0
Embed Size (px)
Citation preview
SLA Compliance Assurance
Charles WheelusSenior Data Scientist, Cequint
Splunk .conf 2013October 2nd, 2013
1
Thursday, October 3, 13
About me:Charles Wheelus, MSCS
• Senior Data Scientist, Cequint• Ph.D. Candidate, Florida Atlantic University
2
Thursday, October 3, 13
About me:Charles Wheelus, MSCS
• Senior Data Scientist, Cequint• Ph.D. Candidate, Florida Atlantic University research interests: Data Mining and Machine Learning
2
Thursday, October 3, 13
About me:Charles Wheelus, MSCS
• Senior Data Scientist, Cequint• Ph.D. Candidate, Florida Atlantic University research interests: Data Mining and Machine Learning• 2012 Splunk Ninja Revolution award recipient
2
Thursday, October 3, 13
About me:Charles Wheelus, MSCS
• Senior Data Scientist, Cequint• Ph.D. Candidate, Florida Atlantic University research interests: Data Mining and Machine Learning• 2012 Splunk Ninja Revolution award recipient• Splunk Certified Architect
2
Thursday, October 3, 13
About me:Charles Wheelus, MSCS
• Senior Data Scientist, Cequint• Ph.D. Candidate, Florida Atlantic University research interests: Data Mining and Machine Learning• 2012 Splunk Ninja Revolution award recipient• Splunk Certified Architect • Technology consultant for 20 years
2
Thursday, October 3, 13
About me:Charles Wheelus, MSCS
• Senior Data Scientist, Cequint• Ph.D. Candidate, Florida Atlantic University research interests: Data Mining and Machine Learning• 2012 Splunk Ninja Revolution award recipient• Splunk Certified Architect • Technology consultant for 20 years• Splunk user and evangelist for three years
2
Thursday, October 3, 13
About me:Charles Wheelus, MSCS
• Senior Data Scientist, Cequint• Ph.D. Candidate, Florida Atlantic University research interests: Data Mining and Machine Learning• 2012 Splunk Ninja Revolution award recipient• Splunk Certified Architect • Technology consultant for 20 years• Splunk user and evangelist for three years• Started with version 4.3
2
Thursday, October 3, 13
Cequint provides handset and Carrier data services to most major wireless carriers in the U.S.
About
3
Thursday, October 3, 13
Cequint provides handset and Carrier data services to most major wireless carriers in the U.S.
About
http://cequint.com
3
Thursday, October 3, 13
Service Level Agreement (SLA)Compliance Assurance
Charles Wheelus October 2nd, 20136
Thursday, October 3, 13
...or
How to kill a flock of birds with one stone
Charles Wheelus October 2nd, 20137
Thursday, October 3, 13
Disclaimer: No birds were injured during the production of this presentation. :)
Charles Wheelus October 2nd, 20138
Thursday, October 3, 13
SLA Compliance
Charles Wheelus October 2nd, 2013
(on a Wireless Carrier network)
9
Thursday, October 3, 13
SLA Compliance
Charles Wheelus October 2nd, 2013
The project:
(on a Wireless Carrier network)
9
Thursday, October 3, 13
SLA Compliance
Charles Wheelus October 2nd, 2013
The project:
(on a Wireless Carrier network)
Develop a system that provides proof of our SLA compliance with our carrier customer
9
Thursday, October 3, 13
SLA Compliance
Charles Wheelus October 2nd, 2013
The project:
(on a Wireless Carrier network)
Develop a system that provides proof of our SLA compliance with our carrier customer
Time is of the essence!
9
Thursday, October 3, 13
SLA Compliance
Charles Wheelus October 2nd, 2013
What is the significance of a Service Level Agreement?
10
degradedperformance
Thursday, October 3, 13
SLA Compliance
Charles Wheelus October 2nd, 2013
What is the significance of a Service Level Agreement?
10
+
degradedperformance
extendedperiod
Thursday, October 3, 13
SLA Compliance
Charles Wheelus October 2nd, 2013
What is the significance of a Service Level Agreement?
10
unhappycustomer
Thursday, October 3, 13
SLA Compliance
Charles Wheelus October 2nd, 2013
What is the significance of a Service Level Agreement?
10
happycustomer
Thursday, October 3, 13
SLA Compliance
Charles Wheelus October 2nd, 2013
Step 1: Determine the Key Performance Indicators
11
Thursday, October 3, 13
SLA Compliance
Charles Wheelus October 2nd, 2013
What are Key Performance Indicators (KPI)?
Step 1: Determine the Key Performance Indicators
11
Thursday, October 3, 13
SLA Compliance
Charles Wheelus October 2nd, 2013
What are Key Performance Indicators (KPI)?
Metrics used to evaluate factors that are critical to the optimal performanceof a organization, project or system
Step 1: Determine the Key Performance Indicators
11
Thursday, October 3, 13
SLA Compliance
Charles Wheelus October 2nd, 2013
Challenges
• Numerous subsystems
12
Thursday, October 3, 13
SLA Compliance
Charles Wheelus October 2nd, 2013
Challenges
• Numerous subsystems
• Different development teams
12
Thursday, October 3, 13
SLA Compliance
Charles Wheelus October 2nd, 2013
Challenges
• Numerous subsystems
• Different development teams
• Different programming languages
12
Thursday, October 3, 13
SLA Compliance
Charles Wheelus October 2nd, 2013
Challenges
• Numerous subsystems
• Different development teams
• Different programming languages
• Different operating systems
12
Thursday, October 3, 13
SLA Compliance
Charles Wheelus October 2nd, 2013
Challenges
• Numerous subsystems
• Different development teams
• Different programming languages
• Different operating systems
• Wide variety of hardware types
12
Thursday, October 3, 13
SLA Compliance
Charles Wheelus October 2nd, 2013
Determine what data to get
14
Thursday, October 3, 13
SLA Compliance
Charles Wheelus October 2nd, 2013
Determine what data to get
Study the SLA
14
Thursday, October 3, 13
SLA Compliance
Charles Wheelus October 2nd, 2013
Determine what data to get
Study the SLA
Engage others in the process
14
Thursday, October 3, 13
SLA Compliance
Charles Wheelus October 2nd, 2013
Determine what data to get
Study the SLA
Engage others in the process
• Developers
14
Thursday, October 3, 13
SLA Compliance
Charles Wheelus October 2nd, 2013
Determine what data to get
Study the SLA
Engage others in the process
• Developers• Management
14
Thursday, October 3, 13
SLA Compliance
Charles Wheelus October 2nd, 2013
Determine what data to get
Study the SLA
Engage others in the process
• Developers• Management• Product team
14
Thursday, October 3, 13
SLA Compliance
Charles Wheelus October 2nd, 2013
Determine what data to get
Study the SLA
Engage others in the process
• Developers• Management• Product team• Operations
14
Thursday, October 3, 13
SLA Compliance
Charles Wheelus October 2nd, 2013
Determine the best place(s) to get the data from
15
Thursday, October 3, 13
SLA Compliance
Charles Wheelus October 2nd, 2013
Determine the best place(s) to get the data from
15
Thursday, October 3, 13
SLA Compliance
Charles Wheelus October 2nd, 2013
Establish best practice for data input
16
Thursday, October 3, 13
SLA Compliance
Charles Wheelus October 2nd, 2013
Establish best practice for data input
What simple step can you take in the beginning that will save time later?
16
Thursday, October 3, 13
SLA Compliance
Charles Wheelus October 2nd, 2013
Establish best practice for data input
What simple step can you take in the beginning that will save time later?
Best practices document
16
Thursday, October 3, 13
SLA Compliance
Charles Wheelus October 2nd, 2013
Establish best practice for data input
What simple step can you take in the beginning that will save time later?
Best practices document
Verify the data is in the expected format!
16
Thursday, October 3, 13
SLA Compliance
Charles Wheelus October 2nd, 2013
Determine transport method for getting the data into Splunk
17
Thursday, October 3, 13
SLA Compliance
Charles Wheelus October 2nd, 2013
Determine transport method for getting the data into Splunk
syslog
17
Thursday, October 3, 13
SLA Compliance
Charles Wheelus October 2nd, 2013
Determine transport method for getting the data into Splunk
syslogUDP
17
Thursday, October 3, 13
SLA Compliance
Charles Wheelus October 2nd, 2013
Determine transport method for getting the data into Splunk
syslogUDP
17
Thursday, October 3, 13
SLA Compliance
Charles Wheelus October 2nd, 2013
Determine transport method for getting the data into Splunk
18
Thursday, October 3, 13
SLA Compliance
Charles Wheelus October 2nd, 2013
Determine transport method for getting the data into Splunk
UniversalForwarder
18
Thursday, October 3, 13
SLA Compliance
Charles Wheelus October 2nd, 2013
Determine transport method for getting the data into Splunk
19
Thursday, October 3, 13
SLA Compliance
Charles Wheelus October 2nd, 2013
Determine transport method for getting the data into Splunk
19
Thursday, October 3, 13
SLA Compliance
Charles Wheelus October 2nd, 2013
Determine transport method for getting the data into Splunk
19
Thursday, October 3, 13
SLA Compliance
Charles Wheelus October 2nd, 2013
Determine transport method for getting the data into Splunk
19
Thursday, October 3, 13
SLA Compliance
Charles Wheelus October 2nd, 2013
SLA report (RECAP):
• Establish KPI
21
Thursday, October 3, 13
SLA Compliance
Charles Wheelus October 2nd, 2013
SLA report (RECAP):
• Establish KPI
• Get KPI data into Splunk
21
Thursday, October 3, 13
SLA Compliance
Charles Wheelus October 2nd, 2013
SLA report (RECAP):
• Establish KPI
• Get KPI data into Splunk
• KPI counter aggregation and reconciliation
21
Thursday, October 3, 13
SLA Compliance
Charles Wheelus October 2nd, 2013
SLA report (RECAP):
• Establish KPI
• Get KPI data into Splunk
• KPI counter aggregation and reconciliation
• Use Splunk REST API to build the report
21
Thursday, October 3, 13
SLA Compliance
Charles Wheelus October 2nd, 2013
SLA report (RECAP):
• Establish KPI
• Get KPI data into Splunk
• KPI counter aggregation and reconciliation
• Use Splunk REST API to build the report
21
Thursday, October 3, 13
“Black-box” testing
Charles Wheelus October 2nd, 2013
The problem:
Performance information about the Carrier’s self provisioning gateway is unavailable. We have to run our own tests to determine the expected performance
24
Thursday, October 3, 13
“Black-box” testing
Charles Wheelus October 2nd, 2013
The problem:
Performance information about the Carrier’s self provisioning gateway is unavailable. We have to run our own tests to determine the expected performance
Time is of the essence!
24
Thursday, October 3, 13
Load test results analysis
Charles Wheelus October 2nd, 2013
The problem:
28
Thursday, October 3, 13
Load test results analysis
Charles Wheelus October 2nd, 2013
The problem:
We need a quick way to evaluate the results of load testing.
28
Thursday, October 3, 13
Load test results analysis
Charles Wheelus October 2nd, 2013
The problem:
We need a quick way to evaluate the results of load testing.
Time is of the essence!
28
Thursday, October 3, 13
Event Reporting
Charles Wheelus October 2nd, 2013
The problem:
Thousands of subsystem events may be generated into the log files, some events are inter-dependent. We need a comprehensive and robust system for detecting, correlating, and reporting these events to the correct development team.
32
Thursday, October 3, 13
Event Reporting
Charles Wheelus October 2nd, 2013
The problem:
Thousands of subsystem events may be generated into the log files, some events are inter-dependent. We need a comprehensive and robust system for detecting, correlating, and reporting these events to the correct development team.
Time is of the essence!
32
Thursday, October 3, 13
Event Reporting
Charles Wheelus October 2nd, 2013
The solution:
Splunk saved and scheduled searches!
33
Thursday, October 3, 13
Event Reporting
Charles Wheelus October 2nd, 2013
The solution:
Splunk saved and scheduled searches!
With very brief training, the developers are building their own queries, saving and scheduling
33
Thursday, October 3, 13
Charles Wheelus October 2nd, 2013
The problem:
Event Monitoring and Alarming
35
Thursday, October 3, 13
Charles Wheelus October 2nd, 2013
The problem:
The operations team requires that the KPI produce alarm output into their pre-existing monitoring and alarm system
Event Monitoring and Alarming
35
Thursday, October 3, 13
Charles Wheelus October 2nd, 2013
The problem:
The operations team requires that the KPI produce alarm output into their pre-existing monitoring and alarm system
Time is of the essence!
Event Monitoring and Alarming
35
Thursday, October 3, 13
• Operations has pre-existing alarming software
Charles Wheelus October 2nd, 201336
Event Monitoring and Alarming
Thursday, October 3, 13
• Operations has pre-existing alarming software
• Splunk was connected to OPS alarm system using the Splunk API
Charles Wheelus October 2nd, 201336
Event Monitoring and Alarming
Thursday, October 3, 13
Charles Wheelus October 2nd, 2013
The problem:
The entire team needs to have up to the minute business intelligence.
Performance Analysis
38
Thursday, October 3, 13
Charles Wheelus October 2nd, 2013
The problem:
The entire team needs to have up to the minute business intelligence.
Time is of the essence!
Performance Analysis
38
Thursday, October 3, 13
Charles Wheelus October 2nd, 2013
The answer:
Splunk Dashboards and Apps!
Performance Analysis
39
Thursday, October 3, 13
Performance Analysis
• Customized tools for Developers
Charles Wheelus October 2nd, 201340
Thursday, October 3, 13
Performance Analysis
• Customized tools for Developers
• Dashboards for Operations
Charles Wheelus October 2nd, 201340
Thursday, October 3, 13
Performance Analysis
• Customized tools for Developers
• Dashboards for Operations
• Trouble shooting for Developers and Operations
Charles Wheelus October 2nd, 201340
Thursday, October 3, 13
Performance Analysis
• Customized tools for Developers
• Dashboards for Operations
• Trouble shooting for Developers and Operations
• Business Intelligence for Management
Charles Wheelus October 2nd, 201340
Thursday, October 3, 13
Charles Wheelus October 2nd, 2013
Cut to the chaseSplunk’s greatest benefits:
42
Thursday, October 3, 13
•Time savings
Charles Wheelus October 2nd, 2013
Cut to the chaseSplunk’s greatest benefits:
42
Thursday, October 3, 13
•Time savings•Ability to react quickly (SPL)
Charles Wheelus October 2nd, 2013
Cut to the chaseSplunk’s greatest benefits:
42
Thursday, October 3, 13
•Time savings•Ability to react quickly (SPL)•Real time analytics
Charles Wheelus October 2nd, 2013
Cut to the chaseSplunk’s greatest benefits:
42
Thursday, October 3, 13
•Time savings•Ability to react quickly (SPL)•Real time analytics•Rapid dashboard production
Charles Wheelus October 2nd, 2013
Cut to the chaseSplunk’s greatest benefits:
42
Thursday, October 3, 13
What’s next?
Charles Wheelus October 2nd, 2013
• Deeper analytics• New metrics & dashboards
43
Thursday, October 3, 13
What’s next?
Charles Wheelus October 2nd, 2013
• Deeper analytics• New metrics & dashboards• Modular inputs
43
Thursday, October 3, 13
What’s next?
Charles Wheelus October 2nd, 2013
• Deeper analytics• New metrics & dashboards• Modular inputs• More use of Splunk Apps
43
Thursday, October 3, 13
Charles Wheelus
http://about.me/charleswheelus
http://facebook.com/charleswheelus
45
Thursday, October 3, 13