Upload
beau-christensen
View
232
Download
0
Tags:
Embed Size (px)
Citation preview
Copyright ©2013 Ping Identity Corporation. All rights reserved.
Roy Christensen: 35 years3C > HoneyWell > Digital Equipment CorpMike Christensen: 40 yearsDEC > Compaq> Hewlett Packard
Beau Christensen: 17 yearsAstraZeneca > Send.com > DirecTV > Ping Identity
3rd Generation Computer Geek
92 years family experience (hah)
Copyright ©2013 Ping Identity Corporation. All rights reserved.
• We believe secure professional and personal identities underlie human progress in a connected world. Our purpose is to enable and protect identity, defend privacy and secure the Internet.
• Over 1,000 companies, including over half of the Fortune 100, rely on our award-winning products to make the digital world a better experience for hundreds of millions of people.
• Denver, Colorado. Est. 2003
About Ping Identity
Copyright ©2013 Ping Identity Corporation. All rights reserved.
Site Reliability Engineering (SRE)Production Web OperationsOps to On-Demand Dev
Configuration Engineering (CFE)Automation, Deployment, LabsOps to On-Premise Dev
Infrastructure Engineering (IFE)Iron, Network, SecurityOps to Support & Helpdesk
Infrastructure Operations @Ping
Copyright ©2013 Ping Identity Corporation. All rights reserved.
(Quickie) Architecture
• Hybrid Cloud Application• VMware/AWS• SOA (lots of little services)• Red/Black Deployment• Cassandra & Galera MySQL
Copyright ©2013 Ping Identity Corporation. All rights reserved.
Splunk Systems
• 7 Indexers • Distro search across 3 regional
data centers• ~ 40Gib/day• No clustering (yet)• Universal FWD installed in all
templates• Moving all Splunk to clustered
architecture in VPCs.
Copyright ©2013 Ping Identity Corporation. All rights reserved.
Because uptime is dead.
http://www.kitchensoap.com/2013/01/03/availability-nuance-as-a-service/
Copyright ©2013 Ping Identity Corporation. All rights reserved.
Lots of Things…
Honesty
Integrity
ReliabilityDependabilitySecurity
TransparencyMaturity
Copyright ©2013 Ping Identity Corporation. All rights reserved.
We rolled those up into two.
Reliability & Transparency
Copyright ©2013 Ping Identity Corporation. All rights reserved.
Keeping Shit Running• Architecture
– SOA (duh!!!!)– Highly Automated Deployments– Active/Active, quick failover, multi-region
• Tools– We use a ton of tools– We constantly question the tools we use– We are always looking for new tools
• Security– Eyes always on– Scanners running continuously– Constant Remediation
Copyright ©2013 Ping Identity Corporation. All rights reserved.
Talking About Shit• Talk Publicly
– Talk about decisions we make– Talk about architecture– Talk to vendors– Talk to customers
• When Stuff Breaks, Talk More– Quick & dirty status updates– Verbose post mortems– Be honest
• Metrics– Use 3rd Parties– Expose as much monitoring as you can– Make it relevant to customers
Copyright ©2013 Ping Identity Corporation. All rights reserved.
T = r + t
Trust = reliability + transparency
Copyright ©2013 Ping Identity Corporation. All rights reserved.
Monitoring is the foundation
of Trust.
Copyright ©2013 Ping Identity Corporation. All rights reserved.
is the backbone of our monitoring
strategy.
Copyright ©2013 Ping Identity Corporation. All rights reserved.
<3 MonitoringThe right tool for the right job = Lots of tools.
Copyright ©2013 Ping Identity Corporation. All rights reserved.
Lots of tools = Best of Breed, orStack Monitoring
Copyright ©2013 Ping Identity Corporation. All rights reserved.
Transport & Network Layer.
Fundamental interactions between services. Extremely important to understand, forms the basis of troubleshooting efforts.
“Can you ping it? Tracert? Tcpdump?”
Copyright ©2013 Ping Identity Corporation. All rights reserved.
Instance Resource MonitoringHistorical resource consumption, collected and correlated with incidents and system events. Scalability intelligence.
“Traditional” systems monitoring. Nagios, Zenoss, Zabbix, etc.
Copyright ©2013 Ping Identity Corporation. All rights reserved.
Machine Data, Logs.
Extraordinary versatility: events, alerting, reporting, security, business metrics, and OI. Allows the creation of dashboards and event types relative to your own environment.
Events, logs, traditional “syslog stuff.”
Copyright ©2013 Ping Identity Corporation. All rights reserved.
Application Performance Monitoring
Detailed view of how the applications are running on top of the rest of the stack. Identify bottlenecks, architecture issues, and accurately model user experience with RUM.
“Page loads are slow.” “Why are there SQL queries that return 30,000 rows?”
Copyright ©2013 Ping Identity Corporation. All rights reserved.
Availability Monitoring
External agents monitor services from multiple locations around the globe. Use application heartbeats, not ICMP or TCP. Provides near real-time alerting and is immediately visible to the customer.
“The site is down.”
Copyright ©2013 Ping Identity Corporation. All rights reserved.
Splunk is the system’s mortar.
Binding systems together, filling gaps, creating stability.
Copyright ©2013 Ping Identity Corporation. All rights reserved.
Security
Quick visual impact. Easy to identify location, frequency, and network address of aggressor. Quick drill down into detailed Splunk searches.
Copyright ©2013 Ping Identity Corporation. All rights reserved.
Traffic
Real-time traffic maps are immediately recognizable to everyone in the organization.
Copyright ©2013 Ping Identity Corporation. All rights reserved.
Operations
Operational dashboards provide a quick overview of system health by displaying error correlation with traffic levels and event type heat maps.
Copyright ©2013 Ping Identity Corporation. All rights reserved.
Global Load Balance
Shows traffic distribution between production data centers.
Copyright ©2013 Ping Identity Corporation. All rights reserved.
Maximize Machine Data Layer VersatilityBuild your own dashboards (it’s easy).
Maintain proper naming conventions:na-den-www-13-8-17-0ec5e8da-128-193
Design for wetware.
Ask for feedback.
Maintain event types.
Copyright ©2013 Ping Identity Corporation. All rights reserved.
Having tons of tools is great. Each layer of the stack is matched well with the system’s requirements…
Aggregation of metrics is the new challenge. How do you see it all?
Copyright ©2013 Ping Identity Corporation. All rights reserved.
eventtype=critical.*
Application integration allows speedier access to event timeline and context of the alarm using markers. Allows SRE to gather relevant information quicker, using the same tools with less screens.
Stack Integration = MTTR Speed
Copyright ©2013 Ping Identity Corporation. All rights reserved.
Thanks!
[email protected]@beauchristensenwww.pingidentity.com/blogshttp://status.pingidentity.com