94
The Unrealized Role of: Monitoring & Alerting @jasonhand | VictorOps | #DevOpsDays

Icinga Camp San Diego 2016 - Unrealized Role of Monitoring

  • Upload
    icinga

  • View
    523

  • Download
    2

Embed Size (px)

Citation preview

Page 1: Icinga Camp San Diego 2016 - Unrealized Role of Monitoring

The UnrealizedRole of:

Monitoring& Alerting

@jasonhand | VictorOps | #DevOpsDays

Page 2: Icinga Camp San Diego 2016 - Unrealized Role of Monitoring

JasonHandDevOps Evangelist

VictorOps@jasonhand | VictorOps | #DevOpsDays

Page 3: Icinga Camp San Diego 2016 - Unrealized Role of Monitoring

SCaLE 14xSouthern

California

Linux

Expo

@jasonhand | VictorOps | #DevOpsDays

Page 4: Icinga Camp San Diego 2016 - Unrealized Role of Monitoring

@jasonhand | VictorOps | #DevOpsDays

Page 5: Icinga Camp San Diego 2016 - Unrealized Role of Monitoring

2015MonitoringSurvey@jasonhand | VictorOps | #DevOpsDays

Page 6: Icinga Camp San Diego 2016 - Unrealized Role of Monitoring

@jasonhand | VictorOps | #DevOpsDays

Page 7: Icinga Camp San Diego 2016 - Unrealized Role of Monitoring

Why Are You Collecting This Data?NOTE: You may choose more than one

» Performance analysis and trending

» Fault and Anomaly detection

» Capacity Planning

» A/B Testing

» We don’t do anything with collected metrics

@jasonhand | VictorOps | #DevOpsDays

Page 8: Icinga Camp San Diego 2016 - Unrealized Role of Monitoring

The ResultsNOTE: Respondents may have chose more than one

» Performance analysis and trending - 63%

» Fault and Anomaly detection - 53%

» Capacity Planning - 45%

» A/B Testing - 11%

» We don’t do anything with collected metrics - 3%

@jasonhand | VictorOps | #DevOpsDays

Page 9: Icinga Camp San Diego 2016 - Unrealized Role of Monitoring

Tyranny of the

S.L.A.(Service Level Agreement)@jasonhand | VictorOps | #DevOpsDays

Page 10: Icinga Camp San Diego 2016 - Unrealized Role of Monitoring

HighAvailabilityPrediction & Prevention@jasonhand | VictorOps | #DevOpsDays

Page 11: Icinga Camp San Diego 2016 - Unrealized Role of Monitoring

@jasonhand | VictorOps | #DevOpsDays

Page 12: Icinga Camp San Diego 2016 - Unrealized Role of Monitoring

@jasonhand | VictorOps | #DevOpsDays

Page 13: Icinga Camp San Diego 2016 - Unrealized Role of Monitoring

That's Important

... but ...@jasonhand | VictorOps | #DevOpsDays

Page 14: Icinga Camp San Diego 2016 - Unrealized Role of Monitoring

@jasonhand | VictorOps | #DevOpsDays

Page 15: Icinga Camp San Diego 2016 - Unrealized Role of Monitoring

@jasonhand | VictorOps | #DevOpsDays

Page 16: Icinga Camp San Diego 2016 - Unrealized Role of Monitoring

Business

Objectives?

@jasonhand | VictorOps | #DevOpsDays

Page 17: Icinga Camp San Diego 2016 - Unrealized Role of Monitoring

Happy Camper

@jasonhand | VictorOps | #DevOpsDays

Page 18: Icinga Camp San Diego 2016 - Unrealized Role of Monitoring

Customerswant more than just

99.999% Uptime@jasonhand | VictorOps | #DevOpsDays

Page 19: Icinga Camp San Diego 2016 - Unrealized Role of Monitoring

@jasonhand | VictorOps | #DevOpsDays

Page 20: Icinga Camp San Diego 2016 - Unrealized Role of Monitoring

Where's the

Innovation?@jasonhand | VictorOps | #DevOpsDays

Page 21: Icinga Camp San Diego 2016 - Unrealized Role of Monitoring

? =@jasonhand | VictorOps | #DevOpsDays

Page 22: Icinga Camp San Diego 2016 - Unrealized Role of Monitoring

? =ContinuousImprovement

@jasonhand | VictorOps | #DevOpsDays

Page 23: Icinga Camp San Diego 2016 - Unrealized Role of Monitoring

How Important is

Learning &Innovation?@jasonhand | VictorOps | #DevOpsDays

Page 24: Icinga Camp San Diego 2016 - Unrealized Role of Monitoring

@jasonhand | VictorOps | #DevOpsDays

Page 25: Icinga Camp San Diego 2016 - Unrealized Role of Monitoring

@jasonhand | VictorOps | #DevOpsDays

Page 26: Icinga Camp San Diego 2016 - Unrealized Role of Monitoring

@jasonhand | VictorOps | #DevOpsDays

Page 27: Icinga Camp San Diego 2016 - Unrealized Role of Monitoring

@jasonhand | VictorOps | #DevOpsDays

Page 28: Icinga Camp San Diego 2016 - Unrealized Role of Monitoring

The result of underutilizing monitoring & alertingis that the IT department and the organization haveno chance to...

learn,improve, orinnovate.@jasonhand | VictorOps | #DevOpsDays

Page 29: Icinga Camp San Diego 2016 - Unrealized Role of Monitoring

Continually understanding & responding to the feedbackfrom

monitoring, logging, & alerting

allows you to use information about events in the past to drive future actions.

@jasonhand | VictorOps | #DevOpsDays

Page 30: Icinga Camp San Diego 2016 - Unrealized Role of Monitoring

SwitchingGears

@jasonhand | VictorOps | #DevOpsDays

Page 31: Icinga Camp San Diego 2016 - Unrealized Role of Monitoring

@jasonhand | VictorOps | #DevOpsDays

Page 32: Icinga Camp San Diego 2016 - Unrealized Role of Monitoring

@jasonhand | VictorOps | #DevOpsDays

Page 33: Icinga Camp San Diego 2016 - Unrealized Role of Monitoring

It's not just about

Prediction& Prevention

@jasonhand | VictorOps | #DevOpsDays

Page 34: Icinga Camp San Diego 2016 - Unrealized Role of Monitoring

Respond &Repair...Quickly@jasonhand | VictorOps | #DevOpsDays

Page 35: Icinga Camp San Diego 2016 - Unrealized Role of Monitoring

Nope

@jasonhand | VictorOps | #DevOpsDays

Page 36: Icinga Camp San Diego 2016 - Unrealized Role of Monitoring

MTTRRather Than

MTBF@jasonhand | VictorOps | #DevOpsDays

Page 37: Icinga Camp San Diego 2016 - Unrealized Role of Monitoring

Failure IsInevitable

@jasonhand | VictorOps | #DevOpsDays

Page 38: Icinga Camp San Diego 2016 - Unrealized Role of Monitoring

us·er/ˈyoozər/Distributed fault injection test suite for production.

credit: Leon Fayer (@papa_fire)@jasonhand | VictorOps | #DevOpsDays

Page 39: Icinga Camp San Diego 2016 - Unrealized Role of Monitoring

Successis a result of

Failure@jasonhand | VictorOps | #DevOpsDays

Page 40: Icinga Camp San Diego 2016 - Unrealized Role of Monitoring

Understand

LearnInnovate

@jasonhand | VictorOps | #DevOpsDays

Page 41: Icinga Camp San Diego 2016 - Unrealized Role of Monitoring

re·sil·ient/rəˈzilyənt/The ability to resist, absorb, recover from or successfully adapt to adversity or a change in conditions

@jasonhand | VictorOps | #DevOpsDays

Page 42: Icinga Camp San Diego 2016 - Unrealized Role of Monitoring

Changecan cause failure

but innovation requires

Change

@jasonhand | VictorOps | #DevOpsDays

Page 43: Icinga Camp San Diego 2016 - Unrealized Role of Monitoring

Conflict

@jasonhand | VictorOps | #DevOpsDays

Page 44: Icinga Camp San Diego 2016 - Unrealized Role of Monitoring

ChangeRequired

@jasonhand | VictorOps | #DevOpsDays

Page 45: Icinga Camp San Diego 2016 - Unrealized Role of Monitoring

“Without deviation from the norm, progress is not possible ”Frank Zappa

@jasonhand | VictorOps | #DevOpsDays

Page 46: Icinga Camp San Diego 2016 - Unrealized Role of Monitoring

What Did You

LearnFrom the Recovery Efforts?(including monitoring & alerting)

@jasonhand | VictorOps | #DevOpsDays

Page 47: Icinga Camp San Diego 2016 - Unrealized Role of Monitoring

Postmortems / Learning Reviews:Stories of:

What took placeleading up to & duringthe disruption & recovery efforts

@jasonhand | VictorOps | #DevOpsDays

Page 48: Icinga Camp San Diego 2016 - Unrealized Role of Monitoring

Who was

involved?

@jasonhand | VictorOps | #DevOpsDays

Page 49: Icinga Camp San Diego 2016 - Unrealized Role of Monitoring

What did they

see?@jasonhand | VictorOps | #DevOpsDays

Page 50: Icinga Camp San Diego 2016 - Unrealized Role of Monitoring

What was

said?@jasonhand | VictorOps | #DevOpsDays

Page 51: Icinga Camp San Diego 2016 - Unrealized Role of Monitoring

What

actionswere taken?jhand.co/chatopsbook

@jasonhand | VictorOps | #DevOpsDays

Page 52: Icinga Camp San Diego 2016 - Unrealized Role of Monitoring

How do events & actions

correlateover time?@jasonhand | VictorOps | #DevOpsDays

Page 53: Icinga Camp San Diego 2016 - Unrealized Role of Monitoring

5 Why's@jasonhand | VictorOps | #DevOpsDays

Page 54: Icinga Camp San Diego 2016 - Unrealized Role of Monitoring

What is the "cause"of the Problem?

Root Cause is ...

@jasonhand | VictorOps | #DevOpsDays

Page 55: Icinga Camp San Diego 2016 - Unrealized Role of Monitoring

Our...

obsession with

"Root Cause"@jasonhand | VictorOps | #DevOpsDays

Page 56: Icinga Camp San Diego 2016 - Unrealized Role of Monitoring

Asking "why".. leads to ..

Blame

@jasonhand | VictorOps | #DevOpsDays

Page 57: Icinga Camp San Diego 2016 - Unrealized Role of Monitoring

Blamingleads to..

operators hiding relevant & important information

@jasonhand | VictorOps | #DevOpsDays

Page 58: Icinga Camp San Diego 2016 - Unrealized Role of Monitoring

We must

believethat our operators are doing their best given theconstraints of the "system"

@jasonhand | VictorOps | #DevOpsDays

Page 59: Icinga Camp San Diego 2016 - Unrealized Role of Monitoring

"We are here to"

LearnFrom Failure(and success)

@jasonhand | VictorOps | #DevOpsDays

Page 60: Icinga Camp San Diego 2016 - Unrealized Role of Monitoring

Rather than ..

@jasonhand | VictorOps | #DevOpsDays

Page 61: Icinga Camp San Diego 2016 - Unrealized Role of Monitoring

AvoidFailure

@jasonhand | VictorOps | #DevOpsDays

Page 62: Icinga Camp San Diego 2016 - Unrealized Role of Monitoring

What's the

Story?@jasonhand | VictorOps | #DevOpsDays

Page 63: Icinga Camp San Diego 2016 - Unrealized Role of Monitoring

InnovateLearning from both success & failureto develop & implementsmall incremental improvementsis critical.

@jasonhand | VictorOps | #DevOpsDays

Page 64: Icinga Camp San Diego 2016 - Unrealized Role of Monitoring

LearningOrganization

@jasonhand | VictorOps | #DevOpsDays

Page 65: Icinga Camp San Diego 2016 - Unrealized Role of Monitoring

Learning does NOT come from

Reading&Listening@jasonhand | VictorOps | #DevOpsDays

Page 66: Icinga Camp San Diego 2016 - Unrealized Role of Monitoring

Learning comes from

Doing@jasonhand | VictorOps | #DevOpsDays

Page 67: Icinga Camp San Diego 2016 - Unrealized Role of Monitoring

Real Learning comes from:

ObservingOrientingDecidingActingJohn Boyd's OODA Loop

@jasonhand | VictorOps | #DevOpsDays

Page 68: Icinga Camp San Diego 2016 - Unrealized Role of Monitoring

Example:

Learning to play the

Dobro Guitar@jasonhand | VictorOps | #DevOpsDays

Page 69: Icinga Camp San Diego 2016 - Unrealized Role of Monitoring

@jasonhand | VictorOps | #DevOpsDays

Page 70: Icinga Camp San Diego 2016 - Unrealized Role of Monitoring

Learning

@jasonhand | VictorOps | #DevOpsDays

Page 71: Icinga Camp San Diego 2016 - Unrealized Role of Monitoring

Why?Go from knowing...to understanding...to learning

NOTE:(Requires making mistakes)

@jasonhand | VictorOps | #DevOpsDays

Page 72: Icinga Camp San Diego 2016 - Unrealized Role of Monitoring

@jasonhand | VictorOps | #DevOpsDays

Page 73: Icinga Camp San Diego 2016 - Unrealized Role of Monitoring

“We will trade some uptime in exchange for innovation-Dave Hahn (Netflix)”

DevOpsDays Boise 2016(today)

@jasonhand | VictorOps | #DevOpsDays

Page 74: Icinga Camp San Diego 2016 - Unrealized Role of Monitoring

Are We Doing

itRight?@jasonhand | VictorOps | #DevOpsDays

Page 75: Icinga Camp San Diego 2016 - Unrealized Role of Monitoring

What do your

Postmortemslook like?

Are they setting you up to learn?

@jasonhand | VictorOps | #DevOpsDays

Page 76: Icinga Camp San Diego 2016 - Unrealized Role of Monitoring

"The Story"-Timeline-Who Was Involved-Context

(Seeing, Saying, Executing)

-Action Items

(Small Incremental Improvements)

@jasonhand | VictorOps | #DevOpsDays

Page 77: Icinga Camp San Diego 2016 - Unrealized Role of Monitoring

Shift our gazefrom:

maintaining& protecting

@jasonhand | VictorOps | #DevOpsDays

Page 78: Icinga Camp San Diego 2016 - Unrealized Role of Monitoring

LearningWhich leads to...

Improving& Innovating

@jasonhand | VictorOps | #DevOpsDays

Page 79: Icinga Camp San Diego 2016 - Unrealized Role of Monitoring

we increase value

of monitoring & alertingof the IT teamsof Products & Services& of the Organization.

@jasonhand | VictorOps | #DevOpsDays

Page 80: Icinga Camp San Diego 2016 - Unrealized Role of Monitoring

HypothesizeExploreStretchExperimentFailLearnTry Again@jasonhand | VictorOps | #DevOpsDays

Page 81: Icinga Camp San Diego 2016 - Unrealized Role of Monitoring

@jasonhand | VictorOps | #DevOpsDays

Page 82: Icinga Camp San Diego 2016 - Unrealized Role of Monitoring

Learning & Innovatingleads to uncovering new ways of

building, deploying, and maintaining software & infrastructureWhich leads to...

@jasonhand | VictorOps | #DevOpsDays

Page 83: Icinga Camp San Diego 2016 - Unrealized Role of Monitoring

ResilientSystems

@jasonhand | VictorOps | #DevOpsDays

Page 84: Icinga Camp San Diego 2016 - Unrealized Role of Monitoring

The

By-product

of a highly

resilientsystem is ...

@jasonhand | VictorOps | #DevOpsDays

Page 85: Icinga Camp San Diego 2016 - Unrealized Role of Monitoring

@jasonhand | VictorOps | #DevOpsDays

Page 86: Icinga Camp San Diego 2016 - Unrealized Role of Monitoring

HighlyAvailablesystem@jasonhand | VictorOps | #DevOpsDays

Page 87: Icinga Camp San Diego 2016 - Unrealized Role of Monitoring

The UnrealizedRole of:

Monitoring& Alerting is ....

@jasonhand | VictorOps | #DevOpsDays

Page 88: Icinga Camp San Diego 2016 - Unrealized Role of Monitoring

Learning&

Innovation

@jasonhand | VictorOps | #DevOpsDays

Page 89: Icinga Camp San Diego 2016 - Unrealized Role of Monitoring

Thank

YouBe Victorious!

@jasonhand | VictorOps | #DevOpsDays

Page 90: Icinga Camp San Diego 2016 - Unrealized Role of Monitoring

References:

Monitoring Survey: https://kartar.net/2015/08/monitoring-survey-2015---metrics/Firefighter: https://www.learyfirefighters.org/wp-content/uploads/2013/09/cover-slide-1.jpgMechanic: https://upload.wikimedia.org/wikipedia/commons/4/4b/Flickr_-_Israel_Defense_Forces_-_Airplane_Technician,_March_2010.jpgGnome Plan: http://www.nerdfitness.com/wp-content/uploads/2012/04/Screen-Shot-2012-03-30-at-3.15.38-AM-1024x7591.jpgNOC: https://upload.wikimedia.org/wikipedia/commons/

@jasonhand | VictorOps | #DevOpsDays

Page 91: Icinga Camp San Diego 2016 - Unrealized Role of Monitoring

References:

Kodak: http://file.answcdn.com/answ-cld/image/upload/v1/tk/brand_image/b59911fc/91d6e71d30a0878dfe3cb30a22751cb874a3ea8c.jpegVW Camper: https://upload.wikimedia.org/wikipedia/commons/d/d7/VW_Camper.jpgBlockbuster: https://jordanandeddie.files.wordpress.com/2013/11/blockbuster-feature.jpgBorders: http://smashingtops.com/wp-content/uploads/2012/06/borders_logo1.jpg

@jasonhand | VictorOps | #DevOpsDays

Page 92: Icinga Camp San Diego 2016 - Unrealized Role of Monitoring

References:

Chained Hands: https://www.google.com/url?sa=i&rct=j&q=&esrc=s&source=images&cd=&ved=0ahUKEwjgrNCDh5TMAhXJs4MKHaoZDssQjBwIBA&url=http%3A%2F%2Fwww.publicdomainpictures.net%2Fdownload-picture.php%3Fadresar%3D50000%26soubor%3Dhands-in-chains.jpg%26id%3D40426&bvm=bv.119745492,d.amc&psig=AFQjCNFIdnDPzSqiLA-znIW5SCTCUHhqEw&ust=1460926880336203Inevitable: http://vignette4.wikia.nocookie.net/matrix/images/5/51/SMITH.png/revision/latest?cb=20110214092002

@jasonhand | VictorOps | #DevOpsDays

Page 93: Icinga Camp San Diego 2016 - Unrealized Role of Monitoring

References:

Accident Free:http://www.compliancesigns.com/media/digital-scoreboard/1000/Safety-Awareness-Sign-DSE-195271000.gifStewie:http://chroniclesofredmark.com/wp-content/uploads/2014/01/Stewie.gifchange: http://i.imgur.com/EQyC6N3.gifHard drive: https://i.imgur.com/pWsKSEf.gifChange: https://farm6.staticflickr.com/5208/5270199049df99b234e9od.jpgValue: https://d13yacurqjgara.cloudfront.net/users/

@jasonhand | VictorOps | #DevOpsDays

Page 94: Icinga Camp San Diego 2016 - Unrealized Role of Monitoring

@jasonhand | VictorOps | #DevOpsDays