33
2015 ROBERT BIGOS Cloud capacity planning & monitoring challenge Robert Bigos [email protected] +48 665-168-240 hCp://www.slideshare.net/RobertBigos hCps://pl.linkedin.com/in/robertbigos @bigosr Blog: bigosr.com

Microservices monitoring challange

Embed Size (px)

Citation preview

2015ROBERTBIGOS

Cloudcapacityplanning&monitoringchallenge

[email protected]+48665-168-240

hCp://www.slideshare.net/RobertBigos

hCps://pl.linkedin.com/in/robertbigos

@bigosr

Blog:bigosr.com

2015ROBERTBIGOS

2

CloudAppsDevOpsculture

2015ROBERTBIGOS

Requirements…

Func2onalrequirement:AwantmakesexwithB

2015ROBERTBIGOS

Requirements

Func2onalrequirement:AwantmakesexwithB

Non-Func2onal:oneperday,maybetwo,nottoshort,nottolong,safebutnottosafe,24x7with99.9%monthlyavailability,

Decisions:price?cost?candles?…,howlong?

©2015ROBERTBIGOS

Architecturesetofdecisionsaffec-ng

func-onality…

documentedpartofcommunica-onbetweenstakeholderhelping

understand/balance:Func-onalandNon-Func-onalrequirements

2015ROBERTBIGOS

Conflictbydesign

DevOps

Owner

“…Divideetimpera…”

2015ROBERTBIGOS

SoluPon?

DevOps DevOpsDevOps

2015ROBERTBIGOS

Conway'sLaw

TeamA

TEST

STAGE kingdom/silo

teamB-X

PROD

“Organiza2onswhichdesignsystemsareconstrainedtoproducedesignswhicharecopiesofthecommunica2onstructuresofthese

organiza2ons.”

2015ROBERTBIGOS

2

WHY?

2015ROBERTBIGOS

11

Whyissoimportant?Capacity=fuel,Performance=speedandal2tude

Capacityandperformancemanagementhelpsunderstandhowquicklyandsafelyyoucantransportyourcustomerstoplanneddes2na2ons

Source:http://s134.photobucket.com/user/charlesfrith/media/disaster.gif.html

2015ROBERTBIGOS

QueueingFromoperaPonalperspecPveCloudsystemuse“thesame”formulastomanagecapacityofQueuein

Townoffice

Source:KanalvonFerdinandLutz"Stayinqueue"youtube.com

Clouditisallaboutcloudscale…everythingisinterconnectedandinstrumented…

2015ROBERTBIGOS

Monitoringfaces

…don’tforgetaboutConway’sLaw…

Logs Metrics AlertsTrendsTresholdsEvents

2015ROBERTBIGOS

2

Backupplan?try:

CtrZCmdZ

• blue/greendeploymentgivespossibiliPesnotavailability• snapshotisnotabackup

2015ROBERTBIGOS

2

ThereisnomagicbuCon

Source:http://make-everything-ok.com/

2015ROBERTBIGOS

Youwillfailforsure!

Source:presenterstudiesfortopenterprisesinPoland.Source:http://www.skybrary.aero/index.php/James_Reason_HF_Model Source:JózefTischner"TheHighlander'sHistoryofPhilosophy"

"thetruth,thewholetruth

andthebullshittruth!

PostmortemispartofDevOpsculture.Recoveryhastobepartofdesign.

2015ROBERTBIGOS

WHAT?

2015ROBERTBIGOS

IOT/monitoring/BigDataMonitoringusetoolstocollect/process/visualizelogsandmetricsfor

beCerunderstandingyoursystemtoclosedevelopmentandoperaPonalfeedbackloop.Itisnotkinder-gardentotrynicetools.

Todayeverythingisinstrumentedandinterconnected,besureyoucollecttherightdataintherightscaletobeablegetinformaPonfrom

it.YourlaptopcangenerateBigDatavolumes…

VOLUME VARIETY

VELOCITY

2015ROBERTBIGOS

32

Source:ŁukaszPiskorzIBMSWGLab

Deathorlostsignal?

2015ROBERTBIGOS

32

Computervshumanscale5mins=5*60/10^-9/(60*60*24*365)=9512years

fewobjects,fewvariables,nodependency,norelaPons…

Peepingthroughthekeyhole

2015ROBERTBIGOS

2

HOW?

2015ROBERTBIGOS2

Tools

….…youneedtoolsbuttoolisnotasoluKon…

2015ROBERTBIGOS

32

Typical“Pme-centric”dashboard

Source:Demositedashboardgrafana.org

2015ROBERTBIGOS

32

“ThresholdviolaPon”troubleshooPng

Source:ŁukaszPiskorzIBMSWGLab

2015ROBERTBIGOS

2

Bigpicture?

tryunderstand,godeeper

2015ROBERTBIGOS

2

Knowunknownsandunknownunknowns

“…Reports that say that something hasn't happened are alwaysinteres2ngtome,becauseasweknow, thereareknownknowns;therearethingsweknowweknow.Wealsoknowthereareknownunknowns;thatistosayweknowtherearesomethingswedonotknow.Buttherearealsounknownunknowns--theoneswedon'tknowwedon'tknow…”

Donald Rumsfeld, February 12th, 2004 DOD News Briefing

Source:http://www.defense.gov/transcripts/transcript.aspx?transcriptid=2636

2015ROBERTBIGOS

VisualisaPonRealityGames

hhp://www.wearerealitygames.com/

2015ROBERTBIGOS

VisualisaPonCloudFoundry

2015ROBERTBIGOS

2

Summary

2015ROBERTBIGOS

2

Lessonslearned?• suggestedapproach

• WHY• WHAT• HOW

• mostpopular• HOW• HOW• HOW

….tryunderstandbigpicture,godeeper,focusondetails

2015ROBERTBIGOS

2

Lessonslearned?• ThereisNoSingleVersionoftheTruth…beopenforcommunicaPon

• ThereisNOperfecttools…wecanbuildbeCerteamsandbeCercommunicatetoclosefeedbackloop

• Cloudscaledesign,operaPonaldesignnotjusteasytoconsume…

• ThereisnoGO@FASTERopPonintheCloud

2015ROBERTBIGOS

2

Lessonslearned?• Keepstandards:dateISO_8601example• Keepwhatyouneed…notjusteverything• Keepslim:jsonisnicebut…• KeepinformaPonnotjustnumbers/hashtags• AutomaPzaPonrequiresstandardisaPon• Microservicesarchitectureassume:team=silo/kingdom.

• Don’tusewrongpaCerns!• Ifyoudon'tunderstandyoursystem,yourmicroservicewillnotworkinscale.

Avoid:designedbyprogrammersforprogrammers,keepbalance

2015ROBERTBIGOS

neverdeployatFriday

!whatever

©2015ROBERTBIGOS

Ifyouneedmore…Therearenoinfiniteresources.Therearenoperfectresources.MonitoringispartofacapacitymanagementprocesswhichisonlyapartofoperaPonsmanagement.HeadingtosimplicityofusageweacceptcomplexityofthesoluPonandsomePmesblindnesshopingthatwhenthePmecomesmagicwordswillsolvealltheproblems.Hopealwaysdieslastandsilenceayerfailuregivesusalessoninhumility:wehavetolearnhowtointerpret

monitoringdataasthisisengineeringnotmagic.ProcessismorethanjusttoolsandpeopleandmanagementisinspiraPonanddeterminaPonto

achieveatleastgoalsthatweredefined.Morefailures,lessPmetolearn-thisispartof:

”Youbuildit,yourunit”

Seemore:

hCp://www.slideshare.net/RobertBigos

hCps://pl.linkedin.com/in/robertbigos

Blog:bigosr.com