34
Copyright © 2016 Splunk Inc. Nadine Miller Technical Support Engineer, Splunk aka 'vraptor' on IRC and Slack KV Store: Hammer Time

KV Store: Hammer Time

  • Upload
    others

  • View
    4

  • Download
    0

Embed Size (px)

Citation preview

Page 1: KV Store: Hammer Time

Copyright©2016SplunkInc.

NadineMillerTechnicalSupportEngineer,Splunkaka'vraptor'onIRCandSlack

KVStore:HammerTime

Page 2: KV Store: Hammer Time

Disclaimer

2

Duringthecourseofthispresentation,wemaymakeforwardlookingstatementsregardingfutureeventsortheexpectedperformanceofthecompany.Wecautionyouthatsuchstatementsreflectourcurrent

expectationsandestimatesbasedonfactorscurrentlyknowntousandthatactualeventsorresultscoulddiffermaterially.Forimportantfactorsthatmaycauseactualresultstodifferfromthosecontainedinourforward-lookingstatements,pleasereviewourfilingswiththeSEC.Theforward-lookingstatementsmadeinthethispresentationarebeingmadeasofthetimeanddateofitslivepresentation.Ifreviewedafter

itslivepresentation,thispresentationmaynotcontaincurrentoraccurateinformation.Wedonotassumeanyobligationtoupdateanyforwardlookingstatementswemaymake.Inaddition,any

informationaboutourroadmapoutlinesourgeneralproductdirectionandissubjecttochangeatanytimewithoutnotice.Itisforinformationalpurposesonlyandshallnot,beincorporatedintoanycontractorothercommitment.Splunkundertakesnoobligationeithertodevelopthefeaturesorfunctionality

describedortoincludeanysuchfeatureorfunctionalityinafuturerelease.

Page 3: KV Store: Hammer Time

WhoamI?

SeniorTechnicalSupportEngineer

SearchHeadCluster&IndexClusterSME

KeeperofSupportKVStoreTrouble-ShootingDocs

SplunkTrustMember

SeniorUNIXSystemsAdministratorinaPreviousLife

3

Page 4: KV Store: Hammer Time

Whatarewetalkingabout?

FocusedonKVStorewithinthecontextofSHC

Discussionofbackup/restoreimportanttoStandalone

WheretodisableKVstore

Merging!

Nodiscussionofdevelopment/customuse

4

Page 5: KV Store: Hammer Time

Whatisthisbeast?

KVStoreis:

adatabase

Mongodb

Storesuser-createddatainSplunkthatcanbelinkedtoeventsviasearches

Anothermethodforlookups

Scheduler

5

Page 6: KV Store: Hammer Time

WhyisKVStoreImportant?

ES,ITSI,otherpremiumapps

user-createddatastoredinKVStore

SHC

keepstrackofcompletedjobstopreventduplicatejobsifSHCmembergoesoffline

Stand-aloneSH

trackingofcompletedjobsincaseofdowntime

6

Page 7: KV Store: Hammer Time

KVStore101

Mongodb-projecthasgooddocs,donothesitatetorefertothem

Logs:$SPLUNK_HOME/var/log/splunk/mongodb.log

$SPLUNK_HOME/var/log/introspection/kvstore.log

KVStorewhollyunnecessaryonIDX&HWFbydefault

unlessyouexplicitlyuseKVStore

oraSHisenabledontheinstance(maynotbenecessary,though,dependsonuse)

Note:setting"replication=true"createsCSVsthatreplicatetoIDX,notmongodbreplicationtoIDX

7

Page 8: KV Store: Hammer Time

KVStore101

SearchestogetKVStorestatus

SHCstatuscommands

DMC/MCinfo

Collectioncounts,size,etc.

RESTcurl -ku admin https://<host>:<mPort>/services/kvstore/status

OthersinRESTendpointdocs

8

Page 9: KV Store: Hammer Time

#1Failure

Page 10: KV Store: Hammer Time

NoBackups!

Folksdon'trealizethisdataisnotinanindex

Ifyoudon'ttakeregularbackups,easytolosealldataintheKVstore

Protip:RAIDisnotabackup

10

Page 11: KV Store: Hammer Time

BackupMethods

OSlevel

SHC-firstmakesureallSHsareinsync6.5orlater:splunkshowkvstore-status

before6.5:curl -s -k https://localhost:8089/services/server/info | grep kvStoreStatusCheckDMC/MCforcollectioncounts

TakeSHofflineTakeafilesystemsnapshotorcopyentireKVstoredirectory

BringSHbackon-line

11

Page 12: KV Store: Hammer Time

BackupMethods

ITSI

Usebackup/restoreinWebUI

Usekvstore_to_json.py(repurpose?)

Starcher'sbackupscript

https://github.com/georgestarcher/Splunk-backupkvstore

Outputlookup

largekvstorecouldbeaproblem-highmemoryconsumption

multi-kvcouldalsobeaproblem-flattens

12

Page 13: KV Store: Hammer Time

RestoringZeWholeEnchilada

Protip:TESTinadevenvironmentfirst!

OSLevelstandaloneSH

OfflineSplunk

Copykvstoredirectorybackincorrectlocation

RestartSplunk

13

Page 14: KV Store: Hammer Time

RestoringinSHC

SHCversion6.3orlater:

ShutdownSHCmemberssplunk clean kvstore --cluster

Copykvstoredirectoryintoplaceononemember

VerifyKVstoreisgood

RestartotherSHCmembers

SHCpriorto6.3,complicated

Re-bootstrapclusterfromscratchaftertemporarilysettingreplication_factor=1

Advisesupportcase

14

Page 15: KV Store: Hammer Time

SHC:WhenEverythingisaNail...

Page 16: KV Store: Hammer Time

KVStoreComplications

KVstoreisaseparateclusterwithintheSHC

KVstorecaptainprobablydifferentfromSHCcaptain

MonitorKVstore:

Staleness

Collectioncounts:

Outofsync=problem

MonitoryourSHs,ifoneisofflineforanysignificanttime,likelytolosesync

16

Page 17: KV Store: Hammer Time

UsetheMonitoringConsole,Luke

Especiallyin6.5andlater

17

Page 18: KV Store: Hammer Time

KVStoreBasicStatus

18

MC->Search->KVStore:Deployment

Screenshothere

Page 19: KV Store: Hammer Time

KVInstanceCollectionMetrics

19

MC->Search->KVStore:Instance

Page 20: KV Store: Hammer Time

KVStoreReplicationLatency

20

MC->Search->KVStore:Deployment

Page 21: KV Store: Hammer Time

OpLog

Mongodbusesacircularoperationslog,aka"oplog"

IfaKVstoremembercannotkeepupwiththetransactionsinthislog,eventuallyitwilllosesyncExamples:

SHCmemberoff-lineforaperiodoftimeexceedingoplogsizeLargeKVstorechangesareperformedinanon-transactionalway(e.g.createtemptable=>replaceexistingtable)

SHCissobusythatoplogheadisoverwrittenfasterthantheKVstorecanreplicateacrossallmembers

21

Page 22: KV Store: Hammer Time

KVStoreBasicStatus

22

OotBOplogsize=1GB

Page 23: KV Store: Hammer Time

Butwhy?

Mongodbbydefaultsetsepilogsizeto50GBonlargepartitions

1GBgenerallyfineforanormalSHCthat'sjustkeepingtrackofschedulerjobs

1GBevenworksfineforawhilewithpremiumapps—untilitdoesn't

23

Page 24: KV Store: Hammer Time

Howbigshould"oplogSize"be?

"Itdepends"

Lookatsizeof"kvstore"directoryondiskorinMC

Hastherebeenarecentresync(compactonlyoccursonresync)?

Latency?

AnyodditiesinWindowsize?

Doinganythingusual(temptables/replace)?

Generallysomethingbetween10-20gbshouldworkunlessyou'redoingsomethingunusual

24

Page 25: KV Store: Hammer Time

KVOpLogWindow

25

MC->Search->KVStore:Deployment

Page 26: KV Store: Hammer Time

OtherOpLogImpacts

HowbigistheSHC?

SlowdiskI/O

Networklatencyw/inSHC

IfyouarehavingproblemsreplicatingconfigchangesinsideSHC,canimpactkvstore

MayneedtuninginSHC(hitmeinSlack/IRC)

26

Page 27: KV Store: Hammer Time

MergingKVStores

ThankstoSplunk,itispossibletomergeKVStoresunlikerawMongodb

Caveats:

Mustbedoneonacollectionbycollectionbasis

Havetounderstandkeyfieldsincollection

Multi-kvflattened

Example:| inputlookup incident_review_lookup | table _key,time,rule_id,owner,urgency,status,comment,user,rule_name|outputlookup new_data.csv

27

Page 28: KV Store: Hammer Time

GeneralMethod

BackuptheKVStoreyouwanttomergeinto"good"kvstore

Performtheprevioussearchontheincomplete"bad"kvstoretogetaCSVfile

Copybothtoatestinstance

Restore"good"kvstoreintotestinstance

PerformwhateversurgeryisnecessaryontheCSVfiletoremoveunwantedrecords

“Merge”usinganothersearch:|inputlookup new_data.csv |outputlookup append=true incident_review_lookup

28

Page 29: KV Store: Hammer Time

OtherSuggestions

DuaneandGeorge's.conf2016talk:”ShopSmartattheKVStore:BestValueTricksfromtheSplunkKVStoreandRESTAPI”

https://conf.splunk.com/files/2016/slides/shop-smart-at-the-kv-store-best-value-tricks-from-the-splunk-kv-store-and-rest-api.pdf

George'sTA-TA-SyncKVStore

https://splunkbase.splunk.com/app/3519/GeminiKVStoreToolsforSplunk

Looksinteresting,haven'ttestedhttps://splunkbase.splunk.com/app/3536/

29

Page 30: KV Store: Hammer Time

Errata

OnastandaloneootbSH,KVstoreisonlyusedforusers’searchhistory;noscheduledjobstatusistrackedusingKVstore

30

Page 31: KV Store: Hammer Time

Addendum#1

Youcandoa“roundrobinresync”toincreaseopLogSize

Slowandtediousifyouhavealargecluster

31

Page 32: KV Store: Hammer Time

Addendum#2

FindoutifKVStoreisinuseonyourIDXorHWF:

curl -k -u admin https://127.0.0.1:8089/servicesNS/nobody/search/storage/collections/config|egrep -i "\<title\>"

Shouldonlyreturn:

<title>collections-conf</title> <title>SavedSearchHistory</title>

Caveats:OnlyadminscanlogintoIDX/HWF;noappsareinstalled;youusedacleaninstanceasanIDX

32

Page 33: KV Store: Hammer Time

Questions?

Page 34: KV Store: Hammer Time

THANKYOU