20
MONITORING GERRIT STEPHEN V WILLIAMS

MONITORING(GERRIT - storage.googleapis.com · MONITORING(GERRIT STEPHEN’V ... • Splunk • Elastic’Search • CronJobs alittle$bit$of$everything. ... SPLUNK &ELK(STACK Data’Mining’Logs

  • Upload
    haquynh

  • View
    270

  • Download
    2

Embed Size (px)

Citation preview

Page 1: MONITORING(GERRIT - storage.googleapis.com · MONITORING(GERRIT STEPHEN’V ... • Splunk • Elastic’Search • CronJobs alittle$bit$of$everything. ... SPLUNK &ELK(STACK Data’Mining’Logs

MONITORING  GERRITSTEPHEN  V  WILLIAMS

Page 2: MONITORING(GERRIT - storage.googleapis.com · MONITORING(GERRIT STEPHEN’V ... • Splunk • Elastic’Search • CronJobs alittle$bit$of$everything. ... SPLUNK &ELK(STACK Data’Mining’Logs

INTRODUCTION

• Git &  Gerrit Admin  at  Qualcomm

• Admin  Team:  • Earl  Gresh,  Aaron  Crossland

• Developer  Team:  • Nasser  Grainawi,  Martin  Fick,  Kevin  Degi,  Craig  Chappel,  Raviteja Sunkara

The  usual  blurb

Page 3: MONITORING(GERRIT - storage.googleapis.com · MONITORING(GERRIT STEPHEN’V ... • Splunk • Elastic’Search • CronJobs alittle$bit$of$everything. ... SPLUNK &ELK(STACK Data’Mining’Logs

MONITORING  PHILOSOPHY

• Inventory  Suppliment

• Operation  Center

• Fast  and  Loud

• 1%  is  better  than  0%

to  Err  is  Human  …  to  really  foul  things  up  requires  a  computer

Page 4: MONITORING(GERRIT - storage.googleapis.com · MONITORING(GERRIT STEPHEN’V ... • Splunk • Elastic’Search • CronJobs alittle$bit$of$everything. ... SPLUNK &ELK(STACK Data’Mining’Logs

MONITORING  SOLUTIONS

• CheckMK /  Nagios

• Splunk

• Elastic  Search

• Cron Jobs

a  little  bit  of  everything

Page 5: MONITORING(GERRIT - storage.googleapis.com · MONITORING(GERRIT STEPHEN’V ... • Splunk • Elastic’Search • CronJobs alittle$bit$of$everything. ... SPLUNK &ELK(STACK Data’Mining’Logs

CRON JOBS

• Basic  /  Rudimentary  /  Step  1

• Email  Warnings

#  gerrit user  crontabMAILTO=lords-­‐of-­‐[email protected]

#  check  for  corrupt  repositories  daily15  5  *  *  *  /srv/admin/gerrit-­‐fsck.sh

Page 6: MONITORING(GERRIT - storage.googleapis.com · MONITORING(GERRIT STEPHEN’V ... • Splunk • Elastic’Search • CronJobs alittle$bit$of$everything. ... SPLUNK &ELK(STACK Data’Mining’Logs

CHECK_MK /  NAGIOS

• Standard  Monitoring  of  Hardware

• CPU  Usage

• Memory

• Disk

• I/O• Network

• Better  form  of  cron job  monitors

Std Admin  Toolkit

Page 7: MONITORING(GERRIT - storage.googleapis.com · MONITORING(GERRIT STEPHEN’V ... • Splunk • Elastic’Search • CronJobs alittle$bit$of$everything. ... SPLUNK &ELK(STACK Data’Mining’Logs

CHECK_MK /  NAGIOS

• Greedy  Upload-­‐Packs

• Stuck  Threads

• Long  Running  Jobs

Gerrit Admins  Toolkit

Developer:    “hmmm  I  need   this  really  fast,  I’m  in  a  hurry…  repo  update  –j  100000000000”

Developer:    “I  don’t   like  my  regional  mirror…  lets  clone  from  the  other  side  of  the  globe”

Developer:    “I’m  going   to  create  a  cron tab  to  sycn code  every  hour…  no  need  to  make  sure  it  completes”

Page 8: MONITORING(GERRIT - storage.googleapis.com · MONITORING(GERRIT STEPHEN’V ... • Splunk • Elastic’Search • CronJobs alittle$bit$of$everything. ... SPLUNK &ELK(STACK Data’Mining’Logs

SPLUNK &  ELK  STACK

Data  Mining  Logs

Page 9: MONITORING(GERRIT - storage.googleapis.com · MONITORING(GERRIT STEPHEN’V ... • Splunk • Elastic’Search • CronJobs alittle$bit$of$everything. ... SPLUNK &ELK(STACK Data’Mining’Logs

MONITORING  SLAVES

• Runtime

• Upload  Pack  Count

• Receive  Pack  Count

• Runtime  vs  Session  Time

Page 10: MONITORING(GERRIT - storage.googleapis.com · MONITORING(GERRIT STEPHEN’V ... • Splunk • Elastic’Search • CronJobs alittle$bit$of$everything. ... SPLUNK &ELK(STACK Data’Mining’Logs

RUN  TIME

host=git-­‐sd*  sourcetype=ssherror_gerrit |  search  cmd_gerrit!="gerrit ls-­‐projects"  |timechartspan=15min  avg(runtime_seconds)  by  host

Page 11: MONITORING(GERRIT - storage.googleapis.com · MONITORING(GERRIT STEPHEN’V ... • Splunk • Elastic’Search • CronJobs alittle$bit$of$everything. ... SPLUNK &ELK(STACK Data’Mining’Logs

UPLOAD–PACK  COUNT

host=git-­‐sd*  sourcetype=ssherror_gerritcmd_gerrit="git-­‐upload-­‐pack"  |  stats  count  as  gerrit-­‐fetch  by  host

Page 12: MONITORING(GERRIT - storage.googleapis.com · MONITORING(GERRIT STEPHEN’V ... • Splunk • Elastic’Search • CronJobs alittle$bit$of$everything. ... SPLUNK &ELK(STACK Data’Mining’Logs

RECEIVE-­‐PACK  COUNT

host=git-­‐sd-­‐*  sourcetype="syslog"  "Request  receive-­‐pack  for"  |  timechart count  as  recieve-­‐pack  by  host

Page 13: MONITORING(GERRIT - storage.googleapis.com · MONITORING(GERRIT STEPHEN’V ... • Splunk • Elastic’Search • CronJobs alittle$bit$of$everything. ... SPLUNK &ELK(STACK Data’Mining’Logs

RUNTIME  VS  SESSION  TIMEhost=git-­‐sd*  sourcetype=ssherror_gerritcmd_gerrit="git-­‐upload-­‐pack"|  timechart span=5m  avg(runtime_seconds)  as  runtime_duration,  avg(runtime_total)  as  session_duration

Page 14: MONITORING(GERRIT - storage.googleapis.com · MONITORING(GERRIT STEPHEN’V ... • Splunk • Elastic’Search • CronJobs alittle$bit$of$everything. ... SPLUNK &ELK(STACK Data’Mining’Logs

MASTER  MONITORING

• Commands

• Query  Wait  Times

• Connection  Rates

• Connection  Runtimes

• Java  GC• Repacking

Page 15: MONITORING(GERRIT - storage.googleapis.com · MONITORING(GERRIT STEPHEN’V ... • Splunk • Elastic’Search • CronJobs alittle$bit$of$everything. ... SPLUNK &ELK(STACK Data’Mining’Logs

COMMANDS

Host=master  sourcetype=ssherror_gerritcmd_gerrit=*  |  timechart span=5min  count  by  cmd_gerrit

Page 16: MONITORING(GERRIT - storage.googleapis.com · MONITORING(GERRIT STEPHEN’V ... • Splunk • Elastic’Search • CronJobs alittle$bit$of$everything. ... SPLUNK &ELK(STACK Data’Mining’Logs

QUERY  WAIT  TIMES

Host=mastersourcetype=ssherror_gerrit |  search  cmd_gerrit="*"  |  search  cmd_gerrit!="gerrit stream-­‐events"  |  timechart max(wait_seconds)  avg(wait_seconds)

Page 17: MONITORING(GERRIT - storage.googleapis.com · MONITORING(GERRIT STEPHEN’V ... • Splunk • Elastic’Search • CronJobs alittle$bit$of$everything. ... SPLUNK &ELK(STACK Data’Mining’Logs

CONNECTION  RATES

host=mastersourcetype=ssherror_gerrit |search  cmd_general="LOGIN"  |  timechart count  by  username

Page 18: MONITORING(GERRIT - storage.googleapis.com · MONITORING(GERRIT STEPHEN’V ... • Splunk • Elastic’Search • CronJobs alittle$bit$of$everything. ... SPLUNK &ELK(STACK Data’Mining’Logs

CONNECTION  RUNTIMES

host=mastersourcetype=ssherror_gerrit |  search  cmd_gerrit="*"  |  search  cmd_gerrit!="gerrit stream-­‐events"  |    timechart sum(runtime_seconds)  by  username

Page 19: MONITORING(GERRIT - storage.googleapis.com · MONITORING(GERRIT STEPHEN’V ... • Splunk • Elastic’Search • CronJobs alittle$bit$of$everything. ... SPLUNK &ELK(STACK Data’Mining’Logs

JAVA  GChost=mastersourcetype=jvm_gerrit "Full  GC"  |  timechart span=1m  

max(YoungGen_before)  max(YoungGen_after)  max(OldGen_before)  max(OldGen_after)  max(PermGen_before)  max(PermGen_after)

-­‐verbose:gc -­‐XX:+PrintGCDateStamps -­‐XX:+PrintGCDetails

Page 20: MONITORING(GERRIT - storage.googleapis.com · MONITORING(GERRIT STEPHEN’V ... • Splunk • Elastic’Search • CronJobs alittle$bit$of$everything. ... SPLUNK &ELK(STACK Data’Mining’Logs

REPACKINGHost=master  sourcetype=repacking_gitrepack_keeps_before=*  OR  repack_packs_before=*  |  timechart span=1d  

min(repack_keeps_after),max(repack_keeps_before),min(repack_packs_after),max(repack_packs_before)