DevOps Days Tel Aviv 2013: Ignite Talk: Monitoring Patterns with Riemann - Itai Frenkel & Eli Polonsky

  • View

  • Download

Embed Size (px)


Riemann aggregates events from your servers and applications with a powerful stream processing language, which enables concise monitoring rule declarations. This 5 minute ignite talk gives a taste of common monitoring pattern implementations: heartbeat, statistics, event enrichment, state based filters, multi-tenant monitoring, and reviews what you can do with Riemann after processing these patterns. Speakers: Itai Frenkel and Eli Polonski, GigaSpaces Eli Polonsky and Itai Frenkel work at GigaSpaces, developing the the Cloudify open source devops and cloud automation suite. Part of their work includes open source devops tool evaluation such as Riemann.

Text of DevOps Days Tel Aviv 2013: Ignite Talk: Monitoring Patterns with Riemann - Itai Frenkel & Eli...

  • Built for monitoring distributed systems Event Stream Processing (like ESPER/Drools Fusion) Shared-State (index) Open Source (written by aphyr (Kyle Kingsbury))

2. Concepts event hostAservicereq_latencystateokmetric1ttl60tagsimportant 3. Concepts index (host, service) (A, req_latency) (B, req_latency) (C, req_latency) (D, req_latency)last event 4. Heartbeat Trigger 5. Heartbeat Trigger (expired (tagged keep_alive (email "alert@devops.tlv"))) 6. Threshold Trigger 7. Threshold Trigger (where (and (service "req_latency") (> metric 10)) (email "alert@devops.tlv")) 8. Change State (host, service)metric state('A', 'req_latency')20error('B', 'req_latency')1ok('C', 'req_latency')5error('D', 'req_latency')5ok 9. Change State (where (service req_latency) (split (< metric 2) (with :state "ok" index) (> metric 10) (with :state "error" index))) (changed-state {:init ok} (email alert@devops.tlv)) 10. Time Window Statistics 11. Cluster Statistics 12. Cluster Statistics (by [:host] (where (service "req_latency") (percentiles 60 [0.5] index-max-of-median))) (def index-max-of-median (smap folds/maximum index)) 13. Event Storm Filtering 14. Event Storm Filtering (def alert-devops (throttle 100 3600 (rollup 3 3600 (email "alert@devops.tlv")))) (where (tagged "db-connection-exception") alert-devops) 15. Event Enrichment hostAservicereq_latencystateokmetric1ttl60tagsimportanttenant1 16. Event Enrichment (defn change-event [my-key my-value & children] (fn [event] (let [my-event (assoc event :my-key :my-value)] (call-rescue my-event children)))) (change-event 'tenant' '1' index) 17. Tenant 1Tenant 2Tenant 3 18. Multi-Tenancy (def riemann-agg (tcp-client :host "agg-hostname")) (changed-state (change-event 'tenant' '1') (forward riemann-agg)) 19.