62
#vmworld Billion Lyft Rides, Half Million IoT Users How to Scale SaaS with Analytics Insights Rob Fisher, SRE, Centrica Hive Yash Kumaraswamy, Sr. Software Engineer, Lyft Stela Udovicic, Wavefront Product Marketing, VMware MGT1402BE #MGT1402BE VMworld 2018 Content: Not for publication or distribution

Billion Lyft Rides, Half Million - EventKaddy CMS › event_data › 10 › session_notes › MGT1402BE.pdfFaster Using Wavefront Serverless Instrumentation and Monitoring Wavefront

  • Upload
    others

  • View
    4

  • Download
    0

Embed Size (px)

Citation preview

Page 1: Billion Lyft Rides, Half Million - EventKaddy CMS › event_data › 10 › session_notes › MGT1402BE.pdfFaster Using Wavefront Serverless Instrumentation and Monitoring Wavefront

#vmworld

Billion Lyft Rides, Half Million IoT Users

How to Scale SaaS with Analytics Insights

Rob Fisher, SRE, Centrica HiveYash Kumaraswamy, Sr. Software Engineer, Lyft

Stela Udovicic, Wavefront Product Marketing, VMware

MGT1402BE

#MGT1402BE

VMworld 2018 Content: Not for publication or distribution

Page 2: Billion Lyft Rides, Half Million - EventKaddy CMS › event_data › 10 › session_notes › MGT1402BE.pdfFaster Using Wavefront Serverless Instrumentation and Monitoring Wavefront

Disclaimer

2©2018 VMware, Inc.

This presentation may contain product features orfunctionality that are currently under development.

This overview of new technology represents no commitment from VMware to deliver these features in any generally available product.

Features are subject to change, and must not be included in contracts, purchase orders, or sales agreements of any kind.

Technical feasibility and market demand will affect final delivery.

Pricing and packaging for any new features/functionality/technology discussed or presented, have not been determined.

VMworld 2018 Content: Not for publication or distribution

Page 3: Billion Lyft Rides, Half Million - EventKaddy CMS › event_data › 10 › session_notes › MGT1402BE.pdfFaster Using Wavefront Serverless Instrumentation and Monitoring Wavefront

Agenda

3©2018 VMware, Inc.

Introduction

What is New with Wavefront

Centrica Hive

Lyft

VMworld 2018 Content: Not for publication or distribution

Page 4: Billion Lyft Rides, Half Million - EventKaddy CMS › event_data › 10 › session_notes › MGT1402BE.pdfFaster Using Wavefront Serverless Instrumentation and Monitoring Wavefront

©2018 VMware, Inc. 4

SaaS is Growing

$60 Billion2017 SaaS revenue worldwide

$117 Billion2021 SaaS revenue forecast worldwide

*Source Gartner PR - April 12, 2018

VMworld 2018 Content: Not for publication or distribution

Page 5: Billion Lyft Rides, Half Million - EventKaddy CMS › event_data › 10 › session_notes › MGT1402BE.pdfFaster Using Wavefront Serverless Instrumentation and Monitoring Wavefront

©2018 VMware, Inc. 5

But Scaling Cloud Applications is Not Easy

Visibility Issues

• Containers rapid churn

• Serverless speed

• Cloud-scale

Security risks of public cloud applications

VMworld 2018 Content: Not for publication or distribution

Page 6: Billion Lyft Rides, Half Million - EventKaddy CMS › event_data › 10 › session_notes › MGT1402BE.pdfFaster Using Wavefront Serverless Instrumentation and Monitoring Wavefront

6©2018 VMware, Inc.

Wavefront Cloud-Native Analytics and Monitoring Platform

UI and API Backend

Trend & Alert on Anomalies

Troubleshoot Issues

Visualize Metrics at Scale

Self-Service Metrics Analytics for All

Advanced Analytics Engine

Metrics Collection and Storage

UI and API Backend

VMworld 2018 Content: Not for publication or distribution

Page 7: Billion Lyft Rides, Half Million - EventKaddy CMS › event_data › 10 › session_notes › MGT1402BE.pdfFaster Using Wavefront Serverless Instrumentation and Monitoring Wavefront

7©2018 VMware, Inc.

Cloud-Native Analytics and Monitoring Platform

Wavefront by VMware Reliably Scale Your Digital Business

Massive Container Scalability

Serverless Application Instrumentation and Monitoring

Enriched AWS Dashboards -New UI for Faster Troubleshooting

Enhanced Security Access Control

NEW!

VMworld 2018 Content: Not for publication or distribution

Page 8: Billion Lyft Rides, Half Million - EventKaddy CMS › event_data › 10 › session_notes › MGT1402BE.pdfFaster Using Wavefront Serverless Instrumentation and Monitoring Wavefront

‹#› 8©2018 VMware, Inc.

Wavefront's Massive Container Scalability Helps Easily Grow Digital Business – No Blind Spots

Concurrently Running Containers

100,000

Ingesting, Analyzing, Visualizing Metrics from

VMworld 2018 Content: Not for publication or distribution

Page 9: Billion Lyft Rides, Half Million - EventKaddy CMS › event_data › 10 › session_notes › MGT1402BE.pdfFaster Using Wavefront Serverless Instrumentation and Monitoring Wavefront

9

Deep Wavefront PKS Integration for Holistic Kubernetes Monitoring

Kubernetes Health Monitoring

Resource Consumption

Programmatic Alerting

VMworld 2018 Content: Not for publication or distribution

Page 10: Billion Lyft Rides, Half Million - EventKaddy CMS › event_data › 10 › session_notes › MGT1402BE.pdfFaster Using Wavefront Serverless Instrumentation and Monitoring Wavefront

‹#› 10©2018 VMware, Inc.

Deliver Serverless Code Faster Using Wavefront Serverless Instrumentation and Monitoring

Wavefront AWS Lambda Functions SDK -Python, Go, Node.js

– Faster: send metrics from functions directly, bypass AWS CloudWatch Lambda

– Better granularity: 1 sec compared to 5 min

Wavefront AWS Functions Dashboards – At-a-glance serverless health monitoring– Easy customization– Correlation native with custom metrics

Wavefront Delta Counters – More accurate reporting prevents metric loss– Aggregates metric counters from various

sources

VMworld 2018 Content: Not for publication or distribution

Page 11: Billion Lyft Rides, Half Million - EventKaddy CMS › event_data › 10 › session_notes › MGT1402BE.pdfFaster Using Wavefront Serverless Instrumentation and Monitoring Wavefront

11©2018 VMware, Inc.

Detailed Per Function Dashboard

At-a-Glance Health with Aggregated

Visibility

Visualize Health of Serverless Environment with New Dashboards

VMworld 2018 Content: Not for publication or distribution

Page 12: Billion Lyft Rides, Half Million - EventKaddy CMS › event_data › 10 › session_notes › MGT1402BE.pdfFaster Using Wavefront Serverless Instrumentation and Monitoring Wavefront

‹#› 12©2018 VMware, Inc.

Troubleshoot Cloud Environments Faster with Enriched Wavefront Dashboards for AWS

Holistic AWS Monitoring

Map and view the status of your global AWS cloud estate to see where problems are emerging system-wide

Host Maps with Easy Drill-Downs

Click and link to cloud assets to seamlessly drill -down across regional, zones, and instances

New Dashboards & Widgets

Use prebuilt dashboards with new metric widgets to accelerate incident resolution

VMworld 2018 Content: Not for publication or distribution

Page 13: Billion Lyft Rides, Half Million - EventKaddy CMS › event_data › 10 › session_notes › MGT1402BE.pdfFaster Using Wavefront Serverless Instrumentation and Monitoring Wavefront

VMworld 2018 Content: Not for publication or distribution

Page 14: Billion Lyft Rides, Half Million - EventKaddy CMS › event_data › 10 › session_notes › MGT1402BE.pdfFaster Using Wavefront Serverless Instrumentation and Monitoring Wavefront

VMworld 2018 Content: Not for publication or distribution

Page 15: Billion Lyft Rides, Half Million - EventKaddy CMS › event_data › 10 › session_notes › MGT1402BE.pdfFaster Using Wavefront Serverless Instrumentation and Monitoring Wavefront

VMworld 2018 Content: Not for publication or distribution

Page 16: Billion Lyft Rides, Half Million - EventKaddy CMS › event_data › 10 › session_notes › MGT1402BE.pdfFaster Using Wavefront Serverless Instrumentation and Monitoring Wavefront

‹#› 16©2018 VMware, Inc.

Intuitive Security Access Controls for Protecting Digital Insights

User Groups Management

Create user groups to assign permissions to easily manage dashboards accessibility

Programmatic Controls

Enhanced usability with easy manipulation of user groups helps manage user growth

ACL on Entities

Set ACL for dashboards to isolate access by user groups to avoid malicious data integrity attacks

VMworld 2018 Content: Not for publication or distribution

Page 17: Billion Lyft Rides, Half Million - EventKaddy CMS › event_data › 10 › session_notes › MGT1402BE.pdfFaster Using Wavefront Serverless Instrumentation and Monitoring Wavefront

17©2018 VMware, Inc.

Some of Wavefront CustomersMonitoring Cloud-native Applications and Infrastructure

VMworld 2018 Content: Not for publication or distribution

Page 18: Billion Lyft Rides, Half Million - EventKaddy CMS › event_data › 10 › session_notes › MGT1402BE.pdfFaster Using Wavefront Serverless Instrumentation and Monitoring Wavefront

18©2018 VMware, Inc.

Centrica Hive

VMworld 2018 Content: Not for publication or distribution

Page 19: Billion Lyft Rides, Half Million - EventKaddy CMS › event_data › 10 › session_notes › MGT1402BE.pdfFaster Using Wavefront Serverless Instrumentation and Monitoring Wavefront

©2018 VMware, Inc. 19

About Centrica Hive

Largest IoT platform in the United Kingdom

Over 500,000 customers

Entirely cloud-native

Part of Centrica group, but grew like a startup

VMworld 2018 Content: Not for publication or distribution

Page 20: Billion Lyft Rides, Half Million - EventKaddy CMS › event_data › 10 › session_notes › MGT1402BE.pdfFaster Using Wavefront Serverless Instrumentation and Monitoring Wavefront

©2018 VMware, Inc. 20

Taking Control

The Estate

Configuration Management

The Platform

Security and Compliance

Alerting (and sleeping all night)

Cost

VMworld 2018 Content: Not for publication or distribution

Page 21: Billion Lyft Rides, Half Million - EventKaddy CMS › event_data › 10 › session_notes › MGT1402BE.pdfFaster Using Wavefront Serverless Instrumentation and Monitoring Wavefront

©2018 VMware, Inc. 21

Wavefront at Centrica Hive Future Plans

Moving to Serverless and Kubernetes

Integrating all our devices

Staying in control

VMworld 2018 Content: Not for publication or distribution

Page 22: Billion Lyft Rides, Half Million - EventKaddy CMS › event_data › 10 › session_notes › MGT1402BE.pdfFaster Using Wavefront Serverless Instrumentation and Monitoring Wavefront

22©2018 VMware, Inc.

Improving Lives with the World’s Best Transportation

VMworld 2018 Content: Not for publication or distribution

Page 23: Billion Lyft Rides, Half Million - EventKaddy CMS › event_data › 10 › session_notes › MGT1402BE.pdfFaster Using Wavefront Serverless Instrumentation and Monitoring Wavefront

23©2018 VMware, Inc.

About Me

HOBBIES Include Guitar, Golf, Skateboarding, Cooking/Baking, and Automobiles

FORMERTech Lead of Lyft Observability

CURRENTLYWorking closely with the Express Drive team – Lyft’svehicle rental program for drivers

OVER9 years in tech

2010-2014Zynga

EARLYLyft Infrastructure (DevOps) Engineer

VMworld 2018 Content: Not for publication or distribution

Page 24: Billion Lyft Rides, Half Million - EventKaddy CMS › event_data › 10 › session_notes › MGT1402BE.pdfFaster Using Wavefront Serverless Instrumentation and Monitoring Wavefront

24©2018 VMware, Inc.

One Billion Rides

2018

VMworld 2018 Content: Not for publication or distribution

Page 25: Billion Lyft Rides, Half Million - EventKaddy CMS › event_data › 10 › session_notes › MGT1402BE.pdfFaster Using Wavefront Serverless Instrumentation and Monitoring Wavefront

25©2018 VMware, Inc.

About Lyft

• Transportation as a service

• “Your friend with a car,” redefines

personal transportation

• Founded in San Francisco 2012

• Currently serving in US and Canada

• Available in 300+ cities and 1500 drivers

at any minute

VMworld 2018 Content: Not for publication or distribution

Page 26: Billion Lyft Rides, Half Million - EventKaddy CMS › event_data › 10 › session_notes › MGT1402BE.pdfFaster Using Wavefront Serverless Instrumentation and Monitoring Wavefront

26©2018 VMware, Inc.

Lyft – More Fun Facts

• 250,000 Lyft community members gave up their cars at the beginning of 2017

• The Lyft community will take 1 million cars off the road by the end of 2019

• Autonomous vehicle fleets will become widespread & will account for the majority of Lyft

rides within five years

• By 2025, private car ownership will all-but end in major US cities

• Lyft rides are carbon-neutral

• Lyft Bikes and Scooters will be our solution to last mile commute

VMworld 2018 Content: Not for publication or distribution

Page 27: Billion Lyft Rides, Half Million - EventKaddy CMS › event_data › 10 › session_notes › MGT1402BE.pdfFaster Using Wavefront Serverless Instrumentation and Monitoring Wavefront

27©2018 VMware, Inc.

Lyft Stats – in 2017

Annual Rides

MM

New Year Eve Rides

MM

Employees

2K+

Halloween Drop-

Offs/sec

K+

Microservices

200+

EC2 instances

10,000+

Lots* of logs and metrics

VMworld 2018 Content: Not for publication or distribution

Page 28: Billion Lyft Rides, Half Million - EventKaddy CMS › event_data › 10 › session_notes › MGT1402BE.pdfFaster Using Wavefront Serverless Instrumentation and Monitoring Wavefront

28©2018 VMware, Inc.

Observability Team at Lyft

Founded in early 2016, a small and cohesive team of 5 engineers

Team collectively owns

• Client and Server logging infrastructure

• Metric ingest pipeline and real-time aggregation

• Distributed Tracing

• PagerDuty interactions and integrations

• The real-time business metric framework

• Dashboards and user experience with monitoring and alarming setup

• Logging and metric-based alerting

• Baseline monitoring systems for all microservices

• Core librariesVMworld 2018 Content: Not for publication or distribution

Page 29: Billion Lyft Rides, Half Million - EventKaddy CMS › event_data › 10 › session_notes › MGT1402BE.pdfFaster Using Wavefront Serverless Instrumentation and Monitoring Wavefront

29©2018 VMware, Inc.

Metrics at Lyft: The Before Times

VMworld 2018 Content: Not for publication or distribution

Page 30: Billion Lyft Rides, Half Million - EventKaddy CMS › event_data › 10 › session_notes › MGT1402BE.pdfFaster Using Wavefront Serverless Instrumentation and Monitoring Wavefront

30©2018 VMware, Inc.

Before Wavefront by VmwareChallenges with Open Source Tooling

30

• Manual maintenance• Resource-hungry drives

cost

• Query performance issues• Ingest performance issues

• Hard to scale• Sharding handled

externally

Reliability Performance Maintainability

VMworld 2018 Content: Not for publication or distribution

Page 31: Billion Lyft Rides, Half Million - EventKaddy CMS › event_data › 10 › session_notes › MGT1402BE.pdfFaster Using Wavefront Serverless Instrumentation and Monitoring Wavefront

31©2018 VMware, Inc.

Observability Challenges Early in 2015

• Lyft used Graphite (and whisper files) located on i2 instances

• Hard to scale, we handledsharding externally

• Relays provided poor controlfor fan out of data to alternate destinations

• We computed top-level aggregates from the onealready existing

• This stack processed local minutely aggregated samples

VMworld 2018 Content: Not for publication or distribution

Page 32: Billion Lyft Rides, Half Million - EventKaddy CMS › event_data › 10 › session_notes › MGT1402BE.pdfFaster Using Wavefront Serverless Instrumentation and Monitoring Wavefront

32©2018 VMware, Inc.

Observability Challenges Early in 2016

• Replace the poorly scaling

Python-based intermediaries

with more efficient

components

• Reduce end to end to end

latency for site >3m to < 2m

• Produce improved and

accurate top-level

aggregates - p95/99/999/9999

VMworld 2018 Content: Not for publication or distribution

Page 33: Billion Lyft Rides, Half Million - EventKaddy CMS › event_data › 10 › session_notes › MGT1402BE.pdfFaster Using Wavefront Serverless Instrumentation and Monitoring Wavefront

33©2018 VMware, Inc.

Early 2016 – Enter Wavefront

• Node.js based StatsD replaced by C implementation of StatsD server – lower overhead, better data quality

• Added fan-out for StatsD traffic to other clusters or receivers, e.g., Wavefront

• Wrote cluster-wide aggregated metrics to the existing cluster graphite under a new namespace to allow comparisons of latency and accuracy

• Aggregated StatsD packets over time in several dimensions, including per-host and per-cluster

Wavefront starts serving 20% of reading traffic on March 2016

• Time series ingestion

• Integrated alarms

• Wavefront salt module for alert, dashboard and user management

• Grafana integration

VMworld 2018 Content: Not for publication or distribution

Page 34: Billion Lyft Rides, Half Million - EventKaddy CMS › event_data › 10 › session_notes › MGT1402BE.pdfFaster Using Wavefront Serverless Instrumentation and Monitoring Wavefront

34©2018 VMware, Inc.

So Many Metrics!

System metrics• Collected• Custom scripts• Bash functions

‚‚Applications metrics

Core libraries instrumentation

Scraper scripts - pull metrics• Cloudwatch metrics• Google Cloud Platform metrics• Mongo telemetry

Containers generated parameters (future Kubernetes)VMworld 2018 Content: Not for publication or distribution

Page 35: Billion Lyft Rides, Half Million - EventKaddy CMS › event_data › 10 › session_notes › MGT1402BE.pdfFaster Using Wavefront Serverless Instrumentation and Monitoring Wavefront

35©2018 VMware, Inc.

Opt-in mechanism for per-host and per-second data

Only ~300K metrics per second, thanks to rollups

Per-instance cardinality limits

So Many Metrics!

Billions per second, even with aggregation and sampling

Graphite meltdown!

VMworld 2018 Content: Not for publication or distribution

Page 36: Billion Lyft Rides, Half Million - EventKaddy CMS › event_data › 10 › session_notes › MGT1402BE.pdfFaster Using Wavefront Serverless Instrumentation and Monitoring Wavefront

36©2018 VMware, Inc.

Wavefront by VMware at Lyft Today

36

• System Monitoring

• Application monitoring

• > 500,000 metrics/second - peaked at 800,000

• 1,000+ engineers using Wavefront

• 1,000+ Wavefront dashboards

• 18,000+ Wavefront alerts

VMworld 2018 Content: Not for publication or distribution

Page 37: Billion Lyft Rides, Half Million - EventKaddy CMS › event_data › 10 › session_notes › MGT1402BE.pdfFaster Using Wavefront Serverless Instrumentation and Monitoring Wavefront

37©2018 VMware, Inc.

Python and Golang

• Common base libraries for each language

• Hundreds of microservices, one monorepo (that is getting decomposed)

• Frequent deploys

• Common “base” deploy, Salt (masterless), AWS public cloud

• DevOps (Infrastructure team) has the role of enabling others, not to operate

• Teams are responsible for their service

• No SRE

Today Lyft Relies on Wavefront for Time Series and Alarming

VMworld 2018 Content: Not for publication or distribution

Page 38: Billion Lyft Rides, Half Million - EventKaddy CMS › event_data › 10 › session_notes › MGT1402BE.pdfFaster Using Wavefront Serverless Instrumentation and Monitoring Wavefront

38©2018 VMware, Inc.

How Does Metrics Aggregation Pipeline at Lyft WorkCascaded Approach

github.com/lyft/statsrelay.git

github.com/lyft/statsite.git

VMworld 2018 Content: Not for publication or distribution

Page 39: Billion Lyft Rides, Half Million - EventKaddy CMS › event_data › 10 › session_notes › MGT1402BE.pdfFaster Using Wavefront Serverless Instrumentation and Monitoring Wavefront

39©2018 VMware, Inc.

Service level aggregates centrally - correct histogramsPer host aggregates locally

Default metrics aggregated at 60s intervalThe 1-second interval is possible with a whitelist

Data Aggregation

VMworld 2018 Content: Not for publication or distribution

Page 40: Billion Lyft Rides, Half Million - EventKaddy CMS › event_data › 10 › session_notes › MGT1402BE.pdfFaster Using Wavefront Serverless Instrumentation and Monitoring Wavefront

40©2018 VMware, Inc.

Transitioning from Graphite to Wavefront Format Is Easy

VMworld 2018 Content: Not for publication or distribution

Page 41: Billion Lyft Rides, Half Million - EventKaddy CMS › event_data › 10 › session_notes › MGT1402BE.pdfFaster Using Wavefront Serverless Instrumentation and Monitoring Wavefront

41©2018 VMware, Inc.

Lyft Business Metrics in Wavefront

Passenger metrics• New user signups / installs / activations• Current passengers with the app open

Driver metrics• New driver applications / activations• Current drivers with the app open

Ride metrics• Rides requested / accepted / dropped off / canceled / lapsed• Lyft Line rides dropped off• Paid vs. Couponed rides dropped off

Marketplace metrics• Drivers available• Drivers en route• Driver utilization %VMworld 2018 Content: Not for publication or distribution

Page 42: Billion Lyft Rides, Half Million - EventKaddy CMS › event_data › 10 › session_notes › MGT1402BE.pdfFaster Using Wavefront Serverless Instrumentation and Monitoring Wavefront

42©2018 VMware, Inc.

Passenger - PAX Client Metrics - Wavefront Integration with Grafana

VMworld 2018 Content: Not for publication or distribution

Page 43: Billion Lyft Rides, Half Million - EventKaddy CMS › event_data › 10 › session_notes › MGT1402BE.pdfFaster Using Wavefront Serverless Instrumentation and Monitoring Wavefront

43©2018 VMware, Inc.

Techniques Used at Lyft to Avoid Production Incidents with Hundreds of Micro Services

VMworld 2018 Content: Not for publication or distribution

Page 44: Billion Lyft Rides, Half Million - EventKaddy CMS › event_data › 10 › session_notes › MGT1402BE.pdfFaster Using Wavefront Serverless Instrumentation and Monitoring Wavefront

44©2018 VMware, Inc.

from lyft_stats import stats

handler = stats.get_stats(‘test_prefix’)

map = {‘foo’: ‘bar’}

try:

with handler.timer(‘sample.timer’):

# do other things

print(map[‘test’])

except KeyError:

handler.incr(‘illegal.access’)

pass

Easy Application Metrics Collection - Python Metrics Library

VMworld 2018 Content: Not for publication or distribution

Page 45: Billion Lyft Rides, Half Million - EventKaddy CMS › event_data › 10 › session_notes › MGT1402BE.pdfFaster Using Wavefront Serverless Instrumentation and Monitoring Wavefront

45©2018 VMware, Inc.

Easy Metrics Collection Go Metrics Library

https://github.com/lyft/gostats

VMworld 2018 Content: Not for publication or distribution

Page 46: Billion Lyft Rides, Half Million - EventKaddy CMS › event_data › 10 › session_notes › MGT1402BE.pdfFaster Using Wavefront Serverless Instrumentation and Monitoring Wavefront

46©2018 VMware, Inc.

Observability in the Age of Microservice Mesh

VMworld 2018 Content: Not for publication or distribution

Page 47: Billion Lyft Rides, Half Million - EventKaddy CMS › event_data › 10 › session_notes › MGT1402BE.pdfFaster Using Wavefront Serverless Instrumentation and Monitoring Wavefront

47©2018 VMware, Inc.

Envoy Primer

• Envoy Proxy- modern, high performance, small footprint edge and service proxy

designed for cloud-native applications

• Out of process architecture (sidecar)

• C++ 11 code base

• Service discovery and active/passive health checking

• Advanced load balancing

• Edge and service proxy

• HTTP L7 filter architecture

• Best in class Observability (tracing, logging, and stats)

VMworld 2018 Content: Not for publication or distribution

Page 48: Billion Lyft Rides, Half Million - EventKaddy CMS › event_data › 10 › session_notes › MGT1402BE.pdfFaster Using Wavefront Serverless Instrumentation and Monitoring Wavefront

48©2018 VMware, Inc.

Measure Everything!

VMworld 2018 Content: Not for publication or distribution

Page 49: Billion Lyft Rides, Half Million - EventKaddy CMS › event_data › 10 › session_notes › MGT1402BE.pdfFaster Using Wavefront Serverless Instrumentation and Monitoring Wavefront

49©2018 VMware, Inc.

• Monolithic repository for managing dashboards

• Close integration with our salt infrastructure

• Grafana and Wavefront modules for dashboard/alert management

• Dashboards/alerts defined as salt states (jinja2+yaml)

• The rigorous code review process

• Consistent look and feel

• Distributed ownership

Managed Dashboards and Alarms Hub

VMworld 2018 Content: Not for publication or distribution

Page 50: Billion Lyft Rides, Half Million - EventKaddy CMS › event_data › 10 › session_notes › MGT1402BE.pdfFaster Using Wavefront Serverless Instrumentation and Monitoring Wavefront

50©2018 VMware, Inc.

Consistent Look and Feel Across All Our Microservices

VMworld 2018 Content: Not for publication or distribution

Page 51: Billion Lyft Rides, Half Million - EventKaddy CMS › event_data › 10 › session_notes › MGT1402BE.pdfFaster Using Wavefront Serverless Instrumentation and Monitoring Wavefront

51©2018 VMware, Inc.

Envoy Global Health DashboardWavefront Integration with Grafana

VMworld 2018 Content: Not for publication or distribution

Page 52: Billion Lyft Rides, Half Million - EventKaddy CMS › event_data › 10 › session_notes › MGT1402BE.pdfFaster Using Wavefront Serverless Instrumentation and Monitoring Wavefront

52©2018 VMware, Inc.

Metrics-Based Alerting Using Wavefront

VMworld 2018 Content: Not for publication or distribution

Page 53: Billion Lyft Rides, Half Million - EventKaddy CMS › event_data › 10 › session_notes › MGT1402BE.pdfFaster Using Wavefront Serverless Instrumentation and Monitoring Wavefront

53©2018 VMware, Inc.

Metrics-Based Alerting Using Wavefront

VMworld 2018 Content: Not for publication or distribution

Page 54: Billion Lyft Rides, Half Million - EventKaddy CMS › event_data › 10 › session_notes › MGT1402BE.pdfFaster Using Wavefront Serverless Instrumentation and Monitoring Wavefront

54©2018 VMware, Inc.

Metrics-Based Alerting Using Wavefront

VMworld 2018 Content: Not for publication or distribution

Page 55: Billion Lyft Rides, Half Million - EventKaddy CMS › event_data › 10 › session_notes › MGT1402BE.pdfFaster Using Wavefront Serverless Instrumentation and Monitoring Wavefront

55©2018 VMware, Inc.

Enrichment

VMworld 2018 Content: Not for publication or distribution

Page 56: Billion Lyft Rides, Half Million - EventKaddy CMS › event_data › 10 › session_notes › MGT1402BE.pdfFaster Using Wavefront Serverless Instrumentation and Monitoring Wavefront

56©2018 VMware, Inc.

Finding a Needle in a Haystack

VMworld 2018 Content: Not for publication or distribution

Page 57: Billion Lyft Rides, Half Million - EventKaddy CMS › event_data › 10 › session_notes › MGT1402BE.pdfFaster Using Wavefront Serverless Instrumentation and Monitoring Wavefront

57©2018 VMware, Inc.

Help Us Arrive at Root Cause Quickly

VMworld 2018 Content: Not for publication or distribution

Page 58: Billion Lyft Rides, Half Million - EventKaddy CMS › event_data › 10 › session_notes › MGT1402BE.pdfFaster Using Wavefront Serverless Instrumentation and Monitoring Wavefront

58©2018 VMware, Inc.

Tight Coupling

VMworld 2018 Content: Not for publication or distribution

Page 59: Billion Lyft Rides, Half Million - EventKaddy CMS › event_data › 10 › session_notes › MGT1402BE.pdfFaster Using Wavefront Serverless Instrumentation and Monitoring Wavefront

59©2018 VMware, Inc.

Benefits of Wavefront for Lyft

59

• Multiple-system syndrome- Fewer tools for triage, better and faster resolution

- Context switching is expensive

- Wavefront puts metrics and data from numerous sources

up front and makes them available in a single click

• Real-time visibility into the performance of our key services

• Highly efficient Alert Engine

- Relies on Wavefront to create smart alerts that dynamicallyfilter noise and capture veritable anomalies

• Powerful metrics explorer and chart viewVMworld 2018 Content: Not for publication or distribution

Page 60: Billion Lyft Rides, Half Million - EventKaddy CMS › event_data › 10 › session_notes › MGT1402BE.pdfFaster Using Wavefront Serverless Instrumentation and Monitoring Wavefront

60©2018 VMware, Inc.

Big Wins with Wavefront

Ability to monitor releases to help engineers makeaccurate decisions

Predict the future

Empirical data to guide decision making

Robust alerting - for when you’re not watching

The first-class citizen, to answer questions: “Is Lyft up?” or “How many rides did we complete?”

Intuitive yet powerful query language

VMworld 2018 Content: Not for publication or distribution

Page 61: Billion Lyft Rides, Half Million - EventKaddy CMS › event_data › 10 › session_notes › MGT1402BE.pdfFaster Using Wavefront Serverless Instrumentation and Monitoring Wavefront

DON’T FORGET TO FILL OUT YOUR SURVEY.

#vmworld #MGT1402BE

VMworld 2018 Content: Not for publication or distribution

Page 62: Billion Lyft Rides, Half Million - EventKaddy CMS › event_data › 10 › session_notes › MGT1402BE.pdfFaster Using Wavefront Serverless Instrumentation and Monitoring Wavefront

THANK YOU!

#vmworld #MGT1402BE

VMworld 2018 Content: Not for publication or distribution