62
©2019 VMware, Inc. Behind the Scenes: How VMware is Transforming to a SaaS Company VMware Cloud on AWS, Project Dimension, and More Kit Colbert CTO, Cloud Platform, VMware March 21, 2019

Behind the Scenes: How VMware is Transforming to …...x n Support e Support M Cluster ©2019 VMware, Inc. Agenda 21 VMware’s Hybrid Cloud Vision VMware Cloud on AWS, Project Dimension,

  • Upload
    others

  • View
    2

  • Download
    0

Embed Size (px)

Citation preview

Page 1: Behind the Scenes: How VMware is Transforming to …...x n Support e Support M Cluster ©2019 VMware, Inc. Agenda 21 VMware’s Hybrid Cloud Vision VMware Cloud on AWS, Project Dimension,

©2019 VMware, Inc.

Behind the Scenes: How VMware is Transforming to a SaaS CompanyVMware Cloud on AWS, Project Dimension, and More

Kit Colbert

CTO, Cloud Platform, VMware

March 21, 2019

Page 2: Behind the Scenes: How VMware is Transforming to …...x n Support e Support M Cluster ©2019 VMware, Inc. Agenda 21 VMware’s Hybrid Cloud Vision VMware Cloud on AWS, Project Dimension,

©2019 VMware, Inc.

Agenda

2

VMware’s Hybrid Cloud Vision

VMware Cloud on AWS, Project Dimension, and More

A Fundamental Shift in Delivery Model

VMware Cloud on AWS Operations

Summary

Page 3: Behind the Scenes: How VMware is Transforming to …...x n Support e Support M Cluster ©2019 VMware, Inc. Agenda 21 VMware’s Hybrid Cloud Vision VMware Cloud on AWS, Project Dimension,

©2019 VMware, Inc.

Agenda

3

VMware’s Hybrid Cloud Vision

VMware Cloud on AWS, Project Dimension, and More

A Fundamental Shift in Delivery Model

VMware Cloud on AWS Operations

Summary

Page 4: Behind the Scenes: How VMware is Transforming to …...x n Support e Support M Cluster ©2019 VMware, Inc. Agenda 21 VMware’s Hybrid Cloud Vision VMware Cloud on AWS, Project Dimension,

©2019 VMware, Inc.

BRANCH

BRANCH

BRANCH

BRANCH

BRANCH

DATACENTER

BRANCH

BRANCH

TELCO/NFV

TELCO/NFV

EDGE/IOT

TELCO/NFV

DATACENTER

DATACENTER

EDGE/IOT

EDGE/IOT

VMware Hybrid Cloud Vision

Consistent Infrastructure and Consistent Operations from Private Cloud to Public Cloud to Edge

Page 5: Behind the Scenes: How VMware is Transforming to …...x n Support e Support M Cluster ©2019 VMware, Inc. Agenda 21 VMware’s Hybrid Cloud Vision VMware Cloud on AWS, Project Dimension,

©2019 VMware, Inc. 5

VMware Delivers Across the Public Cloud, Data Center, and EdgeConsistent Infrastructure and Operations to Speed Innovation

Public Cloud Private Cloud Compute Edge

The VMware Cloud Solution

Management Compute Storage Networking

Page 6: Behind the Scenes: How VMware is Transforming to …...x n Support e Support M Cluster ©2019 VMware, Inc. Agenda 21 VMware’s Hybrid Cloud Vision VMware Cloud on AWS, Project Dimension,

©2019 VMware, Inc. 6

Evolution of our infrastructure abstractions

Abstracting Away Complexity

ESX ESX

ESX ESX

ESXi ESXi

ESXi ESXi

vCenter

vSAN

vSphere vSphere

vRealize

NSX

SDDC(AWS)

SDDC(VCPP)

VMware Cloud Services

Virtual Cloud Network

SDDC(IBM)

SDDC (edge)

v1Hypervisor

v2Virtual

Infrastructure

v3Software-defined

Datacenter

SDDC(DC)

SDDC(DC)

VMware Cloud Foundation

VMware Cloud

vSphere vSphere

vSphere

v4Hybrid Cloud

Page 7: Behind the Scenes: How VMware is Transforming to …...x n Support e Support M Cluster ©2019 VMware, Inc. Agenda 21 VMware’s Hybrid Cloud Vision VMware Cloud on AWS, Project Dimension,

©2019 VMware, Inc. 7

3

Page 8: Behind the Scenes: How VMware is Transforming to …...x n Support e Support M Cluster ©2019 VMware, Inc. Agenda 21 VMware’s Hybrid Cloud Vision VMware Cloud on AWS, Project Dimension,

©2019 VMware, Inc.

Agenda

8

VMware’s Hybrid Cloud Vision

VMware Cloud on AWS, Project Dimension, and More

A Fundamental Shift in Delivery Model

VMware Cloud on AWS Operations

Summary

Page 9: Behind the Scenes: How VMware is Transforming to …...x n Support e Support M Cluster ©2019 VMware, Inc. Agenda 21 VMware’s Hybrid Cloud Vision VMware Cloud on AWS, Project Dimension,

©2019 VMware, Inc. 9

Extend your VMware environment to the world-class AWS Cloud

VMware Cloud™ on AWS

AWS Global InfrastructurePrivate Cloud

vCenter

Intrinsic Security & Lifecycle Automation

VM Integrated with

AWS Services

VMware Cloud Foundation

Network Compute Storage

Automation & Operations

Rich VMware SDDC delivered as a cloud service on AWS

Consistency and workload portability across clouds

Direct access to the power of native AWS services

Page 10: Behind the Scenes: How VMware is Transforming to …...x n Support e Support M Cluster ©2019 VMware, Inc. Agenda 21 VMware’s Hybrid Cloud Vision VMware Cloud on AWS, Project Dimension,

©2019 VMware, Inc. 10

AWS Global InfrastructureCustomer data Center

vSphere vSAN NSX

On-Premises AWS services

vRealize Suite, ISV ecosystem

VMware vCenter ®VMware vCenter®

VMware CloudTM on AWSPowered by VMware Cloud Foundation

VMware Cloud™ on AWS

Page 11: Behind the Scenes: How VMware is Transforming to …...x n Support e Support M Cluster ©2019 VMware, Inc. Agenda 21 VMware’s Hybrid Cloud Vision VMware Cloud on AWS, Project Dimension,

©2019 VMware, Inc. 11

VMware Cloud on AWSFully configured VMware software stack running on state-of-the-art infrastructure provisioned on-demand in minutes

VMwareoperated,

supported, and maintained

Gateway …

NSX Manager

ESXi

ESXi

ESXi

…ESXi

…ESXi

…ESXi

VMware Cloud on AWS

Single tenant (dedicated) bare metal Amazon EC2 hardware

vCenter Server

Latest software

• VCSA, ESXi, NSX, VSAN, H5 client

Dynamic capacity

• DRS/HA compute cluster (Intel x86)

• VSAN storage cluster (SSD)

• NSX network virtualization (10 Gbps+)

Flexible topology

• Standalone cloud cluster

• Cloud-to-cloud connectivity

Hybrid Cloud Extension (HCX)

• Large-scale rapid migration - no retrofit

• 10X faster than conventional methods

• Hybrid connectivity to on-premises sites

Page 12: Behind the Scenes: How VMware is Transforming to …...x n Support e Support M Cluster ©2019 VMware, Inc. Agenda 21 VMware’s Hybrid Cloud Vision VMware Cloud on AWS, Project Dimension,

©2019 VMware, Inc. 12

Operations and Support

Provisioning

• API allows automated account creation and environment provisioning

• Automated interconnection between VMware and AWS customer accounts

Operations

• VMware directly supported

• AWS portion of infrastructure managed by VMware

• Ongoing infrastructure monitoring

Maintenance

• Ongoing, VMware managed stack maintenance

• Upgrade implementation and execution

The fully configured VMware Cloud software stack will be provisioned, operated, and maintained directly by VMware

Page 13: Behind the Scenes: How VMware is Transforming to …...x n Support e Support M Cluster ©2019 VMware, Inc. Agenda 21 VMware’s Hybrid Cloud Vision VMware Cloud on AWS, Project Dimension,

©2019 VMware, Inc. 13

VMware Cloud Foundation in a

Hyper-Converged Appliance

Hybrid Cloud Control Plane

VMware-Operated End-to-End

What is Project Dimension?Deliver VMware Cloud Simplicity to Data Center and Edge

Page 14: Behind the Scenes: How VMware is Transforming to …...x n Support e Support M Cluster ©2019 VMware, Inc. Agenda 21 VMware’s Hybrid Cloud Vision VMware Cloud on AWS, Project Dimension,

©2019 VMware, Inc. 14

Page 15: Behind the Scenes: How VMware is Transforming to …...x n Support e Support M Cluster ©2019 VMware, Inc. Agenda 21 VMware’s Hybrid Cloud Vision VMware Cloud on AWS, Project Dimension,

©2019 VMware, Inc. 15

Page 16: Behind the Scenes: How VMware is Transforming to …...x n Support e Support M Cluster ©2019 VMware, Inc. Agenda 21 VMware’s Hybrid Cloud Vision VMware Cloud on AWS, Project Dimension,

©2019 VMware, Inc. 16

Page 17: Behind the Scenes: How VMware is Transforming to …...x n Support e Support M Cluster ©2019 VMware, Inc. Agenda 21 VMware’s Hybrid Cloud Vision VMware Cloud on AWS, Project Dimension,

©2019 VMware, Inc. 17

Page 18: Behind the Scenes: How VMware is Transforming to …...x n Support e Support M Cluster ©2019 VMware, Inc. Agenda 21 VMware’s Hybrid Cloud Vision VMware Cloud on AWS, Project Dimension,

©2019 VMware, Inc. 18

Page 19: Behind the Scenes: How VMware is Transforming to …...x n Support e Support M Cluster ©2019 VMware, Inc. Agenda 21 VMware’s Hybrid Cloud Vision VMware Cloud on AWS, Project Dimension,

©2019 VMware, Inc. 19

Page 20: Behind the Scenes: How VMware is Transforming to …...x n Support e Support M Cluster ©2019 VMware, Inc. Agenda 21 VMware’s Hybrid Cloud Vision VMware Cloud on AWS, Project Dimension,

©2019 VMware, Inc. 20

What’s Included in the Dimension Service?

HW and SW

Pro

ject

Dim

ensi

on

so

ftw

are

Stac

k

Node: OEM server

3+ Nodes 1 Switch

Switch

1 VeloCloud

VeloCloud

Services

Hyb

rid

Clo

ud

C

on

tro

l Pla

ne

Velo

Clo

ud

Life

cycl

e M

anag

eme

nt

VMware OEM

Del

iver

y /

Inst

alla

tio

n

Bre

ak-f

ix

Har

dw

are

Co

llect

ion

Support

VM

war

e

Support

OEM

Cluster

Page 21: Behind the Scenes: How VMware is Transforming to …...x n Support e Support M Cluster ©2019 VMware, Inc. Agenda 21 VMware’s Hybrid Cloud Vision VMware Cloud on AWS, Project Dimension,

©2019 VMware, Inc.

Agenda

21

VMware’s Hybrid Cloud Vision

VMware Cloud on AWS, Project Dimension, and More

A Fundamental Shift in Delivery Model

VMware Cloud on AWS Operations

Summary

Page 22: Behind the Scenes: How VMware is Transforming to …...x n Support e Support M Cluster ©2019 VMware, Inc. Agenda 21 VMware’s Hybrid Cloud Vision VMware Cloud on AWS, Project Dimension,

©2019 VMware, Inc. 22

Towards “Closed Loop” Service Ownership

Fundamental Change: Software -> Service

Software developer writes product code and delivers software binary

Shrink-wrapped software package delivered to customer

Customer deploys and operates software

VMware support assists customer, sometimes escalates to R&D

If a fix is found, developer fixes it and releases in next release

Development team tries to repro, goes back and forth with customer

Software developer writes product code, automation code, and health and alerts

Development team who wrote code triages and fixes problem

Product code automatically deployed to production, health and alerts automatically configured

Customer consumes service as a user

Alert automatically triggered when problem is detected

Time scale: months to years

Time scale: days to weeks

Soft

war

eSe

rvic

e

Page 23: Behind the Scenes: How VMware is Transforming to …...x n Support e Support M Cluster ©2019 VMware, Inc. Agenda 21 VMware’s Hybrid Cloud Vision VMware Cloud on AWS, Project Dimension,

©2019 VMware, Inc. 23

1. Move at the speed of Cloud – quarterly or faster releases for VMC Service & SDDCs

2. Building differentiated solutions with AWS – e.g. stretched clusters, vSAN on EBS, NSX & DirectConnect, SnowMotion

3. Enabling self-service customer onboarding – e.g. single node SDDCs, guided tours

4. Transparency – e.g. public roadmap at https://cloud.vmware.com/vmc-aws/roadmap

5. Focus on making customers successful – e.g. chat support, customer success, NPS

6. Real time business powered by data – e.g. Slingshot (big data & analytics for VMC)

What Are We Doing Differently?

Page 24: Behind the Scenes: How VMware is Transforming to …...x n Support e Support M Cluster ©2019 VMware, Inc. Agenda 21 VMware’s Hybrid Cloud Vision VMware Cloud on AWS, Project Dimension,

©2019 VMware, Inc. 24

These are now all handled by VMware:

• SDDC infrastructure availability – VMW is contractually accountable for SDDC availability

• SDDC provisioning

• SDDC patching and upgrade

• Host remediation

• Capacity planning

• Large scale mobility with HCX

• Global representation of SDDCs on a single console

We Are Simplifying Complex Problems

Page 25: Behind the Scenes: How VMware is Transforming to …...x n Support e Support M Cluster ©2019 VMware, Inc. Agenda 21 VMware’s Hybrid Cloud Vision VMware Cloud on AWS, Project Dimension,

©2019 VMware, Inc.

Agenda

25

VMware’s Hybrid Cloud Vision

VMware Cloud on AWS, Project Dimension, and More

A Fundamental Shift in Delivery Model

VMware Cloud on AWS Operations

• Overview

• Support and Operating Model

• Data and Analytics

Summary

Page 26: Behind the Scenes: How VMware is Transforming to …...x n Support e Support M Cluster ©2019 VMware, Inc. Agenda 21 VMware’s Hybrid Cloud Vision VMware Cloud on AWS, Project Dimension,

©2019 VMware, Inc.

Agenda

26

VMware’s Hybrid Cloud Vision

VMware Cloud on AWS, Project Dimension, and More

A Fundamental Shift in Delivery Model

VMware Cloud on AWS Operations

• Overview

• Support and Operating Model

• Data and Analytics

Summary

Page 27: Behind the Scenes: How VMware is Transforming to …...x n Support e Support M Cluster ©2019 VMware, Inc. Agenda 21 VMware’s Hybrid Cloud Vision VMware Cloud on AWS, Project Dimension,

©2019 VMware, Inc. 27

Continuously Expanding Global Footprint of AWS Regions

March 2019 Q2 2019 Q3 2019 Q4 2019

Asia Pacific (Singapore) *South America (Sao Paulo)* Europe (Sweden) Bahrain

*Canada (Central)* *Asia Pacific (Seoul)* China (Hong Kong) Gov Cloud US East

Europe (Paris) **Asia Pacific (Osaka-Local)**

*Asia Pacific (Mumbai)*

3

2

3333

3

2

3 3

3

32 3

3

2

3

2 1

* Stretched cluster not supported (2 AZs) ** Disaster Recovery site only, gated entry

Available Regions and # of Availability Zones (AZs)

US West (Oregon)

US East (N. Virginia)

Europe (London)

Europe (Frankfurt)

Asia Pacific (Sydney)

Europe (Ireland)

*US West (N. California)*

US East (Ohio)

Asia Pacific (Tokyo)

Gov Cloud US West

#

Last updated: March 8, 2019

Page 28: Behind the Scenes: How VMware is Transforming to …...x n Support e Support M Cluster ©2019 VMware, Inc. Agenda 21 VMware’s Hybrid Cloud Vision VMware Cloud on AWS, Project Dimension,

©2019 VMware, Inc.

Continuous Service Enhancements

83 Upgrades(SDDC infrastructure, management and control plane, DRaaS, and other services)

19 Repairs(Examples: disable edge flow cache, fix for packet fragment leak, disable SNAT rule)

12 Maintenance Tasks(Examples: refresh tokens, update certs, auto-scaler restart, migrate to k8s)

10 New Functionality or Conversions(Examples: i3.metal conversion, adding namespace to config service)

99.99% Overall SLA Availability

Over a 3 month period

Page 29: Behind the Scenes: How VMware is Transforming to …...x n Support e Support M Cluster ©2019 VMware, Inc. Agenda 21 VMware’s Hybrid Cloud Vision VMware Cloud on AWS, Project Dimension,

©2019 VMware, Inc. 29

VMware Cloud on AWS: Architecture

vCenter

VPN/Direct Connect

Hybrid Networking

HybridLinked Mode

Single Pane of Glass UI,Hybrid VM Provisioning VMware-Owned “Shadow” VPC

ESX

VM VM

ESX

VM VM

ESX

VM VM

HA/DRS

vCenter

vSAN

Skyscraper PoP

NSXBare metal Bare metal Bare metal

Provisioning, Lifecycle,

Operations

Metrics, Logs,Events, Billing

SaaS

CSP

Identity

Billing

Subscription

VMware CloudService

AWS Driver

SRE/LI/FM

Metrics

Logs

Alerts

On-premises Datacenter AWS Cloud Region

Page 30: Behind the Scenes: How VMware is Transforming to …...x n Support e Support M Cluster ©2019 VMware, Inc. Agenda 21 VMware’s Hybrid Cloud Vision VMware Cloud on AWS, Project Dimension,

©2019 VMware, Inc.

Agenda

30

VMware’s Hybrid Cloud Vision

VMware Cloud on AWS, Project Dimension, and More

A Fundamental Shift in Delivery Model

VMware Cloud on AWS Operations

• Overview

• Support and Operating Model

• Data and Analytics

Summary

Page 31: Behind the Scenes: How VMware is Transforming to …...x n Support e Support M Cluster ©2019 VMware, Inc. Agenda 21 VMware’s Hybrid Cloud Vision VMware Cloud on AWS, Project Dimension,

©2019 VMware, Inc.

Support for VMC on AWS

Customer Success Manager Architects

VMC Support

Site Reliability Engineering (SRE)

Service Owners

DRaaS, HCX, NSX, VSAN, VC, ESX, VMC teams, LCM

24/7 365 (Daily Service Watch Shifts staffed by 2 engineers per Service Owner Team)

Incident Management

Operations ManagementSecurity (SOC)

Customer Onboarding Technical Guidance

Customer Support

(Customer initiated Chat and Phone escalations)

Service Response

(Service Alerts and Customer Success Escalations)

VMware has implemented an organizational structure to support the health of the VMware Cloud on AWS service 24/7 365 days of the year. Support is staffed by Support, SRE, Security Operations, Incident Managers, and Service Owners. They are focused on detecting and remediating service issues to maintain SLAs.

3rd Party Applications Function

Salesforce, Jira Service Desk, PagerDuty Escalation, Incident Tracking

Statuspage.io, Slack External/Internal Communications

SRE Applications Function

RTS, LINT, Wavefront, SDDC Dashboard Service Optics

3rd Party Applications Function

Jira Service Desk, PagerDuty Escalation, Incident Tracking

Statuspage.io, Slack External/Internal Communications

Logz.io, Splunk Log aggregation, Monitoring

Catchpoint VMC Console Monitoring

BigPanda Alert Correlation and Tirage

SRE Applications Function

RTS (Remediation Troubleshooting Service) Service Optics, Runbook Execution

Wavefront Service Optics, Monitoring

LINT (Log Intelligence) Log Aggregation, Monitoring

SDDC Dashboard Service Optics, SDDC info

SDDC Monitor (APE, Metrics Collector) Utilize Native vSphere monitoring

ACE (Alert Correlation Engine) Alert Enhancement, Correlation

Alert Dashboard Alert Triage and Configuration

Page 32: Behind the Scenes: How VMware is Transforming to …...x n Support e Support M Cluster ©2019 VMware, Inc. Agenda 21 VMware’s Hybrid Cloud Vision VMware Cloud on AWS, Project Dimension,

©2019 VMware, Inc.

Support for VMC on AWS

Customer Success Manager Architects

VMC Support

Site Reliability Engineering (SRE)

Service Owners

DRaaS, HCX, NSX, VSAN, VC, ESX, VMC teams, LCM

24/7 365 (Daily Service Watch Shifts staffed by 2 engineers per Service Owner Team)

Incident Management

Operations ManagementSecurity (SOC)

Customer Onboarding Technical Guidance

Customer Support

(Customer initiated Chat and Phone escalations)

Service Response

(Service Alerts and Customer Success Escalations)

VMware has implemented an organizational structure to support the health of the VMware Cloud on AWS service 24/7 365 days of the year. Support is staffed by Support, SRE, Security Operations, Incident Managers, and Service Owners. They are focused on detecting and remediating service issues to maintain SLAs.

3rd Party Applications Function

Salesforce, Jira Service Desk, PagerDuty Escalation, Incident Tracking

Statuspage.io, Slack External/Internal Communications

SRE Applications Function

RTS, LINT, Wavefront, SDDC Dashboard Service Optics

3rd Party Applications Function

Jira Service Desk, PagerDuty Escalation, Incident Tracking

Statuspage.io, Slack External/Internal Communications

Logz.io, Splunk Log aggregation, Monitoring

Catchpoint VMC Console Monitoring

BigPanda Alert Correlation and Tirage

SRE Applications Function

RTS (Remediation Troubleshooting Service) Service Optics, Runbook Execution

Wavefront Service Optics, Monitoring

LINT (Log Intelligence) Log Aggregation, Monitoring

SDDC Dashboard Service Optics, SDDC info

SDDC Monitor (APE, Metrics Collector) Utilize Native vSphere monitoring

ACE (Alert Correlation Engine) Alert Enhancement, Correlation

Alert Dashboard Alert Triage and Configuration

Page 33: Behind the Scenes: How VMware is Transforming to …...x n Support e Support M Cluster ©2019 VMware, Inc. Agenda 21 VMware’s Hybrid Cloud Vision VMware Cloud on AWS, Project Dimension,

©2019 VMware, Inc. 33

Site Reliability Engineering (SRE) Charter

Provide the horizontal platforms/tools, operational processes, and operational response needed to maintain the desired customer experience with minimum toil

Key SRE Pillars:

• Health (Availability and Reliability)

• Monitoring

• Automation

• Reporting and Analytics

• Service Response and Operational Support

Page 34: Behind the Scenes: How VMware is Transforming to …...x n Support e Support M Cluster ©2019 VMware, Inc. Agenda 21 VMware’s Hybrid Cloud Vision VMware Cloud on AWS, Project Dimension,

©2019 VMware, Inc. 34

Service Health Monitoring, and AutomationDetect service infrastructure issues, automated remediation, and escalation

Multi layer monitoring implementation.

SRE SDDC Dashboard, WaveFront Dashboards, and Alert Dashboard provide SDDC state and optics.

Blackbox Monitoring

Whitebox Monitoring

SDDC Event Aggregation

Log based Alerting

SaaS and SDDC components are queried by SRE agents for uptime at regular interval

SaaS and SDDC Components push service telemetry service telemetry

SRE agent collects SDDC Events and metrics created by components

SDDC and SaaS Services stream logs to aggregation service

Method Implementation Automation

Autoscaler

(Safe Hardware Remediation)

Example: replace failed ESX hosts or hosts with failed disks if there is no risk to

data integrity

RTS/ACE

Based on specific alerts restarts services or execute

script for remediation

Example: restart Vpxd , VC perfcharts, VMCD, …

Testing externally visible behavior as a user or service

would see it

Monitoring based on metrics exposed by the internals of

the system

vSphere Events and Metrics as seen by vCenter (Like on

prem)

Monitoring of log based events and errors

Function

Is VMON running, vCenter VM up, ESX hostd up,

management VMs, RTS agent is running

CPU, memory usage of management VMs,

processes, disk Latency, disk I/O, performance …

vCenter reports host disconnect, VSAN rebalance

needed,…

HA event for VM, VM port disconnect

Example

Page 35: Behind the Scenes: How VMware is Transforming to …...x n Support e Support M Cluster ©2019 VMware, Inc. Agenda 21 VMware’s Hybrid Cloud Vision VMware Cloud on AWS, Project Dimension,

©2019 VMware, Inc. 35

Alerts

Monitoring and Automation Flows

ESX

VM VM

ESX

VM VM

ESX

VM VM

HA/DRS

vCenter

vSAN

SRE agents

NSX

Bare metal Bare metal Bare metal

Various agents in each SDDC collect logging, telemetry, performance data, and service events. This data is sent to various SRE monitoring services for processing and alerting.

Alert Processing

EngineAlert

Correlation Engine

or

BigPanda

RTSAutomated

Action

Data

LINT

Wavefront

PagerDutyAnd Jira

Service Desk

Alerts are filtered, deduped, correlated and enhanced by ACE and BigPanda

Based on configuration some alerts are sent to Remediation and Troubleshooting Service (RTS) to initiate automated workflows , some are sent directly to Jira Service Desk for Service Watch response, and specific hardware events are sent to Autoscaler for hardware remediation.

If auto remediation fails an alert is sent to Jira Service Desk to engage a service watch engineer

Autoscaler

SRE SaaS ServicesSDDC

Page 36: Behind the Scenes: How VMware is Transforming to …...x n Support e Support M Cluster ©2019 VMware, Inc. Agenda 21 VMware’s Hybrid Cloud Vision VMware Cloud on AWS, Project Dimension,

©2019 VMware, Inc.

Support for VMC on AWS

Customer Success Manager Architects

VMC Support

Site Reliability Engineering (SRE)

Service Owners

DRaaS, HCX, NSX, VSAN, VC, ESX, VMC teams, LCM

24/7 365 (Daily Service Watch Shifts staffed by 2 engineers per Service Owner Team)

Incident Management

Operations ManagementSecurity (SOC)

Customer Onboarding Technical Guidance

Customer Support

(Customer initiated Chat and Phone escalations)

Service Response

(Service Alerts and Customer Success Escalations)

VMware has implemented an organizational structure to support the health of the VMware Cloud on AWS service 24/7 365 days of the year. Support is staffed by Support, SRE, Security Operations, Incident Managers, and Service Owners. They are focused on detecting and remediating service issues to maintain SLAs.

3rd Party Applications Function

Salesforce, Jira Service Desk, PagerDuty Escalation, Incident Tracking

Statuspage.io, Slack External/Internal Communications

SRE Applications Function

RTS, LINT, Wavefront, SDDC Dashboard Service Optics

3rd Party Applications Function

Jira Service Desk, PagerDuty Escalation, Incident Tracking

Statuspage.io, Slack External/Internal Communications

Logz.io, Splunk Log aggregation, Monitoring

Catchpoint VMC Console Monitoring

BigPanda Alert Correlation and Tirage

SRE Applications Function

RTS (Remediation Troubleshooting Service) Service Optics, Runbook Execution

Wavefront Service Optics, Monitoring

LINT (Log Intelligence) Log Aggregation, Monitoring

SDDC Dashboard Service Optics, SDDC info

SDDC Monitor (APE, Metrics Collector) Utilize Native vSphere monitoring

ACE (Alert Correlation Engine) Alert Enhancement, Correlation

Alert Dashboard Alert Triage and Configuration

Page 37: Behind the Scenes: How VMware is Transforming to …...x n Support e Support M Cluster ©2019 VMware, Inc. Agenda 21 VMware’s Hybrid Cloud Vision VMware Cloud on AWS, Project Dimension,

©2019 VMware, Inc. 37

Service Ownership Concept

Coding Testing Deployment Monitoring Upgrade Remediation Availability

Software Delivery(Product Ownership)

VMware Engineering

VMware Engineering

Customer Customer Customer Customer Customer

Service Delivery(Service Ownership)

VMware Engineering

VMware Engineering

VMware Engineering

VMware Engineering

VMware Engineering

VMware Engineering

VMware Engineering

End-to-end ownership of a service, from creation to deployment to running in production to upgrade to decommission

Page 38: Behind the Scenes: How VMware is Transforming to …...x n Support e Support M Cluster ©2019 VMware, Inc. Agenda 21 VMware’s Hybrid Cloud Vision VMware Cloud on AWS, Project Dimension,

©2019 VMware, Inc. 38

• What does the service do?

• Metrics to report to the business

• Drive 100% end-to-end (includes alerts [w/ help from SRE], phone calls [w/ help from CS])

• Gap in any other teams doesn’t excuse service team for accountability

• Usability

• Technology choice independence

• Ship whenever the team is ready

• SDLC that matches service (not product) requirements and team style

What Exactly Does Service Ownership Entail?

Owning the business-oriented service definition

Owning the customer experience

Owning the implementation and delivery

Page 39: Behind the Scenes: How VMware is Transforming to …...x n Support e Support M Cluster ©2019 VMware, Inc. Agenda 21 VMware’s Hybrid Cloud Vision VMware Cloud on AWS, Project Dimension,

©2019 VMware, Inc. 39

Steps to Service Ownership

Base Requirements

1. Define the service

• Define the service’s functionality and interface (API)

• Create business KPIs, include measure of SLAs

2. Define service health

• What measurements define the service’s health? Those are the service level indicators (SLIs)

• What are the ideal targets / ranges for those SLIs? Those are the service level objectives (SLOs)

3. Own and be accountable for service health

• How are SLOs monitored?

• How are SLOs maintained?

• Auto-remediation, run books, and supporting the service directly in production 24x7

Advanced

4. Own lifecycle and growth

• Lifecycle: Deployment / pipelines, upgrade / patch, backup / restore / recovery, decommission

• Rollout and scale: % of customers using service, understand scale parameters as adoption grows

• Evolve service architecture to facilitate the above

5. Manage runtime service configuration

• Is configuration in known good state?

• Detect and remediate configuration drift

• Control configuration change across instances

6. Manage capacity and cost

• Understand service resource requirements

• Ensure usage is within defined parameters & costs

Page 40: Behind the Scenes: How VMware is Transforming to …...x n Support e Support M Cluster ©2019 VMware, Inc. Agenda 21 VMware’s Hybrid Cloud Vision VMware Cloud on AWS, Project Dimension,

©2019 VMware, Inc.

Support for VMC on AWS

Customer Success Manager Architects

VMC Support

Site Reliability Engineering (SRE)

Service Owners

DRaaS, HCX, NSX, VSAN, VC, ESX, VMC teams, LCM

24/7 365 (Daily Service Watch Shifts staffed by 2 engineers per Service Owner Team)

Incident Management

Operations ManagementSecurity (SOC)

Customer Onboarding Technical Guidance

Customer Support

(Customer initiated Chat and Phone escalations)

Service Response

(Service Alerts and Customer Success Escalations)

VMware has implemented an organizational structure to support the health of the VMware Cloud on AWS service 24/7 365 days of the year. Support is staffed by Support, SRE, Security Operations, Incident Managers, and Service Owners. They are focused on detecting and remediating service issues to maintain SLAs.

3rd Party Applications Function

Salesforce, Jira Service Desk, PagerDuty Escalation, Incident Tracking

Statuspage.io, Slack External/Internal Communications

SRE Applications Function

RTS, LINT, Wavefront, SDDC Dashboard Service Optics

3rd Party Applications Function

Jira Service Desk, PagerDuty Escalation, Incident Tracking

Statuspage.io, Slack External/Internal Communications

Logz.io, Splunk Log aggregation, Monitoring

Catchpoint VMC Console Monitoring

BigPanda Alert Correlation and Tirage

SRE Applications Function

RTS (Remediation Troubleshooting Service) Service Optics, Runbook Execution

Wavefront Service Optics, Monitoring

LINT (Log Intelligence) Log Aggregation, Monitoring

SDDC Dashboard Service Optics, SDDC info

SDDC Monitor (APE, Metrics Collector) Utilize Native vSphere monitoring

ACE (Alert Correlation Engine) Alert Enhancement, Correlation

Alert Dashboard Alert Triage and Configuration

Page 41: Behind the Scenes: How VMware is Transforming to …...x n Support e Support M Cluster ©2019 VMware, Inc. Agenda 21 VMware’s Hybrid Cloud Vision VMware Cloud on AWS, Project Dimension,

©2019 VMware, Inc. 41

Operations Process

Change Management

The VMware Cloud on AWS Change Management process ensures that changes being introduced to the SDDC have been planned, tested, and reviewed to minimize the risk of negative customer impact.

• Change tickets are used both to track approval and to provide an audit trail.

• Changes require the approval of the Change Advisory Board (CAB)

Internal Incident Reviews

The objective of the VMware Cloud on AWS Incident review process is to ensure the same incident does not reoccur. These reviews are required to be completed within 7 days of the incident resolution.

VMC requires incident reviews for:

• All P0 incidents

• Incidents that impact customer SLA or internal SLOs

• Any customer incident that resulted in a post to the VMware external status page

Page 42: Behind the Scenes: How VMware is Transforming to …...x n Support e Support M Cluster ©2019 VMware, Inc. Agenda 21 VMware’s Hybrid Cloud Vision VMware Cloud on AWS, Project Dimension,

©2019 VMware, Inc. 42

Daily Service Review (DSR)

• Twice daily tactical review of open incidents and customer escalations

• Review of upcoming maintenance schedules and change requests

Weekly KPI Reporting and Review

• Weekly SRE review of KPIs to identify trends and assign actions if needed

Monthly Service Review (MSR)

• Executive level review of the operations and health of the service

• Focus on service availability SLAs, customer ticket information and trends, patch/upgrade effectiveness, escalations

Service Review and Open Incidents

Page 43: Behind the Scenes: How VMware is Transforming to …...x n Support e Support M Cluster ©2019 VMware, Inc. Agenda 21 VMware’s Hybrid Cloud Vision VMware Cloud on AWS, Project Dimension,

©2019 VMware, Inc. 43

Service KPIs

KPIs to determine service health and support effectiveness

Reviewed daily, weekly and monthly to drive appropriate actions that result in a highly reliable service with minimal human touch

Service Availability Measure of service uptime for each component.

Customer Escalations Track the number of escalations per component and how they were addressed. Identify monitoring gaps/effectiveness.

Incident KPIs Track incident volume, automation effectiveness, MTTR, and component reliability trends for product improvement. These KPIs are used internally to improve service watch and SRE tools effectiveness.

Mean Time to Resolution For each incident, the measure of the time it takes to restore service. Focus is to reduce MTTR.

Upgrade / Patch KPIs Track the success and performance of upgrade/patch

Top KPIs

Page 44: Behind the Scenes: How VMware is Transforming to …...x n Support e Support M Cluster ©2019 VMware, Inc. Agenda 21 VMware’s Hybrid Cloud Vision VMware Cloud on AWS, Project Dimension,

©2019 VMware, Inc. 44

Service and Operational Readiness(-2 wks)

POR

Test Plan Execution ✓

Quality ✓

Customer Feedback ✓

Documentation ✓

Operations ✓

Support Plan ✓

Cross Functional Approvals ✓

Service Architecture Validation (-3-6 mo)

Service Description and Business Justification

Planned Compliance with VMC Architecture Requirements

Interop/Integration Definition ✓

Support Strategy ✓

Implementation and Roadmap Proposal ✓

Design Validation(-3-6 mo)

POR

Implementation Design ✓

Interop/Integration Test Strategy ✓

User Experience ✓

Deployment Plan ✓

Service and Operational Readiness Criteria Confirmation

Service-Ready Solution Scorecard

Purpose: • Introduce Service• Confirm architecture

requirements and support strategy requirements met.

When: 3-6 months prior to VMC-ready solution introduction.

Pre requisite for:

Participation in VMC planning cycle.

Purpose: • Agree on integration design and test strategy

• Confirm Readiness Criteria

When: 3-6 months prior to VMC-ready solution introduction or enhancement

Purpose: • Confirm readiness to enable VMC-ready solution

When: • Just prior to enablement of service or enhancement

Pre requisite for:

• Early Access Enablement• Availability Enablement

Process to ensure all capabilities enabled on VMware Cloud on AWS comply with common architectural, design, and operational readiness requirements

Page 45: Behind the Scenes: How VMware is Transforming to …...x n Support e Support M Cluster ©2019 VMware, Inc. Agenda 21 VMware’s Hybrid Cloud Vision VMware Cloud on AWS, Project Dimension,

©2019 VMware, Inc. 45

• Major milestone release once per quarter

• Patch bundles released monthly

• 24x7 full service owner upgrade support

• Upgrades scheduled for 10:00am, 3:00pm, and 8:30pm (US Pacific time) to account for customer business hours

• Upgrade rollout performed in a strategic cycle to reduce the risk to customer SDDCs

– Internal SDDCs

– Partner SDDCs

– When known, customer dev/test or non-prod SDDCs

– External Customers with minimal large or strategic SDDCs per day

• Upgrade KPIs reviewed

• Postmortem of any upgrade requiring manual intervention

SDDC Upgrade OverviewVMC has a continual upgrade process with the goal to have zero impact upgrades to customer workloads

VMC SDDC Upgrade Cadence

Upgrade Execution Strategy Post upgrade review

Page 46: Behind the Scenes: How VMware is Transforming to …...x n Support e Support M Cluster ©2019 VMware, Inc. Agenda 21 VMware’s Hybrid Cloud Vision VMware Cloud on AWS, Project Dimension,

©2019 VMware, Inc.

VMC Fleet Rollout workflow

Release Coordination Engine

SDDCPoP

vCenter

VSAN

NSX

ESX ESX ESX ESX

NotificationService

SDDC CI/CD Pipeline

Bundle Manifest

Bundle Package

S3

Upgrade Svc

Operator UI

Backup and Restore Service

Remediation & Troubleshooting

AutoscalerService

1

2

3

45

6

7

8

9

1. Bundle pushed to S3 in all AWS Regions

10

2. Service operator schedules rollout and conducts preflight checks

3. Notification is sent to customers

4. Release Coordination Engine starts the rollout

5. Backup is taken by Backup & Restore Service

6. PoP and Upgrade Svc updated

7. Upgrade Svc updates Control Plane

8. Host is added

9. Upgrade Svc updates Data Plane

10. Host is removed

ESX

Service Operator

DevOps

Customer

Page 47: Behind the Scenes: How VMware is Transforming to …...x n Support e Support M Cluster ©2019 VMware, Inc. Agenda 21 VMware’s Hybrid Cloud Vision VMware Cloud on AWS, Project Dimension,

©2019 VMware, Inc.

Agenda

51

VMware’s Hybrid Cloud Vision

VMware Cloud on AWS, Project Dimension, and More

A Fundamental Shift in Delivery Model

VMware Cloud on AWS Operations

• Overview

• Support and Operating Model

• Data and Analytics

Summary

Page 48: Behind the Scenes: How VMware is Transforming to …...x n Support e Support M Cluster ©2019 VMware, Inc. Agenda 21 VMware’s Hybrid Cloud Vision VMware Cloud on AWS, Project Dimension,

©2019 VMware, Inc. 52

We launched the internal beta of VMware Cloud on AWS…

Page 49: Behind the Scenes: How VMware is Transforming to …...x n Support e Support M Cluster ©2019 VMware, Inc. Agenda 21 VMware’s Hybrid Cloud Vision VMware Cloud on AWS, Project Dimension,

©2019 VMware, Inc. 53

And had one critical question on that day:

How many people logged in?

Page 50: Behind the Scenes: How VMware is Transforming to …...x n Support e Support M Cluster ©2019 VMware, Inc. Agenda 21 VMware’s Hybrid Cloud Vision VMware Cloud on AWS, Project Dimension,

©2019 VMware, Inc. 54

We had lots of logs, but no practical way to answer the question

Page 51: Behind the Scenes: How VMware is Transforming to …...x n Support e Support M Cluster ©2019 VMware, Inc. Agenda 21 VMware’s Hybrid Cloud Vision VMware Cloud on AWS, Project Dimension,

©2019 VMware, Inc. 55

We had our first question answered two weeks laterLeveraging 10 lines of code and two SaaS services – Segment.io, Google Analytics

Page 52: Behind the Scenes: How VMware is Transforming to …...x n Support e Support M Cluster ©2019 VMware, Inc. Agenda 21 VMware’s Hybrid Cloud Vision VMware Cloud on AWS, Project Dimension,

©2019 VMware, Inc. 56

Which lead us to ask many, many more questions….

Page 53: Behind the Scenes: How VMware is Transforming to …...x n Support e Support M Cluster ©2019 VMware, Inc. Agenda 21 VMware’s Hybrid Cloud Vision VMware Cloud on AWS, Project Dimension,

©2019 VMware, Inc. 57

How many SDDCs have been created?

What is our SDDC failure rate?

Who are our most active users?

How much capacity do we have left?

Are page load times getting better or

worse?

What is our NPS score?

What screens are causing users

problems?

What is our most popular help topic?

Are customers creating VMs?

Page 54: Behind the Scenes: How VMware is Transforming to …...x n Support e Support M Cluster ©2019 VMware, Inc. Agenda 21 VMware’s Hybrid Cloud Vision VMware Cloud on AWS, Project Dimension,

©2019 VMware, Inc. 58

Leverage Cloud Services

Experiment and Fail Fast

Extensibility over Efficiency

Data Ingestion

Report Creation

Application Development

Data Democratization

Modern APIs

Simple Integrations

Slingshot’s goal is to enable anyone to answer any VMC questionDesign principals focus on velocity, self service, and openness

Optimize for Velocity Enable Self Service Open Platform

Page 55: Behind the Scenes: How VMware is Transforming to …...x n Support e Support M Cluster ©2019 VMware, Inc. Agenda 21 VMware’s Hybrid Cloud Vision VMware Cloud on AWS, Project Dimension,

©2019 VMware, Inc. 59

Slingshot – Typical View

Page 56: Behind the Scenes: How VMware is Transforming to …...x n Support e Support M Cluster ©2019 VMware, Inc. Agenda 21 VMware’s Hybrid Cloud Vision VMware Cloud on AWS, Project Dimension,

©2019 VMware, Inc. 60

Customer Success Dashboard – Overview

Page 57: Behind the Scenes: How VMware is Transforming to …...x n Support e Support M Cluster ©2019 VMware, Inc. Agenda 21 VMware’s Hybrid Cloud Vision VMware Cloud on AWS, Project Dimension,

©2019 VMware, Inc. 61

Customer Success Dashboard – Product Usage

Page 58: Behind the Scenes: How VMware is Transforming to …...x n Support e Support M Cluster ©2019 VMware, Inc. Agenda 21 VMware’s Hybrid Cloud Vision VMware Cloud on AWS, Project Dimension,

©2019 VMware, Inc. 62

Customer Success Dashboard – Time to Onboard

Page 59: Behind the Scenes: How VMware is Transforming to …...x n Support e Support M Cluster ©2019 VMware, Inc. Agenda 21 VMware’s Hybrid Cloud Vision VMware Cloud on AWS, Project Dimension,

©2019 VMware, Inc. 63

Page 60: Behind the Scenes: How VMware is Transforming to …...x n Support e Support M Cluster ©2019 VMware, Inc. Agenda 21 VMware’s Hybrid Cloud Vision VMware Cloud on AWS, Project Dimension,

©2019 VMware, Inc. 64

Slingshot Provides a Comprehensive View Across Data Sources

Page 61: Behind the Scenes: How VMware is Transforming to …...x n Support e Support M Cluster ©2019 VMware, Inc. Agenda 21 VMware’s Hybrid Cloud Vision VMware Cloud on AWS, Project Dimension,

©2019 VMware, Inc.

Agenda

65

VMware’s Hybrid Cloud Vision

VMware Cloud on AWS, Project Dimension, and More

A Fundamental Shift in Delivery Model

VMware Cloud on AWS Operations

Summary

Page 62: Behind the Scenes: How VMware is Transforming to …...x n Support e Support M Cluster ©2019 VMware, Inc. Agenda 21 VMware’s Hybrid Cloud Vision VMware Cloud on AWS, Project Dimension,

©2019 VMware, Inc. 66

VMware is all-in on delivering afundamentally better experience for you

https://cloud.vmware.com/vmc-aws

Summary