Upload
others
View
2
Download
0
Embed Size (px)
Citation preview
©2019 VMware, Inc.
Behind the Scenes: How VMware is Transforming to a SaaS CompanyVMware Cloud on AWS, Project Dimension, and More
Kit Colbert
CTO, Cloud Platform, VMware
March 21, 2019
©2019 VMware, Inc.
Agenda
2
VMware’s Hybrid Cloud Vision
VMware Cloud on AWS, Project Dimension, and More
A Fundamental Shift in Delivery Model
VMware Cloud on AWS Operations
Summary
©2019 VMware, Inc.
Agenda
3
VMware’s Hybrid Cloud Vision
VMware Cloud on AWS, Project Dimension, and More
A Fundamental Shift in Delivery Model
VMware Cloud on AWS Operations
Summary
©2019 VMware, Inc.
BRANCH
BRANCH
BRANCH
BRANCH
BRANCH
DATACENTER
BRANCH
BRANCH
TELCO/NFV
TELCO/NFV
EDGE/IOT
TELCO/NFV
DATACENTER
DATACENTER
EDGE/IOT
EDGE/IOT
VMware Hybrid Cloud Vision
Consistent Infrastructure and Consistent Operations from Private Cloud to Public Cloud to Edge
©2019 VMware, Inc. 5
VMware Delivers Across the Public Cloud, Data Center, and EdgeConsistent Infrastructure and Operations to Speed Innovation
Public Cloud Private Cloud Compute Edge
The VMware Cloud Solution
Management Compute Storage Networking
©2019 VMware, Inc. 6
Evolution of our infrastructure abstractions
Abstracting Away Complexity
ESX ESX
ESX ESX
ESXi ESXi
ESXi ESXi
vCenter
vSAN
vSphere vSphere
vRealize
NSX
SDDC(AWS)
SDDC(VCPP)
VMware Cloud Services
Virtual Cloud Network
SDDC(IBM)
SDDC (edge)
v1Hypervisor
v2Virtual
Infrastructure
v3Software-defined
Datacenter
SDDC(DC)
SDDC(DC)
VMware Cloud Foundation
VMware Cloud
vSphere vSphere
vSphere
v4Hybrid Cloud
©2019 VMware, Inc. 7
3
©2019 VMware, Inc.
Agenda
8
VMware’s Hybrid Cloud Vision
VMware Cloud on AWS, Project Dimension, and More
A Fundamental Shift in Delivery Model
VMware Cloud on AWS Operations
Summary
©2019 VMware, Inc. 9
Extend your VMware environment to the world-class AWS Cloud
VMware Cloud™ on AWS
AWS Global InfrastructurePrivate Cloud
vCenter
Intrinsic Security & Lifecycle Automation
VM Integrated with
AWS Services
…
VMware Cloud Foundation
Network Compute Storage
Automation & Operations
Rich VMware SDDC delivered as a cloud service on AWS
Consistency and workload portability across clouds
Direct access to the power of native AWS services
©2019 VMware, Inc. 10
AWS Global InfrastructureCustomer data Center
vSphere vSAN NSX
On-Premises AWS services
vRealize Suite, ISV ecosystem
VMware vCenter ®VMware vCenter®
VMware CloudTM on AWSPowered by VMware Cloud Foundation
VMware Cloud™ on AWS
©2019 VMware, Inc. 11
VMware Cloud on AWSFully configured VMware software stack running on state-of-the-art infrastructure provisioned on-demand in minutes
VMwareoperated,
supported, and maintained
Gateway …
NSX Manager
…
…
…
ESXi
ESXi
ESXi
…ESXi
…ESXi
…ESXi
VMware Cloud on AWS
Single tenant (dedicated) bare metal Amazon EC2 hardware
vCenter Server
Latest software
• VCSA, ESXi, NSX, VSAN, H5 client
Dynamic capacity
• DRS/HA compute cluster (Intel x86)
• VSAN storage cluster (SSD)
• NSX network virtualization (10 Gbps+)
Flexible topology
• Standalone cloud cluster
• Cloud-to-cloud connectivity
Hybrid Cloud Extension (HCX)
• Large-scale rapid migration - no retrofit
• 10X faster than conventional methods
• Hybrid connectivity to on-premises sites
©2019 VMware, Inc. 12
Operations and Support
Provisioning
• API allows automated account creation and environment provisioning
• Automated interconnection between VMware and AWS customer accounts
Operations
• VMware directly supported
• AWS portion of infrastructure managed by VMware
• Ongoing infrastructure monitoring
Maintenance
• Ongoing, VMware managed stack maintenance
• Upgrade implementation and execution
The fully configured VMware Cloud software stack will be provisioned, operated, and maintained directly by VMware
©2019 VMware, Inc. 13
VMware Cloud Foundation in a
Hyper-Converged Appliance
Hybrid Cloud Control Plane
VMware-Operated End-to-End
What is Project Dimension?Deliver VMware Cloud Simplicity to Data Center and Edge
©2019 VMware, Inc. 14
©2019 VMware, Inc. 15
©2019 VMware, Inc. 16
©2019 VMware, Inc. 17
©2019 VMware, Inc. 18
©2019 VMware, Inc. 19
©2019 VMware, Inc. 20
What’s Included in the Dimension Service?
HW and SW
Pro
ject
Dim
ensi
on
so
ftw
are
Stac
k
Node: OEM server
3+ Nodes 1 Switch
Switch
1 VeloCloud
VeloCloud
Services
Hyb
rid
Clo
ud
C
on
tro
l Pla
ne
Velo
Clo
ud
Life
cycl
e M
anag
eme
nt
VMware OEM
Del
iver
y /
Inst
alla
tio
n
Bre
ak-f
ix
Har
dw
are
Co
llect
ion
Support
VM
war
e
Support
OEM
Cluster
©2019 VMware, Inc.
Agenda
21
VMware’s Hybrid Cloud Vision
VMware Cloud on AWS, Project Dimension, and More
A Fundamental Shift in Delivery Model
VMware Cloud on AWS Operations
Summary
©2019 VMware, Inc. 22
Towards “Closed Loop” Service Ownership
Fundamental Change: Software -> Service
Software developer writes product code and delivers software binary
Shrink-wrapped software package delivered to customer
Customer deploys and operates software
VMware support assists customer, sometimes escalates to R&D
If a fix is found, developer fixes it and releases in next release
Development team tries to repro, goes back and forth with customer
Software developer writes product code, automation code, and health and alerts
Development team who wrote code triages and fixes problem
Product code automatically deployed to production, health and alerts automatically configured
Customer consumes service as a user
Alert automatically triggered when problem is detected
Time scale: months to years
Time scale: days to weeks
Soft
war
eSe
rvic
e
©2019 VMware, Inc. 23
1. Move at the speed of Cloud – quarterly or faster releases for VMC Service & SDDCs
2. Building differentiated solutions with AWS – e.g. stretched clusters, vSAN on EBS, NSX & DirectConnect, SnowMotion
3. Enabling self-service customer onboarding – e.g. single node SDDCs, guided tours
4. Transparency – e.g. public roadmap at https://cloud.vmware.com/vmc-aws/roadmap
5. Focus on making customers successful – e.g. chat support, customer success, NPS
6. Real time business powered by data – e.g. Slingshot (big data & analytics for VMC)
What Are We Doing Differently?
©2019 VMware, Inc. 24
These are now all handled by VMware:
• SDDC infrastructure availability – VMW is contractually accountable for SDDC availability
• SDDC provisioning
• SDDC patching and upgrade
• Host remediation
• Capacity planning
• Large scale mobility with HCX
• Global representation of SDDCs on a single console
We Are Simplifying Complex Problems
©2019 VMware, Inc.
Agenda
25
VMware’s Hybrid Cloud Vision
VMware Cloud on AWS, Project Dimension, and More
A Fundamental Shift in Delivery Model
VMware Cloud on AWS Operations
• Overview
• Support and Operating Model
• Data and Analytics
Summary
©2019 VMware, Inc.
Agenda
26
VMware’s Hybrid Cloud Vision
VMware Cloud on AWS, Project Dimension, and More
A Fundamental Shift in Delivery Model
VMware Cloud on AWS Operations
• Overview
• Support and Operating Model
• Data and Analytics
Summary
©2019 VMware, Inc. 27
Continuously Expanding Global Footprint of AWS Regions
March 2019 Q2 2019 Q3 2019 Q4 2019
Asia Pacific (Singapore) *South America (Sao Paulo)* Europe (Sweden) Bahrain
*Canada (Central)* *Asia Pacific (Seoul)* China (Hong Kong) Gov Cloud US East
Europe (Paris) **Asia Pacific (Osaka-Local)**
*Asia Pacific (Mumbai)*
3
2
3333
3
2
3 3
3
32 3
3
2
3
2 1
* Stretched cluster not supported (2 AZs) ** Disaster Recovery site only, gated entry
Available Regions and # of Availability Zones (AZs)
US West (Oregon)
US East (N. Virginia)
Europe (London)
Europe (Frankfurt)
Asia Pacific (Sydney)
Europe (Ireland)
*US West (N. California)*
US East (Ohio)
Asia Pacific (Tokyo)
Gov Cloud US West
#
Last updated: March 8, 2019
©2019 VMware, Inc.
Continuous Service Enhancements
83 Upgrades(SDDC infrastructure, management and control plane, DRaaS, and other services)
19 Repairs(Examples: disable edge flow cache, fix for packet fragment leak, disable SNAT rule)
12 Maintenance Tasks(Examples: refresh tokens, update certs, auto-scaler restart, migrate to k8s)
10 New Functionality or Conversions(Examples: i3.metal conversion, adding namespace to config service)
99.99% Overall SLA Availability
Over a 3 month period
©2019 VMware, Inc. 29
VMware Cloud on AWS: Architecture
vCenter
VPN/Direct Connect
Hybrid Networking
HybridLinked Mode
Single Pane of Glass UI,Hybrid VM Provisioning VMware-Owned “Shadow” VPC
ESX
VM VM
ESX
VM VM
ESX
VM VM
HA/DRS
vCenter
vSAN
Skyscraper PoP
NSXBare metal Bare metal Bare metal
Provisioning, Lifecycle,
Operations
Metrics, Logs,Events, Billing
SaaS
CSP
Identity
Billing
Subscription
VMware CloudService
AWS Driver
SRE/LI/FM
Metrics
Logs
Alerts
On-premises Datacenter AWS Cloud Region
©2019 VMware, Inc.
Agenda
30
VMware’s Hybrid Cloud Vision
VMware Cloud on AWS, Project Dimension, and More
A Fundamental Shift in Delivery Model
VMware Cloud on AWS Operations
• Overview
• Support and Operating Model
• Data and Analytics
Summary
©2019 VMware, Inc.
Support for VMC on AWS
Customer Success Manager Architects
VMC Support
Site Reliability Engineering (SRE)
Service Owners
DRaaS, HCX, NSX, VSAN, VC, ESX, VMC teams, LCM
24/7 365 (Daily Service Watch Shifts staffed by 2 engineers per Service Owner Team)
Incident Management
Operations ManagementSecurity (SOC)
Customer Onboarding Technical Guidance
Customer Support
(Customer initiated Chat and Phone escalations)
Service Response
(Service Alerts and Customer Success Escalations)
VMware has implemented an organizational structure to support the health of the VMware Cloud on AWS service 24/7 365 days of the year. Support is staffed by Support, SRE, Security Operations, Incident Managers, and Service Owners. They are focused on detecting and remediating service issues to maintain SLAs.
3rd Party Applications Function
Salesforce, Jira Service Desk, PagerDuty Escalation, Incident Tracking
Statuspage.io, Slack External/Internal Communications
SRE Applications Function
RTS, LINT, Wavefront, SDDC Dashboard Service Optics
3rd Party Applications Function
Jira Service Desk, PagerDuty Escalation, Incident Tracking
Statuspage.io, Slack External/Internal Communications
Logz.io, Splunk Log aggregation, Monitoring
Catchpoint VMC Console Monitoring
BigPanda Alert Correlation and Tirage
SRE Applications Function
RTS (Remediation Troubleshooting Service) Service Optics, Runbook Execution
Wavefront Service Optics, Monitoring
LINT (Log Intelligence) Log Aggregation, Monitoring
SDDC Dashboard Service Optics, SDDC info
SDDC Monitor (APE, Metrics Collector) Utilize Native vSphere monitoring
ACE (Alert Correlation Engine) Alert Enhancement, Correlation
Alert Dashboard Alert Triage and Configuration
©2019 VMware, Inc.
Support for VMC on AWS
Customer Success Manager Architects
VMC Support
Site Reliability Engineering (SRE)
Service Owners
DRaaS, HCX, NSX, VSAN, VC, ESX, VMC teams, LCM
24/7 365 (Daily Service Watch Shifts staffed by 2 engineers per Service Owner Team)
Incident Management
Operations ManagementSecurity (SOC)
Customer Onboarding Technical Guidance
Customer Support
(Customer initiated Chat and Phone escalations)
Service Response
(Service Alerts and Customer Success Escalations)
VMware has implemented an organizational structure to support the health of the VMware Cloud on AWS service 24/7 365 days of the year. Support is staffed by Support, SRE, Security Operations, Incident Managers, and Service Owners. They are focused on detecting and remediating service issues to maintain SLAs.
3rd Party Applications Function
Salesforce, Jira Service Desk, PagerDuty Escalation, Incident Tracking
Statuspage.io, Slack External/Internal Communications
SRE Applications Function
RTS, LINT, Wavefront, SDDC Dashboard Service Optics
3rd Party Applications Function
Jira Service Desk, PagerDuty Escalation, Incident Tracking
Statuspage.io, Slack External/Internal Communications
Logz.io, Splunk Log aggregation, Monitoring
Catchpoint VMC Console Monitoring
BigPanda Alert Correlation and Tirage
SRE Applications Function
RTS (Remediation Troubleshooting Service) Service Optics, Runbook Execution
Wavefront Service Optics, Monitoring
LINT (Log Intelligence) Log Aggregation, Monitoring
SDDC Dashboard Service Optics, SDDC info
SDDC Monitor (APE, Metrics Collector) Utilize Native vSphere monitoring
ACE (Alert Correlation Engine) Alert Enhancement, Correlation
Alert Dashboard Alert Triage and Configuration
©2019 VMware, Inc. 33
Site Reliability Engineering (SRE) Charter
Provide the horizontal platforms/tools, operational processes, and operational response needed to maintain the desired customer experience with minimum toil
Key SRE Pillars:
• Health (Availability and Reliability)
• Monitoring
• Automation
• Reporting and Analytics
• Service Response and Operational Support
©2019 VMware, Inc. 34
Service Health Monitoring, and AutomationDetect service infrastructure issues, automated remediation, and escalation
Multi layer monitoring implementation.
SRE SDDC Dashboard, WaveFront Dashboards, and Alert Dashboard provide SDDC state and optics.
Blackbox Monitoring
Whitebox Monitoring
SDDC Event Aggregation
Log based Alerting
SaaS and SDDC components are queried by SRE agents for uptime at regular interval
SaaS and SDDC Components push service telemetry service telemetry
SRE agent collects SDDC Events and metrics created by components
SDDC and SaaS Services stream logs to aggregation service
Method Implementation Automation
Autoscaler
(Safe Hardware Remediation)
Example: replace failed ESX hosts or hosts with failed disks if there is no risk to
data integrity
RTS/ACE
Based on specific alerts restarts services or execute
script for remediation
Example: restart Vpxd , VC perfcharts, VMCD, …
Testing externally visible behavior as a user or service
would see it
Monitoring based on metrics exposed by the internals of
the system
vSphere Events and Metrics as seen by vCenter (Like on
prem)
Monitoring of log based events and errors
Function
Is VMON running, vCenter VM up, ESX hostd up,
management VMs, RTS agent is running
CPU, memory usage of management VMs,
processes, disk Latency, disk I/O, performance …
vCenter reports host disconnect, VSAN rebalance
needed,…
HA event for VM, VM port disconnect
Example
©2019 VMware, Inc. 35
Alerts
Monitoring and Automation Flows
ESX
VM VM
ESX
VM VM
ESX
VM VM
HA/DRS
vCenter
vSAN
SRE agents
NSX
Bare metal Bare metal Bare metal
Various agents in each SDDC collect logging, telemetry, performance data, and service events. This data is sent to various SRE monitoring services for processing and alerting.
Alert Processing
EngineAlert
Correlation Engine
or
BigPanda
RTSAutomated
Action
Data
LINT
Wavefront
PagerDutyAnd Jira
Service Desk
Alerts are filtered, deduped, correlated and enhanced by ACE and BigPanda
Based on configuration some alerts are sent to Remediation and Troubleshooting Service (RTS) to initiate automated workflows , some are sent directly to Jira Service Desk for Service Watch response, and specific hardware events are sent to Autoscaler for hardware remediation.
If auto remediation fails an alert is sent to Jira Service Desk to engage a service watch engineer
Autoscaler
SRE SaaS ServicesSDDC
©2019 VMware, Inc.
Support for VMC on AWS
Customer Success Manager Architects
VMC Support
Site Reliability Engineering (SRE)
Service Owners
DRaaS, HCX, NSX, VSAN, VC, ESX, VMC teams, LCM
24/7 365 (Daily Service Watch Shifts staffed by 2 engineers per Service Owner Team)
Incident Management
Operations ManagementSecurity (SOC)
Customer Onboarding Technical Guidance
Customer Support
(Customer initiated Chat and Phone escalations)
Service Response
(Service Alerts and Customer Success Escalations)
VMware has implemented an organizational structure to support the health of the VMware Cloud on AWS service 24/7 365 days of the year. Support is staffed by Support, SRE, Security Operations, Incident Managers, and Service Owners. They are focused on detecting and remediating service issues to maintain SLAs.
3rd Party Applications Function
Salesforce, Jira Service Desk, PagerDuty Escalation, Incident Tracking
Statuspage.io, Slack External/Internal Communications
SRE Applications Function
RTS, LINT, Wavefront, SDDC Dashboard Service Optics
3rd Party Applications Function
Jira Service Desk, PagerDuty Escalation, Incident Tracking
Statuspage.io, Slack External/Internal Communications
Logz.io, Splunk Log aggregation, Monitoring
Catchpoint VMC Console Monitoring
BigPanda Alert Correlation and Tirage
SRE Applications Function
RTS (Remediation Troubleshooting Service) Service Optics, Runbook Execution
Wavefront Service Optics, Monitoring
LINT (Log Intelligence) Log Aggregation, Monitoring
SDDC Dashboard Service Optics, SDDC info
SDDC Monitor (APE, Metrics Collector) Utilize Native vSphere monitoring
ACE (Alert Correlation Engine) Alert Enhancement, Correlation
Alert Dashboard Alert Triage and Configuration
©2019 VMware, Inc. 37
Service Ownership Concept
Coding Testing Deployment Monitoring Upgrade Remediation Availability
Software Delivery(Product Ownership)
VMware Engineering
VMware Engineering
Customer Customer Customer Customer Customer
Service Delivery(Service Ownership)
VMware Engineering
VMware Engineering
VMware Engineering
VMware Engineering
VMware Engineering
VMware Engineering
VMware Engineering
End-to-end ownership of a service, from creation to deployment to running in production to upgrade to decommission
©2019 VMware, Inc. 38
• What does the service do?
• Metrics to report to the business
• Drive 100% end-to-end (includes alerts [w/ help from SRE], phone calls [w/ help from CS])
• Gap in any other teams doesn’t excuse service team for accountability
• Usability
• Technology choice independence
• Ship whenever the team is ready
• SDLC that matches service (not product) requirements and team style
What Exactly Does Service Ownership Entail?
Owning the business-oriented service definition
Owning the customer experience
Owning the implementation and delivery
©2019 VMware, Inc. 39
Steps to Service Ownership
Base Requirements
1. Define the service
• Define the service’s functionality and interface (API)
• Create business KPIs, include measure of SLAs
2. Define service health
• What measurements define the service’s health? Those are the service level indicators (SLIs)
• What are the ideal targets / ranges for those SLIs? Those are the service level objectives (SLOs)
3. Own and be accountable for service health
• How are SLOs monitored?
• How are SLOs maintained?
• Auto-remediation, run books, and supporting the service directly in production 24x7
Advanced
4. Own lifecycle and growth
• Lifecycle: Deployment / pipelines, upgrade / patch, backup / restore / recovery, decommission
• Rollout and scale: % of customers using service, understand scale parameters as adoption grows
• Evolve service architecture to facilitate the above
5. Manage runtime service configuration
• Is configuration in known good state?
• Detect and remediate configuration drift
• Control configuration change across instances
6. Manage capacity and cost
• Understand service resource requirements
• Ensure usage is within defined parameters & costs
©2019 VMware, Inc.
Support for VMC on AWS
Customer Success Manager Architects
VMC Support
Site Reliability Engineering (SRE)
Service Owners
DRaaS, HCX, NSX, VSAN, VC, ESX, VMC teams, LCM
24/7 365 (Daily Service Watch Shifts staffed by 2 engineers per Service Owner Team)
Incident Management
Operations ManagementSecurity (SOC)
Customer Onboarding Technical Guidance
Customer Support
(Customer initiated Chat and Phone escalations)
Service Response
(Service Alerts and Customer Success Escalations)
VMware has implemented an organizational structure to support the health of the VMware Cloud on AWS service 24/7 365 days of the year. Support is staffed by Support, SRE, Security Operations, Incident Managers, and Service Owners. They are focused on detecting and remediating service issues to maintain SLAs.
3rd Party Applications Function
Salesforce, Jira Service Desk, PagerDuty Escalation, Incident Tracking
Statuspage.io, Slack External/Internal Communications
SRE Applications Function
RTS, LINT, Wavefront, SDDC Dashboard Service Optics
3rd Party Applications Function
Jira Service Desk, PagerDuty Escalation, Incident Tracking
Statuspage.io, Slack External/Internal Communications
Logz.io, Splunk Log aggregation, Monitoring
Catchpoint VMC Console Monitoring
BigPanda Alert Correlation and Tirage
SRE Applications Function
RTS (Remediation Troubleshooting Service) Service Optics, Runbook Execution
Wavefront Service Optics, Monitoring
LINT (Log Intelligence) Log Aggregation, Monitoring
SDDC Dashboard Service Optics, SDDC info
SDDC Monitor (APE, Metrics Collector) Utilize Native vSphere monitoring
ACE (Alert Correlation Engine) Alert Enhancement, Correlation
Alert Dashboard Alert Triage and Configuration
©2019 VMware, Inc. 41
Operations Process
Change Management
The VMware Cloud on AWS Change Management process ensures that changes being introduced to the SDDC have been planned, tested, and reviewed to minimize the risk of negative customer impact.
• Change tickets are used both to track approval and to provide an audit trail.
• Changes require the approval of the Change Advisory Board (CAB)
Internal Incident Reviews
The objective of the VMware Cloud on AWS Incident review process is to ensure the same incident does not reoccur. These reviews are required to be completed within 7 days of the incident resolution.
VMC requires incident reviews for:
• All P0 incidents
• Incidents that impact customer SLA or internal SLOs
• Any customer incident that resulted in a post to the VMware external status page
©2019 VMware, Inc. 42
Daily Service Review (DSR)
• Twice daily tactical review of open incidents and customer escalations
• Review of upcoming maintenance schedules and change requests
Weekly KPI Reporting and Review
• Weekly SRE review of KPIs to identify trends and assign actions if needed
Monthly Service Review (MSR)
• Executive level review of the operations and health of the service
• Focus on service availability SLAs, customer ticket information and trends, patch/upgrade effectiveness, escalations
Service Review and Open Incidents
©2019 VMware, Inc. 43
Service KPIs
KPIs to determine service health and support effectiveness
Reviewed daily, weekly and monthly to drive appropriate actions that result in a highly reliable service with minimal human touch
Service Availability Measure of service uptime for each component.
Customer Escalations Track the number of escalations per component and how they were addressed. Identify monitoring gaps/effectiveness.
Incident KPIs Track incident volume, automation effectiveness, MTTR, and component reliability trends for product improvement. These KPIs are used internally to improve service watch and SRE tools effectiveness.
Mean Time to Resolution For each incident, the measure of the time it takes to restore service. Focus is to reduce MTTR.
Upgrade / Patch KPIs Track the success and performance of upgrade/patch
Top KPIs
©2019 VMware, Inc. 44
Service and Operational Readiness(-2 wks)
POR
Test Plan Execution ✓
Quality ✓
Customer Feedback ✓
Documentation ✓
Operations ✓
Support Plan ✓
Cross Functional Approvals ✓
Service Architecture Validation (-3-6 mo)
Service Description and Business Justification
✓
Planned Compliance with VMC Architecture Requirements
✓
Interop/Integration Definition ✓
Support Strategy ✓
Implementation and Roadmap Proposal ✓
Design Validation(-3-6 mo)
POR
Implementation Design ✓
Interop/Integration Test Strategy ✓
User Experience ✓
Deployment Plan ✓
Service and Operational Readiness Criteria Confirmation
✓
Service-Ready Solution Scorecard
Purpose: • Introduce Service• Confirm architecture
requirements and support strategy requirements met.
When: 3-6 months prior to VMC-ready solution introduction.
Pre requisite for:
Participation in VMC planning cycle.
Purpose: • Agree on integration design and test strategy
• Confirm Readiness Criteria
When: 3-6 months prior to VMC-ready solution introduction or enhancement
Purpose: • Confirm readiness to enable VMC-ready solution
When: • Just prior to enablement of service or enhancement
Pre requisite for:
• Early Access Enablement• Availability Enablement
Process to ensure all capabilities enabled on VMware Cloud on AWS comply with common architectural, design, and operational readiness requirements
©2019 VMware, Inc. 45
• Major milestone release once per quarter
• Patch bundles released monthly
• 24x7 full service owner upgrade support
• Upgrades scheduled for 10:00am, 3:00pm, and 8:30pm (US Pacific time) to account for customer business hours
• Upgrade rollout performed in a strategic cycle to reduce the risk to customer SDDCs
– Internal SDDCs
– Partner SDDCs
– When known, customer dev/test or non-prod SDDCs
– External Customers with minimal large or strategic SDDCs per day
• Upgrade KPIs reviewed
• Postmortem of any upgrade requiring manual intervention
SDDC Upgrade OverviewVMC has a continual upgrade process with the goal to have zero impact upgrades to customer workloads
VMC SDDC Upgrade Cadence
Upgrade Execution Strategy Post upgrade review
©2019 VMware, Inc.
VMC Fleet Rollout workflow
Release Coordination Engine
SDDCPoP
vCenter
VSAN
NSX
ESX ESX ESX ESX
NotificationService
SDDC CI/CD Pipeline
Bundle Manifest
Bundle Package
S3
Upgrade Svc
Operator UI
Backup and Restore Service
Remediation & Troubleshooting
AutoscalerService
1
2
3
45
6
7
8
9
1. Bundle pushed to S3 in all AWS Regions
10
2. Service operator schedules rollout and conducts preflight checks
3. Notification is sent to customers
4. Release Coordination Engine starts the rollout
5. Backup is taken by Backup & Restore Service
6. PoP and Upgrade Svc updated
7. Upgrade Svc updates Control Plane
8. Host is added
9. Upgrade Svc updates Data Plane
10. Host is removed
ESX
Service Operator
DevOps
Customer
©2019 VMware, Inc.
Agenda
51
VMware’s Hybrid Cloud Vision
VMware Cloud on AWS, Project Dimension, and More
A Fundamental Shift in Delivery Model
VMware Cloud on AWS Operations
• Overview
• Support and Operating Model
• Data and Analytics
Summary
©2019 VMware, Inc. 52
We launched the internal beta of VMware Cloud on AWS…
©2019 VMware, Inc. 53
And had one critical question on that day:
How many people logged in?
©2019 VMware, Inc. 54
We had lots of logs, but no practical way to answer the question
©2019 VMware, Inc. 55
We had our first question answered two weeks laterLeveraging 10 lines of code and two SaaS services – Segment.io, Google Analytics
©2019 VMware, Inc. 56
Which lead us to ask many, many more questions….
©2019 VMware, Inc. 57
How many SDDCs have been created?
What is our SDDC failure rate?
Who are our most active users?
How much capacity do we have left?
Are page load times getting better or
worse?
What is our NPS score?
What screens are causing users
problems?
What is our most popular help topic?
Are customers creating VMs?
©2019 VMware, Inc. 58
Leverage Cloud Services
Experiment and Fail Fast
Extensibility over Efficiency
Data Ingestion
Report Creation
Application Development
Data Democratization
Modern APIs
Simple Integrations
Slingshot’s goal is to enable anyone to answer any VMC questionDesign principals focus on velocity, self service, and openness
Optimize for Velocity Enable Self Service Open Platform
©2019 VMware, Inc. 59
Slingshot – Typical View
©2019 VMware, Inc. 60
Customer Success Dashboard – Overview
©2019 VMware, Inc. 61
Customer Success Dashboard – Product Usage
©2019 VMware, Inc. 62
Customer Success Dashboard – Time to Onboard
©2019 VMware, Inc. 63
©2019 VMware, Inc. 64
Slingshot Provides a Comprehensive View Across Data Sources
©2019 VMware, Inc.
Agenda
65
VMware’s Hybrid Cloud Vision
VMware Cloud on AWS, Project Dimension, and More
A Fundamental Shift in Delivery Model
VMware Cloud on AWS Operations
Summary
©2019 VMware, Inc. 66
VMware is all-in on delivering afundamentally better experience for you
https://cloud.vmware.com/vmc-aws
Summary