Upload
others
View
3
Download
0
Embed Size (px)
Citation preview
Andrew MillerRebecca Fitzhugh
MGT3342BUS
#VMworld #MGT3342BUS
Architecting Data Protection with Rubrik
VMworld 2017 Content: Not fo
r publication or distri
bution
• This presentation may contain product features that are currently under development.
• This overview of new technology represents no commitment from VMware to deliver these features in any generally available product.
• Features are subject to change, and must not be included in contracts, purchase orders, or sales agreements of any kind.
• Technical feasibility and market demand will affect final delivery.
• Pricing and packaging for any new technologies or features discussed or presented have not been determined.
Disclaimer
2#MGT3342BUS CONFIDENTIAL
VMworld 2017 Content: Not fo
r publication or distri
bution
Rebecca Fitzhugh
Tweet
Blogger
Co-Host
I have a job!
Author
VMware
@ rebeccafitzhugh
@ technicloud.com
@ vbrownbag.com
@ Rubrik.com
vSphere Virtual Machine Management
Learning VMware vSphere
VCDX #243
#MGT3342BUS CONFIDENTIAL
VMworld 2017 Content: Not fo
r publication or distri
bution
Andrew Miller
Tweet
Blogger
TMM
Background
Certs
VMware
@ andriven
@ thinkmeta.net
@ Rubrik.com
7 years customer, 8 years partner.
Lots of Random Ones
vExpert (6x)
#MGT3342BUS CONFIDENTIAL
VMworld 2017 Content: Not fo
r publication or distri
bution
Agenda? Nah…
Share Data Protection Architecture Knowledge
(more than half)
Show Where Rubrik Fits Technically + Demo
(less than half)
Fair?
(Q&A Too)
#MGT3342BUS CONFIDENTIAL
VMworld 2017 Content: Not fo
r publication or distri
bution
Why bother? One big reason…
Business Expectations
Of
Disaster Recovery / Data Protection
IT Capabilities
For
Disaster Recovery / Data Protection
!=!=#MGT3342BUS CONFIDENTIAL
VMworld 2017 Content: Not fo
r publication or distri
bution
What Are You Really Protecting Yourself Against?
• Lost or postponed sales and income
• Regulatory fines
• Delay of new business plans
• Loss of contractual bonuses
• Customer dissatisfaction
• Timing and duration of disruption
• Increased expenses such as overtime labor and outsourcing
• Employee Burnout
#MGT3342BUS CONFIDENTIAL
VMworld 2017 Content: Not fo
r publication or distri
bution
What is a Disaster?
Disaster: An event that affects a service or system such that significant effort is required to restore the original performance level.
• But what does that look like IN OUR
ENVIRONMENT?
• What disaster and recovery scenarios
should we plan for?
#MGT3342BUS CONFIDENTIAL
VMworld 2017 Content: Not fo
r publication or distri
bution
Sabotage!
VMworld 2017 Content: Not fo
r publication or distri
bution
Natural Disaster
11#MGT3342BUS CONFIDENTIAL
VMworld 2017 Content: Not fo
r publication or distri
bution
Natural Disaster
12#MGT3342BUS CONFIDENTIAL
VMworld 2017 Content: Not fo
r publication or distri
bution
Natural Disaster
13#MGT3342BUS CONFIDENTIAL
VMworld 2017 Content: Not fo
r publication or distri
bution
Natural Disaster
14#MGT3342BUS CONFIDENTIAL
VMworld 2017 Content: Not fo
r publication or distri
bution
Power Loss
15#MGT3342BUS CONFIDENTIAL
VMworld 2017 Content: Not fo
r publication or distri
bution
Power Loss
16#MGT3342BUS CONFIDENTIAL
VMworld 2017 Content: Not fo
r publication or distri
bution
Power Loss
17
VMworld 2017 Content: Not fo
r publication or distri
bution
What is the most common scenario for disaster?
18#MGT3342BUS CONFIDENTIAL
VMworld 2017 Content: Not fo
r publication or distri
bution
What is a Disaster?
Disaster: An event that affects a service or system such that significant effort is required to restore the original performance level.
• But what does that look like IN OUR
ENVIRONMENT?
• What disaster and recovery scenarios
should we plan for?
• Where do we begin?
• How do we do it?
#MGT3342BUS CONFIDENTIAL
VMworld 2017 Content: Not fo
r publication or distri
bution
What is a Business Impact Analysis (BIA)?
• A process to understand:
– What is the monetary impact of a disaster or failure?
– What are the most time-critical and information-critical business processes?
– How does the business REALLY rely upon IT Service and Application availability?
– What availability or recoverability capabilities are justifiable based on these requirements, potential impact, and costs?
• Composed of two components
– Technical Discovery – Data Gathering
– Human Conversation – Talk to People!
#MGT3342BUS CONFIDENTIAL
VMworld 2017 Content: Not fo
r publication or distri
bution
Example Output – Priority Tiers
Priority Tier Description
Priority 1
High Availability /
Immediate Recovery
Services whose unavailability more than a brief period can have a severe impact
on customers or time-critical business operations.
Priority 2
1-2 day recovery
Services whose unavailability significantly impacts customers or business
operations.
Priority 3
3-5 day recovery
Services which can tolerate up to five days of disruption in a disaster.
Priority 4
6-10 day recovery
Services which can tolerate up to ten days of disruption in a disaster.
Priority 3 and 4 systems may be restored in less time, depending on the situation.
However, higher priority functions will be restored first.
Priority 5
“Best effort” recovery
Non-critical services which can tolerate two weeks or more of disruption in a
disaster. These systems will be restored on a best-effort basis, after other more
critical systems have been restored and ongoing operations have resumed.
Priority 5 systems may be restored in less time, depending on the situation.
However, higher priority functions will be restored first. In some cases, systems
deemed to not be required for continued operations may not be restored.
#MGT3342BUS CONFIDENTIAL
VMworld 2017 Content: Not fo
r publication or distri
bution
What is an SLA?
• A contract between an external service provider and its customers or between an IT department and the internal business units it serves.
22#MGT3342BUS CONFIDENTIAL
VMworld 2017 Content: Not fo
r publication or distri
bution
What is an SLA?
• Two 9’s – 99% = 3.65 days of downtime per year (easy to achieve, less expensive)
• Three 9’s – 99.9% = 8.76 hours of downtime per year
• Four 9’s – 99.99% = 52.6 minutes of downtime per year
• Five 9’s – 99.999% = 5.26 minutes of downtime per year (difficult to achieve, expensive!)
23#MGT3342BUS CONFIDENTIAL
VMworld 2017 Content: Not fo
r publication or distri
bution
DECLARE
DISASTER
10 a.m.
Recovery Point Objectives(RPO)
Recovery Time Objectives(RTO)
RPO: Amount of data lost from
failure, measured as the amount
of time from a disaster event
RTO: Targeted amount of time
to restart a business service
after a disaster event
5a.m.
6a.m.
7a.m.
8a.m.
9a.m.
10a.m.
11a.m.
12a.m.
1p.m.
2p.m.
3p.m.
4p.m.
5p.m.
6p.m.
7p.m.
Disaster Recovery: Key Measures
#MGT3342BUS CONFIDENTIAL
VMworld 2017 Content: Not fo
r publication or distri
bution
Cost
Disaster Recovery: Key Measures
Weeks Days Hours Minutes Seconds WeeksDaysHoursMinutesSeconds
Recovery Point Recovery Time
Real Time
#MGT3342BUS CONFIDENTIAL
VMworld 2017 Content: Not fo
r publication or distri
bution
BC vs DR vs OR – Say What?
• Business Continuity
– All goes on as normal despite an incident
– Could lose a site and have no impact on business operations (active/active sites)
• Disaster Recovery
– To cope with & recover from an IT crisis that moves work to an alternative system in a non-routine way.
– A real “disaster” is large in scope and impact
– DR typically implies failure of the primary data center and recovery to an alternate site
• Operational Recovery
– Addresses more “routine” types of failures (server, network, storage, etc.)
– Events are smaller in scope and impact than a full disaster
– Typically implies recovering to alternate equipment within the primary data center
• Each should have its own clearly defined objectives – at minimum know the difference.
#MGT3342BUS CONFIDENTIAL
VMworld 2017 Content: Not fo
r publication or distri
bution
Where Rubrik HelpsLet’s keep it architecture focused.
27#MGT3342BUS CONFIDENTIAL
VMworld 2017 Content: Not fo
r publication or distri
bution
28
Complexity is the Enemy
Whatever you do. Whatever you buy.
Simplify your Architecture & Expect More
#MGT3342BUS CONFIDENTIAL
VMworld 2017 Content: Not fo
r publication or distri
bution
Key Evaluation Criteria
What we’ve seen that makes a difference…
1. Reliability of Data Recovery
a. Simplicity of Setup and Day 2 Operations – SLA Policies!
2. Speed of Data Recovery
29#MGT3342BUS CONFIDENTIAL
VMworld 2017 Content: Not fo
r publication or distri
bution
30
Data Management: 1990s to Present
1990s – Present
Backup &
Replication
Software
Backup Storage
Backup
Software
Backup
Servers
Backup
Proxies
Replication Catalog
Database
Tape Off-site ArchiveBackup Storage
a
Dedupe
Metadata
2000s – Present
Data Management: 2000s to Present
#MGT3342BUS CONFIDENTIAL
VMworld 2017 Content: Not fo
r publication or distri
bution
In Two Words
Sad PandaVMworld 2017 Content: N
ot for publicatio
n or distribution
32
Meet Rubrik Cloud Data Management
Backup
Software
Backup
Servers
Backup
Proxies
Replication Catalog
Database
Tape Off-site ArchiveBackup Storage
a
Dedupe
MetadataPrivate Public
Software fabric for orchestrating apps and data across clouds. No forklift upgrades.
#MGT3342BUS CONFIDENTIAL
VMworld 2017 Content: Not fo
r publication or distri
bution
34
How It Works
Quick Start: Rack and go. Auto-discovery.
Rapid Ingest: Flash-optimized, parallel ingest accelerates snapshots and eliminates stun.
Content-aware dedupe. One global namespace.
Automate: Intelligent SLA policy engine for
effortless management.
Instant Recovery: Live Mount VMs & SQL.
Instant search and file restore.
Secure: End-to-end encryption. Immutability to
fight Ransomware.
Cloud: “CloudOut” instantly accessible with global
search. Launch apps with “CloudOn” for DR or
test/dev. Run apps in cloud.
Primary Environment
SLA Policy Engine
Log Management
Private Public
NAS
AHV Hyper-V
VMware VMwareVMware VMwareVMware VMware
#MGT3342BUS CONFIDENTIAL
VMworld 2017 Content: Not fo
r publication or distri
bution
35
Your Data Center Today
Backup Proxy
SAN
Production Servers
Backup Server
Search Server
Disk-Based
Backup
Tape Archive Offsite
Tape Vault
#MGT3342BUS CONFIDENTIAL
VMworld 2017 Content: Not fo
r publication or distri
bution
36
Rubrik Simplifies Your Data Center
SAN
Production Servers
Scale Out
Scale Out Rubrik
Replication + Long-Term
Retention + Search
Private
#MGT3342BUS CONFIDENTIAL
VMworld 2017 Content: Not fo
r publication or distri
bution
Data Management in the Cloud
37
On-Premises
Applications & Data
Storage
Azure Instance
Blob
Storage
Backup
Replication
Archival
Analytics
Rubrik
Cloud-Native
Applications & Data
EC2 Instance
Rubrik
#MGT3342BUS CONFIDENTIAL
VMworld 2017 Content: Not fo
r publication or distri
bution
38
Recovery Point Objective (RPO)Availability Duration (Retention)When to Archive (RTO)Replication Schedule (DR)
{SLA
#MGT3342BUS CONFIDENTIAL
VMworld 2017 Content: Not fo
r publication or distri
bution
39
Let’s Demo!
What does it look like?
VMworld 2017 Content: Not fo
r publication or distri
bution
Key Evaluation Criteria
What we’ve seen that makes a difference…
1. Reliability of Data Recovery
a. Simplicity of Setup and Day 2 Operations – SLA Policies!
b. Immutability – is your data there there when you need it?
40#MGT3342BUS CONFIDENTIAL
VMworld 2017 Content: Not fo
r publication or distri
bution
Under the Hood
41
“The Interface”
“The Logic”
“The Core”
Distributed Task Framework
CallistoDistributed Metadata Service
Cluster Management
Global Search
CerebroData Management
CrystalUI / API
InfinityEcosystem
Integration
ThorCloud Connect
AtlasCloud-Scale File System
NFS
#MGT3342BUS CONFIDENTIAL
VMworld 2017 Content: Not fo
r publication or distri
bution
Key Evaluation Criteria
What we’ve seen that makes a difference…
1. Reliability of Data Recovery
a. Simplicity of Setup and Day 2 Operations – SLA Policies!
b. Immutability – is your data there there when you need it?
2. Speed of Data Recovery
a. Search + Live Mount
42#MGT3342BUS CONFIDENTIAL
VMworld 2017 Content: Not fo
r publication or distri
bution
43
Let’s Demo!
What does it look like?
VMworld 2017 Content: Not fo
r publication or distri
bution
Rubrik Backup / Recovery + DR
44
SAN
Production Servers
Replication + Long-Term
Retention + Search
DR Servers
RubrikBackup S/W + Dedupe Storage
RubrikReplication & DR
Private
#MGT3342BUS CONFIDENTIAL
VMworld 2017 Content: Not fo
r publication or distri
bution
Key Evaluation Criteria
What we’ve seen that makes a difference…
1. Reliability of Data Recovery
a. Simplicity of Setup and Day 2 Operations – SLA Policies!
b. Immutability – is your data there there when you need it?
2. Speed of Data Recovery
a. Search + Live Mount
b. API Usage / Automation to enhance restore capabilities
45#MGT3342BUS CONFIDENTIAL
VMworld 2017 Content: Not fo
r publication or distri
bution
Oh… By the Way
46
Your App
Use an API-first platform to create powerful automation workflows that can
be integrated with any service that supports outbound REST
Now OpenAPI
#MGT3342BUS CONFIDENTIAL
VMworld 2017 Content: Not fo
r publication or distri
bution
One More Demo!Wait a minute…we’ve been doing them already.
VMworld 2017 Content: Not fo
r publication or distri
bution
What did you see?
48
Easy Integration
with vSphere
Working with an
SLA Policy
Real-time Data
Search
#MGT3342BUS CONFIDENTIAL
VMworld 2017 Content: Not fo
r publication or distri
bution
VMworld 2017 Content: Not fo
r publication or distri
bution
50
Don’t Backup. Go Forward.VMworld 2017 Content: N
ot for publicatio
n or distribution
VMworld 2017 Content: Not fo
r publication or distri
bution
Andrew Miller | [email protected] | @andrivenRebecca Fitzhugh | [email protected] | @rebeccafitzhugh
VMworld 2017 Content: Not fo
r publication or distri
bution