50
Andrew Miller Rebecca Fitzhugh MGT3342BUS #VMworld #MGT3342BUS Architecting Data Protection with Rubrik VMworld 2017 Content: Not for publication or distribution

MGT3342BUS Architecting Data Protection with Rubrik or ... · 1-2 day recovery Services whose unavailability significantly impacts customers or business operations. Priority 3 3-5

  • Upload
    others

  • View
    3

  • Download
    0

Embed Size (px)

Citation preview

Page 1: MGT3342BUS Architecting Data Protection with Rubrik or ... · 1-2 day recovery Services whose unavailability significantly impacts customers or business operations. Priority 3 3-5

Andrew MillerRebecca Fitzhugh

MGT3342BUS

#VMworld #MGT3342BUS

Architecting Data Protection with Rubrik

VMworld 2017 Content: Not fo

r publication or distri

bution

Page 2: MGT3342BUS Architecting Data Protection with Rubrik or ... · 1-2 day recovery Services whose unavailability significantly impacts customers or business operations. Priority 3 3-5

• This presentation may contain product features that are currently under development.

• This overview of new technology represents no commitment from VMware to deliver these features in any generally available product.

• Features are subject to change, and must not be included in contracts, purchase orders, or sales agreements of any kind.

• Technical feasibility and market demand will affect final delivery.

• Pricing and packaging for any new technologies or features discussed or presented have not been determined.

Disclaimer

2#MGT3342BUS CONFIDENTIAL

VMworld 2017 Content: Not fo

r publication or distri

bution

Page 3: MGT3342BUS Architecting Data Protection with Rubrik or ... · 1-2 day recovery Services whose unavailability significantly impacts customers or business operations. Priority 3 3-5

Rebecca Fitzhugh

Tweet

Blogger

Co-Host

I have a job!

Author

VMware

@ rebeccafitzhugh

@ technicloud.com

@ vbrownbag.com

@ Rubrik.com

vSphere Virtual Machine Management

Learning VMware vSphere

VCDX #243

#MGT3342BUS CONFIDENTIAL

VMworld 2017 Content: Not fo

r publication or distri

bution

Page 4: MGT3342BUS Architecting Data Protection with Rubrik or ... · 1-2 day recovery Services whose unavailability significantly impacts customers or business operations. Priority 3 3-5

Andrew Miller

Tweet

Blogger

TMM

Background

Certs

VMware

@ andriven

@ thinkmeta.net

@ Rubrik.com

7 years customer, 8 years partner.

Lots of Random Ones

vExpert (6x)

#MGT3342BUS CONFIDENTIAL

VMworld 2017 Content: Not fo

r publication or distri

bution

Page 5: MGT3342BUS Architecting Data Protection with Rubrik or ... · 1-2 day recovery Services whose unavailability significantly impacts customers or business operations. Priority 3 3-5

Agenda? Nah…

Share Data Protection Architecture Knowledge

(more than half)

Show Where Rubrik Fits Technically + Demo

(less than half)

Fair?

(Q&A Too)

#MGT3342BUS CONFIDENTIAL

VMworld 2017 Content: Not fo

r publication or distri

bution

Page 6: MGT3342BUS Architecting Data Protection with Rubrik or ... · 1-2 day recovery Services whose unavailability significantly impacts customers or business operations. Priority 3 3-5

Why bother? One big reason…

Business Expectations

Of

Disaster Recovery / Data Protection

IT Capabilities

For

Disaster Recovery / Data Protection

!=!=#MGT3342BUS CONFIDENTIAL

VMworld 2017 Content: Not fo

r publication or distri

bution

Page 7: MGT3342BUS Architecting Data Protection with Rubrik or ... · 1-2 day recovery Services whose unavailability significantly impacts customers or business operations. Priority 3 3-5

What Are You Really Protecting Yourself Against?

• Lost or postponed sales and income

• Regulatory fines

• Delay of new business plans

• Loss of contractual bonuses

• Customer dissatisfaction

• Timing and duration of disruption

• Increased expenses such as overtime labor and outsourcing

• Employee Burnout

#MGT3342BUS CONFIDENTIAL

VMworld 2017 Content: Not fo

r publication or distri

bution

Page 8: MGT3342BUS Architecting Data Protection with Rubrik or ... · 1-2 day recovery Services whose unavailability significantly impacts customers or business operations. Priority 3 3-5

What is a Disaster?

Disaster: An event that affects a service or system such that significant effort is required to restore the original performance level.

• But what does that look like IN OUR

ENVIRONMENT?

• What disaster and recovery scenarios

should we plan for?

#MGT3342BUS CONFIDENTIAL

VMworld 2017 Content: Not fo

r publication or distri

bution

Page 9: MGT3342BUS Architecting Data Protection with Rubrik or ... · 1-2 day recovery Services whose unavailability significantly impacts customers or business operations. Priority 3 3-5

Sabotage!

VMworld 2017 Content: Not fo

r publication or distri

bution

Page 10: MGT3342BUS Architecting Data Protection with Rubrik or ... · 1-2 day recovery Services whose unavailability significantly impacts customers or business operations. Priority 3 3-5

Natural Disaster

11#MGT3342BUS CONFIDENTIAL

VMworld 2017 Content: Not fo

r publication or distri

bution

Page 11: MGT3342BUS Architecting Data Protection with Rubrik or ... · 1-2 day recovery Services whose unavailability significantly impacts customers or business operations. Priority 3 3-5

Natural Disaster

12#MGT3342BUS CONFIDENTIAL

VMworld 2017 Content: Not fo

r publication or distri

bution

Page 12: MGT3342BUS Architecting Data Protection with Rubrik or ... · 1-2 day recovery Services whose unavailability significantly impacts customers or business operations. Priority 3 3-5

Natural Disaster

13#MGT3342BUS CONFIDENTIAL

VMworld 2017 Content: Not fo

r publication or distri

bution

Page 13: MGT3342BUS Architecting Data Protection with Rubrik or ... · 1-2 day recovery Services whose unavailability significantly impacts customers or business operations. Priority 3 3-5

Natural Disaster

14#MGT3342BUS CONFIDENTIAL

VMworld 2017 Content: Not fo

r publication or distri

bution

Page 14: MGT3342BUS Architecting Data Protection with Rubrik or ... · 1-2 day recovery Services whose unavailability significantly impacts customers or business operations. Priority 3 3-5

Power Loss

15#MGT3342BUS CONFIDENTIAL

VMworld 2017 Content: Not fo

r publication or distri

bution

Page 15: MGT3342BUS Architecting Data Protection with Rubrik or ... · 1-2 day recovery Services whose unavailability significantly impacts customers or business operations. Priority 3 3-5

Power Loss

16#MGT3342BUS CONFIDENTIAL

VMworld 2017 Content: Not fo

r publication or distri

bution

Page 16: MGT3342BUS Architecting Data Protection with Rubrik or ... · 1-2 day recovery Services whose unavailability significantly impacts customers or business operations. Priority 3 3-5

Power Loss

17

VMworld 2017 Content: Not fo

r publication or distri

bution

Page 17: MGT3342BUS Architecting Data Protection with Rubrik or ... · 1-2 day recovery Services whose unavailability significantly impacts customers or business operations. Priority 3 3-5

What is the most common scenario for disaster?

18#MGT3342BUS CONFIDENTIAL

VMworld 2017 Content: Not fo

r publication or distri

bution

Page 18: MGT3342BUS Architecting Data Protection with Rubrik or ... · 1-2 day recovery Services whose unavailability significantly impacts customers or business operations. Priority 3 3-5

What is a Disaster?

Disaster: An event that affects a service or system such that significant effort is required to restore the original performance level.

• But what does that look like IN OUR

ENVIRONMENT?

• What disaster and recovery scenarios

should we plan for?

• Where do we begin?

• How do we do it?

#MGT3342BUS CONFIDENTIAL

VMworld 2017 Content: Not fo

r publication or distri

bution

Page 19: MGT3342BUS Architecting Data Protection with Rubrik or ... · 1-2 day recovery Services whose unavailability significantly impacts customers or business operations. Priority 3 3-5

What is a Business Impact Analysis (BIA)?

• A process to understand:

– What is the monetary impact of a disaster or failure?

– What are the most time-critical and information-critical business processes?

– How does the business REALLY rely upon IT Service and Application availability?

– What availability or recoverability capabilities are justifiable based on these requirements, potential impact, and costs?

• Composed of two components

– Technical Discovery – Data Gathering

– Human Conversation – Talk to People!

#MGT3342BUS CONFIDENTIAL

VMworld 2017 Content: Not fo

r publication or distri

bution

Page 20: MGT3342BUS Architecting Data Protection with Rubrik or ... · 1-2 day recovery Services whose unavailability significantly impacts customers or business operations. Priority 3 3-5

Example Output – Priority Tiers

Priority Tier Description

Priority 1

High Availability /

Immediate Recovery

Services whose unavailability more than a brief period can have a severe impact

on customers or time-critical business operations.

Priority 2

1-2 day recovery

Services whose unavailability significantly impacts customers or business

operations.

Priority 3

3-5 day recovery

Services which can tolerate up to five days of disruption in a disaster.

Priority 4

6-10 day recovery

Services which can tolerate up to ten days of disruption in a disaster.

Priority 3 and 4 systems may be restored in less time, depending on the situation.

However, higher priority functions will be restored first.

Priority 5

“Best effort” recovery

Non-critical services which can tolerate two weeks or more of disruption in a

disaster. These systems will be restored on a best-effort basis, after other more

critical systems have been restored and ongoing operations have resumed.

Priority 5 systems may be restored in less time, depending on the situation.

However, higher priority functions will be restored first. In some cases, systems

deemed to not be required for continued operations may not be restored.

#MGT3342BUS CONFIDENTIAL

VMworld 2017 Content: Not fo

r publication or distri

bution

Page 21: MGT3342BUS Architecting Data Protection with Rubrik or ... · 1-2 day recovery Services whose unavailability significantly impacts customers or business operations. Priority 3 3-5

What is an SLA?

• A contract between an external service provider and its customers or between an IT department and the internal business units it serves.

22#MGT3342BUS CONFIDENTIAL

VMworld 2017 Content: Not fo

r publication or distri

bution

Page 22: MGT3342BUS Architecting Data Protection with Rubrik or ... · 1-2 day recovery Services whose unavailability significantly impacts customers or business operations. Priority 3 3-5

What is an SLA?

• Two 9’s – 99% = 3.65 days of downtime per year (easy to achieve, less expensive)

• Three 9’s – 99.9% = 8.76 hours of downtime per year

• Four 9’s – 99.99% = 52.6 minutes of downtime per year

• Five 9’s – 99.999% = 5.26 minutes of downtime per year (difficult to achieve, expensive!)

23#MGT3342BUS CONFIDENTIAL

VMworld 2017 Content: Not fo

r publication or distri

bution

Page 23: MGT3342BUS Architecting Data Protection with Rubrik or ... · 1-2 day recovery Services whose unavailability significantly impacts customers or business operations. Priority 3 3-5

DECLARE

DISASTER

10 a.m.

Recovery Point Objectives(RPO)

Recovery Time Objectives(RTO)

RPO: Amount of data lost from

failure, measured as the amount

of time from a disaster event

RTO: Targeted amount of time

to restart a business service

after a disaster event

5a.m.

6a.m.

7a.m.

8a.m.

9a.m.

10a.m.

11a.m.

12a.m.

1p.m.

2p.m.

3p.m.

4p.m.

5p.m.

6p.m.

7p.m.

Disaster Recovery: Key Measures

#MGT3342BUS CONFIDENTIAL

VMworld 2017 Content: Not fo

r publication or distri

bution

Page 24: MGT3342BUS Architecting Data Protection with Rubrik or ... · 1-2 day recovery Services whose unavailability significantly impacts customers or business operations. Priority 3 3-5

Cost

Disaster Recovery: Key Measures

Weeks Days Hours Minutes Seconds WeeksDaysHoursMinutesSeconds

Recovery Point Recovery Time

Real Time

#MGT3342BUS CONFIDENTIAL

VMworld 2017 Content: Not fo

r publication or distri

bution

Page 25: MGT3342BUS Architecting Data Protection with Rubrik or ... · 1-2 day recovery Services whose unavailability significantly impacts customers or business operations. Priority 3 3-5

BC vs DR vs OR – Say What?

• Business Continuity

– All goes on as normal despite an incident

– Could lose a site and have no impact on business operations (active/active sites)

• Disaster Recovery

– To cope with & recover from an IT crisis that moves work to an alternative system in a non-routine way.

– A real “disaster” is large in scope and impact

– DR typically implies failure of the primary data center and recovery to an alternate site

• Operational Recovery

– Addresses more “routine” types of failures (server, network, storage, etc.)

– Events are smaller in scope and impact than a full disaster

– Typically implies recovering to alternate equipment within the primary data center

• Each should have its own clearly defined objectives – at minimum know the difference.

#MGT3342BUS CONFIDENTIAL

VMworld 2017 Content: Not fo

r publication or distri

bution

Page 26: MGT3342BUS Architecting Data Protection with Rubrik or ... · 1-2 day recovery Services whose unavailability significantly impacts customers or business operations. Priority 3 3-5

Where Rubrik HelpsLet’s keep it architecture focused.

27#MGT3342BUS CONFIDENTIAL

VMworld 2017 Content: Not fo

r publication or distri

bution

Page 27: MGT3342BUS Architecting Data Protection with Rubrik or ... · 1-2 day recovery Services whose unavailability significantly impacts customers or business operations. Priority 3 3-5

28

Complexity is the Enemy

Whatever you do. Whatever you buy.

Simplify your Architecture & Expect More

#MGT3342BUS CONFIDENTIAL

VMworld 2017 Content: Not fo

r publication or distri

bution

Page 28: MGT3342BUS Architecting Data Protection with Rubrik or ... · 1-2 day recovery Services whose unavailability significantly impacts customers or business operations. Priority 3 3-5

Key Evaluation Criteria

What we’ve seen that makes a difference…

1. Reliability of Data Recovery

a. Simplicity of Setup and Day 2 Operations – SLA Policies!

2. Speed of Data Recovery

29#MGT3342BUS CONFIDENTIAL

VMworld 2017 Content: Not fo

r publication or distri

bution

Page 29: MGT3342BUS Architecting Data Protection with Rubrik or ... · 1-2 day recovery Services whose unavailability significantly impacts customers or business operations. Priority 3 3-5

30

Data Management: 1990s to Present

1990s – Present

Backup &

Replication

Software

Backup Storage

Backup

Software

Backup

Servers

Backup

Proxies

Replication Catalog

Database

Tape Off-site ArchiveBackup Storage

a

Dedupe

Metadata

2000s – Present

Data Management: 2000s to Present

#MGT3342BUS CONFIDENTIAL

VMworld 2017 Content: Not fo

r publication or distri

bution

Page 30: MGT3342BUS Architecting Data Protection with Rubrik or ... · 1-2 day recovery Services whose unavailability significantly impacts customers or business operations. Priority 3 3-5

In Two Words

Sad PandaVMworld 2017 Content: N

ot for publicatio

n or distribution

Page 31: MGT3342BUS Architecting Data Protection with Rubrik or ... · 1-2 day recovery Services whose unavailability significantly impacts customers or business operations. Priority 3 3-5

32

Meet Rubrik Cloud Data Management

Backup

Software

Backup

Servers

Backup

Proxies

Replication Catalog

Database

Tape Off-site ArchiveBackup Storage

a

Dedupe

MetadataPrivate Public

Software fabric for orchestrating apps and data across clouds. No forklift upgrades.

#MGT3342BUS CONFIDENTIAL

VMworld 2017 Content: Not fo

r publication or distri

bution

Page 32: MGT3342BUS Architecting Data Protection with Rubrik or ... · 1-2 day recovery Services whose unavailability significantly impacts customers or business operations. Priority 3 3-5

34

How It Works

Quick Start: Rack and go. Auto-discovery.

Rapid Ingest: Flash-optimized, parallel ingest accelerates snapshots and eliminates stun.

Content-aware dedupe. One global namespace.

Automate: Intelligent SLA policy engine for

effortless management.

Instant Recovery: Live Mount VMs & SQL.

Instant search and file restore.

Secure: End-to-end encryption. Immutability to

fight Ransomware.

Cloud: “CloudOut” instantly accessible with global

search. Launch apps with “CloudOn” for DR or

test/dev. Run apps in cloud.

Primary Environment

SLA Policy Engine

Log Management

Private Public

NAS

AHV Hyper-V

VMware VMwareVMware VMwareVMware VMware

#MGT3342BUS CONFIDENTIAL

VMworld 2017 Content: Not fo

r publication or distri

bution

Page 33: MGT3342BUS Architecting Data Protection with Rubrik or ... · 1-2 day recovery Services whose unavailability significantly impacts customers or business operations. Priority 3 3-5

35

Your Data Center Today

Backup Proxy

SAN

Production Servers

Backup Server

Search Server

Disk-Based

Backup

Tape Archive Offsite

Tape Vault

#MGT3342BUS CONFIDENTIAL

VMworld 2017 Content: Not fo

r publication or distri

bution

Page 34: MGT3342BUS Architecting Data Protection with Rubrik or ... · 1-2 day recovery Services whose unavailability significantly impacts customers or business operations. Priority 3 3-5

36

Rubrik Simplifies Your Data Center

SAN

Production Servers

Scale Out

Scale Out Rubrik

Replication + Long-Term

Retention + Search

Private

#MGT3342BUS CONFIDENTIAL

VMworld 2017 Content: Not fo

r publication or distri

bution

Page 35: MGT3342BUS Architecting Data Protection with Rubrik or ... · 1-2 day recovery Services whose unavailability significantly impacts customers or business operations. Priority 3 3-5

Data Management in the Cloud

37

On-Premises

Applications & Data

Storage

Azure Instance

Blob

Storage

Backup

Replication

Archival

Analytics

Rubrik

Cloud-Native

Applications & Data

EC2 Instance

Rubrik

#MGT3342BUS CONFIDENTIAL

VMworld 2017 Content: Not fo

r publication or distri

bution

Page 36: MGT3342BUS Architecting Data Protection with Rubrik or ... · 1-2 day recovery Services whose unavailability significantly impacts customers or business operations. Priority 3 3-5

38

Recovery Point Objective (RPO)Availability Duration (Retention)When to Archive (RTO)Replication Schedule (DR)

{SLA

#MGT3342BUS CONFIDENTIAL

VMworld 2017 Content: Not fo

r publication or distri

bution

Page 37: MGT3342BUS Architecting Data Protection with Rubrik or ... · 1-2 day recovery Services whose unavailability significantly impacts customers or business operations. Priority 3 3-5

39

Let’s Demo!

What does it look like?

VMworld 2017 Content: Not fo

r publication or distri

bution

Page 38: MGT3342BUS Architecting Data Protection with Rubrik or ... · 1-2 day recovery Services whose unavailability significantly impacts customers or business operations. Priority 3 3-5

Key Evaluation Criteria

What we’ve seen that makes a difference…

1. Reliability of Data Recovery

a. Simplicity of Setup and Day 2 Operations – SLA Policies!

b. Immutability – is your data there there when you need it?

40#MGT3342BUS CONFIDENTIAL

VMworld 2017 Content: Not fo

r publication or distri

bution

Page 39: MGT3342BUS Architecting Data Protection with Rubrik or ... · 1-2 day recovery Services whose unavailability significantly impacts customers or business operations. Priority 3 3-5

Under the Hood

41

“The Interface”

“The Logic”

“The Core”

Distributed Task Framework

CallistoDistributed Metadata Service

Cluster Management

Global Search

CerebroData Management

CrystalUI / API

InfinityEcosystem

Integration

ThorCloud Connect

AtlasCloud-Scale File System

NFS

#MGT3342BUS CONFIDENTIAL

VMworld 2017 Content: Not fo

r publication or distri

bution

Page 40: MGT3342BUS Architecting Data Protection with Rubrik or ... · 1-2 day recovery Services whose unavailability significantly impacts customers or business operations. Priority 3 3-5

Key Evaluation Criteria

What we’ve seen that makes a difference…

1. Reliability of Data Recovery

a. Simplicity of Setup and Day 2 Operations – SLA Policies!

b. Immutability – is your data there there when you need it?

2. Speed of Data Recovery

a. Search + Live Mount

42#MGT3342BUS CONFIDENTIAL

VMworld 2017 Content: Not fo

r publication or distri

bution

Page 41: MGT3342BUS Architecting Data Protection with Rubrik or ... · 1-2 day recovery Services whose unavailability significantly impacts customers or business operations. Priority 3 3-5

43

Let’s Demo!

What does it look like?

VMworld 2017 Content: Not fo

r publication or distri

bution

Page 42: MGT3342BUS Architecting Data Protection with Rubrik or ... · 1-2 day recovery Services whose unavailability significantly impacts customers or business operations. Priority 3 3-5

Rubrik Backup / Recovery + DR

44

SAN

Production Servers

Replication + Long-Term

Retention + Search

DR Servers

RubrikBackup S/W + Dedupe Storage

RubrikReplication & DR

Private

#MGT3342BUS CONFIDENTIAL

VMworld 2017 Content: Not fo

r publication or distri

bution

Page 43: MGT3342BUS Architecting Data Protection with Rubrik or ... · 1-2 day recovery Services whose unavailability significantly impacts customers or business operations. Priority 3 3-5

Key Evaluation Criteria

What we’ve seen that makes a difference…

1. Reliability of Data Recovery

a. Simplicity of Setup and Day 2 Operations – SLA Policies!

b. Immutability – is your data there there when you need it?

2. Speed of Data Recovery

a. Search + Live Mount

b. API Usage / Automation to enhance restore capabilities

45#MGT3342BUS CONFIDENTIAL

VMworld 2017 Content: Not fo

r publication or distri

bution

Page 44: MGT3342BUS Architecting Data Protection with Rubrik or ... · 1-2 day recovery Services whose unavailability significantly impacts customers or business operations. Priority 3 3-5

Oh… By the Way

46

Your App

Use an API-first platform to create powerful automation workflows that can

be integrated with any service that supports outbound REST

Now OpenAPI

#MGT3342BUS CONFIDENTIAL

VMworld 2017 Content: Not fo

r publication or distri

bution

Page 45: MGT3342BUS Architecting Data Protection with Rubrik or ... · 1-2 day recovery Services whose unavailability significantly impacts customers or business operations. Priority 3 3-5

One More Demo!Wait a minute…we’ve been doing them already.

VMworld 2017 Content: Not fo

r publication or distri

bution

Page 46: MGT3342BUS Architecting Data Protection with Rubrik or ... · 1-2 day recovery Services whose unavailability significantly impacts customers or business operations. Priority 3 3-5

What did you see?

48

Easy Integration

with vSphere

Working with an

SLA Policy

Real-time Data

Search

#MGT3342BUS CONFIDENTIAL

VMworld 2017 Content: Not fo

r publication or distri

bution

Page 47: MGT3342BUS Architecting Data Protection with Rubrik or ... · 1-2 day recovery Services whose unavailability significantly impacts customers or business operations. Priority 3 3-5

VMworld 2017 Content: Not fo

r publication or distri

bution

Page 48: MGT3342BUS Architecting Data Protection with Rubrik or ... · 1-2 day recovery Services whose unavailability significantly impacts customers or business operations. Priority 3 3-5

50

Don’t Backup. Go Forward.VMworld 2017 Content: N

ot for publicatio

n or distribution

Page 49: MGT3342BUS Architecting Data Protection with Rubrik or ... · 1-2 day recovery Services whose unavailability significantly impacts customers or business operations. Priority 3 3-5

VMworld 2017 Content: Not fo

r publication or distri

bution

Page 50: MGT3342BUS Architecting Data Protection with Rubrik or ... · 1-2 day recovery Services whose unavailability significantly impacts customers or business operations. Priority 3 3-5

Andrew Miller | [email protected] | @andrivenRebecca Fitzhugh | [email protected] | @rebeccafitzhugh

VMworld 2017 Content: Not fo

r publication or distri

bution