2012_Architects_Guide_Designing_Integrated_Multi-Product_HA_DR_BC_Solutions_v2

1

Architect’s Guide to Designing Integrated Multi-Product HA-DR-BC SolutionsJohn Sing, Executive Strategy, IBM Session E10

2

John Sing • 31 years of experience with IBM in high end servers, storage, and software

– 2009 - Present: IBM Executive Strategy Consultant: IT Strategy and Planning, Enterprise Large Scale Storage, Internet Scale Workloads and Data Center Design, Big Data Analytics, HA/DR/BC

– 2002-2008: IBM IT Data Center Strategy, Large Scale Systems, Business Continuity, HA/DR/BC, IBM Storage

– 1998-2001: IBM Storage Subsystems Group - Enterprise Storage Server Marketing Manager, Planner for ESS Copy Services (FlashCopy, PPRC, XRC, Metro Mirror, Global Mirror)

– 1994-1998: IBM Hong Kong, IBM China Marketing Specialist for High-End Storage– 1989-1994: IBM USA Systems Center Specialist for High-End S/390 processors– 1982-1989: IBM USA Marketing Specialist for S/370, S/390 customers (including VSE and

VSE/ESA)

• [email protected]

• IBM colleagues may access my webpage:– http://snjgsa.ibm.com/~singj/

• You may follow my daily IT research blog– http://www.delicious.com/atsf_arizona

mailto:[email protected]

http://snjgsa.ibm.com/~singj/



http://www.delicious.com/atsf_arizona


http://www.delicious.com/atsf_arizona

3

Agenda

• Understand today’s challenges and best practices

– for IT High Availability and IT Business Continuity

• What has changed? What is the same?

• Strategies for:– Requirements, design, implementation

• Step by step approach– Essential role of automation– Accommodating petabyte scale– Exploiting Cloud

3

2012 Clouddeployment

options

4

Agenda

1. Solving Today’s HA-DR-BC Challenges

2. Guiding HA-DR-BC Principles to mitigate chaos

3. Traditional Workloads vs. Internet Scale Workloads

4. Master Vision and Best Practices Methodology

5

Recovering today’s real-time massive streaming workflows is challenging

Chart in public domain: IEEE Massive File Storage presentation, author: Bill Kramer, NCSA: http://storageconference.org/2010/Presentations/MSST/1.Kramer.pdf:

n d

http://storageconference.org/2010/Presentations/MSST/1.Kramer.pdf

6

Today’s Data and Data Recovery Conundrum:

7

Many options, including many non-traditional alternatives for user deployments, workload hosting, and recovery models

Traditional alternatives:

• Other platforms

• Other vendors

• Non-traditional alternatives: – The Cloud, the Developing World

Illustrative Cloud examples onlyNo endorsement is implied

or expressed

Inter-

Disciplinary

8

Finally, we have this ‘little’ problem regarding Mobile proliferation

• From IT standpoint, we are clearly seeing “consumerization of IT”

• Key is to recognize and exploit hyper-pace reality of BYOD’s associated data

• Not just the technology

• Also the recovery model (“cloud), the business model, and the required ecosystem

Clayton ChristensenHarvard Business School

http://en.wikipedia.org/wiki/Disruptive_innovation

http://en.wikipedia.org/wiki/Clayton_M._Christensen




9

So how do we affordably architect HA / BC / DR in 2012?

10

What has remained the same?

Data Protection Service Management Storage Efficiency

(Continued good Guiding Principles that mitigate HA/DR/BC chaos)

11

Application 1Application 3Analytics

report

managementreports

http://xyz.xml

decisionpoint

MQseries

WebSphere

Application 2

SQL

db2

Businessprocess A

Businessprocess B

Businessprocess C

Businessprocess D

Businessprocess E

Businessprocess F

Businessprocess G

Infr

astr

uctu

reA

pp

licati

on

Bu

sin

ess

1. An error occurs on a storage device that correspondingly corrupts a database

2. The error impacts the ability of two or more applications to share critical data

3. The loss of both applications affects two distinctly different business processes

IT Business Continuity must recover at the business processlevel

The Business Process is still the Recoverable Unit

12


report

managementreports

http://xyz.xml

decisionpoint

WebSphere

Application 2

SQL

db2

Businessprocess A

Businessprocess B

Businessprocess C

Businessprocess D

Businessprocess E

Businessprocess F

Businessprocess G

Infr

astr

uctu

reA

pp

licati

on

Bu

sin

ess

1. Data input to the cloud

2. Cloud provider outage

3. The loss of Cloud output affects two distinctly different business processes

Cloud is simply another deployment option

But doesn’t change HA/BC fundamental approach

Cloud does not change business process; still the recovery unit

STOP

13

When can Cloud recovery can provide extremely fast time to project completion?

• Where entire business process recoverable units can be out-sourced to Cloud provider

– Production example: Out-sourcing production, or backup/restore, or integrated, standalon, application to a provider

– Cloud application-as-a-service (AaaS) example: Salesforce.com, etc.


reportmanagement

reports

http://xyz.xml

decisionpoint

MQseries

WebSphere

Application 2

SQL

db2

Businessprocess A

Businessprocess B

Businessprocess C

Businessprocess D

Businessprocess E

Businessprocess F

Businessprocess G

Tech

nic

al

Ap

plicati

on

Bu

sin

ess

14

The trick to leveraging Cloud is:

Understanding that Cloud is simply another (albeit powerful) deployment choice

Good news:

Fundamental principles for HA/DR/BC haven’t changed

It’s only the deployment options that have changed

15

Still true: synergistic overlap of valid data protection techniques

Protection of critical Business data Operations continue after a disaster

Costs are predictable and manageableRecovery is predictable and reliable

Fault-tolerant, failure-resistant streamlined infrastructure

with affordable cost foundation

1. High Availability Non-disruptive backups and

system maintenance coupled with continuous availability of

applications

2. Continuous Operations Protection against unplanned

outages such as disasters through reliable, predictable

recovery

3. Disaster Recovery

IT DataProtection

16

Four Stages of Data Center Efficiency: (pre-req’s for HA/BC/DR)

http://public.dhe.ibm.com/common/ssi/ecm/en/rlw03007usen/RLW03007USEN.PDF http://www-935.ibm.com/services/us/igs/smarterdatacenter.html

April 2012

http://public.dhe.ibm.com/common/ssi/ecm/en/rlw03007usen/RLW03007USEN.PDF

http://www-935.ibm.com/services/us/igs/smarterdatacenter.html

http://www-935.ibm.com/services/us/igs/smarterdatacenter.html

17

Done?

?

Still true: Timeline of an IT Recovery ==>

Production ☺ Network Staff

Operations StaffOperations Staff

Data

Operating System

Physical Facilities

Telecom Network

Management Control

Execute hardware, operating system, and data integrity recovery

AssessRPO

Application transactionintegrity recovery

Applications

Now we're done!

Applications Staff

Recovery Time Objective (RTO)of transaction integrity

Recovery Time Objective (RTO)of hardware data integrity

Recovery Point Objective

(RPO)

How much datamust be

recreated?

Outage!

RPO

Telecom bandwidth still the major delimiterfor any fast recovery

18

?

Still true: value of Automation for real-time failover ===>

Production ☺ Network StaffOperations StaffOperations Staff

Data

Operating System

Physical Facilities

Telecom Network

Management Control

AssessRPO

Trans.Recov.

Applications

Now we're done!

Applications Staff

RTO trans. integrity

RTO H/W

Recovery Point Objective

(RPO)

How much datamust be

recreated?

Outage!

RPO

HW

•Reliability

•Repeatability

•Scalability

•Frequent Testing

Value of automation

19

Recovery Time Objective (guidelines only)

15 Min. 1-4 Hr.. 4 -8 Hr.. 8-12 Hr.. 12-16 Hr.. 24 Hr.. Days

Co

st

/ Va

lue

BC Tier 4 – Add Point in Time replication to Backup/Restore

BC Tier 3 – VTL, Data De-Dup, Remote vault

BC Tier 2 – Tape libraries + Automation

BC Tier 7 – Add Server or Storage replication with end-to-end automated server recovery

BC Tier 6 – Add real-time continuous data replication, server or storage

BC Tier 1 – Restore from Tape

Still true: Organize High Availability, Business Continuity Technologies Balancing recovery time objective with cost / value

BC Tier 5 – Add Application/database integration to Backup/Restore

Recovery from a disk image Recovery from tape copy

20

Tape Backup

SecsMinsHrsDays Wks Secs Mins Hrs Days Wks

Recovery PointRecovery Point Recovery TimeRecovery Time

Synchronous replication / HA

Periodic Replication

Asynchronous replication

Still true: Replication Technology Drives RPO

For example:

21

Recovery Time includes:

– Fault detection

– Recovering data

– Bringing applications back online

– Network access

Manual Tape Restore

SecsMinsHrsDays Wks Secs Mins Hrs Days Wks

Recovery PointRecovery Point Recovery TimeRecovery Time

End to end automated clustering

Storage automation

Still true: Recovery Automation Drives Recovery Time

For example:

22

Integration into IT ManageBusiness Prioritization

StrategyDesign

riskassessment

businessimpactanalysis

Risks,

Vulnerabilities

and Threats

programassessment

Impacts

of

Outage

RTO/RPO

•Maturity Model

•Measure ROI

•Roadmap for Program

ProgramDesign

Current

Capability

Implement programvalidation

Estimated

Recovery Tim

e

ResilienceProgram

Management

Awareness, Regular Validation, Change Management, Quarterly Management Briefings

Business processes drive strategies and they are integral to the Continuity of Business Operations. A company cannot be resilient without having strategies for alternate workspace, staff members, call centers and communications channels.

crisis team

businessresumption

disasterrecovery

highavailability

1. People2. Processes3. Plans4. Strategies5. Networks6. Platforms7. Facilities

Database andSoftware design

High Availability Servers

Storage, Data Replication

High Availabilitydesign

Source: IBM STG, IBM Global Services

Still true: “ideal world” construct for IT High Availability and Business Continuity

23

The 2012 Bottom line: (IT Business Continuity Planning Steps)

For today’s real world environment……….


StrategyDesign

riskassessment


Risks,

Vulnerabilities

and Threats

programassessment

Impacts

of

Outage

RTO/RPO

• Maturity Model

• Measure ROI

• Roadmap for Program

ProgramDesign

Current

Capability


Estimated

Recovery Tim

e

ResilienceProgram

Management


crisis team

businessresumption

disasterrecovery

highavailability




Data Replication

high availabilitydesign

i.e. how to streamline this “ideal” process?1. Collect information for prioritization 2. Vulnerability, risk assessment, scope3. Define BC targets based on scope4. Solution option design and evaluation5. Recommend solutions and products 6. Recommend strategy and roadmap

4. Solution option design and evaluation5. Recommend solutions and products 6. Recommend strategy and roadmap

2012 key #2:

Workload type

2012 key #1:

need a basicData Strategy

Need faster way than even this simplified 2007 version:

24

Streamlined BC ActionsInput Output

2. Vulnerability / Risk Assessment

List of vulnerabilities Defined vulnerabilities

3. Define desired HA/BC targets based on scope

Existing BC capability, KPIs, targets, and success rate

Defined BC baseline targets, architecture, decision and success criteria

4. Solution design andevaluation

Technologies and solution options

Business process segmentsand solutions

5. Recommend solutions and products

Generic solutions that meet criteria

Recommended IBMSolutions and benefits

1. Collect info forprioritization

Business processes, Key Perf. Indicators, IT inventory

Scope, Resource Business Impact

Component effect on business processes

6. Recommend strategy and roadmap

Budget, major project milestones, resource availability, business process priority

Baseline Bus. Cont. strategy, roadmap, benefits, challenges,financial implications andjustification

2005 version

25

Streamlined BC ActionsInput Output

2. Vulnerability / Risk Assessment

List of vulnerabilities Defined vulnerabilities

3. Define desired HA/BC targets based on scope

Existing BC capability, KPIs, targets, and success rate

Defined BC baseline targets, architecture, decision and success criteria

4. Solution design andevaluation

Technologies and solution options

Business process segmentsand solutions

5. Recommend solutions and products

Generic solutions that meet criteria

Recommended IBMSolutions and benefits

1. Collect info forprioritization

Business processes, Key Perf. Indicators, IT inventory

Scope, Resource Business Impact

Component effect on business processes

6. Recommend strategy and roadmap

Budget, major project milestones, resource availability, business process priority

Baseline Bus. Cont. strategy, roadmap, benefits, challenges,financial implications andjustification

Do basic HA/DR

Data Strategy

Exploit

Workload Type

2012 version

26

How do we get there in 2012?

Bottom line #1: have a basic Data Strategy

Bottom line #2: Exploit Workload type

Data Protection Service Management Storage Efficiency

27

i.e. #1: It’s all about the

Data

Now, what do I mean by that?

2828

Applicationscreate data

InformationArchive / Retain / Delete

What is a basic Data Strategy? Specify data usage over it’s lifespanF

req

uen

cy o

f A

cces

s an

d U

se

Time

Informationand data

Management

29

Business processes drive strategies and they are integral to the Continuity of Business Operations. A company cannot be resilient without having strategies for alternate workspace, staff members, call centers and communications channels.


StrategyDesign

riskassessment


Risks,

Vulnerabilities

and Threats

programassessment

Impacts

of

Outage

RTO/RPO

•Maturity Model

•Measure ROI

•Roadmap for Program

ProgramDesign

Current

Capability


Estimated

Recovery Tim

e

ResilienceProgram

Management


crisis team

businessresumption

disasterrecovery

highavailability




Storage, Data Replication

High Availabilitydesign

Source: IBM STG, IBM Global Services

Data strategy = collecting information, prioritizing, vulnerability/risk, scope

Data

Strategy

30

Data Strategy: relationship to Business, IT Strategies

Business Strategy

Business

Scope

Distinct

CompetenciesBusiness

Governance

IT Strategy

Technology

Scope

System

CompetenciesIT

Governance

Organization, Infrastructure,

Process

Process

Skills Tools

IT Infrastructure

And processes

IT

Infrastructure

Processes Skills

Business Strategies

IT Strategy

Data Strategy

Enterprise IT Architecture

IT Infrastructure

People

Process

Structure

Data

Technology

Data Strategy

Data Strategy Defined

31

The role of the basic “Data Strategy” for HA / BC purposes

• Define major data types “good enough”– i.e. by major application, by business line….– An ongoing journey

• For each data type:– Usage– Performance and measurement– Security– Availability– Criticality– Organizational role– Who manages– What standards for this data

• What type storage deployed on• What database • What virtualization

• Be pragmatic– Create a basic, “good enough” data strategy for HA/BC purposes

• Acquire tools that help you know your data

Data Strategy Defined

Business Strategies

IT Strategy

Data Strategy

Enterprise IT Architecture

IT Infrastructure

People

Process

Structure

Data

Technology

Data Strategy

You have toknow your data

And have abasic strategy

for it

32

Here’s the major difference for 2012: There are two major types of workloads:

Traditional IT Internet Scale Workloads

HA, Business Continuity, Disaster Recovery Characteristics

HA/DR/BC can be done “Agnostic / after the fact” using replication

HA/DR/BC must be “designed into software stack from the beginning”

Data Strategy Use traditional tools/concepts to understand / know dataStorage/server virtualization and pooling

Proven Open Source toolset to implement failure tolerance and redundancy in the application stack

Automation End to end automation of server / storage virtualization

End to end automation of the application software stack providing failure tolerance

Commonality Apply master vision and lessons learned from internet scale data centers

Apply master vision and lessons learned from internet scale data centers

33

Site Load Balancer

Web Server Clusters

Application / DBServer Clusters

Server Clusters Disk

Production Site

Choices for high availability and replication architectures

Local backup

Applicationor database Replication

ServerReplication

StorageReplic.

Geographic Load Balancer

Geographic Load Balancer Site

Load Balancer

PIT Image, Tape B/U

Web Server Clusters

Application / DBServer Clusters

Server Clusters

Other Site(s)

Workloadbalancer

34

Comparing IT BC architectural methods

• Application / database / file system replication / workload balancer– Typically requires the least bandwidth– May be required if the scale of storage is very large (i.e. internet scale)– Span of consistency is that application, database or file system only– Well understood by database, application, file system administrators– Can be more complex implementation, must implement for each application

File system,

DB, Applic.

Aware

File system,

DB, Applic.

Agnostic

• Replication – Server (traditional IT) – Well understood by operating systems administrators– Storage and application independent, uses server cycles– Span of recovery limited to that server platform

• Replication – Storage (traditional IT)– Can provide common recovery across multiple application stacks and multiple

server platforms– Usually requires more bandwidth– Requires storage replication skill set

Site Load Balancer

Web Server Clusters

Application / DB Server Clusters

Server Clusters Storage

Production Site

LocalBackup

Application / DB Replication

ServerReplication

StorReplic.

Geographic Load Balancer

Geographic Load Balancer Site

Load Balancer Replication,

PiT Image, Tape

Web Server Clusters

Application / DB Server Clusters

Server Clusters

Multiple Site(s)

WorkloadBalancer

35

Principles for Internet Scale Workloads

36

Internet Scale Workload Characteristics - 1

• Embarrassingly parallel Internet workload– Immense data sets, but relatively independent records being processed

• Example: billions of web pages, billions of log / cookie / click entries– Web requests from different users essentially independent of each over

• Creating natural units of data partitioning and concurrency• Lends itself well to cluster-level scheduling / load-balancing

– Independence = peak server performance not important– What’s important is aggregate throughput of 100,000s of servers

i.e. Very low inter-process

communication

• Workload Churn– Well-defined, stable high level API’s (i.e. simple URLs)– Software release cycles on the order of every couple of weeks

• Means Google’s entire core of search services rewritten in 2 years– Great for rapid innovation

• Expect significant software re-writes to fix problems ongoing basis– New products hyper-frequently emerge

• Often with workload-altering characteristics, example = YouTube

37

Internet Scale Workload Characteristics - 2• Platform Homogeneity

– Single company owns, has technical capability, runs entire platform end-to-end including an ecosystem

– Most Web applications more homogeneous than traditional IT– With immense number of independent worldwide users

1% - 2% of all Internet requests

fail*

Users can’t tell difference between Internet down and

your system down

Hence 99% good enough

*The Data Center as a Computer: Introduction to Warehouse Scale Computing, p.81 Barroso, Holzlehttp://www.morganclaypool.com/doi/pdf/10.2200/S00193ED1V01Y200905CAC006

• Fault-free operation via application middleware– Some type of failure every few hours, including software bugs– All hidden from users by fault-tolerant middleware– Means hardware, software doesn’t have to be perfect

• Immense scale: – Workload can’t be held within 1 server, or within max size tightly-clustered

memory-shared SMP– Requires clusters of 1000s, 10000s of servers with corresponding PBs storage,

network, power, cooling, software– Scale of compute power also makes possible apps such as Google Maps, Google

Translate, Amazon Web Services EC2, Facebook, etc.

http://www.morganclaypool.com/doi/pdf/10.2200/S00193ED1V01Y200905CAC006

38

IT architecture at internet scale

• Internet scale architectures fundamental assumptions:– Distributed aggregation of data

– High Availability, failure tolerance functionality is in software on the server

– Time to Market is everything• Breakage = “OK” if I can insulate that from user

– Affordability is everything– Use open source software where-ever possible

– Expect that something somewhere in infrastructure will always be broken

– Infrastructure is designed top-to-bottom to address this

• All other criteria are driven off of these

Criteria:

Cost

Extreme:

- Scale- Parallelism- Performance- Real time-Time to Market

39

SERVER HARDWARE

RHEL 2.6.X PAE

RACK

INTERIOR NETWORK IPv6

GFS / GFS II

BigTable MapreduceBigTable

Chubby Lock

GOOGLE APP ENGINE

Python, Java, C++, Sawzall, other

DC

GOOGLE APPSSEARCH

INDEXCRAWLGMAIL...

Architecture

Python. Java. C++

Exterior Network

GWQ

1. Google File System Architecture – GFS II

2. Google Database - Bigtable

3. Google Computation - MapReduce

4. Google Scheduling - GWQ

For Internet Scale workloads, Open Source based internet-scale software stack

Example shown is the 2003-2008 Google version:

The OS or HW doesn’t do any of the redundancy

Reliability, redundancy all in the “application stack”

40

Internet-scale IT infrastructure

Inp

ut

from

th

e I

nte

rnet

You

r cu

sto

mers

HA/DR/BC

For

InternetScale

Workloads

Each red block is an inexpensive server = plenty of power for its

portion of workflow

41

Warehouse Scale Computer programmer productivity framework example

• Hadoop– Overall name of software stack

• HDFS– Hadoop Distributed File System

• MapReduce– Software compute framework

• Map = queries • Reduce=aggregates answers

• Hive– Hadoop-based data warehouse

• Pig– Hadoop-based language

• Hbase– Non-relationship database fast

lookups

• Flume– Populate Hadoop with data

• Oozie– Workflow processing system

• Whirr– Libraries to spin up Hadoop on

Amazon EC2, Rackspace, etc.• Avro

– Data serialization• Mahout

– Data mining• Sqoop

– Connectivity to non-Hadoop data stores

• BigTop– Packaging / interop of all

Hadoop components

http://wikibon.org/wiki/v/Big_Data:_Hadoop%2C_Business_Analytics_and_Beyond

http://wikibon.org/wiki/v/Big_Data:_Hadoop%2C_Business_Analytics_and_Beyond

42

Summary - two major types of approaches, depending on workload type:

Traditional IT Internet Scale Workloads

HA, Business Continuity, Disaster Recovery Characteristics

HA/DR/BC can be done “Agnostic / after the fact” using replication

HA/DR/BC must be “designed into software stack from the beginning”

Data Strategy Use traditional tools/conceptsw to understand / know dataStorage/server virtualization and pooling

Proven Open Source toolset to implement failure tolerance and redundancy in the application stack

Automation End to end automation of server / storage virtualization

End to end automation of the application software stack providing failure tolerance

Commonality Apply master vision and lessons learned from internet scale data centers

Apply master vision and lessons learned from internet scale data centers

43

Principles for Architecting IT HA / DR / Business Continuity

44

Key strategy: segment data into logical storage pools by appropriate Data Protection characteristics (animated chart)

• Continuous Availability (CA) – E2E automation enhances RDR– RTO = near continuous, RPO = small as possible (Tier 7)– Priority = uptime, with high value justification

Lower cost

• Rapid Data Recovery (RDR) – enhance backup/restore– For data that requires it– RTO = minutes, to (approx. range): 2 to 6 hours– BC Tiers 6, 4– Balanced priorities = Uptime and cost/value

• Backup/Restore (B/R) – assure efficient foundation – Standardize base backup/restore foundation – Provide universal 24 hour - 12 hour (approx) recovery capability– Address requirements for archival, compliance, green energy– Priority = cost

Mission Critical

Know and categorize your data -

Provides foundation for affordable data protection

Know and categorize your data -

Provides foundation for affordable data protection

Enabled by

virtualization

45

Virtualization is fundamental to addressing today’s IT diversity

Virtualization

46

Virtualized IT infrastructure Business Processes

Virtualized systems become the resource pools that enable the recoverability

Consolidated virtualized systems become the Recoverable Units for IT Business Continuity

Virtualization

47

Recovery Time Objective


Co

st

/ Va

lue







High Availability, Business Continuity Step by Step virtualization journey Balancing recovery time objective with cost / value



Foundation

Storage pools

48

Storage PoolsApply appropriate server,

storage technology

Real Time replication(storage or server or

software)


software)

Periodic PiT replication:-File System

- Point in Time Disk- VTL to VTL with Dedup



- Foundation backup/restore- Physical or electronic transport

- Foundation backup/restore- Physical or electronic transport

PetaByteUnstructured


PetabyteUnstructured


Petabyte unstructured, due to usage and large scale, typically uses

application level intelligent redundancyfailure toleration design

Petabyte unstructured, due to usage and large scale, typically uses

application level intelligent redundancyfailure toleration design

Real-time replication

Point in time

Removable media

File, application, or disk-to-disk

periodic replication

Add automated failover to replicated storage

49


Co

st

Methodology: Traditional IT HA / BC / DR in stages, from bottom up

SAN SAN

Add: Point-in-time Copy, disk to disk, Tiered Storage (Tier 4)Foundation: electronic vaulting, automation, tape lib (Tier 3)

Foundation: standardized, automated tape backup (Tier 2, 1)

Disk VTL/De-DupDisk VTL/De-Dup VTL/De-Dup

•IBM FlashCopy, SnapShot•IBM XIV, SVC, DS, SONAS•IBM Tivoli Storage Productivity Center 5.1

•IBM ProtecTier•IBM Virtual Tape Library•IBM Tivoli Storage Manager Backup/restore

•VTL, de-dup, remote replication at tape level

50


Co

st

SAN SAN

Add: Point-in-time Copy, disk to disk for backup/restore (Tier 4)Foundation: electronic vaulting, automation, tape lib (Tier 3)

Foundation: standardized, automated tape backup (Tier 2, 1)

Disk VTL/De-DupDisk VTL/De-Dup VTL/De-Dup

Applicationintegration

Applicationintegration

Automate applications, database for replication and automation (Tier 5)Consolidate and implement real time data availability (Tier 6)

Datareplication

Data replication

End to end automated site failover servers, storage, applications (Tier 7)

Dynamic

End to endAutomatedFailover:Server

StorageApplications

Methodology: traditional IT HA / BC / DR in stages, from bottom up

If storage: •Metro Mirror, Global Mirror, Hitachi UR•XIV, SVC, DS, other storage•TPC 5.1

•VMWare•PowerHA on p

•Tivoli FlashCopy Manager

•Server virtualization

51

Technology Deployments in Cloud

EnterpriseData Center

Private Cloud

1EnterpriseEnterprise

Data Center

Co-lo operated

Managed Private Cloud

Co-lo owned and operated Co-lo owned

and operated

Hosted Private Cloud

2 3

• Consumption models including client-owned and provider-owned assets

• Delivery options including client premise & hosted

• Strategic Outsourcing clients with standardized services

Operated or

Co-located

Enterprise AEnterprise

BEnterprise C

Shared Cloud Services

4

• Standardized, multi-tenant service

• Pay-per-usage model with provider-owned assets

Pay-per-Usage

User A

User B

User C

User D

User E

Public Cloud Services

5

• Supporting compute-centric workloads

• Finer granularity in multi-tenancy model

• Provider-owned assets

Compute Cloud Persistent StoragePrivate Cloud

• Client-managed cloud

• Internal or partner implementation services

52

Cloud as remote site deployment options


software)


software)





- Point in Time Copies- Physical or electronic transport






Petabyte level storage typicallyuses intelligent file or application replication

due to large scale, usage patterns



ProductionRecovery

inCloud

53

VirtualizedStorageData strategy

remote cloud


software)


software)















Real-time replication

Point in time

Removable media

Disk-to-disk replication

Automated failover

54

Local Cloud deployment from data standpoint



55

Cloud providerresponsibilityfor HAand BC


software)


software)















YourProduction

In Cloud

Recovery By

CloudProvider

56



Co

st

/ Va

lue







Today’s world: High Availability, Business Continuity is a Step by Step data strategy / workload journey

Balancing recovery time objective with cost / value



Workload Types

Data Strategy

Clouddeploymentif needed

57



Co

st

/ Va

lue








Step by Step Virtualization, High Availability, Business Continuity data strategy

Balancing recovery time objective with cost / value


Continuous AvailabilityContinuous Availability

Rapid Data RecoveryRapid Data Recovery

Backup/RestoreBackup/Restore

Workload typesData Strategy

Clouddeploymentif needed

59

Summary • Understand today’s best practices

– for IT High Availability and IT Business Continuity

• What has changed? What is the same?– Principles for requirements = no change

• Data Strategy– Deployment for true internet scale wkloads:

• Application level redundancy

• Strategies for:– Requirements, design, implementation– In-house vs. out-sourcing

• Step by step approach– Automation, virtualization essential– Segment workloads traditional vs. petabyte scale– Exploiting Cloud

59

DataStrategy

Workloadtypes

Clouddeployment

options

60