Oow 2013 cloning_kyle_final copy

Preview:

DESCRIPTION

 

Citation preview

Agile Data Platform: Revolutionizing Database Cloning

Kyle Haileyhttp://kylehailey.com

• Mind-meld with the illustrious members of the OakTable • 32 talks over 2 days, right next door to Oracle OpenWorld

More info. and registration: http://oaktableworld.com

OakTable World: Sept 23 & 24, San Francisco

NOW

Problem in IT

Get the right dataTo the right peopleAt the right time

Part I : Cloning Technology

Virtual Thin Clone Physical

Part II : Agile Data Acceleration

Database Cloning Challenge

If you can’t satisfy the business demands then your process is broken.

Problem

Developers

QA and UAT

Reports

First copy

Production

• CERN -  European Organization for Nuclear Research

• 145 TB database• 75 TB growth each year• Dozens of developers want copies.

Tradeoff: Speed, Quality, Cost

What We’ve Seen

1. Inefficient QA: Higher costs of QA2. QA Delays : Greater re-work of code3. Sharing DB Environments : Bottlenecks4. Using DB Subsets: More bugs in Prod5. Slow Environment Builds: Delays

“if you can't measure it you can’t manage it”

1. Inefficient QA: Long Build times

Build Time

QA Test

96% of QA time was building environment$.04/$1.00 actual testing vs. setup

Build

2. QA Delays: bugs found late require more code re-work

Build QA Env QA Build QA Env QA

Sprint 1 Sprint 2 Sprint 3

Bug CodeX

1 2 3 4 5 6 70

10203040506070

Delay in Fixing the bug

Cost ToCorrect

Software Engineering Economics – Barry Boehm (1981)

3. Full Copy Shared : Bottlenecks

Frustration Waiting

Old Unrepresentative Data

4. Subsets : cause bugs

Production4. Subsets : cause bugs

Classic problem is that queries that run fast on subsets hit the wall in production.

Developers are unable to test against all data

The Production ‘Wall’

5. Slow Environment Builds: 3-6 Months to Deliver Data

Management

DBA

System Admin

Storage Admin

Developers Submit Request

Disk Capacity?

Approve Request $$ (2 Weeks)

Approve Request $$

(1 Week)

RequestAdditional Storage?

ProvisionCapacity

File SystemConfigured?

Configure LUNS & Build File System

Coordinate Replication w/ Infrastructure

Re-Parameterize & Configure DB

Mount Recovery DB to

Specific PIT

Begin Work

Approve Request $$ (2 Weeks)

(3 Days)

(3 Days)

(2 Days)

(3 Days)

(3 Days)

…….1-2 Weeks of Approvals, Delays, and Provisioning……

15

5. Slow Environment Builds: culture of no

DBA Developer

Never enough environments

bottlenecks

What We’ve Seen

1. Inefficient QA: Higher costs2. QA Delays : Increased re-work3. Sharing DB : Bottlenecks4. Subset DB : Bugs5. Slow Environment Builds: Delays

Clone 1 Clone 3Clone 2

99% of blocks are identical

Clone 1 Clone 2 Clone 3

Thin Clone

I. Clonedb Oracle II. EMC

• Copy on first write (COFW)III. Netapp

• write anywhere file system (WAFL)• & EMC VNX redirect on write (ROW)

IV. ZFS

2. Thin Cloning

RMAN backup

dNFSsparse fileI. clonedb

RMAN backup

dNFSsparse fileI. clonedb

CloneDB

1. dNFS 11.2.0.2+– cd $ORACLE_HOME/rdbms/lib– make -f ins_rdbms.mk dnfs_on

2. Clonedb.pl initSOURCE.ora output.sql– MASTER_COPY_DIR="/rman_backup”– CLONE_FILE_CREATE_DEST="/nfs_mount”– CLONEDB_NAME="clone"

3. sqlplus / as sysdba @output.sql– startup nomount PFILE=initclone.ora – Create control file backup location– dbms_dnfs.clonedb_renamefile ('/backup/file.dbf' , '/clone/file.dbf');– alter database open resetlogs;

Tim Hallwww.oracle-base.com/articles/11g/clonedb-11gr2.php

I. Clonedb Oracle II. EMC

• Copy on first write (COFW)III. Netapp

• write anywhere file system (WAFL)• & EMC VNX redirect on write (ROW)

IV. ZFS

2. Thin Cloning

D

ActiveFile

SystemSnapshot

CBA

II. EMC Copy on Write

D

ActiveFile

SystemSnapshot

DCBA

Write penalty (read and two writes)Limit 16 snapshotsNo Branching (snapshots of snapshots)

II. EMC Copy on Write

I. Clonedb Oracle II. EMC

• Copy on first write (COFW)III. Netapp

• write anywhere file system (WAFL)• & EMC VNX redirect on write (ROW)

IV. ZFS

2. Thin Cloning

Data Blocks

root

III. Netapp and EMC VNX

• 255 snapshots• Branching possible

I. Clonedb Oracle II. EMC

• Copy on first write (COFW)III. Netapp

• write anywhere file system (WAFL)• & EMC VNX redirect on write (ROW)

IV. ZFS

2. Thin Cloning

Snapshot rootLive root

ZilIntent Log

IV. ZFS Allocate on Write

Unlimited Instantaneous SnapshotsUnlimited Instantaneous ClonesBranching easy and unlimited

FS vs. ZFS

• FS per Volume

• FS limited bandwidth

• Storage stranded

• Many FS in a pool

• Grow automatically

• All bandwidth

Storage PoolVolume

FS

Volume

FS

Volume

FS ZFS ZFS ZFS

I. Clonedb Oracle II. EMC III. Netapp IV. ZFS

2. Thin Cloning

Database Luns

Production FilerTarget A

Target B

Target C

snapshotclones

1. Put database in hot backup2. Take Snapshot3. Clone Snapshot (ZFS & Netapp)4. Export Clone5. Mount on target host

InstanceInstance

InstanceInstance

InstanceInstance

InstanceInstance

Instance

Source

Database LUNs

snapshotclonesProduction Filer

Development Filer

Problem: How do you get data off Production?

Instance

Target A

Target B

Target C

InstanceInstance

InstanceInstance

InstanceInstance

Instance

Three Core Parts

Production

File System

Instance

DevelopmentStorage

21 3

Copy Sync SnapshotsPurge Time Flow

Clone (snapshot)CompressShare CacheStorage Agnostic

Mount, recover, renameSelf Service, Roles & Security Rollback & Refresh Branch & Tag

Instance

Three Core Parts

Production

File System

Instance

DevelopmentStorage

1

Copy Sync SnapshotsPurge Time Flow

Instance

Snap Manager

SnapManagerRepository

Protection Manager

Snap Drive

Snap Manager

Snap Mirror

Flex Clone

RMANRepository

Production

Development

DBA

Storage Admin

1 tr-3761.pdf

Netapp

NetApp Filer - DevelopmentNetApp Filer - Production

Database Luns

Snap mirror

Snapshot Manager for Oracle

Flexclone

Repository Database

SnapDrive

Protection Manage

Production

Development

1NetappTarget A

Target B

Target C

InstanceInstance

InstanceInstance

InstanceInstance

Instance

Three Core Parts

Production

File System

Instance

DevelopmentStorage

Instance

3

Mount, recover, renameSelf service, refresh & rollbackBranch & tagRoles & security

3

Oracle EM 12c Snap Clone3

EM 12c

Agents instance

• Register Netapp or ZFS with Storage Credentials• Install agents on a LINUX machine to manage the Netapp or ZFS storage. • Register test master database• Enable Snap Clone for the test master database• Set up a zone – set max CPU and Memory and the roles that can see these zones• Set up a pool – a pool is a set of machines where databases can be provisioned• Set up a profile – a source database that can be used for thin cloning• Set up a service template – init.ora values

Test Master

Instance

Source

? instance

CloneLinux

ZFS orNetApp

PoolProfile

ZoneTemplate

Where we Are

Production Development QA UAT

Instance Instance Instance InstanceInstance Instance Instance Instance

Database

File systemFile system

Database

File systemFile system

Database

File system

Database

File systemFile systemFile systemFile system

Database

File system

Production

Instance

Database

Development

Instance

Database

QA

Instance

Database

UAT

Instance

Snapshots

Instance Instance Instance Instance

Want be here

EM 12c: Snap Clone

Production Development

Flexclone Flexclone

Netapp Snap Manager for Oracle

Thin Cloning

3. Database Virtualization

Three Physical CopiesThree Virtual Copies

Data Virtualization Applaince

Choose your virtualization Layer:• Delphix and Oracle SMU

SMU

ZFS Storage Appliance

Oracle 12c SMUOracle Snap Management Utility for ZFS Appliance

• Requires ZFS Appliance• Supports Linux , Solaris 10+, Windows

2008+• GUI

– snapshot source databases – provision virtual databases

Install Delphix on x86 hardware

Intel hardware

Allocate Any Storage to Delphix

Allocate StorageAny type

One time backup of source database

Database

Production

Instance

File system

File systemBeta or in Development

Production

DxFS (Delphix) Compress Data

Database

Production

Instance

Data is compressed typically 1/3 size

File system

Incremental forever change collection

Database

Production

Instance

File system

Changes

• Collected incrementally forever• Old data purged

File system Time Window

Typical Architecture

Production Development QA UAT

Instance Instance Instance InstanceInstance Instance Instance Instance

Database

File systemFile system

Database

File systemFile system

Database

File system

Database

File systemFile systemFile systemFile system

With Delphix

Database

Production

Instance

Database

Development

Instance

Database

QA

Instance

Database

UAT

InstanceInstance Instance Instance Instance

File system

Three Core Parts

Production

Instance

Time Window

Instance

Self Service

Development

21 3

Source Syncing Storage (DxFS)

Fast, Fresh, Full

Instance

Time Window

Instance

Development VDB

Source

Free

Instance

Time Window

Instance

Instance

Instance

gif by Steve Karam

Source

Source

Self Service

Branching

Instance Instance

Instance

Time Window

Time Window

Dev1 VDB

Source

Source Dev VDB

QA VDB (branched from Dev)

End of SprintOr a Code Freeze

Federated Cloning

Federated

Instance

Time Window

Instance

Instance

Instance

Time Window

Source1

Source2Source1

Source2

“I looked like a hero”Tony Young, CIO Informatica

DevOps

DevOps With Delphix

1. Efficient QA: Low cost, high utilization2. Quick QA : Fast Bug Fix3. Every Dev gets DB: Parallelized Dev4. Full DB : Less Bugs5. Fast Builds: Fast Dev, Culture of Yes

1. Efficient QA: Lower cost

Build Time

QA Test

1% of QA time was building environment$.99/$1.00 actual testing vs. setup

Build Time

QA Test

Build

Rapid QA via Branching

2. QA Immediate: bugs found fast and fixed

Sprint 1 Sprint 2 Sprint 3

Bug CodeX

QA QA

Build QA Env QA Build QA Env Q

A

Sprint 1 Sprint 2 Sprint 3

Bug CodeX

3. Private Copies: Parallelize

4. Full Size DB : Eliminate bugs

Production

5. Self Service: Fast, Efficient. Culture of Yes!

Management

DBA

System Admin

Storage Admin

Developers Submit Request

Disk Capacity?

Approve Request $$ (2 Weeks)

Approve Request $$

(1 Week)

RequestAdditional Storage?

ProvisionCapacity

File SystemConfigured?

Configure LUNS & Build File System

Coordinate Replication w/ Infrastructure

Re-Parameterize & Configure DB

Mount Recovery DB to

Specific PIT

Begin Work

Approve Request $$ (2 Weeks)

(3 Days)

(3 Days)

(2 Days)

(3 Days)

(3 Days)

…….1-2 Weeks of Approvals, Delays, and Provisioning……

Quality

• Forensics• A/B testing• Recovery

Investigate Production Bugs

Instance

Time Window

Instance

Development

Anomaly on ProdPossible code bugAt noon yesterday

Spin up VDB of Prod as it was during anomaly

Rewind for patch and QA testing

Instance

Time Window

Instance

Development

Time Window

Prod

A/B testing

Instance

Time Window

Instance

Instance

• Keep tests for compare• Production vs Virtual

– invisible index on Prod– Creating index on virtual

• Flashback vs Virtual

Test A with Index 1

Test B with Index 2

Surgical recover of Production

Instance Instance

Development

Time Window

Spin VDB up Before drop

Problem on ProdDropped Table Accidently

Source

Time Window

Surgical or Full Recovery on VDB

Instance

Instance

Dev1 VDB

Time Window

Dev1 VDB

InstanceSource

Source

Dev2 VDB Branched

Virtual to Physical

Instance Instance

VDB

Source

Time Window

Spin VDB up Before drop

Corruption

Recovery

Business Intelligence

ETL and Refresh Windows

1pm 10pm 8am noon

ETL and DW refreshes taking longer

1pm 10pm 8am noon20112012201320142015

Database going Global

Globalization Reduces Windows

20112012201320142015

1pm 10pm 8am noon

10pm 8am noon 9pm

6am 8am 10pm

ETL and Refresh Windows

20112012201320142015

1pm 10pm 8am noon

10pm 8am noon 9pm

6am 8am 10pm

ETL and DW Refreshes

Instance

Prod

Instance

DW & BI

Data Guard – requires full refresh if usedActive Data Guard – read only, most reports don’t work

Fast Refreshes

• Collect only Changes• Refresh in minutes

Instance Instance

BI / DW

Prod

Temporal Data

BI

a) Fast refreshes

b) Temporal queries

c) Confidence testing

Review: Use Cases

1. Development Accelerationa) Full, Fresh, Fast , Self Serveb) QA Branchingc) Federated

2. Qualitya) Forensicsb) Testing : A/B, upgrade, patchc) Recovery: logical, physical

3. BIa) Fast refreshb) Temporal Datac) Confidence testing

over 10 times

perhaps the single largest storage consolidation opportunity history“

Oracle 12c

80MB buffer cache ?

200GBCache

5000

Tnxs

/ m

inLa

tenc

y

300 ms

1 5 10 20 30 60 100 200

with

1 5 10 20 30 60 100 200Users

8000

Tnxs

/ m

inLa

tenc

y

600 ms

1 5 10 20 30 60 100 200Users

1 5 10 20 30 60 100 200

Database Virtualization

Memory Prices

• EMC sells $1000/GB• X86 memory $30/1GB

• TB RAM on a x86 costs around $32,000 • TB RAM on a VMAX 40K costs around $1,000,000

$1,000,000

$6,000

ERP Project Failures 2011

• NYC CityTime : delays $63 M => $760 M

• Montclair Uni: delays sues PeopleSoft• Idaho : delays ERP cost millions

Standish : IT Project Failure Rate

1994 1996 1998 2000 2002 2004 2009

31% 40% 28% 23% 15% 18% 24%

★http://www.galorath.com/wp/software-project-failure-costs-billions-better-estimation-planning-can-help.php*http://www.pcworld.com/article/246647/10_biggest_erp_software_failures_of_2011.html

Devv2.6 v2.6v2.6

QA UAT

v2.6

v2.6 v2.6v2.6v2.7

v2.6 v2.6v2.6v2.8

v2.6v2.6 v2.6v2.6

v2.6v2.7 v2.6v2.7

v2.6v2.8 v2.6v2.8

Devv2.6 v2.6v2.6

QA UAT

v2.6Production

v2.6 v2.6v2.6v2.7

v2.6 v2.6v2.6v2.8

Source Control for the database data

v2.6v2.6 v2.6v2.6

v2.6v2.7 v2.6v2.7

v2.6v2.8 v2.6v2.8

DevProd

2.6

Dev

QA

Prod

2.6

Dev

QA

UAT

Prod

2.6

Dev

QA

UAT

Prod

Dev

QA

UAT

2.6

2.7

Dev

QA

UAT

Prod

Dev

QA

UAT

2.6

2.7

Dev

QA

UAT2.8

Dev

QA

UAT

Prod

Dev

QA

UAT

2.6

2.7

Dev

QA

UAT2.8

Data Control = Source Control for the Database

Dev

QA

UAT

Dev

QA

UAT

2.6

2.7

Dev

QA

UAT

2.8

Data Control = Source Control for the Database

Production Time Flow

Who is Kyle Hailey

1990 Oracle– 90 support– 92 Ported v6– 93 France– 95 Benchmarking – 98 ST Real World Performance

2000 Dot.Com 2001 Quest 2002 Oracle OEM 10g

Success!First successful OEM design

Who is Kyle Hailey

1990 Oracle– 90 support– 92 Ported v6– 93 France– 95 Benchmarking – 98 ST Real World Performance

2000 Dot.Com 2001 Quest a 2002 Oracle OEM 10g 2005 Embarcadero

– DB Optimizer

Who is Kyle Hailey

• 1990 Oracle 90 support 92 Ported v6 93 France 95 Benchmarking 98 ST Real World Performance

• 2000 Dot.Com• 2001 Quest • 2002 Oracle OEM 10g• 2005 Embarcadero

DB Optimizer• 2010 Delphix

When not being a Geek- Have a little 4 year old boy who takes up all my time

NoCOUG boardIOUG Liaison

About Delphix

• Founded in 2008, launched in 2010• CEO Jedidiah Yueh (founder of Avamar: >$1B revenue))• Based in Silicon Valley, Global Operations• 10% of Fortune 500

$40M

$75M

$850M

$27,000M

Storage

IT

Develop

Business

Good, Cheap, Fast : choose two

Fast

GoodCheap

DevOps With Delphix

1. Efficient QA: Low cost, high utilization2. Quick QA : Fast Bug Fix3. Every Dev gets DB: Parallelized Dev4. Full DB : Less Bugs5. Fast Builds: Fast Dev, Culture of Yes