Clusters and Disaster ToleranT Solutions

© 2004 Hewlett-Packard Development Company, L.P.The information contained herein is subject to change without notice

Clusters and Disaster Tolerant Solutions

Jaime BlascoHP/Oracle CTC (Cooperative Technology Center)EMEA Competence Center

Storage Grid Seminar Oktober 2005Frankfurt, München, Hamburg

23. November 2005

Our Agenda Today

• Cluster Solutions from HP

• Introduction to Disaster Tolerance

• Solutions for Disaster Tolerance

• Continuous Access EVA with Oracle Databases

• Oracle Data Guard

• Positioning Continuous Access and Oracle Data Guard

• CA-EVA Demo Scenario

• Summary

33. November 2005

fundamental questions to ask yourselves…

Protection level (distance)

• local disaster (e.g. water, AC)• regional disaster• logical errors

RPO(Data Currency)

• synchronous• loss of data in DR case• data consistency!

Data Volume & Performance

• Gigabytes or TBs• latency considerations

Investment • Hardware & software• bandwidth• processes

RTO(failover time)

• manual failover• fully automated stretched clusters

DRDesign

43. November 2005

Disaster Tolerant Solutions

Business continuity solutions

Recovery Time

Dep

loym

ent C

ost

Days Hours Minutes

Fault-Tolerant

Business Critical Remote Mirroring

Local and Remote Clustering

Infrastructure and Enterprise Backup Solutions

Rapid Recovery SolutionsRapid Backup

Solutions

Operational Data Protection Solutions

HP

Ser

vice

s

53. November 2005

Host-based data replicationBenefits• Uses existing servers• Simple to implement• Any server supported storage• Some have integrated clustering software

But…• Uses server and front-network bandwidth• Licenses by server

Array-based mirroringBenefits• High performance• Server and application independent• No host license costs

(capacity-based licensing)But… • Similar arrays only

Transaction-based solutionBenefits• Disaster tolerance integrated at the transaction level• Minimal data transfer between sites• Multiple site capability

But…• For transactions• Only for specific database applications

Data replication solutions

63. November 2005

Continuous Access EVA

7October 2003

HostHost

CA EVA – Synchronous replication

FC SAN FC SAN

CACHE CACHE

I/OI/OCompleComple

tete

Write Op Ack

Site ASite A Site BSite B

SourceDR Groups

DestinationDR Groups

83. November 2005

Continuous Access for EVA Synchronous and asynchronous copy methods

• Synchronous mode

−Remote site data is constantly kept 100% consistent with local site data

−I/O completion status is not returned to the host until both the local and remote writes are complete (i.e. in cache)

−Provides real-time mirroring of data−In-order delivery is guaranteed

93. November 2005


• Synchronous (cont)− Appropriate when exact data consistency is crucial to

the application− Synchronous mode can affect overall performance by

increasing response time on writes− The higher the latency of the link, the more impact on

write performance

• No loss of transactions− Database & FileSystem are always consistent− Site failure – basically same application recovery as

power failure

10October 2003

HostHost

SourceDR Groups

Destination DR Groups

CA EVA – Future Asynchronous replication (Write Behind)

SANSAN / ATM / ATM

CACHE CACHE

I/OI/OCompleComple

tete

Write Op Ack

Site ASite A Site BSite B

113. November 2005


• Asynchronous mode− Remote site data continuously lags behind the local site

data by a number of write I/Os− In-order delivery is guaranteed− I/O completion status is returned to the host when the

local write completes, while remote writes are deferred until a later time

− Typically used on high-latency links− Check if the business application can accommodate

lost transactions in a disaster scenario

123. November 2005

Asynchronous Replication: Benefits

• Maintains production system performance− Remote replication can add significant latency to I/O

over long distance

• Performance with much greater distances (a myth…)

− Still have remote data replication− Application must be able to handle possible

inconsistencies

133. November 2005

Latency considerations

cache

5 nanoseconds/m – speed of light in FO cable

2 microseconds/switch

up to 1 ms – write into cache memory

up to 30 ms – write to back-end disk

=~1ms Σ 1,000 IOPS for local write to cache

143. November 2005

Synchronous considerations

cache cache

write IOcomplete

5 s/mη

2 s/switchμ

1 ms cache write+ 0.3ms cache read

remote link latency

1 ms cachewrite

=~2.3ms Σ 435 IOPS @ 0km

100km FC direct ~1ms (ROT)DWDM ~250 s/devηFCIP Europe ~19msFCIP trans atlantic ~81msFCIP NorthAmerica ~44msFCIP Sing.<->US ~210ms

=~3.3ms Σ 303 IOPS @ 100km=~83ms Σ 12 IOPS @ IP London-NY

5 s/mη

2 s/switchμ

1 ms cache write+ 0.3ms cache read

remote link latency

1 ms cachewrite

=~2.3ms Σ 435 IOPS @ 0km

153. November 2005

xp1024RAID5

xp128RAID5

hp9000L-class

hp9000L-class

hp fc16b(SilkWorm3800)

hp fc16b(SilkWorm3800)

DWDMCisco 15540

DWDMCisco 15540

1x optical cable5km or 55km

4x 4x

4x 4x

2x 2x

DWDM proof-of-conceptcommunication company, UK

Write IO latency (8k)

0

0,5

1

1,5

2

2,5

CA type/distance

late

ncy

(ms)

simplexsync CA @0kmasync CA @ 0kmsync CA @ 5kmasync CA @ 5kmsync CA @ 50kmasync CA @ 50km

163. November 2005

Cluster Solutions from HP

173. November 2005

HP‘s full range disaster tolerant solutions

local cluster

Cluster

Data

campus cluster

SAN 2SAN 1

up to 100km using optical multiplexersingle cluster management

server-based data replication

183. November 2005

HP‘s full range disaster tolerant solutions

Cluster

Data

local cluster with DR-option

ContinuousAccess XP

Mirror

Cluster

Data Mirror

ContinuousAccess XP

metro cluster/CLX

continental clustercontinental cluster

193. November 2005

Shared centralized or shared distributed storagefile systems (exclusive access) or raw volumes (shared access with SGeRAC)

/root/usr

/apps

D

A

/root/usr

/apps

A

B

/root/usr

/apps

B

Cluster interconnect

C

/root/usr

/apps

C

FC switches

controllers

Up to 16nodes

/apps& data

/apps& data

Each node hasits own root filesystem.

Applications maybe installed once or on each node.

Foundation high availability clustering Loosely-coupled cluster

HP Serviceguard

203. November 2005

HP Serviceguard Extension for RAC (SGeRAC)

• SGeRAC provides clustering service to Oracle 10g, 9i Real Application Cluster (RAC) and 8i Parallel Server (OPS) to meet high availability and scalability requirements of modern enterprise mission critical computing.

• Customers can easily add SGeRAC to their Mission Critical Operating Environment (MCOE)

• Same protection & functionality for applications as Serviceguard

Increase availability & scalability with Oracle Real Application Cluster

9i RAC 9i RAC 9i RAC 9i RAC

SGeRAC/SGOPS

Cache Fusion

213. November 2005

Extended Serviceguard Cluster with Oracle (aka Campus Cluster)

• Single Cluster over two data centers - Active/Active

• Disaster tolerance as servers and storage reside in two separate data centers− up to 10km with Oracle RAC and

FC switch/hub− up to 100km with Oracle RAC and

DWDM

• Automatic failover to second data center

• Software mirroring− 2 nodes supported with SLVM and

MirrorDisk/UX− 8 nodes supported with Veritas

VxVM/CVM up to 10km; 2 nodes up to 100km

Data Center 1 Data Center 2

Storage Storage

RAC

Node A

VM

Node B

VM

Supported with Oracle single instance and Oracle RAC

223. November 2005

MetroCluster / Cluster Extension XPwith Oracle

• Single Cluster over two data centers; Active/Passive

• Disaster tolerance as servers and storage reside in two separate data centers

• Rapid, automatic site recovery without human intervention

• Storage Hardware Mirroring with XP CA, EVA CA currently tested

• Separate arbitrator for split brain situations

• system connected to mirror has read only access

Data Center 1 Data Center 2

XP XP

MC/SG

Node A Node B

Node CArbitrator

CA / SRDF

Supported with Oracle single instance

233. November 2005

Cluster Extension XP

Features− extends local cluster solutions up

to the distance limits of the software

− rapid, automatic application recovery

− integrates with multi-host environments

• Serviceguard for Linux• VERITAS Cluster Server on Solaris• HACMP on AIX• Microsoft Cluster service on Windows

Benefits− enhances overall solution

availability− ensures continued operation in

the event of metropolitan-wide disasters

Seamless integration of remote mirroring with

industry-leading heterogeneous server

clustering

XP array

XP array

Continuous

Access XP

Dat

acen

ter

A

Dat

acen

ter

B

data mirror

Cluster


243. November 2005

HP ContinentalClusters with Oracle

• Single Clusters in separate data centers – Active/Passive

• Data replication:− Single instance: data replication

method can be software based (like Oracle Standby) or disk array based (XP CA or EMC SRDF) with no limitation in distance.

− RAC: XP CA or EMC SRDF with SLVM up to 100km

• Bidirectional failover capabilities− Either data center is capable of

supporting the other in the event of a disaster

• „Push-Button“ Failover

Supported with Oracle single instance and Oracle RAC (*)

Active DB

Primary Cluster

Recovery Cluster

Instance1

Instance2

XP CA

Active DB

Primary Cluster

Recovery Cluster

Instance1

Instance2

Instance1

Instance2

XP CA

Before Failure:

After Failure:

(*) RAC10g expected to be supported end of CY2005

253. November 2005

HP ContinentalClusters with OracleCustomer Example

PUSH

R E A D Y

A L A R MM E S S A G E

hp S t o r a g e W o r k s xp 1 2 0 0 0 d i sk a r r a y

hp S t o r a g e Wo r ks xp 1 2 0 0 0 d i sk a r r a y

h p S t o r a g e W o r k s x p 1 2 0 0 0 d i s k a r r a y

PUSH

R E A D Y

A L A R MME S S A G E

hp S t o r a g e Wo r ks x p 1 2 0 0 0 d i sk a r r a y

hp S t o r a g e Wo r k s x p 1 2 0 0 0 d i sk a r r a y

h p S t o r a g e W o r k s x p 1 2 0 0 0 d i s k a r r a y

Cabinet Number

HKP

48V

GSPAttentionRemote

RAC11

Apps Server

Cabinet Number

HKP

48V

GSPAttentionRemote

RAC22

Apps Server

Cabinet Number

HKP

48V

GSPAttentionRemote

RAC12

Apps Server

Cabinet Number

HKP

48V

GSPAttentionRemote

RAC21

Apps Server

DB1

DB2’

DB1’

DB2

ContinentalClusters

local SGeRAC cluster

local SGeRAC cluster

RAC11 & RAC12 access DB1 RAC21 & RAC22 access DB2

Intercluster Monitoring

bi-directional HP CA XP/EVAsynch. or async.

263. November 2005

Oracle 10g & Oracle Data Guard

273. November 2005

Oracle 10g and Oracle Data Guard

• What is Oracle Data Guard (ODG)?

• Why Oracle Data Guard?

• Physical or Logical - Standby DB Considerations

• Sync or Asynchronous - Distance Considerations

• Oracle Data Guard Services

• Oracle Data Guard Protection Modes

283. November 2005

What is Oracle Data Guard (ODG)

• An efficient disaster recover solution native to Oracle Databases

• Provides ability to failover and restore access to users within minutes

• Can be configured to guarantee zero data loss and maintain transactionally consistent copies of the primary/production database

• Better resource utilization and load balancing by offloading reports and backups to standby site

293. November 2005

Why Oracle Data Guard?

• Built by Oracle from ground-up for exclusive use with Oracle Databases

• Improved performance on the primary database

• When One (primary) to Many (standby) relationship model is required for Oracle Database Replication

• Ability to create, maintain, manage and monitor the configuration within the same framework as Oracle Databases

• Same Oracle DBAs could manage & maintain the replication of the Database

• Automatic Gap detection and resolution

303. November 2005

Physical or Logical - Standby Database Considerations

• Physical Standby: − Provides a physically identical copy of the primary DB,

with on-disk database structures identical on a block for block basis

− Uses media recovery to apply the redo data generated and transmitted from the primary database

− Protection from user and logical errors using the provision for delayed redo apply

− Fast and efficient failover− Ability to open the standby database in read-only mode

for reporting and backups

313. November 2005


• Physical Standby using Redo Apply:

323. November 2005


• Logical Standby: − Provides for having a logically similar but physically

distinct database structures at the standby location− Transforms the redo data on the primary into logical

SQL statements and applies the SQL statements to the standby database

− Available for user access while applying changes/data from the primary

− Ability to add new data structures on the standby to better facilitate reporting or added functionality

− Enables rolling upgrades to the primary and standby databases

333. November 2005


• Logical Standby using SQL apply

343. November 2005

Synchronous or Asynchronous –

Distance considerations

− Determined by the parameter values assigned to LOG_ARCHIVE_DEST_n

− Default is transmission using ARC process• Happens only when there is log file switch on the primary

database• At first, log file is written to local destination and then that file is

transmitted to the remote destination

− LGWR• LGWR writes to primary online redo log and transmits to the

standby redo log files.• Can be synchronous or Asynchronous• Determined by the distance between primary & standby• Commits on the primary depends on sync or async mode

353. November 2005

Oracle Data Guard Services

• Log Transport Services− Controls the automated transfer of redo data from the

primary database to one or more archival (standby) destinations

− Uses either ARC or LGWR process

• Log Apply Services− Applies the redo data received from the primary to the

standby database to maintain consistency− Uses redo apply (physical standby) and SQL apply

(Logical Standby)

• Role Management services− Switchover (planned role transition) − Failover (unplanned)

363. November 2005

Oracle Data Guard Protection Modes• Maximum Protection

− Guarantees no data loss in case of failure− Needs confirmed redo log writes to both local & remote − Primary shuts down if writes fail

• Maximum Availability− Similar to maximum protection, but primary does not

shutdown in case of failed redo write to remote

• Maximum Performance− Default protection mode− Transactions commit as soon as they are written to a

local redo log group− Remote redo logs written asynchronously

373. November 2005

ODG - Creating a Physical Standby

• Enable Archive mode on PRIMARY

• Setup up remote passwords

• Force Logging on Primary

• Create Standby redo log(s)

• Update init.ora

• Create STANDBY control files

• Copy DB, logs, control files to STANDBY

• Set up listeners and tns names alias

• Mount STANDBY

• Ship redo logs to standby

383. November 2005

A model RAC-Single Configuration

393. November 2005

Positioning CA and Oracle DG

403. November 2005

Positioning CA and Data Guard

High level of transactional consistency. Performance monitoring primarily done on the network and within the database structures

Offline reporting/backups performed through a read-only connection on the STANDBY Database.

Operating system agnostic. Performance monitoring primarily done on the array.

Offline reporting/backups performed using a Business Copy of the secondary CA Volumes.

General Characteristics

Updates are dependent on the server throughput capabilities and the latency of the network. This will create server overhead.

CA Asynchronous uses sidefile technology to maintain performance on the primary side. The rate of update must be managed, based on the latency of the long distance link.

Long Distance (i.e. > 100km)

Physical standby provides log based recovery. Robust network connection is required. Updates work through logs.

CA Synchronous a very good choice. Offers high level of data consistency. Updates to replicated volume are real-time. Provides crash consistent recovery.

Short to medium distance (i.e <100km)

DATA GUARD CA –XP REPLICATION

413. November 2005

SUMMARY

423. November 2005

Live demo

433. November 2005

Continuous Access XP and Oracle 10g• What is CA-XP?

• Why CA-XP?

• Sync or Asynchronous - Distance Considerations

• Command View XP for Setup & Management

• Performance Advisor XP

• Raid Manager XP for Replication Control

• I/O throughput Graphical Observations

• Benchmark Factory (User loads)

443. November 2005

What is CA XP?High-performance real-time remote data mirroring

Features− real-time remote data

mirroring between local and remote XP disk arrays

− fast failover/failback for seamless, reliable mirroring recovery operations

− flexible host agent for solution integration

− enables a wide range of remote mirroring solutions

− synchronous and asynchronous copy modes

Benefits− reliable and easy to manage− offers geographically

dispersed disaster recovery

Host Host

Fibre Channel / DWDM

FibreChannel

data

mirror

mirror

continental

453. November 2005

Why CA-XP?

• Platform, Operating System & application agnostic

• Common replication solution for multiple applications and a variety of data.

• No impact on applications (async mode) and very minimal impact noticeable during sync. mode

• Re-sync only the changes made, after failure/restore

• Hardware/Array based, block level replication -faster and high level of data consistency

463. November 2005

Continuous Access connectivity

The XP12000 is capable of running nearly full speed with remote mirroring of the entire contents of the array

Synchronous• Host writes are not acknowledged until the data is

in both arrays for maximum data integrity

• Flexible interconnect• Fibre Channel up to 10 km• ESCON with repeaters up to 43 km• DWDM up to 100 km

Asynchronous• Improved performance over longer distances as

writes are acknowledged as soon as the data is in the primary array

• Remote Connections Over Common Carrier (T3,OC3, etc)

473. November 2005

Continuous Access XP Asynchronous

Operation1. Primary Site server writes to Cache and Data Volumes. 2. I/O is immediately acknowledged by Array to server.3. Cache date is written Asynchronously to remote Array.4. Remote Site XP Array acknowledge the write to Primary array, clearing the Cache.

What Continuous Access XP Extension (async copy) offersLow response time impact on primary-site operationsRemote sequence-stamping to ensure continuous remote mirror consistencyEfficient, link-optimized remote copy operations

Key SolutionsCost-effective Continental-distance Data Replication/Migration & Disaster Recovery/Clustering

Data Copy

Primary Site Remote Site

cache

write 1

cache2ack

3write

4ack

483. November 2005

Example Solution Overview:

• Rapid Recovery for large Oracle database environments in a disaster situation

• Basis for validating the various technologies available for replicating large Oracle Database Clusters in a SAN environment− Continuous Access XP (Array based replication)− Oracle Data Guard (Software replication)

• Best Fit Scenarios for reference configurations

• Best Practices recommendations

493. November 2005

Worldwide clustering with Continental Clusters(hp-ux)

• no geographic distance limitations

• push-button failover across 1000s of kilometers

• bi-directional failover• support of all TCP/IP

networks • failover between two MC/SG

clusters of same or different sizes transparent to application

• logical & physical data replication

• asynchronous and synchronousphysical data replication with fast-failback

• HP-UX servers and HP’s XP diskarray family

HP-UX disaster tolerant solution for global

distances

local cluster

data mirror

clusterdetection

local cluster

503. November 2005

Continental Cluster architecture

XP128XP1024

SAN SAN

hp-ux ContinentalClusterVersion A.04.00

heartbeat & client LAN

RACApache

RACApache

9i RAC1.1 9i RAC1.2

ccmon

bi-directionalContinuousAccess xp

synch. or async.

ccmon Apache

513. November 2005


Continuous Access XP

• synchronous or asynchronous

physical data replication

• ensuring data consistency and concurrency

• systems are not connected to both

replica of data

• system connected to mirror has read only access

Cluster Solution

• integrates application resources

• monitors status of cluster

• provides automatic application failover mechanisms

• cannot handle read only disks

• provides interfaces or APIs to extend cluster functionality


How does it work?(1)

Cluster

Campus/Metropolitan/Continental distances!

XP Array A XP Array B

ContinuousAccess XP

Dat

acen

ter

A

Dat

acen

ter

B

readonly!

Application

Data Mirror

523. November 2005

Continuous Access connectivity

The XP is the first array capable of running nearly full speed with remote mirroring of the entire contents of the

array

Write IO latency (8k)

0

0,5

1

1,5

2

2,5

CA type/distance

late

ncy

(ms)

simplexsync CA @0kmasync CA @ 0kmsync CA @ 5kmasync CA @ 5kmsync CA @ 50kmasync CA @ 50km

Private Fibre or Common Carrier

Other NetworkDevices

SITE A SITE B

P-VOL S-VOL

533. November 2005

Configuration Schematic

543. November 2005

Extended clustering with MetroCluster (hp-ux)

HP-UX disaster tolerant solution for metropolitan-

wide distances

HP Continuous Access XP

Site C

Site A Site B

XP disk array XP disk array

• rapid, automatic site recovery without human intervention

• distance limited by network anddata replication link to 100 km

• bi-directional failover within asingle, geographically dispersed

MC/SG cluster• transparency to application• asynchronous and synchronous

data replication with fast-failback

• MC/SG technology:• Single cluster• Online reconfiguration• 16 nodes• Rolling upgrade

• HP-UX servers and HP’s XP disk array family

Documents

Clusters and Disaster ToleranT Solutions