Microsoft SQL Server High Availability Using VMAX and SRDF/Metro · 2020-06-18 · VMAX uses Symmetrix Remote Data Facility (SRDF) for remote replications. SRDF offers many DR topologies

MICROSOFT SQL SERVER HIGH AVAILABILITY USING VMAX AND SRDF/METRO

VMAX® Engineering White Paper

ABSTRACT

With the introduction of the VMAX All Flash and VMAX SRDF/Metro remote high

availability, aadministrators have a new way to protect their applications effectively and

efficiently with unprecedented scale, data protection and high availability This white

paper discusses VMAX®

SRDF/Metro functionality in the context of deploying, planning,

and protecting Microsoft SQL Server.

November 2016

The information in this publication is provided “as is.” EMC Corporation makes no representations or warranties of any kind with respect

to the information in this publication, and specifically disclaims implied warranties of merchantability or fitness for a particular purpose.

Use, copying, and distribution of any EMC software described in this publication requires an applicable software license.

EMC2, EMC, the EMC logo, AppSync, HYPERMAX, PowerPath, SnapVX, SRDF, Unisphere, VMAX, and VMAX3 are registered

trademarks or trademarks of EMC Corporation in the United States and other countries. All other trademarks used herein are the

property of their respective owners. © Copyright 2016 EMC Corporation. All rights reserved. Published in the USA. 11/16 White Paper

H15529

EMC believes the information in this document is accurate as of its publication date. The information is subject to change without

notice.

EMC is now part of the Dell group of companies.

Part Number H15529

TABLE OF CONTENTS

EXECUTIVE SUMMARY ...........................................................................................................4

VMAX3 AND VMAX ALL FLASH OVERVIEW .........................................................................4

VMAX SRDF REMOTE REPLICATIONS OVERVIEW .............................................................6

SRDF/METRO OVERVIEW .......................................................................................................7

SQL SERVER FAILOVER CLUSTER USING SRDF/METRO TEST CASES ....................... 11

CONCLUSION ........................................................................................................................ 14

REFERENCES ........................................................................................................................ 14

APPENDIX .............................................................................................................................. 14

EXECUTIVE SUMMARY

The EMC® VMAX

® family of storage systems is built on the strategy of simple, intelligent, modular storage, and incorporates a Dynamic

Virtual Matrix interface that connects and shares resources across all VMAX engines. The EMC VMAX3™ family encompasses three

new array models: VMAX 100K, VMAX 200K, and VMAX 400K.The VMAX3 family offers hybrid storage arrays with pre-defined service

levels for diverse application workloads that use Fully Automated Storage Tiering (FAST) to place the most active data in the flash tier.

FAST places the least active data in lower performance tiers according to the user-provided service levels. The VMAX family offers the

highest levels of performance, scale, availability, and disaster recovery (DR) with new hardware and software capabilities.

In 2016, Dell EMC announced brand new VMAX All Flash products: VMAX 250F1, VMAX 450F, and VMAX 850F The new VMAX

architecture is designed to take advantage of the latest, most cost-efficient 3D NAND Flash drive technology. It features multi-

dimensional scale, large write-cache buffering, back-end write aggregation, high bandwidth, and low latency.

VMAX uses Symmetrix Remote Data Facility (SRDF) for remote replications. SRDF offers many DR topologies for Microsoft SQL

Server databases, including two, three, and four site replication. SRDF can replicate in the following modes:

Synchronous mode

Asynchronous mode

Cascaded mode

Zero-data-loss at any distance with SRDF/Star

SRDF/Metro (SRDF/M) (creates an active/active clustered topology across metro distances)

With SRDF/M, reads are serviced from the local VMAX cache, and writes are sent synchronously to the remote array cache. As VMAX

cache is persistent, writes are acknowledged immediately upon being received in the remote cache, making SRDF/M an extremely fast

and performant solution for SQL Server databases running at either location, and allowing them to fail back and forth without any

changes to the storage layer. If there is true cluster partitioning, SRDF/M uses both bias (winner) and Witness to determine the actual

surviving location and automatically resume I/O operations there without compromising data integrity or availability.

SQL Server applications can run on either side of the SRDF/M configuration in clustered or standalone configurations, offering the

highest level of scale, lower recovery time objective (RTO), and efficient resource utilization. Microsoft Windows Server Failover

Clustering (WSFC) can benefit from SRDF/M by having storage readily available for both reads and writes and in sync on all cluster

nodes at both active and stand by sites. This improves the RTO for cluster resources in the event of failover. In addition, when using

SRDF/M for SQL Server with WSFC, active cluster nodes can have storage accessed from either side of the VMAX SRDF/M

configuration, which offers better utilization of resources and load balancing for SQL Server databases.

This white paper explains SRDF/M deployment and use cases with SQL Server failover clusters to offer improved RTO and ease of

management for cluster-based failover for SQL Server databases with active storage visibility for all cluster nodes.

AUDIENCE

This white paper is intended for database and system administrators, storage administrators, and system architects who are

responsible for implementing, managing, and maintaining SQL Server databases and VMAX storage systems. Readers should have

some familiarity with SQL Server and the VMAX family of storage arrays, and be interested in achieving higher database availability,

performance, and ease of storage management.

VMAX3 AND VMAX ALL FLASH OVERVIEW

All VMAX All Flash systems provide great simplicity for planning, ordering, and management and enable scale out using V-Bricks. V-

Bricks are composed of a VMAX engine and 53 TBu of capacity. You can scale V-Bricks with Flash Capacity Packs in increments of 13

TBu. VMAX All Flash systems can be ordered with a pre-packaged software bundle using the entry “F” package or the more

encompassing “FX” package. They also come standard with embedded EMC Unisphere® for VMAX management and monitoring.

VMAX All Flash systems provide better total cost ownership (TCO) because they use less floor space with high-capacity solid-state

drives (SSD). VMAX All Flash systems uses less power than hard disk drives. Flash storage also provides consistent performance

whether the I/O profile is random, sequential, intermittent, or continuous. SQL Server databases can greatly benefit from both server-

side cache and flash storage. As large as the server-side cache is, database capacity is often larger. While frequently accessed data

1 VMAX 450F and 850F were introduced in Q1 2016, and VMAX 250F was introduced in Q4 2016.

fits in the database cache, there are always queries for less-frequently needed data. Finally, in a cluster, server-side cache is not

cumulative and each cluster node caches its own data regardless of others. VMAX All Flash is even faster, due to the high-capacity

VMAX cache. When the requested data is not in cache, flash storage enables I/O to complete quickly. VMAX All Flash compliments the

use of database cache with faster I/O requests for blocks that are not already in cache. VMAX All Flash also provides a storage system

that allows for high performance, consolidation, ease of data replications for backup, for HA or DR.

The VMAX All Flash systems also provide automatic, scheduled, and application consistent snapshots for SQL Server and other

applications for creating point-in-time copies for backup, reporting, or test/dev natively using EMC SnapVX™. EMC AppSync™

provides single, pane of glass management of application snapshots with tighter integration between VMAX SnapVX and Microsoft

Volume Shadowcopy Services (VSS) and SQL Server Virtual Device Interface (VDI).

Figure 1 shows the VMAX All Flash multi-dimensional scalability.

Figure 1. VMAX All Flash multi-dimensional scalability

Key VMAX All Flash and VMAX3 family of storage system benefits for SQL Server databases include:

Large, dynamic random-access memory (DRAM)-based cache enables all writes, including SQL Server writes, check points, or

batch-loads, to complete extremely fast. The VMAX cache is large and mirrored, which allows for write buffering. As a result, writes

to VMAX are faster than writes to other flash arrays that write directly to SSD.

The Write Folding feature of VMAX reduces the number of writes to the SSDs, which helps to extend the life of these drives. This

feature also helps to improve I/O performance on database updates, as the data is still resident in cache.

The FlashBoost feature helps to improve the response time on random read misses. With FlashBoost, small-block read I/Os

bypass the VMAX cache as the data is transferred directly from back-end (SSD) to front-end (host ports). The I/O is then staged in

the VMAX cache for future access.

TimeFinder® SnapVX provides storage-consistent snapshots. SnapVX offers new levels of scale, efficiency, and simplicity. It uses

redirect-on-write for added performance and pointer-based management for dedup-like data reduction. In addition, snapshots have

names, versions, dates, and automatic expirations. Snapshots are protected, so they can be re-used regardless of changes by the

application. They can also cascade any number of times. With SnapVX SQL Server, database replicas can be created for

purposes such as gold copies, test/dev environments creation, or backups. Snapshots can be restored in seconds, and read/write

access to their data is always immediate.

SRDF offers many DR topologies for SQL Server databases, including two, three, and four sites. SRDF can replicate in different

modes to offer a variety of HA and DR solutions, including fast recovery from a remote copy, backup offload, and more.

SRDF/Metro (SRDF/M) extends core SRDF functionality to offer active/active access of storage devices by presenting devices on

local and remote VMAX systems with the same identity to the clustered hosts for active/active data access at synchronous

distances. Cluster aware applications can leverage SRDF/M for HA and load balancing. This also improves the application

recovery point objective (RPO) in the event of failover as storage is readily available on the surviving site.

With AppSync, you can manage the protection, replication, and cloning of applications and databases. AppSync automatically

discovers application databases, learns the database structure, and maps it through the virtualization layer to the underlying

storage LUN. Application owners and database administrators can ensure they get the right copies to the right stakeholders like

analytics, test, and development without worrying about underlying storage intricacies.

T10-DIF (Data Integrity Field) is a core feature of VMAX that benefits all user data from the moment it enters the storage system.

T10-DIF offers data integrity checks as user data travels between frontend, cache, back end, or disk. T10-DIF is also used by

SnapVX and SRDF replications. T10-DIF does not have a performance overhead as it is part of the VMAX architecture.

Note: Unless explicitly stated, the use cases described in this paper are applicable to both VMAX All Flash and VMAX3

storage systems. From this point forward, this paper uses “VMAX family” to refer to both VMAX All Flash and VMAX3

storage systems.

VMAX SRDF REMOTE REPLICATIONS OVERVIEW

The SRDF family of software is the gold standard for remote replications in mission-critical environments. Built for the industry leading,

high-end VMAX storage array, the SRDF family is trusted for DR and business continuity (BC). SRDF offers a variety of replication

modes that can be combined in different topologies, including two, three, and even four sites. SRDF and TimeFinder are closely

integrated to offer a combined solution for local and remote replications.

Built on an industry standard, data protection solution, SRDF/M extends the performance and scale of VMAX storage systems to the

remote site by offering active/active device access to clustered or non-clustered hosts. SRDF/M does this by presenting the remote

devices with the same identity – geometry and world wide names (WWN) to hosts and keeping them synchronized for HA. Applications

can access storage from either site simultaneously for load balancing, active availability, and quick RPO in the event of a failover.

WSFC and SQL Server failover cluster work well with SRDF/M for HA and quick RTO for clustered resources.

Some of the main SRDF capabilities include:

SRDF modes of operation:

o SRDF Synchronous (SRDF/S) mode, which is used to create a no-data-loss of committed transactions solution. The

target devices are an exact copy of the source devices (Production).

o SRDF Asynchronous (SRDF/A) mode, which is used to create consistent replicas at unlimited distances without write

response time penalty to the application. The target devices are typically seconds to minutes behind the source devices

(Production), though consistent (“restartable”).

o SRDF Adaptive Copy (SRDF/ACP) mode, which allows bulk transfers of data between source and target devices without

write-order fidelity and without write performance impact to source devices. SRDF/ACP is typically used for data

migrations as a point-in-time data transfer. It is also used to catch up after a long period where replication was suspended

and many changes are owed to the remote site. SRDF/ACP can be set to continuously send changes in bulk until the

delta between the source and target is reduced to a specified “skew.” At this time, SRDF/S or SRDF/A mode can resume.

SRDF groups:

o An SRDF group is a collection of matching devices in two VMAX storage systems together with the SRDF ports used to

replicate these devices between the arrays. HYPERMAX OS allows up to 250 SRDF groups per SRDF director. The

source devices in the SRDF group are called R1 devices, and the target devices are called R2 devices.

o SRDF operations are performed on a group of devices contained in the SRDF group. This group is defined by using

either a text file specifying the list of devices, a device group (DG), a composite/consistency group (CG), or a storage

group (SG). The recommended way is to use a storage group.

SRDF consistency:

o The SRDF GC is an SRDF group to which consistency was enabled.

o Consistency can be enabled for either synchronous or asynchronous replication mode.

o An SRDF CG always maintains write-order fidelity (also called dependent-write consistency) to ensure that the target

devices always provide a restartable replica of the source application.

Note: Even when consistency is enabled, the remote devices may not yet be consistent while SRDF is in the sync-

in-progress state. This happens when SRDF initial synchronization happens before SRDF enters a consistent

replication state.

o SRDF consistency also implies that if a single device in a GC cannot replicate, then the whole group stops replicating to

preserve the target device’s consistency.

o Multiple SRDF groups set in SRDF/A mode can be combined within a single array or across arrays. Such grouping of

consistency groups is called multi-session consistency (MSC). MSC maintains dependent-write consistent replications

across all participating SRDF groups.

SRDF sessions:

o An SRDF session is created when replication starts between R1 and R2 devices in an SRDF group.

o The SRDF session can establish replication between R1 to R2 devices. Only on first establish do R1 and R2 devices need

a full copy. Any subsequent establish (for example, after SRDF split or suspend) is incremental, only passing changed

data.

o The SRDF session can restore the content of R2 devices back to R1. Restore is incremental, moving only changed data

across the links. TimeFinder and SRDF can restore in parallel (for example, bring back a remote backup image).

o During replications, the devices to which data is replicated are write-disabled (read-only).

o The SRDF session can be suspended, temporarily halting replication until a resume command is issued.

o The SRDF session can be split, which not only suspends the replication, but also makes the R2 devices read-writable.

o An SRDF checkpoint command does not return the prompt until the content of the R1 devices has reached the R2

devices. This option helps create remote database backups in SRDF/A mode.

o An SRDF swap changes R1 and R2 personality, and the replication direction for the session.

o An SRDF failover makes the R2 devices writable. R1 devices, if still accessible, change to Write_Disabled (read-only).

The SRDF session is suspended and application operations can go on the R2 devices.

o An SRDF failback copies changed data from R2 devices back to R1, and makes the R1 devices writable. R2 devices are

made Write_Disabled (read-only).

o SRDF replication sessions can go in either direction (bi-directional) between the two arrays, where different SRDF groups

can replicate in different directions.

o For more information about SRDF, refer to VMAX Product Guide.

SRDF/METRO OVERVIEW

SRDF/Metro (SRDF/M) is available on the VMAX All Flash and VMAX3 storage systems. SRDF/M provides active/active access to the

R1 and R2 devices in an SRDF configuration. In traditional SRDF configurations, R1 devices are read/write accessible, while R2

devices are read only/ write disabled. In SRDF/M configurations, both the R1 and R2 devices are read/write accessible. The R2 device

takes on the personality of the R1 device in terms of geometry, and most importantly, the WWN. By sharing a WWN, the R1 and R2

devices appear as a shared virtual device across the two VMAX storage systems for host presentation. A host or typically multiple hosts

in a cluster can read and write to both devices. SRDF/M ensures that each copy remains current and consistent and addresses any

write conflicts that might arise. A diagram of the feature is shown in Figure 2.

https://support.emc.com/docu59438_VMAX3-Family-Product-Guide.pdf?language=en_US

Figure 2. SRDF/Metro Overview

SRDF/M for Microsoft failover clusters is supported on VMAX storage systems running HYPERMAX OS 5977 Q1 2016 SR2. This

feature provides the following benefits:

An HA solution at Metro distances by using and extending SRDF/S functionality.

The R2 device is read/write accessible to the host, and both sides of the SRDF device pair appear to one or more hosts as the

same device.

Active/active replication capabilities on both the source and target sites.

Witness support to enable full HA, resiliency, and seamless failover.

SRDF/METRO RESILIENCY

SRDF/M maintains consistency between the R1 and R2 devices during normal operation. If, however, a device or devices move to the

not ready (NR) state or connectivity is lost between the arrays, SRDF/M selects one side of the environment, known as the “winner,”

and makes the other side inaccessible to one or more hosts. There are two ways that SRDF/M can determine a winner: bias or

Witness. The bias or Witness prevents any data inconsistencies that might result from the two arrays being unable to communicate.

SRDF/Metro bias

With or without Witness, bias is a required component of SRDF/M. Witness builds on the bias functionality; in essence; bias becomes

the failsafe if Witness is unavailable or fails. The initial createpair operation of SRDF/M assigns bias to the R1 site, though it is possible

to change it to the R2 site after initial synchronization. Note that changing the bias turns the R2 site into the R1 site. In the event of a

failure, SRDF/M makes the non-biased side inaccessible to one or more hosts, while the bias site (R1) survives. Bias is denoted by the

state of ActiveBias on a device pair or SRDF group, as shown in Figure 3. Note that the bias location is Local, meaning this is the R1

site.

If the bias site (R1) experiences the failure, the entire SRDF/M cluster becomes unavailable to the hosts and requires user intervention

to rectify. To avoid these types of failures, EMC offers the Witness.

2 See E-Lab™ for the currently supported hosts and multi-pathing software, along with any additional ePacks required on top of the Q3 2105 SR release.

The E-Lab Interoperability Navigator can be found at: https://elabnavigator.emc.com.

Figure 3. SRDF/Metro ActiveBias configuration

Alternatively, if the groups are added after synchronization, at the next re-establish, the Witness takes effect. It is also possible to

configure multiple Witnesses if multiple arrays are available. In such cases, SRDF/M handles the use of multiple Witnesses so that if

the initial one fails, no user intervention is required to enable a secondary one.

Note: As this white paper does not detail all failure scenarios of an SRDF/M configuration, refer to EMC VMAX3

SRDF/Metro Overview and Best Practices for more information.

SRDF/Metro Witness

As noted in the previous paragraph, SRDF/M by default uses site bias to define how a site or link failure should be handled in an

SRDF/M configuration. If two clusters lose contact, the bias defines which cluster continues operation and which suspends I/O. Bias is

defined at the SRDF group level. The use of bias to control which site is a winner, however, adds unnecessary complexity in case of a

site failure since it may be necessary to manually intervene to resume I/O to the surviving site. SRDF/M has the capability to use a

Witness, instead of bias. The Witness is an external arbiter running on a separate VMAX or VMAX3 for physical witness configuration.

The Witness can also be VMware virtual machine (vWitness) with the same functionality as physical witness except that it is packaged

to run in a virtual appliance. The use of the Witness supersedes the bias functionality. If the SRDF groups to the Witness are present

before the createpair command is executed, one or more device pairs automatically enter a Witness Protected state upon

synchronization and the state is ActiveActive.

Multiple virtual witnesses can be configured. The VMAX can reside in a physically separate failure domain to either VMAX storage

system in the metro configuration if desired. The Array Witness, if configured, takes precedence for arbitration for the surviving site.

Witness provides the following features:

Active/active use of both data centers

HA for applications (no single points of storage failure, auto-restart)

Fully automatic failure handling

Load balancing and better resource utilization

Lower capital expenditures and lower operational expenditures as a result

https://www.emc.com/collateral/technical-documentation/h14556-vmax3-srdf-metro-overview-and-best-practices-tech-note.pdf

https://www.emc.com/collateral/technical-documentation/h14556-vmax3-srdf-metro-overview-and-best-practices-tech-note.pdf

Note: Fault domain is decided by the customer and can range from different racks in the same data center all the way up to a 5

ms of distance away from each SRDF/M cluster (5 ms measured latency or typical synchronous distance).

A configuration that uses SRDF/M with Witness allows both sides to provide coherent read/write access to the same volume. This

means that on the remote site, the paths are up and the storage is available even before any failover happens. SRDF/M combined with

host failover clustering technologies such as WSFC provides fully automatic application restart for any site-level disaster. The system

rides through component failures within a site, including the failure of an entire array.

In the event of a complete site failure, as shown in Figure 4, Witness automatically assigns the R2 array as the winner, rather than

following the R1 site bias. VMware HA detects the failure of the virtual machines and restarts the virtual machines automatically at the

surviving site with no user intervention.

Figure 4. SC with SRDF/Metro Witness

SRDF/METRO AND SRDF WITNESS FAILURE HANDLING

There are EMC documents available that cover all the various failure scenarios that can impact an SRDF/M environment. As this paper

does not specifically address those scenarios, see the References section for more information.

SRDF/METRO SUPPORT MATRIX

The SRDF/M (active mode) support matrix is actively updated with new host environments, qualifications, inter-operability, and

information regarding enhancements to base products. Refer to SRDF/Metro simple support matrix for the most up-to-date information

pertaining to your environment. This matrix also discusses some specific requirements for SRDF/M configuration with WSFC services.

SRDF/METRO STORAGE CONFIGURATION

SRDF/M can be configured using Solutions Enabler CLI (SE CLI) or Unisphere for VMAX. EMC recommends using Unisphere to

configure and manage SRDF/M. The Appendix describes configuring SRDF/M using Unisphere in more detail along with corresponding

screen shots. The Appendix also covers SRDF/M configuration using Solutions Enabler CLI (SE CLI).

To set up the SRDF/M configuration:

1. Set up the SRDF connection between R1 and R2 arrays.

2. Create storage groups on R1 and R2 sites.

3. Create SRDF groups and use them to create SRDF device pairs across those storage groups.

https://elabnavigator.emc.com/vault/pdf/VMAX_All_Flash_VMAX3_Dir_Bit_ESSM.pdf

4. Establish SRDF/M relationships and synchronize both R1 and R2 storage groups.

5. Incorporate Witness as needed using Witness or vWitness capabilities.

SRDF/METRO CONFIGURATION FOR SQL SERVER FAIL OVER CLUSTERS

WSFC is a group of independent servers that work together to increase the availability of applications and services. SQL Server 2016

takes advantage of WSFC services and capabilities to support SQL Server failover cluster.

SRDF/M enables Read/Write storage access for all nodes of WSFC to improve RTO for the SQL Server failover cluster, even prior to

actual failover. Active cluster nodes can also access either side of the SRDF/M configuration, which offers better utilization of resources

and load balancing for SQL Server databases.

To configure SQL Server failover cluster using SRDF/M:

1. Set up WSFC using multi-path enabled nodes.

2. Provision VMAX devices for fail over cluster needs on both R1 and R2 sites and set up SRDF/M using the storage groups and full synchronization

of both storage groups.

3. Use multi-pathing solutions like Windows MPIO or EMC PowerPath® to support cross connect. This ensures storage visibility to all failover cluster

nodes from both SRDF/M managed storage groups. Multi-pathing solutions should use configurations that allow load balancing across all available

paths for optimal performance. Refer to Windows MPIO KB article for more information about Windows MPIO configuration. When using

PowerPath, the default VMAX path policy is suitable for most configurations.

4. Set up Witness or vWitness to detect cluster split-brain situations.

5. Discover the disks using Windows disk manager. Bring the disks online and format them.

6. Add the disks as the cluster resource.

7. Set up SQL Server failover cluster on the clustered disks.

Note: The example here shows WSFC, but SRDF/M can also be used for SQL Server nodes running on either side of the

configuration for load balancing and better utilization of resources.

Figure 5 shows the configuration.

Figure 5. SQL Server Failover Cluster using SRDF/Metro

SQL SERVER FAILOVER CLUSTER USING SRDF/METRO TEST CASES

This section presents the SQL Server failover cluster using SRDF/M test cases.

HARDWARE AND SOFTWARE CONFIGURATION

The test environment consists of two Cisco C240M3 servers in WSFC configuration with storage provisioned from two VMAX 850F

arrays. SDRF/M is configured between SQL Server failover cluster storage groups. The online transaction processing (OLTP) workload

is run on the active cluster node before and after failover.

https://technet.microsoft.com/en-us/library/ee619734(v=ws.10).aspx

Table 1. Test case configuration

Category Type Quantity/Size Version/Release

Storage system EMC VMAX 850F 2 x V-Brick (R1: 572 R2: 569) HYPERMAX OS 5977

Database server UCS C240-M3 2 – Servers supporting Microsoft Windows Failover Cluster

Microsoft Windows Server 2012

SQL Servers Two SQL Server failover cluster instances on stand-alone servers

2 – SQL Servers Microsoft SQL Server 2016

TEST CASES

TEST 1: RUNNING SQL SERVER WORKLOAD ON R1 SIDE OF SRDF/METRO CONFIGURATION

TEST OBJECTIVES

This test shows a scenario with SQL Server database OLTP workload running on SRDF/M R1 side with consistent low latency and a

high transaction rate while maintaining remote HA with active availability of the storage.

TEST CONFIGURATION

We ran a single SQL Server database OLTP workload running on 128 GB C240-M3 server with SQL buffer pool size of 4 GB. SRDF/M

HA for VMAX storage devices was maintained to ensure low RTO in the event of failover.

TEST RESULTS

As shown in Figure 6, SRDF/M maintained consistent IOPS for the application at <1 ms latency. At the same time, SQL Server

maintained a consistent transaction rate of 540K while maintaing HA of storage devices at the remote site, as shown in Figure 7.

Figure 6. VMAX Performance Profile for SQL Server workload running on R1 side before failover

Figure 7. SQL Server transaction rate for workload running on R1 side before failover

TEST 2: RUNNING SQL SERVER WORKLOAD ON REMOTE SIDE OF SRDF/METRO CONFIGURATION

TEST OBJECTIVES

This test shows a scenario with SQL Server database OLTP workload running SRDF/M remote side (R2) after cluster failover with

consistent low latency and a high transaction rate while maintaining HA with active availability of the storage.

TEST CONFIGURATION

The SQL Server resource group on WSFC failed over to another node using a move operation on WSFC Manager, and the application

was restarted. The workload identical to the workload in Test Case 1 was run on a new owner node and performance metrics were

collected.

TEST RESULTS

As shown in Figure 8Error! Reference source not found., VMAX SRDF/M supported the workload on the remote site after cluster

failover with <1 ms latency and similar IOPS. At the same time, SQL Server maintained a consistent transaction rate of 540K on the

new owner node, as shown in Figure 9.

Figure 8. VMAX Performance Profile for SQL Server workload running on R2 side after failover

Figure 9. SQL Server transaction rate for workload running on R2 side after failover

CONCLUSION

SRDF/M running HYPERMAX OS is an enterprise-class technology that dissolves distance by providing active/active access to

dispersed VMAX All Flash and VMAX3 storage systems, enhancing availability and mobility. Using SRDF/M with WSFC and SQL

Server failover cluster technologies provides new levels of availability suitable for the most mission-critical environments without

compromise.

These technologies provide the basis by which a customer can ensure HA at both the hardware and software level – through the nature

of SRDF/M and WSFC. With active storage availability from both sites on SRDF/M clusters, applications can run on local and remote

sites, simultaneously improving the resource utilization and offering load balancing across the distance.

REFERENCES

The following documentation provides additional and relevant information. Access to these documents depends on your login

credentials. If you do not have access to a document, contact your EMC representative.

VMAX and SRDF Product Guide

EMC Simple Support Matrix (includes SRDF and requirements for Microsoft failover clusters

APPENDIX

CONFIGURATION OF SRDF/METRO USING UNISPHERE

This section covers the configuration of SRDF/M using Unisphere. The following objects are assumed to exist as their creation is

independent of SRDF/M:

Devices on each site created and placed in a single SG on each array

Initiator groups for each WSFC node on each site

Port groups for each site

Masking views for cluster nodes on each site

SETTING UP SRDF CONNECTION AND CREATING SRDF GROUP

If both R1 and R2 VMAX systems are not visible to Unisphere, which is the case when VMAX eManagement guest is used as the

Unisphere server, the SRDF connection must be established between local and remote VMAX systems. It is expected that SAN

connectivity between both VMAX systems is established prior to running this task to help discover the remote VMAX system.

To set up the SRDF connection:

1. From Replication Groups and Pools, select SRDF and choose Create SRDF Connection, as shown in Figure 10.

2. Enter SRDF group label and number and select local VMAX RDF ports.

https://support.emc.com/docu59438_VMAX3-Family-Product-Guide.pdf?language=en_US

https://elabnavigator.emc.com/vault/pdf/VMAX_All_Flash_VMAX3_Dir_Bit_ESSM.pdf

3. Scan for the remote SRDF VMAX system, enter the SRDF group number, and choose SRDF ports on that VMAX system, as shown in Figure 11

and Figure 12.

Figure 10. Setting up SRDF connection

Figure 11. SRDF connection R1 side

Figure 12. SRDF connection R2 side

CREATING SRDF/METRO GROUP OR WITNESS GROUP

Once the SRDF connection is established between R1 and R2 VMAX systems, the SRDF group can be created using the wizard.

To configure the SRDF/M groups:

1. Select the local and remote VMAX systems and corresponding SRDF ports, as shown in Figure 13.

2. Type the SRDF group label and number for both VMAX systems.

The Witness group can also be created using the same wizard by selecting the checkbox for SRDF/Metro witness group.

Figure 13. SRDF/Metro group creation wizard

CREATING SRDF/METRO DEVICE PAIRS

Once the SRDF/M group is created, devices can be added to the group.

To configure SRDF/M device pairs:

1. Select the SRDF group and select Create Pairs, as shown in Figure 14.

2. Identify the R1 side of SRDF mirror to use as the source device for SRDF replication. Select Active for the SRDF Mode.

3. Specify the start device IDs for the R1 and R2 sides and the number of volumes to be included in the SRDF/M group. Additional devices or non -

sequential devices can also be added separately.

4. If the devices are not already in the storage group, choose R1 and R2 storage groups to add the devices for application provisioning. By default, a

subsequent SRDF/M storage group establish operation invalidates R2 devices. This can be changed using the Advanced option. Devices can also

be formatted to delete existing data prior to application provisioning.

Figure 14. SRDF/Metro device pair creation

SYNCHRONIZING SRDF/METRO DEVICES PAIRS FOR ACTIVE/ACTIVE AVAILABILITY

Once SRDF/M device pairs are created, the SRDF/M group remains in the Suspended state.

To establish SRDF/M groups for active availability:

1. Select the SRDF/Metro storage group and choose Establish, as shown in Figure 15.

2. Choose Incremental or Full to synchronize device tracks from R1 to R2, which invalidates all existing tracks on R2. The Establish complete storage

group is now actively available on both sides for the SRDF/M group, as shown in Figure 16.

Figure 15. SRDF/Metro storage group establish for active availability

Figure 16. SRDF/Metro storage group with active availability

SRDF/METRO WITNESS CONFIGURATION

This section covers the Witness configuration for the SRDF/M group.

SETTING UP SRDF/METRO WITNESS CONFIGURATION

Witness configuration requires creating an SRDF group with R1 and R2 VMAX systems with a VMAX on a separate failure domain.

Complete the steps in the SRDF Metro group creation section and select the SRDF/Metro Witness group checkbox to designate the

new group as a Witness group.

SETTING UP SRDF/METRO VWITNESS CONFIGURATION

To configure SRDF/M vWitness:

1. Select the SRDF Virtual Witnesses configuration wizard, as shown in Figure 17.

2. Add vWitness by providing the name, IP address, or DNS-based hostname information for the virtual machine, as shown in Figure 18.

3. Review the vWitness configuration and state, as shown in Figure 19.

Figure 17. Configuration of vWitness

Figure 18. vWitness IP configuration

Figure 19. vWitness configuration and state

WINDOWS SERVER FAILOVER CLUSTER SETUP USING SRDF/METRO

To configure WSFC using SRDF/M active availability after the SRDF/M group is created:

1. Set up cross connect for SRDF/Group device visibility to all cluster nodes by creating masking views on R1 and R2 arrays, as shown in Figure 20

and Figure 21.

2. Scan the disks from Windows Disk Manager and bring them online, as shown in Figure 22.

3. On WSFC manager, browse to Storage and add disks discovered to the cluster, as shown in Figure 23.

4. On disk manager, partition the disks and assign the drive letters or mount the volumes, as shown in Figure 24.

5. Install SQL Server failover cluster using clustered disks and set up the SQL server role, as shown in Figure 25.

CROSS CONNECT CONFIGURATION FOR VISIBILITY TO ALL SRDF/METRO STORAGE GROUP DEVICES

WSFC requires visibility to all the devices to detect split/brain situations. SRDF/M storage groups on both R1 and R2 devices should be

provisioned to all nodes of the cluster. As shown in Figure 20 and Figure 21, masking views are created for both cluster nodes to have

devices visible to them.

Figure 20. SRDF/Metro cross connect R1 site setup

Figure 21. SRDF/Metro cross connect R2 site setup

Figure 22. Windows server disk management for cluster disks

Figure 23. Add disks to cluster using Windows server failover cluster manager

Figure 24. Partition the disks and assign drive letter or mount the volume

Figure 25. SQL Server failover role created using cluster disks

SQL SERVER FAILOVER MANAGEMENT

This section covers various operations for SQL Server failover management using WSFC management. As the storage is accessible to

all cluster nodes, planned failover management simply requires moving the ownership of the cluster. In the event of unplanned failover,

cluster resources are restarted on the surviving site.

1. As shown in Figure 26, ensure that the SQL Server role has proper node ownership identifying nodes on the R1 and R2 site.

2. As shown in Figure 27, select the node to move the cluster to.

Figure 26. SQL server role ownership for SRDF/Metro

Figure 27. Moving the cluster role

CONFIGURATION OF SRDF/METRO USING SOLUTIONS ENABLER CLI

SAN DISCOVERY FOR AVAILABLE VMAX ARRAYS

# symsan list -sanrdf -sid 569 -dir ALL

Symmetrix ID: 000197200569

Flags Remote

------ ----------- ------------------------------------

Dir Prt Lnk

Dir:P CS S S Symmetrix ID Dir:P WWN

------ --- --- --- ------------ ------ ----------------

01F:08 GO O C 000197200572 01F:08 500009737808F008

01F:08 GO O C 000197200572 02F:08 500009737808F048

01F:09 GO - I - - 0000000000000000

Legend:

Director:

(C)onfig : S = Fibre-Switched, H = Fibre-Hub

G = GIGE, - = N/A

(S)tatus : O = Online, F = Offline, D = Dead, - = N/A

Port:

(S)tatus : O = Online, F = Offline, - = N/A

Link:

(S)tatus : C = Connected, P = ConnectInProg

D = Disconnected, I = Incomplete, - = N/A

CREATING SRDF/METRO GROUP OR WITNESS GROUP

# symrdf -sid 572 addgrp -label SQL_M -rdfg 100 -dir 1F:08,2F:08 -remote_rdfg 100 -remote_sid 569 -

remote_dir 1F:08,2F:08 –nop {-witness}

Successfully Added Dynamic RDF Group 'SQL_M' for Symm: 000197200572

CREATING SRDF/METRO DEVICE PAIRS USING THE STORAGE GROUPS

# symrdf -sid 572 -sg DSIB1118_9_SQLFC_SG__RAC_DB -rdfg 200 createpair -type R1 -remote_sg

DSIB1118_9_SQLFC_SG -rdf_metro -invalidate R2 –nop

An RDF 'Create Pair' operation execution is

in progress for storage group 'DSIB1118_9_SQLFC_SG'. Please wait...

Create RDF Pair in (0572,100)....................................Started.

Create RDF Pair in (0572,100)....................................Done.

Mark target device(s) in (0572,100) for full copy from source....Started.

Devices: 0059-0064 in (0572,100).................................Marked.

Mark target device(s) in (0572,100) for full copy from source....Done.

The RDF 'Create Pair' operation successfully executed for

storage group 'DSIB1118_9_SQLFC_SG'.

ESTABLISHING SRDF/METRO GROUP

# symrdf -sid 572 -sg DSIB1118_9_SQLFC_SG -rdfg 100 establish –nop

An RDF 'Incremental Establish' operation execution is

in progress for storage group 'DSIB1118_9_SQLFC_SG'. Please wait...

CHECKING STATE OF SRDF/METRO GROUP

# symrdf -sid 572 list -rdf_metro

Symmetrix ID: 000197200572

Local Device View

---------------------------------------------------------------------------

STATUS MODES RDF S T A T E S

Sym Sym RDF --------- ----- R1 Inv R2 Inv ----------------------

Dev RDev Typ:G SA RA LNK MDATE Tracks Tracks Dev RDev Pair

----- ----- -------- --------- ----- ------- ------- --- ---- -------------

00064 0009C R1:100 RW RW RW T..1. 0 0 RW RW ActiveBias

00065 0009D R1:100 RW RW RW T..1. 0 0 RW RW ActiveBias

Total ------- -------

Track(s) 0 0

MB(s) 0.0 0.0

Legend for MODES:

M(ode of Operation) : A = Async, S = Sync, E = Semi-sync, C = Adaptive Copy

: M = Mixed, T = Active

D(omino) : X = Enabled, . = Disabled

A(daptive Copy) : D = Disk Mode, W = WP Mode, . = ACp off

(Mirror) T(ype) : 1 = R1, 2 = R2

(Consistency) E(xempt): X = Enabled, . = Disabled, M = Mixed, - = N/A

Documents

Microsoft SQL Server High Availability Using VMAX and SRDF/Metro · 2020-06-18 · VMAX uses Symmetrix Remote Data Facility (SRDF) for remote replications. SRDF offers many DR topologies