Upload
dinhnga
View
234
Download
1
Embed Size (px)
Citation preview
EMC Business Continuity for Microsoft SQL Server 2008
Enabled by EMC Symmetrix V-Max with
SRDF/CE, EMC Replication Manager, and Enterprise Flash Drives
Proven Solution Guide
Copyright © 2010 EMC Corporation. All rights reserved. Published January, 2010 EMC believes the information in this publication is accurate as of its publication date. The information is subject to change without notice. Benchmark results are highly dependent upon workload, specific application requirements, and system design and implementation. Relative system performance will vary as a result of these and other factors. Therefore, this workload should not be used as a substitute for a specific customer application benchmark when critical capacity planning and/or product evaluation decisions are contemplated. All performance data contained in this report was obtained in a rigorously controlled environment. Results obtained in other operating environments may vary significantly. EMC Corporation does not warrant or represent that a user can or will achieve similar performance expressed in transactions per minute. No warranty of system performance or price/performance is expressed or implied in this document. Use, copying, and distribution of any EMC software described in this publication requires an applicable software license. For the most up-to-date listing of EMC product names, see EMC Corporation Trademarks on EMC.com. All other trademarks used herein are the property of their respective owners. Part number: H6574
Table of Contents
EMC Business Continuity for Microsoft SQL Server 2008 – Enabled by EMC Symmetrix V-Max with SRDF/CE, EMC Replication Manager, and Enterprise Flash Drives Proven Solution Guide
3
Table of Contents
Chapter 1: About this Document ............................................................................................. 5
Overview ......................................................................................................................... 5 Audience and purpose ..................................................................................................... 6 Scope .............................................................................................................................. 7 Business challenge ......................................................................................................... 8 Technology solution......................................................................................................... 8 Objectives ......................................................................................................................10 Reference architecture ...................................................................................................11 Validated environment profile .........................................................................................12 Hardware and software resources ..................................................................................13 Prerequisites and supporting documentation ..................................................................15 Terminology ...................................................................................................................16
Chapter 2: SQL Server 2008 on the Symmetrix V-Max Design Overview ..............................17 Overview ........................................................................................................................17 Server architecture .........................................................................................................18 Key elements of the storage design layout .....................................................................19 Best practices for storage design....................................................................................20 Storage design layout .....................................................................................................21
Chapter 3: Disaster Recovery Design ....................................................................................23 Overview ........................................................................................................................23 Deploying Windows 2008 failover clustering and SRDF/CE in synchronous mode .........24 Production site protection ...............................................................................................27 DR site protection ...........................................................................................................28
Chapter 4: Replication Management and Design ...................................................................30 Overview ........................................................................................................................30 Replication Manager design ...........................................................................................31 TimeFinder/Snap and TimeFinder/Clone design .............................................................36
Chapter 5: Storage Optimization ............................................................................................37 Overview ........................................................................................................................37 EFDs ..............................................................................................................................38 Storage tiering ................................................................................................................39
Chapter 6: Test and Validation ..............................................................................................46 Overview ........................................................................................................................46
Section A: Testing methodology .........................................................................................47 Overview ........................................................................................................................47 Generating the workload for testing ................................................................................48
Table of Contents
EMC Business Continuity for Microsoft SQL Server 2008 – Enabled by EMC Symmetrix V-Max with SRDF/CE, EMC Replication Manager, and Enterprise Flash Drives Proven Solution Guide
4
Database components and test configuration .................................................................49 Section B: Baseline performance summary .......................................................................53
Overview ........................................................................................................................53 Baseline performance test profile ...................................................................................54 Baseline performance test results...................................................................................55
Section C: OLTP application migration test results summary and recommendations ........57 Overview ........................................................................................................................57 Summary of test results ..................................................................................................58 Recommendations .........................................................................................................60
Section D: Storage tiering test results summary and recommendations ............................61 Overview ........................................................................................................................61 Summary of test results ..................................................................................................62 Recommendations .........................................................................................................65
Section E: Replication Manager test results summary and recommendations ...................66 Summary of test results ..................................................................................................67 Recommendations .........................................................................................................68
Section F: Failover clustering with the Symmetrix V-Max SRDF/CE integrated software test results summary and recommendations ......................................................................69
Overview ........................................................................................................................69 Summary of Test Results ...............................................................................................70 Recommendations .........................................................................................................71
Chapter 7: Conclusion ...........................................................................................................72 Overview ........................................................................................................................72
Chapter 1: About this Document
EMC Business Continuity for Microsoft SQL Server 2008 – Enabled by EMC Symmetrix V-Max with SRDF/CE, EMC Replication Manager, and Enterprise Flash Drives Proven Solution Guide
5
Chapter 1: About this Document
Overview
Introduction This Proven Solution Guide summarizes a series of best practices that were discovered, validated, or otherwise encountered during the validation of a solution using:
EMC® Symmetrix
® V-Max™ with EMC SRDF
®/CE
EMC Replication Manager
Enterprise Flash Drives (EFDs) This Proven Solution Guide will help turn plans for a highly available Microsoft SQL Server 2008 online transaction processing (OLTP) environment into reality by utilizing EMC Replication Manager for simplified data protection, tiered storage for optimizing resources, and the effectiveness of Symmetrix Remote Data Facility/Cluster Enabler (SRDF/CE) during planned failovers and unplanned site outages. EMC's commitment to consistently maintain and improve quality is led by the Total Customer Experience (TCE) program, which is driven by Six Sigma methodologies. As a result, EMC has built Customer Integration Labs (CIL) in its Global Solutions Centers to reflect real-world deployments in which TCE use cases are developed and executed. These use cases provide EMC with an insight into the challenges currently facing its customers.
Use case definition
A use case reflects a defined set of tests that validates the reference architecture for a customer environment. This validated architecture can then be used as a reference point for a Proven Solution.
Contents The content of this chapter includes the following topics.
Topic See Page
Audience and purpose 6
Scope 7
Business challenge 8
Technology solution 8
Objectives 10
Reference architecture 11
Validated environment profile 12
Hardware and software resources 13
Prerequisites and supporting documentation 15
Terminology 16
Chapter 1: About this Document
EMC Business Continuity for Microsoft SQL Server 2008 – Enabled by EMC Symmetrix V-Max with SRDF/CE, EMC Replication Manager, and Enterprise Flash Drives Proven Solution Guide
6
Audience and purpose
Audience The intended audience for this Proven Solutions Guide is:
Internal EMC personnel
EMC partners, and
Customers
Purpose The key purpose of this solution is to validate an efficient, remote storage replication
process for disaster recovery (DR) and business continuity in a high volume SQL Server OLTP environment. This is accomplished using synchronous replication (SRDF/CE in synchronous mode) for the automated site failover with Microsoft failover clusters across the SRDF link. In this solution, the Symmetrix V-Max array is used for storage consolidation while SQL Server works as the relational database management system supporting a multifaceted OLTP environment. This solution takes advantage of EMC TimeFinder
®
replication technology within the Symmetrix V-Max to protect data by creating consistent snapshots at various points throughout the day. Both asynchronous TimeFinder/Snap and TimeFinder/Clone technologies do not require downtime to perform backups and protect data—which is a significant advantage. Also, in comparison to host-based replication, this solution poses a negligible performance impact on the host. Additionally, this solution employs a tiered storage infrastructure, which utilizes EFDs to accelerate access to critical data and low-cost, high-capacity Serial Advanced Technology Attachment (SATA) drives to store historical information. The purpose of this solution is to:
Demonstrate an effective DR solution for geographically dispersed failover clusters
enabled by SRDF/CE software. Demonstrate simplified application protection using Replication Manager to rapidly protect SQL Server databases in a very active OLTP environment.
Validate the benefit of EFD performance for SQL Server OLTP workloads in comparison to traditional Fibre Channel (FC) drive performance.
Demonstrate how to migrate SQL Server table partitions between storage tiers including EFDs, FC, and SATA drives.
Show the benefits of storage tiering for Microsoft SQL Server OLTP-type
applications.
Chapter 1: About this Document
EMC Business Continuity for Microsoft SQL Server 2008 – Enabled by EMC Symmetrix V-Max with SRDF/CE, EMC Replication Manager, and Enterprise Flash Drives Proven Solution Guide
7
Scope
Scope This document’s focus is the design, configuration, and validation of a data
protection solution for SQL Server databases hosted on the Symmetrix V-Max platform. The simulated testing environment reflects a high-volume, real-world SQL Server workload. To meet both the performance and cost-efficiency demands placed on critical SQL Server databases, this proven solution combines the benefits of tiered storage with the benefits of advanced storage protection by incorporating:
Symmetrix V-Max for highly available, shared storage
EMC Replication Manager to create and mount Symmetrix V-Max TimeFinder clones to a mount host
Storage Tiering using:
EFDs
FC disk drives
SATA drives Additionally, tiered storage reduces costs significantly as compared to provisioning large amounts of any one particular tier of disk for the entire environment. While mileage will vary with each specific customer environment, EFDs have shown performance improvements of up to 30 times in typical OLTP workloads. EFDs represent a crucial element in this solution by helping to eliminate potential performance bottlenecks.
Not in scope Basic Microsoft SQL Server application functionality and best practices are outside
the scope of this testing.
Chapter 1: About this Document
EMC Business Continuity for Microsoft SQL Server 2008 – Enabled by EMC Symmetrix V-Max with SRDF/CE, EMC Replication Manager, and Enterprise Flash Drives Proven Solution Guide
8
Business challenge
Overview SQL Server often forms the foundation for today’s most demanding, enterprise-level,
transaction-based companies with its rich feature set and ability to store data from structured, semi-structured, and unstructured documents. OLTP systems running on a SQL Server platform represent one of the most common data processing systems in today's enterprises. The availability requirements of OLTP systems are very demanding. Downtime can represent failure for critical business processes, effectively halting business operations. It is vital that OLTP systems remain online during backups so that customers can continue to access the system. SQL Server administrators need to ensure that a plan is in place that does not introduce major performance degradation to the environment. A business whose very existence relies on 24x7 availability can succeed or fail depending on the database recovery infrastructure in place. SQL Server database administrators (DBAs) want to design and deploy a SQL-based OLTP infrastructure that:
Reduces the cost of storing vast amounts of data
Provides redundancy and high availability throughout the entire system
Reduces I/O and locking contention for better application performance
Ensures 24x7 access to critical business data
Achieves enterprise-level performance for transactional latency and user concurrency (the key success criteria for OLTP database systems)
Provides nondisruptive storage tiering to enable cost-effective information lifecycle management (ILM)
Technology solution
Overview To meet both the performance and cost-efficiency demands placed on critical SQL
Server OLTP databases; this proven solution combines the benefits of tiered storage with the benefits of advanced storage protection.
Tiered storage This solution utilizes the three types of storage media available on the Symmetrix
V-Max platform:
EFDs
FC disk drives
SATA drives The environment also utilizes both RAID 1 mirroring and RAID 5 striping. This ensures that the most active areas (tables) receive the most suitable tiered level of storage to meet performance requirements of the database.
Chapter 1: About this Document
EMC Business Continuity for Microsoft SQL Server 2008 – Enabled by EMC Symmetrix V-Max with SRDF/CE, EMC Replication Manager, and Enterprise Flash Drives Proven Solution Guide
9
EFDs EFDs can dramatically increase performance for demanding Microsoft SQL OLTP
database applications because they can deliver single-millisecond application response times, and significantly higher IOPS as compared to traditional FC disk drives. Energy consumption can be significantly reduced using EFDs.
The high-performance characteristics of EFDs minimize the need for organizations to purchase large numbers of traditional hard disk drives, while only utilizing a small portion of their capacity to satisfy the IOPS and latency requirements.
Advanced storage protection
In addition to using the latest available technologies for database storage, this solution also utilizes advanced array-based replication technology for both local and remote protection. Array-based replication technology has the advantage of being host-agnostic; any concerns over existing host-based SQL Server protection technologies will not affect the array-based replication of SQL databases. Local data protection is provided by Replication Manager for array-based cloning technology. Replication Manager’s integration with Microsoft’s Virtual Device Interface (VDI) is a significant enhancement for most OLTP-based environments, as it:
Creates application-consistent copies of production data in minutes
Produces zero production host overhead (in-array clone processing)
Enables off-host backup, data mining, repurposing, data validity checking Remote protection in this solution is provided by EMC’s SRDF replication technology
in conjunction with Microsoft Failover Clustering extended by EMC’s CE software. SRDF/CE is a Microsoft Windows failover cluster extension utility that stretches a typical active/passive cluster across geographically-dispersed sites. The combination of SRDF and CE (SRDF/CE) makes it possible to not only handle unplanned site outages with quick, automated failover, but it also becomes a helpful utility to handle planned site or host-level outages. SRDF/CE ensures that DR failover is repeatable and predictable, while significantly reducing DR failover management.
Chapter 1: About this Document
EMC Business Continuity for Microsoft SQL Server 2008 – Enabled by EMC Symmetrix V-Max with SRDF/CE, EMC Replication Manager, and Enterprise Flash Drives Proven Solution Guide
10
Objectives
Objectives The solution focuses on the following objectives.
Objective Details
Describe the baseline performance results generated using a TPC-E-like testing tool.
This solution first establishes a performance baseline using:
Simulated loads
Database maintenance
Replication Manager jobs
Validate the benefit of EFD performance for SQL Server OLTP workloads in comparison to traditional FC drive performance.
The initial configuration places all of the database files on FC drives. During testing, the database files and log files are moved to EFDs to demonstrate the performance benefits.
Demonstrate Replication Manager functionality using local clones and snapshots.
Replication Manager is used to create database replicas using snapshots to provide point-in-time recovery and clones for daily backup. The impact on daily activity is monitored and documented.
Demonstrate Replication Manager server DR capabilities.
Outline the steps to provide DR capabilities for Replication Manager. Provide guidelines and considerations.
Perform VLUN migrations to appropriate tiers under load and document the impact for both EFD and FC hosted databases.
Depending on the user activity the database files are moved to:
EFDs
FC drives, or
SATA drives
Validate SQL Server application’s availability and recovery time with both planned and unplanned failure scenarios under simulated load with SRDF/Synchronous (SRDF/S) automated by SRDF/CE.
Failover cluster functionality is tested and recovery time is measured. The impact of a geographically dispersed node enabled by SRDF/CE is tested in both planned failovers and unexpected site failures.
Chapter 1: About this Document
EMC Business Continuity for Microsoft SQL Server 2008 – Enabled by EMC Symmetrix V-Max with SRDF/CE, EMC Replication Manager, and Enterprise Flash Drives Proven Solution Guide
11
Reference architecture
Corresponding Reference Architecture
The corresponding Reference Architecture document for this use case is available on Powerlink
® and EMC.com. Refer to EMC Business Continuity for Microsoft SQL
Server 2008 Enabled by EMC Symmetrix V-Max with SRDF/CE, EMC Replication Manager, and Enterprise Flash Drives Reference Architecture for details. If you do not have access to this content, contact your EMC representative.
Reference Architecture diagram
The following diagram depicts the overall physical architecture of the use case.
Chapter 1: About this Document
EMC Business Continuity for Microsoft SQL Server 2008 – Enabled by EMC Symmetrix V-Max with SRDF/CE, EMC Replication Manager, and Enterprise Flash Drives Proven Solution Guide
12
Validated environment profile
Profile characteristics
The use case was validated with the following environment profile.
Profile characteristic Value
OLTP database Supporting 75,000 users with 1 percent concurrency rate
OLTP database size 1.7 TB
SQL storage type (high frequency data) RAID 5 (7+1), 400 GB EFDs
SQL storage type (medium frequency data) RAID 1, 450 GB, 15k rpm FC drives
SQL storage type (low frequency, historical data) RAID 5 (3+1), 1,000 GB, 7.2k rpm SATA drives
SQL TimeFinder storage RAID 5 (3+1), 400 GB, 10k rpm FC drives
Site link characteristics
The solution was validated using the following site link configuration.
Site link characteristics Configuration
Link type OC-3 (155 Mb/s)
1 Gigabit Ethernet (stretched VLAN)
Distances tested for synchronous replication
10 km
200 km
Data transmission mechanism FCIP
Chapter 1: About this Document
EMC Business Continuity for Microsoft SQL Server 2008 – Enabled by EMC Symmetrix V-Max with SRDF/CE, EMC Replication Manager, and Enterprise Flash Drives Proven Solution Guide
13
Hardware and software resources
Production site hardware
The Production site hardware used to validate the solution is listed below. Note Testing performed during this use case used one FC switch only. However, most Production environments require two FC switches for redundancy.
Equipment at the Production site Quantity Configuration
Storage array 1 EMC Symmetrix V-Max
4 V-Max Engines
9 x 400 GB EFDs
213 x 450 GB, 15k rpm FC disks
18 x 1 TB 7.2k rpm SATA drives
Fibre Channel switch 1 4 Gb/s enterprise class Fibre Channel switch, (requires a minimum of 48 ports)
Ethernet network switch 1 Gigabit Ethernet network switch (requires a minimum of 32 ports)
SQL Server active node 1 4 quad core
2.93 GHz x 7350 Intel processors with 64 GB of 667 MHz FBD-DIMM
SQL Server local passive node 1 4 quad core
2.93 GHz x 7350 Intel processors with 64 GB of 667 MHz FBD-DIMM
Replication Manager server 1 2 CPU quad-core, 4 GB RAM
EMC SMC server 1 2 CPU quad-core, 4 GB RAM
DR site hardware
The DR site hardware used to validate the solution is listed below.
Equipment at the DR Site Quantity Configuration
Storage array 1 EMC Symmetrix V-Max
4 V-Max Engines
221 x 450 GB, 15k rpm FC disks
18 x 1 TB 7.2k rpm SATA drives
Fibre Channel switch 1 4 Gb/s enterprise class Fibre Channel switch, (requires a minimum of 48 ports)
Ethernet network switch 1 Gigabit Ethernet network switch (requires a minimum of 32 ports)
SQL Server remote passive node 1 4 CPU quad core, 64 GB RAM
Replication Manager server 1 2 CPU quad core, 32 GB RAM
Chapter 1: About this Document
EMC Business Continuity for Microsoft SQL Server 2008 – Enabled by EMC Symmetrix V-Max with SRDF/CE, EMC Replication Manager, and Enterprise Flash Drives Proven Solution Guide
14
Software The software used to validate the solution is listed below.
Software Version
Windows Server 2008, x64 Enterprise Edition SP2
Microsoft SQL Server 2008, x64 Enterprise Edition SP1
EMC Enginuity™ 5874.157.129
EMC Solutions Enabler 7.0
EMC SRDF/CE 3.1
EMC Replication Manager 5.2, SP1
EMC Symmetrix Management Console (SMC) 7.0.0.5
Chapter 1: About this Document
EMC Business Continuity for Microsoft SQL Server 2008 – Enabled by EMC Symmetrix V-Max with SRDF/CE, EMC Replication Manager, and Enterprise Flash Drives Proven Solution Guide
15
Prerequisites and supporting documentation
Technology It is assumed the reader has a general knowledge of:
Microsoft SQL Server 2008 Enterprise Edition
EMC Symmetrix V-Max
EMC Replication Manager
EMC SRDF/CE
EMC Solutions Enabler
EMC Symmetrix Management Console (SMC)
Supporting documents
The following documents, located on Powerlink.com, provide additional, relevant information. Access to these documents is based on your login credentials. If you do not have access to the following content, contact your EMC representative.
EMC Business Continuity for Microsoft SQL Server 2008 Enabled by EMC Symmetrix V-Max with SRDF/CE, EMC Replication Manager, and Enterprise Flash Drives Reference Architecture (companion document to this Proven Solution Guide)
EMC Replication Manager 5.2 Administrator’s Guide
EMC SRDF/Cluster Enabler Version 3.1 Product Guide
EMC Symmetrix DMX-4 Enterprise Flash Drives with Microsoft SQL Server Databases—Applied Technology white paper
Chapter 1: About this Document
EMC Business Continuity for Microsoft SQL Server 2008 – Enabled by EMC Symmetrix V-Max with SRDF/CE, EMC Replication Manager, and Enterprise Flash Drives Proven Solution Guide
16
Terminology
Terms and definitions
This section defines terms used in this document.
Term Definition
Cluster Multiple physical servers that act as a single, logical server.
Metavolume A series of smaller disk devices combined to form a LUN.
Meta members One of several disk devices that make up a metavolume.
Node majority mode One of the quorum models available for failover clusters. In this model, each node in the cluster communicates with a ―vote.‖ A majority of votes must be present to provide cluster services. This node majority mode is commonly used for clusters with an odd number of nodes.
Planned failover Cluster services (SQL Server) are moved from a node on the Production site to a node on the DR site (or remote site) in a controlled manner.
Preferred owner A list of server nodes for a cluster service. The preferred owner list is accessed through the Properties of a cluster service. The cluster reviews the list of nodes in the order presented on the property sheet for the first available node to host the service.
Quorum mode This setting ensures that when a cluster is running, enough members of the distributed system are operational and communicating, and that at least one replica of the current state is guaranteed or accessible. The quorum mode is set using the Failover Cluster Manager GUI, provided by Microsoft.
R1 Represents the local copy of the data at the Production site.
R2 Represents the local copy of the data at the remote site.
Chapter 2: SQL Server 2008 on the Symmetrix V-Max Design Overview
EMC Business Continuity for Microsoft SQL Server 2008 – Enabled by EMC Symmetrix V-Max with SRDF/CE, EMC Replication Manager, and Enterprise Flash Drives Proven Solution Guide
17
Chapter 2: SQL Server 2008 on the Symmetrix V-Max Design Overview
Overview
Introduction The following sections detail the key server architecture and storage design
elements for the test OLTP application. The main questions that need to be answered in determining an appropriate storage design layout for this environment include:
How many IOPS will the SQL Server databases generate on the storage system?
What is the maximum acceptable LUN response rate (latency) in milliseconds (ms)?
Contents This chapter contains the following topics:
Topic See Page
Server architecture 18
Key elements of the storage design layout 19
Best practices for storage design 20
Storage design layout 21
Chapter 2: SQL Server 2008 on the Symmetrix V-Max Design Overview
EMC Business Continuity for Microsoft SQL Server 2008 – Enabled by EMC Symmetrix V-Max with SRDF/CE, EMC Replication Manager, and Enterprise Flash Drives Proven Solution Guide
18
Server architecture
Key elements The following sections detail the key elements of the server architecture used for the
test OLTP application.
Microsoft SQL Server failover cluster
The SQL Server application is hosted on a Microsoft SQL Server failover cluster. This cluster consists of one active and two passive nodes. The Production site contains one active node and one passive node dedicated for local failover. The DR site contains one passive node for site failover.
Physical servers
In addition to the servers that comprise the failover cluster, the test environment utilizes the following physical servers:
One server is deployed at each site (Production and DR) to:
Manage Replication Manager
Act as mount hosts to enable tasks such as database consistency checks against the replica volumes
One server at the Production site to host the SMC application
Physical server connections
The design implements the following physical connections:
The Production site and the DR site each contain a 4 Gb/s FC SAN switch and a Gigabit Ethernet LAN switch.
The sites are connected by an OC-3 155 Mb/s link between the LAN switches. This link carries FC traffic as well as LAN traffic.
The SAN switches connect to each other through an interswitch link (ISL) that communicates across the OC-3 link in FCIP protocol.
The servers in the failover cluster each have four HBA ports connected to the SAN switch.
The Replication Manager servers each have two HBA ports connected to the SAN switch.
The SMC server has two HBA ports connected to the SAN switch.
Each Symmetrix V-Max is connected to the SAN through 20 of their 4 Gb/s front-end ports.
Each server has one LAN connection to a gigabit Ethernet switch.
The SQL Server failover cluster nodes have one cluster private network connection to the gigabit Ethernet switch providing heartbeat for the cluster nodes.
The server VLAN and cluster private network are stretched across the OC-3 connection to the DR site.
Chapter 2: SQL Server 2008 on the Symmetrix V-Max Design Overview
EMC Business Continuity for Microsoft SQL Server 2008 – Enabled by EMC Symmetrix V-Max with SRDF/CE, EMC Replication Manager, and Enterprise Flash Drives Proven Solution Guide
19
Key elements of the storage design layout
OLTP application storage requirements
Capacity, I/O throughput, and latency requirements are the primary drivers in determining the appropriate storage design for OLTP databases. Capacity refers to how much data will need to be stored, in respect to how large the database is. I/O throughput refers to the number of IOPS required to deliver the expected response times. Verify that the storage design:
Distributes storage resources efficiently to prevent bottlenecks.
Accounts for additional storage to support potential database growth.
Supports the application I/O throughput and latency requirements.
Considers the impact of overheads of different RAID protection levels on IOPS.
Formula for calculating the
number of disks required
It is critical that data centers supporting high volume SQL Server databases identify the correct number of disks and IOPS required to deliver target response times. Use the following formula to calculate the disks required: ((Total_IO*read_IO%)-read_hit) + ((Total_IO*write_IO%)*RAID Factor))=IOPS Where: Total_I/O = Anticipated database workload Read_I/O% = Percentage of Total_I/O that is read requests Read_hit = Amount of read workload that is serviced from the array cache Write_I/O% = Percentage of Total_I/O that is write requests RAIDFactor = RAID protection overhead (for example, RAID 1 is indicated by 2, RAID 5 is indicated by 4, RAID 6 is indicated by 6) IOPS = The adjusted IOPS requirement
Target disks calculated for this use case
The following example shows how to identify the number of disks required to support this solution’s high-capacity OLTP test environment. See Chapter 2>SQL Server 2008 on the Symmetrix V-Max Design Overview>Formula for calculating the number of disks required, for more information on the formula. Example
IOPS=((20,000*.80)-0) + ((20,000*.20)*2)) = 24,000 In testing, the target drives (450 GB, 15k rpm drives) produced 200 IOPS. By dividing the adjusted IOPS by the expected IOPS per disk (24,000/200) it was determined that 120 disks are required to support this workload. The number of disks was then rounded up to 128. Recommendation The number of disks is rounded up from 120 to 128 to balance the workload across the Symmetrix V-Max back-end I/O modules. Because the back end of the storage array contains 32 physical disk controllers, it is good practice to use a number of disks that is divisible by 32. This maximizes the use of the Symmetrix V-Max Engines and back-end controllers.
Chapter 2: SQL Server 2008 on the Symmetrix V-Max Design Overview
EMC Business Continuity for Microsoft SQL Server 2008 – Enabled by EMC Symmetrix V-Max with SRDF/CE, EMC Replication Manager, and Enterprise Flash Drives Proven Solution Guide
20
Best practices for storage design
Microsoft SQL Server
The OLTP application is hosted on a Microsoft SQL Server 2008 failover cluster. Review the storage-related best practices presented in the table below.
Area Best practice
Disk latency
Should not exceed 10 ms for best performance.
1 to 5 ms for log files
4 to 20 ms for database files on OLTP systems (ideally, 10 ms or less)
Log files Place on RAID 1 or RAID 10 storage (as they are write-intensive).
TempDB files
Consider the following:
The number of files should equal the number of physical CPU sockets. This practice increases parallel access to the database.
Pre-allocate files to avoid the overhead of autogrowth.
Place TempDB files on their own LUNs using RAID 1 or RAID 10 protection.
Database files
Consider the following:
Configure database files across multiple LUNs to take advantage of parallel access and to minimize I/O contention (where necessary).
Keep database files and log files on separate LUNs. This is because log files represent a sequential write workload; whereas database files supporting OLTP applications represent random read/write read activity. Combining heterogeneous workloads can have a negative effect on overall database performance.
Make sure that all database files in the same file group are equal in size. SQL Server uses a proportional fill algorithm that favors allocation to files with more free space. Keeping the files the same size provides more evenly distribution data.
Set the Enable Autogrowth parameter in the Microsoft SQL Server Management Studio graphical user interface (GUI) to expand the database file in case of unanticipated growth. For detailed information on this setting, refer to the appropriate Microsoft SQL Server Management Studio documentation.
Pre-allocate the data files.
Chapter 2: SQL Server 2008 on the Symmetrix V-Max Design Overview
EMC Business Continuity for Microsoft SQL Server 2008 – Enabled by EMC Symmetrix V-Max with SRDF/CE, EMC Replication Manager, and Enterprise Flash Drives Proven Solution Guide
21
Storage design layout
Introduction Review the following sections to learn how the storage layout was designed to
support the high-capacity OLTP workloads in this use case.
Physical disk distribution at the Production site
The following image represents the physical disk distribution for the Symmetrix V-Max at the Production site.
Physical disk distribution at the DR site
The following image represents the physical disk distribution for the Symmetrix V-Max at the DR site.
Chapter 2: SQL Server 2008 on the Symmetrix V-Max Design Overview
EMC Business Continuity for Microsoft SQL Server 2008 – Enabled by EMC Symmetrix V-Max with SRDF/CE, EMC Replication Manager, and Enterprise Flash Drives Proven Solution Guide
22
LUN distribution in the test environment
LUNs were distributed in the test environment as follows:
A total of 32 LUNs are used to support the OLTP application.
The LUNs are in a disk group of 128, 450 GB, 15k rpm disk drives that are evenly distributed across all of the available back-end I/O modules.
Most of the LUNs are metavolumes. A metavolume is a series of smaller disk devices combined to form a LUN. The disk devices are called meta members.
Additionally, the following table lists the members for each LUN.
LUNs Number of members
22.5 GB LUNs 2
45 GB LUNs 4
90 GB LUNs 8
135 GB LUNs 8
Chapter 3: Disaster Recovery Design
EMC Business Continuity for Microsoft SQL Server 2008 – Enabled by EMC Symmetrix V-Max with SRDF/CE, EMC Replication Manager, and Enterprise Flash Drives Proven Solution Guide
23
Chapter 3: Disaster Recovery Design
Overview
Using Windows 2008 failover clustering extended by EMC SRDF/CE
It is imperative that SQL Server DBAs running high-capacity OLTP workloads build the following three key elements into the environment’s DR design:
Repeatability
Predictability
Reduction in failover management This solution combines all three of these aspects by leveraging the integrated remote data protection features of the Symmetrix V-Max array. SRDF/CE combines Microsoft failover clusters with SRDF/S to automate the failover. If the Production site fails, SRDF/CE is automatically triggered to move services either laterally within the Production site, or to the remote DR site in case of a full site failure.
Contents This chapter contains the following topics:
Topic See Page
Deploying Windows 2008 failover clustering and SRDF/CE in synchronous mode
24
Production site protection 27
DR site protection 28
Chapter 3: Disaster Recovery Design
EMC Business Continuity for Microsoft SQL Server 2008 – Enabled by EMC Symmetrix V-Max with SRDF/CE, EMC Replication Manager, and Enterprise Flash Drives Proven Solution Guide
24
Deploying Windows 2008 failover clustering and SRDF/CE in synchronous mode
Distances targeted during failover clustering testing
This solution is built around geographically dispersed failover clustering targeting the multi-site data center. The geographically dispersed cluster allows organizations to position the individual nodes in separate data centers miles away from one another. In order to make the test environment as realistic as possible, this use case established a baseline using two simulated distances:
10 km represents a campus-like environment or metro area
200 km represents the longest, recommended distances between geographically dispersed sites
SRDF/S and SRDF/CE functionality
The Symmetrix V-Max storage array is the main component of this solution, integrating the most powerful suite of remote storage replication technologies for superior failover/failback performance—EMC SRDF software. More specifically:
SRDF/S: Maintains real-time synchronous remote data replication from the Production site to the DR site, providing a recovery point objective (RPO) of zero data loss.
SRDF/CE: Works with Microsoft failover clusters to leverage the SRDF/S link for accessing the remote DR site. In addition, SRDF/CE enables consistent replication that virtually eliminates the need to perform full resynchronizations of the environment.
See Chapter 6>Section F> Failover clustering with the Symmetrix V-Max SRDF/CE integrated software test results summary and recommendations for findings.
SRDF/S configuration details
Review the following prior to configuring SRDF/S:
SRDF is configured between two Symmetrix V-Max arrays.
Synchronous replication is performed at a distance of 10 km and 200 km between the Production site and the DR site.
The data transmission mechanism is FCIP.
Each array dedicates eight front-end I/O modules (two per engine) for SRDF communication.
The link between the Production site and the DR site is an OC-3 (155 Mb/s) 1 Gigabit Ethernet (stretched VLAN) connection.
The SQL cluster is connected to four of the eight Symmetrix V-Max front-end I/O modules. The remaining four front-end modules can be used for additional connectivity capacity.
All of the SQL application devices (LUNs) are in a single RDF 1 type group. See Chapter 6>Section F> Failover clustering with the Symmetrix V-Max SRDF/CE integrated software test results summary and recommendations for findings.
Chapter 3: Disaster Recovery Design
EMC Business Continuity for Microsoft SQL Server 2008 – Enabled by EMC Symmetrix V-Max with SRDF/CE, EMC Replication Manager, and Enterprise Flash Drives Proven Solution Guide
25
SRDF/CE prerequisites
Complete the following tasks prior to configuring SRDF/CE. See the EMC SRDF/Cluster Enabler Version 3.1 Product Guide for detailed procedures.
SRDF/CE prerequisite Details
Licenses required Install these Solutions Enabler licenses:
Base kit
SRDF
Cluster Enabler
Failover cluster configuration Configure the failover cluster prior to installing SRDF/CE.
Zoning Cluster nodes are zoned only to their local storage.
Microsoft Cluster Validate Microsoft Cluster Validate must pass all tests except for storage. This test procedure is part of the Microsoft Failover Cluster installation.
Mapping R1 and R2 disk devices Map the R1 disk devices to all nodes at the Production site and the R2 disk devices to peer nodes at the DR site.
Write enable devices in a cluster group Ensure that all of the devices in a cluster group are write-enabled on the node that owns the group in the cluster.
SRDF/CE configuration details
Review the following prior to configuring SRDF/CE. See the EMC SRDF/Cluster Enabler Version 3.1 Product Guide for detailed procedures. Note The failover cluster must be configured prior to installing SRDF/CE.
Use the Configure CE Cluster wizard in the EMC Cluster Enabler Manager GUI to:
Detect the current cluster.
Validate the appropriate software versions.
Perform a storage discovery for each cluster node.
When the wizard successfully completes, Cluster Enabler displays the components of the current cluster in the navigation tree.
See Chapter 6>Section F> Failover clustering with the V-Max SRDF/CE integrated software test results summary and recommendations for findings.
Chapter 3: Disaster Recovery Design
EMC Business Continuity for Microsoft SQL Server 2008 – Enabled by EMC Symmetrix V-Max with SRDF/CE, EMC Replication Manager, and Enterprise Flash Drives Proven Solution Guide
26
Cluster Enabler Manager GUI
The Cluster Enabler Manager GUI is the user interface that manages SRDF/CE activity. This GUI allows you to configure disk-based resources to automatically move between geographically dispersed sites. The managed cluster objects are represented by folders, as detailed in the following image and table. In the example, the resources for the Production site (site 1) are shown.
Folder Details
Groups Displays the Services and Applications from the Failover Cluster Manager.
Storage Displays the storage systems (two Symmetrix V-Max arrays).
Sites Displays the geographically dispersed locations.
Nodes Displays the cluster nodes.
Chapter 3: Disaster Recovery Design
EMC Business Continuity for Microsoft SQL Server 2008 – Enabled by EMC Symmetrix V-Max with SRDF/CE, EMC Replication Manager, and Enterprise Flash Drives Proven Solution Guide
27
Production site protection
Production site protection details
Review the following section prior to implementing geographically dispersed failover clusters in the SQL environment.
The primary means of managing the failover cluster is done through the Failover Cluster Management GUI provided by Microsoft.
The failover cluster is comprised of three nodes, two at the Production site and one at the DR site.
A planned failover is managed by moving services to the second node at the Production site.
All cluster services’ preferred owners are configured to run on the first node in the Production site. The second preferred owner is the second node at the Production site, and the third preferred owner is the node at the DR site.
Note The preferred owner is managed by SRDF/CE. It is not recommended to manually change the preferred owner.
Executing a planned failover
Use the Failover Cluster Management GUI to initiate a service failover, as follows:
Right-click the service to move.
Select Move this service or application to another node.
Select the node where the service is to be moved. The service will then be brought online to the other node.
Chapter 3: Disaster Recovery Design
EMC Business Continuity for Microsoft SQL Server 2008 – Enabled by EMC Symmetrix V-Max with SRDF/CE, EMC Replication Manager, and Enterprise Flash Drives Proven Solution Guide
28
DR site protection
DR site protection details
Review the following details prior to implementing geographically dispersed failover clusters into the SQL environment.
The failover cluster uses one node at the DR site. This node’s function is to run the SQL Server service in the case of a site failure at the Production site.
The planned failover is managed by moving services to the node at the DR site. This can be useful for testing DR procedures, which is sometimes required by regulatory agencies.
Preferred owner list order in the test environment
SRDF-CE will honor the predefined preferred owner list, and manages the storage resources accordingly. The following preferred owner list is implemented in this use case:
The primary owner is represented by the first node at the Production site.
The secondary owner is represented by the lateral (second) node at the Production site.
The third owner is represented by the single node at the DR site.
SRDF/CE also implements a delay failback function. Delay failback will automatically modify the preferred owner list so that a failover to a lateral node (cluster node connected to the same storage array) is a higher priority than a failover to a peer node (cluster node connected to a different storage array).
Node majority selected as the quorum mode
The quorum mode used for this failover cluster is node majority. In order to keep costs low at the DR site, only one peer node is installed. In order to maintain a majority of nodes in the event of a site failure at the Production site, two more ―Votes‖ (through fileshare witness or peer nodes) would need to be established. Using quorum mode, each node in the cluster communicates with a ―vote‖. A majority of votes (two in this configuration) must be present to initiate cluster services. See http://technet.microsoft.com/en-us/library/cc770830(WS.10).aspx for more information on quorum modes.
Chapter 3: Disaster Recovery Design
EMC Business Continuity for Microsoft SQL Server 2008 – Enabled by EMC Symmetrix V-Max with SRDF/CE, EMC Replication Manager, and Enterprise Flash Drives Proven Solution Guide
29
Force starting the cluster is required
A minimum of cluster nodes were used in this configuration as detailed in Chapter 3>Disaster Recovery Design>Node majority selected as the quorum mode. Since only one peer node would remain at the DR site in the event of a Production site failure, a node majority no longer exists. In this case the cluster must be force started to provide services. Before force starting the cluster, verify that the storage components have failed over to the DR site. Use Cluster Enabler Manager to verify that the storage has failed over, as follows:
Click the appropriate SQL Server Group.
Verify the Owner Storage ID and the Owning Node.
Next, force start the cluster. Open a command window and type:
Chapter 4: Replication Management and Design
EMC Business Continuity for Microsoft SQL Server 2008 – Enabled by EMC Symmetrix V-Max with SRDF/CE, EMC Replication Manager, and Enterprise Flash Drives Proven Solution Guide
30
Chapter 4: Replication Management and Design
Overview
Introduction to the backup infrastructure design
Applying this solution’s design principles and product recommendations will help to establish a reliable, highly efficient replication process. The replication model uses
Replication Manager and EMC TimeFinder technology to create snapshots and
clones of the database LUNs at regular intervals within the Symmetrix V-Max. The validated design presented here demonstrates that:
There was negligible impact to user experience on the simulated SQL Server OLTP environment during replication with Replication Manager and TimeFinder software.
There is a little performance impact on the host (as compared to host-based replication).
Contents This section contains the following topics:
Topic See Page
Replication Manager design 31
TimeFinder/Snap and TimeFinder/Clone design 36
Chapter 4: Replication Management and Design
EMC Business Continuity for Microsoft SQL Server 2008 – Enabled by EMC Symmetrix V-Max with SRDF/CE, EMC Replication Manager, and Enterprise Flash Drives Proven Solution Guide
31
Replication Manager design
How Replication Manager was used in this solution
Replication Manager was used to coordinate the data protection and recovery of the Production SQL Server databases. This solution’s Replication Manager-based testing targeted:
75,000 users with 1.7 TB (OLTP database)
An 8-hour performance test window to represent a typical production day
A Replication Manager server configured as a primary server at the Production site, and a Replication Manager server configured as a secondary server at the DR site
VDI snapshot replication
Replication Manager utilizes the VDI framework in order to obtain application-consistent snapshots of active databases for both Snap and Clone jobs. Generally, a VDI snapshot backup has minimal impact on database performance during its execution. Most importantly, user connections to the SQL Server are not broken during this process. Read access is unaffected while database write operations occur in the transaction logs during the VDI backup window, and while the VSS snapshot is performed on the underlying filesystems. The writes are temporarily held in memory for a maximum of 10 seconds. Those transactions, which execute a commit operation while the VDI backup is being processed, may be suspended because of their write requirement. Most VDI backup operations will execute within a matter of seconds, although the VDI implementation itself does not implement a timeout value. See http://technet.microsoft.com/en-us/library/ms175536.aspx for more detailed information on using VDI in a SQL Server context.
VSS implementation
The volume copy shadow services (VSS) implementation used with SQL Server 2008 provides a structured framework for executing SQL Server disk-based backup operations. The VSS framework utilizes an implementation that has similar requirements to that of VDI. The threshold value for VSS operations is 10 seconds. If a disk mirror-based backup exceeds this timing, the backup will abort and I/O operations will proceed as normal.
Chapter 4: Replication Management and Design
EMC Business Continuity for Microsoft SQL Server 2008 – Enabled by EMC Symmetrix V-Max with SRDF/CE, EMC Replication Manager, and Enterprise Flash Drives Proven Solution Guide
32
Replication Manager configuration highlights
The test environment uses Replication Manager to create replicas of the SQL Server application database for rapid recovery. It is important to note that:
Each Replication Manager server uses four 6-cylinder gatekeeper devices assigned to them from their respective Symmetrix V-Max array.
In order to effectively replicate the database, 24 LUNs are copied.
A Replication Manager storage pool is established (for replica creation) at each site.
A storage pool with 192 virtual devices (24 LUNs x 8 sessions) is created to support daily snapshots.
A storage pool of 72 standard devices (24 LUNs x 3 sets) is created to support clone operations.This allows for three sets of clones.
See Chapter 6>Section E> Replication Manager test results summary and recommendations for findings.
Replication Manager design considerations
Consider the following before implementing Replication Manager into the SQL Server environment. See the EMC Replication Manager 5.2 Administrator’s Guide for detailed configuration information. See Chapter 6>Section E> Replication Manager test results summary and recommendations for findings.
A method to provide server name resolution to an IP address must be available. In the test environment, name resolution is provided through the domain name server (DNS).
Install the Replication Manager server on the secondary server before installing on the primary server. During the primary server installation, you will be prompted for the name of the secondary server.
If a Replication Manager server already exists, install the secondary server and stop the Replication Manager Server service. Upgrade the Replication Manager server on the primary node. Start the Replication Manager service on the secondary node. For more information, see the EMC Replication Manager 5.2 Administrator’s Guide.
Provide an IP communications port for the servers to be able to synchronize the Replication Manager database. The default port is 1964.
A standalone Replication Manager server can be converted to a DR server configuration. See the EMC Replication Manager Administrator’s Guide for details.
Each site should have a Replication Manager storage pool to create replicas.
Site specific replication tasks should be created. When a secondary server becomes a primary server, the name of the server will change. Pre-configuring tasks for each site will make DR easier.
When a failover occurs between sites, the replicas at the now secondary site are no longer valid for restore. The replicas must be manually expired and deleted.
Chapter 4: Replication Management and Design
EMC Business Continuity for Microsoft SQL Server 2008 – Enabled by EMC Symmetrix V-Max with SRDF/CE, EMC Replication Manager, and Enterprise Flash Drives Proven Solution Guide
33
Replication Manager and SQL Server observations
Replication Manager is very effective for local data protection in a high-volume OLTP environment because:
Replication Manager integrates a scheduling feature to automate replica creation. DBAs can schedule consistent SQL replication to occur at regular intervals and manage the lifecycle of those replicas.
Replicas can be automatically mounted to alternate hosts for SQL consistency checks or to be transferred to offline storage media such as disk libraries or tape.
Replicas can be automatically mounted copy files used for data mining activities, offloading these functions from the Production host.
Replication Manager will rotate through the sets of storage in the storage pool automatically expiring the oldest set.
Replication Manager and SQL Server best practices
Consider the following prior to introducing Replication Manager into the SQL Server environment: See Chapter 6>Section E> Replication Manager test results summary and recommendations for findings.
Because this is a DR configuration, configure storage pools at each site.
Each site requires a set of application sets and jobs specific to the array at the site.
In this use case the application set is specified to perform database replication. This setting will copy all data and related transaction logs. Choosing Filegroup replication will not replicate active transaction logs as part of a filegroup replication.
Replication Manager does not truncate transaction logs for SQL Server. A database maintenance plan is needed to backup and truncate the transaction logs. In this environment, log backup and truncation is scheduled after the full clone replication of the database.
When setting up the SQL Server replication jobs select the Full, Online with advanced recovery (using VDI) setting as the consistency method. This setting enables log replay upon recovery.
Chapter 4: Replication Management and Design
EMC Business Continuity for Microsoft SQL Server 2008 – Enabled by EMC Symmetrix V-Max with SRDF/CE, EMC Replication Manager, and Enterprise Flash Drives Proven Solution Guide
34
Replication Manager Server DR functionality
Replication Manager auto-discovers the SQL Server from an application perspective, then identifies the Production host’s defined storage, enabling DBAs to quickly devise a backup strategy. See Chapter 6>Section E> Replication Manager test results summary and recommendations for findings. The replication model is described next:
Replication Manager server DR is implemented in a primary server/secondary server model. The primary server controls replication management. All configuration or schedule edits are performed on the primary server. The secondary server is a read-only configuration that is kept synchronized with the primary server.
Should the primary server become unavailable, the secondary server can be designated as the primary server to take over and manage replications.
Replication Manager Server failover
Use Replication Manager’s command line interface (CLI) to initiate the failover as follows. Note The steps outlined here are high-level in nature and should be read in conjunction with the EMC Replication Manager 5.2 Administrator’s Guide.
Step Action
1 Type the following at the command prompt:
The system responds with a prompt.
2 Log in to the system:
3 Type the following command to designate the secondary server as the primary server:
:
4 Restart the Replication Manager server service for the change to take effect.
Note
The replicas at the Production site are no longer valid for recovery.
Chapter 4: Replication Management and Design
EMC Business Continuity for Microsoft SQL Server 2008 – Enabled by EMC Symmetrix V-Max with SRDF/CE, EMC Replication Manager, and Enterprise Flash Drives Proven Solution Guide
35
Replication Manager server failback
Use Replication Manager’s command line interface (CLI) to initiate the failback as follows: Note The steps outlined here are high-level in nature and should be read in conjunction with the EMC Replication Manager 5.2 Administrator’s Guide. See Chapter 6>Section E> Replication Manager test results summary and recommendations for findings.
Step Action
1 Start the Replication Manager server service on the original primary server at the Production site.
Note
This server now becomes the secondary server.
2 Stop the Replication Manager server service on the original secondary server at the DR site
3 Restart the Replication Manager server service on the secondary server at the DR site.
:
Note
The replicas at the DR site are no longer valid for recovery.
4
Chapter 4: Replication Management and Design
EMC Business Continuity for Microsoft SQL Server 2008 – Enabled by EMC Symmetrix V-Max with SRDF/CE, EMC Replication Manager, and Enterprise Flash Drives Proven Solution Guide
36
TimeFinder/Snap and TimeFinder/Clone design
Introduction In the test environment, both TimeFinder technologies are leveraged through
Replication Manager. TimeFinder/Snap sessions taken at several points during the day provide a point-in-time rollback should corruption occur during peak usage times. If corruption occurs, the database can be rapidly rolled back to the last point-in-time copy. SQL log files can then be applied to return the data to the point of failure providing rapid recovery.
TimeFinder/Snap in the test environment
Replication Manager integrates the following TimeFinder/Snap capabilities in the test environment: See Chapter 6>Section E> Replication Manager test results summary and recommendations for findings.
Database snapshots are taken hourly to provide point-in-time copies for recovery should corruption occur.
TimeFinder/Snap allows customers to make multiple pointer-based copies of source data simultaneously on multiple target devices from a single source device. This results in point-in-time copies that can be accessed immediately.
TimeFinder/Snap does not create a full copy of the data, and therefore does not consume as much space as a TimeFinder/Clone. It is an asynchronous copy-on-first-write (A/COFW) technology that copies blocks as they are changed. Blocks that do not change are read from the source volume.
The asynchronous copy-on-first-write (ACOFW) feature improves host performance by eliminating the need to intercept I/O, and copy data before the I/O can complete. With ACOFW, the cache slot for the source disk track is marked as ―versioned‖ and the host write is allowed to continue. The data is copied to the snap or clone device after the write completes.
Snapshot sessions are created immediately and are maintained until they are stopped. Creation of a snapshot causes a small spike in latency at creation time, but has no impact to overall performance.
TimeFinder/ Clone in the test environment
Replication Manager integrates the following TimeFinder/Clone capabilities in the test environment: See Chapter 6>Section E> Replication Manager test results summary and recommendations for findings.
TimeFinder clones of the database LUNs are taken at the end of the day for a daily point-in-time recovery and for copying to other media.
Produces a full image copy of the data. Because it is a full copy, there is no dependence on the source volume as in the TimeFinder/Snap feature.
Produces a SQL consistent full image copy of the database at the end of the day to provide an independent copy for full database recovery.
Subsequent TimeFinder clone copies are incremental syncs, and take less time to complete.
Provides the ability to repurpose the data for SQL consistency checks, offloading the function from the Production host.
Chapter 5: Storage Optimization
EMC Business Continuity for Microsoft SQL Server 2008 – Enabled by EMC Symmetrix V-Max with SRDF/CE, EMC Replication Manager, and Enterprise Flash Drives Proven Solution Guide
37
Chapter 5: Storage Optimization
Overview
Optimizing storage resources
High-transaction OLTP environments require a built-in plan for storage optimization. Read this section to learn how the Symmetrix V-Max storage system works in the test environment to achieve a high level of storage efficiency through:
Moving the OLTP application in its entirety to high-performing EFDs
Distributing database files across storage types within the Symmetrix V-Max array (EFDs, FC, and SATA drives)
Contents Topic See Page
EFDs 38
Storage tiering 39
Chapter 5: Storage Optimization
EMC Business Continuity for Microsoft SQL Server 2008 – Enabled by EMC Symmetrix V-Max with SRDF/CE, EMC Replication Manager, and Enterprise Flash Drives Proven Solution Guide
38
EFDs
Increased performance levels with EFDs
This solution demonstrates how leveraging the Enterprise Flash Drive (EFD) technology available in the Symmetrix V-Max storage array achieves increased performance levels and energy efficiency. Because EFDs contain no moving parts, much of the storage latency delay associated with traditional magnetic disk drives no longer exists. A Symmetrix V-Max with integrated EFDs can deliver single-millisecond application response times and up to 30 times more IOPS than traditional FC disk drives. Additionally, because there are no mechanical components, EFDs consume significantly less energy than hard disk drives. Energy consumption can be reduced up to 98 percent for a given IOPS workload by replacing disk drives with fewer EFDs. For example, in some workload scenarios, it would take 30 or more 15k rpm FC disk drives to deliver the same performance as a single EFD. See Chapter 6>Section C>OLTP Application Migration test results summary and recommendations for findings.
Chapter 5: Storage Optimization
EMC Business Continuity for Microsoft SQL Server 2008 – Enabled by EMC Symmetrix V-Max with SRDF/CE, EMC Replication Manager, and Enterprise Flash Drives Proven Solution Guide
39
Storage tiering
Three types of storage tiering provided by the Symmetrix V-Max
This solution utilizes the three types of storage media available on the Symmetrix V-Max platform:
EFDs
FC disk drives
SATA drives The environment also utilizes both RAID 1 mirroring, and RAID 5 striping to ensure that the most active areas (tables) receive the most suitable tiered level of storage to meet performance requirements. Tiering storage has proven to reduce costs significantly as compared to provisioning large amounts of any one particular storage type for the entire environment. While mileage will vary with each specific customer environment, EFDs have shown performance improvements of up to 30 times in typical OLTP workloads. See Chapter 6>Section D> Storage Tiering test results summary and recommendations for findings.
Virtual LUN (VLUN) migration
The Virtual LUN migration feature introduced with Symmetrix V-Max offers SQL Server storage DBAs the ability to transparently migrate database volumes between different storage types, as well as from differing tiers of protection. Database volumes can be migrated to either unallocated space (also referred to as unconfigured space) or to configured space, which is defined as existing Symmetrix volumes that are not currently assigned to a host within the same subsystem. The data on the original source volumes is cleared using instant volume table of contents (VTOC) once the migration has been completed. The migration does not require swap or driver (DRV) space, and is nondisruptive to the attached SQL application systems and other internal Symmetrix applications such as TimeFinder and SRDF. All migration combinations of drive types and protection types are valid except for unprotected volumes As demonstrated in this proven solution, the database files move to either EFD, FC or SATA drives depending on user activity. The device migration is completely transparent to the host operating system and SQL application because the migration operation is executed against the Symmetrix device, so the host address of the device is not changed and database operations are uninterrupted. Furthermore, in SRDF environments (like this validated solution) the migration does not require customers to re-establish their DR protection after the migration.
Chapter 5: Storage Optimization
EMC Business Continuity for Microsoft SQL Server 2008 – Enabled by EMC Symmetrix V-Max with SRDF/CE, EMC Replication Manager, and Enterprise Flash Drives Proven Solution Guide
40
Storage tiering considerations
Consider the following before implementing storage tiering:
Performance improves when all of the database and log files with RAID 1 data protection are placed on EFD drives. However, this may not be the best use of the storage resource. Some database files will be more active than others and require higher performing drives.
Database applications tend to display a workload skew behavior where most of the I/O demand is focused on a smaller number of LUNs and others have a much lesser demand. Applications that behave this way are good candidates for storage tiering.
See Chapter 6>Section D> Storage Tiering test results summary and recommendations for findings.
Monitor the I/O activity pattern of the OLTP database
Monitoring the I/O pattern of the database over time is a good first step for identifying the most active databases. Usage patterns can change depending on the day of the week, or the week of the month. Gather enough performance data to understand utilization patterns.
Chapter 5: Storage Optimization
EMC Business Continuity for Microsoft SQL Server 2008 – Enabled by EMC Symmetrix V-Max with SRDF/CE, EMC Replication Manager, and Enterprise Flash Drives Proven Solution Guide
41
Test OLTP application activity patterns
The test environment’s OLTP application I/O patterns and IOPS values are detailed in the chart and table below. Also, see Chapter 6>Section A> Database components and test configuration for a detailed graphic that clearly illustrates the three database tables used in the test configuration:
Customer data
Broker data, and
Market data
Disk IOPS Disk IOPS
Broker\B10 1463.1 Customer\C10 184.7
Broker\B9 1322.3 MKT DB 90.8
Broker\B8 1320.4 Customer\C9 58.5
Broker\B5 1320.2 Customer\C7 58.0
Broker\B2 1319.3 Customer\C5 57.5
Broker\B7 1318.5 Customer\C3 55.5
Broker\B3 1314.7 Customer\C2 54.7
Broker\B4 1314.7 Customer\C4 53.1
Broker\B6 1313.3 Customer\C6 52.0
Broker\B1 1312.6 Customer\C1 51.9
Broker\B0 769.9 Customer\C8 51.8
Customer\C0 311.8 Total IOPS: 15,169
Notes
Based on the data collected, most of the broker file group should be migrated to EFDs, and most of the customer file group should be migrated to SATA drives.
Some of the tables in the broker file group are more active than those in the customer file group.
There is a wide range of activity from the most active to the least active file groups.
Chapter 5: Storage Optimization
EMC Business Continuity for Microsoft SQL Server 2008 – Enabled by EMC Symmetrix V-Max with SRDF/CE, EMC Replication Manager, and Enterprise Flash Drives Proven Solution Guide
42
Migrating the OLTP application data
The following details how OLTP application data is migrated in the test environment. See Chapter 6>Section C>OLTP Application Migration test results summary and recommendations for findings.
The SQL database and log files used in the test environment were placed on 128 FC drives with RAID 1 protection.
The data and log files were moved nondisruptively to EFDs in the same array leveraging the VLUN migration feature.
The RAID type is also changed during the migration process, allowing the data to fit onto eight EFDs. For example:
Testing started with 8+8 RAID 1 on FC drives
Testing ended with 7+1 RAID 5 on EFDs In this solution, LUNs are migrated as a member of a storage group. However, LUNs may be:
Ungrouped
Members of a device group
The migration is performed using the LUN Migration Wizard available on the Tasks view of the EMC Symmetrix Management Console (SMC). See Chapter 5>Storage Optimization>Migrating the LUNs using the LUN Migration wizard in SMC.
Chapter 5: Storage Optimization
EMC Business Continuity for Microsoft SQL Server 2008 – Enabled by EMC Symmetrix V-Max with SRDF/CE, EMC Replication Manager, and Enterprise Flash Drives Proven Solution Guide
43
Migrating the LUNs using the LUN Migration wizard in SMC
Use the LUN Migration wizard available in SMC to configure the LUNs for migrating to the appropriate storage tier.
Step Action
1 Click the LUN Migration wizard hyperlink from the Task view in SMC to start the LUN Migration wizard. A Welcome screen appears.
2 Follow the onscreen prompts. Type the data required to migrate the LUNs.
3 The Select Source Devices screen appears:
Select the Symmetrix ID.
Type a Session Name. The Session Name is used by SMC to track migration progress.
Select a Group Type.
Select a Group Name.
Click on the appropriate storage group and click Ok to confirm the
selection. Select the LUNs to be migrated from the Available Devices pane and move to the Selected Source Devices pane on the screen.
Click Next.
Chapter 5: Storage Optimization
EMC Business Continuity for Microsoft SQL Server 2008 – Enabled by EMC Symmetrix V-Max with SRDF/CE, EMC Replication Manager, and Enterprise Flash Drives Proven Solution Guide
44
4 The Select Migration Type screen appears:
Note These settings enable SMC to create appropriate target devices prior to migration.
Select the Unconfigured Devices radio button.
Select the RAID type.
Select the Disk Group.
Click Next.
5 The Summary screen indicates that the LUNs are migrating to the SATA drives using a RAID 6 14+2 protection type. Click Finish to start
the migration.
Chapter 5: Storage Optimization
EMC Business Continuity for Microsoft SQL Server 2008 – Enabled by EMC Symmetrix V-Max with SRDF/CE, EMC Replication Manager, and Enterprise Flash Drives Proven Solution Guide
45
Checking the migration status
Use the LUN Migration wizard in SMC to monitor the migration.
Step Action
1 Navigate to the Symmetrix Arrays Migration Session folder. Select the appropriate session.
2 Click the Properties tab to view device information.
3 Click the RAID Group Info tab to view the properties of the device being migrated. Properties include the source RAID type and the destination RAID type.
In the example shown below, you can see a primary mirror of RAID 1 and a secondary mirror of RAID 5 (7+1). This is normal during a migration. When the migration completes, there will only be a primary mirror of RAID 5 (7+1).
Chapter 6: Test and Validation
EMC Business Continuity for Microsoft SQL Server 2008 – Enabled by EMC Symmetrix V-Max with SRDF/CE, EMC Replication Manager, and Enterprise Flash Drives Proven Solution Guide
46
Chapter 6: Test and Validation
Overview
Introduction End-to-end testing of the entire infrastructure was performed to validate the
achievable performance levels for this solution. Performance is measured using five key phases:
Baseline performance
OLTP application migration
Storage tiering
Replication Manager job cycle
SQL Server databases restore and recovery with V-Max SRDF/CE integrated software
Contents Topic See Page
Section A: Testing methodology 47
Section B: Baseline performance summary 53
Section C: OLTP application migration test results summary and recommendations
57
Section D: Storage tiering test results summary and recommendations
61
Section E: Replication Manager test results summary and recommendations
66
Section F: Failover clustering with the Symmetrix V-Max SRDF/CE integrated software test results summary and recommendations
69
Chapter 6: Test and Validation
EMC Business Continuity for Microsoft SQL Server 2008 – Enabled by EMC Symmetrix V-Max with SRDF/CE, EMC Replication Manager, and Enterprise Flash Drives Proven Solution Guide
47
Section A: Testing methodology
Overview
Introduction This section describes the key components of the test configuration.
Contents
Topic See Page
Generating the workload for testing 48
Database components and test configuration 49
Chapter 6: Test and Validation
EMC Business Continuity for Microsoft SQL Server 2008 – Enabled by EMC Symmetrix V-Max with SRDF/CE, EMC Replication Manager, and Enterprise Flash Drives Proven Solution Guide
48
Generating the workload for testing
SQL load test tool
The SQL load test tool used in this environment simulates an OLTP workload. It is comprised of a set of transactional operations designed to exercise system functionalities in a manner representative of a complex OLTP application environment.
OLTP workloads
The OLTP application used to generate user load in this test environment is based on the TPC Benchmark-E (TPC-E) standard. TPC-E testing is composed of a set of transactions that represent the processing activities. The database schema, data population, transactions, and implementation rules have been designed to be broadly representative of modern OLTP systems. The TPC-E application models the activity of a brokerage firm by:
Managing customer accounts
Executing customer trade orders
Tracking customer activity with financial markets For further clarification, see Chapter 6>Section A> Database components and test configuration for a detailed graphic that clearly illustrates the three database tables used in the test configuration (customer data, broker data, and market data).
Chapter 6: Test and Validation
EMC Business Continuity for Microsoft SQL Server 2008 – Enabled by EMC Symmetrix V-Max with SRDF/CE, EMC Replication Manager, and Enterprise Flash Drives Proven Solution Guide
49
Database components and test configuration
Key components of the OLTP test environment
This benchmark is composed of a set of transactions that are executed against three sets of database tables that represent market data, customer data, and broker data. A fourth set of tables contains generic dimension data such as zip codes. The following diagram illustrates the key components of the test environment.
Logical drive functionality and configuration
The following table details how the logical drives function in the test environment. Note The standard LUN size configured on the Symmetrix V-Max used in solution testing was 22.5 GB. However, it is possible to configure the LUNs using a smaller size.
Function Size Number of LUNs RAID type
Filesystem mount points 1.9 GB 2 1
MSDTC storage 22.5 GB 1 1
SQL system databases 22.5 GB 1 1
SQL system logs 22.5 GB 1 1
TempDB data files 22.5 GB 4 1
TempDB log 22.5 GB 1 1
Chapter 6: Test and Validation
EMC Business Continuity for Microsoft SQL Server 2008 – Enabled by EMC Symmetrix V-Max with SRDF/CE, EMC Replication Manager, and Enterprise Flash Drives Proven Solution Guide
50
Application database LUNs
The test environment integrates 23 LUNs that contain database application files, as detailed in the following table. In addition, see Chapter 6>Section A> Database components and test configuration for a detailed graphic that clearly illustrates the three database tables used in the test configuration (customer data, broker data, and market data).
Function Size Number of LUNs
RAID type
Application Database Transaction Log
file
90 GB 1 1
Database File Broker B0 135 GB 1 1
Database File Broker B1 90 GB 1 1
Database File Broker B2 90 GB 1 1
Database File Broker B3 90 GB 1 1
Database File Broker B4 90 GB 1 1
Database File Broker B5 90 GB 1 1
Database File Broker B6 90 GB 1 1
Database File Broker B7 90 GB 1 1
Database File Broker B8 90 GB 1 1
Database File Broker B9 90 GB 1 1
Database File Broker B10 90 GB 1 1
Database File Customer C0 45 GB 1 1
Database File Customer C1 22.5 GB 1 1
Database File Customer C2 22.5 GB 1 1
Database File Customer C3 22.5 GB 1 1
Database File Customer C4 22.5 GB 1 1
Database File Customer C5 22.5 GB 1 1
Database File Customer C6 22.5 GB 1 1
Database File Customer C7 22.5 GB 1 1
Database File Customer C8 22.5 GB 1 1
Database File Customer C9 22.5 GB 1 1
Database File Customer C10 22.5 GB 1 1
Partitioning the SQL database
SQL table partitioning is used to segment data into smaller, more manageable sections. Table partitioning can lead to better performance through parallel operations. The performance of large-scale operations across extremely large data sets (for instance many millions of rows) can benefit by performing multiple operations against individual subsets in parallel. The number of table partitions to allocate depends on:
Table size
LUN utilization The broker and customer file groups for this application are the largest and best candidates for partitioning. For more information on the file groups used in testing, see Chapter 6>Section A> Database components and test configuration for a detailed graphic that shows the three sets of database tables used (customer data, broker data, and market data).
Chapter 6: Test and Validation
EMC Business Continuity for Microsoft SQL Server 2008 – Enabled by EMC Symmetrix V-Max with SRDF/CE, EMC Replication Manager, and Enterprise Flash Drives Proven Solution Guide
51
Broker and customer file groups
The following table details the file groups used in testing. For further clarification, see Chapter 6>Section A> Database components and test configuration for a detailed graphic that clearly illustrates the three database tables used in the test configuration (customer data, broker data, and market data). Note The OLTP application’s storage is configured across the FC drives using RAID 1 protection.
File group name Table name Drive (directory with mount point)
broker_fg1-10 CASH_TRANSACTION M:\Broker\B1~B10 SETTLEMENT
TRADE
TRADE_HISTORY
customer_fg1-10 HOLDING M:\Customer\C1~C10 HOLDING_HISTORY
broker_fg CHARGE M:\Broker\B0 COMMISSION_RATE
TRADE_TYPE
TRADE_REQUEST
BROKER
customer_fg ACCOUNT_PERMISSION M:\Customer\C0 CUSTOMER
CUSTOMER_ACCOUNT
CUSTOMER_TAXRATE
HOLDING_SUMMARY
market_fg EXCHANGE M:\MKT DB
INDUSTRY
SECTOR
STATUS_TYPE
COMPANY
COMPANY_COMPETITOR
DAILY_MARKET
FINANCIAL
LAST_TRADE
NEWS_ITEM
NEWS_XREF
SECURITY
WATCH_ITEM
WATCH_LIST
misc_fg TAXRATE M:\MKT DB
ZIP_CODE
ADDRESS
Tempdb Y:\TEMPDB1~4
Transaction Log L:\
Chapter 6: Test and Validation
EMC Business Continuity for Microsoft SQL Server 2008 – Enabled by EMC Symmetrix V-Max with SRDF/CE, EMC Replication Manager, and Enterprise Flash Drives Proven Solution Guide
52
Logical drive configuration
The following table details the logical drives used in the application test environment. See Chapter 6>Section A> Database components and test configuration for a detailed graphic that clearly illustrates the three database tables used in the test configuration (customer data, broker data, and market data).
Disk File Group Size
M:\Broker\B0 broker_fg 135 GB
M:\Broker\B1-B10 broker_fg1-10 90 GB each
M:\Customer\C0 customer_fg 45 GB
M:\Customer\C1-C10 customer_fg1-10 22.5 GB each
M:\MKT DB misc_fg 45 GB
Chapter 6: Test and Validation
EMC Business Continuity for Microsoft SQL Server 2008 – Enabled by EMC Symmetrix V-Max with SRDF/CE, EMC Replication Manager, and Enterprise Flash Drives Proven Solution Guide
53
Section B: Baseline performance summary
Overview
Introduction A performance baseline of the SQL Server application is measured prior to
introducing SRDF/CE, storage tiering, and EFDs into the environment. Baseline performance metrics identified:
CPU utilization
Database TPS
Read/write activity
IOPS and latency values for each database LUN
Contents
Topic See Page
Baseline performance test profile 54
Baseline performance test results 55
Chapter 6: Test and Validation
EMC Business Continuity for Microsoft SQL Server 2008 – Enabled by EMC Symmetrix V-Max with SRDF/CE, EMC Replication Manager, and Enterprise Flash Drives Proven Solution Guide
54
Baseline performance test profile
Summary A performance baseline of the SQL server application is determined using the
following profile:
Simulated user load is provided from a utility server that initiates transactions against the database.
The database contains 1.7 TB of data supporting 75,000 users.
Simulated workload with 1 percent concurrency rate and zero think time consistent with the Microsoft testing framework.
Chapter 6: Test and Validation
EMC Business Continuity for Microsoft SQL Server 2008 – Enabled by EMC Symmetrix V-Max with SRDF/CE, EMC Replication Manager, and Enterprise Flash Drives Proven Solution Guide
55
Baseline performance test results
Baseline performance—CPU utilization
CPU utilization averages 73 percent, as shown in the following chart.
Baseline performance—database activity
The following details database activity:
The database is processing 2,180 TPS
A high percentage of activity focuses on broker data
80 percent Read activity
20 percent Write activity
Chapter 6: Test and Validation
EMC Business Continuity for Microsoft SQL Server 2008 – Enabled by EMC Symmetrix V-Max with SRDF/CE, EMC Replication Manager, and Enterprise Flash Drives Proven Solution Guide
56
Baseline performance—IOPS activity for database LUNs
The following chart shows IOPS activity for the database LUNs.
Baseline performance—IOPS and latency values
The average IOPS and latency values for each database LUN are listed below.
Disk IOPS Latency Disk IOPS Latency
Broker\B0 769.9 9 ms Customer\C1 51.9 5 ms
Broker\B1 1312.6 5 ms Customer\C10 184.7 8 ms
Broker\B10 1463.1 5 ms Customer\C2 54.7 5 ms
Broker\B2 1319.3 4 ms Customer\C3 55.5 5 ms
Broker\B3 1314.7 4 ms Customer\C4 53.1 5 ms
Broker\B4 1314.7 4 ms Customer\C5 57.5 5 ms
Broker\B5 1320.2 5 ms Customer\C6 52.0 5 ms
Broker\B6 1313.3 5 ms Customer\C7 58.0 5 ms
Broker\B7 1318.5 5 ms Customer\C8 51.8 5 ms
Broker\B8 1320.4 4 ms Customer\C9 58.5 5 ms
Broker\B9 1322.3 5 ms MKT DB 90.8 1 ms
Customer\C0 311.8 6 ms Total IOPS: 15169.3
Chapter 6: Test and Validation
EMC Business Continuity for Microsoft SQL Server 2008 – Enabled by EMC Symmetrix V-Max with SRDF/CE, EMC Replication Manager, and Enterprise Flash Drives Proven Solution Guide
57
Section C: OLTP application migration test results summary and recommendations
Overview
Introduction The following sections detail the effect of moving the OLTP application database files
to EFDs.
Contents
Topic See Page
Summary of test results 58
Recommendations 60
Chapter 6: Test and Validation
EMC Business Continuity for Microsoft SQL Server 2008 – Enabled by EMC Symmetrix V-Max with SRDF/CE, EMC Replication Manager, and Enterprise Flash Drives Proven Solution Guide
58
Summary of test results
CPU utilization after moving the application database to EFDs
Moving the application database to EFDs increased performance. The same user load applied in the baseline performance testing is run against the relocated application database. The CPU percent utilization increased from 73 percent to 95 percent, as shown in the following image. This shows that the storage is now capable of processing more TPS, which translates to increased CPU load on the SQL Server.
Chapter 6: Test and Validation
EMC Business Continuity for Microsoft SQL Server 2008 – Enabled by EMC Symmetrix V-Max with SRDF/CE, EMC Replication Manager, and Enterprise Flash Drives Proven Solution Guide
59
Increased IOPS and improved disk latency
The table below indicates a dramatic difference in IOPS for the database LUNs after migration completes. Average disk latency is between 2 ms and 4 ms.
LUNs on FC drives LUNs on EFDs
LUN IOPS Latency IOPS Latency
Broker\B0 769.9 9 ms 1427.0 2 ms
Broker\B1 1312.6 5 ms 3083.5 2 ms
Broker\B2 1319.3 4 ms 2083.5 2 ms
Broker\B3 1314.7 4 ms 2380.5 2 ms
Broker\B4 1314.7 4 ms 2179.3 2 ms
Broker\B5 1320.2 5 ms 2485.0 3 ms
Broker\B6 1313.3 5 ms 2535.1 2 ms
Broker\B7 1318.5 5 ms 2877.8 2 ms
Broker\B8 1320.4 4 ms 2755.3 2 ms
Broker\B9 1322.3 5 ms 3083.5 3 ms
Broker\B10 1463.1 5 ms 1427.0 3 ms
Customer\C0 311.8 6 ms 147.3 3 ms
Customer\C1 51.9 5 ms 157.2 3 ms
Customer\C10 184.7 8 ms 683.4 2 ms
Customer\C2 54.7 5 ms 157.2 2 ms
Customer\C3 55.5 5 ms 173.9 2 ms
Customer\C4 53.1 5 ms 157.5 3 ms
Customer\C5 57.5 5 ms 144.2 3 ms
Customer\C6 52.0 5 ms 112.8 2 ms
Customer\C7 58.0 5 ms 149.4 2 ms
Customer\C8 51.8 5 ms 122.2 2 ms
Customer\C9 58.5 5 ms 122.5 2 ms
MKT DB 90.8 1 ms 203.3 1 ms
Total IOPS: 15169.3 28648.4
Reduction in physical disks
The number of physical disks is reduced from 128 FC drives to 8 EFDs with an increased transaction rate of 2,753 TPS.
Chapter 6: Test and Validation
EMC Business Continuity for Microsoft SQL Server 2008 – Enabled by EMC Symmetrix V-Max with SRDF/CE, EMC Replication Manager, and Enterprise Flash Drives Proven Solution Guide
60
Recommendations
OLTP application migration to EFDs
OLTP application migration testing indicates that customers should:
Use EFDs to increase IOPS by consolidating data with a smaller footprint.
Use EFDs for read-intensive database partitions, but not for all database partitions. For example, excellent performance can be achieved by placing less utilized database partitions on less expensive media, while reserving EFDs for heavily utilized partitions.
Analyze OLTP workload activity to identify the partitions that would benefit from using EFDs the most.
Chapter 6: Test and Validation
EMC Business Continuity for Microsoft SQL Server 2008 – Enabled by EMC Symmetrix V-Max with SRDF/CE, EMC Replication Manager, and Enterprise Flash Drives Proven Solution Guide
61
Section D: Storage tiering test results summary and recommendations
Overview
Introduction The following sections detail the effect of moving the application database files to
EFDs, FC and SATA drives.
Contents
Topic See Page
Summary of test results 62
Recommendations 65
Chapter 6: Test and Validation
EMC Business Continuity for Microsoft SQL Server 2008 – Enabled by EMC Symmetrix V-Max with SRDF/CE, EMC Replication Manager, and Enterprise Flash Drives Proven Solution Guide
62
Summary of test results
Performance results—CPU utilization
The following chart details CPU utilization after storage tiering is introduced into the test environment. The same user load applied to baseline performance testing is run against the relocated application database, where the CPU utilization percentage increased from 73 percent to 81 percent.
Chapter 6: Test and Validation
EMC Business Continuity for Microsoft SQL Server 2008 – Enabled by EMC Symmetrix V-Max with SRDF/CE, EMC Replication Manager, and Enterprise Flash Drives Proven Solution Guide
63
IOPS and latency
The following table compares the IOPS and disk latency before storage tiering is implemented in the test environment, as compared to IOPS after storage tiering.
Before Tiering After Tiering
LUN IOPS Latency Storage type IOPS Latency Storage type
Broker\B01 769.9 9 ms FC 928.1 2 ms Flash Drive
Broker\B11 1312.6 5 ms FC 1596.8 3 ms Flash Drive
Broker\B2 1319.3 4 ms FC 1512.7 3 ms Flash Drive
Broker\B3 1314.7 4 ms FC 1600.3 3 ms Flash Drive
Broker\B41 1314.7 4 ms FC 1404.1 2 ms Flash Drive
Broker\B5 1320.2 5 ms FC 1515.4 3 ms Flash Drive
Broker\B6 1313.3 5 ms FC 1497.7 2 ms Flash Drive
Broker\B7 1318.5 5 ms FC 1499.7 2 ms Flash Drive
Broker\B8 1320.4 4 ms FC 1494.3 2 ms Flash Drive
Broker\B9 1322.3 5 ms FC 1600.1 4 ms Flash Drive
Broker\B10 1463.1 5 ms FC 1644.2 4 ms Flash Drive
Customer\C02 311.8 6 ms FC 299.1 5 ms FC
Customer\C1 51.9 5 ms FC 52.3 5 ms SATA
Customer\C102 184.7 8 ms FC 204 6 ms FC
Customer\C2 54.7 5 ms FC 46.5 5 ms SATA
Customer\C3 55.5 5 ms FC 47.1 5 ms SATA
Customer\C4 53.1 5 ms FC 58.8 6 ms SATA
Customer\C5 57.5 5 ms FC 59.6 6 ms SATA
Customer\C6 52.0 5 ms FC 44.3 5 ms SATA
Customer\C7 58.0 5 ms FC 53.8 5 ms SATA
Customer\C8 51.8 5 ms FC 46.5 5 ms SATA
Customer\C9 58.5 5 ms FC 59.8 5 ms SATA
MKT DB 90.8 1 ms FC 93.1 2 ms SATA
Total IOPS: 15169.3 17358.3
1 Note the improvement in latency by moving these LUNs to EFDs.
2 Note the improvement in latency on FC drives after the busier LUNs are moved to EFDs. Increased
performance for the remaining LUNs is observed (as there is less demand on the FC drives).
Chapter 6: Test and Validation
EMC Business Continuity for Microsoft SQL Server 2008 – Enabled by EMC Symmetrix V-Max with SRDF/CE, EMC Replication Manager, and Enterprise Flash Drives Proven Solution Guide
64
Performance results—number of physical disks reduced
Solution testing validated that the number of physical disks is reduced after storage tiering is implemented. The original 128 FC drive count is now reduced to 56 drives, as follows:
8 EFDs
16 FC drives
32 SATA drives Storage tiering consolidated resources significantly. Additionally, the migration to EFDs revealed a marked increase in performance levels with an improved transaction rate of 2,605 TPS. This represents an increase of 19 percent (425 TPS).
Chapter 6: Test and Validation
EMC Business Continuity for Microsoft SQL Server 2008 – Enabled by EMC Symmetrix V-Max with SRDF/CE, EMC Replication Manager, and Enterprise Flash Drives Proven Solution Guide
65
Recommendations
Storage tiering summary
Storage tiering testing indicates that customers should:
Understand utilization patterns prior to introducing storage tiering into their environment.
Use EFDs to support the busiest database partitions.
Use SATA drives to support the less frequently accessed partitions.
Leverage the available storage tiers available to achieve higher performance levels versus using a single FC tier.
Chapter 6: Test and Validation
EMC Business Continuity for Microsoft SQL Server 2008 – Enabled by EMC Symmetrix V-Max with SRDF/CE, EMC Replication Manager, and Enterprise Flash Drives Proven Solution Guide
66
Section E: Replication Manager test results summary and recommendations
Introduction The following sections detail the effect of using Replication Manager to maintain
storage replicas of the OLTP application database providing recovery points in case of site failure or data corruption.
Contents
Topic See Page
Summary of test results 67
Recommendations 68
Chapter 6: Test and Validation
EMC Business Continuity for Microsoft SQL Server 2008 – Enabled by EMC Symmetrix V-Max with SRDF/CE, EMC Replication Manager, and Enterprise Flash Drives Proven Solution Guide
67
Summary of test results
8-hour operational test cycle
Testing was performed to validate a full 8-hour cycle that consisted of:
User activity (typical user load, accessing and updating databases)
An hourly point-in-time snapshot of the database
Regular database maintenance that:
Reorganizes the index
Updates statistics
Note After the 8-hour test cycle, a full copy clone of the database is taken and mounted to an alternate host.
Impact on daily application activity is monitored
The 8-hour operational test is monitored during the Replication Manager cycle. Minimal impact to the SQL Server application performance is observed during this timeframe.
Replication Manager job performance results
Replication Manager job performance test results presented in the table below indicate that replicas are created and recovered within a two hour window:
Item Result
SQL Server data set size 1.7 TB
Average time for Replication Manager Snap job 10 min
Average time for Replication Manager Clone job 65 min
SQL recovery from a snapshot job 12 min to restore the data, and replay the logs
SQL recovery from a clone job 1 hour, 55 min to restore the data, replay the logs
Chapter 6: Test and Validation
EMC Business Continuity for Microsoft SQL Server 2008 – Enabled by EMC Symmetrix V-Max with SRDF/CE, EMC Replication Manager, and Enterprise Flash Drives Proven Solution Guide
68
TimeFinder/Snap and TimeFinder/Clone performance
The following chart represents the impact of TimeFinder/snapshots on a LUN’s response time during snapshot activity. The greatest impact to performance occurs during:
Snapshot activation
Snapshot termination This impact is significant because the only impact to performance occurs during activation of a snapshot, or termination of a snapshot. While the snapshot is active, no impact to performance is observed.
Recommendations
Replication Manager summary
Based on observations, replication testing indicates that:
Recovery point snapshots can be taken at regular intervals without a great impact on the database performance.
A clone is necessary to have a full, independent copy of the database in the event of loss of the source device.
A transaction log backup with the truncate option must be performed after the clone/snap.
Chapter 6: Test and Validation
EMC Business Continuity for Microsoft SQL Server 2008 – Enabled by EMC Symmetrix V-Max with SRDF/CE, EMC Replication Manager, and Enterprise Flash Drives Proven Solution Guide
69
Section F: Failover clustering with the Symmetrix V-Max SRDF/CE integrated software test results summary and recommendations
Overview
Introduction Testing is based on a typical, high transaction-based OLTP environment with
planned failovers introduced in a controlled manner. Simulated site failures were introduced to the test environment by interrupting service. All nodes at the Production site were easily moved to the remote DR site, and produced excellent recovery times.
Contents
Topic See Page
Summary of Test Results 70
Recommendations 71
Chapter 6: Test and Validation
EMC Business Continuity for Microsoft SQL Server 2008 – Enabled by EMC Symmetrix V-Max with SRDF/CE, EMC Replication Manager, and Enterprise Flash Drives Proven Solution Guide
70
Summary of Test Results
SRDF/CE local site RTO results
Item RTO Planned failover (moving the cluster services between the two Production site nodes)
41 seconds
Node failure at the local site 45 seconds
SRDF/CE remote site RTO results
Testing was performed at a distance of 10 km and 200 km between the Production and DR sites.
Item RTO Planned failover (Moving the cluster services between the Production site and the DR site)
1 min, 37 sec (10 km)
1 min, 43 sec (200 km)
Site failure at the Production site 1 min, 54 sec (10 km)
2 min, 3 sec (200 km)
Validate SQL Server availability
The SQL Server remained available during the planned failover, but required manual intervention during a site failure. During planned failovers the SQL Server cluster remains available because all three nodes continue to be available on the network. Since a node majority can be maintained, services continue to remain online. During a Production site failure, only one node continues to be available on the network. Therefore, manual intervention (force start) is required to restart the cluster at the DR site.
Impact of SRDF/CE within distances of 10 km and 200 km
Testing validates that SRDF/CE showed a negligible impact in the environment, as detailed:
At a distance of 10 km the average disk latency was 5.2 ms on average for database LUNs with all of the databases on FC disk devices.
When the distance is increased to 200 km, the average latency increased to 5.7 ms on average.
Storage tiering had no effect on this latency change suggesting the delay is due to the extended distance.
Chapter 6: Test and Validation
EMC Business Continuity for Microsoft SQL Server 2008 – Enabled by EMC Symmetrix V-Max with SRDF/CE, EMC Replication Manager, and Enterprise Flash Drives Proven Solution Guide
71
Recommendations
SRDF/CE summary
Based on observations, SRDF/CE testing indicates that:
Microsoft failover clusters can be effectively stretched across data centers for site failure protection.
The impact of site replication on daily operations is able to meet the standard requirements for good performance.
The cluster has to be force started when the Production site fails. This can be automated by adding more cluster resources. The resources would include an additional node at the DR site, and a file share witness at a third site.
Chapter 7: Conclusion
EMC Business Continuity for Microsoft SQL Server 2008 – Enabled by EMC Symmetrix V-Max with SRDF/CE, EMC Replication Manager, and Enterprise Flash Drives Proven Solution Guide
72
Chapter 7: Conclusion
Overview
Introduction This use case addresses the many challenges that OLTP administrators continually
encounter when trying to establish a high-performing, highly available yet cost-effective DR strategy for data centers situated in remote locations. Both performance and cost-efficiency demands are met by hosting the backup and recovery solution on the Symmetrix V-Max platform. The solution provides many benefits to customers on a number of levels:
Redundancy and high availability throughout the entire OLTP system
Superior failover/failback performance to geographically dispersed sites
Optimal backup rates with minimal user impact
Nondisruptive storage tiering to enable cost-effective information lifecycle management (ILM)
Significant cost savings through resource optimization and reduced energy consumption
Redundancy and high availability throughout the entire OLTP system
Redundancy and high availability are provided by a geographically distributed cluster across two sites. In this use case, Microsoft Failover Clustering provides a redundant server at the Production site that is able to run the SQL Server services in the event of a hardware failure. With the addition of SRDF/CE, Microsoft failover clustering is enhanced to operate across geographically dispersed clusters. In addition to a hardware failure at the Production site, the OLTP application can sustain a site failure. In addition to application high availability, the Replication Manager server has the ability to be configured for DR requirements. In this use case, a Replication Manager server is placed at both sites to continue to provide local database protection.
Superior failover/failback performance to geographically dispersed sites
SRDF/CE technology provides cross-volume and storage system consistency, tight integration with the SQL Server application, and simplified usage through automated management. As testing proves, SRDF/CE ensures that DR failover is repeatable and predictable, while significantly reducing man hours required for DR failover management.
Optimal backup rates with minimal user impact
Replication Manager with TimeFinder/Clone and TimeFinder/Snap proves an ideal mechanism for backup automation and acceleration in creating replicas of the Production site databases for instant restore. The simulated, very active OLTP test environment rapidly backed up and recovered the SQL Server databases in the environment. Replication performance statistics are recorded based on a typical 8-hour test cycle. Testing produced excellent backup rates while user impact was less than 10 seconds.
Chapter 7: Conclusion
EMC Business Continuity for Microsoft SQL Server 2008 – Enabled by EMC Symmetrix V-Max with SRDF/CE, EMC Replication Manager, and Enterprise Flash Drives Proven Solution Guide
73
Nondisruptive storage tiering to enable ILM
Organizations can apply the stored tiering guidelines provided to achieve varied cost/performance initiatives. Storage tiering nondisruptively moves lower-priority OLTP databases to less-expensive storage, or conversely, moves critical application data to higher-performing storage types as needs dictate.
EFDs provide significant cost savings
EFDs represent a critical element in this solution by providing significant cost savings through resource optimization and reduced energy consumption. The high-performance characteristics of EFDs eliminate the need for organizations to purchase large numbers of traditional hard disk drives, while only utilizing a small portion of their capacity to satisfy the IOPS requirements of the database.
Objectives and results
The following table details the objectives of this use case and the results achieved.
Objective Result
Describe the baseline performance results generated using a TPC-E-like testing tool.
The baseline performance results established that:
The OLTP application’s I/O pattern was:
80 percent Read
20 percent Write
The CPU was 73 percent utilized under normal load
Average disk latency of 4 ms to 5 ms
The system supported 2,180 TPS
Validate the benefit of EFD performance for SQL Server OLTP workloads in comparison to traditional FC drive performance.
Migrating the database LUNs to EFDs resulted in:
Doubling the IOPS produced by the same user load
Reducing the disk latencies to the 2 ms to 4 ms range
A reduction in physical (from 128 to 8)
Demonstrate Replication Manager functionality using local clones and snapshots.
Eight-hour test cycles with hourly snapshots and a full clone at the
end of the cycle were performed, which resulted in:
A short spike in disk latency at snapshot creation
Recovery from a snapshot averaging 12 min
Creation of a full clone averaging 1 hour 10 min
Recovery from a clone averaging 1 hour 55 min
Demonstrate Replication Manager server DR capabilities.
Replication Manager Server failover was implemented and tested in this use case. The Replication Manager internal database remained synchronized. The process of swapping between primary and secondary servers
is documented in Chapter 4> Replication Management design>Replication Manager Server failover.
Chapter 7: Conclusion
EMC Business Continuity for Microsoft SQL Server 2008 – Enabled by EMC Symmetrix V-Max with SRDF/CE, EMC Replication Manager, and Enterprise Flash Drives Proven Solution Guide
74
Perform VLUN migrations to appropriate tiers under load, and document impact for both EFD and FC hosted databases.
Migrating the database LUNs across EFDs, FC, and SATA drives resulted in:
A less dramatic overall IOPS increase of 2,000 IOPS produced by the same user load
Reduced disk latencies (2 ms to 6 ms range)
Improved performance with CPU utilization reported at 81 percent
A reduction of physical disks from 128 to 56
Validate SQL availability and recovery time with both planned and unplanned failure scenarios under simulated load with SRDF/S and SRDF/CE.
Planned failovers, equipment failures and site failure were tested in this use case:
Failovers between the Production site cluster nodes took 41 seconds to recover
Moving services to the DR site took:
1 min, 37 sec at 10 km, and
1 min, 43 seconds at 200 km
Conclusion Customers looking to deploy a large-scale, SQL Server based OLTP environment
need efficiencies in their backup and recovery process. This solution provides a design that simplifies and automates what is normally considered a very complex process with the potential to drain IT resources and cause user downtime. The powerful combination of the Symmetrix V-Max with SRDF/CE and Replication Manager, as tested, provides an integrated backup solution that is ideal for customers who are delivering mission-critical SQL Server service across their enterprise. Solution testing verified increased performance and cost savings utilizing:
The resource optimization features of the Symmetrix V-Max storage system.
A combination of SRDF/CE and Replication Manager with Symmetrix TimeFinder software providing minimal to no impact on the OLTP databases across geographically dispersed sites.
EMC can help to accelerate assessment, design, implementation, and management while lowering the implementation risks and costs of a backup/disaster recovery solution for a Microsoft SQL Server 2008 OLTP environment. To learn more about this and other Microsoft SQL Server 2008 solutions contact an EMC representative or visit http://www.emc.com/solutions/application-environment/microsoft/index.htm.