Copyright © 2005 EMC Corporation. Do not Copy - All Rights Reserved.
Symmetrix Solutions Design Concepts - 1
© 2005 EMC Corporation. All rights reserved.
Symmetrix Solutions Design ConceptsSymmetrix Solutions Design Concepts
Symmetrix and Layered Applications
Welcome to Symmetrix Solutions Design Concepts.
The AUDIO portion of this course is supplemental to the material and is not a replacement for the student notes accompanying this course.
EMC recommends downloading the Student Resource Guide from the Supporting Materials tab, and reading the notes in their entirety.
These materials may not be copied without EMC's written consent.
EMC believes the information in this publication is accurate as of its publication date. The information is subject to change without notice.
THE INFORMATION IN THIS PUBLICATION IS PROVIDED “AS IS.” EMC CORPORATION MAKES NO REPRESENTATIONS OR WARRANTIES OF ANY KIND WITH RESPECT TO THE INFORMATION IN THIS PUBLICATION, AND SPECIFICALLY DISCLAIMS IMPLIED WARRANTIES OF MERCHANTABILITY OR FITNESS FOR A PARTICULAR PURPOSE.
Use, copying, and distribution of any EMC software described in this publication requires an applicable software license.
EMC is a registered trademark and Celerra is a trademark of EMC Corporation.
All other trademarks used herein are the property of their respective owners.
Copyright © 2005 EMC Corporation. Do not Copy - All Rights Reserved.
Symmetrix Solutions Design Concepts - 2
© 2005 EMC Corporation. All rights reserved. Symmetrix Solutions Design Concepts - 2
Course ObjectivesUpon completion of this course, you will be able to:
Describe the important technical data to be gathered about the use of Symmetrix
Describe how technical data is gathered for Symmetrix
Describe how to interpret and comprehend the gathered data
State the parameters to set, and tools used to control and manage Symmetrix
Discuss the best practices for configuring and deploying Symmetrix
The objectives for this course are shown here. Please take a moment to read them.
Copyright © 2005 EMC Corporation. Do not Copy - All Rights Reserved.
Symmetrix Solutions Design Concepts - 3
© 2005 EMC Corporation. All rights reserved. Symmetrix Solutions Design Concepts - 3
Concepts and TerminologyUpon completion of this lesson, you will be able to:
Describe Symmetrix management concepts and terminology
Describe TimeFinder and SRDF concepts
The objectives for this lesson are shown here. Please take a moment to read them.
Copyright © 2005 EMC Corporation. Do not Copy - All Rights Reserved.
Symmetrix Solutions Design Concepts - 4
© 2005 EMC Corporation. All rights reserved. Symmetrix Solutions Design Concepts - 4
Review of Hardware Topics Basic Symmetrix architecture – LUN, – RAID Protection Types– Front End, Back End– Dynamic Spares
Performance Basics– Little’s Law– Application I/O and block sizes– Workload characterization
Let us review some of the topics that have been covered in the Foundation modules:
A logical unit number (LUN) is how a storage device is presented by the Symmetrix and recognized by the host.
RAID (Redundant Array of Independent Disks) protection is available in one of three variations: RAID 1 refers to Mirror protection; RAID-S is the older EMC technique for providing RAID protection, where there was one dedicated parity drive for a group of 3 or 7 drives; and the newer variation is RAID 5, where parity is distributed across the entire rank of drives.
The front end of the Symmetrix connects to hosts. The back end is connected to the physical drives. Patented EMC technology (Enginuity) carves up the physical storage and presents it as smaller logical volumes to hosts connected to the front end.
Little’s Law formulated by John Little of MIT’s Sloane School of Management has been adapted for computer systems to indicate that response time (such as how quickly an I/O is serviced by a disk) increases non-linearly as utilization increases.
Block sizes associated with application I/O can vary and has an effect on performance. The response times for large block I/Os tends to be greater but the amount of data moved in MB/sec is larger. Small block I/Os result in more I/Os per second but fewer MB/sec throughput.
Workloads can be batch or interactive, read or write intensive. Interactive workloads, where humans are waiting for a response from the system tend to be more response time sensitive. Batch workloads are less sensitive to response times. In addition disk subsystems can handle more reads per second than writes, so the read/write mix has an impact on disk array performance.
Copyright © 2005 EMC Corporation. Do not Copy - All Rights Reserved.
Symmetrix Solutions Design Concepts - 5
© 2005 EMC Corporation. All rights reserved. Symmetrix Solutions Design Concepts - 5
Review of Symmetrix Local and Remote ReplicationLayered Applications– TimeFinder
MirrorSnapClones
– EMC Open Replicator– SRDF
Synchronous (SRDF/S)Asynchronous (SRDF/A)Automated Replication (SRDF/AR)Consistency GroupsData Mobility (SRDF/DM)SRDF/Star
EMC’s rich set of local and remote replication products are introduced in the EMC Technology Foundations modules. To review each product, briefly:
The TimeFinder family comprises the local replication products. TimeFinder/Mirror is the most mature and high performing local replication product. The Business Continuance Volume is a detachable mirror that can act either as a transparent mirror of the standard or as an independent device. TimeFinder/Snap is a space saving pointer based copy of a source suitable for read intensive applications, not as suitable for write intensive applications. TimeFinder/Clone is a full volume pointer based copy that overcomes the 4 mirror limitation of TimeFinder mirror by allowing up to 8 differential and 16 non-differential full copies of the source.
EMC Open Replicator for Symmetrix is a relatively new product that facilitates data transfers between a DMX and other kinds of EMC and third party arrays.
The Symmetrix Remote Data Facility (SRDF) product suite consists of EMC’s remote replication products for the Symmetrix.
SRDF/S replicates all writes to the local array remotely in real time. SRDF/A buffers writes in the local array’s cache and replicates them remotely in near real time (within seconds to minutes). SRDF/AR takes periodic point in time copies of the original data on disk and propagates the data at a later point in time. This mode of delayed replication requires less network bandwidth and saves transmission costs at the expense of having the target lag the source by a few hours. SRDF/DM is a low cost solution that is good for moving data from one Symmetrix to another in adaptive copy mode. It is unsuitable for disaster recovery. SRDF/Star allows higher tiered customers to maintain three data centers. In the event of a failure two data centers can continue running with very little interruption. This preserves the ability to survive a second data center failure.
Copyright © 2005 EMC Corporation. Do not Copy - All Rights Reserved.
Symmetrix Solutions Design Concepts - 6
© 2005 EMC Corporation. All rights reserved. Symmetrix Solutions Design Concepts - 6
Terms Related to Cache ManagementLogical Volume Write Pending Ceiling– Maximum number of writes (measured in tracks) that can be queued
to a logical volume– Slows performance to write intensive logical volumes– Impact on local applications and SRDF/S performance
System Wide WP ceiling– Maximum number of writes (measured in tracks) that can be queued
to the Symmetrix– When the ceiling is hit (80% of available cache)
system performance deterioratesSRDF/A sessions fail
Logical volume write-pending restrictions are imposed when too much data has been written to an individual Symmetrix system logical device, but not destaged to disk. When a device reaches the LVWP ceiling, each new write to a track that is not already write pending for the device, will trigger a special task that waits for old data to be destaged to disk before the new write is accepted.
The system-wide write-pending limit is imposed when too much data has accumulated system wide in the Symmetrix system without being destaged to the back-end disks. When the Symmetrix system is at the system-wide write-pending limit, and you write to a track that is not already write pending, each new write for any volume to the Symmetrix system will trigger a special task that waits for old data to be destaged to disk before the new write is completed. At the write-pending limit, the Symmetrix system changes the priority of writes to equal that of read misses. Because the Symmetrix system is no longer prioritizing reads over writes, read response time may also be significantly impacted.
Copyright © 2005 EMC Corporation. Do not Copy - All Rights Reserved.
Symmetrix Solutions Design Concepts - 7
© 2005 EMC Corporation. All rights reserved. Symmetrix Solutions Design Concepts - 7
I/O Performance ConceptsBandwidth– The data transfer rate, measured in MB/s– Key metric for SRDF links– Attribute of host interconnects such as Fibre channel
I/O Size– Exchange like (4K block size)– Oracle like (8K block size)– Sybase like (2K block size)
Read / Write ratio
Random I/O– Consecutive I/O operations are on different disk areas
Sequential I/O– Small block sequential writes good for Snap performance– Large block sequential writes poor for Snap performance
Different sources use the term ‘bandwidth’ differently. Some define bandwidth as the link capacity expressed in Megabytes, and others in Megabits. For the purposes of our discussion we will refer to bandwidth as the data transfer rate in Megabytes per second. It is an important consideration in designing networks for SRDF and Open Replicator. Bandwidth is also an important metric for host interconnects such as Fibre Channel.
Knowing the customer’s application environment can facilitate the planning process. The Performance Engineering group publishes performance information for the EMC internal Speed community. Their reports are based on different workloads. In general they comprise OLTP workloads with different average I/O sizes and read/write mixes and Decision Support workloads with different I/O sizes and read/write I/O mixes.
Typically the variables in a workload are: a) I/O size in KB b) Sequential or Random I/O c) Read/write mix. It is useful to know how each of these workloads affect different EMC products. For instance large block sequential writes pay a noticeable CopyOnWrite penalty when the write occurs to the source or target of a Snap session. Comparatively, small block sequential writes do not pay as high a penalty.
Copyright © 2005 EMC Corporation. Do not Copy - All Rights Reserved.
Symmetrix Solutions Design Concepts - 8
© 2005 EMC Corporation. All rights reserved. Symmetrix Solutions Design Concepts - 8
Security and Change Measurement ToolsAccess Control– The ability to limit the amount of control that a channel attached host
with SYMCLI can exercise on a Symmetrix– The function of segregating devices into pools for control operations
by different hosts (e.g. TimeFinder, SRDF, Configuration Manager, etc.)
DeltaMark (SDDF)– Symmetrix Differential Data Facility is used to track changes
between the logical volume and its replica– 16 per Symmetrix logical volume– Used for Open Replicator, TimeFinder/Snap, TimeFinder/Mirror,
Change Tracker, TimeFinder/Clone
Access Control is a free feature of Solutions Enabler that restricts the ability of a channel attached host to control the Symmetrix using SYMCLI commands. The Access Control paradigm divides the Symmetrix operations into 20+ privileges or Access Rights. Selected Access Rights such as BCV, SRDF, etc. can be assigned to different hosts so they can execute different subsets of SYMCLI commands on selected pools of devices. For instance, a host dedicated to the accounting department could be assigned the privilege to issue TimeFinder commands to a pool of standards and BCVs that are earmarked for use by that department. A host from the sales department lacking that privilege could not exercise TimeFinder control on accounting’s pool of devices.
The DeltaMark feature, sometimes known as the Symmetrix Differential Data Facility (SDDF), identifies tracks that have changed on a Symmetrix LUN since the creation of a DeltaMark session. Each Symmetrix LUN supports up to 16 DeltaMark sessions. It is the mechanism by which relative changes between two volumes such as the source and target of a Snap or a clone, and a standard and BCV are measured. It is also the mechanism by which volume changes in Change Tracker and data transfers performed by Open Replicator are measured.
Copyright © 2005 EMC Corporation. Do not Copy - All Rights Reserved.
Symmetrix Solutions Design Concepts - 9
© 2005 EMC Corporation. All rights reserved. Symmetrix Solutions Design Concepts - 9
TimeFinder ConceptsTimeFinder Emulation Mode– Enables use of TimeFinder Clones in existing TimeFinder scripts– Underlying technology behind RAID 5 BCVs– Settable by session
Enginuity Consistency Assist (ECA)– Used for TimeFinder consistent splits
TimeFinder Consistent Splits– Required for SRDF/AR, SRDF/A and local applications
When TimeFinder was originally introduced in the late 1990s, it functioned by allowing a BCV to become an additional mirror. Though revolutionary at the time, over the years the 4 mirror limit of a Symmetrix LUN limited the number of concurrent full volume copies that could be made from a single source. By implementing TimeFinder/Clones, which use up DeltaMark sessions as opposed to mirror positions, the number of concurrent mirrors was expanded to 8 with Enginuity 71.
TimeFinder Emulation mode preserves customers’ investments in scripts that had been written for TimeFinder/Mirror. Using this mode, calls to the TimeFinder/Mirror “symmir” command are translated by the CLI into the TimeFinder/Clone command “symclone”.
Enginuity Consistency Assist is the Symmetrix feature which makes it possible to perform TimeFinder consistent splits. ECA will hold write I/O to a user defined list of Symmetrix standard volumes while BCVs are being split from them. The momentary stoppage of writes permits the creation of a consistent and re-startable copy of the data on the BCV. Similar functionality is available also on TimeFinder/Snap, TimeFinder/Clone and Open Replicator, where ECA will hold writes to the source volumes while the Target volumes are activated.
TimeFinder Consistent Splits are an important Enginuity feature and can be executed in the local Symmetrix or in an SRDF attached remote Symmetrix.
Copyright © 2005 EMC Corporation. Do not Copy - All Rights Reserved.
Symmetrix Solutions Design Concepts - 10
© 2005 EMC Corporation. All rights reserved. Symmetrix Solutions Design Concepts - 10
SRDF ConceptsSRDF Consistency
– Technology to assure that remote copy is consistent and restartable– Useful in synchronous SRDF setups with more than two Symmetrix units– Used in SRDF/Star
SRDF Groups– Theoretical max of 16 per director (pair) and 64 per Symmetrix– Important for managing SRDF/A
SRDF/A session priorities– A number from 1 to 64 that regulates the order in which SRDF/A sessions will fail if
the Symmetrix fails to cope with incoming writes
SRDF/A Cycle Time– The minimum time that must elapse before a delta set switch occurs, provided the
transmit and the apply cycles have finished– Governs the amount of data loss exposure
SRDF Daemon– Host process run on one or more hosts to manage SRDF/A cycle switching and
Synchronous SRDF consistency groups.
SRDF Consistency Groups for open systems allow customers to define logical volume groups, which can be associated with a given workload. These groups of SRDF logical volumes will automatically be suspended in case there is a failure to write to any volume in the group because of network or other hardware problems. The remote SRDF logical volumes will be consistent, even if these logical volumes span multiple Symmetrix systems. One such example is a large database with its tables on one Symmetrix and its log files on another.
The Open Systems version of this capability is available for HP/UX, Solaris and IBM AIX environments and uses PowerPath to manage inter Symmetrix communication. Consistent split capability is inherent in SRDF/A implementations where data on the R2 is guaranteed to be consistent if a consistent SRDF/A setup encounters a failure.
In most kinds of SRDF/A implementations, devices belonging to an RDF group can be subdivided into smaller SYMCLI device groups and distributed among different applications. These device groups can establish, split, fail over and fail back independently of each other. In contrast, a group functioning in SRDF/A mode requires that all volumes in the RDF group be managed as a single entity, i.e. devices belonging to an RDF group must all be placed in one device group. This places a limit on the number of device groups, and consequently applications can be supported per director pair in the Symmetrix.
Since Enginuity 71 supports multiple SRDF/A groups within a single Symmetrix, it is possible to assign session priorities to each SRDF/A group. If cache resources become overextended, the SRDF/A group traffic will be suspended in order of lowest (64) to highest (1).
Cycle times for each individual group can be set using Symmetrix Configuration Manager with Solutions Enabler V6.0 and Enginuity 71.
The SRDF Daemon in Open Systems is used to maintain consistency protection in SRDF/S and SRDF/A environments. In a Synchronous SRDF environment it works with Symmetrix ECA to guarantee consistency. In SRDF/A environments it plays a role in cycle-switching when multiple SRDF/A sessions must be consistently managed.
Copyright © 2005 EMC Corporation. Do not Copy - All Rights Reserved.
Symmetrix Solutions Design Concepts - 11
© 2005 EMC Corporation. All rights reserved. Symmetrix Solutions Design Concepts - 11
Important Information About This TechnologyUpon completion of this lesson, you will be able to:
Describe the technical data that you need to gather about the Symmetrix configuration and performance measurement
Describe the technical data that you need to gather about OpenReplicator usage
Describe the technical data that you need to gather about TimeFinder usage
Describe the technical data that you need to gather about SRDF usage
The objectives for this lesson are shown here. Please take a moment to read them.
Copyright © 2005 EMC Corporation. Do not Copy - All Rights Reserved.
Symmetrix Solutions Design Concepts - 12
© 2005 EMC Corporation. All rights reserved. Symmetrix Solutions Design Concepts - 12
Hardware Layout of Existing EnvironmentCapacity– Number of applications– Amount of storage that each application / host uses– Amount of used and unused capacity
Availability– What type of RAID protection is being used (R-5, R-S, R-1)
Security– Is Access Control being used to limit host control over the Symmetrix
Connectivity– How many hosts are connected to the array– Are redundant paths being used– What is the total number of FA ports available vs. in use today
To understand an existing customer environment some of the questions that need answers are:
• What is the number of applications and the amount of storage each uses?• What is the amount of used and unused storage capacity in the entire Symmetrix?• What type of RAID protection being used?• Is Access Control is being used to implement security?• How many hosts are connected to the array?• Are redundant paths in use?• How many of FA ports are in use? Are there any unused ones?
Copyright © 2005 EMC Corporation. Do not Copy - All Rights Reserved.
Symmetrix Solutions Design Concepts - 13
© 2005 EMC Corporation. All rights reserved. Symmetrix Solutions Design Concepts - 13
Performance and Management of Existing Environment
Performance– Bandwidth (MB/sec) used by each application / host– Read/Write ratio– Random/Sequential I/O mix– I/O Block size – How much cache is being used– Devices close to their device WP ceilings– Is the Symmetrix approaching its WP ceiling
Management– Any GUI based products (EMC ControlCenter, Performance
Manager)– CLI based
I/O Profile of a Host running E-Mail Application
Different applications have different I/O profiles. For instance an E-mail application exhibits a spike early in the morning, then subsides to a steady level before tailing off late at night. Other applications have other IOPs characteristics.
To gather performance related information about an existing environment it is necessary to find out:- The nature of the workload - Bandwidth expressed in MB/sec- Its read/write ratio- Average I/O block size - Random vs. sequential I/O- Throughput in IOPs - The size of cache in the existing Symmetrix- How close individual devices and the Symmetrix are to the Write Pending ceiling limits
Are EMC ControlCenter, Replication Manager or EMC ControlCenter Performance Manager being used for management or monitoring?
Is SYMCLI in use?
Copyright © 2005 EMC Corporation. Do not Copy - All Rights Reserved.
Symmetrix Solutions Design Concepts - 14
© 2005 EMC Corporation. All rights reserved. Symmetrix Solutions Design Concepts - 14
System-wide Cache Back-end Directors
DisksHost Ports
Component Usage
This slide shows the four primary component types within a Symmetrix: (from the bottom left) Host Ports (the Front-end activity), System Cache, Back-end directors, and the Disks. Component usage results expose design flaws by showing imbalances and poor use of directors, cache and devices. It also shows where insufficient resources could result in poor performance by exposing places where resources are running at or near their technical limits.
The graphs on this page represent the four areas of contention within the Symmetrix and provide the following key metrics:
The host ports display is I/O per sec.
The cache activity is represented by a hit% graph.
Back-end directors show I/O per sec.
The Disks graph is of SCSI commands per sec.
As each of the components get loaded, the responses become slower. Little’s Law as applied to storage subsystems states that response times increase with greater component utilization.
Copyright © 2005 EMC Corporation. Do not Copy - All Rights Reserved.
Symmetrix Solutions Design Concepts - 15
© 2005 EMC Corporation. All rights reserved. Symmetrix Solutions Design Concepts - 15
TimeFinder Usage in Existing EnvironmentConfiguration
– Which TimeFinder product is being used (Mirror, Snap or Clone)– How much data is being replicated– How many concurrent and how many sequential copies are being used– How long is the life of the copies– What is the rate and total amount of change in the source data during the life of the
copy
Performance– If using Snaps or Clones, is there an unacceptable degradation of write performance
due to CopyOnWrite or CopyOnAccess penalties– Are the copies being predominantly used for reads (e.g. backups) or writes (e.g. data
warehouse loads)– Are Business Continuance time deadlines (for backup, reporting etc.) being
adequately met
Management– Are GUI based management products being used (EMC ControlCenter, RM)– Are SYMCLI scripts being used for automation
The questions shown here pertain to any existing TimeFinder setup the customer may currently have.
The amount of data and the number of copies being used point to the amount of storage being used for local replication. A maximum of two concurrent copies are permitted in TimeFinder/Mirror and eight with TimeFinder/Clone. The duration of the copies and the amount of data that changes during that time could be an indicator to whether Snaps may be a viable option in the future.
Performance of TimeFinder Snaps and Clones can suffer if heavy writes are executed against the source while the copy is in progress. This is because of the Copy On First Write penalty in the case of Snaps and Copy on First Access penalty in the case of clones. With both of the above products the data from the point in time of Snap or Clone activation the has to be moved from the source target before the initial write or access can be allowed.
Is the performance suitable for meeting the backup/reporting deadlines?
Is the management of the product being done by using GUIs or CLIs?
Copyright © 2005 EMC Corporation. Do not Copy - All Rights Reserved.
Symmetrix Solutions Design Concepts - 16
© 2005 EMC Corporation. All rights reserved. Symmetrix Solutions Design Concepts - 16
Open Replicator Usage in Existing EnvironmentConfiguration
– Is data moving to or from a non-EMC arrayIf yes is the other array qualified
– Is data being transferred to (pull) or from (push) the DMX– Are incremental pushes being done– How much data is being replicated in how much time– Are live pushes or pulls being done– Is the connection over a local or a wide area SAN
Performance– What is the network distance and quality between the control DMX and the remote array– What if any bottlenecks are affecting the transfer (e.g. network infrastructure, speed of remote
array, hot volumes on the DMX)– Is Open Replicator consuming too high a share of SAN bandwidth and affecting local I/O– Are the “ceiling” and “pace” parameters being used
Management– CLI based
Scalability– How many LUNS are being migrated
Open Replicator is a relatively new product which enables point in time data transfers to occur between EMC DMX and other types of arrays. Various arrays of third party vendors have been qualified to work with Open Replicator including several models from Hitachi, IBM, and HP StorageWorks.
Incremental copies are only possible when the data originates on the DMX and is pushed out. Cold pushes can service up to 15 targets simultaneously. Live pulls run the risk of data loss if the session is interrupted before it completes. For these reasons it is important to know about the planned direction of data flow. Open Replicator uses a DeltaMark session and is subject to the 16 session limit for every logical volume.
The performance of Open Replicator is dependent on network quality. Other factors include the abilities of the source and target arrays to transfer data. A fast array can perform poorly if the volumes involved in the data transfer are too busy because of an uneven distribution of I/O load.
Open Replicator can sometimes impact host I/O by consuming an unfairly large share of SAN bandwidth. Two parameters, “ceiling” and “pace” can be used to throttle the amount of bandwidth that Open Replicator uses. “Ceiling” sets the maximum bandwidth that Open Replicator can use on a specific director and port regardless of how many sessions may be active. “Pace” can be set to slow down specific sessions by injecting waits between transfers.
At this time, the only management interface to Open Replicator is through SYMCLI.
Copyright © 2005 EMC Corporation. Do not Copy - All Rights Reserved.
Symmetrix Solutions Design Concepts - 17
© 2005 EMC Corporation. All rights reserved. Symmetrix Solutions Design Concepts - 17
SRDF Configuration in Existing EnvironmentConfiguration– What kind of SRDF is being used (/DM, /AR, /S or /A)– Physical and network distance between production and DR site– Amount of Data that is being replicated– Number of LUNs being replicated– Network interconnect type (Fibre, ESCON or GIG-E)
What is the availability of data / what is the maximum possible data loss with this solution – Days / hours (SRDF/AR)– Seconds / Minutes (SRDF/A)– None (SRDF/S or SRDF/Star)– Are the Recovery Time Objective (RTO )/ Recovery Point Objective
(RPO) goals being met
In order to discover what kind of SRDF setup the customer has, it is important to know:
Which flavor of SRDF is in use, the physical and network distances between the two sites and the volume of the source and target data. The number of LUNs being replicated and network interconnect types are also relevant.
If SRDF/Star is being used the distances between the Workload site and the Sync target site, between the Sync target site and the Async target site and the Workload site and the Async target site need to be ascertained.
It is also worth inquiring whether the actual data loss potential is satisfying the expected Recovery Point Objective.
Copyright © 2005 EMC Corporation. Do not Copy - All Rights Reserved.
Symmetrix Solutions Design Concepts - 18
© 2005 EMC Corporation. All rights reserved. Symmetrix Solutions Design Concepts - 18
SRDF Operations in Existing EnvironmentPerformance statistics– Bandwidth– Latency– Packet Loss– Is TimeFinder competing with SRDF Adaptive Copy for Symmetrix
back end resources
Management– GUI based (Replication Manager, SRDF/TimeFinder Manager)– SYMCLI based
Scalability– Are consistency groups being used to scale across multiple
Symmetrix arrays
Network attributes such as bandwidth, latency, and packet loss have an impact on SRDF performance. With SRDF/AR implementations, the Symmetrix back end, i.e. the DAs, can find themselves trying to satisfy requests for host I/O, TimeFinder synchronizations, and SRDF copy tasks leading to performance degradation.
Management of SRDF can be done with either SRDF/TimeFinder Manager, which is part of EMC ControlCenter, or with SYMCLI, which is part of Solutions Enabler.
Copyright © 2005 EMC Corporation. Do not Copy - All Rights Reserved.
Symmetrix Solutions Design Concepts - 19
© 2005 EMC Corporation. All rights reserved. Symmetrix Solutions Design Concepts - 19
Gathering Symmetrix Information Upon completion of this lesson, you will be able to:
Describe the tools and resources that allow you to gather Symmetrix configuration, management, and performance information
Discuss how to interpret this information
The objectives for this lesson are shown here. Please take a moment to read them.
Copyright © 2005 EMC Corporation. Do not Copy - All Rights Reserved.
Symmetrix Solutions Design Concepts - 20
© 2005 EMC Corporation. All rights reserved. Symmetrix Solutions Design Concepts - 20
Information Gathering for SymmetrixInterview storage managers / users to learn– Nature of application
Database typeRead or write intensive
– Number of hosts connected to the Symmetrix
Interview account CE or examine Symmetrix BIN file to learn– How many logical volumes are configured in the Symmetrix– Protection type (RAID 1, RAID-5, RAID-S or unprotected)– Number of front end host ports
Use host based tools (e.g. iostat, sar, Perfmon ) to estimate– Read and Write IO loads on the Symmetrix
Use EMC tools (if available) to:– Measure Read and Write I/O loads (Performance Manager, STP, symstat)– Gauge the size of needed storage by using SYMCLI (symdev)
One of the primary sources of information in existing Symmetrix environments should be the user community. They are the ones who can describe the nature of the application that the Symmetrix is being used for, for example; the type of database, whether it is read or write intensive, and the number of hosts that use it.
The EMC CE can also provide information about the layout of the Symmetrix, the protection scheme of the logical volumes, and the number of front end ports available for host connectivity.
To assess the I/O loads on the Symmetrix there are a choice of tools that can be used. Host based tools such as iostat and sar on Unix, and Perfmon on Windows, are easily accessible but do not provide a comprehensive picture of the load in the Symmetrix.
EMC tools such as Performance Manager, STP and the SYMCLI commands ‘symstat” and “symdev”provide a more comprehensive picture of the entire array.
Copyright © 2005 EMC Corporation. Do not Copy - All Rights Reserved.
Symmetrix Solutions Design Concepts - 21
© 2005 EMC Corporation. All rights reserved. Symmetrix Solutions Design Concepts - 21
Interpreting sar Information
SunOS losap160 5.8 Generic_117350-12 sun4u 08/11/05
18:18:44 device %busy avque r+w/s blks/s avwait avservsd191 99 1.0 112 14386 0.0 8.8sd191,a 0 0.0 0 0 0.0 0.0sd191,b 0 0.0 0 0 0.0 0.0sd191,c 99 1.0 112 14386 0.0 8.8sd191,g 0 0.0 0 0 0.0 0.0sd192 99 1.0 112 14310 0.0 8.9sd192,a 0 0.0 0 0 0.0 0.0sd192,b 0 0.0 0 0 0.0 0.0sd192,c 99 1.0 112 14310 0.0 8.9sd192,g 0 0.0 0 0 0.0 0.0sd193 99 1.0 113 14489 0.0 8.7sd193,a 0 0.0 0 0 0.0 0.0sd193,b 0 0.0 0 0 0.0 0.0sd193,c 99 1.0 113 14489 0.0 8.7sd193,g 0 0.0 0 0 0.0 0.0
sar –d 5 (example below is on Solaris) shows average R/W I/Os/sec., average number of 512 byte blocks/sec, average service time in millisec
System Activity Reporting (sar) is one of the two major Unix host based performance monitoring utilities. It collects data and produces reports for CPU, memory, and disk performance. The sar –d report disk statistics. The columns show:
• The portion of time the device was busy servicing a transfer request
• Average number of requests outstanding during that time
• Number of read/write transfers from or to device, number of bytes transferred in 512-byte units
• Average wait time in milliseconds
• Average service time in milliseconds
Copyright © 2005 EMC Corporation. Do not Copy - All Rights Reserved.
Symmetrix Solutions Design Concepts - 22
© 2005 EMC Corporation. All rights reserved. Symmetrix Solutions Design Concepts - 22
Interpreting iostat information
extended device statisticsr/s w/s Mr/s Mw/s wait actv wsvc_t asvc_t %w %b device0.0 13.4 0.0 0.1 0.0 1.0 0.0 71.1 0 4 c0t0d00.0 111.6 0.0 7.0 0.0 1.0 0.0 8.9 0 99 c3t0d100.0 114.6 0.0 7.2 0.0 1.0 0.0 8.6 0 99 c3t0d110.0 115.4 0.0 7.2 0.0 1.0 0.0 8.6 0 99 c3t0d120.0 114.0 0.0 7.1 0.0 1.0 0.0 8.7 0 99 c3t0d130.0 114.8 0.0 7.2 0.0 1.0 0.0 8.6 0 99 c3t0d140.0 66.8 0.0 4.2 0.0 1.0 0.0 14.9 0 99 c3t0d150.0 66.6 0.0 4.2 0.0 1.0 0.0 14.9 0 99 c3t0d160.0 115.4 0.0 7.2 0.0 1.0 0.0 8.6 0 99 c3t0d170.0 115.4 0.0 7.2 0.0 1.0 0.0 8.6 0 99 c3t0d180.0 115.4 0.0 7.2 0.0 1.0 0.0 8.6 0 99 c3t0d190.0 115.4 0.0 7.2 0.0 1.0 0.0 8.6 0 99 c3t0d200.0 115.4 0.0 7.2 0.0 1.0 0.0 8.6 0 99 c3t0d210.0 115.0 0.0 7.2 0.0 1.0 0.0 8.6 0 99 c3t0d220.0 106.4 0.0 6.6 0.0 1.0 0.0 9.3 0 99 c3t0d231.4 1.2 0.0 0.0 0.0 0.0 0.0 0.0 0 0 c3t0d171
iostat xtczMn (example below is on Solaris) will print out number of reads, writes (IOPs) as well as MB/sec per device
IOstat is another Unix based host performance monitoring utility. It collects data and produces reports for terminals, disks and tapes. The columns denote:
- r/s reads per second
- w/s writes per second
- Mr/s Megabytes read per second
- Mw/s Megabytes written per second
- wait average queue length of transactions waiting for service
- actv average number of transactions actively being serviced
- asvc_t average service time in milliseconds
- %w percent of time there are transactions waiting for service
- %b percent of time the disk is busy
Copyright © 2005 EMC Corporation. Do not Copy - All Rights Reserved.
Symmetrix Solutions Design Concepts - 23
© 2005 EMC Corporation. All rights reserved. Symmetrix Solutions Design Concepts - 23
Performance Manager Sample Report Diskperf (Performance Manager) provides snapshot of Disk Performance
The diskperf utility is the major Windows performance measurement utility. It controls the types of counters that can be monitored using the System Monitor Utility.
Copyright © 2005 EMC Corporation. Do not Copy - All Rights Reserved.
Symmetrix Solutions Design Concepts - 24
© 2005 EMC Corporation. All rights reserved. Symmetrix Solutions Design Concepts - 24
Limitations of Host Performance MeasurementThe host views each logical volume as a separate physical spindle– All modern disk arrays present logical volumes to hosts as if they
were standalone devices– Different logical volumes can share the same physical spindles– Logical volumes sharing the same physical spindles can contend
with each other for the drive actuator and degrade each others’performance
Each host assumes it has the exclusive use of its logical volumes
Can only ‘see’ the I/Os and response times locally
Data must be collected for volumes across each host
Since the host views every logical volume as a separate entity, the information derived from the host based tools can be deceptive. Since disk arrays can host several logical volumes on the same spindle, the host tools could easily present a distorted view of a disk’s performance.
Host based performance measurement tools only offer performance information about the host they are running on. They cannot provide a comprehensive picture that shows the performance of the whole array.
Copyright © 2005 EMC Corporation. Do not Copy - All Rights Reserved.
Symmetrix Solutions Design Concepts - 25
© 2005 EMC Corporation. All rights reserved. Symmetrix Solutions Design Concepts - 25
Use of symstat
DEVICE IO/sec KB/sec % Hits %Seq Num WP10:37:22 READ WRITE READ WRITE RD WRT READ Tracks
10:37:22 0022 (c3t0d10s2) 0 106 0 6826 N/A 100 N/A 56730023 (c3t0d11s2) 0 106 0 6826 N/A 100 N/A 56720024 (c3t0d12s2) 0 106 0 6826 N/A 100 N/A 56820025 (c3t0d13s2) 0 106 0 6826 N/A 100 N/A 56620026 (c3t0d14s2) 0 96 0 6144 N/A 99 N/A 74990027 (c3t0d15s2) 0 53 0 3413 N/A 96 N/A 56620028 (c3t0d16s2) 0 53 0 3413 N/A 100 N/A 56500029 (c3t0d17s2) 0 96 0 6144 N/A 99 N/A 5656002A (c3t0d18s2) 0 106 0 6826 N/A 100 N/A 5639002B (c3t0d19s2) 0 106 0 6826 N/A 100 N/A 5648002C (c3t0d20s2) 0 106 0 6826 N/A 100 N/A 5659002D (c3t0d21s2) 0 106 0 6826 N/A 100 N/A 5661002E (c3t0d22s2) 0 106 0 6826 N/A 100 N/A 5637002F (c3t0d23s2) 0 106 0 6826 N/A 100 N/A 566100B4 (c3t0d140s2) 0 0 0 0 N/A N/A N/A 204700B5 (c3t0d141s2) 0 0 0 0 N/A N/A N/A 5695
------ ------ ------- ------- --- --- --- ------Total 0 1358 0 87374 N/A 100 N/A 88803
symstat breaks down reads and writes by Symmetrix Logical Volumes
The SYMCLI command, symstat, captures statistics information about the Symmetrix in real time. You can examine the performance of one or more devices or directors. The statistics in this display are broken down by I/Os per second, KB/sec, Read and Write cache hits, as well as a breakdown between reads and writes. The number of write pending tracks indicates the tracks awaiting destaging from cache to disk.
Copyright © 2005 EMC Corporation. Do not Copy - All Rights Reserved.
Symmetrix Solutions Design Concepts - 26
© 2005 EMC Corporation. All rights reserved. Symmetrix Solutions Design Concepts - 26
Use of symdev
losap160/usr/sengupta> symdev list -sid 53
Symmetrix ID: 000387940053
Device Name Directors Device
--------------------------- ------------- -------------------------------------
Cap
Sym Physical SA :P DA :IT Config Attribute Sts (MB)
--------------------------- ------------- -------------------------------------
0000 /dev/rdsk/c3t0d0s2 15C:0 16B:C0 2-Way Mir N/Grp'd VCM WD 11
0001 /dev/rdsk/c3t0d1s2 15C:0 02B:C0 2-Way Mir N/Grp'd (M) RW 17261
0002 Not Visible ***:* 01A:C0 2-Way Mir N/Grp'd (m) RW -
0003 Not Visible ***:* 02A:C1 2-Way Mir N/Grp'd (m) RW -
0004 Not Visible ***:* 01B:C1 2-Way Mir N/Grp'd (m) RW -
symdev list will print out device capacities, which can be added up to arrive at the total storage used by an application
The “symdev” command lists information about devices in the Symmetrix. By using this command it is possible to display the sizes of all volumes in the Symmetrix. If the user knows the identities of the volumes dedicated to his application he can then sum up their capacities and arrive at the total number of MB being used for his application.
Copyright © 2005 EMC Corporation. Do not Copy - All Rights Reserved.
Symmetrix Solutions Design Concepts - 27
© 2005 EMC Corporation. All rights reserved. Symmetrix Solutions Design Concepts - 27
TimeFinder Configuration Information GatheringInterview storage managers and users to learn:– Which TimeFinder products are being actively used– Number of actual copies made– Life of each copy of data– Extent of change to the source during the life of the copy
Validate the above information– Examine SYMCLI license file ( symapi_licenses.dat ) on the
management hosts to see which TimeFinder products are licensed– Check the device groups (symdg list) to learn how many devices with
what capacities are currently being used for TimeFinder– Peruse some of the log files (symapi-YYYYmmdd.log)to determine which
TimeFinder family commands are being used– If Snap is being used or planned, run change tracker to determine extent of
change to source volume during the life of the copy– Is emulation mode being used either through use of RAID 5 BCVs or use of
the SYMCLI_CLONE_EMULATION environment variable
By interviewing users, you can discover which TimeFinder products are in use and how many copies of data are needed for business continuity operations. It is also important to know about the length of time that a copy must be available and the amount of change to the source, and the copy during the life of the copy.
Answers to those questions would provide a clue to the suitability of Snaps in that environment. The information gathered during user interviews can be validated by examining the following host based files:
The license database resides in /var/symapi/config/ or \Program Files\EMC\SYMAPI\Config. It lists the licenses of the Symmetrix software products that can run on that host.
The composition of the device groups and the actions they are involved in can be deduced by displaying a list of the device groups using the symdg list command and by examining the log files located in the /var/symapi/log and \Program Files\emc\SYMAPI\log directories.
If TimeFinder/Snap is being planned, change tracker may be run to determine the extent of change to the source during the period that the snap is expected to exist.
If TimeFinder/Clones are being used in Emulation Mode this can be validated by checking if the SYMCLI_CLONE_EMULATION mode environment variable is set.
Copyright © 2005 EMC Corporation. Do not Copy - All Rights Reserved.
Symmetrix Solutions Design Concepts - 28
© 2005 EMC Corporation. All rights reserved. Symmetrix Solutions Design Concepts - 28
TimeFinder Performance Information GatheringIf performance is deemed unsatisfactory:– Examine Performance Manager data to discover if write contention
exists on the back end– In case of Snaps, check change tracker data to examine amount of
change over life of copy
If Business Continuance operations are failing to meet time deadlines– Assess the viability of making more copies– Use other products from the TimeFinder portfolio (such as Snap or
Clone) to make copies available earlier
A common cause of performance problems with TimeFinder/Mirror is the overloading of the Disk Adapters (DAs). This occurs when excessive Establish/Restore activity collides with host write activity on older model (pre-DMX) Symmetrixes.
Another situation where performance problems can arise is when BCVs and standards that they are paired with, reside on the same spindle. A TimeFinder operation in this configuration will cause contention on the drive.
If BC Operations are failing to meet time deadlines, it may be worth creating more copies of data. If more than 2 concurrent copies of data are needed, Snaps and Clones may be appropriate.
Copyright © 2005 EMC Corporation. Do not Copy - All Rights Reserved.
Symmetrix Solutions Design Concepts - 29
© 2005 EMC Corporation. All rights reserved. Symmetrix Solutions Design Concepts - 29
Information Gathering for Open ReplicatorInterview storage mangers and other users to learn:– How much data is being transferred in which direction (push or pull)– Is it to a Symmetrix or a supported non-Symmetrix array– Are recurring incremental pushes being performed– Network quality attributes (bandwidth, latency, packet loss)
Use EMC tools to validate:– Issue symdev list and add up the sizes of the volumes participating in
the OR session – Measure existing load on the SAN to assess impact of OR on existing
infrastructure by using Performance Manager– Use symstat to measure the throughput of the network– Use symcfg list and symmask list –logins to ascertain which FA
is logged in to which FA
Examine Data from Performance Manager to:– Assess the backend load on volumes involved in data transfer– Possible overloading of SAN infrastructure
If Open Replicator is part of the current environment, you can interview existing users to find out:
- How much data is being transferred
- Whether it is to a Symmetrix or non-Symmetrix array
- Are incremental pushes being used
- Network quality
Using SYMCLI command symdev list it would be possible to find out the list and sizes of the devices participating in the Open Replicator session.
The symstat command can provide snapshots of Open Replicator performance.
The outputs of the symcfg and symmask commands will show how the DMX and the remote array are configured.
Data from Performance Manager will indicate if the disk spindles or the SAN infrastructure are overloaded.
Copyright © 2005 EMC Corporation. Do not Copy - All Rights Reserved.
Symmetrix Solutions Design Concepts - 30
© 2005 EMC Corporation. All rights reserved. Symmetrix Solutions Design Concepts - 30
Information Gathering for SRDFInterview storage mangers and other users to learn:
– Which SRDF products are in use– Distance between production and remote sites– Distance between each site and the other two in an SRDF/Star configuration– How much data is being transferred– What is the RPO and RTO– How SRDF is being managed (GUI or CLI)
Validate the above information: – Examine SYMCLI license file ( symapi_licenses.dat ) on the management
hosts to see which TimeFinder products are licensed– Check the device groups (symdg list) to learn how many devices with what
capacities are currently being used for SRDF– Peruse some of the log files (symapi-YYYYmmdd.log)to determine which SRDF
family commands are being used– Use network monitoring tools from network hardware providers to assess the
performance of the network– Use Symmerge (available to Performance Gurus) to verify if SRDF traffic is impeding
host or TimeFinder performance
By interviewing users, you can discover information about an SRDF infrastructure such as:- Which members of the product family are being used- Distance between the sites- How much data is being transferred- What is the RTO and RPO - If management is being done via GUI or CLI
Answers to those questions can be validated by examining the SYMAPI license database, the SYMAPI log, and by examining the outputs from the command symdg list to examine the sizes of the devices participating in SRDF.
Network hardware providers such as CNT will often provide tools to monitor network performance. These software tools can often give a good indication of how the SRDF traffic is flowing across the network.
Symmerge, an EMC proprietary tool available to SPEED community members, can be used to determine if the back end of the Symmetrix is being overextended by SRDF and / or TimeFinder
Copyright © 2005 EMC Corporation. Do not Copy - All Rights Reserved.
Symmetrix Solutions Design Concepts - 31
© 2005 EMC Corporation. All rights reserved. Symmetrix Solutions Design Concepts - 31
Use of Change TrackerEach Symmetrix logical volume can use up to 16 DeltaMark (SDDF) sessions
Using the symchg create followed by symchg markcommands you can start a DeltaMark session on a logical volume
Incremental changes can now be measured at discrete intervals
Useful for potential Snap and SRDF/AR implementations
More information can be found in the white paper Using SYMCLI to Measure Volume Changes with Change Tracker
Change Tracker uses DeltaMark bitmap technology to identify logical blocks that have been changed on a Symmetrix FBA device. Before change tracking can begin, a DeltaMark session must be created using the symchg create command. The symchg mark command is then used to perform a timestamp and mark the selected area of disk storage occupied by a data object using the DeltaMark bitmap.
After a set of devices has been marked, incremental changes can be measured at discrete time intervals. Although the measurement interval can be set in seconds, practically, the measurement intervals would be a few hours to a few days depending on the time duration over which changes are being measured.
Data change rates are important for planning SRDF/AR and Snap implementations. Hence, Change Tracker is typically used to estimate change rates on devices that are candidates for participating in an SRDF/AR implementation, or devices that would be the sources for TimeFinder/Snaps.
The Solutions Enabler manual on Change Tracker and the white paper Using SYMCLI to Measure Volume Changes with Change Tracker have more information.
Copyright © 2005 EMC Corporation. Do not Copy - All Rights Reserved.
Symmetrix Solutions Design Concepts - 32
© 2005 EMC Corporation. All rights reserved. Symmetrix Solutions Design Concepts - 32
Parameters and Tools
Upon completion of this lesson, you will be able to:
Describe Symmetrix configuration and management parameters and tools
Describe TimeFinder management parameters and tools
Discuss SRDF management parameters and tools
Discuss Open Replicator management parameters and tools
The objectives for this lesson are shown here. Please take a moment to read them.
Copyright © 2005 EMC Corporation. Do not Copy - All Rights Reserved.
Symmetrix Solutions Design Concepts - 33
© 2005 EMC Corporation. All rights reserved. Symmetrix Solutions Design Concepts - 33
Symmetrix Configuration ToolEMC Customer services uses Symmwin to configure Symmetrix
Customers can use Symmetrix Configuration Change CLI to make online changes
Any time a new Symmetrix is configured or an old one reconfigured, the EMC account team creates a proposed configuration after consulting the customer. This configuration is validated by the Configuration Control group at EMC. In the case of more complex solutions such as SRDF over IP and Open Replicator, an approval from the Solution Validation Center at corporate is needed.
The primary tool for configuring the Symmetrix is the Symmwin program which runs on the Service Processor. It is used by EMC Customer Service to set up the array in accordance with the approved configuration.
Customers can use SYMCLI based Symmetrix Configuration change CLI to make online configuration changes to the Symmetrix.
Copyright © 2005 EMC Corporation. Do not Copy - All Rights Reserved.
Symmetrix Solutions Design Concepts - 34
© 2005 EMC Corporation. All rights reserved. Symmetrix Solutions Design Concepts - 34
TimeFinder Management ToolsSymmetrix Management Console
– Easy, point-and-click access– Excellent for ad hoc TimeFinder operations
EMC Replication Manager– Discovers replication environments– Automates replication process– Integrates replication technologies at the application level
TimeFinder/Exchange Integration Module (TimeFinder/EIM)– Provides a comprehensive backup management interface specifically for Windows
servers that support Microsoft Exchange databases residing in Symmetrix storage– Produces exact copies of the production volumes that hold the Exchange server
information stores and logs– Full and single mailbox restores in a fraction of the usual time
TimeFinder/SQL Integration Module (TimeFinder/SIM)– Provides a comprehensive backup and recovery management interface specifically
for Windows servers that support Microsoft SQL Server databases– Integrates and collectively automates the command actions and behavioral features
EMC offers a rich set of tools to manage and monitor TimeFinder. Symmetrix Management Console is a simple GUI application that is suited for ad-hoc Symmetrix management operations. It features easy point and click access to TimeFinder operations. It is good for individual TimeFinder operations but it is not suitable for automation.
EMC Replication Manager provides a GUI interface for managing local replicas. It can use multiple TimeFinder products such as mirrors and Snaps, and it permits the user to automate the process using a GUI interface
The TimeFinder Exchange Integration Module allows Windows users to automate Exchange Backup using TimeFinder. It has a built in capacity to perform consistency checks on the data before it is backed up to tape. The BCV data can be used for Information Store, Directory, or single mailbox recovery.
The TimeFinder/SQL Integration module is available for customers needing to quickly and easily integrate TimeFinder and SQL Server.
Copyright © 2005 EMC Corporation. Do not Copy - All Rights Reserved.
Symmetrix Solutions Design Concepts - 35
© 2005 EMC Corporation. All rights reserved. Symmetrix Solutions Design Concepts - 35
Managing TimeFinder with Solutions EnablerSYMCLI commands are used to manage different TimeFinder products:– For TimeFinder/Mirror symmir – For TimeFinder/Snap symsnap– For TimeFinder/Clones symclone
Commands are based on device and composite group structure
Device Groups– A collection of devices, assigned to a named group, to provide a
more manageable object to query status and impart control operations
– Devices can be associated as either a device group or a composite group
Most TimeFinder users use SYMCLI scripts to automate their Business Continuance operations. The commands symmir, symsnap and symclone are used to control TimeFinder/Mirror, TimeFinder/Snap, and TimeFinder/Clones respectively.
These commands are designed to act on groups of devices placed by the user into device groups. Device groups are the basic building block of the Solution Enabler universe. Composite groups are a construct that is similar to device groups. Composite groups can contain devices belonging to several Symmetrixes in them, while device groups cannot.
Copyright © 2005 EMC Corporation. Do not Copy - All Rights Reserved.
Symmetrix Solutions Design Concepts - 36
© 2005 EMC Corporation. All rights reserved. Symmetrix Solutions Design Concepts - 36
Managing SRDF with EMC ControlCenterSRDF Manager within EMC ControlCenter– Easy, “point-and-click” access– Excellent for ad hoc SRDF operations
Symmetrix Management Console features an easy point and click access to TimeFinder and SRDF operations. It is good for individual operations, but it is not suitable for automation.
Copyright © 2005 EMC Corporation. Do not Copy - All Rights Reserved.
Symmetrix Solutions Design Concepts - 37
© 2005 EMC Corporation. All rights reserved. Symmetrix Solutions Design Concepts - 37
Managing SRDF with Solutions EnablerSYMCLI commands are used to manage different SRDF products:– For SRDF/A, SRDF/DM and SRDF/S symrdf– For SRDF/AR symreplicate
Commands are based on device and composite group structure
Most SRDF users use SYMCLI scripts to automate their Disaster Recovery operations. The command symrdf is used to control all SRDF.
The commands are designed to act on groups of devices placed by the user into device groups. Device groups are the basic building block of the Solution Enabler paradigm. Composite groups are a construct that is similar to device groups. Composite groups can contain devices belonging to several Symmetrixes while device groups cannot span Symmetrixes.
Copyright © 2005 EMC Corporation. Do not Copy - All Rights Reserved.
Symmetrix Solutions Design Concepts - 38
© 2005 EMC Corporation. All rights reserved. Symmetrix Solutions Design Concepts - 38
Tools to Manage Open ReplicatorThe SYMCLI command symrcopy is used to manage Open Replicator
It is possible to regulate the speed of data transfer using the pace and ceiling parameters
Open Replicator is a product released in 2005. It runs on DMXs running 71 code. Though it can be used between Symmetrixes, its main purpose is to enable transfer of data between a DMX and a dissimilar array. The remote array can be a CLARiiON or a qualified array from another storage vendor.
The symrcopy command controls Open Replicator actions. Two parameters, pace and ceiling,regulate the rate of data flow between the controlling DMX array and the remote storage array. The pace parameter can throttle a single Open Replicator session and can be specified for each transfer. The ceiling parameter determines what percentage of the total FA bandwidth can be used by all Open Replicator sessions using the FA.
Copyright © 2005 EMC Corporation. Do not Copy - All Rights Reserved.
Symmetrix Solutions Design Concepts - 39
© 2005 EMC Corporation. All rights reserved. Symmetrix Solutions Design Concepts - 39
Best Practices Upon completion of this lesson, you will be able to:
Describe best practices for optimizing Symmetrix configuration
Discuss best practices for TimeFinder performance optimization
Discuss best practices for SRDF operations
The objectives for this lesson are shown here. Please take a moment to read them.
Copyright © 2005 EMC Corporation. Do not Copy - All Rights Reserved.
Symmetrix Solutions Design Concepts - 40
© 2005 EMC Corporation. All rights reserved. Symmetrix Solutions Design Concepts - 40
Optimizing Symmetrix ConfigurationsAllow for future growth– How many more hosts will need to connect to the server– At what rate will storage capacity grow
Spread load across backend infrastructure– Have many physical spindles work in parallel
Match application I/O block size to Symmetrix backend– 2KB I/O size causes SRDF performance to suffer
RAID1 or RAID 1+0 (striped metavolumes) are best for write intensive applicationsRAID 5 is cost effective and performs well for read intensive applications
Important considerations for optimizing a Symmetrix configuration are:
• Allowing for future growth
• Spreading the workload across as many spindles as possible, thereby improving application performance.
• Matching application I/O size with the way Symmetrix handles data is good practice. Since the smallest block of data transmitted by SRDF is 4KB, it is not a good practice to run older versions of Sybase with 2 KB I/O block size on SRDF volumes. Block sizes corresponding to higher powers of 2 (i.e. 4, 8, 16, 32, etc.) are all right to use.
• RAID 1 or RAID 1+0 volumes are best for write intensive applications
• RAID 5 volumes are cost effective and offer good read performance. They are unsuitable for write intensive applications.
Copyright © 2005 EMC Corporation. Do not Copy - All Rights Reserved.
Symmetrix Solutions Design Concepts - 41
© 2005 EMC Corporation. All rights reserved. Symmetrix Solutions Design Concepts - 41
Optimizing TimeFinder Performance BCV performance could cause contention to source volume if not properly placed– BCVs and standards should be on different spindles– Establish/Restore operations stress the back end– Short SRDF/AR cycles require frequent establishes and can stress the
Symmetrix infrastructure
Clones are prone to CopyOnAccess performance penalty– Avoid heavy performance-sensitive writes to source until clone has finished
copying– Place sources and targets on different spindles
Snaps are susceptible to CopyOnWrite penalty– Avoid heavy performance-sensitive writes to source for life of Snap– Limit data changes to source to about 30%– Spread out Save Devices across many spindles
Avoid using TimeFinder/Clones and TimeFinder/Mirrors in the sameSymmetrix– It can lead to unexpected results
Since BCVs are full image copies, they require the same amount of usable disk space as the source volume. They do not have to be the same RAID type or drive type as their source. BCVs only require incremental resynchronization. BCV establishes, place a strain on the Symmetrix back end. A large number of full establishes, or short SRDF/AR cycles which require frequent establishes and splits, can lead to heavy usage of the Symmetrix resources and hamper host I/O performance.
Clone performance has no impact on reads from the source as long as there is no workload on the target. Accessing the target before the clone is fully replicated could cause disk contention with the source. This impact is referred to as the CopyOnAccess penalty.
Immediately after its creation, all tracks on a Snap source are “protected”. This means that these tracks have to be moved to the Save area prior to any new writes to the source. This causes the first write to any track on the source to be delayed, and is called the CopyOnWrite penalty. If too much data on the source volume changes, Snap loses the advantage of being a space saving copy. Spreading out Save devices over many spindles is critical to performance. Otherwise they can become a performance bottleneck when simultaneous changes occur to a lot of snapped volumes.
Using TimeFinder/Mirrors and TimeFinder/Clones in the same Symmetrix is not recommended. It can lead to unexpected results.
Copyright © 2005 EMC Corporation. Do not Copy - All Rights Reserved.
Symmetrix Solutions Design Concepts - 42
© 2005 EMC Corporation. All rights reserved. Symmetrix Solutions Design Concepts - 42
Recommendations for Synchronous SRDF SRDF performance factors
Write rate to SRDF volumes– Provide sufficient bandwidth
Network distance– Balance response time needs with
impact of latency
Network quality– Observe packet loss limits
Symmetrix Infrastructure– Avoid hot volumes
R1 R2
SRDF
The following factors affect SRDF performance. They are:Write rate - This refers to the amount of incoming write I/O that has to be replicated. Since Synchronous SRDF completes every write to the remote Symmetrix before acknowledging completion, the available bandwidth must be able to handle the rate of incoming writes. Otherwise, writes waiting for remote acknowledgement will slow the host down.Network distance - Network distance determines write latency. A good rule of thumb is that every 125 circuit miles adds a millisecond of latency each way. Latency caused by network distance will determine how far the remote site can be without adversely affecting application performance under SRDF/S.
Network quality - Network quality has an impact on latency. Typically a packet loss of more than 0.1% is deemed to be unsatisfactory.
Symmetrix infrastructure - Synchronous SRDF will not queue more than one write to a logicalvolume. This means that if there is one overworked logical volume, all writes will slow down, because the application will not be able to continue until the write to the hot volume has been processed. Avoiding hot volumes requires careful analysis on a busy Symmetrix. Striping logs is one way of getting around the problem, because logs tend to be heavily accessed. Avoiding a lot of TimeFinder establishes at the same time that SRDF load is heavy, can also help.
Copyright © 2005 EMC Corporation. Do not Copy - All Rights Reserved.
Symmetrix Solutions Design Concepts - 43
© 2005 EMC Corporation. All rights reserved. Symmetrix Solutions Design Concepts - 43
Balancing SRDF/A Cache and Bandwidth
Cache – must be configured to absorb excess writes during spikes
Bandwidth – Must be sufficient to handle average write throughput of the environment
Data Loss Potential (RPO) – Sum of the current (capture) and previous (transmit and receive) delta sets in seconds
13:00 14:00
Average across ½ hour
Average across an hour
Link required for sync mode
The diagram above shows the effect of different bandwidths in an SRDF/A environment. The higher the bandwidth the less the need for caching writes inside the Symmetrix.
A reduction in the SRDF/A cycle time causes the bandwidth requirements to go up and the cache requirements to go down.
An increase in the SRDF/A cycle time reduces the bandwidth requirements and increases the cache requirements.
At a time when the cycle times are elongated because the write load is higher than the available bandwidth, one could say that the time to drain the excess writes in the transmit cycle is at the point in time when the area below the curve is equal to the area above the curve.
Copyright © 2005 EMC Corporation. Do not Copy - All Rights Reserved.
Symmetrix Solutions Design Concepts - 44
© 2005 EMC Corporation. All rights reserved. Symmetrix Solutions Design Concepts - 44
Recommended Practices for SRDF/AWhen in SRDF/A mode all devices in the RDF group must be managed together
Engineering recommends maximum 8 groups per director (pair) though theoretical maximum is 16 per director (pair)
The following should be set using Symmetrix Configuration Manager if necessary– Priority for RDF groups determines in which order an SRDF/A group should
be suspended if cache resources become low (default is 33)– Cycle time (default is 30 sec.) can be lowered to reduce potential data loss,
though this will require higher bandwidth– Amount of cache used by SRDF/A (default is 94% of available cache) can
be adjusted downwards to guarantee availability of cache for local applications
After long outages drop out of SRDF/A and synchronize the two sides using Adaptive Copy Write Pending
Unlike other modes in SRDF, devices belonging to an RDF group in SRDF/A mode have to be managed together. They cannot be subdivided into smaller groups. This feature limits the number of independently manageable SRDF/A applications to the number of RDF groups in the Symmetrix.
The theoretical limits of supported SRDF groups are 16 per director and 64 per Symmetrix. If redundancy is desired, this puts the limit at 16 groups per director pair. However, engineering recommends a maximum of 8 SRDF groups per director. To learn about the prevailing limits it is best to consult with the Solution Validation Center.
Starting with Enginuity 5671 it is possible to dynamically adjust:
a) The priority of an SRDF/A group. The priority determines the order in which an SRDF/A session will be suspended if cache resources become scarce.
b) The cycle time is the minimum time that must elapse before a new cycle is started. The data loss potential or RPO in SRDF/A is the amount of data in the Capture and the Transmit-Receive cycles. By reducing the cycle time, it is possible to reduce the RPO. However, this will increase the bandwidth requirements of the solution.
c) By default, SRDF/A is permitted to use 94% of the available cache in a Symmetrix. This percentage can be adjusted downward so there is more cache available for local applications.
After a failure when there has been a buildup of a significant number of invalids, it is best to drop out of SRDF/A and change the mode to Adaptive Copy write pending until the two sides are nearly synchronized. Otherwise. you run the risk of SRDF/A being dropped because the link cannot handle the excess load in SRDF/A mode.
Copyright © 2005 EMC Corporation. Do not Copy - All Rights Reserved.
Symmetrix Solutions Design Concepts - 45
© 2005 EMC Corporation. All rights reserved. Symmetrix Solutions Design Concepts - 45
Recommendations for SRDF/ARSymmetrix Automated Replication provides delayed replication using SRDF Adaptive Copy mode
SRDF Adaptive Copy and BCV establishes compete with host I/O for Symmetrix DA time– Use QOS to slow down Adaptive
copy or– Increase SRDF/AR cycle times so
there are fewer TimeFinder establishes
STD R2
SRDF/AR R1/BCV BCV
SRDF/AR is a flexible disaster recovery solution that allows production data to be replicated at a slower pace than with SRDF/A or SRDF/S in exchange for a higher data loss potential.
Apart from the larger data loss potential, the primary disadvantage of SRDF/AR is its heavy use of the Symmetrix back end resources. Specifically, the Disk Adapters or DAs are responsible for scheduling adaptive copy writes across the RDF link. Since SRDF/AR also involves continual splitting and establishing of BCVs, the establish activity also places a heavy load on the DAs. Finally, host writes also pass through the DAs in the process of being written to disk.
The multiple activities contending for DA resources can sometimes cause performance issues in a Symmetrix. One simple work around is to slow down SRDF link traffic by using the Quality of Service parameter. Another is to simply elongate the SRDF/AR cycles so as to reduce the frequency of TimeFinder establishes and splits.
Copyright © 2005 EMC Corporation. Do not Copy - All Rights Reserved.
Symmetrix Solutions Design Concepts - 46
© 2005 EMC Corporation. All rights reserved. Symmetrix Solutions Design Concepts - 46
Considerations for the SRDF Failover Site (R2)The Business Resumption Server/Site needs to be:– Same platform type– Comparable size– Same levels of hardware and software
Must have network and power connections equal to those at the primary site
Requires access and data security levels equivalent to the source site
Should have similar physical environment characteristics to the primary production server/site
Planning for the restart requires that all components required for operations at the production site be available at the secondary site. The greater the number of resources that have to be procured at the remote site following a site outage will add to the total Recovery Time.
Copyright © 2005 EMC Corporation. Do not Copy - All Rights Reserved.
Symmetrix Solutions Design Concepts - 47
© 2005 EMC Corporation. All rights reserved. Symmetrix Solutions Design Concepts - 47
Design Pitfalls and Overall RisksUpon completion of this lesson, you will be able to:
Describe the risks, impact and options associated with Symmetrix solutions
Describe the risks, impact and options associated with TimeFinder solutions
Describe the risks, impact and options associated with Open Replicator solutions
Describe the risks, impact and options associated with SRDF solutions
The objectives for this lesson are shown here. Please take a moment to read them.
Copyright © 2005 EMC Corporation. Do not Copy - All Rights Reserved.
Symmetrix Solutions Design Concepts - 48
© 2005 EMC Corporation. All rights reserved. Symmetrix Solutions Design Concepts - 48
TimeFinder/Mirror Solution Risks, Impact and Options
Solution RisksBack end could get overworked
BCVs failing while established
All four mirror positions are used
ImpactBusy Symmetrix back end impacts host I/O and SRDF/AR
Replication will halt until failed drive is replaced
Limits the number of concurrent TF/Mirror BCVs
OptionsUse QOS to slow down SRDF
Use TF/clones instead of standard BCVs - clones do not use mirror positions
Use protection mechanisms that do not consume mirror positions
The next few slides highlight some of the possible pitfalls in implementing Symmetrix software solutions.
TimeFinder/Mirror has a tendency to consume Disk Adapter resources during establish operations. If the establish operation coincides with SRDF/Adaptive copy activity and host writes, performance can suffer. The work around for this problem is to slow down SRDF using QOS or to reduce the number of establishes that conflict with the other two activities.
If a BCV fails while it is established, TimeFinder/Mirror processes will stop until the drive is physically replaced. This is true even if the BCV is mirrored, because an “establish” will pair only the “moving” mirror with the standard. If drive failures are a common problem, use of TimeFinder/Clones will get around the issue.
One of the drawbacks of TimeFinder/Mirror is that it occupies one of the 4 mirror positions in a Symmetrix logical volume. This limits the number of concurrent BCV copies to two. That limit could be extended to 8 and 15 copies by using TimeFinder/Clones or TimeFinder/Snaps respectively.
Copyright © 2005 EMC Corporation. Do not Copy - All Rights Reserved.
Symmetrix Solutions Design Concepts - 49
© 2005 EMC Corporation. All rights reserved. Symmetrix Solutions Design Concepts - 49
TimeFinder/Snap Solution Risks, Impact and Options
Solution RisksRapid changes to source data
Large percentage of change to source data
Cannot be cascaded
ImpactCopyOnWrite penalty will affect host I/O
Will negate space saving benefit of Snap
A Snap target cannot be used as a snap source
OptionsPick a different replication solution
Pick a different replication solution
Try snapping from standard volumes
TimeFinder/Snap is a great product for environments where the source or the target do not experience heavy writes, and the data does not change much during the life of the copy. It can cause problems if TimeFinder/Snap is used in the wrong environment.
Heavy writes, either to the Snap source or the target, can cause CopyOnWrite penalty which will slow down write performance. A large amount of data changing on the source negates the advantage of using Snaps because there is no conservation of disk space; only a second copy resides on the shared save pool. A different local replication solution may be in order in both of these cases.
Snaps have a disadvantage in that they cannot be cascaded. Therefore, a Snap target cannot be used as a Snap source.
Copyright © 2005 EMC Corporation. Do not Copy - All Rights Reserved.
Symmetrix Solutions Design Concepts - 50
© 2005 EMC Corporation. All rights reserved. Symmetrix Solutions Design Concepts - 50
TimeFinder/Clone Solution Risks, Impact and Options
Solution RisksSlow initial performance
Different command structure from TimeFinder/Mirror
Cannot be cascaded
ImpactCopyOnAccess penalty while data copy is underway
New commands have to be learned
Clone targets may not be used as Snap or Clone sources
OptionsWait till data copy is complete
Use TimeFinder Emulation Mode
Try cloning from standard volumes
TimeFinder/Clones offer two notable advantages over TimeFinder/Mirrors:
They are available immediately without having to wait for synchronization of source and targets, and they permit up to 8 concurrent copies as opposed to two with TimeFinder/Mirror.
The price of immediate availability is slower write performance to the source and target while the copy is in progress. The first writes to the source or data access on the target, are preceded by the transfer of the original track from the source to the target. If the performance degradation becomes a problem, it might be a good idea to wait until the data copy is complete.
TimeFinder/Clones have a slightly different command structure from TimeFinder/Mirror, and represent a learning curve for users. One way of shortening the learning process is to use TimeFinder/Clones in emulation mode. In this mode, users can continue using old TimeFinder/Mirror command syntax while the Solutions Enabler software translates each mirror command into its clone equivalent. Apart from subtle differences, this process is transparent.
Clones may not be cascaded. This means clone targets cannot be used as clone or snap sources. One could always use standard volumes as clone sources to avoid this problem.
Copyright © 2005 EMC Corporation. Do not Copy - All Rights Reserved.
Symmetrix Solutions Design Concepts - 51
© 2005 EMC Corporation. All rights reserved. Symmetrix Solutions Design Concepts - 51
Open Replicator Solution Risks, Impact and Options
Solution RisksFA Port Contention
Poor SAN network quality
Application impact during live migration
Hot pull data loss if migration is aborted
ImpactContention occurs with incoming writes
Impact overall migration time
EOR causes a CopyOnWrite penalty
Write activity during migration is lost
OptionsFA usage for ceiling and pace values
Conduct a network assessment
If application impacts cannot be tolerated use ‘cold’
Use hot pulls only when necessary
The big selling feature of Open Replicator is that it uses existing SAN infrastructure to transfer data between two storage arrays. The disadvantage of this feature is that there may be contention between the host and the Open Replicator session for FA bandwidth. It is possible to set the “pace” and “ceiling” parameters available in the product to prevent Open Replicator from using up an unfair share of the FA port bandwidth.
Poor network quality can lead to long or aborted migration attempts. It is best to conduct a network assessment while implementing Open Replicator over long distances.
Live migration (push) is a powerful Open Replicator feature which allows the production volume to stay online while a point in time snapshot of the data is being migrated. This can lead to application impact, because every “protected” or “yet to be transferred” track of data is first transferred before an application is allowed to write to that track on the production volume. This is known as the CopyOnWrite or CopyOnFirstWrite penalty. This penalty can be avoided if a cold push is undertaken.
When data is being pulled from a remote array while it is being accessed by the host on the DMX, any attempt to access a track that has not been moved over, will result in the data being moved first before access is permitted. This is known as the CopyOnAccess penalty. Since the new data written to the DMX is not replicated back to the remote array, there is a potential for data loss if the Open Replicator session is terminated unexpectedly prior to completion
Copyright © 2005 EMC Corporation. Do not Copy - All Rights Reserved.
Symmetrix Solutions Design Concepts - 52
© 2005 EMC Corporation. All rights reserved. Symmetrix Solutions Design Concepts - 52
Synchronous SRDF Risks, Impact and Options
Solution RisksDistance between subsystems
High bandwidth requirements
Busy volumes
ImpactElongated response times
Expensive
A single busy volume can slow down all traffic for the group
OptionsConsider use of SRDF/A if write response is poor
Consider SRDF/AR if bandwidth costs are too high
Spread application load evenly
Synchronous SRDF offers the benefit of guaranteeing that the source and target sites are exact mirrors of each other, all the time. The price for real time replication is in two forms: Greater write response times, and bandwidth requirements that meet or exceed the peak write capacity.
If write response times are unacceptably high, one possibility might be to consider SRDF/A instead of SRDF/S. By having the target just a few seconds behind the source, performance of source applications can be significantly improved.
High bandwidth can be expensive and can cost millions of dollars per year. To reduce the cost of bandwidth over large distances, SRDF/A or SRDF/AR may be a better solution. The reduction in bandwidth requirements using SRDF/A is not significant, since the bandwidth has to keep up with the average arrival rate of writes.
SRDF/S is very sensitive to the existence of busy volumes. One overworked volume can impact the performance of the whole group of devices adversely. If busy volumes are affecting SRDF/S, a redistribution of load across the spindles is probably appropriate.
Copyright © 2005 EMC Corporation. Do not Copy - All Rights Reserved.
Symmetrix Solutions Design Concepts - 53
© 2005 EMC Corporation. All rights reserved. Symmetrix Solutions Design Concepts - 53
SRDF/A Solution Risks, Impact and Options
Solution RisksInsufficient bandwidth
Insufficient cache
Longer distances
ImpactSRDF/A session drops
SRDF/A session drops
Longer resynchronization times
OptionsAcquire adequate bandwidth
Provide enough cache
Ensure availability of gold copy before starting resynchronization
Unlike synchronous SRDF which waits for an acknowledgement from the remote side before it acknowledges I/O completion to the host, SRDF/A will, by default, logically suspend the links if the bandwidth of the link cannot keep up with the arrival rate of the writes. Other than slowing down writes or buying more bandwidth, there is no simple solution to this problem.
A similar problem arises if there is insufficient cache to buffer the writes. Rather than slow down the host application, SRDF/A will logically suspend the session if it runs out of cache resources. Again, there is no simple solution to the problem other than to increase cache or slow down writes.
Typically, since SRDF/A is implemented over longer distances, it takes a longer time to resynchronize the two sides after a link failure. It is therefore important to preserve a gold copy of consistent restartable data on the target side prior to starting resynchronization after a failure.
Copyright © 2005 EMC Corporation. Do not Copy - All Rights Reserved.
Symmetrix Solutions Design Concepts - 54
© 2005 EMC Corporation. All rights reserved. Symmetrix Solutions Design Concepts - 54
SRDF/Star Solution Risks, Impact and Options
Solution RisksComplex and relatively new product
Devices used for Star cannot be easily used for other applications
Complex failover actions
ImpactLong implementation cycle
Combining Star with other technologies is non-trivial
Need to understand data implications before deciding how to fail over
OptionsUse EMC professional services
Use “-star” option to manage devices
Ensure availability of gold copy before starting resynchronization
Synchronous SRDF waits for an acknowledgement from the remote side before it acknowledges I/O completion to the host. With SRDF/A, the links (by default) are logically suspended if the link bandwidth cannot keep up with the arrival rate of the writes. Other than slowing down writes or buying more bandwidth, there is no simple solution to this problem.
A similar problem arises if there is insufficient cache to buffer the writes. Rather than slow down the host application, SRDF/A logically suspends the session if it runs out of cache resources.
Typically, since SRDF/A is implemented over longer distances, it takes longer to resynchronize the two sides after a link failure. Therefore, it is important to preserve a gold copy of consistent restartable data on the target side prior to starting resynchronization after a failure.
Copyright © 2005 EMC Corporation. Do not Copy - All Rights Reserved.
Symmetrix Solutions Design Concepts - 55
© 2005 EMC Corporation. All rights reserved. Symmetrix Solutions Design Concepts - 55
Course SummaryKey points covered in this course:
Recognizing important technical data to be gathered about the use of Symmetrix
Gathering technical data for Symmetrix
Interpreting and comprehending the gathered data
Recognizing parameters to set and tools for managing Symmetrix
Identifying the best practices for configuring and deploying Symmetrix and it underlying applications
These are the key points covered in this course. Please take a moment to review them.