Dell Compellent Storage Centeri.dell.com/sites/doccontent/shared-content/campaigns/en/... · 2020. 7. 1. · 2008 R2 Hyper-V in conjunction with the features of the Dell Compellent

Dell Compellent Storage Center

Disaster Recovery Best Practices for Microsoft Hyper-V (Server 2008 R2)

Dell Compellent Storage Center Disaster Recovery Best Practices for Microsoft Hyper-V (Server 2008 R2)

Page 2

Document Revisions

Date Revision Comments

12/19/2011 A First Revision

THIS BEST PRACTICES GUIDE IS FOR INFORMATIONAL PURPOSES ONLY, AND MAY CONTAIN

TYPOGRAPHICAL ERRORS AND TECHNICAL INACCURACIES. THE CONTENT IS PROVIDED AS IS, WITHOUT

EXPRESS OR IMPLIED WARRANTIES OF ANY KIND.

© 2011 Dell Inc. All rights reserved. Reproduction of this material in any manner whatsoever without

the express written permission of Dell Inc. is strictly forbidden. For more information, contact Dell.

Dell, the DELL logo, the DELL badge, and Compellent are trademarks of Dell Inc. Other trademarks and

trade names may be used in this document to refer to either the entities claiming the marks and names

or their products. Dell Inc. disclaims any proprietary interest in trademarks and trade names other than

its own.


Page 3

Contents Document Revisions ............................................................................................... 2

Contents .................................................................................................................. 3

General Syntax ..................................................................................................... 5

Conventions ......................................................................................................... 5

Preface ................................................................................................................... 6

Audience ............................................................................................................ 6

References .......................................................................................................... 6

Purpose. ............................................................................................................. 7

Introduction ............................................................................................................. 8

Dell Compellent Storage Center Overview .................................................................... 8

Microsoft Hyper-V Overview ..................................................................................... 8

An Overview of Basic Disaster Recovery Concepts ................................................................ 8

Cost-Risk Analysis .................................................................................................. 9

Disaster Recovery and Disaster Avoidance .................................................................. 10

Testing and Documenting the Plan ........................................................................... 11

Virtualization and Disaster Recovery ......................................................................... 11

Microsoft Hyper-V and Disaster Recovery ................................................................... 11

Addressing the Prerequisites ....................................................................................... 12

Multiple Sites ..................................................................................................... 12

Alternate Sites ................................................................................................... 12

Dell Compellent Storage Centers at Primary and Alternate Locations ................................. 12

Connectivity Between the Primary and Alternate Sites ................................................... 13

Network Design at Alternate Sites ............................................................................ 13

Documentation ................................................................................................... 13

Dell Compellent Options for Hyper-V Guest Recovery ......................................................... 15

Disaster Recovery of a Hyper-V Guest from a Replay .......................................................... 15

Verifying the Environment ..................................................................................... 16

Hyper-V Guest Recovery Steps ................................................................................ 17

Considerations Before Bringing the Primary Site Back on Line .......................................... 25

Fail-back Options ................................................................................................ 26

Disaster Avoidance and Live Volume .............................................................................. 26

Conclusion ............................................................................................................. 27


Page 4

Tables

Table 1. Document Syntax ........................................................................................ 5

Table 2. Dell Compellent Hyper-V Recovery Options ....................................................... 15

Table 3. Guest Details Needed for Manual Guest Recovery ............................................... 18

Figures

Figure 1: Cost/risk analysis ........................................................................................ 9

Figure 2: Hyper-V Guest configuration at the primary site ................................................ 16

Figure 3: Replay Manager 6 Restore Points ................................................................... 16

Figure 4: Remote Replication to a remote Storage Center ................................................. 17

Figure 5: Replays and VSS Snapshots replicated to an alternate location ............................... 17

Figure 6: Remove remote volume replication ................................................................ 18

Figure 7: Create a View Volume from a Restore Point ...................................................... 19

Figure 8: Create a View Volume from a Replay .............................................................. 20

Figure 9: Clear volume attributes on a View Volume of a Replay Manager Replay .................... 21

Figure 10: Run CHKDSK after modifying attributes ............................................................ 22

Figure 11: Validate a Cluster to review disk characteristics ................................................ 23

Figure 12: Add the View Volume to the recovery cluster as a new CSV ................................... 25


Page 5

General Syntax

Table 1. Document Syntax

Item Convention

Menu items, dialog box titles, field names, keys Bold

Mouse click required Click:

User Input Monospace Font

User typing required Type:

Website addresses http://www.compellent.com

Email addresses [email protected]

Conventions

Notes are used to convey special information or instructions.

Timesavers are tips specifically designed to save time or reduce the number of steps.

Caution indicates the potential for risk including system or data damage.

Warning indicates that failure to follow directions could result in bodily harm.


Page 6

Preface

Audience This document is highly technical and intended for storage and server administrators, as well as other

information technology professionals interested in learning more about how Microsoft Hyper-V

integrates with the Dell Compellent Storage Center.

This document assumes the reader has read, has formal training, or has advanced working knowledge

of the following:

Installation and configuration of Microsoft Hyper-V.

Configuration and operation of the Dell Compellent Storage Center.

Business continuity, disaster recovery, and disaster avoidance planning.

References Reviewing the following documentation is highly recommended prior to referencing this best practices

guide:

Microsoft Hyper-V Planning and Deployment Guide:

http://www.microsoft.com/downloads/details.aspx?familyid=5DA4058E-72CC-4B8D-BBB1-

5E16A136EF42&displaylang=en

The Microsoft TechNet Hyper-V document collection:

http://technet.microsoft.com/en-us/library/cc753637.aspx

Microsoft Exchange and SQL support policies on Hyper-V:


http://support.microsoft.com/default.aspx/kb/956893

Dell Compellent Documentation:

o Storage Center 5.5 Users Guide o Replay Manager 6 Users Guide o Replay Manager 6 Best Practices for Microsoft Hyper-V o Live Volume Best Practices Guide o PowerShell Command Set 6.1 Release Notes and Users Guide

http://knowledgecenter.compellent.com

Windows 2008 Performance Tuning Guidelines

http://www.microsoft.com/whdc/system/sysperf/Perf_tun_srv.mspx

http://www.microsoft.com/downloads/details.aspx?familyid=5DA4058E-72CC-4B8D-BBB1-5E16A136EF42&displaylang=en

http://www.microsoft.com/downloads/details.aspx?familyid=5DA4058E-72CC-4B8D-BBB1-5E16A136EF42&displaylang=en



http://knowledgecenter.compellent.com/

http://www.microsoft.com/whdc/system/sysperf/Perf_tun_srv.mspx


Page 7

Purpose. As stated in the Preface, this document assumes that the reader already has a good understanding of

business continuity and disaster recovery/avoidance planning. An overview of key concepts will be

reviewed below. If additional information is needed concerning how business continuity and disaster

recovery/avoidance planning apply to your environment, many good resources are available on the

Internet or from other sources.

The main purpose of this document is to provide disaster recovery best practices for Microsoft Server

2008 R2 Hyper-V in conjunction with the features of the Dell Compellent Storage Center.

Please note that the information contained within this document provides general recommendations

only and may not be applicable to all configurations. Configurations may vary based upon individual

circumstances, environments, or business needs.


Page 8

Introduction

Dell Compellent Storage Center Overview The Dell Compellent Storage Center is an enterprise class storage area network (SAN) that significantly

lowers capital expenditures, reduces storage management and administration time, provides

continuous data availability and enables storage virtualization. Storage Center’s industry-standard

hardware and sophisticated software manage data at the block-level, maximizing utilization,

automating tiered storage, simplifying replication and speeding data recovery.

Microsoft Hyper-V Overview Hyper-V is a virtualization server role provided with Windows Server 2008 and Windows Server 2008 R2.

Hyper-V is a layer of software that sits between the Hyper-V Host server’s hardware and the Hyper-V

Guest VMs. Hyper-V presents hardware resources in a virtualized manner from the Host server to the

Guests. Hyper-V Hosts (also referred to as virtualization servers) can host multiple Hyper-V Guest VMs,

which are isolated from each other but share the same underlying hardware resources (e.g. processors,

memory and I/O devices).

By consolidating traditional physical servers to virtual servers on a single Host server, virtualization can

improve resource utilization, increase power efficiency and reduce operational and maintenance costs.

In addition, Hyper-V Guests and the associated management tools offer greater flexibility for managing

resources, balancing load, provisioning systems, and recovery.

An Overview of Basic Disaster Recovery Concepts A good business continuity strategy will always incorporate disaster recovery planning. At a high level,

a disaster recovery plan is a process whereby a company insures they are able to recover as quickly as

possible from data loss (by restoring the most recent data) or from an interruption or failure that

prevents access to data. It is a very important part of overall IT strategy and in some cases is governed

by regulations specific to particular industries.

When a major event happens, a disaster recovery plan will require the involvement of more than just

the IT department. A business continuity/disaster recovery response team should be comprised of key

staff that are assigned and trained to handle the various aspects of dealing with a disaster.

In addition to the IT department recovering data, or failing over services and applications to alternate

sites, others may be assigned to deal with communications, media releases, security, notifying

customers, accounting for employees, etc.

The disaster recovery scenarios that may be encountered (and therefore must be planned for) are quite

diverse. Disasters can be small (the loss of a single document that impacts one user) or large (a natural

disaster). For the most part, the essential elements of disaster recovery are common place, reliable,

fairly inexpensive, easy to implement, and are able to address or prevent the vast majority of events

that are most likely to occur. These elements include tape backups with off-site storage, on-line

backups, network and physical security measures, malware protection, redundant hardware and

internet connections, SAN-based snapshots, and battery backups or generators.

Many of these basic levels of protection are probably already in place in your environment. Business

continuity becomes more complicated with size and number of locations. While virtualization


Page 9

technologies such as Microsoft Hyper-V can help insure continuity in case of a disaster, they also add

complexity to the underlying design. This document will help administrators running Microsoft Hyper-V

on Dell Compellent better understand their disaster recovery and avoidance options particularly when

the design involves multiple locations.

Cost-Risk Analysis Disaster recovery planning is an ongoing process that involves cost/risk analysis. Some ubiquitous

protections are simple and rather inexpensive to implement. They provide adequate safeguards for

events which are most likely to occur. Other protections for more catastrophic events may be too cost

prohibitive or complicated to implement especially when considering the very low odds such events

may ever occur.

Figure 1: Cost/risk analysis

The graph in Figure 1 shows the relationship between the odds, cost, complexity, and business impact

of a particular event that invokes a disaster recovery response. In general, the more disruptive or

catastrophic events tend to have much lower odds of occurring, but they also usually require more

expensive or complicated designs to insure business continuity if they do occur. Therefore, decisions

must be made based on these factors as to which disaster recovery measures will be implemented.

Where to draw the line is unique to each company and each location. Implementing some protections

may be implemented as part of a multi-year plan to spread the cost out over time.

Questions that might be asked as part of a cost/risk analysis include:

What regulations apply to my industry?

What applications and data are the most mission critical to our business?

What is the Recovery Time Objective (RTO) for each application or service? In other words, how long can something be down before the business impact becomes too great? For example:


Page 10

o Practice Management System – 30 minutes o Messaging system – 4 hours

o Research and Development Server – 2 days

What is the Recovery Point Objective (RPO)? In other words, how much data loss is acceptable for a particular subset of data? Is backing up the mail server once a day adequate? If so, the mail server has an RPO of 24 hours. In other words, up to 24 hours of mail data may be lost in the event the mail sever has to be recovered from the last backup.

What types of events are most likely to occur, factoring in the geographic location? A coastal location may be subject to hurricanes, and a location on a fault line may be subject to earthquakes.

Is an alternate site far enough away so that the same event does not take out both locations?

How much will it cost (hardware, software, and staff) to implement and support the desired protections and is that cost justified given the risk?

Disaster Recovery and Disaster Avoidance Disaster recovery usually means reacting to an unexpected event that has already taken place. These

events can be categorized as follows:

Events that cause data loss such as malware infection, corruption, accidental deletion, or

hardware failure of disks or disk arrays.

Events that interrupt the ability to access data within or between sites, such as damaged

internet lines (e.g. a backhoe operator cuts the fiber link to your main site), short or extended

power outages, or network hardware failures.

Events that cause both loss of data and loss of access to a site, typically caused by more

significant events such as a fire or natural disaster.

Disaster events, small and large, usually come with little or no warning, and recovery can involve both

manual or automatic processes. A manual process might be required to restore lost data from a backup

or snapshot, or to bring a Hyper-V Guest on line at an alternate site. An automatic process kicks in on

its own, such as when a Hyper-V Guest fails over to another Node in a Cluster if the Node that currently

owns the Guest experiences a hardware failure.

Disaster avoidance implies proactively dealing with an event before it happens in a way that avoids or

minimizes down time. This is the strategy commonly used when doing system maintenance. An

administrator may live migrate Hyper-V Guests from Node A to Node B, patch Node A and reboot it, and

then move the Hyper-V Guests from Host B back to Host A again – all without affecting user access to

the Guests.

Disaster Avoidance also comes in to play when a site has advanced notice of a disaster, such as a

coastal location that has several days warning of an impending hurricane, or when a site located in a

flood plain learns that a projected flood crest may overwhelm the levees. With the advance warning,

administrators have limited time to proactively failover or move applications and services to an

alternate location (one that will be unaffected by the impending disaster) before it strikes. However,

having advance notice of major events is obviously the exception and not the rule.


Page 11

A good business continuity plan will include both disaster recovery and disaster avoidance strategies

involving a combination of both manual and automatic processes to address a wide range of possible

scenarios.

Testing and Documenting the Plan The time to find out if elements of your disaster recovery plan will work is not when faced with an

event. Testing your plan periodically is just as important as having a plan. This will help ensure that

given an actual event, RTO objectives can still be met.

Documentation is also key, especially when manual processes are required as part of a disaster

recovery plan. While steps and configurations may seem familiar and easy to remember at present,

they may not be in the future given the passage of time. And, ensuring that documentation is always

current and readily available will help ensure quicker recovery when an event occurs.

Virtualization and Disaster Recovery Virtualization technology now allows servers to be virtualized so that the resources the VM needs (disk,

CPU, RAM, etc.) are not tied to any one piece of physical hardware. Some advantages of virtualization

include higher density, higher utilization, lower maintenance and support costs, and more efficient use

of power. Another big advantage, which lends itself well to business continuity planning, is VM

mobility – the ability for a VM to move from one host or node to another.

VM mobility, along with the lower cost of faster and more reliable Internet connections, has made it

possible for companies to take advantage of disaster recovery and avoidance options involving multiple

locations that were previously too complicated or cost prohibitive to implement.

While VM server mobility can be extended to remote locations over WAN connections, many complex

and often expensive design considerations may still be faced, especially when the goal is to automate

the fail-over and fail-back processes as much as possible. Some reasons why automated failover is

highly desirable when faced with disaster include:

Manual failover and failback processes may take too long - RTO goals cannot be met.

IT resources can be quickly overwhelmed with manual processes when a disaster occurs.

Manual processes, which can often be complicated, are more prone to user error especially when administrators are under the stress of dealing with a disaster.

Microsoft Hyper-V and Disaster Recovery The more routine day-to-day aspects of disaster recovery and avoidance may already be in place in

your Hyper-V environment, such as Storage Center Replays, Replay Manager Backups, and remote

replication of volumes with these Replays to a remote site. For more information about these Dell

Compellent features (including Replay Manager 6 Best Practices for Hyper-V), please refer to the

References section of this document.

While Hyper-V has native features that allow administrators many good disaster recovery and

avoidance options (particularly within the same site and within the same cluster), the design

limitations of Hyper-V become much more apparent when the goal of a business continuity plan is to

extend VM mobility (e.g. failover of Guests) to other sites or between separate Hyper-V Clusters.


Page 12

The good news is that when administrators leverage Hyper-V in concert with the features available

with the Dell Compellent Storage Center, Hyper-V’s limitations can be mitigated as will be shown step-

by-step below.

Third party software may be required to bridge the gap if automating the failover/failback of Hyper-V

Guests between sites and/or between separate Hyper-V clusters is necessary in order to meet RTO

goals.

Addressing the Prerequisites In order for Hyper-V Guests to be brought online at an alternate location in the most time sensitive way

possible, a number of prerequisite design considerations must be taken into account and be planned for

in advance:

Multiple Sites With only one location available, recovery options are extremely limited if a major event prevents

access to that site. With two locations, they should ideally be far enough apart so that the same event

does not take out both sites.

Alternate Sites Alternate sites can be referred to as hot, warm or cold.

A “hot” site usually means that one or more applications or systems at a primary site are mirrored or clustered at a secondary site in such a way that automatic failover of at least some of these resources can occur between the two sites. For example, A Hyper-V Cluster may have Nodes that exist at both locations (stretch clustering), allowing for Guest Live Migration in conjunction with Live Volume, assuming all the prerequisite design considerations are in place (e.g. MPIO, adequate bandwidth between the sites, and Layer 2 Networking stretched across both sites).

A “warm” site means an alternate site has all the necessary hardware and networking components in place to accommodate the primary site’s Hyper-V Guests if they need to be manually failed over. This requires having Server 2008 R2 Hyper V Hosts pre-provisioned at alternate sites that have the capacity to accommodate the primary site’s Guests.

A “cold” site means that some (or all) of the necessary hardware infrastructure is lacking. This hardware might need to be obtained from a vendor or another location and transported to the site. For recovery to a “cold” site, administrators have to plan for the additional time and resources required to set up the required hardware. This additional lead time may make it difficult or impossible to meet RTO goals.

Dell Compellent Storage Centers at Primary and Alternate Locations The Storage Centers at any alternate locations should have adequate disk performance and capacity to

accommodate the primary site’s Hyper-V Guests and data. To save money, a company may choose to

do the following at their alternate location(s):

Install fewer hardware redundancies (at their own risk).

“Cascade” an older Storage Center from a primary site down to an alternate site to serve as a backup Storage Center there.

Alternate sites may feature slower performance disk with slower back-end connectivity, so long as the design will still provide adequate I/O performance.


Page 13

Connectivity Between the Primary and Alternate Sites Over long distances, this is usually accomplished by way of private (dedicated) or public (VPN) Internet

lines of sufficient capacity to allow for remote replication of Dell Compellent volumes, Replays and

Restore Points from one Storage Center (at a primary site) to another Storage Center (at an alternate

site).

Within shorter distances, such as within a metro area or a campus, “dark” (dedicated) fiber links may

be available as a means to connect sites that are in close proximity to each other. With a high-speed

low-latency fiber link connecting sites, stretch clustering (in conjunction with Dell Compellent Live

Volume) becomes a viable disaster avoidance strategy.

Network Design at Alternate Sites The network design at an alternate site needs to allow users and systems (on site and remote) to

continue to access the Hyper-V Guests once they are brought online there. Design considerations

include:

Routing rules and/or gateway settings

VLAN configuration

Firewall rules

DNS (both internal and external)

Domain Authentication for end users, domain service accounts, clusters, etc.

If Active Directory (AD) Domain Controllers (DCs) are virtualized as Hyper-V Guests, they should be on

standalone (not clustered) Hyper-V Server Hosts. Since a Hyper-V Cluster needs to authenticate to an

available DC before it (or any of its Guests) can start, administrators can quickly find themselves in a

bind if the only available DC server exists as a Guest on a Cluster that depends on that particular DC to

authenticate.

The best case recovery scenario provides for stretched Layer 2 networking functionality between the

two sites so that Guests will function at either location without their IP, DNS and gateway settings

needing to be changed. This will greatly simplify the process of bringing Guests on line, and minimize

the manual changes that must be made to Guests before they will function. However, Layer 2

extensibility across sites can involve complex network design considerations that may not always be

practical or possible given the environment.

Documentation Documentation is a critical (and often overlooked or neglected) component of ensuring that a disaster

recovery plan goes as smoothly as possible.

Documenting key information about Hyper-V Guests that may need to be manually recovered at an alternate site is important because some settings may have to be manually reconfigured.

o Number of CPU Cores assigned. o Static or dynamic memory settings. o Disk configuration (e.g. which VHDs are presented as which drive letters). o IP settings. o Which applications or services on a Guest are tied to the Guest’s name or

current IP address? Application-level changes may also be necessary in some

cases if the server’s name or IP settings change.


Page 14

Document the fail-over process itself and refine it as part of testing. Understand which Hyper-V Guests will take priority based on RTO goals for specific services or subsets of data.

Accessibility – loss of access to a primary site should not interfere with being able to access key documentation. Make sure that current documentation exists at one or more alternate locations that can be ideally accessed remotely and securely by the necessary staff.


Page 15

Dell Compellent Options for Hyper-V Guest Recovery

Table 2. Dell Compellent Hyper-V Recovery Options

Dell Compellent Option Disaster Recovery Disaster Avoidance

Storage Center Replays (crash-consistent volume

snapshots) X X

Replay Manager Restore Points (application-consistent

backups of Hyper-V Guests that leverage Microsoft VSS) X X

Remote Volume Replication (which also replicates the

volume’s associated Replays and Restore Points) X X

Live Volume (asynchronously mirrored volumes between

two different Storage Centers) n/a X

As shown in Table 2, the first three options (Storage Center Replays, Replay Manager Restore Points,

and Volume Replication to a remote location) can be used both in disaster recovery and disaster

avoidance situations. However, the fourth option, Live Volume, is not a valid disaster recovery option

at present. Also note that Replay Manager Restore Points are not supported with Live Volume at the

present time.

Disaster Recovery of a Hyper-V Guest from a Replay In this example, a Hyper-V Guest will be recovered at a secondary site using either a Storage Center

Replay or Replay Manager Restore Point. This example assumes that:

An unexpected disaster at the primary site has resulted in a loss of hardware and data.

A secondary site is available and located far enough away from the primary site so as to be unaffected by the event that took down the primary site.

The secondary site is a “warm” site.

The secondary site has all of the design considerations met to allow for manual recovery of Hyper-V Guests, as reviewed in the Addressing the Prerequisites section of this document.

For more information on how to set up and use Storage Center Data Instant Replays, Replay Manager

Backups, and Volume Replication to a remote Storage Center, please refer to the documentation listed

in the References section of this document.


Page 16

Verifying the Environment

Figure 2: Hyper-V Guest configuration at the primary site

As shown in Figure 2, the configuration at the primary location shows (as set up before the disaster)

that the Hyper-V Guest “Guest01” is configured to use a cluster shared volume (CSV) on Cluster01.

The Hyper-V Cluster Nodes, along with the CSV containing Guest01’s VHD files, are configured on

Storage Center 12 at the primary location. Hourly, daily, and weekly Storage Center Data Instant

Replays are being taken of the CSV volume.

Figure 3: Replay Manager 6 Restore Points

For purposes of this example, Replay Manager Backups using the Hyper-V extension for VSS (application

consistent) backups are also configured for Guest01, as shown in Figure 3.


Page 17

Figure 4: Remote Replication to a remote Storage Center

As shown from the perspective of Storage Center 12 at the primary location (see Figure 4), the CSV has

been replicated from Storage Center 12 to Storage Center 13 at the remote site. Note that the unique

icon helps with identifying volumes that are being replicated, as shown in Figure 3.

Figure 5: Replays and VSS Snapshots replicated to an alternate location

From the perspective of Storage Center 13 at the alternate location, the CSV and all of its Replays have

been replicated to Storage Center 13 (as shown under the Replays tab for the replicated volume). This

includes both the crash consistent Storage Center Replays, and the application consistent Replay

Manager Restore Points as shown in Figure 5.

For ease of use, it is recommended that administrators append “_Repl” (or something similar) to

replicated volume names and other associated volume objects (such as folder names) on the remote

Storage Center to make identification of these resources easier, as shown in Figure 5.

Hyper-V Guest Recovery Steps In this example, Guest01 will be recovered from a Replay on the replicated volume on the remote

Storage Center at the alternate location.

Storage Center Replays

Replay Manager

Restore Points


Page 18

Table 3. Guest Details Needed for Manual Guest Recovery

Guest Details Primary Location Recovery Location

Guest Name Guest01 Guest01

Hyper-V Cluster Cluster01 Cluster04

Guest Static IP Address 172.16.23.220 172.16.23.220

Subnet Mask 255.255.255.0 255.255.255.0

Gateway 172.15.23.1 172.15.23.1

DNS Servers 172.16.23.10, 11 172.16.23.10, 11

RAM 4 GB (static) 4 GB (static)

Number of CPU Cores assigned 4 cores 4 cores

AD Domain name techsol.local techsol.local

Backup Method Storage Center Replays

Replay Manager 6 Hyper-V Backups

Storage Center Replays

Replay Manager 6 Hyper-V Backups

Virtual DVD drive Z:\ Z:\

Boot

Vol.

Drive Letter C:\ C:\

VHD Name TSHVC01_Guest01_Boot TSHVC01_Guest01_Boot

Storage Center CSV Volume Name TS-HV-Cluster01_CD02_CSV01_Guests TS-HV-Cluster01_CD02_CSV01_Guests_Repl

Storage Center Storage Center 12 Storage Center 13

Data

Vol.

Drive Letter D:\ D:\

VHD Name TSHVC01_Guest01_Data01 TSHVC01_Guest01_Data01

Storage Center CSV Volume Name TS-HV-Cluster01_CD02_CSV01_Guests TS-HV-Cluster01_CD02_CSV01_Guests_Repl

Storage Center Storage Center 12 Storage Center 13

1) Review available documentation for Guest01. Table 3 above shows an example of data an

administrator might find very useful when needing to recover a Hyper-V Guest at an alternate

site. With no access to the primary site, it might be difficult to recreate Guests with exactly

the same configuration without some minimal documentation available.

2) Decide what do with the volume replication between the primary and alternate site:

If (as in this example) the Storage Center hardware was lost at the primary location, then volume replication settings should be removed (see step 3 below).

If the primary site will come back on line with its Storage Center and data intact, then an administrator may choose to leave the volume replication settings it place. Once both sites come back on line, volume replication will resume (skip to Step 4).

Figure 6: Remove remote volume replication

3) To remove remote volume replication, from the Storage Center Manager GUI for the Storage

Center at the alternate location (SC13 in this example), click on the Mapping tab for the


Page 19

desired volume as shown in Figure 6. The mapped “Server” in this case is the Storage Center

at the primary site. Click on the Remove Mappings button to remove the mapping.

4) The next step in the recovery process is to create a View Volume from a Replay that is

associated with the Replicated volume. As shown in Figure 5, there are two types of Replays to

choose from in this example: crash-consistent Storage Center Replays, or application-

consistent Replay Manager 6 Restore Points. A View Volume can be created from either type of

Replay.

Figure 7: Create a View Volume from a Restore Point

5) To create a View Volume, identify the desired Storage Center Replay or Replay Manager 6

Restore Point (choose the most recent to minimize data loss) from under the Replays tab.

Right click on it, and select Create Volume from Replay, as shown in Figure 7. In this

example, a Replay Manager 6 Restore point was chosen to create the View Volume.

The two different kinds of Replays can be differentiated by their icons:

= Storage Center (crash-consistent) Replay

= Replay Manager 6 Restore Point (VSS application-consistent) Replay


Page 20

Figure 8: Create a View Volume from a Replay

6) Complete the View Volume creation wizard.

Provide a name for the View Volume. By Default, the suggested name will be the Volume’s original name with “View1” appended as shown in Figure 8.

Indicate the volume folder to save the View Volume under (TS-HV-Cluster04 in this example).

Map the View Volume to the desired Hyper-V Server or Hyper-V Cluster.

If the View Volume was created from a Replay Manager Restore Point (as in this example), it is not

possible to present it to a Hyper-V Cluster using Failover Cluster Manager until the VSS structure is

removed from the volume. If mapping a View Volume of a Replay Manager Replay to a Hyper-V cluster

is desired, then the View Volume must be mapped to a standalone Windows host server first as an

interim step to clear the VSS disk structure. After the VSS structure is cleared, then the volume can be

remapped to a Hyper-V Cluster. See Step 15 below for more details on how to clear VSS disk structure

from a View Volume of a Replay Manager restore point.

7) Next, determine if manually clearing volume attributes is necessary:

If the Replay chosen in Step 5 was a Storage Center Replay, then volume attributes do not need to be cleared. Please skip to step 22.

If the Replay chosen in Step 5 was a Replay Manager Restore Point (as in this example), then several volume attributes need to be cleared. Please proceed to Step 8 to clear these attributes.

Replay Manager is unable to manage or restore from Replay Manager Restore Points that have been

replicated to another Storage Center. The only way to recover data from a Replay Manager Restore

Point, given a disaster recovery situation where the Restore Point is at an alternate site, is to create a

View Volume from the Restore Point.

8) Use Disk Manager to bring the view volume on line as a disk on the Windows Server Host and

note its volume information. Then open a command prompt window.


Page 21

Figure 9: Clear volume attributes on a View Volume of a Replay Manager Replay

9) At the command prompt, type:

Diskpart <enter>

10) Using Figure 9 as an example, select the correct volume (volume 3 in this example), verify, and

then clear the attributes by entering the commands as shown. Verify that the attributes are

cleared (set to “No”) and then exit DiskPart.

Clear these

three attributes.


Page 22

11) If the disk does not have a drive letter assigned to it, then using Disk Manager, assign a drive

letter of your choice. In this example, the letter D is assigned to the disk.

Figure 10: Run CHKDSK after modifying attributes

12) As shown in Figure 10, check-disk has to be run after clearing the attributes.

CHKDSK <DRIVELETTER:> /F <enter>

13) Verify that the CHKDSK command completes without any errors. If errors are indicated (e.g.

“an unspecified error has occurred”) then repeat the CHKDSK command. It should complete

without an errors.

14) If the View Volume is attached to a standalone Hyper-V Server, then leave the disk on line and

proceed to Step 22 to recover any Guests. If the View Volume is to be attached to a Hyper-V

Cluster as a new CSV (as in this example), then clear the VSS structure as shown starting with

Step 15 below.


Page 23

Figure 11: Validate a Cluster to review disk characteristics

15) A View Volume of a Replay Manager Restore Point has a disk IO structure (assigned to it by VSS)

that needs to be reset (removed) before Failover Cluster Manager will recognize the View

Volume as available disk space for clustering. If attempting to add disk space to a Cluster that

has the VSS structure, Failover Cluster Manager will not recognize the disk space. As shown in

Figure 11, the presence of this Windows IO structure on a disk can be verified by:

Validating the cluster (run all tests)

After the test finishes, view the report

Under Results by Category, click on Storage, then on List all Disks.

Examine the detail under the Disk Characteristics column. In this example, Physical Drive 1 contains the IO structure needing to be cleared, as indicated by the presence of the “Disk is a snapshot disk” description.

16) To remove this disk IO structure, first insure that the disk is mapped to a standalone 2008

Server (not a cluster). Assign a drive letter to the disk, and make sure it is online.

17) On this same server, install the Dell Compellent 6.1 PowerShell Command Set (if not already

installed). The installer .msi can be downloaded from the Knowledge Center.

18) Verify the serial number of the disk (volume) on the Storage Center, as found under the

General tab for the disk. In this example, the disk serial number is 000002bb-0000b0eb.

19) Go to Start All Programs Compellent Technologies Storage Center PowerShell Snapin

Compellent Storage Center Command Set Shell.

20) Connect to the desired Storage Center and make the changes by using the PowerShell Cmdlet

“Set-DiskDevice” with the “-ResetSnapshotInfo” parameter. This Cmdlet is available with the

Dell Compellent Command Set 6.1 for PowerShell release.

For example:

Type “Get-SCConnection” <enter>.

Type “sc13” for the HostName <enter> (to connect to Storage Center 13)


Page 24

Type “<username>” <enter> (must be a valid user on the Storage Center)

Type “<password>” <enter>.

Type “Get-Help Set-DiskDevice” <enter> (for syntax if desired)

Type “Set-DiskDevice –ResetSnapshotInfo –SerialNumber 000002bb-

0000b0eb” <enter> (this is the command that removes VSS structure)

The above command should return a result such as the following:

DeviceName Size Status Health SerialNumber

\\?\PhysicalDrive1 550.00GB Online Healthy 000002bb-0000b0e9

For more information on using PowerShell cmdlets to clear the VSS Disk structure, please refer to the

PowerShell Command Set 6.1 Release Notes and Users Guide as listed in the References section of this

document.

21) Close out of the PowerShell window. Use DiskPart to insure that the read only, hidden, and

ShadowCopy attributes have been removed from the volume (see Figure 9). Then take the disk

off line.

22) Using Storage Center Manager, map the disk to the desired server or cluster object.

23) Using Disk Manager, make sure that the disk is off line, and that it is visible to all Nodes in the

cluster (by running “rescan disks” on each node). Launch Failover Cluster Manager and add

the new disk space to the Cluster. Then add the disk space as a new CSV.

If failover Cluster Manager fails to see the new disk space, then (1) make sure the disk is offline on all

nodes in the cluster, and (2) that the VSS structure was removed with the PowerShell cmdlet. It may

be necessary to run the Cluster Validation Wizard and verify that the “this is a Snapshot Disk” structure

is removed (see Step 15).


Page 25

Figure 12: Add the View Volume to the recovery cluster as a new CSV

24) Figure 12 shows the new View Volume assigned to the Cluster04 server object as a new CSV.

25) After the CSV has been added to the Cluster, make note of the data path (where the VHDs are

located), in this case, C:\ClusterStorage\Volume2 as shown in Figure 12.

26) Create a new Hyper-V Guest, and by using the Guest configuration information from Table 3,

provide the necessary configuration information for the Guest, and attach its VHDs. In this

example, Guest01 has two VHDs on the CSV: one for the OS (as the C:\ drive), and one for data

(as the D:\ drive), as shown in Table 3.

27) Boot the Guest and verify functionality.

28) Repeat steps 26 and 27 to recover additional Hyper-V Guests (if there are multiple Guests on

the CSV) at the alternate site.

29) Configure Storage Center Replays as desired for this View Volume.

30) If another Storage Center is available at this or another alternate location, Volume replication

for this View Volume can be set up to another Storage Center.

31) If Replay Manager Backups of the recovered Guests are desired, then configure new Replay

Manager Backup Sets from an instance of Replay Manager Explorer at the alternate location

that has visibility to the Hyper V Host servers or Cluster Nodes.

Considerations Before Bringing the Primary Site Back on Line In this example, since a disaster resulted in loss of hardware, data and infrastructure at the primary

location, it is not possible for Guests to come back on line there. When and if it the site does come

back on line, it will be a though it were a new site.


Page 26

But what if the disaster resulted in loss of access only (no loss of data or infrastructure)? Once the

event has passed or has been resolved, the primary site will come back on line with Hyper-V Guests and

data that were current as of the start of the outage.

If it is possible (or anticipated) that the primary site will come back on line with infrastructure,

servers, and (old) data intact, the administrator must take some precautions. The original Hyper-V

Guests at the primary location should not be allowed to come back on line, at least not initially, at the

same time as recovered Guests. Allowing this to happen (in addition to possible IP/Server name

conflicts) may cause undesirable results such as data loss, redundancy, or corruption. Strategies to

avoid this might include:

Disconnecting or disabling the WAN links between the primary location and alternate locations.

Isolating any Hyper-V Guests at the primary site (the ones that were manually recovered at the alternate site) so they cannot be accessed by users or other systems.

Fail-back Options Fail-back options might include:

No fail-back will happen (the secondary site will become the primary site for the recovered Guests and data)

Set up Volume Replication from the Dell Compellent Storage Center at the alternate Site back to the Storage Center at the primary site. Once replication has finished, then schedule a maintenance window, and from a Storage Center Replay or a Replay Manager restore Point, create a View Volume and manually recover Guests to their original location at the primary location.

Restore only the changed data from the manually recovered Guests at the secondary site back to the original Guests at the primary site. Again, this might require a short maintenance window.

Disaster Avoidance and Live Volume As has been stated previously (see Table 2), the use of Dell Compellent Live Volume is not considered

to be a valid disaster recovery strategy.

If the primary site’s Storage Center contains the primary copy of a Live Volume, and an alternate site’s

Storage Center contains the secondary copy of a Live Volume, if the primary site goes down (the

connectivity between the two Storage Centers is lost), access to data on the secondary copy of the Live

Volume at the alternate site is not possible.

While it is possible to have Copilot forcibly remove Live Volume attributes from the secondary copy of

a Live Volume in the event of a disaster, there are other risks involved with attempting to recover data

in this fashion. For more information, please refer to the Live Volume Best Practices Guide as listed in

the References section of this document.

The main purpose of Live Volume is to be able to gracefully move volumes between separate Storage

Centers on either a temporarily or permanent basis. Three common scenarios include:

Planned maintenance needs to be performed on a Storage Center that takes it off line.

Due to growth, an administrator needs to move some volumes from one Storage Center to another to balance I/O.


Page 27

An administrator needs to move volumes to a remote Storage Center at a different location.

The last bullet – moving volumes to a different location – is where Live Volume lends itself well to

disaster avoidance planning and strategy. This may involve a scenario where an administrator has a

limited amount of time to proactively prepare for an impending disaster (such as a hurricane) that will

affect one or more sites.

With disaster avoidance, many prerequisites come into play, as reviewed above in this document.

Three key components include:

Stretch Clustering. This means that a Hyper-V Cluster has one or more nodes at two separate locations on separate Storage Centers. One Hyper-V cluster is in essence “stretched” across two sites.

Layer 2 networking functionality is also “stretched” between the two sites, so that Hyper-V Guests can be live migrated to either site without having to reconfigure any network settings. While stretched Layer 2 networking is not a requirement for Live Volume to function, it is required for seamless migration of Hyper-V Guests.

The primary and alternate sites must be “well connected” to insure that adequate bandwidth is available along with low latency. Connectivity would ideally be a high bandwidth fiber connection with 5 ms or less latency. This “well connected” requirement limits the maximum distance between the two sites to around 60 miles or less - making disaster avoidance a challenge given that sites located too closely together may be affected by the same disaster.

Given a disaster avoidance situation affecting the primary location, an administrator has time to

proactively:

Live migrate Hyper-V Guests to a Storage Center at an alternate site.

Shift primary ownership of any associated Live Volumes to the Storage Center at the alternate site.

Once the event has passed or been resolved, the Guests and Live Volumes can be moved back to the primary site.

For more details on how to set up and use Live Volume in conjunction with Microsoft Hyper-V and

stretch clustering, please refer to the Live Volume Best Practices Guide as listed in the References

section of this document.

Conclusion Hopefully this document has proved helpful and has accomplished its purpose by providing

administrators with tips and helpful answers to many commonly asked questions associated with

disaster recovery/avoidance planning when implementing Microsoft Server 2008 R2 Hyper-V on a Dell

Compellent Storage Center.

Documents

Dell Compellent Storage Centeri.dell.com/sites/doccontent/shared-content/campaigns/en/... · 2020. 7. 1. · 2008 R2 Hyper-V in conjunction with the features of the Dell Compellent