15
CTOSERIES CTO Series: SharePoint, SQL, and Exchange Backup in Virtual and Physical Environments PAGE 1 CTO: SSEB-20130130 SHAREPOINT, SQL, AND EXCHANGE BACKUP IN VIRTUAL AND PHYSICAL ENVIRONMENTS CTO Series: Dr. Mark Campbell, Chief Strategy/Technology Officer, Unitrends INTRODUCTION Enterprise Windows demands enterprise-class data protection strategies; yet there has never been more confusion concerning the advantages and disadvantages of protecting various versions of Microsoft Windows Server, Microsoft SharePoint Server, Microsoft SQL Server, and Microsoft Exchange Server in both physical and virtual environments. Some of this uncertainty is due to the inherent complexity associated with these technologies; other confusion is a result of aggressive marketing by vendors each of whom has a particular axe to grind with respect to data protection methodology. Or to put it more simply - when you’ve spent a lot of time and effort creating a data protection company that is in essence a “hammer”, all customers’ environments start to look like “nails.” The damage of trying to hammer a screw into a wall is pretty easy to understand; it’s much more difficult to conceptualize the difficulty of applying certain types of backup techniques to distributed SharePoint farms. But what both situations have in common is that when you try to use the result of either - whether it’s hanging a heavy mirror or recovering that distributed SharePoint environment - you end up with a mess.

sharepoint, sql, and exchange backup in virtual and physical

  • Upload
    others

  • View
    3

  • Download
    0

Embed Size (px)

Citation preview

CTOSERIES

CTO Series: SharePoint, SQL, and Exchange Backup in Virtual and Physical Environments PAGE 1CTO: SSEB-20130130

SHAREPOINT, SQL, AND EXCHANGE BACKUP IN VIRTUAL AND PHYSICAL ENVIRONMENTS

CTO Series: Dr. Mark Campbell,

Chief Strategy/Technology Officer, Unitrends

INTRODUCTIONEnterprise Windows demands enterprise-class data protection strategies; yet there has never been more confusion concerning the advantages and disadvantages of protecting various versions of Microsoft Windows Server, Microsoft SharePoint Server, Microsoft SQL Server, and Microsoft Exchange Server in both physical and virtual environments. Some of this uncertainty is due to the inherent complexity associated with these technologies; other confusion is a result of aggressive marketing by vendors each of whom has a particular axe to grind with respect to data protection methodology. Or to put it more simply - when you’ve spent a lot of time and effort creating a data protection company that is in essence a “hammer”, all customers’ environments start to look like “nails.”

The damage of trying to hammer a screw into a wall is pretty easy to understand; it’s much more difficult to conceptualize the difficulty of applying certain types of backup techniques to distributed SharePoint farms. But what both situations have in common is that when you try to use the result of either - whether it’s hanging a heavy mirror or recovering that distributed SharePoint environment - you end up with a mess.

CTOSERIES

CTO Series: SharePoint, SQL, and Exchange Backup in Virtual and Physical Environments PAGE 2CTO: SSEB-20130130

In this white paper we’ll explore the following with respect to data protection of Windows Server, Microsoft SharePoint, Microsoft SQL, and Microsoft Exchange environments:

• Backup• Archiving • Replication (private and public cloud-based data protection)• Failover virtualization (also called instant recovery)

TYPES OF BACKUPIn order to make sure that you metaphorically don’t try to hammer a screw or screw in a nail, you first have to understand what hammers, screwdrivers, nails, and screws are and what tasks they’re best suited to perform. Likewise, in order to understand the best type of backup to use for Windows Server and the server-level Microsoft applications, you have to first understand native, HOS, and GOS block-based backup types.

There are three basic types of Windows Server and Microsoft application backup that have evolved over time: native, HOS-based, and GOS-based block. The level at which these operate is depicted in the figure 1.

Native and GOS-based block backup occur within the virtual machine - at the GOS level. HOS-based backup occurs at the physical host machine. All three of these backup types will be discussed briefly in the sections that follow. For much more detail on virtual (and physical) backup approaches, please see the white paper entitled “Losing My Religion: Virtualization Backup Dogma, Faith, and Fact.”

Native BackupMicrosoft uses a data protection architecture known as VSS (Volume Shadow Copy Service) to protect their operating systems, applications, and their virtualization. VSS at the operating system and application level is used not only by Microsoft, but by other virtualization vendors (for example, VMware) to make sure that the data being used by Microsoft operating systems and applications is in a consistent state so that recovery is insured (this is also called “quiescing.”) However, Microsoft as a virtualization vendor also uses VSS at the HOS-level as well.

Native backup may be used in physical environments as well as in virtual environments by protecting each VM (Virtual Machine) independently

CTOSERIES

CTO Series: SharePoint, SQL, and Exchange Backup in Virtual and Physical Environments PAGE 3CTO: SSEB-20130130

through the direct use of VSS.

HOS (Host Operating System) BackupHOS backup refers to protecting one or more VMs typically by using a set of programmatic interfaces provided by the virtualization vendor. These programmatic interfaces (which for VMware vSphere is known as VADP - vStorage API for Data Protection - and for Microsoft is VSS implemented at the host-level.) In the case of VMware vSphere and Microsoft, the HOS-level programmatic interfaces indirectly call the lower-level VSS protection architecture.

Note that VSS doesn’t ensure maintenance and management operations such as the truncation of transaction logs associated with an application. It is assumed in this document that this is handled automatically via backup aware software infrastructure that can be pushed and pulled into and out of the virtual machine (the GOS) - however, you should make sure that your vendor supports this.

CTOSERIES

CTO Series: SharePoint, SQL, and Exchange Backup in Virtual and Physical Environments PAGE 4CTO: SSEB-20130130

GOS (Guest Operating System) Block-Based BackupGOS block-based backup refers to data protection architectures that attempt to capture data changes below the file system level (hence, the term block-based backup.) These architectures often use VSS in some form in order to force loose synchronization between the file system and the underlying blocks that form the file system.

VSS: THE KEY TO MICROSOFT AND WINDOWS BACKUPMicrosoft introduced a set of core data protection primitives called VSS (Volume Shadow Copy) with Windows Server 2003 upon which all modern Windows Server and Microsoft application protection is based. VSS is used to quiesce the appropriate applications, operating systems, and storage such that a consistent point-in-time snapshot (also called a shadow copy) can be created for the purpose of backup.

The diagram (figure 2) below from Microsoft depicts the core VSS architecture.

The VSS requestor (typically a data protection solution) requests from the VSS provider that a snapshot be created and the VSS provider in turn works with the VSS writers to create that snapshot.

VSS writers are available for Windows Server operating systems (Windows Server 2003 and above) as well as the Microsoft applications SharePoint, SQL,

CTOSERIES

CTO Series: SharePoint, SQL, and Exchange Backup in Virtual and Physical Environments PAGE 5CTO: SSEB-20130130

and Exchange.

This is the core architecture used to natively protect Windows Server operating systems and Microsoft applications - which in the prior section we call “Native Backup.” This may also be part of the solution used by what we called “GOS Block-Based Backup” in the previous section - in essence VSS is typically used in these schemes in order to synchronize the blocks being captured with the operating system, applications, and storage.

But what about virtualization-aware backup - what we called HOS Backup in the prior section? As you’d suspect, it’s a bit more complicated. The figure below (figure 3, from Microsoft) depicts an example of data protection in a virtualized environment.

Now - this particular figure depicts Microsoft Hyper-V - but in general the same concepts apply. The hypervisor layer has its own set of data protection primitives - VSS in the case of Microsoft Hyper-V, VADP (vStorage API for Data Protection) in the case of VMware vSphere) - that in turn call the VSS services in each virtual machine.

2 I always wanted to see a backup vendor create a marketing campaign targeted to puppy lovers; to me, it makes as much sense as most advertising in our space.

CTOSERIES

CTO Series: SharePoint, SQL, and Exchange Backup in Virtual and Physical Environments PAGE 6CTO: SSEB-20130130

ADVANTAGES AND DISADVANTAGES OF NATIVE, GOS BLOCK, AND HOS BACKUP TECHNIQUESBefore we discuss the specifics of SharePoint, SQL, and Exchange, there are some general advantages and disadvantages we can explore with regard to the native, GOS block, and HOS backup techniques. These are depicted in the table that follows.

Technique Level Advantages DisadvantagesNative GOS • Granularity

• Distributed• Integrity• RPO best• RTO varies• Passthrough

storage

• TCO• Load

GOS Block GOS • RPO good• RTO varies• Passthrough

storage

• Granularity• Distributed• Integrity• TCO• Load

HOS HOS • TCO and simplicity• Automation and

inclusion• vCenter/HA• Performance• RTO

• Granularity• Distributed• RPO• Passthrough

storage

Native backup typically have the following advantages:

• Backup granularity for both what application data to protect as well as varying RPOs for that application data is better.

• Deeper integration may be achieved with clustered operating system and application clustering including failover location transparency.

• Better potential integrity with the use of native application techniques to check on the integrity of the backup.

• A better RPO can be gained by taking advantage of discrete transaction log capabilities of the application.

• The RTO varies on native backup; however, it typically takes longer than block- or image-based mounted recovery solutions for incremental data recovery.

CTOSERIES

CTO Series: SharePoint, SQL, and Exchange Backup in Virtual and Physical Environments PAGE 7CTO: SSEB-20130130

• This type of backup supports virtual environments in which data resides on guest-attached mounted storage.

• Passthrough disks configured using iSCSI, physical RDM (VMware), and independent disks (VMware) are supported.

Native backups typically have the following disadvantages:

• If you have large numbers of discrete application servers, then managing each one individually has a higher TCO than managing them collectively as part of a larger set of virtual machines.

• Additional potential load may be incurred, particularly if the RPO is set aggressively. Note that better RPOs may not be possible, however, with other techniques.

GOS block backups typically have the following advantages:

• The RPO of a GOS block backup is not quite as good as what is possible with native backup but is potentially better than the RPO associated with HOS backup. The term “potentially” applies because it depends upon the general and application-specific configuration of the physical or virtual machine being protected.

• The RTO varies on GOS block backup; however, if mounting of the block-based image is possible then this type of backup has a fast RTO for incremental data recovery.

• Passthrough disks configured using iSCSI, physical RDM (VMware), and independent disks (VMware) are supported

GOS block backups typically have the following disadvantages:

• Support for distributed systems can range from completely absent to absolutely minimal.

• Backup granularity for both what application data to protect as well as varying RPOs for that application data is typically unavailable.

• RPO can be poor since typically much more data is being protected; RPO can be poor due to the inability to roll changes up to a 1-minute level.

• GOS block backups do not handle distributed systems well.• Additional potential load may be incurred, particularly if the RPO is set

aggressively. The RPO can’t be set on a granular basis which also incurs additional load.

HOS backups typically have the following advantages:

CTOSERIES

CTO Series: SharePoint, SQL, and Exchange Backup in Virtual and Physical Environments PAGE 8CTO: SSEB-20130130

• HOS backups have potentially a lower TCO because backing up tens, hundreds, or even thousands of VMs in which the administrator is abstracted from what is in the VM is simpler and easier.

• It’s easier to schedule dynamic server (VM) inclusion and to perform automatic tasks on those dynamically included VMs.

• Support for virtualization constructs like vCenter or HA are available.• The performance of HOS-level backup within a specified window is

typically superior - no semantic understanding of the data means that “blocks are blocks” and can be transferred more quickly.

• RTO is superior.

HOS backups typically have the following disadvantages:

• Support for distributed systems can range from completely absent to absolutely minimal.

• You backup at the virtual machine level; thus backup granularity for both what application data to protect as well as varying RPOs for that application data is typically unavailable.

• RPO can be poor since typically much more data is being protected; RPO can be poor due to the inability to roll transaction logs up to a 1-minute level.

• Passthrough disks configured using iSCSI, physical RDM (VMware), and independent disks (VMware) are not supported.

SHAREPOINT BACKUPMicrosoft has stated that SharePoint is one of the fastest growing applications in their history. SharePoint at the end of 2012 was reported by Microsoft to be generating $2 billion a year. By the beginning of 2012, Microsoft had 125 million SharePoint licenses, 65,000 customers of SharePoint, and 67% of those 65,000 customers reported SharePoint deployments across their entire organization.

SharePoint relies on a lot of technologies. These typically include but aren’t limited to

• Windows Server• Microsoft SQL Server• IIS (Internet Information Server)• Active Directory• DNS (Domain Name System)

CTOSERIES

CTO Series: SharePoint, SQL, and Exchange Backup in Virtual and Physical Environments PAGE 9CTO: SSEB-20130130

• Networking• Exchange Server (Optional)

In addition, SharePoint is often deployed on a SharePoint farm - a collection of SharePoint and SQL servers that work in concert to provide SharePoint services to support one (or more) SharePoint site. In essence, SharePoint has been designed by Microsoft to be scalable via distributed system semantics.

The next series of figures depicts the basic functions associated with SharePoint backup from the perspective of the previously described VSS architecture (these illustrations were taken from Microsoft documentation.)

The first function that must be performed is the creation of the inventory of all of the components within a SharePoint implementation. As noted previously, SharePoint may be both distributed and consists of a number of disparate technologies - so this is a critical step.

The next step is the actual backup itself. The VSS architecture makes it relatively simple once an inventory has been created - a request is made and distributed to all components; all of those components are quiesced together and a synchronized snapshot created.

CTOSERIES

CTO Series: SharePoint, SQL, and Exchange Backup in Virtual and Physical Environments PAGE 10CTO: SSEB-20130130

When a restore needs to occur, VSS is used to recover across the distributed SharePoint implementation as follows.

While SharePoint is a successful and scalable application, the sheer number of technologies and its distributed nature make it a challenge to protect correctly. The nature of both GOS block backup and HOS-level backup make it difficult to synchronize among these different technologies on different systems; thus native backup tends to work best.

SQL SERVER BACKUPMicrosoft SQL Server is a relational database server that is used both as a fundamental building block of SharePoint as well as a directly as SQL-based database infrastructure. SQL Server may be implemented on a single server or in a distributed fashion for higher scalability across a number of servers.

In the simplest cases, native backup, GOS block backup, and HOS backup work fine. However, across scalable, distributed SQL Server implementations native backup tends to be superior. In addition, native backup techniques tend to produce higher integrity backups. Why? SQL Server databases

CTOSERIES

CTO Series: SharePoint, SQL, and Exchange Backup in Virtual and Physical Environments PAGE 11CTO: SSEB-20130130

are sensitive to multiple backup and maintenance requests operating on the same database. Typically native SQL backup uses LSNs (Log Sequence Numbers) to ensure that the transaction log chain is intact. If the LSNs are not in sequence the native SQL server backup can ensure resynchronization of that chain through the issuance of full backups before proceeding with differential and transaction log backups.

In addition, with native backup, deeper integration with clustered SQL implementations are available which allow location transparency on failover by accessing SQL data through the clustered node.

EXCHANGE SERVER BACKUPMicrosoft Exchange Server is a mail server, calendaring software, and contact manager. Exchange Server is used directly by Microsoft client-side e-mail programs as well as being an optional component of SharePoint. Exchange Server may be implemented on a single server or in a distributed fashion for higher scalability across a number of servers.

In the simplest cases, native backup, GOS block backup, and HOS backup are acceptable methods to use to protect Microsoft Exchange Server. The best method is native backup even in these simpler cases - the reason is that Exchange Server is built upon an older database technology and has a tendency under load to become corrupted - and a backup of a corrupted Exchange database can fail to restore. Some vendors have audited recovery techniques for this - but what is best is to do deep integrity checking of Exchange before, during and after the backup.

Exchange is often integrated into many other Microsoft services - not only in products such as SharePoint but also in core services such as Active Directory. Make sure that your backup strategy is flexible enough to allow different modes of recovery - from a single Exchange Server to multiple Exchange Servers to a complete loss of all Windows and Microsoft application systems.

Also, across scalable, distributed Exchange Server implementations native backup tends to be superior. The reason is synchronization across the multiple Exchange databases. Native Exchange backup also allows deeper integration into clustered Exchange Server implementations which provide location transparency on Exchange failover. For example, in the case of Exchange Server 2010 DAGs (Database Availability Groups), the backup can

occur from either the active node or passive node.

CTOSERIES

CTO Series: SharePoint, SQL, and Exchange Backup in Virtual and Physical Environments PAGE 12CTO: SSEB-20130130

ARCHIVINGArchiving in the context of this document refers to tertiary backup. Some systems support not only D2D (Disk-to-Disk) backup but also D2D2x (Disk-to-Disk-to-Any) data protection. That “x” can refer to a removable disk, removable tape, fixed NAS, fixed SAN, or even cloud gateways such as that offered by Twinstrata or AWS (Amazon Web Servers.) The hallmark of archiving is that the target and source are typically located at the same premise - thus higher performance interconnects can be used for faster data transfer. These higher performance interconnects can range from USB to eSATA to various LAN-based Ethernet implementations.

Archiving is also called nearline retention - which distingishes it from the online retention of the primary backup storage. If archiving is performed correctly, it also serves as a disaster recovery method that may be used individually or together with replication (see the next chapter on replication.)

So what does archiving have to do with the backup of SharePoint, SQL, or Exchange? Primarily it’s granularity. Native application backup tends to be more granular (as discussed above) in terms of not only application data but in terms of retention as well as RPO and RTO. HOS backup and GOS block backup can be post-indexed to pull data specifically, but it’s not only time-consuming it also wastes a great deal of resources.

In choosing the type of backup to use with your particular Microsoft application implementation, make sure that your current and future nearline retention needs are well understood as well as your disaster recovery strategy and that you have the granularity and flexibility you require.

REPLICATIONReplication refers to the electronic copying of data - typically from a device in one location to a device in another location over WAN (Wide Area Network.) Because replication typically takes place over a WAN and WAN bandwidth is precious, a key component of replication is not just synchronization of data but also the reduction of the amount of data that has to be sent over the WAN.

Replication is thus the key enabling technology for cloud-based disaster recovery. Replication is used in both private and public cloud implementations as well as single- and multi-tenant cloud solutions.

There are two basic types of replication: primary and secondary. Primary

CTOSERIES

CTO Series: SharePoint, SQL, and Exchange Backup in Virtual and Physical Environments PAGE 13CTO: SSEB-20130130

replication occurs on the primary (active) data; secondary replication occurs on a secondary copy of that data. In the case of backup solutions, replication before or during the backup is typically primary replication while replication after the backup is typically secondary replication. For SharePoint, SQL, Exchange, and for that matter Windows Server or other environments, secondary replication has less potential performance impact on the application or operating system. Note that the typical trade-off in replication is compute time to reduce the amount of data (typically using techniques such as compression and source-level deduplication) versus the time it takes to transfer over the WAN. Since these types of data reduction techniques can be quite intensive, you want to make sure your replication is not only efficient in terms of WAN bandwidth but that it doesn’t add unnecessary load on your application server.

Because of the bandwidth limitations of the WAN, granularity with respect to SharePoint, SQL, and Exchange is even more important with respect to replication than it is with archiving. Thus you want to make sure that the type of backup you choose has the most granularity possible if you plan to replicate - particularly if your WAN bandwidth is limited. Native backup tends to offer the greatest degree of granularity.

FAILOVER VIRTUALIZATIONCompared to backup, archiving, and replication, failover virtualization is a much more recent technology. (Note: This doesn’t mean all, or even most, backup vendors have integrated backup, archiving, and replication - most do not - it simply means that as a technology failover virtualization is the most recent.) Failover virtualization, which is often termed “instant recovery” or “live recovery”, simply means that a copy of the backup can be in just a few minutes used to create a virtual machine that is a copy of the original physical or virtual machine that was being protected.

There are two primary types of failover virtualization: one in which the virtual machine that is created requires additional resources to operate and the other in which the virtual machine that is created operates more or less in a standalone mode. As an example, it is possible to do an HOS backup of a VMware vSphere VM and then present that backup as a VMDK block image on the backup device itself so that a secondary VMware vSphere ESX or ESXi host server can use it with a hosted VM. This is depicted in the figure that follows.

CTOSERIES

CTO Series: SharePoint, SQL, and Exchange Backup in Virtual and Physical Environments PAGE 14CTO: SSEB-20130130

As another example, it is possible to do a native backup of Windows Server and then use that native backup to create a standalone virtual machine that contains not only Windows Server but also SharePoint, SQL, and/or Exchange. This is depicted in the figure that follows

Failover virtualization is a valuable tool. But it isn’t a substitute for deep integrity checking before or during a backup. And it should be used very

CTOSERIES

CTO Series: SharePoint, SQL, and Exchange Backup in Virtual and Physical Environments PAGE 15CTO: SSEB-20130130

carefully with respect to distributed systems, i.e., distributed SharePoint farms and multi-server implementations of SQL and Exchange.

CONCLUSIONUsing a hammer to install a screw doesn’t make a lot of sense. As noted in the introduction, Enterprise Windows demands enterprise-class data protection strategies. And yet far too many data protection vendors are promoting a one size fits all strategy that are focused on what one particular technology rather than being focused on their customer’s needs - both now and in the future.

It’s critical that your data protection vendor has flexible strategies and can adapt to your environment - after all, in IT you’re continuously being asked to adapt to better service your customers. Make sure that you’re data protection vendor flexible and adaptable enough to enable you to build an agile IT infrastructure that handles not only your needs today - but your needs tomorrow as well.

7 Technology Circle | Suite 100 | Columbia, SC 29203866.359.5411 | [email protected] | www.unitrends.com

Copyright © 2012 Unitrends