48

Gopal Ashok Program Manager Microsoft Corporation DAT306

Embed Size (px)

Citation preview

Page 1: Gopal Ashok Program Manager Microsoft Corporation DAT306
Page 2: Gopal Ashok Program Manager Microsoft Corporation DAT306

Building a High Availability Strategy for Your Enterprise Using Microsoft SQL Server 2008

Gopal AshokProgram ManagerMicrosoft CorporationDAT306

Page 3: Gopal Ashok Program Manager Microsoft Corporation DAT306

Agenda

Introduction to High Availability and Disaster Recovery

SQL Server Always On Technologies

Developing Your Availability Solution

Conclusion

Page 4: Gopal Ashok Program Manager Microsoft Corporation DAT306

Introduction to High Availability and Disaster Recovery

DefinitionsIntroduce key terms and concepts

Business Continuity PlanningOverview of the BCP process

SQL Server High Availability PlanningHow does BCP apply to SQL Server availability?

Page 5: Gopal Ashok Program Manager Microsoft Corporation DAT306

High Availability and Disaster Recovery: Definition

High AvailabilityHigh availability is a system design protocol and associated implementation that ensures a certain absolute degree of operational continuity during a given measurement period

Availability defined in terms of service level agreements (SLA)

Recovery TimeData loss during unplanned downtime

A highly available application should be accessible by users x% of the time

Disaster RecoveryProcesses and procedures designed to restore business operations due to a natural or human-induced disaster

Typically involves providing redundancy spanning multiple sites or across geographic regions

Page 6: Gopal Ashok Program Manager Microsoft Corporation DAT306

Defining x and SLARecovery Time Objective (RTO) guided by availability requirements

How much downtime can you tolerate?

Recovery Point Objective (RPO) guided by criticality of application data

How much data can you lose?

AvailabilityClass

Acceptable Downtime (hrs/yr) OR RTO

Acceptable Data Loss (time of last copy) OR RPO

Tier 1 >99.99%(1 hr or less)

5 min or less

Tier 2 99.9% - 99.99% (1- 8.5 hrs)

5 mins to 8.5 hrs

Tier 3 (<99.9%)(Hours to days)

Hours to days

Tier1

RTO

RPO

Tier2

Tier3

Availa

bility

Requirements

Page 7: Gopal Ashok Program Manager Microsoft Corporation DAT306

Protection LevelsProtection against resource failures

Machine Database CorruptionDisk

Location RedundancyBuilding< 10 miles

Local HA

Regional DR

Geographic DR

Protection against Network Outages Site Failures

Location Redundancy– City, County– < 100-200 miles

Protection against Natural Disasters

Location Redundancy– State, Country– > 100-200 miles

Page 8: Gopal Ashok Program Manager Microsoft Corporation DAT306

Business Continuity Planning

Analysis

Solution Design

ImplementationTesting

Maintenance

Impact AnalysisCritical FunctionsThreat IdentificationRecovery Objectives

Solution DesignAchieve recovery objectives for relevant threats within specified constraints like budget, human resources etcCost\Benefit analysis of solutions

ImplementationDeploy the recommended solution

TestingTest to see if the solution meets the recovery requirements

MaintenanceYearly testing and review of procedures

Page 9: Gopal Ashok Program Manager Microsoft Corporation DAT306

SQL Server High Availability PlanningAnalysis

Application tiers serviced by the databasesCauses of database downtimeProtection levels: Local HA, Regional DR, Geographic DR

Solution DesignNeed to understand what solutions exists?What are the characteristics and cost of the solution?

ImplementationWhat are the deployment steps and best practices?

TestingHow do I test my implementation?

MaintenanceHow do I monitor and maintain the solution?

Analysis

Solution Design

ImplementationTesting

Maintenance

Page 10: Gopal Ashok Program Manager Microsoft Corporation DAT306

Database Downtime Drivers

Database Downtime

Unplanned Downtime

FailureProtection

User Errors

Planned Downtime

Online Administration

Predictable Resourcing

Analysis

Page 11: Gopal Ashok Program Manager Microsoft Corporation DAT306

Solution Design

Solution Architecture

HA Capabilities

Limitations and Caveats

Cost Vector

Understand the solutions and choices before making a decision

Solution Design

Page 12: Gopal Ashok Program Manager Microsoft Corporation DAT306

SQL Server Always On Technologies

Solution Design

Page 13: Gopal Ashok Program Manager Microsoft Corporation DAT306

Always On TechnologiesProvides a full range of options to minimize downtime and maintain appropriate levels of application availability

Solution Design

Increases Availability

• Backup and Restore• Log Shipping• Database Mirroring• Failover Clustering• Peer-Peer Replication

Decreased Downtime

• Online Index Operations• Table Partitioning• Enhanced Locking• Resource Governor• Database Snapshot• Dedicated Admin Connection• Dynamic Configuration

Page 14: Gopal Ashok Program Manager Microsoft Corporation DAT306

Always On Technology Overview

Architecture OverviewHow does it work?

Solution CharacteristicsData Loss GuaranteesFailover CharacteristicsRedundancy Levels and UtilizationCostLimitations and Caveats

Solution Design

Increases Availability

• Backup and Restore• Log Shipping• Database Mirroring• Failover Clustering• Peer-Peer Replication

Page 15: Gopal Ashok Program Manager Microsoft Corporation DAT306

Backup and RestoreBase availability technology for any solution

Protects against failures and recovery from errorsProvides Local HA and Site DR

Need to ensure the backups are accessible if site goes downHigh RTO due to restore timeRPO=0 can never be guaranteed

Types: Full, Differential, and Transaction LogFile-group backup/restore for large databases

Backup Compression provides faster and smaller backups in SQL Server 2008

Solution Design

Page 16: Gopal Ashok Program Manager Microsoft Corporation DAT306

Log ShippingAutomated transaction log backup and restore provides redundancy at the database level

SQLLogship.exe provides the underlying framework for doing automated backup, copy and restore

Backup on primary instanceRestore on secondary instance(s)

Scheduling is done through SQL Server Agent jobs

SQL Server 2008 provides sub-minute scheduling interval providing the ability to do quick backup and restores

No automatic failover capabilities

Solution Design

Page 17: Gopal Ashok Program Manager Microsoft Corporation DAT306

Database MirroringA database level high availability solution that provides complete protection against data loss and fast recovery through automatic failover

Maintains a redundant database by shipping log blocks when the transactions are committed on the principal

Synchronous and Asynchronous modes provide the spectrum of options to choose between availability and performance

Automatic failover when using witness server

Solution Design

Page 18: Gopal Ashok Program Manager Microsoft Corporation DAT306

Failover Clustering

Instance level protection built on Windows Failover Clustering shared disk model

Cluster nodes typically co-located within the same site to provide local HARegional DR possible using VLAN and stretch storage level replication

No built in data redundancy like database mirroring and log shipping

Data protection has to be provided at the storage level or by combining with other solutions

Solution Design

Page 19: Gopal Ashok Program Manager Microsoft Corporation DAT306

Transactional ReplicationA high performance data replication solution that provides granular table level replication

Logical data movement provides flexibility and better hardware utilization

Key scenarios:Customized application-specific DRReal-time reporting on secondary server that be used for Site DRScale out application queries with ability to use any one database copy for Site DR

Two types relevant for HA and DRTransactional and Peer-to-Peer

Solution Design

Page 20: Gopal Ashok Program Manager Microsoft Corporation DAT306

Always On Solution Characteristics

No Data Loss(RPO=0)

Failover Unit AutoFailover(RTO)

Inst DB Tab

+ **

Read Mult-iple

Write

*

*

*

Solutions

Log Shipping

DBM Sync

Async

Cluster

Transactional Replication

Peer-PeerReplication

RPO FailoverRedundancy and

Utilization

Hard-ware

App PerfImpact

Manag-eability

Low Low Low

Low High Low

Low Low Low

High*** Low *** Low***

Low Low High

Low Low High

Cost

Solution Design

* Database Mirroring and Log Shipping can provide point in time read capability using STANDBY or database snapshots respectively** Database Mirroring provides fastest failover to hot secondary*** Depends on SAN technology

Page 21: Gopal Ashok Program Manager Microsoft Corporation DAT306

What’s New in SQL Server 2008New Features

Resource GovernorManage SQL Server workloads and resources by specifying limits on resource consumption

Backup CompressionReduce backup and restore time

Feature EnhancementsDatabase Mirroring

Automatic recovery from page corruptionLog stream compressionFaster recovery on failover

Log ShippingSub-Minute Log ShippingBackup compression

Failover Clustering16 nodesRolling upgrade

Peer-Peer ReplicationHot add new nodes

Page 22: Gopal Ashok Program Manager Microsoft Corporation DAT306

Backup CompressionCommon questions:

“How much compression will I see?”“Will it be comparable to, say, SQL Litespeed?”

One simple answer: “It depends!”

All data compresses differently – the compression ratio achieved depends on:

The type of data in the databaseWhether the data in the database is already compressedWhether the data/database is encrypted

“We saw an 85 percent reduction in file size using SQL Server 2008 Backup Compression,” says Colin Neller, Senior Software Engineer at ServiceU and part of the company’s SQL Server 2008 implementation team. “A backup file that was previously over 300 GB is now only 40 GB, and the job runs in about half the time.”

Page 23: Gopal Ashok Program Manager Microsoft Corporation DAT306

Backup Compression: Backup Performance

Backup of a 322 MB Adventureworks database

A LOT more CPU used (avg 25%) BUT runtime = 21.6s (45% improvement) and backup stored in 76.7MB (4.2x compression ratio)

Hardly any CPU used (avg 5%), runtime = 39.5s, compression ratio of 0.

Uncom

pressed

Compressed

Page 24: Gopal Ashok Program Manager Microsoft Corporation DAT306

DEMO: Increasing Availability Using Always On Technologies

demo

Page 25: Gopal Ashok Program Manager Microsoft Corporation DAT306

Developing Your Availability Solution using SQL Server Always On Technologies

Solution Design

Page 26: Gopal Ashok Program Manager Microsoft Corporation DAT306

RecapApplication availability requirements or SLA drive primary solution choices

RPO and RTO are the key metrics used to define the SLA

Need mitigation against planned and unplanned downtimes

Multiple solution choices that provides varying cost\benefits

Other requirements apart from application SLA factor into the choice

Understand constraints and tradeoffs you can make

Database Mirroring

Clustering

Log Shipping

Peer-PeerReplication

Application Availability

Unplanned downtime Planned Downtime

Solution Design

Page 27: Gopal Ashok Program Manager Microsoft Corporation DAT306

Always On Solution Characteristics

No Data Loss(RPO=0)

Failover Unit AutoFailover(RTO)

Inst DB Tab

+ **

Read Mult-iple

Write

*

*

*

Solutions

Log Shipping

DBM Sync

Async

Cluster

Transactional Replication

Peer-PeerReplication

RPO FailoverRedundancy and

Utilization

Hard-ware

App PerfImpact

Manag-eability

Low Low Low

Low High Low

Low Low Low

High*** Low *** Low***

Low Low High

Low Low High

Cost

* Database Mirroring and Log Shipping can provide point in time read capability using STANDBY or database snapshots respectively** Database Mirroring provides fastest failover to hot secondary*** Depends on SAN technology

Solution Design

Page 28: Gopal Ashok Program Manager Microsoft Corporation DAT306

AdventureWorks Inc Scenario Adventureworks Inc is a

manufacturing company that manufactures and sells bicycles across the world. There are a number of applications, some that are mission critical that run on multiple SQL Server Instances

The DBA team is run by Darren who is responsible for deploying and managing the application databases. One of his core responsibilities is to ensure availability of all application databases in order to meet the application SLA

One datacenter located in OmahaThree applications

Manufacturing – Tier 1Finance – Tier 2 Scheduling – Tier 3

Manufacturing application runs on a dedicated SQL Server 2008 Instance

All other applications run on a second instance

Availability of manufacturing application is criticalImplement a solution at the lowest possible cost

Solution Design

Page 29: Gopal Ashok Program Manager Microsoft Corporation DAT306

Application Requirements

Manufacturing application has strict SLA’sFinance application requires readability on the secondary

The reports are run every 4 hours and need to be fresh as of the last one hour. To offload the reporting load from the main system they would like to utilize the mirror

Data LossRPO=0

RTO in secs

Failover Unit AutoFailover

Inst DB Tab

Read Multiple Sites

ReadWrite

Applications

Manufacturing

Finance

Scheduling

Solution Design

Page 30: Gopal Ashok Program Manager Microsoft Corporation DAT306

Solution Choice for Manufacturing Application

Clustering can provide a zero data loss solution that can also provide fast instance level failoverUse RAID configuration to provide data redundancy on the SANIf a redundant copy is required that can provide instance failover with zero data loss use SAN replication

High Cost Solution

Use synchronous database mirroring if instance failover is not needed

Solutions Data LossRPO=0

Fast RTO

Failover Unit AutoFailover

Read > 1Sites\ Copy

Read WriteInst DB Tab

Cluster

SAN Replication

DBM - Sync

Log Shipping

Transactional Replication

Peer-PeerReplication

Clustering with RAID

DBM - Async

Solution Design

Page 31: Gopal Ashok Program Manager Microsoft Corporation DAT306

For database level redundancy with acceptable data loss with minimal perf impact, asynchronous database mirroring is an optimal choice

Use database snapshots at periodic intervals to provide a readable snapshot of the data for reporting

Low cost solution

DBM - Async

Cluster

Solution Choice for Finance Application

Solutions Data LossRPO=0

Fast RTO

Failover Unit AutoFailover

Read > 1Sites\ Copy

Read WriteInst DB Tab

SAN Replication

Log Shipping

Transactional Replication

Peer-PeerReplication

Async Database Mirroring

Omaha Datacenter

Finance

Db Snapshotevery hour

Reports

Scheduling

DBM - Sync

Solution Design

Page 32: Gopal Ashok Program Manager Microsoft Corporation DAT306

Adding a Regional Datacenter Into the Mix

Regulatory and compliance requirements drive the need for having a additional datacenter within a 10 mile radius to provide redundancy against site level failure.

It is now required that all applications have the ability to failover to the regional datacenter across the river in Council Bluff

The SLA need to be maintained for tier 1 applications even in the case of site failures

Solution Design

Page 33: Gopal Ashok Program Manager Microsoft Corporation DAT306

Regional Site Solution Choices

Async Database Mirroring

Omaha Datacenter

Finance

Db Snapshotevery hour

Reports

Scheduling

Sync Mirroringno witness

Log Shipping

CB Datacenter

Cluster with SAN

Manufacturing

Solution Design

Page 34: Gopal Ashok Program Manager Microsoft Corporation DAT306

A Complete Topology

Considering the potential of floods and tornadoes destroying the regional data centers, Adventureworks Inc wants to maintain a disaster recovery site in San Antonio, TX

The disaster recovery site has lower SLA requirements for all applications

The manufacturing application can have an RPO of 1 hourThe RTO is set at 4 hours

Solution Design

Page 35: Gopal Ashok Program Manager Microsoft Corporation DAT306

Topology Diagram

Sync MirroringNo witness

Cluster with SAN

Log Shipping

Manufacturing

Solution Design

Page 36: Gopal Ashok Program Manager Microsoft Corporation DAT306

Scale Out and Availability Scenario

Adventureworks is building a new web based order management system that allows customers from all over the world access the system and place ordersThe core group of customers are in Western Europe, South East Asia and North America

Requirements– Geo Redundancy – Data Locality– High Availability– Local Read-Scale

Workload Characteristics– Mainly reads– Few writes

Application Characteristics– Each user logging in connects to a

particular server Partitioned based on user-id and region Writes from a user always happen on one

server regardless of the region the user log in from

– All reads redirected to the closest geo-location

Reasonable tolerance for latency (5-10 minutes)

Solution Design

Page 37: Gopal Ashok Program Manager Microsoft Corporation DAT306

Replication Topology

Peer Nodes

Read-Only Servers

Asia1 Asia2

Solution Design

Page 38: Gopal Ashok Program Manager Microsoft Corporation DAT306

Implementing and maintaining a HA solutiondemo

Page 39: Gopal Ashok Program Manager Microsoft Corporation DAT306

Licensing Facts

Passive servers are mirror, log shipped secondary and clustering passive nodeNo license required on passive if it is truly passiveA passive server does not need a license if the number of processors in the passive server is equal to or less than the number of processors in the active server. The passive server can take the duties of the active server for 30 days. Afterwards, it must be licensed accordingly.

Page 40: Gopal Ashok Program Manager Microsoft Corporation DAT306

HA Features Edition SupportFeature Express Workgroup Standard Enterprise Comments

Database Mirroring 1

Advanced high availability solution that includes fast failover and automatic client redirection

Failover Clustering 2

Backup Log-shipping Data backup and recovery solution

Online System Changes

Includes Hot Add Memory, dedicated administrative connection, and other online operations

Online Indexing

Online Restore

Fast RecoveryDatabase available when undo operations begin

₁Single thread redo₂ Limited to 2 node cluster

Page 41: Gopal Ashok Program Manager Microsoft Corporation DAT306

SummaryThere is no “one size fits all” solution

Consider the cost\benefits\constraints and compare that to availability requirements of the organization to determine the best solution

Use the charts to understand cost, benefit and constraints of the various SQL Server High Availability solutions

TEST the solution to ensure it can meet the availability requirements and meet SLA’s

Page 42: Gopal Ashok Program Manager Microsoft Corporation DAT306

question & answer

Page 43: Gopal Ashok Program Manager Microsoft Corporation DAT306

SQL Server Word of the Day

DATA COMPRESSION

Wednesday, May 13

*Game cards may be picked up at the SQL Server booths in the TLC

Page 44: Gopal Ashok Program Manager Microsoft Corporation DAT306

Additional Resources• Team Forum: http://social.msdn.microsoft.com/Forums/en-

US/sqldisasterrecovery/threads• Other: http://sqlcat.com/

External Resourceshttp://www.microsoft.com/sqlserver/2008/en/us/high-availability.aspxURLhttp://sqlcat.com/

SQL Server 2008 Business Value Calculator: www.moresqlserver.com

Page 45: Gopal Ashok Program Manager Microsoft Corporation DAT306

www.microsoft.com/teched

Sessions On-Demand & Community

http://microsoft.com/technet

Resources for IT Professionals

http://microsoft.com/msdn

Resources for Developers

www.microsoft.com/learningMicrosoft Certification and Training Resources

www.microsoft.com/learning

Microsoft Certification & Training Resources

Resources

Page 46: Gopal Ashok Program Manager Microsoft Corporation DAT306

Related Content

Breakout SessionsDAT302 All You Need to Know about Microsoft SQL Server 2008 Failover ClustersDAT318 Microsoft SQL Server 2008 Virtualization Considerations and Best PracticesDAT322 Tips and Tricks for Successful Database Mirroring Deployments with Microsoft SQL Server

Hands On LabDAT07-HOL Microsoft SQL Server 2008 Peer-to-Peer ReplicationDAT08-HOL Microsoft SQL Server 2008 Table Index and PartitioningDAT12-HOL Microsoft SQL Server 2008 Database Mirroring, Part 1DAT13-HOL Microsoft SQL Server 2008 Database Mirroring, Part 2DAT16-HOL Using Microsoft SQL Server 2008 Resource Governor for Predictable Performance

Page 47: Gopal Ashok Program Manager Microsoft Corporation DAT306

Complete an evaluation on CommNet and enter to win!

Page 48: Gopal Ashok Program Manager Microsoft Corporation DAT306

© 2009 Microsoft Corporation. All rights reserved. Microsoft, Windows, Windows Vista and other product names are or may be registered trademarks and/or trademarks in the U.S. and/or other countries.The information herein is for informational purposes only and represents the current view of Microsoft Corporation as of the date of this presentation. Because Microsoft must respond to changing market conditions, it should not be interpreted to be a commitment on the part of Microsoft, and Microsoft cannot guarantee the accuracy of any information provided after the date of this presentation. MICROSOFT MAKES NO WARRANTIES, EXPRESS,

IMPLIED OR STATUTORY, AS TO THE INFORMATION IN THIS PRESENTATION.