Upload
colin-dennis
View
223
Download
0
Tags:
Embed Size (px)
Citation preview
Multi-Site Clustering with Windows Server 2008 R2
Elden ChristensenSenior Program Manager LeadMicrosoftSession Code: SVR319
Session Objectives And Takeaways
Session Objective(s): Understanding the need and benefit of multi-site clustersWhat to consider as you plan, design, and deploy your first multi-site cluster
Windows Server Failover Clustering is a great solution for not only high availability, but also disaster recovery
Multi-Site Clustering
Introduction Networking Storage Quorum Workloads
Site A But what if there is a catastrophic event?
Fire, flood, earthquake …
Same Physical Location
SAN
Is my Cluster Resilient to Site Failures?
Site BSite AApplications are failed over to a
separate physical location
Node is moved to a physically separate site
Multi-Site Clusters for DR
Extends a cluster from being a High Availability solution, to also being a Disaster Recovery solution
SANSAN
Benefits of a Multi-Site Cluster
Protects against loss of an entire datacenterAutomates failover
Reduced downtimeLower complexity disaster recovery plan
Reduces administrative overheadAutomatically synchronize application and cluster changesEasier to keep consistent than standalone servers
The primary reason DR solutions fail isdependence on people
Multi-Site Clustering
Introduction Networking Storage Quorum Workloads
Network Considerations
Network Options:1. Stretch VLAN’s across sites2. Cluster nodes can reside in different subnets
Site A
Public Network
Site B10.10.10.1 20.20.20.1
30.30.30.1 40.40.40.1
Separate Network
Stretching the NetworkLonger distance traditionally means greater network latencyToo many missed health checks can cause false failoverHeartbeating is fully configurable
SameSubnetDelay (default = 1 second)Frequency heartbeats are sent
SameSubnetThreshold (default = 5 heartbeats)Missed heartbeats before an interface is considered down
CrossSubnetDelay (default = 1 second)Frequency heartbeats are sent to nodes on dissimilar subnets
CrossSubnetThreshold (default = 5 heartbeats)Missed heartbeats before an interface is considered down to nodes on dissimilar subnets
Command Line: Cluster.exe /propPowerShell (R2): Get-Cluster | fl *
Security over the WANEncrypt intra-node traffic
0 = clear text1 = signed (default)2 = encrypted
Site A Site B10.10.10.1 20.20.20.1
30.30.30.1 40.40.40.1
Enhanced Dependencies – ORNetwork Name resource stays up if either IP Address Resource A OR IP Address Resource B is up
OR
Network Name resource
IP Address Resource A
IP Address Resource B
Client Reconnect ConsiderationsNodes in dissimilar subnetsFailover changes resource’s IP AddressClients need that new IP Address from DNS to reconnect
10.10.10.111 20.20.20.222
DNS Server 1DNS Server 2DNS Replication
Record Updated
Record Created
Record Obtained
FS = 10.10.10.111
Record Updated
FS = 20.20.20.222Site A Site B
Solution #1: Configure NN SettingRegisterAllProvidersIP (default = 0 for FALSE)
Determines if all IP Addresses for a Network Name will be registered by DNSTRUE (1): IP Addresses can be online or offline and will still be registeredEnsure application is set to try all IP Addresses, so clients can connect quicker
HostRecordTTL (default = 1200 seconds)Controls time the DNS record lives on client for a cluster network nameShorter TTL: DNS records for clients updated sooner
Solution #2: Prefer Local FailoverLocal failover for higher availability
No change in IP AddressCross-site failover for disaster recovery
10.10.10.111
DNS Server 1 DNS Server 2
FS = 10.10.10.111Site A Site B
20.20.20.222
Solution #3: Stretch VLAN’sDeploying a VLAN minimizes client reconnection times
DNS Server 1 DNS Server 2
FS = 10.10.10.111
Site A Site B
10.10.10.11110.10.10.111
VLAN
Solution #4: Abstraction in DeviceNetwork device uses 3rd IP3rd IP is the one registered in DNS & used by clientExample:http://www.cisco.com/en/US/docs/solutions/Enterprise/Data_Center/App_Networking/extmsftw2k8vistacisco.pdf
10.10.10.111 20.20.20.222
DNS Server 1
DNS Server 2
FS = 30.30.30.30Site A Site B
30.30.30.30
This is generic guidance…
If you have other creative ideas, that’s ok!
Multi-Site Clustering
Introduction Networking Storage Quorum Workloads
Storage in Multi-Site Clusters
Different than local clusters:Multiple storage arrays – independent per siteNodes commonly access own site storageNo “true” shared disk visible to all nodes
Site A Site B
Site A
Changes are made on Site A and replicated to Site B
Site B
Replica
Storage Considerations
Need a data replication mechanism between sites
Replication Options
Replication levels:Hardware storage-based replication
Software host-based replication
Application-based replication
Synchronous ReplicationHost receives “write complete” response from the storage after the data is successfully written on both storage devices
PrimaryStorage
SecondaryStorage
WriteComplete
Replication
Acknowledgement
WriteRequest
Asynchronous ReplicationHost receives “write complete” response from the storage after the data is successfully written to the primary storage device
PrimaryStorage
SecondaryStorage
WriteComplete
Replication
WriteRequest
Synchronous vs. Asynchronous
Synchronous AsynchronousNo data loss Potential data loss on hard
failuresRequires high
bandwidth/low latency connection
Enough bandwidth to keep up with data replication
Stretches over shorter distances
Stretches over longer distances
Write latencies impact application performance
No significant impact on application performance
Ensures node is communicating with
local storage and array state
Disk Resource
Resource Group
Custom Resource
IP Address Resources*
Network Name Resource
Establishes start order
timing
Group determines smallest unit of
failover
Storage Resource Dependencies
Ensures node is communicating with
local storage and array state
Ensures application comes online after
replication is complete
Workload Resource (example File Server)
Cluster Validation and Replication
Multi-Site clusters are not required to pass the Storage tests to be supported
Validation Guide and Policyhttp://go.microsoft.com/fwlink/?LinkID=119949
HP’s Multi-Site Implementation & DemoMatthias PoppArchitectHP
partner
HP's Multi-Site Implementation:CLX for Windows
Virtual Machine
VM Config FilePhysical Disk
HP CLX
All Physical Disk resources of one Resource Group (VM) depend on a CLX resourceVery smooth integration
HP Cluster Extension –What’s new?
Support for Hyper-V Live Migration across disk arraysSupport for Windows 2008 R2 Support for Windows Hyper-V Server 2008 R2
TT337AAE – HP StorageWorks Cluster Extension EVA for Window e-LTUThere is no change to current CLX product pricing
XP Cluster Extension does not yet support Live Migration - planed for 2010
Live Migration with Storage FailoverInitiate Live Migration
storage based remote replication storage based remote replication
Host 1 Host 2
HP EVA Storage HP EVA Storage
Create VM on target nodeCreate VM on target nodeCopy memory pages from source server to target server via EthernetCheck disk array for replication link and disk pair states
Initiate Live MigrationCreate VM on target node
Copy memory pages from source server to target server via EthernetCheck disk array for replication link and disk pair states
Final state transferPause virtual machineMove storage connectivity from source server to target serverChange storage replication direction
Initiate Live MigrationCreate VM on target node
Copy memory pages from source server to target server via EthernetCheck disk array for replication link and disk pair states
Final state transferPause virtual machineMove storage connectivity from source server to target serverChange storage replication direction
Run new VM on target server; Delete VM on source server
HP Storage for VirtualizationHyper-V Live Migration between Replicated Disk Arrays
End-user transparent app migration across data centers; across servers and storageZero Downtime Array Load Balancing
(IOPS, cache utilization, response times, power consumption, etc.)Zero Downtime Maintenance
Firmware/HBA/Server updates without user interruptionPlan maintenance without the need to check for downtimes
Follow the sun/moon data center access modelMove the app/VM closest to the users or closest to the cheapest power source
Failover, failback, Quick and Live Migration using the same management software
No need to learn x different tools and their limitations
EVA CLX with Exchange 2010 Live Migrationdemo
Virtual Machines
Mailbox server G:\ OS Disk 30 GB
K:\ Database Disk 100 GB
Hub Transport server OS Disk 30GB
Client Access server OS Disk 30GB
Hyper-V Geo Cluster with Exchange
LAN
Command View
SCVMM
Virtual networkEVA 4400
EVA 4400
LiveMigration
Command ViewSCVMM
SAN
Replicate VHDs of all VMs
Hub Transport serverClient Access server
DR Group 003DR Group 002
DR Group 001
SAN
Mailbox server
HP Cluster ExtensionHyper-V Cluster
Virtual Machines
Mailbox server G:\ OS Disk 30 GB
K:\ Database Disk 100 GB
Mailbox server
Automatically re-direct storage replication during Live Migration
Virtual Machines
Hub Transport server OS Disk 30GB
Client Access server OS Disk 30GB
Hyper-V Geo Cluster with Exchange
LAN
Command View
SCVMM
Virtual networkEVA 4400
EVA 4400
LiveMigration
Command ViewSCVMM
SAN
Replicate VHDs of all VMs
Hub Transport serverClient Access server
DR Group 003DR Group 002
DR Group 001
SAN
HP Cluster ExtensionHyper-V Cluster
37
Additional HP ResourcesHP website for Hyper-V
www.hp.com/go/hyper-v HP and Microsoft Frontline Partnership website
www.hp.com/go/microsoft HP website for Windows Server 2008 R2
www.hp.com/go/ws2008r2HP website for management tools
www.hp.com/go/insightHP OS Support Matrix
www.hp.com/go/osssupportInformation on HP ProLiant Network Adapter Teaming for Hyper-V
http://h20000.www2.hp.com/bc/docs/support/SupportManual/c01663264/c01663264.pdf
Technical overview on HP ProLiant Network Adapter Teaminghttp://h20000.www2.hp.com/bc/docs/support/SupportManual/c01415139/c01415139.pdf?jumpid=reg_R1002_USEN
Whitepaper: Disaster Tolerant Virtualization Architecture with HP StorageWorks Cluster Extension and Microsoft Hyper-V™
http://h20195.www2.hp.com/V2/getdocument.aspx?docname=4AA2-6905ENW.pdf
Multi-Site Clustering
Introduction Networking Storage Quorum Workloads
Quorum Overview
Disk only (not recommended)Node and Disk majority
Node majorityNode and File Share majority
VoteVote Vote Vote Vote
Majority is greater than 50%Possible Voters:
Nodes (1 each) + 1 Witness (Disk or File Share)4 Quorum Types
Replicated Disk WitnessA witness is a decision maker when nodes lose network connectivity
When a witness is not a single decision maker, problems occurDo not use in multi-site clusters unless directed by vendor
Replicated Storage from vendor
?Vote Vote Vote
Site BSite A
Cross site network connectivity broken!
Can I communicate with majority of the nodes in
the cluster?Yes, then Stay Up
Can I communicate with majority of the nodes in
the cluster?No, drop out of Cluster
Membership
5 Node Cluster: Majority = 3
Majority in Primary Site
SANSAN
Node Majority
Node Majority
Site BSite A
Disaster at Site 1
We are down! Can I communicate with majority of the nodes in
the cluster?No, drop out of Cluster
Membership
Majority in Primary Site
5 Node Cluster: Majority = 3
SANSAN
Need to force quorum manually
Forcing Quorum
Always understand why quorum was lostUsed to bring cluster online without quorumCluster starts in a special “forced” stateOnce majority achieved, no more “forced” state
Command Line:net start clussvc /fixquorum (or /fq)
PowerShell (R2):Start-ClusterNode –FixQuorum (or –fq)
Site A Site B
Site C
Complete resiliency and automatic recovery from the loss of any 1 site
Replicated Storage
\\Foo\Cluster1
SAN SAN
WAN
Multi-Site With File Share WitnessFile Share Witness
WANSite A Site B
Site C
Complete resiliency and automatic recovery from the loss of connection between sites
Replicated Storage
SAN SAN
Multi-Site With File Share WitnessCan I communicate with
majority of the nodes (+FSW) in the cluster?
Yes, then Stay Up
File Share Witness
Can I communicate with majority of the nodes in the
cluster?No (lock failed), drop out of
Cluster Membership\\Foo\Cluster1
FSW Considerations
Simple Windows File ServerSingle file server can serve as a witness for multiple clusters
Each cluster requires it’s own shareCan be clustered in a second cluster
Recommended to be at 3rd separate site so that there is no single point of failure
FSW cannot be on a node in the same cluster
Quorum Model Summary
No Majority: Disk OnlyNot RecommendedUse as directed by vendor
Node and Disk MajorityUse as directed by vendor
Node MajorityOdd number of nodesMore nodes in primary site
Node and File Share MajorityEven number of nodesBest availability solution – FSW in 3rd site
Multi-Site Clustering
Introduction Networking Storage Quorum Workloads
Hyper-V in a Multi-Site Cluster
Area ConsiderationsNetwork -On cross-subnet failover, if guest is …
- DHCP, then IP updated automatically- Statically configured IP, then admin needs to
configure new IP-Use VLAN preferred with live migration between sites
Storage -3rd party replication solution required-Configuration with CSV (explained next)
Quorum -No special considerations
Links: http://technet.microsoft.com/en-us/library/dd197488.aspx
CSV in a Multi-Site Cluster
Architectural assumptions collide…Replication solutions assume only 1 array accessed at a timeCSV assumes all nodes can concurrently access the LUN
CSV is not required for Live MigrationTalk to your storage vendor for their support storyCSV requires VLAN’s
VHD
Nodes in Primary Site Nodes in Disaster Recovery Site
Read/OnlyRead/WriteReplication
VM attempts to access replica
SQL in a Multi-Site Cluster
Area ConsiderationsNetwork -SQL does not support OR dependency
-Need to stretch VLAN between sitesStorage -No special considerations
-3rd party replication solution requiredQuorum -No special considerations
Links:http://technet.microsoft.com/en-us/library/ms189134.aspx http://technet.microsoft.com/en-us/library/ms178128.aspx
Exchange in a Multi-Site ClusterArea Considerations
Network -No VLAN needed-Change HostRecordTTL from 20 minutes to 5 minutes-CCR supports 2 nodes, one per site
Storage -Exchange CCR provides application-based replication
Quorum -File share witness on the Hub Transport server on primary site
Links:http://technet.microsoft.com/en-us/library/bb124721.aspx http://technet.microsoft.com/en-us/library/aa998848.aspx
Session Summary
Multi-Site Failover Clustering has many benefitsRedundancy is needed everywhereUnderstand your replication needsCompare VLANs with multiple subnetsPlan quorum model & nodes before deploymentFollow the checklist and best practices
www.microsoft.com/teched
Sessions On-Demand & Community
http://microsoft.com/technet
Resources for IT Professionals
http://microsoft.com/msdn
Resources for Developers
www.microsoft.com/learning
Microsoft Certification & Training Resources
Resources
Related Content
Breakout SessionsSVR208 Gaining Higher Availability with Windows Server 2008 R2 Failover ClusteringSVR319 Multi-Site Clustering with Windows Server 2008 R2 DAT312 All You Needed to Know about Microsoft SQL Server 2008 Failover ClusteringUNC307 Microsoft Exchange Server 2010 High AvailabilitySVR211 The Challenges of Building and Managing a Scalable and Highly Available Windows Server 2008 R2 Virtualisation SolutionSVR314 From Zero to Live Migration. How to Set Up a Live Migration
Demo SessionsSVR01-DEMO Free Live Migration and High Availability with Microsoft Hyper-V Server 2008 R2
Hands-on LabsUNC12-HOL Microsoft Exchange Server 2010 High Availability and Storage Scenarios
Multi-Site Clustering Content
Design guide:http://technet.microsoft.com/en-us/library/dd197430.aspx
Deployment guide/checklist:http://technet.microsoft.com/en-us/library/dd197546.aspx
Complete an evaluation on CommNet and enter to win an Xbox 360 Elite!
© 2009 Microsoft Corporation. All rights reserved. Microsoft, Windows, Windows Vista and other product names are or may be registered trademarks and/or trademarks in the U.S. and/or other countries.The information herein is for informational purposes only and represents the current view of Microsoft Corporation as of the date of this presentation. Because Microsoft must respond to changing market conditions, it should not be interpreted to be a commitment on the part of Microsoft, and Microsoft cannot guarantee the accuracy of any information provided after the date of this presentation. MICROSOFT MAKES NO WARRANTIES, EXPRESS,
IMPLIED OR STATUTORY, AS TO THE INFORMATION IN THIS PRESENTATION.