51
High Availability and Fault Tolerance Module 11 © 2011 VMware Inc. All rights reserved

VS5ICM_M11_HighAvailability

Embed Size (px)

Citation preview

Page 1: VS5ICM_M11_HighAvailability

High Availability and Fault Tolerance

Module 11

© 2011 VMware Inc. All rights reserved

Page 2: VS5ICM_M11_HighAvailability

You Are Here

Course Introduction

Introduction to Virtualization

Virtual Machines

VMware vCenter Server

Data Protection

Access & Authentication Control

Resource Management and Monitoring

High Availability

11-2

© 2011 VMware Inc. All rights reserved

VMware vSphere: Install, Configure, Manage – Revision A

VMware vCenter Server

Configure and Manage Virtual Networks

Configure and Manage Virtual Storage

Managing Virtual Machines

High Availability

Scalability

Patch Management

Installing vSphere Components

Page 3: VS5ICM_M11_HighAvailability

Importance

Most organizations rely on computer-based services like email, databases, and Web-based applications. The failure of any of these services can mean lost productivity and revenue. Co nfiguring highly available, computer-based services is extremely imp ortant for an organization to remain competitive in contemporary business environments.

With VMware vSphere® 5, a new high availability arc hitecture has

11-3

© 2011 VMware Inc. All rights reserved

VMware vSphere: Install, Configure, Manage – Revision A

been released.

Page 4: VS5ICM_M11_HighAvailability

Module Lessons

Lesson 1: Introduction to vSphere High Availability

Lesson 2: Configuring vSphere High Availability

Lesson 3: vSphere High Availability Architecture

Lesson 4: Introduction to Fault Tolerance

11-4

© 2011 VMware Inc. All rights reserved

VMware vSphere: Install, Configure, Manage – Revision A

Page 5: VS5ICM_M11_HighAvailability

Lesson 1:Introduction to

vSphere High Availability

11-5

© 2011 VMware Inc. All rights reserved

VMware vSphere: Install, Configure, Manage – Revision A

Page 6: VS5ICM_M11_HighAvailability

Learner Objectives

After this lesson, you should be able to do the fol lowing:

� Describe the various options that you can configure to ensure high availability in a vSphere 5 environment.

� Discuss the response of vSphere High Availability when a VMware® ESXi™ host, a virtual machine, or an application fails.

11-6

© 2011 VMware Inc. All rights reserved

VMware vSphere: Install, Configure, Manage – Revision A

Page 7: VS5ICM_M11_HighAvailability

VMware Offers Protection at Every Level

VMware vSphere®

vSphere Storage VMotion

Site Recovery Manager

High Availability & Fault Tolerance

� Protection against hardware failures� Planned maintenance with zero downtime� Protection against unplanned downtime

and disasters

11-7

© 2011 VMware Inc. All rights reserved

VMware vSphere: Install, Configure, Manage – Revision A

NIC Teaming, Storage

Multipathing

VMware vSphere® vMotion®, DRS VMotion Manager

Component Server Storage Data Site

3rd-Party Backup Solutions,

VMware Data Recovery

Page 8: VS5ICM_M11_HighAvailability

vCenter Server Availability - Recommendations

Make VMware vCenter Server™ and the components it re lies on highly available.

vCenter Server relies on:

� vCenter Server database:• Cluster the database. Refer to the specific database documentation.

� Active Directory structure:• Set up with multiple redundant servers.

11-8

© 2011 VMware Inc. All rights reserved

VMware vSphere: Install, Configure, Manage – Revision A

• Set up with multiple redundant servers.

Methods for making vCenter Server available:

� Use vSphere High Availability to protect the vCenter Server virtual machine.

� Use VMware vCenter Server Heartbeat™.

Page 9: VS5ICM_M11_HighAvailability

High Availability

A highly available system is one that is continuous ly operational for a desirably long length of time.

Level of availability Downtime per year

11-9

© 2011 VMware Inc. All rights reserved

VMware vSphere: Install, Configure, Manage – Revision A

99% 87 hours (3.5 days)

99.9% 8.76 hours

99.99% 52 minutes

99.999% 5 minutes

What level of virtual machine availability is

important to you?

Page 10: VS5ICM_M11_HighAvailability

vSphere High Availability

vSphere HA

Level of availability High availability

Amount of downtime Minimal

Guest operating systems supportedWorks with all supported guest operating systems

11-10

© 2011 VMware Inc. All rights reserved

VMware vSphere: Install, Configure, Manage – Revision A

VMware ESXi hardware supported Works with all supported ESXi hardware

UsesUse to provide high availability for the virtual machines that require that level of protection.

Page 11: VS5ICM_M11_HighAvailability

vSphere HA Failure Scenarios

� ESXi host failure

� Guest OS failure

� Application failure

11-11

© 2011 VMware Inc. All rights reserved

VMware vSphere: Install, Configure, Manage – Revision A

Page 12: VS5ICM_M11_HighAvailability

LUN 1 LUN 2 LUN 3

High Availability Failure Scenario - Host

virtual machine A

virtual machine B

virtual machine C

virtual machine Fvirtual machine D

virtual machine E

virtual machine A virtual machine B

When a host fails, vSphere HA restarts the affected virtual machines on other

11-12

© 2011 VMware Inc. All rights reserved

VMware vSphere: Install, Configure, Manage – Revision A

vCenter Server

ESXi host ESXi hostESXi host

machines on other hosts

= vSphere HA cluster

Page 13: VS5ICM_M11_HighAvailability

LUN 1 LUN 2 LUN 3

High Availability Failure Scenario – Guest Operating System

When a virtual machine stops sending heartbeats or the virtual machine process

virtual machine CVMware tools VMware tools

virtual machine E

VMware tools

virtual machine F

VMware tools

virtual machine A

VMware tools

virtual machine BVMware tools

virtual machine D

11-13

© 2011 VMware Inc. All rights reserved

VMware vSphere: Install, Configure, Manage – Revision A

vCenter Server

ESXi host ESXi host

= vSphere HA cluster

ESXi host

machine process crashes (vmx), vSphere HA resets the virtual machine

Page 14: VS5ICM_M11_HighAvailability

LUN 1 LUN 2 LUN 3

HA Failure Scenario - Application

virtual machine E

application When an application fails, vSphere HA restarts the affected virtual machine on the same host.

virtual machine C

application

virtual machine F

applicationvirtual machine D

application

virtual machine A

application

virtual machine B

application

11-14

© 2011 VMware Inc. All rights reserved

VMware vSphere: Install, Configure, Manage – Revision A

vCenter Server

ESXi host ESXi hostESXi host

the same host.

Requires VMware Tools to be installed

= vSphere HA cluster

Page 15: VS5ICM_M11_HighAvailability

Review of Learner Objectives

You should be able to do the following:

� Describe the various options that you can configure to ensure high availability in a vSphere 5 environment.

� Discuss the response of vSphere High Availability when an ESXi host, a virtual machine, or an application fails.

11-15

© 2011 VMware Inc. All rights reserved

VMware vSphere: Install, Configure, Manage – Revision A

Page 16: VS5ICM_M11_HighAvailability

Lesson 2:Configuring vSphere High Availability

11-16

© 2011 VMware Inc. All rights reserved

VMware vSphere: Install, Configure, Manage – Revision A

Page 17: VS5ICM_M11_HighAvailability

Learner Objectives

After this lesson, you should be able to do the fol lowing:

� Configure a vSphere HA cluster.

11-17

© 2011 VMware Inc. All rights reserved

VMware vSphere: Install, Configure, Manage – Revision A

Page 18: VS5ICM_M11_HighAvailability

Enabling vSphere HA

Enable vSphere HA by creating a cluster or modifyin g a vSphere Distributed Resource Scheduler (DRS) cluster.

11-18

© 2011 VMware Inc. All rights reserved

VMware vSphere: Install, Configure, Manage – Revision A

Page 19: VS5ICM_M11_HighAvailability

Configuring vSphere HA Settings

Disable Host Monitoring when

performing maintenance

on any cluster/host. Enabled is the default setting .

Admission Controlrefers to the amount

11-19

© 2011 VMware Inc. All rights reserved

VMware vSphere: Install, Configure, Manage – Revision A

refers to the amount of available resourcesthat can be used to

start virtual machineson a specific ESXi

host. The default setting

is to disallow power and other operations that will violate the

set Admission Control Policy .

Admission control helps ensure sufficient resources to provide

high availability.Default setting is Host failures the cluster tolerates.

VMware recommended

setting

Page 20: VS5ICM_M11_HighAvailability

Admission Control Policy Choices

Policy Description Recommended use

Percentage of cluster resources reserved as failover spare capacity

Reserves specified percentage of total capacity

When virtual machines have highly variable CPU and memory reservations

Host failures cluster tolerates

Reserves enough resources to tolerate specified number

When virtual machines have similar CPU/memory

11-20

© 2011 VMware Inc. All rights reserved

VMware vSphere: Install, Configure, Manage – Revision A

tolerates to tolerate specified number of host failures

have similar CPU/memory reservations and similar memory overheads

Specify a failover host Dedicates a host exclusively for failover service

To accommodate organizational policies that dictate the use of a passive failover host

Page 21: VS5ICM_M11_HighAvailability

Configuring Virtual Machine Options

Configure options at the cluster level or per virtu al machine.

VM restart priority determines relative order in which virtual machines are restarted after a host

failure.

11-21

© 2011 VMware Inc. All rights reserved

VMware vSphere: Install, Configure, Manage – Revision A

Host Isolation response determines what happens to virtual machines when a host loses

the management network but continues running.

Page 22: VS5ICM_M11_HighAvailability

Configuring Virtual Machine Monitoring

Reset a virtual machine if its VMware Tools

heartbeat or VMware Tools application

heartbeats are not received.

Determine how

11-22

© 2011 VMware Inc. All rights reserved

VMware vSphere: Install, Configure, Manage – Revision A

Determine how quickly failures are

detected.

Set monitoring sensitivity for individual virtual machines.

Page 23: VS5ICM_M11_HighAvailability

Importance of Redundant Heartbeat Networks

In a vSphere HA cluster, heartbeats are:

� Sent between the master and the slave hosts

� Used to determine if a master or slave host has failed

� Sent over a heartbeat network

The heartbeat network is:

� Implemented using a VMkernel port marked for management

11-23

© 2011 VMware Inc. All rights reserved

VMware vSphere: Install, Configure, Manage – Revision A

� Implemented using a VMkernel port marked for management

Redundant heartbeat networks:

� Allow for the reliable detection of failures

Page 24: VS5ICM_M11_HighAvailability

Redundancy Using NIC Teaming

You can use NIC teaming to create a redundant heart beat network on ESXi hosts.

Both port groups must be VMkernel ports.

11-24

© 2011 VMware Inc. All rights reserved

VMware vSphere: Install, Configure, Manage – Revision A

NIC teaming on an ESXi host

Page 25: VS5ICM_M11_HighAvailability

Redundancy Using Additional Networks

You can also create redundancy by configuring more heartbeat networks:

� On ESXi hosts, add one or more VMkernel networks marked for management traffic.

Configure port group with

11-25

© 2011 VMware Inc. All rights reserved

VMware vSphere: Install, Configure, Manage – Revision A

Configure port group with these settings:

� Set Load Balancing to originating port ID.

� Do not enable Failback.

� Configure port group with active/standby failover.

Page 26: VS5ICM_M11_HighAvailability

Network Configuration and Maintenance

Before changing the networking configuration on the ESXi hosts (adding port groups, removing vSwitches):

� Deselect Enable Host Monitoring.

� Place the host in

11-26

© 2011 VMware Inc. All rights reserved

VMware vSphere: Install, Configure, Manage – Revision A

host inmaintenance mode.

These steps prevent unwanted attempts to fail over virtual machines.

Page 27: VS5ICM_M11_HighAvailability

Cluster Resource Allocation Tab

How much CPU and memory resources is the cluster us ing now?

How much reserved capacity remains?

11-27

© 2011 VMware Inc. All rights reserved

VMware vSphere: Install, Configure, Manage – Revision A

Page 28: VS5ICM_M11_HighAvailability

Monitoring Cluster Status

The vSphere HA Cluster Status window displays details about host

cluster’s Summary tab

11-28

© 2011 VMware Inc. All rights reserved

VMware vSphere: Install, Configure, Manage – Revision A

window displays details about host operational status, virtual machine protection, and heartbeat datastores

The Configuration Issues window displays the current vSphere HA operational status, including the specific status and errors for each master and slave host in the cluster.

Page 29: VS5ICM_M11_HighAvailability

Lab 18

In this lab, you will modify slot sizes and admissi on control.

1. Create a cluster enabled for vSphere HA.

2. Add your ESXi host to a cluster.

3. Test vSphere HA functionality.

4. Prepare for the next lab.

11-29

© 2011 VMware Inc. All rights reserved

VMware vSphere: Install, Configure, Manage – Revision A

Page 30: VS5ICM_M11_HighAvailability

Review of Learner Objectives

You should be able to do the following:

� Configure a vSphere HA cluster.

11-30

© 2011 VMware Inc. All rights reserved

VMware vSphere: Install, Configure, Manage – Revision A

Page 31: VS5ICM_M11_HighAvailability

Lesson 3:vSphere High Availability Architecture

11-31

© 2011 VMware Inc. All rights reserved

VMware vSphere: Install, Configure, Manage – Revision A

Page 32: VS5ICM_M11_HighAvailability

Learner Objectives

After this lesson, you should be able to do the fol lowing:

� Describe heartbeat mechanisms used by vSphere HA.

� Identify and discuss additional failure scenarios.

11-32

© 2011 VMware Inc. All rights reserved

VMware vSphere: Install, Configure, Manage – Revision A

Page 33: VS5ICM_M11_HighAvailability

vSphere HA Architecture: Agent Communication

FDM FDMFDM

datastore datastoredatastore

11-33

© 2011 VMware Inc. All rights reserved

VMware vSphere: Install, Configure, Manage – Revision A

vCenter Server

ESXi host (slave) ESXi host (master)ESXi host (slave)

vpxd

hostdhostdhostd

= Management network

vpxa vpxa vpxa

Page 34: VS5ICM_M11_HighAvailability

vSphere HA Architecture: Network Heartbeats

virtual machine A

virtual machine B

virtual machine C

virtual machine D

virtual machine E

virtual machine F

ESXi host(slave)

ESXi host(slave)

ESXi host(master)

NAS/NFS VMFS Local

11-34

© 2011 VMware Inc. All rights reserved

VMware vSphere: Install, Configure, Manage – Revision A

vCenter Server

(slave) (slave) (master)

Management network 1

Management network 2

Page 35: VS5ICM_M11_HighAvailability

vSphere HA Architecture: Datastore Heartbeats

virtual machine A

virtual machine B

virtual machine C

virtual machine D

virtual machine E

virtual machine F

ESXi host(slave)

ESXi host(master)

ESXi host(slave)

NAS/NFS VMFS Local

Cluster Edit Settings Window

11-35

© 2011 VMware Inc. All rights reserved

VMware vSphere: Install, Configure, Manage – Revision A

Management network 1

Management network 2

vCenter Server

Page 36: VS5ICM_M11_HighAvailability

Additional HA Failure Scenarios

� Slave host failure

� Master host failure

� Host isolation

� Management network failures• Network partition

• Network isolation

11-36

© 2011 VMware Inc. All rights reserved

VMware vSphere: Install, Configure, Manage – Revision A

Page 37: VS5ICM_M11_HighAvailability

Failed Slave Host

virtual machine A

virtual machine B

virtual machine C

virtual machine D

virtual machine E

virtual machine F

NAS/NFS(lock file)

file locks file locks

VMFS(heartbeat region)

11-37

© 2011 VMware Inc. All rights reserved

VMware vSphere: Install, Configure, Manage – Revision A

vCenter Server

ESXi host(slave)

ESXi host(master)

ESXi host(slave)?

primary heartbeat network

alternate heartbeat network

Page 38: VS5ICM_M11_HighAvailability

Failed Master Host

virtual machine A

virtual machine B

virtual machine C

virtual machine D

virtual machine E

virtual machine F

ESXi host ESXi host

file locks

NAS/NFS(lock file)

file locks

default gateway(isolation address)

ESXi host

VMFS(heartbeat region)

11-38

© 2011 VMware Inc. All rights reserved

VMware vSphere: Install, Configure, Manage – Revision A

ESXi hostRole: slaveMOID: 98

ESXi hostRole: masterMOID: 99 ?

vCenter Serverprimary heartbeat network

alternate heartbeat network

MOID = managed object ID

ESXi hostRole: slaveMOID: 100

Page 39: VS5ICM_M11_HighAvailability

Isolated Host

virtual machine A

virtual machine B

virtual machine C

virtual machine D

virtual machine E

virtual machine F

The host is not observing any election traffic on the management and cannot ping its isolation

11-39

© 2011 VMware Inc. All rights reserved

VMware vSphere: Install, Configure, Manage – Revision A

ESXi host ESXi host

default gateway(isolation address)

ESXi hostping its isolation address(es), the host is isolated.

Page 40: VS5ICM_M11_HighAvailability

Design Considerations

Host isolation events can be minimized through good design

� Implement redundant heartbeat networks

� Implement redundant isolation addresses

If host isolation events do occur, good design enab les vSphere HA to determine whether the isolated host is still ali ve

� Implement datastores so that they are separated from the management network using one or both of the following approaches:

11-40

© 2011 VMware Inc. All rights reserved

VMware vSphere: Install, Configure, Manage – Revision A

management network using one or both of the following approaches:• Fibre Channel over fibre optic

• Physically separating your IP storage network from the management network

Page 41: VS5ICM_M11_HighAvailability

Network Partition

virtual machine A

virtual machine B

ESXi hostMASTER

virtual machine C

virtual machine D

ESXi hostSLAVE

virtual machine E

virtual machine F

ESXi hostSLAVE

virtual machine G

virtual machine H

ESXi hostSLAVE

MASTER

11-41

© 2011 VMware Inc. All rights reserved

VMware vSphere: Install, Configure, Manage – Revision A

default gateway(isolation address)

vCenter Server

Page 42: VS5ICM_M11_HighAvailability

Review of Learner Objectives

You should be able to do the following:

� Describe heartbeat mechanisms used by vSphere HA

� Identify and discuss additional failure scenarios

11-42

© 2011 VMware Inc. All rights reserved

VMware vSphere: Install, Configure, Manage – Revision A

Page 43: VS5ICM_M11_HighAvailability

Lesson 4:Introduction to Fault Tolerance

11-43

© 2011 VMware Inc. All rights reserved

VMware vSphere: Install, Configure, Manage – Revision A

Page 44: VS5ICM_M11_HighAvailability

Learner Objectives

After this lesson, you should be able to do the fol lowing:

� List Fault Tolerance requirements and limitations.

� Describe Fault Tolerance operation.

11-44

© 2011 VMware Inc. All rights reserved

VMware vSphere: Install, Configure, Manage – Revision A

Page 45: VS5ICM_M11_HighAvailability

What Is Fault Tolerance (FT)?

FT:

� A fault-tolerant system is designed so that, in the event of an unplanned outage, a backup virtual machine can immediately take over with no loss of service. (The backup virtual machine is called a secondary virtual machine.)• Provides a higher level of business continuity than vSphere HA• Provides zero downtime and zero data loss for applications

11-45

© 2011 VMware Inc. All rights reserved

VMware vSphere: Install, Configure, Manage – Revision A

FT can be used for any application that needs to be available at all times.

FT can be used with DRS:

� Fault-tolerant virtual machines benefit from better initial placement and are included in the cluster’s load-balancing calculations.

Page 46: VS5ICM_M11_HighAvailability

VMware Fault Tolerance

Fault Tolerance

Level of availability Fault tolerance

Amount of downtime Zero

Guest operating systems supportedWorks with all supported guest operating systems

11-46

© 2011 VMware Inc. All rights reserved

VMware vSphere: Install, Configure, Manage – Revision A

ESXi hardware supported Widely compatible

UsesUse to provide fault tolerance to your critical virtual machines.

Page 47: VS5ICM_M11_HighAvailability

Fault Tolerance in Action

primary VM

secondaryVMnew

primarynew

secondary

vLockstep technology vLockstep technology

11-47

© 2011 VMware Inc. All rights reserved

VMware vSphere: Install, Configure, Manage – Revision A

FT provides zero-downtime, zero-data-loss protection to virtual machines in a vSphere HA cluster.

VM VMprimaryVM

secondary VM

Page 48: VS5ICM_M11_HighAvailability

Fault Tolerance Guidelines

Check the requirements and limitations of FT.

Ensure enough ESXi hosts for fault-tolerant virtual machines:

� No more than four fault-tolerant virtual machines (primaries or secondaries) on any single host

Store ISOs on shared storage for continuous access:

� Especially if used for important operations

11-48

© 2011 VMware Inc. All rights reserved

VMware vSphere: Install, Configure, Manage – Revision A

Disable BIOS-based power management:

� Prevents the secondary virtual machine from having insufficient CPU resources

Page 49: VS5ICM_M11_HighAvailability

Enabling Fault Tolerance on a Virtual Machine

11-49

© 2011 VMware Inc. All rights reserved

VMware vSphere: Install, Configure, Manage – Revision A

Page 50: VS5ICM_M11_HighAvailability

Review of Learner Objectives

You should be able to do the following:

� List Fault Tolerance requirements and limitations.

� Describe Fault Tolerance operation.

11-50

© 2011 VMware Inc. All rights reserved

VMware vSphere: Install, Configure, Manage – Revision A

Page 51: VS5ICM_M11_HighAvailability

Key Points

� vSphere HA restarts virtual machines on the remaining hosts in the cluster.

� Hosts in vSphere HA clusters have a master/slave relationship.

� Implement redundant heartbeat networks either with NIC teaming or by creating additional heartbeat networks.

� FT provides zero downtime for applications that need to be available at all times.

11-51

© 2011 VMware Inc. All rights reserved

VMware vSphere: Install, Configure, Manage – Revision A

at all times.