30
Best Practices for designing Farms and Clusters Daniel McLean Systems Engineer – Strategic Accounts Thank you to: Nathan Wheat, Andy Meakin and Michael Francis

TAC9516 Best Practices for designing Farms and Clustersdownload3.vmware.com/vmworld/2006/tac9516.pdf · Best Practices for designing Farms and Clusters ... want to overload the servers

Embed Size (px)

Citation preview

Page 1: TAC9516 Best Practices for designing Farms and Clustersdownload3.vmware.com/vmworld/2006/tac9516.pdf · Best Practices for designing Farms and Clusters ... want to overload the servers

Best Practices for designing Farms and Clusters

Daniel McLean

Systems Engineer – Strategic Accounts

Thank you to: Nathan Wheat, Andy Meakin and Michael Francis

Page 2: TAC9516 Best Practices for designing Farms and Clustersdownload3.vmware.com/vmworld/2006/tac9516.pdf · Best Practices for designing Farms and Clusters ... want to overload the servers

Objectives

Speak and be understood!What is a cluster?How can designing a cluster help with scaling out a Virtual Infrastructure?An approach to system sizing and chargeback.An approach to sizing the cluster.

Page 3: TAC9516 Best Practices for designing Farms and Clustersdownload3.vmware.com/vmworld/2006/tac9516.pdf · Best Practices for designing Farms and Clusters ... want to overload the servers

Overview

VI Cluster

Page 4: TAC9516 Best Practices for designing Farms and Clustersdownload3.vmware.com/vmworld/2006/tac9516.pdf · Best Practices for designing Farms and Clusters ... want to overload the servers

Overview

VI Cluster

Compute Units

Page 5: TAC9516 Best Practices for designing Farms and Clustersdownload3.vmware.com/vmworld/2006/tac9516.pdf · Best Practices for designing Farms and Clusters ... want to overload the servers

Building the Virtual Enterprise

VI Cluster

Compute Units

VI Cluster

Compute Units

VI Cluster

Compute Units

VI Cluster

Compute Units

VI Cluster

Compute Units

VI Cluster

Compute Units

Page 6: TAC9516 Best Practices for designing Farms and Clustersdownload3.vmware.com/vmworld/2006/tac9516.pdf · Best Practices for designing Farms and Clusters ... want to overload the servers

Why Many Clusters?

TechnologicalPerformanceSoftware/Hardware LimitsServices offered (DRS, HA)Tiered Storage or serversRedundancy

GeographicSocial/Political

Business UnitEnvironmental

Dev/Test, QA, ProductionDMZ

VI Cluster

Compute Units

Tier 1 Service OfferingN+2, dual path, highly redundant servers, HA, DRS, VCB. Tier 1 storage with data mirroring. Resource guarantees.

Tier 4 service offeringNo resource guarantees, HA if possible, oversubscription, DRS, Tier 3 storage

Page 7: TAC9516 Best Practices for designing Farms and Clustersdownload3.vmware.com/vmworld/2006/tac9516.pdf · Best Practices for designing Farms and Clusters ... want to overload the servers

Common Questions – One Answer

How Many………

VMs on a server?

LUNs per server?VMs on a LUN?

LUNs to a server?

NICs per server?

It Depends!

Page 8: TAC9516 Best Practices for designing Farms and Clustersdownload3.vmware.com/vmworld/2006/tac9516.pdf · Best Practices for designing Farms and Clusters ... want to overload the servers

How Will This Help?

The “It Depends” is based on each customer environment.Sometimes we don’t know how many VMs we need to haveSometimes we don’t know the load of a system before we test it

The information discussed is to gain an appreciation of what needs to be considered.Move the focus of your designs away from servers and onto clusters. This will likely resolve performance, vMotion, scalability issues.

Experience and VMware recommendations will put you in the ball park. A Professional services engagement will hit the home run.

Page 9: TAC9516 Best Practices for designing Farms and Clustersdownload3.vmware.com/vmworld/2006/tac9516.pdf · Best Practices for designing Farms and Clusters ... want to overload the servers

Clusters – What is a VI Cluster

VI Cluster

Page 10: TAC9516 Best Practices for designing Farms and Clustersdownload3.vmware.com/vmworld/2006/tac9516.pdf · Best Practices for designing Farms and Clusters ... want to overload the servers

Clusters – Elements of a Cluster

VI Cluster

Compatible CPU between all hosts to support vMotion. Speed and quantity can differ.

Enough memory in hosts to support VMs. Consistency amongst hosts.

A common Gb network between all hosts for vMotion, host management and VMs

Shared storage accessible to all hosts for Virtual Machines

Page 11: TAC9516 Best Practices for designing Farms and Clustersdownload3.vmware.com/vmworld/2006/tac9516.pdf · Best Practices for designing Farms and Clusters ... want to overload the servers

The Virtual Infrastructure Unit

A compute unit of infrastructureDesiging for the normal, not the exception.

PROsHelps with planning new Virtual Machines and with sizing the clusterIs a great way to charge back to business units

CONsIs static and can result in some underutilisation (Small)

VI Unit

Page 12: TAC9516 Best Practices for designing Farms and Clustersdownload3.vmware.com/vmworld/2006/tac9516.pdf · Best Practices for designing Farms and Clusters ... want to overload the servers

VI Unit – a Working Example

Think about the most basic server in your environment –your SOE (Standard Operating Environment)We are mainly concerned with RAM and CPUDecide on the ratio of vCPU to physical CPUs

We will use 8:1Create 3 VM sizes – Light, Standard & HeavyThis only alters the memory and the CPU.

Disk and network come later

VM Element Base Unit Notes

vCPU 1 1/8 of a physical CPU

Memory 512Mb Minimum memory to run corporate server or desktop VM

Network 1 1/8 of a physical Gb NIC

Disk 10Gb + Data

To hold standard image – OS and base apps

Server Element

Light1 Unit

Standard2 Units

Heavy4 Units

1 1 or 2

2Gb1Gb

vCPU 1

Memory 512Mb

Page 13: TAC9516 Best Practices for designing Farms and Clustersdownload3.vmware.com/vmworld/2006/tac9516.pdf · Best Practices for designing Farms and Clusters ... want to overload the servers

VI Unit – Physical Server Sizing

We use the ratio to determine the physical server size. (we used 8:1)

Multiple the VI Unit memory by this ratio to work out how much RAM is needed per CPU in the machine.• 512Mb x 8 = 4Gb2* Gb NICs for the server (Mgmt, vMotion) + 1Gb for every CPUHBA, Min of 2

For my example VI Unit (1 vCPU, 512Mb RAM)

*Add separate NICs if using iSCSIsoftware initiator

Physical Server Size

CPU Memory

2 8Gb

16Gb

32Gb

4

8

NIC

2-way 4

4-way 8

8-way 10

Number of Units Light Standard

16 8

16

36

32

64

Heavy

2-way 4

4-way 8

8-way 16

Page 14: TAC9516 Best Practices for designing Farms and Clustersdownload3.vmware.com/vmworld/2006/tac9516.pdf · Best Practices for designing Farms and Clusters ... want to overload the servers

Cluster Sizing

We have our VI Unit, now we need somewhere to run it. The Cluster.

What do we need to consider when sizing the cluster?How big is the storage, how big is the server?

What affects performance for better or worse?How do we charge for it?

GoalTo reach our desired utilisation across all elements• Memory, CPU, Disk etc.• We don’t want to overload the servers and have lots of free space

on the disk volumes.To recover the cost of the infrastructure

Page 15: TAC9516 Best Practices for designing Farms and Clustersdownload3.vmware.com/vmworld/2006/tac9516.pdf · Best Practices for designing Farms and Clusters ... want to overload the servers

Considerations for Sizing a Cluster

The number of VMs or VI Units per datastoreDetermined by amount of IO going to and from the storageWill depend on type of storage and load from VMAffected by

Storage performanceVM LoadInfrastructure changesRedo Logs

Good Rule of thumb for FC SAN is Less then 100 Lightly loaded VMs (80)Less then 40 Standard VMsLess then 20 Heavy VMs

VIUs/DS x DS = Total VIUs per clusterE.g 80 per DS x 5DS = 480 VIUs per cluster

VIUs per DataStore

Page 16: TAC9516 Best Practices for designing Farms and Clustersdownload3.vmware.com/vmworld/2006/tac9516.pdf · Best Practices for designing Farms and Clusters ... want to overload the servers

Considerations for Sizing a Cluster

The number of ESX hosts connected to one DatastoreAffected by Number of Infrastructure Changes leading to VMFS locks

Power operations, vMotions, creation and deletion of VMs, Redo LogsBetter in ESX3 then ESX2 – still must be consideredThe more Infrastructure Changes the less hosts per cluster

The number of hosts per DS will determine the number of Hosts in the Cluster.

E.g. 12 hosts per datastore = 12 hosts per Cluster

Hosts per DataStore

Page 17: TAC9516 Best Practices for designing Farms and Clustersdownload3.vmware.com/vmworld/2006/tac9516.pdf · Best Practices for designing Farms and Clusters ... want to overload the servers

Considerations for Sizing a Cluster

The number of Datastores connected to a single host.Performance affected by

Type of array• Enterprise – Active/Active High performance• Mid-range – Active/Passive Mid performance• NAS/iSCSI/FC SAN• Speed of HBA and number of pathsType of VM• Light or Heavy IO

Often a recommendation by the storage vendorThe number of DS per Host will determine the maximum number of DS for the Cluster.

E.g. 6 DS per Host = 6 DS per Cluster

DataStores per Host

Page 18: TAC9516 Best Practices for designing Farms and Clustersdownload3.vmware.com/vmworld/2006/tac9516.pdf · Best Practices for designing Farms and Clusters ... want to overload the servers

Considerations for Sizing a Cluster - Summary

Light – 80:1Standard – 40:1Heavy – 20:1

VIUs/DS X DS = total number of VMs for cluster.

VIUs per DataStore Hosts per DataStore DataStores per Host

ESX2 – 10:1ESX3 – 16:1

Hosts/DS = total number of hosts for the cluster

Enterprise – 12:1Mid-range – 6:1

DS/Host = Max number of DS for Cluster

Page 19: TAC9516 Best Practices for designing Farms and Clustersdownload3.vmware.com/vmworld/2006/tac9516.pdf · Best Practices for designing Farms and Clusters ... want to overload the servers

Cluster 1

Templates and ISO Datastore

Cluster 2Template/ISO LUN

Staging Server

Page 20: TAC9516 Best Practices for designing Farms and Clustersdownload3.vmware.com/vmworld/2006/tac9516.pdf · Best Practices for designing Farms and Clusters ... want to overload the servers

Putting It Together

We need to consider all aspects when sizing the cluster.If we only take one or two items into account when we design – we get huge potential for problems.Example:

Limit VMs per Datastore to 30. Valid for high performing VMs. The hosts are 4 way servers expected to run 20 VMs per server.After some rapid growth there are now 600 VMs running in the environment. To run this load there are:• 28 Physical hosts and 20 datastores.• Serious performance issues start to occur.

• (Storage/SP contention)• Still only running 20 VMs per server and 30 per datastore

Page 21: TAC9516 Best Practices for designing Farms and Clustersdownload3.vmware.com/vmworld/2006/tac9516.pdf · Best Practices for designing Farms and Clusters ... want to overload the servers

Putting It Together

Start with any item that you know aboutVMs – a known quantity requiredServers – a given type of server (Dual or Quad core)Storage – a known array

Consider all aspects of the cluster and you won’t go wrongEnsure that the assumptions you make do not go outside any of the boundaries unless you test, test and test.

Got it Wrong? Easy to change, move the boundaries of the clusteraround.Set in stone? No, as hardware and software changes, so will these limits

Storage, servers, software gets faster and better with each new release.Be flexible

Page 22: TAC9516 Best Practices for designing Farms and Clustersdownload3.vmware.com/vmworld/2006/tac9516.pdf · Best Practices for designing Farms and Clusters ... want to overload the servers

Storage Sizing - VMs

Virtual Machines StorageSystem Volume for OS (all VMs)Data Volume (Some VMs)

Data Volume is sized per VM based on the application requirements.System Volume is standard for all VMs – e.g. 10Gb System Volume

Be frugal with what you include. Do you need the i386 dir locallyDon’t be too Frugal, you still need the apps installed, some free space and a swap file

VM Application Data storage placement:

Light Data volumes in VMDK

Standard Data volumes in VMDK

MediumData volumes in VMDK or RDM (Raw Disk Mapping), sized according to application best practices (no of disks and RAID level)

HeavyData volumes on RDM (Raw Disk Mapping), sized according to application best practices (no of disks and RAID level)

Page 23: TAC9516 Best Practices for designing Farms and Clustersdownload3.vmware.com/vmworld/2006/tac9516.pdf · Best Practices for designing Farms and Clusters ... want to overload the servers

Storage Sizing - Datastore

DataStore SizingNeed to consider

VM disk filesVM SwapConfiguration filesRedo/snapshots filesMetadata

ExampleTotal number of VM disks (40VMs)

40 X 10Gb = 400GbAdd VM config files and swap (80VIUs)

Swap = ½ of MemoryVIU Memory = 512Mb80 x 256 = 21Gb

Logs and config10Mb per VM x 40 = 400Mb

~422Gb of space for 40VMsStill need to add space for Redo logs

VCB will need this

Page 24: TAC9516 Best Practices for designing Farms and Clustersdownload3.vmware.com/vmworld/2006/tac9516.pdf · Best Practices for designing Farms and Clusters ... want to overload the servers

Chargeback – Costing a Virtual Infrastructure

Complex questionFinance FunctionMust adhere to finance rules

Some conditionsGenerally must be fixed. Otherwise hard to budget for next year• Utilisation charging is difficult to budget.• Focus on cost allocation/recovery independent of utilisation

Different approachesFocus of today is on fixed cost recovery• Suits environments that need fixed, predictable costs for an

accounting period or budget cycle• Simple to administer and report• Pay for performance and availability• Defer metering or utilisation based chargeback until later

Page 25: TAC9516 Best Practices for designing Farms and Clustersdownload3.vmware.com/vmworld/2006/tac9516.pdf · Best Practices for designing Farms and Clusters ... want to overload the servers

Cost Recovery – Base Costs

Base Costs – Virtual Infrastructure•Server Hardware, Storage, Network, Datacenter space (Power, cooling)•Infrastructure Software – VMware etc.•Provisioning costs – Design, installation, maintenance

= Total Cost to Build Cluster

Base Cost to VI Unit cost to VM cost•Divide the total Base cost by number of VI Units in cluster. A per unit cost•Multiply the number cost of the VIU by the number of VIU in a VM.•E.g. A heavy VM is 4 x VIU cost

= Cost of 1 VI Unit

Consider

•Hardware depreciation, costed over the life of the hardware so you have money to replace it when required.•A tax to cover growth•Different values for different SLA and availability.

Goal

•When all VI Units are sold in a cluster, the cost of the infrastructure should be fully covered.•These are defined before any VM is deployed.•Effectively replaces what used to be a server cost.

Page 26: TAC9516 Best Practices for designing Farms and Clustersdownload3.vmware.com/vmworld/2006/tac9516.pdf · Best Practices for designing Farms and Clusters ... want to overload the servers

Cost Recovery – Service Costs

Service Costs – Virtual Machines•Software for VM – OS, Apps, agents (backup and monitoring)•Hardware for VM – Additional storage, installation, maintenance

= Cost to deploy a VM

Include

•Needs to cover items unique to that VM – special support requirements, different SLA’s etc.

= Cost to maintain a VM

Consider

•Software Capex and Opex per VM.•What infrastructure services are standard (back up, DR etc), what cost do these attract.•A service Catalogue for common VMs – A cost template for DBMS, Webservers, Infrastructure Servers)

Goal

•The cost is defined at time of design. •Depends on what service is being offered.

•i.e. a database server has addition disk requirements and a platinum SLA.

Page 27: TAC9516 Best Practices for designing Farms and Clustersdownload3.vmware.com/vmworld/2006/tac9516.pdf · Best Practices for designing Farms and Clusters ... want to overload the servers

Putting the Cost Together

Total Cost of VM

+

VI Unit Cost x VI Units

VM Service Cost

Create different chargeback “tiers” to match the cluster tiers

Price for PerformanceKeep the price attractive

If your model comes out a lot higher then for a physical system, customers will shy awayConsider “Promotional pricing”early on to get people on the system

Don’t compare 1 physical to 1 VM when looking at costs.

To get HA for a physical server means at least 2 servers for 1 applicationVMs are much better then physical serversMore services and flexibility with Virtual

Page 28: TAC9516 Best Practices for designing Farms and Clustersdownload3.vmware.com/vmworld/2006/tac9516.pdf · Best Practices for designing Farms and Clusters ... want to overload the servers

Summary

Were you able to understand me?Clusters are the building block of Virtual Infrastructure – Get the cluster design correct and the environment will sing.Consider all aspects of the cluster when building it – VM, storage, network, ServerUse VI Units to help in sizing estimates and to assist in charge backChargeback should map to the finance process. An approach to sizing the cluster.

Page 29: TAC9516 Best Practices for designing Farms and Clustersdownload3.vmware.com/vmworld/2006/tac9516.pdf · Best Practices for designing Farms and Clusters ... want to overload the servers

Presentation Download

Please remember to complete yoursession evaluation form

and return it to the room monitorsas you exit the session

The presentation for this session can be downloaded at http://www.vmware.com/vmtn/vmworld/sessions/

Enter the following to download (case-sensitive):

Username: cbv_repPassword: cbvfor9v9r

Page 30: TAC9516 Best Practices for designing Farms and Clustersdownload3.vmware.com/vmworld/2006/tac9516.pdf · Best Practices for designing Farms and Clusters ... want to overload the servers

Some or all of the features in this document may be representative of feature areas under development. Feature commitments must not be included in contracts, purchase orders, or sales agreements of any kind. Technical feasibility and market demand will affect final delivery.