Upload
sasi-kanth
View
246
Download
0
Embed Size (px)
Citation preview
7/31/2019 VMware vSphere Performance WP En
1/13
Whats New in VMwarevSphere
4: Performance
EnhancementsW H I T E P A P E R
7/31/2019 VMware vSphere Performance WP En
2/13
Ws New in VMwe vSpee 4:Pefomnce Enncemens
T E C H N I C A L W H I T E P A P E R / 2
Table of Contents
Scalability Enhancements 3
CPU Enhancements 4
Memory Enhancements 4
Storage Enhancements 5
Networking Enhancements 7
Resource Management Enhancements 9
Perormance Management Enhancements 10
Application Perormance 10
Oracle 10
SQL Server 11
SAP 12
Exchange 12
Summary 13
Reerences 13
7/31/2019 VMware vSphere Performance WP En
3/13
Ws New in VMwe vSpee 4:Pefomnce Enncemens
T E C H N I C A L W H I T E P A P E R / 3
VMware vSphere 4, the industrys rst cloud operating system, includes several unique new eatures that allow
IT organizations to leverage the benets o cloud computing, with maximum eciency, uncompromised control,
and fexibility o choice. The new VMware vSphere 4 provides signicant perormance enhancements that
make it easier or organizations to virtualize their most demanding and intense workloads. These perormance
enhancements provide VMware vSphere 4 with better:
Efficiency:
Optimizations resulting in reduced virtualization overheads and highest consolidation ratios.
Control:
Enhancements leading to improved ongoing perormance monitoring and management, as well as
dynamic resource sizing or better scalability.
Choice:
Improvements that provide several options o guest OS, virtualization technologies, comprehensive HCL,
integrations with 3rd-party management tools to choose rom.
This document outlines the key perormance enhancements o VMware vSphere 4, organized into ollowing
categories:
ScalabilityEnhancements
CPU,Memory,Storage,Networking
ResourceManagement
PerformanceManagement
Finally, the white paper showcases the perormance improvements in various tier-1 enterprise applications as a
result o these benets.
ScalabilityEnhancements
A summary o the key new scalability improvements o vSphere 4 as compared to VMwares previous
datacenter product, VMware Inrastructure 3 (VI3), is shown in the ollowing table:
FEaturE VI3 VSPhErE 4
VirtualMachineCPUCount 4vCPUs 8vCPUs
Virtual Machine Memory Maximum 64 GB 255 GB
HostCPUCoreMaximum 32 cores 64 cores
Host Memory Maximum 256 GB 1 TB
Powered-onVMsperESX/ESXiMaximum 128 320
For details see Systems Compatibility Guide and Guest Operating System Installation Guide.
Additional changes that enhance the scalability o vSphere include:
64LogicalCPUsand512VirtualCPUsPerHost ESX/ESXi4.0providesheadroomformorevirtual
machines per host and the ability to achieve even higher consolidation ratios on larger machines.
64-bitVMkernel TheVMkernel,acorecomponentoftheESX/ESXi4.0hypervisor,isnow64-bit.This
provides greater host physical memory capacity and more seamless hardware support than earlier releases.
64-bitServiceConsole TheLinux-basedServiceConsoleforESX4.0hasbeenupgradedtoa64-bit
version derived rom a recent release o a leading Enterprise Linux vendor.
7/31/2019 VMware vSphere Performance WP En
4/13
Ws New in VMwe vSpee 4:Pefomnce Enncemens
T E C H N I C A L W H I T E P A P E R / 4
NewVirtualHardwareESX/ESXi4.0introducesanewgenerationofvirtualhardware(virtualhardware
version 7) which adds signicant new eatures including:
Serial Attached SCSI (SAS) virtual device for Microsoft Cluster Service Providessupportforrunning
Windows Server 2008 in a Microsot Cluster Service conguration.
IDE virtual device Ideal or supporting older operating systems that lack SCSI drivers.
VMXNET Generation 3SeetheNetworkingsection.
Virtual Machine Hot Plug SupportProvidessupportforaddingandremovingvirtualdevices,adding
virtualCPUs,andaddingmemorytoavirtualmachinewithouthavingtopowerothevirtualmachine.
Hardwareversion7isthedefaul tfornewESX/ESXi4.0virtualmachines.ESX/ESXi4.0willcontinuetorun
virtualmachinescreatedonhostsrunningESXServerversions2.xand3.x.Virtualmachinesthatusevirtual
hardwareversion7featuresarenotcompatiblewithESX/ESXireleasespriortoversion4.0.
VMDirectPathforVirtualMachines VMDirectPathI/OdeviceaccessenhancesCPUeciencyinhandling
workloadsthatrequireconstantandfrequentaccesstoI/Odevicesbyallowingvirtualmachinesto
directly access the underlying hardware devices. Other virtualization eatures, such as VMotion, hardware
independenceandsharingofphysicalI/Odeviceswillnotbeavailabletothevirtualmachinesusingthis
feature.VMDirectPathI/OfornetworkingI/OdevicesisfullysupportedwiththeIntel8259810Gigabit
Ethernet Controller and Broadcom 57710 and 57711 10 Gigabit Ethernet Controller. It is experimentally
supportedforstorageI/OdeviceswiththeQLogicQLA25xx8GbFibreChannel,theEmulexLPe120008Gb
FibreChannel,andtheLSI3442e-Rand3801e(1068chipbased)3GbSASadapters.
IncreasedNFSDatastoreSupportESXnowsupportsupto64NFSsharesasdatastoresinacluster.
CPUEnhancements
Resource Management and Processor Scheduling
TheESX4.0schedulerincludesseveralnewfeaturesandenhancementsthathelpimprovethethroughputofall
workloads,withnotablegainsinI/Ointensiveworkloads.Thisincludes:
Relaxedco-schedulingofvCPUs,introducedinearlierversionsofESX,hasbeenfurtherne-tunedespeciallyforSMPVMs.
ESX4.0schedulerutilizesnewner-grainedlockingthatreducesschedulingoverheadsincaseswhere
requent scheduling decisions are needed.
Thenewschedulerisawareofprocessorcachetopologyandtakesintoaccounttheprocessorcache
architecturetooptimizeCPUusage.
ForI/Ointensiveworkloads,interruptdeliveryandtheassociatedprocessingcostsmakeupalargecomponent
o the virtualization overhead. The above scheduler enhancements greatly improve the eciency o interrupt
delivery and associated processing.
MemoryEnhancements
Hardware-assisted Memory Virtualization
Memorymanagementinvirtualmachinesdiersfromphysicalmachinesinonekeyaspect:virtualmemory
address translation. Guest virtual memory addresses must be translated rst to guest physical addresses using
the guest OSs page tables beore nally being translated to machine physical memory addresses. The latter
stepisperformedbyESXbymeansofasetofshadowpagetablesforeachvirtualmachine.Creatingand
maintainingtheshadowpagetablesaddsbothCPUandmemoryoverhead.
7/31/2019 VMware vSphere Performance WP En
5/13
Ws New in VMwe vSpee 4:Pefomnce Enncemens
T E C H N I C A L W H I T E P A P E R / 5
Hardware support is available in current processors to alleviate this situation. Hardware-assisted memory
managementcapabilitiesfromIntelandAMDarecalledEPTandRVI,respectively.Thissupportconsistsofa
second level o page tables implemented in hardware. These page tables contain guest physical to machine
memoryaddresstranslations.ESX4.0introducessupportfortheIntelXeonprocessorsthatsupportEPT.
SupportforAMDRVIhasexistedsinceESX3.5.
Apache Compile
60%
50%
40%
30%
0%
10%
Eciency Improvement
20%
SQL Server Citrix XenApp
Efficiency Improvement
Figure1Eciencyimprovementsusinghardware-assistedmemoryvirtualization
Figure 1 illustrates eciency improvements seen or a ew example workloads when using hardware-assisted
memory virtualization.
While this hardware support obviates the need or maintaining shadow page tables (and the associated
performanceoverhead)itintroducessomecostsofitsown.Translationlook-asidebuer(TLB)misscosts,in
theformofincreasedlatency,arehigherwithtwo-levelpagetablesthanwiththeone-leveltable.Usinglarge
memorypages,afeaturethathasbeenavailablesinceESX3.5,thenumberofTLBmissescanbereduced.
Since TLB miss latency is higher with this orm o hardware virtualization assist but large pages reduce the
number o TLB misses, the combination o hardware assist and large page support that exists in vSphere yields
optimal perormance.
StorageEnhancements
A variety o architectural improvements have been made to the storage subsystem o vSphere 4. The
combinationofthenewparavirtualizedSCSIdriver,andadditionalESXkernel-levelstoragestackoptimizations
dramaticallyimprovesstorageI/Operformancewiththeseimprovements,allbutaverysmallsegmentofthe
mostI/OintensiveapplicationsbecomeattractivetargetsforVMwarevirtualization.
VMware Paravirtualized SCSI (PVSCSI)
Emulated versions o hardware storage adapters rom BusLogic and LSILogic were the only choices available
inearlierESXreleases.Theadvantageofthisfullvirtualizationisthatmostoperatingsystemsshipdriversfor
7/31/2019 VMware vSphere Performance WP En
6/13
Ws New in VMwe vSpee 4:Pefomnce Enncemens
T E C H N I C A L W H I T E P A P E R / 6
these devices. However, this precludes the use o perormance optimizations that are possible in virtualized
environments.Tothisend,ESX4.0shipswithanewvirtualstorageadapterParavirtualizedSCSI(PVSCSI).
PVSCSIadaptersarehigh-performancestorageadaptersthatoergreaterthroughputandlowerCPUutilization
forvirtualmachines.TheyarebestsuitedforenvironmentsinwhichguestapplicationsareveryI/Ointensive.
PVSCSIadapterextendstothestoragestackperformancegainsassociatedwithotherparavirtualdevicessuch
asthenetworkadapterVMXNETavailableinearlierversionsofESX.Aswithotherdeviceemulations,PVSCSI
emulation improves eciency by:
Reducingthecostofvirtualinterrupts
BatchingtheprocessingofI/Orequests
BatchingI/Ocompletioninterrupts
A urther optimization, which is specic to virtual environments, reduces the number o context switches
betweentheguestandVirtualMachineMonitor.EciencygainsfromPVSCSIcanresultinadditional2xCPU
savingsforFibreChannel(FC),upto30percentCPUsavingsforiSCSI.
S/W iSCSI
1.2
1
0.8
0.6
0
0.2
LSI Logic
pvscsi
0.4
Protocol
Fibre Channel
PVSCSI Efficiency of 4K Block
I/0s
Figure2EciencygainswithPVSCSIadapter
VMware recommends that you create a primary adapter or use with a disk that will host the system sotware
(bootdisk)andaseparatePVSCSIadapterforthediskthatwillstoreuserdata,suchasadatabaseormailbox.The primary adapter will be the deault or the guest operating system on the virtual machine. For example, or
virtual machines with Microsot Windows 2008 guest operating systems, LSI Logic is the deault primary adapter.
iSCSI Support Improvements
vSphere 4 includes signicant updates to the iSCSI stack or both sotware iSCSI (that is, in which the iSCSI
initiatorrunsattheESXlayer)andhardwareiSCSI(thatis,inwhichESXleveragesahardware-optimized
iSCSIHBA).Thesechangesoerdramaticimprovementofbothperformanceaswellasfunctionalityofboth
7/31/2019 VMware vSphere Performance WP En
7/13
Ws New in VMwe vSpee 4:Pefomnce Enncemens
T E C H N I C A L W H I T E P A P E R / 7
softwareandhardwareiSCSIanddeliveringsignicantreductionofCPUoverheadforsoftwareiSCSI.Eciency
gainsforiSCSIstackcanresultin7-26percentCPUsavingsforread,18-52percentforwrite.
HW iSCSI
60
50
40
30
0
10
Read
Write
20
SW iSCSI
iSCSI % CPU Efficiency Gains, ESX 4 vs. ESX 3.5
Figure3iSCSI%CPUEciencyGainsESXvsESX
Software iSCSI and NFS Support with Jumbo Frames
vSphere4addssupportforJumboFrameswithbothNFSandiSCSIstorageprotocolson1Gbaswellas10GbNICs.The10GbsupportforiSCSIallowsfor10xI/Othroughputmoredetailsinnetworkingsectionbelow.
Improved I/O Concurrency
AsynchronousI/OexecutionhasalwaysbeenafeatureofESX.However,ESX4.0hasimprovedtheconcurrency
ofthestoragestackwithanI/OmodethatallowsvCPUsintheguesttoexecuteothertasksafterinitiating
anI/OrequestwhiletheVMkernelhandlestheactualphysicalI/O.InVMwaresFebruary2009announcement
onOracleDBOLTPperformancethegainsattributedtothisimprovedconcurrencymodelweremeasuredat
5 percent.
NetworkingEnhancements
Signicant changes have been made to the vSphere 4 network subsystem, delivering dramatic perormance
improvements.
VMXNET Generation 3vSphere4includes,VMXNET3,thethirdgenerationofparavirtualizedNICadapterfromVMware.New
VMXNET3featuresoverpreviousversionofEnhancedVMXNETinclude:
MSI/MSI-Xsupport(subjecttoguestoperatingsystemkernelsupport)
ReceiveSideScaling(supported inWindows2008whenexplicitlyenabledthroughthedevicesAdvanced
conguration tab)
7/31/2019 VMware vSphere Performance WP En
8/13
Ws New in VMwe vSpee 4:Pefomnce Enncemens
T E C H N I C A L W H I T E P A P E R / 8
IPv6checksumandTCPSegmentationOoading(TSO)
overIPv6
VLANo-loading
LargeTX/RXringsizes(conguredfromwithinthevirtualmachine)
Network Stack Performance and Scalability
vSphere 4 includes optimizations to the network stack that can saturate 10Gbps links or both transmit and
receivesidenetworkI/O.TheimprovementsintheVMkernelTCP/IPstackalsoimprovebothiSCSIthroughput
as well as maximum network throughput or VMotion.
vSphere4utilizestransmitqueuestoprovide3Xthroughputimprovementsintransmitperformanceforsmall
packet sizes.
1 VM 4 VMs 8 VMs
100%
80%
60%
0%
20%
Gains Over ESX 3.5
40%
16 VMs
Network Transmit Throughput Improvement
Figure 4NetworkTransmitThroughputImprovementorvSphere
vSphere4supportsLargeReceiveOoad(LRO),afeaturethatcoalescesTCPpacketsfromthesame
connectiontoreduceCPUutilization.UsingLROwithESXprovides40percentimprovementinboth
throughputandCPUcosts.
7/31/2019 VMware vSphere Performance WP En
9/13
Ws New in VMwe vSpee 4:Pefomnce Enncemens
T E C H N I C A L W H I T E P A P E R / 9
ResourceManagementEnhancements
VMotion
PerformanceenhancementsinvSphere4reducetimetoVMotionaVMbyupto75percent.
Storage VMotion Performance
Storage VMotion is now ully supported (experimental beore) and has much improved switchover time. For
veryI/OintensiveVMs,thisimprovementcanbe100x.StorageVMotionleveragesanewandmoreecient
blockcopymechanismcalledChangedBlockTracking,minimizingCPUandmemoryresourceconsumptionon
theESXhostuptotwotimes.
Storage VMotion Time
ESX 3.5 ESX 4
1200
1000
800
600
0
200
400
20 VM Provisioning Time
ESX 3.5
1200
1000
800
600
0
200
400
ESX 4
Figure 5DecreasedStorageVMotionTime Figure 6ImprovedVMFSPerormance
During SPECjbb (ACTIVE)
Seconds(lowerisbetter)
600.00
500.00
400.00
300.00
0.00
100.00
4GB ESX 3.5
4GB ESX 4
200.00
After SPECjbb (IDLE)
Elapsed VMotion Time
Figure 7PerormanceEnhancementsLeadtoaReducedTimetoVMotion
7/31/2019 VMware vSphere Performance WP En
10/13
Ws New in VMwe vSpee 4:Pefomnce Enncemens
T E C H N I C A L W H I T E P A P E R / 1 0
25
ESX 3.5
512 VM Boot Storm (FCP)
ESX 4
20
15
0
5
10
512 VM Boot Time
(Fibre Channel)
Figure 8TimetoBoot2VDIVMS
VM Provisioning
VMFSperformanceimprovementsoermoreecientVMcreationandcloning.Thisusecaseisespecially
important with vSpheres more ambitious role as a Cloud operating system.
PerformanceManagementEnhancements
Enhanced vCenter Server Scalability
As organizations adopt server virtualization at an unprecedented level, the need to manage large scale
virtual data centers is growing signicantly. To address this, vCenter Server, included with vSphere 4, has been
enhanced to manage up to 300 hosts and 3000 virtual machines. You also have the ability to link many vCenter
Servers in your environment with vCenter Server Linked Mode to manage up to 10,000 virtual machines rom
a single console.
vCenter Performance Charts Enhancements
PerformancechartsinvCenterhavebeenenhancedtoprovideasingleviewofallperformancemetricssuch
asCPU,memory,disk,andnetworkwithoutnavigatingthroughmultiplecharts.Inaddition,theperformance
charts also include the ollowing improvements:
Aggregatedchartsshowhigh-levelsummariesofresourcedistributionthatisusefultoidentifythe
top consumers.
Thumbnailviewsofhosts,resourcepools,clusters,anddatastoresallowforeasynavigationtothe
individual charts.
Drilldowncapabilityacrossmultiplelevelsintheinventoryhelpsin isolatingtherootcauseofperformance
problems quickly.
Detaileddatastorelevelviewsshowutilizat ionbyletypeandunusedcapacity.
ApplicationPerformance
Oracle
VMwaretestinghasshownthatrunningaresource-intensiveOLTPbenchmark,basedonanon-comparable
implementationoftheTPC-C*workloadspecication,OracleDBinan8-vcpuVMwithvSphere4achieved
85percentofnativeperformance.Thisworkloaddemonstrated8,900databasetransactionspersecondand
* The benchmark was a air-use implementation o the TPC-C business model; these results are not TPC-C compliant results, and not comparable to ocial
TPC-C results. TPC Benchmark is a trademark o the TPC.
7/31/2019 VMware vSphere Performance WP En
11/13
Ws New in VMwe vSpee 4:Pefomnce Enncemens
T E C H N I C A L W H I T E P A P E R / 1 1
60,000diskinput/outputspersecond(IOPS).Theresultsdemonstratedinthisproofpointrepresentthemost
I/O-intensiveapplication-basedworkloadeverruninanX86virtualenvironmenttodate.
2-processor 4-processor 8-processor
3
3.5
4
4.5
2.5
2
1.5
0
0.5
ESX 4
Native
1
ESX 4 Oracle DB VM Throughput,
as Compared to 2-CPU Native Configuration
Figure 9ComparisonoOracleDBVMThroughputvs2-CPUNativeConfguration
The results above were run on a server with only eight physical cores, resulting in an 8-way VM conguration
thatwasnotunder-committingthehost.TheslightlylesscommittedfourvCPUcongurationranat88
percent o native.
SQLServer
RunninganOLTPbenchmarkbasedonanon-comparableimplementationoftheTPC-E*workloadspecication,
aSQLServervirtualmachinewithfourvirtualCPUsonvSphere4.0showed90percenteciencywithrespectto
native.TheSQLServerVMwitha500GBdatabaseperformed10,500IOPSand50Mb/sofnetworkthroughput.
1 cpu 2 cpu
RelativeScalingR
atio
4 cpu
3
4
2
0
Native
VMware VM
1
ESX 4 SQL Server VM Throughput,
as Compared to 1 CPU Native Configuration
Figure 10ComparisonovSphereSQLServerVMThroughputvsNativeConfguration
* The benchmark was a air-use implementation o the TPC-C business model; these results are not TPC-C compliant results, and not comparable to ocial
TPC-C results. TPC Benchmark is a trademark o the TPC.
7/31/2019 VMware vSphere Performance WP En
12/13
Ws New in VMwe vSpee 4:Pefomnce Enncemens
T E C H N I C A L W H I T E P A P E R / 1 2
SAP
VMwaretestingdemonstratedthatrunningSAPinaVMwithvSphere4scaledlinearlyfromonetoeightvCPUs
perVMandachieved95percentofnativeperformanceonastandard2-tierSAPbenchmark.Thismulti-tiered
applicationarchitectureincludestheSAPapplicationtierandback-endSQLServerdatabaseinstantiatedina
single virtual machine.
1 cpu 2 cpu
Relative
Scalin
gR
atio
4 cpu 8 cpu
6
8
4
0
Native
VMware VM
2
ESX 4 SAP VM Throughput,
as Compared to1 CPU Native Configuration
Figure 11ComparisonoESXSAPVMThroughputvsNativeConfguration
ExchangeMicrosot Exchange Server is one o the most demanding applications in todays datacenters, save the very
largestdatabasesbeingdeployed.PreviousworkonvirtualExchangedeploymentsshowedVMwaresabilityto
improve perormance rom native congurations by designing an Exchange architecture with a greater number
o mailbox instances running ewer mailboxes per instance.
With the perormance enhancements added to vSphere 4 single VM Exchange mailboxes have been
demonstrated at up to 8,000 mailboxes per instance. This means that Exchange administrators will have the
option o choosing the higher perorming smaller mailboxes or the more cheaply licensed large mailbox servers.
7/31/2019 VMware vSphere Performance WP En
13/13
Ws New in VMwe vSpee 4:Pefomnce Enncemens
VMware, Inc.HillviewAvenuePaloAltoCAUSATel8-86-2Fax6-2-wwwvmwarecomCopyright9VMware,IncAllrightsreservedThisproductisprotectedbyUSandinternationalcopyrightandintellectualpropertylawsVMwareproductsarecoveredbyoneormorepatentslistedat
http://wwwvmwarecom/go/patentsVMwareisaregisteredtrademarkortrademarkofVMware,IncintheUnitedStatesand/orotherjurisdictionsAllothermarksandnamesmentionedhereinmaybe
1 VM 2 VMs
95percentilelatency(ms)
Users(Thousands)
4 VMs 6 VMs 8 VMs
6
7
8
9
5
4
3
0
1
2
200
250
300
150
100
0
50
Users (thousands) 95 percentile latency
#VCPUs >#PCPUs
ESX 4 Exchange Mailbox Count and Latency
Figure 12vSphereperormanceenhancementswithMicrosotExchange
Summary
VMware innovations continue to make VMware vSphere 4 the industry standard or computing in data centers
o all sizes and across all industries. The numerous perormance enhancements in VMware vSphere 4 enable
organizations to get even more out o their virtual inrastructure and urther reinorce the role o VMware asindustry leader in virtualization.
vSphere represents dramatic advances in perormance compared to VMware Inrastructure 3 to ensure that
even the most resource intensive and scale out applications such as large databases and Microsot Exchange
email systems can run on private clouds powered by vSphere.
References
Performance Evaluation of AMD RVI Hardware Assist
http://www.vmware.com/pdf/RVI_performance.pdf
Performance Evaluation of Intel EPT Hardware Assist
http://www.vmware.com/pdf/Perf_ESX_Intel-EPT-eval.pdf