41
Designing a Virtualization Architecture: A Best Practices Approach Greg Shields, MVP – Terminal Services Author / Speaker / Instructor / Consultant / All Around Good Guy [email protected] 1

Designing a Virtualization Architecture: A Best Practices - Rocky

Embed Size (px)

Citation preview

Page 1: Designing a Virtualization Architecture: A Best Practices - Rocky

Designing a Virtualization Architecture: A Best Practices

Approach

Greg Shields, MVP – Terminal ServicesAuthor / Speaker / Instructor / Consultant / All Around Good Guy

[email protected]

Page 2: Designing a Virtualization Architecture: A Best Practices - Rocky

Join Us @ TechMentor Events

• TechMentor Las Vegas – Three weeks!– Early bird registration still available…

• TechMentor 2009– Las Vegas in the Spring– Orlando in the Fall

VirtualizationAutomation & PowerShellProactive Windows ManagementBecoming an IT ArchitectWindows Security, Auditing, and ComplianceExchange Server AdministrationWindows FundamentalsWindows Technologies

http://www.techmentorevents.com

Page 3: Designing a Virtualization Architecture: A Best Practices - Rocky

Fear the Worst• The National Academy of Archives and Records

states that 96% of companies that lose access to their data centers for 10 days or longer are out of business within a year.

• A study by McGladrey and Pullen shows that 43% of companies experiencing disasters will never recover.

• Tape restorations can take days and tape failures exacerbate an already critical problem.– 72+ hours to restore 1.5T of office files

Page 4: Designing a Virtualization Architecture: A Best Practices - Rocky

44% of Virtualization Deployments Fail

• According to a CA announcement from 2007.• Inability to quantify ROI• Insufficient administrator training• Success =

– Measure performance– Diligent Inventory and Load Distribution– Thorough Investigation of Technology

Page 5: Designing a Virtualization Architecture: A Best Practices - Rocky

The Lifecycle of a Virtualization Architecture

• Step -1: Hype Recognition & Education• Step 0: Assessment• Step 1: Purchase & Implementation• Step 2: P2V• Step 3: Backups Expansion• Step 4: DR Implementation

Page 6: Designing a Virtualization Architecture: A Best Practices - Rocky

Step 0Assessment

6

Page 7: Designing a Virtualization Architecture: A Best Practices - Rocky

The Virtualization Assessment

• Successful virtualization rollouts need a virtualization assessment.– You need to analyze your environment before you act.

• Virtualization assessment should include:– Inventory of servers– Inventory of attached peripherals– Performance characteristics of servers– Analysis of performance characteristics– Analysis of hardware needs to support virtualized servers– Backups Analysis– Disaster Recovery Analysis (Hot vs. warm vs. cold)– Initial virtual resource assignment

Page 8: Designing a Virtualization Architecture: A Best Practices - Rocky

(Obvious) Candidates for Virtualization

• Systems with minimal processor utilization• Systems with minimal RAM requirements

– We too often add too much RAM in a server.• Systems that do not require large quantities of

drive storage*• Redundant or warm-spare servers• Occasional- or limited-use servers• Systems where many partially-trusted people

need console access

Page 9: Designing a Virtualization Architecture: A Best Practices - Rocky

Not Candidates for Virtualization

• Systems with constant and high processor utilization or RAM usage

• Systems with peripherals– Serial / parallel / USB / External SCSI /

License Keyfobs / Scanners / Bar Code Readers• Systems with exceptionally high network use

– Gigabit networking requirements• Systems with specialized hardware requirements

– Hardware appliances / OEM / Unique configs

Page 10: Designing a Virtualization Architecture: A Best Practices - Rocky

Assessing Performance

• In the early days of virtualization, we used to say…– “Exchange Servers can’t be virtualized”– “Terminal Servers can’t be virtualized”– “You’ll never virtualize a SQL box”

• Today’s common knowledge is that the decision relates entirely to performance.– Thus, before you can determine which servers to virtualize,

you need to understand their performance.– Measure that performance over time.– Compile results into reports and look for deviations from

nominal activity.

Page 11: Designing a Virtualization Architecture: A Best Practices - Rocky

Useful Performance Counters

Category Performance Metric Example ThresholdDisk % Disk Time > 50%

Memory Available MBytes Below Baseline

Memory Pages / Sec > 20

Page File % Usage > 70%

Physical Disk Current Disk Queue Length

>18

Processor % Processor Time > 40%

System Processor Queue Length > 5.4

System Context Switches / Sec > 5000

System Threads > 2000

These are examples (starting points).Your actual thresholds may be different.

Page 12: Designing a Virtualization Architecture: A Best Practices - Rocky

The Virtualization Assessment

ServerDisk / % Disk Time

Memory / Available Mbytes

Memory / Pages/sec

Page File / % Usage

Physical Disk / Current Disk Queue Length

Processor / % Processor Time

System / Processor Queue Length

System / Context Switches/sec

System / Threads

Active Sessions (Where Applicable)

Virtualization Candidacy Index

Initial Assigned VProcs

Initial Assigned VRAM (in G)

ABCS 0 598 19 2 N/D 7 0 2435 712 Likely 1 1.5

ABCSDC0 1 553 2 2 0 1 0 372 520 Likely 1 0.5

ABCSTM 2 1525 0 0 0 1 0 302 465 Likely 1 0.5

ADS N/D 236 0 0 0 0 0 85 259 Likely 1 0.5

BDC 4 108 3 11 0 2 0 440 577 Likely 1 0.5

C3APPSVR N/A N/A N/A N/A N/A N/A N/A N/A N/A N/A 1 N/A

CTX-Surf 2 1319 1 0 0 0 0 557 528 Likely 1 1

DC1 N/D 544 20 1 0 13 5 1027 394 Likely 1 0.5

DIRECTOR N/D 84 37 14 0 7 0 2003 587 Probable 1 2

EX1 N/D 350 1 4 0 1 0 858 404 Likely 1 1

EX2K3 149 359 11 3 1 2 0 2296 927 Probable 1 2

EZTELLER 7 143 3 3 0 1 0 458 509 Likely 1 2

IFS N/D 348 0 1 0 0 0 99 311 Likely 1 0.5

IMAGE-WIN 67 469 7 1 1 18 0 5553 2540 Probable 1 2

ITIAPP02 N/D 1292 1 2 0 2 0 2300 823 Likely 1 2

ITIPrime N/D 34 4 12 0 2 0 830 468 Likely 1 1.5

License 1 140 1 39 0 2 1 417 490 Likely 1 0.5

PFS1 N/D 330 1 1 0 1 0 231 338 Likely 1 0.5

SC-MGR1R 0 255 1 N/D N/D N/D 0 1251 490 Likely 1 1

SURFCONTROL N/D 32 1 37 0 73 8 338 403 Probable 1 1

TESTLPW N/D 129 0 4 0 1 10 896 489 Probable 1 0.5

TSBANK0 0 2521 7 5 0 2 0 5050 1342 7 Probable 1 2

TSBANK1 3 1216 12 10 0 9 0 3381 1237 7 Probable 1 2

TSBANK15 0 2631 7 4 0 8 0 4386 1183 7 Probable 1 2

TSBANK17 0 2652 7 4 0 14 0 4329 1240 7 Probable 1 2

TSBANK2 1 1272 11 12 0 22 0 3314 1168 7 Probable 1 2

TSBANK3 4 1310 5 3 0 38 2 2589 887 4 Probable 1 2

TSBANK4 4 1297 4 3 0 16 2 2702 883 4 Probable 1 2

TSBANK6 7 1191 9 9 0 7 2 3271 1216 7 Probable 1 2

TSWIN1 3 1292 5 2 0 5 1 1689 884 4 Likely 1 2

TSWIN2 4 1272 4 2 N/D 7 1 1677 848 4 Likely 1 2

VIEMCAPP 5 2111 0 1 N/D 0 0 456 541 Likely 1 2

Total RAM Count: 40

Page 13: Designing a Virtualization Architecture: A Best Practices - Rocky

Gathering Performance

• PerfMon is really the only mechanism to gather these statistics from servers.– But PerfMon can be challenging to use.

• Other products are available to assist...– Microsoft Assessment & Planning Solution

Accelerator– VMware Consolidation & Capacity Planner– Platespin PowerRecon– CiRBA– PerfMan

Page 14: Designing a Virtualization Architecture: A Best Practices - Rocky

Step 1Purchase & Implementation

14

Page 15: Designing a Virtualization Architecture: A Best Practices - Rocky

Consolidation = Cost Savings

8:115:120:1

Small Server $6,000 1:1 $6,000 per Server

Large Server $15,000

Virtualization $5,000$20,000

Large MarginalCost Increases perAdditional Server

$2,500 per Server

Smaller Marginal Cost Increases

+ Power+ Cooling

+ Provisioning Labor

$1,333$1,000

Page 16: Designing a Virtualization Architecture: A Best Practices - Rocky

Virtualization Options• Three types of Virtualization

– Entire System Virtualization• VMware• Microsoft Virtual Server

– OS Virtualization• Parallels Virtuozzo

– Paravirtualization• Microsoft Hyper-V• Xen / Citrix XenSource

Virtual O/S is an entire systemthat has no awareness

of underlying host system.

Software runs on system assingle file. Requires client.

Similar to HardwareVirtualization, but Virtual O/S

is “aware” it is virtualized.

Page 17: Designing a Virtualization Architecture: A Best Practices - Rocky

Hardware Virtualization(Type-1)

• ESX– Hybrid hypervisor and host OS– Device drivers in the hypervisor– Emulation (translation from emulated driver to real driver)– High cost, high availability, high performance

Page 18: Designing a Virtualization Architecture: A Best Practices - Rocky

Paravirtualization

• Hyper-V, Citrix XenSource– Host OS becomes primary partition above hypervisor.– Device drivers in the primary partition– Paravirtualization (no emulation for “enlightened” VMs)– Low cost, moderate-to-high availability, high performance

Page 19: Designing a Virtualization Architecture: A Best Practices - Rocky

Hardware Virtualization(Type-2)

• Microsoft Virtual Server– Hypervisor above host OS.– Device drivers in hypervisor– Emulation (translation from emulated driver to real driver)– Low cost, low availability, low performance

Page 20: Designing a Virtualization Architecture: A Best Practices - Rocky

OS Virtualization

• Parallels Virtuozzo– Delta-based.– No hypervisor. V-layer processes requests.– All real device drivers hosted on host OS– Moderate cost, moderate availability, very high performance

Page 21: Designing a Virtualization Architecture: A Best Practices - Rocky

Step 2P2V

21

Page 22: Designing a Virtualization Architecture: A Best Practices - Rocky

P2V Isn’t Exciting Any More• After environment stand-up, P2V process

converts physical machines to virtual ones.– A “ghost” + a “driver injection”

• Numerous applications can do this in one step.– These days, P2V process is commodity.– Everyone has their own version.– Some are faster. Some much

slower. Paid options == faster.

22

Page 23: Designing a Virtualization Architecture: A Best Practices - Rocky

P2V, P2V-DR

• P2V– SCVMM, VMware VI/Converter, Acronis, Leostream,

others.• P2V-DR

– Similar to P2V, but with interim step of image creation/storage.

– “Poor-man’s DR”

23

Page 24: Designing a Virtualization Architecture: A Best Practices - Rocky

P2V-DR Uses

• P2V-DR can be leveraged for medium-term storage of server images– Useful when DR site does not have hot backup

capability or requirements– Regularly create images of physical servers, but only

store those images rather than load to virtual environment

– Cheaper-to-maintain DR environment• Not fast.• Not easy.• Not completely reliable.• …but essentially cost-free.

24

Page 25: Designing a Virtualization Architecture: A Best Practices - Rocky

Step 3Backups Expansion

25

Page 26: Designing a Virtualization Architecture: A Best Practices - Rocky

Backup Terminology

• File-Level Backup– Backup Agent in the Virtual Machine

• Image-Level Backup– Backup Agent on the Virtual Host

• Quiescing– Quieting the file system to prep for a backup

• O/S Crash Consistency– Capability for post-restore O/S functionality

• Application Crash Consistency– Capability for post-restore application functionality

26

Page 27: Designing a Virtualization Architecture: A Best Practices - Rocky

Types of Backups

• Three types of Backups– Backing up the host system

• May be necessary to maintain host configuration• But often, not completely necessary• The fastest fix for a broken host is often a complete rebuild

– Backing up Virtual Disk Files• Fast and can be done from a single host-based backup client• Challenging to do file-level restore

– Backing up VM’s from inside the VM• Slower and requires backup clients in every VM.• Resource intensive on host• Capable of doing file-level restores

27

Page 28: Designing a Virtualization Architecture: A Best Practices - Rocky

The Problem with Transactional Databases

• O/S Crash Consistency is easy to obtain. Just quiesce the file system before beginning the backup.

• Application Crash Consistency much harder.– Transactional databases like AD, Exchange, SQL don’t quiesce

when the file system does.– Need to stop these databases before quiescing.– Need an agent in the VM that handles DB quiescing.– Leverage VSS.

• Restoration without crash consistency will lose data. DB restores into “inconsistent” state.

28

Page 29: Designing a Virtualization Architecture: A Best Practices - Rocky

The Problem with Transactional Databases• When considering backups of virtual machines, need

to consider file-level backups and image-level backups.– File-level backups provide individual file restorability and

transactional database crash consistency.– Image-level backups provide whole-server restorability.– Not all image-level backups provide app crash consistency.

• Solutions exist that call Windows VSS to quiesce apps and the file system prior to snapping a backup.– Compelling argument:

VSS = Microsoft, Hyper-V = Microsoft.

29

Page 30: Designing a Virtualization Architecture: A Best Practices - Rocky

Step 4DR Implementation

30

Page 31: Designing a Virtualization Architecture: A Best Practices - Rocky

DR, meet Virtualization…• Early all-physical attempts at DR were cost-prohibitive and

operationally complex.– Identical server inventory at primary and backup site.– Management cost of identical server configuration. Change

management costs prohibitive.• Virtualization eliminates many previous barriers.

– Virtual servers are chassis independent.– Image-level backup == image-level restore.– Hot sites one of many options – cold & warm sites.

• Numerous cost-effective solutions available.– Don’t believe the hype.– Make decisions based on need.

31

Page 32: Designing a Virtualization Architecture: A Best Practices - Rocky

Disaster Recovery Terminology• What is Disaster Recovery?

– Disaster Recovery intends to provide continuity of business services after a critical event.

– Disaster Recovery is invoked after the large-scale loss of primary business services.

– DR is not the restoration of a critical server.– DR is not the restoration of a critical business service.

• Why the distinction?– DR solutions do not resolve daily operational issues.– Often, failback is challenging.

32

Page 33: Designing a Virtualization Architecture: A Best Practices - Rocky

Disaster Recovery Terminology• RTO – Recovery Time Objective

– Time period between a failure and when a failed system is restored to full operational capability.

• RPO – Recovery Point Objective– Quantity of data that can acceptably be lost as part of a failure.

• MTTR – Mean-Time To Restore– The average amount of time expected to bring a system back

to full operational capability.• SLA – Service Level Agreement

– Agreement between IT and business on restoration metrics, what to restore, priorities, and ownership.

33

Page 34: Designing a Virtualization Architecture: A Best Practices - Rocky

Disaster Recovery Terminology• Hot site

– Servers up and operational at remote site at all times.• Warm site

– Servers pre-provisioned at remote site. Tasks to complete for failover to occur.

• Cold site– Empty site and servers on retainer awaiting DR event.

34

Page 35: Designing a Virtualization Architecture: A Best Practices - Rocky

Four DR Tiers

RTO RPO Examples

Continuous Availability

Immediate Immediate Business Critical DB’s, Transaction processing appliances

Immediate Availability

Minutes to Hours Minutes to Hours Infrastructure services, support services, messaging services

Fast Recovery Hours to Days Hours to Days Internal applications, analytic applications

Eventual Recovery

Days to a Week or More

Days to a Week or More

Development & test environments, stateless applications.

35

Page 36: Designing a Virtualization Architecture: A Best Practices - Rocky

Four DR Tiers

RTO RPO Examples

Continuous Availability

Immediate Immediate Business Critical DB’s, Transaction processing appliances

Immediate Availability

Minutes to Hours Minutes to Hours Infrastructure services, support services, messaging services

Fast Recovery Hours to Days Hours to Days Internal applications, analytic applications

Eventual Recovery

Days to a Week or More

Days to a Week or More

Development & test environments, stateless applications.

36

Page 37: Designing a Virtualization Architecture: A Best Practices - Rocky

Four DR Tiers• $ - Snap & Pray

– Leverage no-cost or low-cost tools to snapshot image-level backups of VM’s.

– Cold site and replacement equipment on retainer.– Store images to tape. Rotate tapes off-site.– Restoration:

• Activate cold site• Procure reserved replacement equipment• Procure tapes and tape device• Restore images to replacement equipment• Resolve database (and some O/S) inconsistencies

37

Page 38: Designing a Virtualization Architecture: A Best Practices - Rocky

Four DR Tiers• $$ - Warm Snap

– Leverage no-cost or low-cost tools to create image-level backups of VM’s.

– Connected warm site with data storage location.– Transfer images to off-site data storage location– Restoration:

• Procure or spin up reserved replacement equipment• Restore images from data storage to replacement equipment• Resolve database (and some O/S) inconsistencies

38

Disk-to-disk backups over the WAN increase backup time, but significantly

reduce restore time.

Page 39: Designing a Virtualization Architecture: A Best Practices - Rocky

Four DR Tiers• $$$ - Inconsistent Storage-to-Storage

– Warm site. Storage-to-storage replication instantiated between sites.

– Storage data automatically replicated to remote site.– Greater support for incrementals. Less WAN usage.– Restoration:

• Procure or spin up reserved replacement equipment• Attach virtual machines to replacement equipment and hit the

“green VCR button”.• Resolve database (and some O/S) inconsistencies

39SAN replication is often not aware of quiescing,

so this solution can be problematic.

Page 40: Designing a Virtualization Architecture: A Best Practices - Rocky

Four DR Tiers• $$$$ - Real-time Replication

– Warm or hot site. Storage-to-storage replication instantiated between sites.

– 3rd Party tools used for image-to-image transfer.• In-VM for transactional database quiescing.• On-host for all other machines.

– Roll-back and roll-forward capabilities– Restoration:

• Hit the “green VCR button”• (or, auto-failover…)

40

Tools like DoubleTake, DoubleTake for Virtual Systems, esxReplicator, DataCore

SANMelody enable real-time and consistent DR between sites.

Page 41: Designing a Virtualization Architecture: A Best Practices - Rocky

• Questions?• Comments?• Sarcastic Remarks?