67
 Virtualization Ian Pratt XenSource Inc. and University of Cambridge Keir Fraser, St ev e Hand, Christian Limpach and many others…

Vir Tualization

Embed Size (px)

Citation preview

Page 1: Vir Tualization

8/14/2019 Vir Tualization

http://slidepdf.com/reader/full/vir-tualization 1/67

 

Vi r tua l i za t ion

Ian Pratt 

XenSource Inc. and University of Cambridge 

Keir Fraser, Steve Hand, Christian 

Limpach and many others…

Page 2: Vir Tualization

8/14/2019 Vir Tualization

http://slidepdf.com/reader/full/vir-tualization 2/67

 

Outline

Virtualization Overview

Xen Architecture

New Features in Xen 3.0

VM Relocation

Xen RoadmapQuestions

Page 3: Vir Tualization

8/14/2019 Vir Tualization

http://slidepdf.com/reader/full/vir-tualization 3/67

 

Virtualization Overview

Single OS image: OpenVZ, Vservers, Zones

Group user processes into resource containers

Hard to get strong isolation

Full virtualization: VMware, VirtualPC, QEMU Run multiple unmodified guest OSes

Hard to efficiently virtualize x86

Para-virtualization: Xen

Run multiple guest OSes ported to special arch Arch Xen/x86 is very close to normal x86

Page 4: Vir Tualization

8/14/2019 Vir Tualization

http://slidepdf.com/reader/full/vir-tualization 4/67

 

Virtualization in the Enterprise

  X

 Consolidate under-utilized

servers

  Avoid downtime with VMRelocation

Dynamically re-balanceworkload 

to guarantee application SLAs

  XEnforce security policy  X

Page 5: Vir Tualization

8/14/2019 Vir Tualization

http://slidepdf.com/reader/full/vir-tualization 5/67

 

Xen 2.0 (5 Nov 2005)

Secure isolation between VMs

Resource control and QoS

Only guest kernel needs to be ported

User-level apps and libraries run unmodified

Linux 2.4/2.6, NetBSD, FreeBSD, Plan9, Solaris

Execution performance close to native

Broad x86 hardware support

Live Relocation of VMs between Xen nodes

Page 6: Vir Tualization

8/14/2019 Vir Tualization

http://slidepdf.com/reader/full/vir-tualization 6/67

 

Para-Virtualization in Xen

Xen extensions to x86 arch

Like x86, but Xen invoked for privileged ops

Avoids binary rewriting

Minimize number of privilege transitions into Xen Modifications relatively simple and self-contained

Modify kernel to understand virtualised env.

Wall-clock time vs. virtual processor time

• Desire both types of alarm timer

Expose real resource availability• Enables OS to optimise its own behaviour

Page 7: Vir Tualization

8/14/2019 Vir Tualization

http://slidepdf.com/reader/full/vir-tualization 7/67

 

Xen 2.0 Architecture

Event Channel Virtual MMUVirtual CPUControl IF

Hardware (SMP, MMU, physical memory, Ethernet, SCSI/IDE)

Native

Device

Drivers

GuestOS(XenLinux)

Device

Manager &

Control s/w

VM0

GuestOS(XenLinux)

Unmodified

User 

Software

VM1

Front-End

Device Drivers

GuestOS(XenLinux)

Unmodified

User 

Software

VM2

Front-End

Device Drivers

GuestOS(Solaris)

Unmodified

User 

Software

VM3

Safe HW IF

Xen Virtual Machine Monitor 

Back-Ends

Front-End

Device Drivers

Page 8: Vir Tualization

8/14/2019 Vir Tualization

http://slidepdf.com/reader/full/vir-tualization 8/67

 

Xen 3.0 Architecture

Event Channel Virtual MMUVirtual CPUControl IF

Hardware (SMP, MMU, physical memory, Ethernet, SCSI/IDE)

Native

Device

Drivers

GuestOS(XenLinux)

Device

Manager &

Control s/w

VM0

GuestOS(XenLinux)

Unmodified

User 

Software

VM1

Front-End

Device Drivers

GuestOS(XenLinux)

Unmodified

User 

Software

VM2

Front-End

Device Drivers

UnmodifiedGuestOS

(WinXP))

Unmodified

User 

Software

VM3

Safe HW IF

Xen Virtual Machine Monitor 

Back-End

VT-x

x86_32

x86_64

IA64

AGP

ACPI

PCISMP

Front-End

Device Drivers

Page 9: Vir Tualization

8/14/2019 Vir Tualization

http://slidepdf.com/reader/full/vir-tualization 9/67

 

I/O Architecture

Xen IO-Spaces delegate guest OSesprotected access to specified h/w devices Virtual PCI configuration space Virtual interrupts (Need IOMMU for full DMA protection)

Devices are virtualised and exported toother VMs via Device Channels Safe asynchronous shared memory transport

‘Backend’ drivers export to ‘frontend’ drivers Net: use normal bridging, routing, iptables Block: export any blk dev e.g. sda4,loop0,vg3

(Infiniband / “Smart NICs” for direct guest

IO)

Page 10: Vir Tualization

8/14/2019 Vir Tualization

http://slidepdf.com/reader/full/vir-tualization 10/67

 

System Performance

L X V U

SPEC INT2000 (score)

L X V U

Linux build time (s)

L X V U

OSDB-OLTP (tup/s)

L X V U

SPEC WEB99 (score)

0.0

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

1.0

1.1

chmark suite running on Linux (L), Xen (X), VMware Workstation (V), and UML

Page 11: Vir Tualization

8/14/2019 Vir Tualization

http://slidepdf.com/reader/full/vir-tualization 11/67

 

TCP results

L X V U

 Tx, MTU 1500 (Mbps)

L X V U

Rx, MTU 1500 (Mbps)

L X V U

 Tx, MTU 500 (Mbps)

L X V U

Rx, MTU 500 (Mbps)

0.0

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

1.0

1.1

 TCP bandwidth on Linux (L), Xen (X), VMWare Workstation (V), and UML (U)

Page 12: Vir Tualization

8/14/2019 Vir Tualization

http://slidepdf.com/reader/full/vir-tualization 12/67

 

Scalability

L X

2

L X

4

L X

8

L X

16

0

200

400

600

800

1000

Simultaneous SPEC WEB99 Instances on Linux (L) and Xen(X)

Page 13: Vir Tualization

8/14/2019 Vir Tualization

http://slidepdf.com/reader/full/vir-tualization 13/67

 

                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                         r

                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                      i                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                         n                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                         g                                                                                                                                                                                                                                                                                                                                                                                                                             

                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                  3                             

x86_32

Xen reserves top of VA space

Segmentation

protects Xen fromkernel

System call speedunchanged

Xen 3 now supportsPAE for >4GB mem

Kernel

User

4GB

3GB

0GB

Xen

S

S

U                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                          r                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                      i                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                         n                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                         g                                                                                                                                                                                                                                                                                                                                                                                                                             

                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                         1

                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                         r                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                      i                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                         n                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                         g                                                                                                                                                                                                                                                                                                                                                                                                                             

                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                  0                             

Page 14: Vir Tualization

8/14/2019 Vir Tualization

http://slidepdf.com/reader/full/vir-tualization 14/67

 

x86_64

Large VA space makeslife a lot easier, but:

No segment limitsupport

Need to use page-levelprotection to protect

hypervisor

Kernel

User

264

0

Xen

U

S

U

Reserved

247

264 -247

Page 15: Vir Tualization

8/14/2019 Vir Tualization

http://slidepdf.com/reader/full/vir-tualization 15/67

 

x86_64

Run user-space and kernel inring 3 using differentpagetables

 Two PGD’s (PML4’s): one withuser entries; one with user pluskernel entries

System calls require anadditional syscall/ret via Xen

Per-CPU trampoline to avoidneeding GS in Xen

Kernel

User

Xen

U

S

U

syscall/sysret

r3

r0

r3

Page 16: Vir Tualization

8/14/2019 Vir Tualization

http://slidepdf.com/reader/full/vir-tualization 16/67

 

x86 CPU virtualization

Xen runs in ring 0 (most privileged)

Ring 1/2 for guest OS, 3 for user-space

GPF if guest attempts to use privileged instr

Xen lives in top 64MB of linear addr space Segmentation used to protect Xen as switching page

tables too slow on standard x86

Hypercalls jump to Xen in ring 0

Guest OS may install ‘fast trap’ handler

Direct user-space to guest OS system calls

MMU virtualisation: shadow vs. direct-mode

Page 17: Vir Tualization

8/14/2019 Vir Tualization

http://slidepdf.com/reader/full/vir-tualization 17/67

 

MMU Virtualization : Direct-Mode

MMU

Guest OS

Xen VMM

Hardware

guest writes

guest reads

Virtual → Machine

Page 18: Vir Tualization

8/14/2019 Vir Tualization

http://slidepdf.com/reader/full/vir-tualization 18/67

 

Para-Virtualizing the MMU

Guest OSes allocate and manage own PTs

Hypercall to change PT base

Xen must validate PT updates before use

Allows incremental updates, avoids revalidation

Validation rules applied to each PTE:1. Guest may only map pages it owns*

2. Pagetable pages may only be mapped RO

Xen traps PTE updates and emulates, or ‘unhooks’ PTE page

for bulk updates

Page 19: Vir Tualization

8/14/2019 Vir Tualization

http://slidepdf.com/reader/full/vir-tualization 19/67

 

 Writeable Page Tables : 1 – Write fault

MMU

Guest OS

Xen VMM

Hardware

page fault

first guest

write

guest reads

Virtual → Machine

Page 20: Vir Tualization

8/14/2019 Vir Tualization

http://slidepdf.com/reader/full/vir-tualization 20/67

 

 Writeable Page Tables : 2 – Emulate?

MMU

Guest OS

Xen VMM

Hardware

first guest

write

guest reads

Virtual → Machine

emulate?

yes

Page 21: Vir Tualization

8/14/2019 Vir Tualization

http://slidepdf.com/reader/full/vir-tualization 21/67

 

 Writeable Page Tables : 3 - Unhook

MMU

Guest OS

Xen VMM

Hardware

guest writes

guest reads

Virtual → MachineX

Page 22: Vir Tualization

8/14/2019 Vir Tualization

http://slidepdf.com/reader/full/vir-tualization 22/67

 

 Writeable Page Tables : 4 - First Use

MMU

Guest OS

Xen VMM

Hardware

page fault

guest writes

guest reads

Virtual → MachineX

Page 23: Vir Tualization

8/14/2019 Vir Tualization

http://slidepdf.com/reader/full/vir-tualization 23/67

 

 Writeable Page Tables : 5 – Re-hook

MMU

Guest OS

Xen VMM

Hardware

validate

guest writes

guest reads

Virtual → Machine

Page 24: Vir Tualization

8/14/2019 Vir Tualization

http://slidepdf.com/reader/full/vir-tualization 24/67

 

MMU Micro-Benchmarks

L X V U

Page fault (µs)

L X V U

Process fork (µs)

0.0

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

1.0

1.1

ench results on Linux (L), Xen (X), VMWare Workstation (V), and UML

Page 25: Vir Tualization

8/14/2019 Vir Tualization

http://slidepdf.com/reader/full/vir-tualization 25/67

 

SMP Guest Kernels

Xen extended to support multiple VCPUs

Virtual IPI’s sent via Xen event channels

Currently up to 32 VCPUs supported

Simple hotplug/unplug of VCPUs From within VM or via control tools

Optimize one active VCPU case by binarypatching spinlocks

NB: Many applications exhibit poor SMPscalability – often better off running multipleinstances each in their own OS

Page 26: Vir Tualization

8/14/2019 Vir Tualization

http://slidepdf.com/reader/full/vir-tualization 26/67

 

SMP Guest Kernels

 Takes great care to get good SMP performancewhile remaining secure Requires extra TLB syncronization IPIs

SMP scheduling is a tricky problem

Wish to run all VCPUs at the same time But, strict gang scheduling is not work conserving Opportunity for a hybrid approach

Paravirtualized approach enables severalimportant benefits Avoids many virtual IPIs Allows ‘bad preemption’ avoidance Auto hot plug/unplug of CPUs

Page 27: Vir Tualization

8/14/2019 Vir Tualization

http://slidepdf.com/reader/full/vir-tualization 27/67

 

Driver Domains

Event Channel Virtual MMUVirtual CPUControl IF

Hardware (SMP, MMU, physical memory, Ethernet, SCSI/IDE)

Native

Device

Driver 

GuestOS(XenLinux)

Device

Manager &

Control s/w

VM0

Native

Device

Driver 

GuestOS(XenLinux)

VM1

Front-End

Device Drivers

GuestOS(XenLinux)

Unmodified

User 

Software

VM2

Front-End

Device Drivers

GuestOS(XenBSD)

Unmodified

User 

Software

VM3

Safe HW IF

Xen Virtual Machine Monitor 

Back-End Back-End

Driver 

Domain

Page 28: Vir Tualization

8/14/2019 Vir Tualization

http://slidepdf.com/reader/full/vir-tualization 28/67

 

Device Channel Interface

Page 29: Vir Tualization

8/14/2019 Vir Tualization

http://slidepdf.com/reader/full/vir-tualization 29/67

 

Isolated Driver VMs

Run device driversin separatedomains

Detect failure e.g. Illegal access

 Timeout

Kill domain, restartE.g. 275ms outage

from failedEthernet driver

0

50

100

150

200

250

300

350

0 5 10 15 20 25 30 35 40

time (s)

Page 30: Vir Tualization

8/14/2019 Vir Tualization

http://slidepdf.com/reader/full/vir-tualization 30/67

 

VT-x / Pacifica : hvm

Enable Guest OSes to be run without modification

E.g. legacy Linux, Windows XP/2003

CPU provides vmexits for certain privileged instrs

Shadow page tables used to virtualize MMU

Xen provides simple platform emulation

BIOS, apic, iopaic, rtc, Net (pcnet32), IDE emulation

Install paravirtualized drivers after booting for high-performance IO

Possibility for CPU and memory paravirtualization Non-invasive hypervisor hints from OS

Page 31: Vir Tualization

8/14/2019 Vir Tualization

http://slidepdf.com/reader/full/vir-tualization 31/67

Native

Device

Drivers

Control

Panel

(xm/xend)

F r  o

n t   en d 

V i  r  t   u al  

Dr i  v er  s

Linux xen64

Xen Hypervisor 

Device

Models

Guest BIOS

Unmodified OS

Domain N

Linux xen64

Callback / Hypercall VMExit

Virtual Platform

0D

Guest VM (VMX)

(32-bit)

B a ck  en d 

i  r  t   u al   d r i  v er 

Native

Device

Drivers

Domain 0

Event channel

0P

1/3P

3P

I/O: PIT, APIC, PIC, IOAPICProcessor Memory

Control Interface HypercallsEvent ChannelScheduler 

F E 

V i  r  t   u

 al  

Dr i  v er  s

Guest BIOS

Unmodified OS

VMExit

Virtual Platform

Guest VM (VMX)

(64-bit)

F E 

V i  r  t   u

 al  

Dr i  v er  s

3D

Page 32: Vir Tualization

8/14/2019 Vir Tualization

http://slidepdf.com/reader/full/vir-tualization 32/67

Native

Device

Drivers

Control

Panel

(xm/xend)

F r  o

n t   en d 

V i  r  t   u al  

Dr i  v er  s

Linux xen64

Xen Hypervisor 

Guest BIOS

Unmodified OS

Domain N

Linux xen64

Callback / Hypercall

VMExit

Virtual Platform

0D

Guest VM (VMX)

(32-bit)

B a ck  en d 

i  r  t   u al   d r i  v er 

Native

Device

Drivers

Domain 0

Event channel

0P

1/3P

3P

I/O: PIT, APIC, PIC, IOAPICProcessor Memory

Control Interface HypercallsEvent ChannelScheduler 

F E 

V i  r  t   u al  

Dr i  v er  s

Guest BIOS

Unmodified OS

VMExit

Virtual Platform

Guest VM (VMX)

(64-bit)

F E 

V i  r  t   u al  

Dr i  v er  s

3D

IO Emulation IO Emulation

Page 33: Vir Tualization

8/14/2019 Vir Tualization

http://slidepdf.com/reader/full/vir-tualization 33/67

 

MMU Virtualizion : Shadow-Mode

MMU

Accessed &dirty bits

Guest OS

VMM

Hardware

guest writes

guest reads Virtual → Pseudo-physical

Virtual → Machine

Updates

Page 34: Vir Tualization

8/14/2019 Vir Tualization

http://slidepdf.com/reader/full/vir-tualization 34/67

 

Xen Tools

xm

xmlib

xenstore

libxc

Priv Cmd Back

dom0_op

Xen

xenbus

dom0 dom1

xenbus Front

Web svcsCIM

builder  control save/

restorecontrol

Page 35: Vir Tualization

8/14/2019 Vir Tualization

http://slidepdf.com/reader/full/vir-tualization 35/67

 

VM Relocation : Motivation

VM relocation enables:

High-availability• Machine maintenance

Load balancing

• Statistical multiplexing ga

 Xen

 Xen

Page 36: Vir Tualization

8/14/2019 Vir Tualization

http://slidepdf.com/reader/full/vir-tualization 36/67

 

Assumptions

Networked storage

NAS: NFS, CIFS

SAN: Fibre Channel iSCSI, network block dev

drdb network RAID

Good connectivity

common L2 network

L3 re-routeing

 Xen

 Xen

Storage

Page 37: Vir Tualization

8/14/2019 Vir Tualization

http://slidepdf.com/reader/full/vir-tualization 37/67

 

Challenges

VMs have lots of state in memory

Some VMs have soft real-timerequirements

E.g. web servers, databases, game servers

May be members of a cluster quorum

 Minimize down-time

Performing relocation requires resources Bound and control resources used 

Page 38: Vir Tualization

8/14/2019 Vir Tualization

http://slidepdf.com/reader/full/vir-tualization 38/67

 

Stage 0: pre-migration

Stage 1: reservation

Stage 2: iterative pre-copy

Stage 3: stop-and-copy

Stage 4: commitment

Relocation Strategy

VM active on host ADestination host

selected

(Block devicesmirrored)

Initialize container ontarget host

Copy dirty pages insuccessive rounds

Suspend VM on hostA

Redirect networktraffic

Synch remaining

state

Activate on host BVM state on host A

released

Page 39: Vir Tualization

8/14/2019 Vir Tualization

http://slidepdf.com/reader/full/vir-tualization 39/67

 

Pre-Copy Migration: Round 1

Page 40: Vir Tualization

8/14/2019 Vir Tualization

http://slidepdf.com/reader/full/vir-tualization 40/67

 

Pre-Copy Migration: Round 1

Page 41: Vir Tualization

8/14/2019 Vir Tualization

http://slidepdf.com/reader/full/vir-tualization 41/67

 

Pre-Copy Migration: Round 1

Page 42: Vir Tualization

8/14/2019 Vir Tualization

http://slidepdf.com/reader/full/vir-tualization 42/67

 

Pre-Copy Migration: Round 1

Page 43: Vir Tualization

8/14/2019 Vir Tualization

http://slidepdf.com/reader/full/vir-tualization 43/67

 

Pre-Copy Migration: Round 1

Page 44: Vir Tualization

8/14/2019 Vir Tualization

http://slidepdf.com/reader/full/vir-tualization 44/67

 

Pre-Copy Migration: Round 2

Page 45: Vir Tualization

8/14/2019 Vir Tualization

http://slidepdf.com/reader/full/vir-tualization 45/67

 

Pre-Copy Migration: Round 2

Page 46: Vir Tualization

8/14/2019 Vir Tualization

http://slidepdf.com/reader/full/vir-tualization 46/67

 

Pre-Copy Migration: Round 2

Page 47: Vir Tualization

8/14/2019 Vir Tualization

http://slidepdf.com/reader/full/vir-tualization 47/67

 

Pre-Copy Migration: Round 2

Page 48: Vir Tualization

8/14/2019 Vir Tualization

http://slidepdf.com/reader/full/vir-tualization 48/67

Page 49: Vir Tualization

8/14/2019 Vir Tualization

http://slidepdf.com/reader/full/vir-tualization 49/67

 

Pre-Copy Migration: Final

Page 50: Vir Tualization

8/14/2019 Vir Tualization

http://slidepdf.com/reader/full/vir-tualization 50/67

 

 Writable Working Set

Pages that are dirtied must be re-sent

Super hot pages

• e.g. process stacks; top of page free list

Buffer cache

Network receive / disk buffers

Dirtying rate determines VM down-time

Shorter iterations → less dirtying → …

Page 51: Vir Tualization

8/14/2019 Vir Tualization

http://slidepdf.com/reader/full/vir-tualization 51/67

 

Rate Limited Relocation

Dynamically adjust resources committed toperforming page transfer

Dirty logging costs VM ~2-3%

CPU and network usage closely linked

E.g. first copy iteration at 100Mb/s, thenincrease based on observed dirtying rate

Minimize impact of relocation on server while

minimizing down-time

Page 52: Vir Tualization

8/14/2019 Vir Tualization

http://slidepdf.com/reader/full/vir-tualization 52/67

 

 Web Server Relocation

Page 53: Vir Tualization

8/14/2019 Vir Tualization

http://slidepdf.com/reader/full/vir-tualization 53/67

 

Iterative Progress: SPECWeb

52s

Page 54: Vir Tualization

8/14/2019 Vir Tualization

http://slidepdf.com/reader/full/vir-tualization 54/67

 

Iterative Progress: Quake3

Page 55: Vir Tualization

8/14/2019 Vir Tualization

http://slidepdf.com/reader/full/vir-tualization 55/67

 

Quake 3 Server relocation

Page 56: Vir Tualization

8/14/2019 Vir Tualization

http://slidepdf.com/reader/full/vir-tualization 56/67

 

Xen Optimizer Functions

Cluster load balancing / optimization Application-level resource monitoring

Performance prediction

Pre-migration analysis to predict down-time

Optimization over relatively coarse timescale

Evacuating nodes for maintenance Move easy to migrate VMs first

Storage-system support for VM clusters Decentralized, data replication, copy-on-write

Adapt to network constraints Configure VLANs, routeing, create tunnels etc

Page 57: Vir Tualization

8/14/2019 Vir Tualization

http://slidepdf.com/reader/full/vir-tualization 57/67

 

Current Status

  x86_32 x86_32p x86_64 IA64 Power  

Privileged Domains  

Guest Domains  

SMP Guests  

Save/Restore/Migrate  

>4GB memory  

VT  

Driver Domains  

Page 58: Vir Tualization

8/14/2019 Vir Tualization

http://slidepdf.com/reader/full/vir-tualization 58/67

 

3.1 Roadmap

Improved full-virtualization support Pacifica / VT-x abstraction

Enhanced IO emulation

Enhanced control toolsPerformance tuning and optimization Less reliance on manual configuration

NUMA optimizationsVirtual bitmap framebuffer and OpenGL

Infiniband / “Smart NIC” support

Page 59: Vir Tualization

8/14/2019 Vir Tualization

http://slidepdf.com/reader/full/vir-tualization 59/67

 

IO Virtualization

IO virtualization in s/w incurs overhead Latency vs. overhead tradeoff 

• More of an issue for network than storage

Can burn 10-30% more CPUSolution is well understood Direct h/w access from VMs

• Multiplexing and protection implemented in h/w

Smart NICs / HCAs• Infiniband, Level-5, Aaorhi etc

• Will become commodity before too long

Page 60: Vir Tualization

8/14/2019 Vir Tualization

http://slidepdf.com/reader/full/vir-tualization 60/67

 

Research Roadmap

Whole-system debugging

Lightweight checkpointing and replay

Cluster/distributed system debugging

Software implemented h/w fault tolerance Exploit deterministic replay

Multi-level secure systems with Xen

VM forking Lightweight service replication, isolation

Page 61: Vir Tualization

8/14/2019 Vir Tualization

http://slidepdf.com/reader/full/vir-tualization 61/67

 

Parallax

Managing storage in VM clusters.

Virtualizes storage, fast snapshots

Access optimized storage

Root A Root BSnapshot

Data

L2

L1

Page 62: Vir Tualization

8/14/2019 Vir Tualization

http://slidepdf.com/reader/full/vir-tualization 62/67

 

Page 63: Vir Tualization

8/14/2019 Vir Tualization

http://slidepdf.com/reader/full/vir-tualization 63/67

 

V2E : Taint tracking

VMM

Control VM

DD

DiskNet

ND

Protected VM

VN VD

I/O Taint

Taint Pagemap

1. Inbound pages are marked as tainted. Fine-grained taint

Details in extension, page-granularity bitmap in VMM.

2. VM traps on access to a tainted page. Tainted pages

Marked not-present. Throw VM to emulation.

Qemu*

Protected VM

VN VD

3. VM runs in emulation, tracking tainted data. Qemu

microcode modified to reflect tainting across data movement.

4. Taint markings are propagated to disk. Disk extension

marks tainted data, and re-taints memory on read.

Page 64: Vir Tualization

8/14/2019 Vir Tualization

http://slidepdf.com/reader/full/vir-tualization 64/67

 

V2E : Taint tracking

VMM

Control VM

DD

DiskNet

ND

I/O Taint

Taint Pagemap

1. Inbound pages are marked as tainted. Fine-grained taint

Details in extension, page-granularity bitmap in VMM.

2. VM traps on access to a tainted page. Tainted pages

Marked not-present. Throw VM to emulation.

Qemu*

3. VM runs in emulation, tracking tainted data. Qemu

microcode modified to reflect tainting across data movement.

4. Taint markings are propagated to disk. Disk extension

marks tainted data, and re-taints memory on read.

Protected VM

VN VD

Protected VM

VN VD

Page 65: Vir Tualization

8/14/2019 Vir Tualization

http://slidepdf.com/reader/full/vir-tualization 65/67

 

Xen Supporters

Hardware Systems

Platforms & I/O

Operating System and Systems Management

* Logos are registered trademarks of their owners

Acquired by

Page 66: Vir Tualization

8/14/2019 Vir Tualization

http://slidepdf.com/reader/full/vir-tualization 66/67

 

Conclusions

Xen is a complete and robust hypervisor

Outstanding performance and scalability

Excellent resource control and protection

Vibrant development communityStrong vendor support

 Try the demo CD to find out more!

(or Fedora 4/5, Suse 10.x)

http://xensource.com/community

Page 67: Vir Tualization

8/14/2019 Vir Tualization

http://slidepdf.com/reader/full/vir-tualization 67/67