38
System Virtual Machines System Virtual Machines Chapter 8.3 ~ 8.7 Chapter 8.3 ~ 8.7 October 25, 2006 Yoo Jonghun [email protected] RTOS Lab., SoEECS, SNU

ch8. System VM, VMWare

Embed Size (px)

DESCRIPTION

 

Citation preview

Page 1: ch8. System VM, VMWare

System Virtual MachinesSystem Virtual MachinesChapter 8.3 ~ 8.7Chapter 8.3 ~ 8.7

October 25, 2006

Yoo Jonghun

[email protected]

RTOS Lab., SoEECS, SNU

Page 2: ch8. System VM, VMWare

2

Presentation OutlinePresentation Outline

Resource Virtualization – Input/Output

Performance Enhancement of System Virtual Machines

Case Study: VMware Virtual Platform

Case Study: The Intel VT-x (Vanderpool) Technology

Page 3: ch8. System VM, VMWare

3

Virtualizing DevicesVirtualizing Devices

Dedicated devices• Dedicated for long time while the guest VM is active

• E.g., Display, keyboard, mouse

Partitioned devices• Partitioned for each guest VM

• E.g., Disk

Shared devices• Shared among guest VMs at a fine time granularity

• E.g., Network adapter

Page 4: ch8. System VM, VMWare

4

Virtualizing DevicesVirtualizing Devices

Spooled devices• Shared among guest VMs at a much higher granularity

• E.g., Printer

– Two level spool table

Virtual Machine 1 Spool Table

ProgramABCD

StatusPrinted

CompletedRunning

Completed

Location1000200030004000

Real loc11000120001300014000

Size400200200500

Virtual Machine 2 Spool Table

Size400800

Real loc2100022000

Location10002000

StatusRunning

Completed

ProgramPQ

VMM Spool Table

VM1211

StatusAQBD

StatusPrintedPrintingWaitingWaiting

Real loc30000310003180030400

Size400800200500

10000

20000

30000

Page 5: ch8. System VM, VMWare

5

Virtualizing DevicesVirtualizing Devices

Nonexistent physical devices• E.g., Virtual network adapter

– Virtual NIC in VMware

Page 6: ch8. System VM, VMWare

6

Virtualizing I/O ActivityVirtualizing I/O Activity

I/O actionMajor interfaces in I/O action

Possible interception points• System call interface

• Device driver interface

• Operation-level interface

Application

Hardware

Operating system

VMM I/O drivers

System calls

Physical memory and I/O operations

driver calls

Page 7: ch8. System VM, VMWare

7

Virtualizing I/O ActivityVirtualizing I/O Activity

Virutualizing at the I/O operation levelInstructions for I/O operation

• Processors with memory-mapped I/O : load or store from/to a specific memory

• System/360 or IA-32 : Special I/O instructions

Above instructions are easy to intercepted by VMM

However, It is extremely difficult for the VMM to determine exactly what I/O action is being requested

Page 8: ch8. System VM, VMWare

8

Virtualizing I/O ActivityVirtualizing I/O Activity

Virutualizing at the device driver levelVMM should have knowledge of the guest OS

Typical guest OS: Windows, Linux• Virtual device drivers for guest OS can be distributed to users

• Drivers in host OS can be used in case of host VM

Virutualizing at the system call levelVMM should have much broader knowledge of the guest OS to emulate ABI level operations

Page 9: ch8. System VM, VMWare

9

I/O virtualization in hosted VMsI/O virtualization in hosted VMs

It is not necessary to provide device drivers in the VMM• Device drivers of host OS are used indirectly

VMM-n (native)• Intercepts traps

• For performance critical small device drivers

VMM-u (user)• Uses host OS’s device drivers

VMM-d (driver)• Communication between

VMM-n and VMM-u

Page 10: ch8. System VM, VMWare

10

Presentation OutlinePresentation Outline

Resource Virtualization – Input/Output

Performance Enhancement of System Virtual Machines

Case Study: VMware Virtual Platform

Case Study: The Intel VT-x (Vanderpool) Technology

Page 11: ch8. System VM, VMWare

11

Reasons for performance degradationReasons for performance degradation

Setup• Setting resources when a VM is activated

Emulation• Sensitive instructions must be emulated

Interrupt handling• Interrupts must be handled by VMM fisrt

State saving• Saving state of VM when control is transferred to VMM

Bookkeeping• E.g., Accounting of time charged to user

Time elongation• E.g., Accessing shadow page table

Page 12: ch8. System VM, VMWare

12

SolutionsSolutions

H/W techniques for improve the performanceIBM VM/370 assist collection

Assist CollectionNumber of functions

Virtual Machine Assists (VMA) 13

Extended control program support

Control program assist 22

Expanded virtual machine assist 12

Virtual interval timer assist 1

Shadow table bypass assist 8

Preferred machine assist 22

Dual address space assist 20

Extended storage key assist 3

Total 101

Page 13: ch8. System VM, VMWare

13

Instruction emulation assistsInstruction emulation assists

Instruction emulation assistsThe HW (via microcode) performs the emulation of special instructions

E.g., In System/370, 13 instructions are assisted by HW• LOAD PSW (LPSW), INSERT PSW KEY (IPK), INSERT STORAGE KEY (ISK),

LOAD REAL ADDRESS (LRA), RESET REFERENCE BIT (RRB), SUPERVISOR CALL (SVC), SET STORAGE KEY (SSK), SET SYSTEM MASK (SSM), STORE CONTROL (STCTL), STORE AND AND SYSTEM MASK (STNSM), STORE THEN OR SYSTEM MASK (STOSM), SET PSW KEY FROM ADDRESS (SPKA)

LPSW trapsVMM determines if the guest VM is in system mode or in user mode

PSW is loaded with the corresponding value if the

guest VM is in system mode

Assisted by HW

Page 14: ch8. System VM, VMWare

14

Virtual machine monitor assistsVirtual machine monitor assists

Virtual machine monitor assists (1)Context switch between VM and VMM

• HW save/restore registers

Decoding of privileged instructions• Privileged instructions always traps whereas they trap only in

user mode in a native environment

• HW decodes privileged instructions to help VMM

Page 15: ch8. System VM, VMWare

15

Virtual machine monitor assistsVirtual machine monitor assists

Virtual machine monitor assists (2)Virtual interval timer

• While the guest VM is running, virtual timer in a certain memory location is decremented automatically by real timer

Additional instruction set– E.g.,Obtain free space from free storage area– Return space to free storage– Page lock/unlock– Translate virtual address and test for shared page– Invalidate segment/page table

Page 16: ch8. System VM, VMWare

16

Improving Performance of the Guest Improving Performance of the Guest SystemSystem

System/370 provides handshaking by which the guest OS send a message to VMM

Nonpaged mode• Turn off virtual memory of the guest OS• The guest OS disables dynamic address translation and define

s its real address space to be as large as the largest virtual address space Page frames are mapped to fixed real pages

• No double paging• No potential conflicts in paging decisions by the guest OS and t

he VMM

Pseudo-page-fault handling• When a page fault is cause, VMM gives back the control to the

same VM• Improves fairness among VMs

Page 17: ch8. System VM, VMWare

17

Improving Performance of the Guest Improving Performance of the Guest SystemSystem

Spool files• When a file is ready to print out, the guest VM may issue a I/O

operation which is intercepted by VMM

• Instead of intercepting that, handshaking allows the VM to signal the VMM that a file is ready

Inter-virtual-machine communication• Save overhead of processing of message packets through com

munication layers

Paravirtualization• Interface presented by the VM is not identical to that of the arch

itecture of the underlying processor, but rather simplified to eliminate the effect of critical instructions

Page 18: ch8. System VM, VMWare

18

Specialized SystemsSpecialized Systems

Virtual-equals-real (V=R) virtual machine• Host address space representing the guest real memory is

mapped one-to-one to the host real memory address space

– Channel programs does not need to be retranslated

Shadow-table bypass assist• Multi-level mapping is very expensive

• By assist of HW, trusted guests are allowed to access to the memory mapping table directly

• IBM found that most guest operating systems well behaved

Page 19: ch8. System VM, VMWare

19

Specialized SystemsSpecialized Systems

Preferred-machine assist• Allow a guest operating system to operate in system mode

rather than user mode

• Only minimal checks are imposed on the use of privileged instructions by the guest

Segment sharing• Sharing the code segments of the operating system among the

virtual machines, provided the operating system code is written in a reentrance manner

– Alleviate TLB pressure

Page 20: ch8. System VM, VMWare

20

Generalized Support for Virtual Generalized Support for Virtual MachinesMachines

Interpretive Execution Facility (IEF)The processor directly executes most of the functions of the virtual machine in hardware. An extreme case of a VM assist.

Interpretive Execution Entry and ExitEntry

• Start Interpretive Execution (SIE) : The software give up control to the hardware IEF part and processor enters the interpretive execution mode.

Exit • Host Interrupt • Interception

– Unsupported hardware instructions.– Exception during the execution of interpreted instruction. – Some special case…

Page 21: ch8. System VM, VMWare

21

Generalized Support for Virtual Generalized Support for Virtual MachinesMachines

VMM Software

SIE

Host interrupthandler

Interpretiveexecution

mode

Entry into interpretive execution mode

Exit for interception

Exit for host interrupt

Emulation

Page 22: ch8. System VM, VMWare

22

Presentation OutlinePresentation Outline

Resource Virtualization – Input/Output

Performance Enhancement of System Virtual Machines

Case Study: VMware Virtual Platform

Case Study: The Intel VT-x (Vanderpool) Technology

Page 23: ch8. System VM, VMWare

23

Challenges for VMwareChallenges for VMware

VMware is a popular virtual machine for IA32

Challenges for IA32Not intended to support multiple users

Openness of system architecture• Different types of devices

Installation/Removal must be easy

Hosted VM is selected to cope with the above problems

Page 24: ch8. System VM, VMWare

24

VMware ComponentsVMware Components

VMM-n

VMM-u

VMM-d

Page 25: ch8. System VM, VMWare

25

Processor VirtualizationProcessor Virtualization

IA-32 architecture is not efficiently virtualizableTheorem 1 is violated

i.e., There are 17 instructions that are sensitive but not privileged

Hybrid VMDiscover critical instructions and patch them

Critical instructionsProtection system references

• Reference the storage protection system, memory system, or address relocation system (e.g., mov ax, cs )

Sensitive register instructions• Read or change resource-related registers and memory locations such

as a clock register or interrupt registers (e.g., POPF)

Page 26: ch8. System VM, VMWare

26

Processor VirtualizationProcessor Virtualization

ProblemsThe sensitive instructions executed in user mode do not executed as correct as we expected unless the instruction is emulated

SolutionsThe VM monitor substitutes the instruction with another set of instruction and emulates the action of the original code

Page 27: ch8. System VM, VMWare

27

Processor VirtualizationProcessor Virtualization

For example, popfd instructionPopfd pops a word from the top of a stack and stores it in the EFLAGS register

One bit of EFLAGS is IF (Interrupt-enable Flag)• Is modified in system mode

• Is unchanged in user mode

SolutionVMM scans the instruction stream

If it detects popfd, substitute it with set of instructions that take the processor into privileged mode and emulate popfd instruction

Page 28: ch8. System VM, VMWare

28

I/O VirtualizationI/O Virtualization

The PC platform supports many more devices and types of devices than any other platform

Emulation in VMMonitorConverting the in and out I/O to new I/O instructions

Requires some knowledge of the device interfaces

Virtual Device Interface,e.g., IDE

I/O Device Simulator in VMMonitor

Hardware Device Interface, e.g., IDE, SCSI

Page 29: ch8. System VM, VMWare

29

I/O VirtualizationI/O Virtualization

Using the services of the host operating systemVirtual Device Interface,

e.g., disk read, screen write

I/O Device Simulator in VMMonitor

Hardware Device Interface, e.g., IDE, SVGA

I/O Device Simulator in VMApp

OS Interface Commands, e.g., commands in graphics language

Host Operating System,e.g., Linux, Windows

Page 30: ch8. System VM, VMWare

30

I/O VirtualizationI/O Virtualization

New Capability for Devices Through Abstraction Layer

Undoable disk• The Disk on the VM can be treated as a file on host OS

• Explicit command for perform disk write

Virtual Ethernet switch between a virtual NIC and a physical NIC

• Reduce performance losses due to virtualization

Alternative user interface• A window can be used instead of whole display device

Page 31: ch8. System VM, VMWare

31

Memory VirtualizationMemory Virtualization

Paging requests of the guest OSNot directly intercepted by the VMM, but converted into disk read/writes.

VMMonitor translates it to requests on the host OS through VMApp.

Page replacement policy of host OSThe host could replace the critical pages of VM system in the competition with other host applications.

VMDriver’s critical pages pinning for virtual memory system.

Page 32: ch8. System VM, VMWare

32

Presentation OutlinePresentation Outline

Resource Virtualization – Input/Output

Performance Enhancement of System Virtual Machines

Case Study: VMware Virtual Platform

Case Study: The Intel VT-x (Vanderpool) Technology

Page 33: ch8. System VM, VMWare

33

OverviewOverview

VT-x (Vanderpool) technology for IA-32 processorsConceptually similar to VM assists and Interpretive Execution

Available in recent CPUs• Pentium 4 6x2, Pentium D 9x0, Xeon, Core Duo, Core 2 Duo

Page 34: ch8. System VM, VMWare

34

The Intel VT-x (Vanderpool) TechnologThe Intel VT-x (Vanderpool) Technologyy

MotivationVirtualization problems of IA-32 architecture

Complexity of code and performance overhead

Main FeatureNew VMX mode of operation

• VMX root

– Fully privileged, intended for VM monitor

• VMX non-root

– Not fully privileged, intended for guest software

Page 35: ch8. System VM, VMWare

35

Technology OverviewTechnology Overview

Root Mode(VMM)

Non-Root(VM1)

Non-Root(VM2)

RegularMode

RegularMode

vmxonvmlaunch

VM1vmlaunch

VM2vmresume

VM2vmresume

VM2vmresume

VM1vmxoff

VM1exits

VM2exits

VM2exits

VM2exits

VM1exits

Page 36: ch8. System VM, VMWare

36

Capabilities of the TechnologyCapabilities of the Technology

A Key aspectThe elimination of the need to run all guest code in the user mode

Maintenance of state informationMajor source of overhead in a software-based solution

Hardware technique that allows all of the state-holding data elements to be mapped to their native structures

VMCS (Virtual Machine Control Structure)• Hardware implementation take over the tasks of loading and

unloading the state from their physical locations

Page 37: ch8. System VM, VMWare

37

MaintMaintenance of state informationenance of state information

The state of a virtual machine is maintained in the VMCS data structure

State AreaGuest State

Register State

Interruptibility State

Host State Register State

Control Area

VM Execution Controls

Pin-based Execution Controls

Processor-based Execution Controls

Bitmap Fields

etc.

VM Exit ControlsControl Bitmap

MSR Controls

VM Entry Controls

Control Bitmap

MSR Controls

Controls for Event Injection

VM Exit InformationBasic Information

VM-Exit Information

Vectoring Event Information

Other Exit InformationDue to Event Delivery

Due to Instruction Execution

Page 38: ch8. System VM, VMWare

38

An Example: The rdtsc InstructionAn Example: The rdtsc Instruction