Upload
kirsten-snow
View
62
Download
1
Tags:
Embed Size (px)
DESCRIPTION
Virtual Machine. Computer System Engineering (2013 Spring). Review. Layering Threads. Review. Virtualization OS kernel helps run multiple programs on single computer Provides virtual memory to protect programs from each other - PowerPoint PPT Presentation
Citation preview
1
Virtual Machine
Computer System Engineering (2013 Spring)
2
REVIEW
3
Layering Threads
4
Review
• Virtualization– OS kernel helps run multiple programs on single
computer– Provides virtual memory to protect programs from
each other– Provides system calls so that programs can interact.
E.g., send/receive, file system, ... – Provides threads to run multiple programs on few
CPUs• Can we rely on the OS kernel to work correctly?
5
VIRTUAL MACHINE
6
Outline
• Why Virtual Machine?– Micro-kernel?– Emulation?
• Virtual Machine– Memory: shadow memory, H/W support– CPU: para-virt, binary translation, H/W support– I/O Device: driver domain
7
Kernel Complexity
8
Monolithic Kernel
• Linux is Monolithic– No enforced modularity (e.g., Linux). – Many good software engineering techniques used
in practice.– But ultimately a bug in the kernel can affect entire
system. • How can we deal with the complexity of such
systems? – One result of all this complexity: bugs!
9
Linux Fails
• A kernel bug can – Cause the whole Linux system to fail
• Is it a good thing that Linux lasted this long? – Problems can be hard to detect, even though they
may still do damage– E.g., maybe my files are now corrupted, but
system didn't crash– Worse: adversary can exploit a bug to gain
unauthorized access to system
10
Linux (2002-2009): 2+ bugs fixed per day!
11
Microkernel Design
• Enforced Modularity– Subsystems in "user" programs– E.g., file server, device driver, graphics, networking
are programs
12
Why not Microkernel Style?
• Many dependencies between components– E.g., balance between memory used for file cache vs.
process memory
• Moving large components to microkernel-style programs does not win much– If entire file system service crashes, might lose all data
• Better designs possible, but require a lot of careful design work– E.g., could create one file server program per file system,
mount points tell FS client to talk to another FS server
13
Why not Microkernel Style?
• Trade-off: spend a year on new features vs. microkernel rewrite– Microkernel design may make it more difficult to
change interfaces later– Some parts of Linux have microkernel design
aspects: X server, some USB drivers, FUSE, printing, SQL database, DNS, SSH, ...
– Can restart these components without crashing kernel or other programs
14
One Solution
• Problem: – Deal with Linux kernel bugs without redesigning Linux
from scratch• One idea: run different programs on different
computers– Each computer has its own Linux kernel; if one
crashes, others not affected– Strong isolation (all interactions are client-server), but
often impractical– Can't afford so many physical computers: hardware
cost, power, space, ..
15
Run multiple Linux on a Single Computer?
• Virtualization + Abstractions– New constraint: compatibility, because we want to run
existing Linux kernel– Linux kernel written to run on regular hardware– No abstractions, pure virtualization
• Approach is called "virtual machines" (VM): – Each virtual machine is often called a guest– The equivalent of a kernel is called a "virtual machine
monitor" (VMM)– The VMM is often called the host
16
Why Virtual Machine?
• Consolidation– Run several different OS on a single machine
• Isolation– Keep the VMs separated as error container– Fault tolerant
• Maintenance– Easy to deploy, backup, clone, migrate
• Security– VM introspection– Antivirus out of the OS
17
Naive Approach
• Emulate every single instruction from the guest OS– In practice, this is sometimes done, but often too
slow in practice– Want to run most instructions directly on the CPU,
like a regular OS kernel
18
MEMORY VIRTUALIZATION
19
Virtualizing Memory
• VMM constructs a page table that maps guest address to host physical address– E.g., if guest VM has 1GB of memory, it can access
memory address 0~1GB– Each guest VM has its own mapping for memory
address 0, etc. – Different host physical address used to store data
for those memory locations
20
Virtualizing the PTP register / page tables
• Terminology: 3 levels of addressing now– Guest virtual. Guest physical. Host physical– Guest VM's page table contains guest physical address– Hardware page table must point to host physical address
(actual DRAM locations)
• Setting hardware PTP register to point to guest page table would not work– Processes in guest VM might access host physical address
0~1GB– But those host physical addresses might not belong to that
guest VM
21
One Solution: Shadow Pages
• Shadow Paging– VMM intercepts guest OS setting the virtual PTP
register– VMM iterates over the guest page table,
constructs a corresponding shadow PT– In shadow PT, every guest physical address is
translated into host physical address– Finally, VMM loads the host physical address of
the shadow PT– Shadow PT is per process
22
Shadow Page Table
HPA
Guest VM
GPT
GVA
SPTHPT
GPA
24
What if guest OS modifies its page table?
• Real hardware would start using the new page table's mappings. – Virtual machine monitor has a separate shadow page table.
• Goal: – VMM needs to intercept when guest OS modifies page
table, update shadow page table accordingly. • Technique:
– use the read/write bit in the PTE to mark those pages read-only.
– If guest OS tries to modify them, hardware triggers page fault.
– Page fault handled by VMM: update shadow page table & restart guest.
25
Another Solution: Hardware Support
• Hardware Support for Memory Virtualization– Intel’s EPT (Extended Page Table)– AMD’s NPT (Nested Page Table)
• Another Table– EPT for translation from GPA to HPA– EPT is controlled by the hypervisor– EPT is per-VM
GVA
GPA
HPA
Guest VM VMM
VA
PA
PT
EPT PT
Root modeNon-root mode
27
CPU VIRTUALIZATION
28
Virtualizing the U/K bit
• Hardware U/K bit should be set to 'U': – Otherwise guest OS can do anything
• Behavior affected by the U/K bit: – 1. Whether privileged instructions can be
executed (e.g., load PTP)– 2. Whether pages marked "kernel-only" in page
table can be accessed
29
Virtual U/K Bit
• Implementing a virtual U/K bit: – VMM stores the current state of guest's U/K bit (in
some memory location)– Privileged instructions in guest code will cause an
exception (U mode)• VMM exception handler's job will be to emulate these
instructions
– Approach is called "trap and emulate”• E.g., if load PTP instruction runs in virtual K mode, load
shadow PT. Otherwise, send an exception to the guest OS, so it can handle it
30
PTP Exception
31
Virtual U/K Bit
32
Protect Kernel-only Pages
• How do we selectively allow / deny access to kernel-only pages in guest PT? – Hardware doesn't know about our virtual U/K bit
• Idea: – Generate two shadow page tables, one for U, one
for K– When guest OS switches to U mode, VMM must
invoke set_ptp(current, 0)
33
Non-virtualizable Instruction
• Hard: what if some instructions don't cause an exception? – E.g., on x86, can always read the current U/K bit
value– Guest OS kernel might observe it's running in U
mode, expecting K– Similarly, can jump to U mode from both U & K
modes– How to intercept switch to U mode to replace
shadow page table?
34
Approach 1: “Para-Virtualization"
• Modify the OS slightly to account for these differences– Compromises on our goal of compatibility– Recall: goal was to run multiple instances of the
OS on a single computer– Not perfect virtualization, but relatively easy to
change Linux– May be difficult to do with Windows, without
Microsoft's help
35
Approach 2: Binary Translation
• VMM analyzes code from the guest VM– Replaces problematic instructions with an
instruction that traps– Once we can generate a trap, can emulate the
desired behavior– Original versions of VMware did this
36
Approach 3: Hardware Support
• Both Intel and AMD:– Special "VMM" operating mode, in addition to U/K
bit– Two levels of page tables, implemented in hardware
• Makes the VMM's job much easier: – Let the guest VM manipulate the U/K bit, as long as
VMM bit is cleared– Let the guest VM manipulate the guest PT, as long
as host PT is set– Many modern VMMs take this approach
37
DEVICE VIRTUALIZATION
38
Virtualizing Devices
• I/O Virtualization– OS kernel accesses disk by issuing special
instructions to access I/O memory– These instructions only accessible in K mode, and
raise an exception– VMM handles exception by simulating a disk
operation– Actual disk contents is typically stored in a file in
the host file system• Case: Xen’s Domain-0 Architecture
39
Xen’s Domain-0 (Para-Virtualization)
• Split Driver to Backend and Frontend– Frontend driver in guest VM• As a new type of NIC
– Backend driver in domain-0• One backend instance for each VM
– Communication with I/O ring• Shared page for I/O
Application
TCP/IP Stack
Frontend Dev Driver
TCP/IP Stack
Backend Dev Driver Real Device Driver
Split Device Driver
Physical Device
Guest Domain Domain-0
Xen
Hardware
41
Xen’s Domain-0 (Full-Virtualization)
• Qemu for Device Emulation– User-level application running in domain-0– Implement NIC (e.g., 8139) in software– Each VM has its own Qemu instance running
• I/O Request Redirection– Guest VM’s I/O requests are sent to domain-0– Then Domain-0 sends to Qemu– Eventually, Qemu will use domain-0’s NIC driver
Application
TCP/IP Stack
Device Driver (8139)
TCP/IP Stack
Real Device Driver
Physical Device
Guest Domain Domain-0
Xen
Hardware
Qemu (User-level)
43
Hardware Virtualization
• Device Support– SRIOV: Single-Root I/O Virtualization– By PCI-SIG and Intel– Typically used on NIC device
• VF: Virtual Function– One VF for each VM– Limited VFs, e.g., 8 per NIC
• PF: Physical Function– One PF for each device– Used only for configuration, not on the critical path
Application
TCP/IP Stack
Device Driver (VF) Device Driver (PF)
Guest Domain Domain-0
Xen
Hardware
User-level Config
VF VF VF VF PF
Physical Device
Application
TCP/IP Stack
Device Driver (VF)
Guest Domain
45
Benefits of Virtual Machine
• Can share hardware between unrelated services, with enforced modularity.
• Can run different OSes (Linux, Windows, etc). • Level-of-indirection tricks: can move VM
from one physical machine to another.
46
How do VMs relate to microkernels?
• Solving somewhat orthogonal problems– Microkernel: splitting up monolithic kernel into
client/server components– VMs: how to run many instances of an existing OS
kernel• E.g., can use VMM to run one network file server VM
and one application VM
– Hard problem still present• Entire file server fails at once? • Or, more complex design with many file servers