58
Faculty of Computer Science Institute for System Architecture, Operating Systems Group Virtualization Dresden, 2009-12-01 Henning Schild

Virtualization - TU Dresdenos.inf.tu-dresden.de/Studium/KMB/WS2009/08-Virtualization.pdf · TU Dresden, 2009-12-01 MOS - Virtualization Slide 6 von 58 Virtualization – a hype A

  • Upload
    others

  • View
    0

  • Download
    0

Embed Size (px)

Citation preview

Page 1: Virtualization - TU Dresdenos.inf.tu-dresden.de/Studium/KMB/WS2009/08-Virtualization.pdf · TU Dresden, 2009-12-01 MOS - Virtualization Slide 6 von 58 Virtualization – a hype A

Faculty of Computer Science Institute for System Architecture, Operating Systems Group

Virtualization

Dresden, 2009-12-01

Henning Schild

Page 2: Virtualization - TU Dresdenos.inf.tu-dresden.de/Studium/KMB/WS2009/08-Virtualization.pdf · TU Dresden, 2009-12-01 MOS - Virtualization Slide 6 von 58 Virtualization – a hype A

TU Dresden, 2009-12-01 MOS - Virtualization Slide 2 von 58

So Far ...

● Basics● Introduction● Threads & synchronization● Memory

● Real-time● Resource Management● Device Drivers

Page 3: Virtualization - TU Dresdenos.inf.tu-dresden.de/Studium/KMB/WS2009/08-Virtualization.pdf · TU Dresden, 2009-12-01 MOS - Virtualization Slide 6 von 58 Virtualization – a hype A

TU Dresden, 2009-12-01 MOS - Virtualization Slide 3 von 58

Today: Virtualization

● Introduction● Motivation & classification, flavors● L4Linux: Para-virtualization on top of L4

● Architecture● Address space layout● Scenarios

● NOVA – a μ-hypervisor● KVM on FiascoOC

Page 4: Virtualization - TU Dresdenos.inf.tu-dresden.de/Studium/KMB/WS2009/08-Virtualization.pdf · TU Dresden, 2009-12-01 MOS - Virtualization Slide 6 von 58 Virtualization – a hype A

TU Dresden, 2009-12-01 MOS - Virtualization Slide 4 von 58

One possible definition ...

● Introduction of layers of abstraction betweenphysical ressources and users/applications.

● partitioning of ressources● aggregation of ressources● combinations

Page 5: Virtualization - TU Dresdenos.inf.tu-dresden.de/Studium/KMB/WS2009/08-Virtualization.pdf · TU Dresden, 2009-12-01 MOS - Virtualization Slide 6 von 58 Virtualization – a hype A

TU Dresden, 2009-12-01 MOS - Virtualization Slide 5 von 58

Virtualization flavours

● Multitasking● OS as layer of abstraction● machine partitioning, virtual memory and time

slices● application level

● Unix chroot● FreeBSD Jails, Solaris Zones, Linux Vserver● Wine

● …● multiple OSs on one machine

● VMWare, QEMU, VirtualBOX● UML, Xen, L4Linux

Page 6: Virtualization - TU Dresdenos.inf.tu-dresden.de/Studium/KMB/WS2009/08-Virtualization.pdf · TU Dresden, 2009-12-01 MOS - Virtualization Slide 6 von 58 Virtualization – a hype A

TU Dresden, 2009-12-01 MOS - Virtualization Slide 6 von 58

Virtualization – a hype

● A lot of interest in the research community within the last years, e.g.:

● SOSP 03: Xen and the Art of Virtualization● EuroSys 07: a whole session on virtualization

● Many virtualization products:● VMware, QEmu, VirtualBox, KVM, Hyper-V

● x86 Hardware support● further increasing demand:

● VMware: from 240 to 6300 employees within the last few years

Page 7: Virtualization - TU Dresdenos.inf.tu-dresden.de/Studium/KMB/WS2009/08-Virtualization.pdf · TU Dresden, 2009-12-01 MOS - Virtualization Slide 6 von 58 Virtualization – a hype A

TU Dresden, 2009-12-01 MOS - Virtualization Slide 7 von 58

Virtualization - a new idea?

● Originates in IBM's CP/CMS series used on System/3xx mainframes (starting ~1964)

● Control Program - VMM● Cambridge Monitor System

● Guest OS

● Memory protection● SIE instruction (VM mode)● CP encodes much of the guest privileged state

in a hardware-defined format● IBM's first virtual memory system

Page 8: Virtualization - TU Dresdenos.inf.tu-dresden.de/Studium/KMB/WS2009/08-Virtualization.pdf · TU Dresden, 2009-12-01 MOS - Virtualization Slide 6 von 58 Virtualization – a hype A

TU Dresden, 2009-12-01 MOS - Virtualization Slide 8 von 58

Motivation

Page 9: Virtualization - TU Dresdenos.inf.tu-dresden.de/Studium/KMB/WS2009/08-Virtualization.pdf · TU Dresden, 2009-12-01 MOS - Virtualization Slide 6 von 58 Virtualization – a hype A

TU Dresden, 2009-12-01 MOS - Virtualization Slide 9 von 58

Virtualization - Motivation

● optimize utilization● server consolidation

● Isolation● security reasons● incompatibility

● reusing legacy software● i.e. Windows on Linux

● development● virtual test machines

Page 10: Virtualization - TU Dresdenos.inf.tu-dresden.de/Studium/KMB/WS2009/08-Virtualization.pdf · TU Dresden, 2009-12-01 MOS - Virtualization Slide 6 von 58 Virtualization – a hype A

TU Dresden, 2009-12-01 MOS - Virtualization Slide 10 von 58

Virtualization - Buzzwords

TCO

Migration

Consolidation

Availability

Utilization

EfficiencySecurity

Flexibility

Manageability

Virtual Appliance

Maintainability

Virtualization

Page 11: Virtualization - TU Dresdenos.inf.tu-dresden.de/Studium/KMB/WS2009/08-Virtualization.pdf · TU Dresden, 2009-12-01 MOS - Virtualization Slide 6 von 58 Virtualization – a hype A

TU Dresden, 2009-12-01 MOS - Virtualization Slide 11 von 58

Formal Requirements

● Equivalence● guest behaviour should match real machine

● Isolation● host controls ressource access● guests are isolated from host and from each

other

● Efficiency● guest code should be executed natively

see paper reading 2010-01-12: “Formal requirements for virtualizable third generation architectures”

Page 12: Virtualization - TU Dresdenos.inf.tu-dresden.de/Studium/KMB/WS2009/08-Virtualization.pdf · TU Dresden, 2009-12-01 MOS - Virtualization Slide 6 von 58 Virtualization – a hype A

TU Dresden, 2009-12-01 MOS - Virtualization Slide 12 von 58

Classification help

● Virtualization - an overloaded term● Some classification criteria:

● Objective target: hardware, OS API or ABI ?● Emulation vs. virtualization: do we have to

interpret some or all instructions ? Binary vs. byte code interpretation (e.g.: JVM)

● Can we modify the target software ? (e.g. using para-virtualization techniques)

Page 13: Virtualization - TU Dresdenos.inf.tu-dresden.de/Studium/KMB/WS2009/08-Virtualization.pdf · TU Dresden, 2009-12-01 MOS - Virtualization Slide 6 von 58 Virtualization – a hype A

TU Dresden, 2009-12-01 MOS - Virtualization Slide 13 von 58

Reimplementation of the OS interface

● used to integrate a bunch of existing software to other respectively newly created OSes

● when copying the API of an OS, target software needs to be re-linked

● in contrast to that, ABI emulation can run unmodified binaries e.g.: Wine

● Disadvantages of both approaches:● huge effort● shooting at a moving target

Page 14: Virtualization - TU Dresdenos.inf.tu-dresden.de/Studium/KMB/WS2009/08-Virtualization.pdf · TU Dresden, 2009-12-01 MOS - Virtualization Slide 6 von 58 Virtualization – a hype A

TU Dresden, 2009-12-01 MOS - Virtualization Slide 14 von 58

Virtualize the hardware

● instead of emulating the OS API or ABI, take the underlying platform

● common to many OSs

● Emulation● interprete/translate guest code

● Virtualization● native execution of guest code● with or without HW-Support

● Paravirtualization● modification of the guest

Page 15: Virtualization - TU Dresdenos.inf.tu-dresden.de/Studium/KMB/WS2009/08-Virtualization.pdf · TU Dresden, 2009-12-01 MOS - Virtualization Slide 6 von 58 Virtualization – a hype A

TU Dresden, 2009-12-01 MOS - Virtualization Slide 15 von 58

Emulation

● binary translation/interpretation of guest code● no native execution● contradicts with efficiency requirement● applicable to a lot of architectures● often used for peripheral devices● Example: QEMU, Bochs

● QEMU emulates x86, ARM, SPARC, PowerPC ...

Page 16: Virtualization - TU Dresdenos.inf.tu-dresden.de/Studium/KMB/WS2009/08-Virtualization.pdf · TU Dresden, 2009-12-01 MOS - Virtualization Slide 6 von 58 Virtualization – a hype A

TU Dresden, 2009-12-01 MOS - Virtualization Slide 16 von 58

Platform virtualization in software

● guest OS runs natively in less privileged mode● privileged instructions fail and are handled by

the VMM (trap-and-emulate)● VMM derives and manages shadow structures

from guest's primary structures, e.g.: shadow page tables

● JIT binary translation● Examples: VMware, KQEMU, VirtualBox

Page 17: Virtualization - TU Dresdenos.inf.tu-dresden.de/Studium/KMB/WS2009/08-Virtualization.pdf · TU Dresden, 2009-12-01 MOS - Virtualization Slide 6 von 58 Virtualization – a hype A

TU Dresden, 2009-12-01 MOS - Virtualization Slide 17 von 58

X86 Virtualization

Page 18: Virtualization - TU Dresdenos.inf.tu-dresden.de/Studium/KMB/WS2009/08-Virtualization.pdf · TU Dresden, 2009-12-01 MOS - Virtualization Slide 6 von 58 Virtualization – a hype A

TU Dresden, 2009-12-01 MOS - Virtualization Slide 18 von 58

Problems with x86 virtualization

● Ring-alias problem● guest OS runs in privilege level > 0

● Address space compression● part of the guest OS's address space used by

the VMM (e.g. IDT, GDT)

● some instructions do not trap, e.g.:● popf: pop stack into EFLAGS register,

causes interrupt handling problems (IF not updated in user-mode)

● faulting implies performance loss● kernel entry/exit -> doubled context switch

Page 19: Virtualization - TU Dresdenos.inf.tu-dresden.de/Studium/KMB/WS2009/08-Virtualization.pdf · TU Dresden, 2009-12-01 MOS - Virtualization Slide 6 von 58 Virtualization – a hype A

TU Dresden, 2009-12-01 MOS - Virtualization Slide 19 von 58

Hardware enabled virtualization

● Example Intel-VT● root and non-root mode, VM entry and exit● Virtual Machine Control Structure in physical

memory holds information of guest and host state and some additional control information

● VMCS is used to investigate VM exit conditions, e.g.: whether a guest traps when masking or unmasking interrupts

● AMD SVM is similar

Page 20: Virtualization - TU Dresdenos.inf.tu-dresden.de/Studium/KMB/WS2009/08-Virtualization.pdf · TU Dresden, 2009-12-01 MOS - Virtualization Slide 6 von 58 Virtualization – a hype A

TU Dresden, 2009-12-01 MOS - Virtualization Slide 20 von 58

Hardware enabled virtualization

● problematic instructions trap● reduced software complexity● Examples: KVM, VirtualBox, Xen, Hyper-V,

Windows 7 XP Mode, Parallels ...

Page 21: Virtualization - TU Dresdenos.inf.tu-dresden.de/Studium/KMB/WS2009/08-Virtualization.pdf · TU Dresden, 2009-12-01 MOS - Virtualization Slide 6 von 58 Virtualization – a hype A

TU Dresden, 2009-12-01 MOS - Virtualization Slide 21 von 58

MMU Virtualization

Page 22: Virtualization - TU Dresdenos.inf.tu-dresden.de/Studium/KMB/WS2009/08-Virtualization.pdf · TU Dresden, 2009-12-01 MOS - Virtualization Slide 6 von 58 Virtualization – a hype A

TU Dresden, 2009-12-01 MOS - Virtualization Slide 22 von 58

Shadow page tables

● Memory tracing of the page tables● decode and emulate guest's pagefaults

host physical memory

guest physical memoryhost virtual memory

guest virtual memory

guest page table

host page table

shadow page table

Page 23: Virtualization - TU Dresdenos.inf.tu-dresden.de/Studium/KMB/WS2009/08-Virtualization.pdf · TU Dresden, 2009-12-01 MOS - Virtualization Slide 6 von 58 Virtualization – a hype A

TU Dresden, 2009-12-01 MOS - Virtualization Slide 23 von 58

Shadow page tables

1) pagefault in guest (GVA)

2) caught by hypervisor/VMM

3) parse guest page tables (GVA GPA)→

4) maybe inject pagefault into guest and parse again

5) translate guest pt entry to shadow pt entry (GPA HVA HPA)→ →

6) create mapping in shadow pt and resume

→ costly, recent x86 processors come with hardware support

host physical memory

guest physical memoryhost virtual memory

guest virtual memory

guest page table

host page table

shadow page table

GVA guest virtual addressGPA guest physical addressHVA host virtual addressHPA host physical address

Page 24: Virtualization - TU Dresdenos.inf.tu-dresden.de/Studium/KMB/WS2009/08-Virtualization.pdf · TU Dresden, 2009-12-01 MOS - Virtualization Slide 6 von 58 Virtualization – a hype A

TU Dresden, 2009-12-01 MOS - Virtualization Slide 24 von 58

MMU Virtualization with HW support

● hardware can parse two page table levels● VM page table constructed by VMM maps HPA

to GPA● guest manages its own GPA to GVA tables● no shadow paging in software required● pagefaults can be resolved without mode

switching● AMD: nested paging, Intel: EPT

→ significant performance increase for VMs

Page 25: Virtualization - TU Dresdenos.inf.tu-dresden.de/Studium/KMB/WS2009/08-Virtualization.pdf · TU Dresden, 2009-12-01 MOS - Virtualization Slide 6 von 58 Virtualization – a hype A

TU Dresden, 2009-12-01 MOS - Virtualization Slide 25 von 58

Paravirtualization

Page 26: Virtualization - TU Dresdenos.inf.tu-dresden.de/Studium/KMB/WS2009/08-Virtualization.pdf · TU Dresden, 2009-12-01 MOS - Virtualization Slide 6 von 58 Virtualization – a hype A

TU Dresden, 2009-12-01 MOS - Virtualization Slide 26 von 58

Paravirtualization

● modify guest OS to integrate it in the runtime environment of another OS

● advantages:● no hardware support required● cooperation from guests possible

● disadvantages:● source code required● high development cost

● L4Linux, Xen, User Mode Linux, coLinux● Afterburner (Karlsruhe): modify binary code● paravirtualized drivers: VMware, KVM (virtio)

Page 27: Virtualization - TU Dresdenos.inf.tu-dresden.de/Studium/KMB/WS2009/08-Virtualization.pdf · TU Dresden, 2009-12-01 MOS - Virtualization Slide 6 von 58 Virtualization – a hype A

TU Dresden, 2009-12-01 MOS - Virtualization Slide 27 von 58

XEN

Page 28: Virtualization - TU Dresdenos.inf.tu-dresden.de/Studium/KMB/WS2009/08-Virtualization.pdf · TU Dresden, 2009-12-01 MOS - Virtualization Slide 6 von 58 Virtualization – a hype A

TU Dresden, 2009-12-01 MOS - Virtualization Slide 28 von 58

Examples from TUDOS group

Page 29: Virtualization - TU Dresdenos.inf.tu-dresden.de/Studium/KMB/WS2009/08-Virtualization.pdf · TU Dresden, 2009-12-01 MOS - Virtualization Slide 6 von 58 Virtualization – a hype A

TU Dresden, 2009-12-01 MOS - Virtualization Slide 29 von 58

L4Linux

Page 30: Virtualization - TU Dresdenos.inf.tu-dresden.de/Studium/KMB/WS2009/08-Virtualization.pdf · TU Dresden, 2009-12-01 MOS - Virtualization Slide 6 von 58 Virtualization – a hype A

TU Dresden, 2009-12-01 MOS - Virtualization Slide 30 von 58

L4Linux: history

● presented at SOSP '97● based on x86 Linux 2.0 on top of first L4 kernel

● (L4)Linux has evolved over the years● 2.2 supported MIPS and x86● 2.4 first version to run on L4Env● 2.6 uses 'paravirtualization' L4 kernel features

● recently● latest L4Linux release 2.6.31● x86 and ARM support● SMP

Page 31: Virtualization - TU Dresdenos.inf.tu-dresden.de/Studium/KMB/WS2009/08-Virtualization.pdf · TU Dresden, 2009-12-01 MOS - Virtualization Slide 6 von 58 Virtualization – a hype A

TU Dresden, 2009-12-01 MOS - Virtualization Slide 31 von 58

Linux Architecture

LinuxKernel

Arch-Ind.

Arch-Depend.

Arch-Depend.

Processes Scheduling

IPC

MemoryManagement

Page allocation Address spaces

Swapping

File Systems VFS

File System Impl.

Networking Sockets Protocols

Device Drivers

System-Call Interface

Hardware Access

Application Application Application Applicationuser

kernel

HardwareCPU, Memory, PCI, Devices

Page 32: Virtualization - TU Dresdenos.inf.tu-dresden.de/Studium/KMB/WS2009/08-Virtualization.pdf · TU Dresden, 2009-12-01 MOS - Virtualization Slide 6 von 58 Virtualization – a hype A

TU Dresden, 2009-12-01 MOS - Virtualization Slide 32 von 58

kernel

Linux Architecture

LinuxKernel

Arch-Ind.

Arch-Depend.

Arch-Depend.

Processes Scheduling

IPC

MemoryManagement

Page allocation Address spaces

Swapping

File Systems VFS

File System Impl.

Networking Sockets Protocols

Device Drivers

System-Call Interface

Hardware Access

HardwareCPU, Memory, PCI, Devices, …

Application Application Application Applicationuser

● Architecture dependent part● Small, for x86 about 2% of the kernel● System call interface:

Kernel entry Signal delivery Copy from/to user space

● Hardware access: CPU state and features MMU Interrupt Memory mapped I/O, I/O ports

● Architecture dependent part implements generic interface used by independent part

Page 33: Virtualization - TU Dresdenos.inf.tu-dresden.de/Studium/KMB/WS2009/08-Virtualization.pdf · TU Dresden, 2009-12-01 MOS - Virtualization Slide 6 von 58 Virtualization – a hype A

TU Dresden, 2009-12-01 MOS - Virtualization Slide 33 von 58

Linux Architecture

LinuxKernel

Arch-Ind.

Arch-Depend.

Arch-Depend.

Processes Scheduling

IPC

MemoryManagement

Page allocation Address spaces

Swapping

File Systems VFS

File System Impl.

Networking Sockets Protocols

Device Drivers

System-Call Interface

Hardware Access

HardwareCPU, Memory, PCI, Devices

Application Application Application Applicationuser

kernel

Page 34: Virtualization - TU Dresdenos.inf.tu-dresden.de/Studium/KMB/WS2009/08-Virtualization.pdf · TU Dresden, 2009-12-01 MOS - Virtualization Slide 6 von 58 Virtualization – a hype A

TU Dresden, 2009-12-01 MOS - Virtualization Slide 34 von 58

L4Linux Architecture

LinuxKernel

Arch-Ind.

Arch-Depend.

Arch-Depend.

Processes Scheduling

IPC

MemoryManagement

Page allocation Address spaces

Swapping

File Systems VFS

File System Impl.

Networking Sockets Protocols

Device Drivers

System-Call Interface

Hardware Access

Hardware

Application Application Application Application

user

kernel FiascoOC

L4 Task

L4IO Console moe

L4 Task L4 Task L4 Task L4 Task

sigma0

Page 35: Virtualization - TU Dresdenos.inf.tu-dresden.de/Studium/KMB/WS2009/08-Virtualization.pdf · TU Dresden, 2009-12-01 MOS - Virtualization Slide 6 von 58 Virtualization – a hype A

TU Dresden, 2009-12-01 MOS - Virtualization Slide 35 von 58

L4Linux Architecture

● Linux kernel and Linux user processes run each within a single L4 task

● L4/L4RE specific part is implemented as separate architecture: arch/l4 include/asm-l4

● L4/L4RE architecture dependent part itself divides into x86 and ARM specific part

● most code is reused from x86 resp. ARM specific part

Page 36: Virtualization - TU Dresdenos.inf.tu-dresden.de/Studium/KMB/WS2009/08-Virtualization.pdf · TU Dresden, 2009-12-01 MOS - Virtualization Slide 6 von 58 Virtualization – a hype A

TU Dresden, 2009-12-01 MOS - Virtualization Slide 36 von 58

Linux address space layout

● 0x0 – TASK_SIZE● user part● changes on every

context switch

● TASK_SIZE – 0xF...● kernel part● constant in all

address spaces

● Physical memory mapped beginning at PAGE_OFFSET

0xFFFFFFFF

0xC0000000

0x00000000

UserAddress

Space

KernelAddress

Space

Phys. Memory

vmalloc, kmap, …

Kernel ImagePAGE_OFFSET

Application,Libraries, …

TASK_SIZE

Page 37: Virtualization - TU Dresdenos.inf.tu-dresden.de/Studium/KMB/WS2009/08-Virtualization.pdf · TU Dresden, 2009-12-01 MOS - Virtualization Slide 6 von 58 Virtualization – a hype A

TU Dresden, 2009-12-01 MOS - Virtualization Slide 37 von 58

L4Linux address space layout

0xFFFFFFFF

0xC0000000

0x00000000

UserAddress

Space

KernelAddress

Space

Phys. Memory

vmalloc, kmap, …

Kernel ImagePAGE_OFFSET

Application,Libraries, …

TASK_SIZE

Application,Libraries, …

Guest-phys. Memory

vmalloc, kmap, …

Kernel Image

FiascoOCMicrokernel

FiascoOCMicrokernel

0x00000000

0x00000000

PAGE_OFFSET

0xFFFFFFFF

0xFFFFFFFF

0xC0000000

0xC0000000 L4Linux Server

L4Linux UserProcess

Page 38: Virtualization - TU Dresdenos.inf.tu-dresden.de/Studium/KMB/WS2009/08-Virtualization.pdf · TU Dresden, 2009-12-01 MOS - Virtualization Slide 6 von 58 Virtualization – a hype A

TU Dresden, 2009-12-01 MOS - Virtualization Slide 38 von 58

L4Linux: problems to be solved

● L4Linux server has to:● have some basic resources (memory, I/O)● manage page tables of its user processes● handle exceptions from user processes● schedule its tasks

● L4Linux user processes have to:● 'enter' the L4Linux kernel (now in a different

address space)

● Kernel needs information from user processes formerly accessible in the same address space, e.g.: syscall arguments

Page 39: Virtualization - TU Dresdenos.inf.tu-dresden.de/Studium/KMB/WS2009/08-Virtualization.pdf · TU Dresden, 2009-12-01 MOS - Virtualization Slide 6 von 58 Virtualization – a hype A

TU Dresden, 2009-12-01 MOS - Virtualization Slide 39 von 58

Linux address space management

● Architecture-independent part:● general page table management● implements allocator strategies● page replacement strategies● assumes 4-level page table by

architecture-dependent part

● Architecture-dependent part● set, remove and test entries● TLB handling● Linux for x86 uses 2 level page

tables

Linux Kernel

Hardware

Architecture-DependentPart (i386)

thread_info

Application

MemoryManagement– Page allocation– Address spaces– Swapping

Page 40: Virtualization - TU Dresdenos.inf.tu-dresden.de/Studium/KMB/WS2009/08-Virtualization.pdf · TU Dresden, 2009-12-01 MOS - Virtualization Slide 6 von 58 Virtualization – a hype A

TU Dresden, 2009-12-01 MOS - Virtualization Slide 40 von 58

L4Linux address space management

● L4Linux user processes are actually L4 tasks

● L4Linux server is the pager● Hardware page tables are

managed by L4 kernel● L4Linux page tables are mirrored

● L4Linux uses map/unmap operations

● adding page table entries is done lazy (pagefault occurs)

Linux Kernel

Hardware

Architecture-DependentPart (i386)

thread_info

Application

MemoryManagement– Page allocation– Address spaces– Swapping

FiascoKernel

Page 41: Virtualization - TU Dresdenos.inf.tu-dresden.de/Studium/KMB/WS2009/08-Virtualization.pdf · TU Dresden, 2009-12-01 MOS - Virtualization Slide 6 von 58 Virtualization – a hype A

TU Dresden, 2009-12-01 MOS - Virtualization Slide 41 von 58

General exception handling

● if a L4 task raises an exception kernel sends exception IPC to handler (feature in FiascoOC and L4.X2)

● Exception IPC contains CPU state of the client● Exception handler can reply with a new state,

for instance another instruction pointer● Exception IPC can be used to recognize Linux

system calls:● INT 0x80 will trigger an exception● L4Linux server acts as exception handler for its

user processes

Page 42: Virtualization - TU Dresdenos.inf.tu-dresden.de/Studium/KMB/WS2009/08-Virtualization.pdf · TU Dresden, 2009-12-01 MOS - Virtualization Slide 6 von 58 Virtualization – a hype A

TU Dresden, 2009-12-01 MOS - Virtualization Slide 42 von 58

L4Linux kernel entry

● System call costs:● 2x kernel entry/exit (exception and reply)● 2x address space switch

Fiasco microkernel

L4Linux UserProcess

INT 0x80

L4Linux Server

arch. dependent

arch. independent2

4

1

3

Page 43: Virtualization - TU Dresdenos.inf.tu-dresden.de/Studium/KMB/WS2009/08-Virtualization.pdf · TU Dresden, 2009-12-01 MOS - Virtualization Slide 6 von 58 Virtualization – a hype A

TU Dresden, 2009-12-01 MOS - Virtualization Slide 43 von 58

Interrupt handling

● Interrupt messages are received in separate threads

● Interrupt threads run on a higher priority than other Linux threads (Linux semantic)

● Interrupt thread wake up idle thread or force the running user process to enter the linux server

● Plain Linux disables interrupts for syncronization

● Use a lock instead of CLI/STI

Fiasco Kernel

L4Linux Server

Hardware

Device Driver

InterruptThreads

L4IO

MainThread

r equest _i r q( i r q_no, handl er ,

…)

Page 44: Virtualization - TU Dresdenos.inf.tu-dresden.de/Studium/KMB/WS2009/08-Virtualization.pdf · TU Dresden, 2009-12-01 MOS - Virtualization Slide 6 von 58 Virtualization – a hype A

TU Dresden, 2009-12-01 MOS - Virtualization Slide 44 von 58

not covered in detail here ...

● Linux kernel needs to access address space of user processes (e.g. syscall arguments)

● walk page tables of user process

● Security problems with DMA● move device drivers out of L4Linux● I/O MMU

● L4Linux scheduling● only one L4Linux process is active at a time● other processes are waiting in IPC (exception

or pagefault)

Page 45: Virtualization - TU Dresdenos.inf.tu-dresden.de/Studium/KMB/WS2009/08-Virtualization.pdf · TU Dresden, 2009-12-01 MOS - Virtualization Slide 6 von 58 Virtualization – a hype A

TU Dresden, 2009-12-01 MOS - Virtualization Slide 45 von 58

Hybrid applications

● Linux applications that are 'L4 aware'● Needs to be detected by Linux server

● Linux server puts them in UNINTERRUPTIBLE state in its own data structures

● Will not disturb ongoing IPC in hybrid task

● L4Linux user processes run as Aliens● Special alien flag used when creating a task● Aliens trap when calling L4 system● Exception handler monitors system call● Fiasco-only feature

Page 46: Virtualization - TU Dresdenos.inf.tu-dresden.de/Studium/KMB/WS2009/08-Virtualization.pdf · TU Dresden, 2009-12-01 MOS - Virtualization Slide 6 von 58 Virtualization – a hype A

TU Dresden, 2009-12-01 MOS - Virtualization Slide 46 von 58

L4Linux Use - cases

Page 47: Virtualization - TU Dresdenos.inf.tu-dresden.de/Studium/KMB/WS2009/08-Virtualization.pdf · TU Dresden, 2009-12-01 MOS - Virtualization Slide 6 von 58 Virtualization – a hype A

TU Dresden, 2009-12-01 MOS - Virtualization Slide 47 von 58

Real-time video player

● L4Linux user processes might use L4 services

FiascoOC kernel

Loader Roottask moe DOpE

RT-MPEGPlayer

L4Linux

MPlayerFrontend

controls

Page 48: Virtualization - TU Dresdenos.inf.tu-dresden.de/Studium/KMB/WS2009/08-Virtualization.pdf · TU Dresden, 2009-12-01 MOS - Virtualization Slide 6 von 58 Virtualization – a hype A

TU Dresden, 2009-12-01 MOS - Virtualization Slide 48 von 58

Multiple L4Linux instances

● Using multiple instances concurrently, e.g. for each security domain

● Devices need to be multiplexed (see resource management lesson: ORe, nitpicker, windhoek, )

● Communication through network, special IPC monitors ...

FiascoOC kernel

Loader Roottask console moe

Virtualization infrastructure

L4Linux server L4Linux server

App.App. App. App.

Page 49: Virtualization - TU Dresdenos.inf.tu-dresden.de/Studium/KMB/WS2009/08-Virtualization.pdf · TU Dresden, 2009-12-01 MOS - Virtualization Slide 6 von 58 Virtualization – a hype A

TU Dresden, 2009-12-01 MOS - Virtualization Slide 49 von 58

Use L4Linux as a toolbox

● L4Linux instances can provide access to various complex software stacks, e.g.:

● Network stacks● Drivers● Filesystems

Fiasco kernel

Loader Roottask

L4 App

L4Linux

AlienFilesystemWrapper

moe

Page 50: Virtualization - TU Dresdenos.inf.tu-dresden.de/Studium/KMB/WS2009/08-Virtualization.pdf · TU Dresden, 2009-12-01 MOS - Virtualization Slide 6 von 58 Virtualization – a hype A

TU Dresden, 2009-12-01 MOS - Virtualization Slide 50 von 58

Faithful Virtualization

Page 51: Virtualization - TU Dresdenos.inf.tu-dresden.de/Studium/KMB/WS2009/08-Virtualization.pdf · TU Dresden, 2009-12-01 MOS - Virtualization Slide 6 von 58 Virtualization – a hype A

TU Dresden, 2009-12-01 MOS - Virtualization Slide 51 von 58

NOVA – μ hypervisor approach

● NOVA OS Virtualization Architecture● Separate hypervisor and VMM(s)

hypervisor

Serveruser

kernel

root

non-root

VMM

Guest OS

VMM

Guest OS

VMM

Guest OS

Page 52: Virtualization - TU Dresdenos.inf.tu-dresden.de/Studium/KMB/WS2009/08-Virtualization.pdf · TU Dresden, 2009-12-01 MOS - Virtualization Slide 6 von 58 Virtualization – a hype A

TU Dresden, 2009-12-01 MOS - Virtualization Slide 52 von 58

NOVA

● Hypervisor manages protection domains: ● address spaces and virtual machines

● Virtual machine has associated virtualization handler -> the VMM (codename: Vancouver)

● VMMs handle virtualization faults and implement virtual devices

● split functionality of hypervisor and VMM➔ reduced complexity of hypervisor which

runs security-sensitive applications beside the VMs

Page 53: Virtualization - TU Dresdenos.inf.tu-dresden.de/Studium/KMB/WS2009/08-Virtualization.pdf · TU Dresden, 2009-12-01 MOS - Virtualization Slide 6 von 58 Virtualization – a hype A

TU Dresden, 2009-12-01 MOS - Virtualization Slide 53 von 58

Guest OS

qemu-kvm

FiascoOC and KVM-L4

● FiascoOC provides AMD SVM support● KVM can be reused with little modification

L4Linux server

Fiasco kernel

Loader Roottask DMPhys Names

KVM-L4

qemu-kvm

Guest OS

user

kernel

host

guest

Page 54: Virtualization - TU Dresdenos.inf.tu-dresden.de/Studium/KMB/WS2009/08-Virtualization.pdf · TU Dresden, 2009-12-01 MOS - Virtualization Slide 6 von 58 Virtualization – a hype A

TU Dresden, 2009-12-01 MOS - Virtualization Slide 54 von 58

FiascoOC and KVM-L4

● FiascoOC supports AMD SVM● memory is mapped to VMs using map/unmap

mechanism● invoke VM capability to enter guest mode● existing VMM can be reused

● KVM with little modification● low development cost

● Virtual Machines next to secure applications

Page 55: Virtualization - TU Dresdenos.inf.tu-dresden.de/Studium/KMB/WS2009/08-Virtualization.pdf · TU Dresden, 2009-12-01 MOS - Virtualization Slide 6 von 58 Virtualization – a hype A

TU Dresden, 2009-12-01 MOS - Virtualization Slide 55 von 58

Summary

● Virtualization flavours● API or ABI emulation● Emulation● Full virtualization● Hardware (especially x86) or OS● Paravirtualizition

● L4Linux – paravirtualization in detail● Address space layout & management● Taming Linux (interrupts, I/O memory)

● Faithful Virtualization● Nova – minimal hypervisor + VMM from scratch● KVM-L4 reusing a VMM

Page 56: Virtualization - TU Dresdenos.inf.tu-dresden.de/Studium/KMB/WS2009/08-Virtualization.pdf · TU Dresden, 2009-12-01 MOS - Virtualization Slide 6 von 58 Virtualization – a hype A

TU Dresden, 2009-12-01 MOS - Virtualization Slide 56 von 58

References

● Tom Van Vleck: 'The IBM 360/67 and CP/CMS' http://www.multicians.org/thvv/360-67.html

● Keith Adams and Ole Agesen: 'A Comparision of Software and Hardware Techniques for x86 Virtualization' ASPLOS 2006 http://www.vmware.com/pdf/asplos235_adams.pdf

● Intel Virtualization Technology http://www.intel.com/technology/itj/2006/v10i3/1-hardware/1-abstract.htm

● H. Härtig, M. Roitzsch, A. Lackorzynski, B. Döbel and A. Böttcher: 'L4 – Virtualization and Beyond'

Page 57: Virtualization - TU Dresdenos.inf.tu-dresden.de/Studium/KMB/WS2009/08-Virtualization.pdf · TU Dresden, 2009-12-01 MOS - Virtualization Slide 6 von 58 Virtualization – a hype A

TU Dresden, 2009-12-01 MOS - Virtualization Slide 57 von 58

References

● Udo Steinberg: 'NOVA Hypervisor Architecture Whitepaper' Internal Report 2007

● L4Linux Webpage http://os.inf.tu-dresden.de/L4/LinuxOnL4

● Adam Lackorzynski: 'L4Linux Porting Optimizations' Diploma Thesis 2004 http://os.inf.tu-dresden.de/papers_ps/adam-diplom.pdf

Page 58: Virtualization - TU Dresdenos.inf.tu-dresden.de/Studium/KMB/WS2009/08-Virtualization.pdf · TU Dresden, 2009-12-01 MOS - Virtualization Slide 6 von 58 Virtualization – a hype A

TU Dresden, 2009-12-01 MOS - Virtualization Slide 58 von 58

Outlook

● now, paper reading:● Singularity - Rethinking the Software Stack

● next weeks:● legacy containers● OS Personalities