Upload
techdude
View
645
Download
0
Tags:
Embed Size (px)
DESCRIPTION
Citation preview
1
Virtual Machine Mobility
with Self-Migration
Jacob Gorm HansenDepartment of Computer Science,
University of Copenhagen
(now at VMware)
2
Short Bio
• Studied CS at DIKU in Copenhagen
• Worked for Io Interactive on the first two Hitman games
• Master’s thesis 2002 on “Nomadic Operating Systems”
• Ph.D. thesis 2007 “Virtual Machine Mobility with Self-Migration”– Early involvement in the Xen VMM project in Cambridge
– Worked on “Tahoma” secure browser at the University of Washington
– Interned at Microsoft Research Cambridge (2004) and Silicon Valley (2006) (security related projects)
• Presently working at VMware on top-secret cool stuff
3
Virtual Machine Mobility
with Self-Migration
Jacob Gorm HansenDepartment of Computer Science,
University of Copenhagen
(now at VMware)
4
Talk Overview
• Motivation & Background
• Virtual Machine Migration
– Live Migration in NomadBIOS
– Self-Migration in Xen
• Laundromat Computing
• Virtual Machines on the desktop
• Related & future work + conclusion
5
Motivation & Background
6
Motivation
• Researchers and businesses need computing power on-demand– Science increasingly relies on simulation– Web2.0 startups grow quickly (and die just as fast)
• Hardware is cheap, manpower and electricity are not – Idle machines are expensive– Immobile jobs reduce utilization– Fear of untrusted users stealing secrets or access
• We need a dedicated Grid/Utility computing platform:– Simple configuration & instant provisioning– Strong isolation of untrusted users– Backwards compatible with legacy apps (C, Fortran, …)– Location independence & Automated load-balancing– Pay-for-access without the lawyer part
7
Our Proposal
• Use virtual machines as containers for untrusted code
• Use live VM migration to make execution transient and location-indepedent
• Use micro-payments for pay-as-you-go computing
8
Her application roams freely,looking for the cheapest andfastest resources
She finds a Utility to host her application
Grid & Utility Computing Vision
Jill creates a web site for sending greeting cards
She pays for access
9
Can’t We Do This With UNIX?
• Configuration complexity & lack of isolation– Hard to agree on a common software install (BSD, Redhat, Ubuntu?)
– Name-space conflicts, e.g., files in /tmp
– UNIX is designed for sharing, not security
• Mismatch of abstractions– Process ≠ Application
– User login ≠ Customer
– Quota ≠ Payment
• Location-dependence– No bullet-proof way of moving running application to a new host
– Process migration in UNIX just doesn’t work
10
Use Virtual Machines Instead of Processes
“A virtual machine is […] an efficient, isolated duplicate of the real machine” [Popek & Goldberg, 1974]
”A virtual machine cannot be compromised by the operation of any other VM. It provides a private, secure, and reliable
computing environment for its users, …” [Creasy, 1981]
VMVM VMVM VMVM
VMMVMM
HardwareHardware
11
Pros and Cons of VMs
• Pros:– Strongly isolated
– Name-space is not shared
– More configuration freedom
– Simple interface to hardware
– VMs can migrate between hosts
• Cons:– Memory and disk footprint of Guest OS
– Less sharing potential
– Extra layer adds I/O overhead
– Not processor-independent
VMVM VMVM VMVM
VMMVMM
HardwareHardware
12
Virtual Machine Migration
13
Why Process Migration Doesn’t Work
• Because of residual dependencies
• Interface between app and OS not clearly defined
• Part of application state resides in OS kernel
process
file
process
14
Virtual Machine Migration is Simpler
• A VM is self-contained
• Interface to virtual hardware is clearly defined
• All dependencies abstracted via fault-resilient network protocols
process
file
process
VMM VMM
15
VMs, VM Migration & Utility Computing
• Utility Computing on Commodity hardware
• Let customers submit their application as VMs
• Minimum-complexity base install– Stateless nodes are disposable
– Small footprint, no bugs or patches
• Can only provide the basic mechanisms– Job submission
– Scheduling and preemption (migration)
– Pay-as-you-go accounting
• Essentially, a BIOS for Grid and Utility Computing
16
Live Migration in NomadBIOS
Joint work with Asger Jensen, 2002
17
NomadBIOS: Hypervisor on L4
make
xeyesbash
emacs
L4Linux
NomadBIOS
L4 micro-kernel
Physical Hardware
make
vi
bash
gcc
L4Linux
untrusted
trusted
18
NomadBIOS Live Migration
VMVM VMVM VMVM
NomadBIOSNomadBIOS
HardwareHardwareL4L4
VMVM VMVM
NomadBIOSNomadBIOS
HardwareHardwareL4L4
Pre-copy migration + gratuitous ARP = sub-second downtime
19
Why Migration Downtime Matters
•Upsets users of interactive applications such as games
•May trigger failure detectors in a distributed system
20
Live Migration Reduces Downtime
•The VM can still be used while it is migrating
•Data is transferred in the background, changes sent later
21
Multi-Iteration Pre-copy Technique
Percent state transferred per pre-copy round
0%
20%
40%
60%
80%
100%
1 2 3 4 5 6 7 8 9
Iteration
Total mem
Modified
22
Migration Downtime
• Two clients connected to a Quake2 server VM, 100Mbit network• Response time increases by ~50ms when server migrates
23
Lessons Learned from NomadBIOS
• Migration & TCP/IP resulted in 10-fold code size increase– Simplicity/functionality tradeoff
• A lot of stuff was still missing:– Threading
– Encryption & access control
– Disk access
VMVM VMVM VMVM
VMMVMM
HardwareHardwareL4L4
VMVM VMVM VMVM
VMMVMM
HardwareHardware
Migration +TCP/IP
L4L4
24
Self-Migration in Xen
Joint work with Cambridge University,
2004-2005
25
The Promise of Xen
• “Xen” open source VMM announced in late 2003
• Xen 1.0 was– A lean system with many of the same goals as NomadBIOS
– Optimized for para-virtualized VM hosting
– Very low overhead (~5%)
• Our goal was to port Live Migration from NomadBIOS to Xen– Xen lacked layers of indirection that L4 had
– Worse: They were removed for a reason
– Nasty control plane “Dom0” VM
26
Xen Control Plane (Dom0)
VMVM VMVM VMVM
VMMVMM
Con
trol
Pla
ne V
MC
ontr
ol P
lane
VM
• Xen uses a “side-car” model, with a trusted control VM– Has absolute powers
– Adds millions of lines of code to the TCB
• Security-wise, the control VM is the Achilles' Heel
27
Reduce Complexity with Self-Migration
• VM migration needs:– TCP/IP for transferring system state
– Page-table access for checkpointing
• A VM is self-paging & has its own TCP/IP stack
• Reduce VMM complexity by performing migration from within the VM
• No need for networking, threading or crypto in the TCB
VMVM VMVM VMVM
VMMVMM
Migration
Paging
TCP/IP
HardwareHardware
Paging
TCP/IP
Paging
TCP/IP
28
An Inspiring Example of Self-Migration
von Münchhausen in the swamp
29
Simple Brute-Force Solution
• Reserve half of memory for a snapshot buffer
• Checkpoint by copying state into snapshot buffer
• Migrate by copying snapshot to destination host
Source
Destination
30
Combination with Pre-copy
Percent state transferred per pre-copy round
0%
20%
40%
60%
80%
100%
1 2 3 4 5 6 7 8 9
Iteration
Total mem
Modified
Combine Pre-copy with Snapshot Buffer
31
First Iteration
32
Delta Iteration
33
Snapshot/Copy-on-Write Phase
34
Impact of Migration on Foreground Load
httperf
35
Self-Migration Summary
• Pros:– Self-Migration is more flexible, under application control
– Self-Migration removes hardcoded and complex features from the trusted install
– Self-Migration can work with direct-IO hardware
• Cons:– Self-Migration is not transparent, has to be implemented by each OS
– Self-Migration cannot be forced from the outside
36
Laundromat Computing
37
Pay-as-you-go Processing
• Laundromats do this already– Accessible to anyone
– Pre-paid & pay-as-you-go
– Small initial investment
• We propose to manage clusters the same way– Micro-payment currency
– Pay from first packet
– Automatic garbage collection when payments run out
38
Token Payments
• Initial payment is enclosed in Boot Token
• Use a simple hash-chain for subsequent payments– Hn(s), Hn-1(s), …, H(s), s
• Boot Token signed by trusted broker service
• Broker handles authentication
39
Injecting a New VM
• Two-stage boot loader handles different incoming formats– ELF loader for injecting a Linux kernel image
– Checkpoint loader for injecting a migrating VM
• “Evil Man” service decodes Boot Token “magic ping”
• Evil Man is 500 lines of code + network driver
40
Laundromat Summary
• Pros:– Simple and flexible model
– Hundreds instead of millions LOC
– Built-in payment system
– Supports self-scaling applications
• Cons:– Needs direct network access
– Magic ping does not always get through firewalls etc.
41
Service-Oriented Model
42
Pull Instead of Push
• In real life, most Grid clusters are hidden behind NATs– No global IP address for nodes– No way to connect from the outside– Usually allowed to initiate a connection from within
• Possible workarounds:– Run a local broker at each site– Port-forwarding in the NAT– Switch to a pull-based model
• Pull model– Boot VMs over HTTP– Add HTTP client to trusted software for fetching a work description– VMs run a web service for cloning and migration
43
Pull Model
44
Workload Description
45
Pulse Notifications
• Periodic polling works, but introduces latency
• What we have essentially is a cache invalidation problem
• Pulse is a simple and secure wide-area cache invalidation protocol
• Clients listen on H(s), publishers release s to invalidate
• We can preserve the pull model, without adding latency
46
Virtual Machines on the Desktop
47
Security Problems on the Desktop
• Web browsers handle sensitive data, such as e-banking logins
• Risk of worms or spy-ware creeping from one site to another
• VMs could provide strong isolation features
48
The Blink Display System
• VMs have traditionally had only simple 2D graphics
• Modern applications need 3D acceleration
• Cannot sacrifice safety for performance here
• Blink:– JIT-compiled OpenGL stored
procedures– Flexible, efficient and safe control of
the screen– Blink VMs can be checkpointed and
migrate to different graphics hardware
49
VMs on Desktop Summary
• VMs can have native performance graphics, without sacrificing safety
• Stored procedures more flexible than, e.g., shared memory off-screen buffers
• Introduces a new display model, but still backwards compatible
50
Concluding Remarks
51
Related Work
• All commercial VMMs have or will have live migration:– VMware VMotion– Citrix/XenSource XenMotion (derived from our work), Sun, Oracle– Microsoft Hyper-V (planned)
• Huge body of previous process migration work– Distributed V, Emerald cross-platform object mobility– MOSIX– Zap process group migration
• Grid/utility computing projects– BOINC (SETI@Home) from Berkeley– PlanetLab– Shirako from Duke, Amazon EC2, Minimun Intrusion Grid, …
• Security– L4 and EROS secure display systems– L4 Nizza architecture
52
Future Work
• A stateless VMM– All per-VM state stored sealed in the VM
– Seamless checkpointing and migration
– Cannot DoS the VMM or cause starvation of other VMs
• Migration-aware storage– Failure-resilient network file system for virtual disks
– Peer-to-peer caching of common contents
• Self-Migration of a native OS, directly on the raw hardware– Also useful for software-suspend / hibernation
53
Conclusion & Contributions
• Compared to processes, VMs offer superior functionality– Control own paging and scheduling
– Provide file systems and virtual memory
– Backwards compatible
– Safe containers for untrusted code
• We have shown:– How VMs can live-migrate across a network, with sub-second downtimes
– How VMs can self-migrate, without help from the VMM
• Furthermore:– We have designed and implemented a “Laundromat Computing” system
– Reduced the network control plane from millions to hundreds of lines of code
– Pulse and Blink supporting systems
54
Questions
55
VMware is hiring in Aarhus
Thank You
http://www.diku.dk/~jacobg
56
Dealing with Network Side-effects
• The copy-on-write phase results in a network fork
• “Parent” and “child” overlap and diverge
• Firewall network traffic during final copy phase
• All except migration-traffic is silently dropped in last phase
57
Re-routing Network Traffic
• Simple techniques– IP redirection with gratuitous ARP
– MAC address spoofing
• Wide-area:– IP-in-IP tunnelling
58
Overhead Added by Continuous Migration
59
Control Models Compared
60
User-Space Migration Driver