View
212
Download
0
Category
Preview:
Citation preview
© 2010 IBM Corporation
Plugging the Hypervisor AbstractionLeaks Caused by Virtual Networking
Alex Landau, David Hadas, Muli Ben-YehudaIBM Research – Haifa
Alex Landau
25 May 2010
© 2010 IBM Corporation2
Hypervisor leaks
Original goal of hypervisors – complete replica of physical hardware
Application running on host should be able to run in guest
Host details leaked to guest–Instruction set extensions–Bridged networking
• Leaked IP address, subnet mask, etc.–NAT
• Not suitable for many applications
© 2010 IBM Corporation3
Why leaks are bad?
Why is that a problem?–Checkpoint / restart–Cloning–Live migration
Example:–Guest acquires IP address from DHCP–Guest is live-migrated to different data center–Guest uses old IP address in new network
Current solution:–Defer problem to guests and network equipment–E.g., VLANs
© 2010 IBM Corporation4
QEMU
Guest
GuestKernel
VIRTIOFrontend
VIRTIOBackend
QEMU
Guest Guestapplication
GuestKernel
NetworkAdapterDriver
EmulatedNetworkAdapter
GuestNetwork
Stack
GuestNetwork
Stack
Socket Interface
Guestapplication
Socket Interface
HostKernel
VirtualNetworkInterface
TAP
HostNetworkServices
(E.g. Bridgeor VAN
central services)
VirtualNetworkInterface
TAP
Packet flow today (in KVM)
© 2010 IBM Corporation5
How to avoid leaks?
Hypervisor, not network, is responsible for avoiding leaks
Guests should be:–Offered an isolated virtual environment–Independent of physical network characteristics (e.g., topology)–Independent of physical location (e.g., IP addresses)
Example:–Guest should receive IP address independent of:
• Host running the guest• Data center containing the host• Network configuration of the host
© 2010 IBM Corporation6
Avoiding leaks – Encapsulation
Guest produces Layer-2 frame
Host encapsulates it in UDP packet
Host finds destination host–By peeking at destination (guest) MAC address–And “somehow” finding destination host
Host transmits UDP packet
Receiver host receives UDP packet
Receiver host decapsulates Layer-2 frame from UDP packet
Receiver host passes Layer-2 frame to guest
© 2010 IBM Corporation7
Proposed packet flow – Dual Stack
HostKernel
QEMU
Guest
GuestKernel
VIRTIOFrontend
Driver
VIRTIOBackend
QEMU
Guest Guestapplication
GuestKernel
NetworkAdapterDriver
EmulatedNetworkAdapter
GuestNetwork
Stack
GuestNetwork
Stack
TrafficEncapsulation
TrafficEncapsulation
HostNetwork
Stack
Socket Interface
Socket Interface
Guestapplication
Socket Interface
GuestStack
(Glue)
HostStack
App.
DriverNet Driver
Isolation
© 2010 IBM Corporation8
Performance
Path from guest to wire is long
Latencies are manifested in the form of:–Packet copies–VM exits and entries–User/Kernel mode switches–Host QEMU process scheduling
© 2010 IBM Corporation9
Large packets
Transport and Network layers capable of up to 64KB packets
Ethernet limit is 1500 bytes–Ignoring jumbo frames
But there is no Ethernet wire between guest and host!
Set MTU to 64KB in guest
64KB packets are transferred from guest to host–Inhibit TCP/UDP checksum calculation and verification
© 2010 IBM Corporation10
Large packets – Flow
Application writes 64KB to TCP socket
TCP, IP check MTU (=64KB) and create 1 TCP segment, 1 IP packet
Guest virtual NIC driver copies entire 64KB frame to host
Host writes 64KB frame into UDP socket
Host stack creates 1 64KB UDP packet
If packet destination = VM on local host–Transfer 64KB packet directly on the loopback interface
If packet destination = other host–Host NIC segments 64KB packet in hardware
© 2010 IBM Corporation11
CPU affinity and pinning
QEMU process contains 2 threads–CPU thread (actually, one CPU thread per guest vCPU)–IO thread
Linux process scheduler selects core(s) to run threads on
Many times scheduler made wrong decisions–Schedule both on same core–Constantly reschedule (core 0 -> 1 -> 0 -> 1 -> …)
Solution/workaround – pin CPU thread to core 0, IO thread to core 1
© 2010 IBM Corporation12
Flow control
Guest does not anticipate flow control at Layer-2
Thus, host should not provide flow control–Otherwise, bad effects similar to TCP-in-TCP encapsulation will
happen
Lacking flow control, host should have large enough socket buffers
Example:–Guest uses TCP–Host buffers should be at least guest TCP’s bandwidth x delay
© 2010 IBM Corporation13
Performance results
Throughput Receiver CPU Utilization
© 2010 IBM Corporation14
Thank you!
Recommended