14
© 2010 IBM Corporation Plugging the Hypervisor Abstraction Leaks Caused by Virtual Networking Alex Landau, David Hadas, Muli Ben-Yehuda IBM Research – Haifa Alex Landau 25 May 2010

© 2010 IBM Corporation Plugging the Hypervisor Abstraction Leaks Caused by Virtual Networking Alex Landau, David Hadas, Muli Ben-Yehuda IBM Research –

Embed Size (px)

Citation preview

Page 1: © 2010 IBM Corporation Plugging the Hypervisor Abstraction Leaks Caused by Virtual Networking Alex Landau, David Hadas, Muli Ben-Yehuda IBM Research –

© 2010 IBM Corporation

Plugging the Hypervisor AbstractionLeaks Caused by Virtual Networking

Alex Landau, David Hadas, Muli Ben-YehudaIBM Research – Haifa

Alex Landau

25 May 2010

Page 2: © 2010 IBM Corporation Plugging the Hypervisor Abstraction Leaks Caused by Virtual Networking Alex Landau, David Hadas, Muli Ben-Yehuda IBM Research –

© 2010 IBM Corporation2

Hypervisor leaks

Original goal of hypervisors – complete replica of physical hardware

Application running on host should be able to run in guest

Host details leaked to guest–Instruction set extensions–Bridged networking

• Leaked IP address, subnet mask, etc.–NAT

• Not suitable for many applications

Page 3: © 2010 IBM Corporation Plugging the Hypervisor Abstraction Leaks Caused by Virtual Networking Alex Landau, David Hadas, Muli Ben-Yehuda IBM Research –

© 2010 IBM Corporation3

Why leaks are bad?

Why is that a problem?–Checkpoint / restart–Cloning–Live migration

Example:–Guest acquires IP address from DHCP–Guest is live-migrated to different data center–Guest uses old IP address in new network

Current solution:–Defer problem to guests and network equipment–E.g., VLANs

Page 4: © 2010 IBM Corporation Plugging the Hypervisor Abstraction Leaks Caused by Virtual Networking Alex Landau, David Hadas, Muli Ben-Yehuda IBM Research –

© 2010 IBM Corporation4

QEMU

Guest

GuestKernel

VIRTIOFrontend

VIRTIOBackend

QEMU

Guest Guestapplication

GuestKernel

NetworkAdapterDriver

EmulatedNetworkAdapter

GuestNetwork

Stack

GuestNetwork

Stack

Socket Interface

Guestapplication

Socket Interface

HostKernel

VirtualNetworkInterface

TAP

HostNetworkServices

(E.g. Bridgeor VAN

central services)

VirtualNetworkInterface

TAP

Packet flow today (in KVM)

Page 5: © 2010 IBM Corporation Plugging the Hypervisor Abstraction Leaks Caused by Virtual Networking Alex Landau, David Hadas, Muli Ben-Yehuda IBM Research –

© 2010 IBM Corporation5

How to avoid leaks?

Hypervisor, not network, is responsible for avoiding leaks

Guests should be:–Offered an isolated virtual environment–Independent of physical network characteristics (e.g., topology)–Independent of physical location (e.g., IP addresses)

Example:–Guest should receive IP address independent of:

• Host running the guest• Data center containing the host• Network configuration of the host

Page 6: © 2010 IBM Corporation Plugging the Hypervisor Abstraction Leaks Caused by Virtual Networking Alex Landau, David Hadas, Muli Ben-Yehuda IBM Research –

© 2010 IBM Corporation6

Avoiding leaks – Encapsulation

Guest produces Layer-2 frame

Host encapsulates it in UDP packet

Host finds destination host–By peeking at destination (guest) MAC address–And “somehow” finding destination host

Host transmits UDP packet

Receiver host receives UDP packet

Receiver host decapsulates Layer-2 frame from UDP packet

Receiver host passes Layer-2 frame to guest

Page 7: © 2010 IBM Corporation Plugging the Hypervisor Abstraction Leaks Caused by Virtual Networking Alex Landau, David Hadas, Muli Ben-Yehuda IBM Research –

© 2010 IBM Corporation7

Proposed packet flow – Dual Stack

HostKernel

QEMU

Guest

GuestKernel

VIRTIOFrontend

Driver

VIRTIOBackend

QEMU

Guest Guestapplication

GuestKernel

NetworkAdapterDriver

EmulatedNetworkAdapter

GuestNetwork

Stack

GuestNetwork

Stack

TrafficEncapsulation

TrafficEncapsulation

HostNetwork

Stack

Socket Interface

Socket Interface

Guestapplication

Socket Interface

GuestStack

(Glue)

HostStack

App.

DriverNet Driver

Isolation

Page 8: © 2010 IBM Corporation Plugging the Hypervisor Abstraction Leaks Caused by Virtual Networking Alex Landau, David Hadas, Muli Ben-Yehuda IBM Research –

© 2010 IBM Corporation8

Performance

Path from guest to wire is long

Latencies are manifested in the form of:–Packet copies–VM exits and entries–User/Kernel mode switches–Host QEMU process scheduling

Page 9: © 2010 IBM Corporation Plugging the Hypervisor Abstraction Leaks Caused by Virtual Networking Alex Landau, David Hadas, Muli Ben-Yehuda IBM Research –

© 2010 IBM Corporation9

Large packets

Transport and Network layers capable of up to 64KB packets

Ethernet limit is 1500 bytes–Ignoring jumbo frames

But there is no Ethernet wire between guest and host!

Set MTU to 64KB in guest

64KB packets are transferred from guest to host–Inhibit TCP/UDP checksum calculation and verification

Page 10: © 2010 IBM Corporation Plugging the Hypervisor Abstraction Leaks Caused by Virtual Networking Alex Landau, David Hadas, Muli Ben-Yehuda IBM Research –

© 2010 IBM Corporation10

Large packets – Flow

Application writes 64KB to TCP socket

TCP, IP check MTU (=64KB) and create 1 TCP segment, 1 IP packet

Guest virtual NIC driver copies entire 64KB frame to host

Host writes 64KB frame into UDP socket

Host stack creates 1 64KB UDP packet

If packet destination = VM on local host–Transfer 64KB packet directly on the loopback interface

If packet destination = other host–Host NIC segments 64KB packet in hardware

Page 11: © 2010 IBM Corporation Plugging the Hypervisor Abstraction Leaks Caused by Virtual Networking Alex Landau, David Hadas, Muli Ben-Yehuda IBM Research –

© 2010 IBM Corporation11

CPU affinity and pinning

QEMU process contains 2 threads–CPU thread (actually, one CPU thread per guest vCPU)–IO thread

Linux process scheduler selects core(s) to run threads on

Many times scheduler made wrong decisions–Schedule both on same core–Constantly reschedule (core 0 -> 1 -> 0 -> 1 -> …)

Solution/workaround – pin CPU thread to core 0, IO thread to core 1

Page 12: © 2010 IBM Corporation Plugging the Hypervisor Abstraction Leaks Caused by Virtual Networking Alex Landau, David Hadas, Muli Ben-Yehuda IBM Research –

© 2010 IBM Corporation12

Flow control

Guest does not anticipate flow control at Layer-2

Thus, host should not provide flow control–Otherwise, bad effects similar to TCP-in-TCP encapsulation will

happen

Lacking flow control, host should have large enough socket buffers

Example:–Guest uses TCP–Host buffers should be at least guest TCP’s bandwidth x delay

Page 13: © 2010 IBM Corporation Plugging the Hypervisor Abstraction Leaks Caused by Virtual Networking Alex Landau, David Hadas, Muli Ben-Yehuda IBM Research –

© 2010 IBM Corporation13

Performance results

Throughput Receiver CPU Utilization

Page 14: © 2010 IBM Corporation Plugging the Hypervisor Abstraction Leaks Caused by Virtual Networking Alex Landau, David Hadas, Muli Ben-Yehuda IBM Research –

© 2010 IBM Corporation14

Thank you!