Upload
kevyn
View
41
Download
5
Tags:
Embed Size (px)
DESCRIPTION
VINI: Virtual Network Infrastructure. Nick Feamster Georgia Tech Andy Bavier, Mark Huang, Larry Peterson, Jennifer Rexford Princeton University. ?. VINI Overview. Bridge the gap between “ lab experiments ” and live experiments at scale. Runs real routing software - PowerPoint PPT Presentation
Citation preview
VINI: Virtual Network Infrastructure
Nick FeamsterGeorgia Tech
Andy Bavier, Mark Huang, Larry Peterson, Jennifer RexfordPrinceton University
VINI Overview
• Runs real routing software• Exposes realistic network conditions• Gives control over network events• Carries traffic on behalf of real users• Is shared among many experiments
Simulation
Emulation
Small-scaleexperiment
Livedeployment
?VINI
Bridge the gap between “lab experiments” and live experiments at scale.
Goal: Control and Realism
• Control– Reproduce results– Methodically change or
relax constraints
• Realism– Long-running services
attract real users– Connectivity to real Internet– Forward high traffic
volumes (Gb/s)– Handle unexpected events
TopologyActual network
Arbitrary, emulated
TrafficReal clients, servers
Synthetic or traces
Network EventsObserved in operational network
Inject faults, anomalies
Overview
• VINI characteristics– Fixed, shared infrastructure– Flexible network topology– Expose/inject network events– External connectivity and routing adjacencies
• PL-VINI: prototype on PlanetLab• Preliminary Experiments• Ongoing work
Fixed Infrastructure
Shared Infrastructure
Arbitrary Virtual Topologies
Exposing and Injecting Failures
Carry Traffic for Real End Users
s
c
Participate in Internet Routing
s
c
BGP
BGP
BGP
BGP
PL-VINI: Prototype on PlanetLab
• First experiment: Internet In A Slice– XORP open-source routing protocol suite (NSDI ’05)– Click modular router (TOCS ’00, SOSP ’99)
• Clarify issues that VINI must address– Unmodified routing software on a virtual topology– Forwarding packets at line speed– Illusion of dedicated hardware– Injection of faults and other events
PL-VINI: Prototype on PlanetLab
• PlanetLab: testbed for planetary-scale services• Simultaneous experiments in separate VMs
– Each has “root” in its own VM, can customize
• Can reserve CPU, network capacity per VM
Virtual Machine Monitor (VMM)(Linux++)
NodeMgr
LocalAdmin
VM1 VM2 VMn…PlanetLab node
XORP: Control Plane
• BGP, OSPF, RIP, PIM-SM, IGMP/MLD
• Goal: run real routing protocols on virtual network topologies
XORP(routing protocols)
User-Mode Linux: Environment
• Interface ≈ network• PlanetLab limitation:
– Slice cannot create new interfaces
• Run routing software in UML environment
• Create virtual network interfaces in UML
XORP(routing protocols)
UML
eth1 eth3eth2eth0
Click: Data Plane
• Performance– Avoid UML overhead– Move to kernel, FPGA
• Interfaces tunnels– Click UDP tunnels
correspond to UML network interfaces
• Filters– “Fail a link” by blocking
packets at tunnel
XORP(routing protocols)
UML
eth1 eth3eth2eth0
Click
PacketForwardEngine
Control
DataUmlSwitch
element
Tunnel table
Filters
Intra-domain Route Changes
s
c
1176
587 846
260
700
6391295
2095
902
548
233
1893
366
856
Ping During Link Failure
70
80
90
100
110
120
0 10 20 30 40 50
Pin
g R
TT
(m
s)
Seconds
Link down
Link up
Routes converging
Close-Up of TCP Transfer
2.1
2.15
2.2
2.25
2.3
2.35
2.4
2.45
17.5 18 18.5 19 19.5 20
Meg
abyt
es in
str
eam
Seconds
Packet receiv ed
Slow start
Retransmitlost packet
PL-VINI enables a user-space virtual networkto behave like a real network on PlanetLab
Challenge: Attracting Real Users
• Could have run experiments on Emulab
• Goal: Operate our own virtual network– Carrying traffic for actual users– We can tinker with routing protocols
• Attracting real users
Conclusion
• VINI: Controlled, Realistic Experimentation
• Installing VINI nodes in NLR, Abilene
• Download and run Internet In A Slice
http://www.vini-veritas.net/
TCP Throughput
0
2
4
6
8
10
12
0 10 20 30 40 50
Meg
abyt
es t
rans
ferr
ed
Seconds
Packet receiv ed
Link down
Link up
Zoom in
Ongoing Work
• Improving realism– Exposing network failures and changes in the
underlying topology– Participating in routing with neighboring networks
• Improving control – Better isolation– Experiment specification
Resource Isolation
• Issue: Forwarding packets in user space– PlanetLab sees heavy use– CPU load affects virtual network performance
Property Depends On Solution
Throughput CPU% received PlanetLab provides CPU reservations
Latency CPU scheduling delay
PL-VINI: boost priority of packet forward process
Performance is bad
• User-space Click: ~200Mb/s forwarding
VINI should use Xen
Experimental Results
• Is a VINI feasible?– Click in user-space: 200Mb/s forwarded– Latency and jitter comparable between network and
IIAS on PL-VINI.– Say something about running on just PlanetLab? Do
n’t spend much time talking about CPU scheduling…
Low latency for everyone?
• PL-VINI provided IIAS with low latency by giving it high CPU scheduling priority
Internet In A SliceXORP• Run OSPF• Configure FIB
Click• FIB• Tunnels• Inject faults
OpenVPN & NAT• Connect clients
and servers
S
C
S
C
C
S
PL-VINI / IIAS Router
• Blue: topology– Virtual net devices– Tunnels
• Red: routing and forwarding– Data traffic does not enter
UML
• Green: enter & exit IIAS overlay
UML
XORP
eth1 eth3eth2
UmlSwitch
UmlSwitchelementFIB
Encapsulation table
eth0
Control
Data
Click
tap0
PL-VINI SummaryFlexible Network Topology
Virtual point-to-point connectivity Tunnels in Click
Unique interfaces per experiment Virtual network devices in UML
Exposure of topology changes Upcalls of layer-3 alarms
Flexible Routing and Forwarding
Per-node forwarding table Separate Click per virtual node
Per-node routing process Separate XORP per virtual node
Connectivity to External Hosts
End-hosts can direct traffic through VINI Connect to OpenVPN server
Return traffic flows through VINI NAT in Click on egress node
Support for Simultaneous Experiments
Isolation between experiments PlanetLab VMs and network isolation
CPU reservations and priorities
Distinct external routing adjacencies BGP multiplexer for external sessions
PL-VINI / IIAS Router
• XORP: control plane• UML: environment
– Virtual interfaces
• Click: data plane– Performance
• Avoid UML overhead• Move to kernel, FPGA
– Interfaces tunnels– “Fail a link”
XORP(routing protocols)
UML
eth1 eth3eth2eth0
Click
PacketForwardEngine
Control
DataUmlSwitch
element
Tunnel table
33
Trellis
• Same abstractions as PL-VINI– Virtual hosts and links
– Push performance, ease of use
• Full network-stack virtualization• Run XORP, Quagga in a slice
– Support data plane in kernel
• Approach native Linux kernel performance (15x PL-VINI)
• Be an “early adopter” of new Linux virtualization work
kernel FIB
virtualNIC
application
virtualNIC
user
kernel
bridge
shaper
EGREtunnel
bridge
shaper
EGREtunnel
Trellis virtual host
Trellis Substrate
34
Virtual Hosts
• Use container-based virtualization– Xen, VMWare: poor scalability, performance
• Option #1: Linux Vserver– Containers without network virtualization– PlanetLab slices share single IP address, port space
• Option #2: OpenVZ– Mature container-based approach– Roughly equivalent to Vserver– Has full network virtualization
35
Network Containers for Linux
• Create multiple copies of TCP/IP stack
• Per-network container– Kernel IPv4 and IPv6 routing table– Physical or virtual interfaces– Iptables, traffic shaping, sysctl.net variables
• Trellis: marry Vserver + NetNS– Be an early adopter of the new interfaces– Otherwise stay close to PlanetLab
36
Virtual Links: EGRE Tunnels
• Virtual Ethernet links• Make minimal assumptions about
the physical network between Trellis nodes
• Trellis: Tunnel Ethernet over GRE over IP– Already a standard, but no Linux
implementation
• Other approaches: – VLANs, MPLS, other network
circuits or tunnels– These fit into our framework
kernel FIB
virtualNIC
application
virtualNIC
user
kernel
EGREtunnel
EGREtunnel
Trellis virtual host
Trellis Substrate
37
Tunnel Termination
• Where is EGRE tunnel interface?• Inside container: better performance• Outside container: more flexibility
– Transparently change implementation– Process, shape traffic btw container and tunnel– User cannot manipulate tunnel, shapers
• Trellis: terminate tunnel outside container
38
Glue: Bridging
• How to connect virtual hosts to tunnels?– Connecting two Ethernet interfaces
• Linux software bridge– Ethernet bridge semantics, create P2M links– Relatively poor performance
• Common-case: P2P links• Trellis
– Use Linux bridge for P2M links– Create new “shortbridge” for P2P links
39
Glue: Bridging
• How to connect virtual hosts to EGRE tunnels?– Two Ethernet interfaces
• Linux software bridge– Ethernet bridge semantics– Support P2M links– Relatively poor performance
• Common-case: P2P links• Trellis:
– Use Linux bridge for P2M links– New, optimized “shortbridge” module
for P2P links
kernel FIB
virtualNIC
application
virtualNIC
user
kernel
bridge*
shaper
EGREtunnel
bridge*
shaper
EGREtunnel
Trellis virtual host
Trellis Substrate
40
IPv4 Packet Forwarding
2/3 of native performance, 10X faster than PL-VINI
Fo
rwa
rdin
g ra
te (
kpp
s)
41
Virtualized Data Plane in Hardware
• Software provides flexibility, but poor performance and often inadequate isolation
• Idea: Forward packets exclusively in hardware– Platform: OpenVZ over NetFPGA– Challenge: Share common functions, while isolating
functions that are specific to each virtual network
42
Accelerating the Data Plane
• Virtual environments in OpenVZ
• Interface to NetFPGA based on Stanford reference router
43
Control Plane
• Virtual environments– Virtualize the control plane by running multiple virtual
environments on the host (same as in Trellis)
– Routing table updates pass through security daemon
– Root user updates VMAC-VE table
• Hardware access control– VMAC-VE table/VE-ID controls access to hardware
• Control register– Used to multiplex VE to the appropriate hardware
44
Virtual Forwarding Table Mapping
45
Share Common Functions
• Common functions– Packet decoding– Calculating checksums– Decrementing TTLs– Input arbitration
• VE-Specific Functions– FIB– IP lookup table– ARP table
46
Forwarding Performance
47
Efficiency
• 53K Logic Cells• 202 Units of
Block RAM
Sharing common elements saves up to 75% savings over independent physical routers.
48
Conclusion
• Virtualization allows physical hardware to be shared among many virtual networks
• Tradeoffs: sharing, performance, and isolation• Two approaches
– Trellis: Kernel-level packet forwarding(10x packet forwarding rate improvement vs. PL-VINI)
– NetFPGA-based forwarding for virtual networks(same forwarding rate as NetFPGA-based router, with 75% improvement in hardware resource utilization)
Accessing Services in the Cloud
49
Cloud Data Center
Cloud Data Center
Data Center Router
Data Center Router
InteractiveService
Bulk transfer
InternetInternet
Routing updates
Packets
ISP1ISP1
ISP2ISP2
• Hosted services have different requirements– Too slow for
interactive service, or
– Too costly for bulk transfer!
Cloud Routing Today
• Multiple upstream ISPs– Amazon EC2 has at least 58 routing peers in Virginia
data center
• Data center router picks one route to a destination for all hosted services– Packets from all hosted applications use
the same path
50
Route Control: “Cloudless” Solution
• Obtain connectivity to upstream ISPs– Physical connectivity– Contracts and routing sessions
• Obtain the Internet numbered resources from authorities
• Expensive and time-consuming!
51
Routing with Transit Portal (TP)
5252
Cloud Data Center
Cloud Data Center
InteractiveService
Bulk transfer
InternetInternet
ISP1ISP1
ISP2ISP2
Virtual Router
B
Virtual Router
B
Virtual Router
A
Virtual Router
A
Transit PortalTransit Portal
Routes
Packets
Full Internet route control to hosted
cloud services!
Outline
• Motivation and Overview
• Connecting to the Transit Portal
• Advanced Transit Portal Applications
• Scaling the Transit Portal
• Future Work & Summary 53
Connecting to the TP
• Separate Internet router for each service– Virtual or physical routers
• Links between service router and TP– Each link emulates connection to upstream ISP
• Routing sessions to upstream ISPs– TP exposes standard BGP route control interface
54
Transit PortalTransit Portal
Virtual BGP
Router
Virtual BGP
Router
Basic Internet Routing with TP
55
• Cloud client with two upstream ISPs– ISP 1 is preferred
• ISP 1 exhibits excessive jitter
• Cloud client reroutes through ISP 2
ISP 1ISP 1 ISP 2ISP 2
Interactive Cloud Service
BGPSessions
Traffic
Current TP Deployment
• Server with custom routing software– 4GB RAM, 2x2.66GHz Xeon cores
• Three active sites with upstream ISPs– Atlanta, Madison, and Princeton
• A number of active experiments– BGP poisoning (University of Washington)– IP Anycast (Princeton University)– Advanced Networking class (Georgia Tech)
56
TP Applications: Fast DNS
• Internet services require fast name resolution
• IP anycast for name resolution– DNS servers with the same IP address– IP address announced to ISPs in multiple locations– Internet routing converges to the closest server
• Available only to large organizations
57
TP Applications: Fast DNS
ISP1ISP1 ISP2ISP2 ISP3ISP3 ISP4ISP4
Transit PortalTransit Portal
Transit PortalTransit Portal
Asia North America
Anycast
Routes
58Name ServiceName Service
• TP allows hosted applications use IP anycast
TP Applications: Service Migration
• Internet services in geographically diverse data centers
• Operators migrate Internet user’s connections
• Two conventional methods:– DNS name re-mapping
• Slow– Virtual machine migration with local re-routing
• Requires globally routed network
59
TP Applications: Service Migration
ISP1ISP1 ISP2ISP2 ISP3ISP3 ISP4ISP4
Transit PortalTransit Portal
Transit PortalTransit Portal
Asia North America
Tunneled SessionsTunneled Sessions
60
Active GameService
InternetInternet
Scaling the Transit Portal
• Scale to dozens of sessions to ISPs and hundreds of sessions to hosted services
• At the same time:– Present each client with sessions that have an
appearance of direct connectivity to an ISP
– Prevented clients from abusing Internet routing protocols
61
Conventional BGP Routing
• Conventional BGP router:– Receives routing updates
from peers– Propagates routing update
about one path only– Selects one path to forward
packets
• Scalable but not transparent or flexible
62
ISP1ISP1 ISP2ISP2
BGP RouterBGP Router
Updates
Client BGP Router
Client BGP Router
Client BGP Router
Client BGP Router
Packets
Bulk Transfer
Routing ProcessRouting Process
Scaling BGP Memory Use
• Store and propagate all BGP routes from ISPs– Separate routing tables
• Reduce memory consumption– Single routing process
- shared data structures– Reduce memory use from
90MB/ISP to 60MB/ISP63
ISP1ISP1 ISP2ISP2
Virtual RouterVirtual Router
Virtual RouterVirtual Router
Routing Table 1Routing Table 1
Routing Table 2Routing Table 2
Interactive Service
Bulk Transfer
Routing ProcessRouting Process
Scaling BGP CPU Use
• Hundreds of routing sessions to clients– High CPU load
• Schedule and send routing updates in bundles – Reduces CPU from 18% to
6% for 500 client sessions
64
ISP1ISP1 ISP2ISP2
Virtual RouterVirtual Router
Virtual RouterVirtual Router
Routing Table 1Routing Table 1
Routing Table 2Routing Table 2
Interactive Service
Forwarding TableForwarding Table
Scaling Forwarding Memory for TP
• Connecting clients– Tunneling and VLANs
• Curbing memory usage– Separate virtual routing
tables with default to upstream
– 50MB/ISP -> ~0.1MB/ISP memory use in forwarding table
65
ISP1ISP1 ISP2ISP2
Virtual BGP
Router
Virtual BGP
Router
Virtual BGP
Router
Virtual BGP
Router
Forwarding Table 1Forwarding Table 1
Forwardng Table 2
Forwardng Table 2
Bulk TransferInteractive Service