48
Rethinking Routers in the Age of Virtualization Jennifer Rexford Princeton University http://www.cs.princeton.edu/~jrex/virtual.html

Rethinking Routers in the Age of Virtualization Jennifer Rexford Princeton University jrex/virtual.html

  • View
    213

  • Download
    0

Embed Size (px)

Citation preview

Rethinking Routers in the Age of Virtualization

Jennifer RexfordPrinceton University

http://www.cs.princeton.edu/~jrex/virtual.html

Traditional View of a Router

• A big, physical device…– Processors– Multiple links– Switching fabric

• … that directs Internet traffic– Connects to other routers– Computes routes– Forwards packets

Times Are Changing

Backbone Links are Virtual

• Flexible underlying transport network– Layer-3 links are multi-hop paths at layer 2

4

Chicago

New York

Washington D.C.

Routing Separate From Forwarding

• Separation of functionality– Control plane: computes paths– Forwarding plane: forwards packets

SwitchingFabric

Processor

Line card

Line card

Line card

Line card

Line card

Line card

data planecontrol plane

Multiple Virtual Routers

• Multiple virtual routers on same physical one– Virtual Private Networks (VPNs)– Router consolidation for smaller footprint

SwitchingFabric

data planecontrol plane

Capitalizing on Virtualization

• Simplify network management– Hide planned changes in the physical topology

• Improve router reliability– Survive bugs in complex routing software

• Deploy new value-added services– Customized protocols in virtual networks

• Enable new network business models– Separate service providers from the infrastructure

What should the router “hypervisor” look like?

VROOM: Virtual Routers On the Move

With Yi Wang, Eric Keller, Brian Biskeborn, and Kobus van der Merwe

The Two Notions of “Router”• IP-layer logical functionality, and physical equipment

9

Logical(IP layer)

Physical

Tight Coupling of Physical & Logical• Root of many network-management challenges (and

“point solutions”)

10

Logical(IP layer)

Physical

VROOM: Breaking the Coupling• Re-mapping logical node to another physical node

11

Logical(IP layer)

Physical

VROOM enables this re-mapping of logical to physical through virtual router migration.

Case 1: Planned Maintenance

• NO reconfiguration of VRs, NO reconvergence

12

A

B

VR-1

Case 1: Planned Maintenance

• NO reconfiguration of VRs, NO reconvergence

13

A

B

VR-1

Case 1: Planned Maintenance

• NO reconfiguration of VRs, NO reconvergence

14

A

B

VR-1

Case 2: Service Deployment/Evolution

• Move (logical) router to more powerful hardware

15

Case 2: Service Deployment/Evolution

• VROOM guarantees seamless service to existing customers during the migration

16

Case 3: Power Savings

17

• $ Hundreds of millions/year of electricity bills

Case 3: Power Savings

18

• Contract and expand the physical network according to the traffic volume

Case 3: Power Savings

19

• Contract and expand the physical network according to the traffic volume

Case 3: Power Savings

20

• Contract and expand the physical network according to the traffic volume

Virtual Router Migration: Challenges

21

1. Migrate an entire virtual router instance• All control-plane processes & data-plane states

Virtual Router Migration: Challenges

22

1. Migrate an entire virtual router instance2. Minimize disruption

• Data plane: millions of packets/sec on a 10Gbps link• Control plane: less strict (with routing message retrans.)

Virtual Router Migration: Challenges

23

1. Migrating an entire virtual router instance2. Minimize disruption3. Link migration

Virtual Router Migration: Challenges

24

1. Migrating an entire virtual router instance2. Minimize disruption3. Link migration

VROOM Architecture

25

Dynamic Interface Binding

Data-Plane Hypervisor

• Key idea: separate the migration of control and data planes

1.Migrate the control plane2.Clone the data plane3.Migrate the links

26

VROOM’s Migration Process

• Leverage virtual server migration techniques• Router image

– Binaries, configuration files, etc.

27

Control-Plane Migration

• Leverage virtual server migration techniques• Router image• Memory

– 1st stage: iterative pre-copy– 2nd stage: stall-and-copy (when the control plane

is “frozen”)

28

Control-Plane Migration

• Leverage virtual server migration techniques• Router image• Memory

29

Control-Plane Migration

Physical router A

Physical router B

DP

CP

• Clone the data plane by repopulation– Enable migration across different data planes– Avoid copying duplicate information

30

Data-Plane Cloning

Physical router A

Physical router B

CP

DP-old

DP-newDP-new

• Data-plane cloning takes time– Installing 250k routes may take several seconds

• Control & old data planes need to be kept “online”• Solution: redirect routing messages through tunnels

31

Remote Control Plane

Physical router A

Physical router B

CP

DP-old

DP-new

• Data-plane cloning takes time– Installing 250k routes takes over 20 seconds

• Control & old data planes need to be kept “online”• Solution: redirect routing messages through tunnels

32

Remote Control Plane

Physical router A

Physical router B

CP

DP-old

DP-new

• Data-plane cloning takes time– Installing 250k routes takes over 20 seconds

• Control & old data planes need to be kept “online”• Solution: redirect routing messages through tunnels

33

Remote Control Plane

Physical router A

Physical router B

CP

DP-old

DP-new

• At the end of data-plane cloning, both data planes are ready to forward traffic

34

Double Data Planes

CP

DP-old

DP-new

• With the double data planes, links can be migrated independently

35

Asynchronous Link Migration

A

CP

DP-old

DP-new

B

• Virtualized operating system– OpenVZ, supports VM migration

• Routing protocols– Quagga software suite

• Packet forwarding– NetFPGA hardware

• Router hypervisor– Our extensions for repopulating data plane,

remote control plane, double data planes, …

36

Prototype Implementation

• Data plane: NetFPGA– No packet loss or extra delay

• Control plane: Quagga routing software– All routing-protocol adjacencies stay up– Core router migration (intradomain only)

• Inject an unplanned link failure at another router• At most one retransmission of an OSPF message

– Edge router migration (intra and interdomain)• Control-plane downtime: 3.56 seconds• Within reasonable keep-alive timer intervals

37

Experimental Results

Conclusions on VROOM

• Useful network-management primitive– Separate tight coupling between physical and logical– Simplify management, enable new applications

• Evaluation of prototype– No disruption in packet forwarding– No noticeable disruption in routing protocols

• Ongoing work– Migration scheduling as an optimization problem– Extensions to hypervisor for other applications

38

VERB: Virtually Eliminating Router Bugs

With Eric Keller, Minlan Yu, and Matt Caesar

Router Bugs Are Important

• Routing software is complicated– Leads to programming errors (aka “bugs”)– Recent string of high-profile outages

• Bugs different from traditional failures– Byzantine failures, don’t simply crash the router– Violate protocol, and cause cascading outages

• The problem is getting worse– Software is getting more complicated– Other outages becoming less common– Vendors allowing third-party software

Exploit Software and Data Diversity

• Many sources of diversity– Diverse code (Quagga, XORP, BIRD)– Diverse protocols (OSPF and IS-IS)– Diverse environment (timing, ordering, memory)

• Reasonable overhead– Extra processor blade for hardware reliability– Multi-core processors, separate route servers, …

• Special properties of routing software– Clear interfaces to data plane and other routers– Limited dependence on past history

Handling Bugs at Run Time

• Diverse replication– Run multiple control planes in parallel– Vote on routing messages and forwarding table

UPDATE VOTER

FIB VOTER

REPLICAMANAGER

Hypervisor

Forwarding Table (FIB)IF 1

IF 2

Protocol daemon

RIB

Protocol daemon

RIB

Protocol daemon

RIB

UPDATE VOTER

FIB VOTER

REPLICAMANAGER

Hypervisor

Replicating Incoming Routing Messages

FIBIF 1

IF 2

12.0.0.0/8Update

Protocol daemon

RIB

Protocol daemon

RIB

Protocol daemon

RIB

No need for protocol parsing – operates at socket level

UPDATE VOTER

FIB VOTER

REPLICAMANAGER

Hypervisor

Voting: Updates to Forwarding Table

FIBIF 1

IF 212.0.0.0/8 IF 2

12.0.0.0/8Update

Protocol daemon

RIB

Protocol daemon

RIB

Protocol daemon

RIB

Transparent by intercepting calls to “Netlink”

UPDATE VOTER

FIB VOTER

REPLICAMANAGER

Hypervisor

Voting: Control-Plane Messages

FIBIF 1

IF 212.0.0.0/8 IF 2

12.0.0.0/8Update

Protocol daemon

RIB

Protocol daemon

RIB

Protocol daemon

RIB

Transparent by intercepting socket system calls

Simple Voting and Recovery

• Tolerate transient periods of disagreement– During routing-protocol convergence (tens of sec)

• Several different voting mechanisms– Master-slave vs. wait-for-consensus

• Small, trusted software component– No parsing, treats data as opaque strings– Just 514 lines of code in our implementation

• Recovery– Kill faulty instance, and invoke a new one

Conclusion on Bug-Tolerant Router

• Seriousness of routing software bugs– Cause serious outages, misbehavior, vulnerability– Violate protocol semantics, so not handled by

traditional failure detection and recovery

• Software and data diversity – Effective, and has reasonable overhead

• Design and prototype of bug-tolerant router– Works with Quagga, XORP, and BIRD software– Low overhead, and small trusted code base

Conclusions for the Talk• Router virtualization is exciting

– Enables wide variety of new networking techniques– … for network management & service deployment– … and even rethinking the Internet architecture

• Fascinating space of open questions– Other possible applications of router virtualization?– What is the right interface to router hardware?– What is the right programming environment for

customized protocols on virtual networks?

http://www.cs.princeton.edu/~jrex/virtual.html