33
VROOM: Virtual ROuters On the Move Jennifer Rexford Joint work with Yi Wang, Eric Keller, Brian Biskeborn, and Kobus van der Merwe http://www.cs.princeton.edu/~jrex/papers/vroom08.pd

VROOM: Virtual ROuters On the Move Jennifer Rexford Joint work with Yi Wang, Eric Keller, Brian Biskeborn, and Kobus van der Merwe jrex/papers/vroom08.pdf

  • View
    214

  • Download
    0

Embed Size (px)

Citation preview

VROOM: Virtual ROuters On the Move

Jennifer Rexford

Joint work with Yi Wang, Eric Keller, Brian Biskeborn, and Kobus van der Merwe

http://www.cs.princeton.edu/~jrex/papers/vroom08.pdf

Virtual ROuters On the Move

• Key idea– Routers should be free to roam around

• Useful for many different applications– Simplify network maintenance– Simplify service deployment and evolution– Reduce power consumption– …

• Feasible in practice– No performance impact on data traffic– No visible impact on routing protocols

2

The Two Notions of “Router”• The IP-layer logical functionality, and the physical

equipment

3

Logical(IP layer)

Physical

Tight Coupling of Physical & Logical• Root of many network-management challenges (and

“point solutions”)

4

Logical(IP layer)

Physical

VROOM: Breaking the Coupling• Re-mapping logical node to another physical node

5

Logical(IP layer)

Physical

VROOM enables this re-mapping of logical to physical through virtual router migration.

Case 1: Planned Maintenance

• NO reconfiguration of VRs, NO reconvergence

6

A

B

VR-1

Case 1: Planned Maintenance

• NO reconfiguration of VRs, NO reconvergence

7

A

B

VR-1

Case 1: Planned Maintenance

• NO reconfiguration of VRs, NO reconvergence

8

A

B

VR-1

Case 2: Service Deployment/Evolution

• Move (logical) router to more powerful hardware

9

Case 2: Service Deployment/Evolution

• VROOM guarantees seamless service to existing customers during the migration

10

Case 3: Power Savings

11

• $ Hundreds of millions/year of electricity bills

Case 3: Power Savings

12

• Contract and expand the physical network according to the traffic volume

Case 3: Power Savings

13

• Contract and expand the physical network according to the traffic volume

Case 3: Power Savings

14

• Contract and expand the physical network according to the traffic volume

Virtual Router Migration: Challenges

15

1. Migrate an entire virtual router instance• All control plane & data plane processes / states

Virtual Router Migration: Challenges

16

1. Migrate an entire virtual router instance2. Minimize disruption

• Data plane: millions of packets/sec on a 10Gbps link• Control plane: less strict (with routing message retrans.)

Virtual Router Migration: Challenges

17

1. Migrating an entire virtual router instance2. Minimize disruption3. Link migration

Virtual Router Migration: Challenges

18

1. Migrating an entire virtual router instance2. Minimize disruption3. Link migration

VROOM Architecture

19

Dynamic Interface Binding

Data-Plane Hypervisor

• Key idea: separate the migration of control and data planes

1.Migrate the control plane2.Clone the data plane3.Migrate the links

20

VROOM’s Migration Process

• Leverage virtual server migration techniques• Router image

– Binaries, configuration files, etc.

21

Control-Plane Migration

• Leverage virtual migration techniques• Router image• Memory

– 1st stage: iterative pre-copy– 2nd stage: stall-and-copy (when the control plane

is “frozen”)

22

Control-Plane Migration

• Leverage virtual server migration techniques• Router image• Memory

23

Control-Plane Migration

Physical router A

Physical router B

DP

CP

• Clone the data plane by repopulation– Enable migration across different data planes– Avoid copying duplicate information

24

Data-Plane Cloning

Physical router A

Physical router B

CP

DP-old

DP-newDP-new

• Data-plane cloning takes time– Installing 250k routes takes over 20 seconds

• Control & old data planes need to be kept “online”• Solution: redirect routing messages through tunnels

25

Remote Control Plane

Physical router A

Physical router B

CP

DP-old

DP-new

• Data-plane cloning takes time– Installing 250k routes takes over 20 seconds

• Control & old data planes need to be kept “online”• Solution: redirect routing messages through tunnels

26

Remote Control Plane

Physical router A

Physical router B

CP

DP-old

DP-new

• Data-plane cloning takes time– Installing 250k routes takes over 20 seconds

• Control & old data planes need to be kept “online”• Solution: redirect routing messages through tunnels

27

Remote Control Plane

Physical router A

Physical router B

CP

DP-old

DP-new

• At the end of data-plane cloning, both data planes are ready to forward traffic

28

Double Data Planes

CP

DP-old

DP-new

• With the double data planes, links can be migrated independently

29

Asynchronous Link Migration

A

CP

DP-old

DP-new

B

• Virtualized operating system– OpenVZ, supports VM migration

• Routing protocols– Quagga software suite

• Packet forwarding– Linux kernel (software), NetFPGA (hardware)

• Router hypervisor– Our extensions to connect everything together

30

Prototype Implementation

• Experiments in Emulab– On realistic backbone topologies

• Impact on data traffic– Linux: modest packet delay due to CPU load– NetFPGA: no packet loss or extra delay

• Impact on routing– Control plane downtime: 3.56 seconds– Routing-protocol adjacencies stay up– At most one retransmission of a message

31

Experimental Evaluation

Where To Migrate

• Physical constraints– Latency

• E.g, NYC to Washington D.C.: 2 msec– Link capacity

• Enough remaining capacity for extra traffic– Platform compatibility

• Routers from different vendors– Router capability

• E.g., number of access control lists (ACLs) supported• Constraints simplify the placement problem

32

Conclusions & Future Work

• VROOM: useful network-management primitive– Separate tight coupling between physical and logical– Simplify network management, enable new applications

• Evaluation of prototype– No disruption in packet forwarding– No noticeable disruption in routing protocols

• Future work– Migration scheduling as an optimization problem– Extensions to router hypervisor for other applications

33