35
RouterFarm: Towards a Dynamic, Manageable Network Edge Mukesh Agrawal, Bobbi Bailey, Zihui Ge, Albert Greenberg, Kobus van der Merwe, Jorge Pastor, Panagiotis Sebos, Srinivasan Seshan, and Jennifer Yates Internet Network Management Workshop 2006

RouterFarm: Towards a Dynamic, Manageable Network Edge Mukesh Agrawal, Bobbi Bailey, Zihui Ge, Albert Greenberg, Kobus van der Merwe, Jorge Pastor, Panagiotis

Embed Size (px)

Citation preview

Page 1: RouterFarm: Towards a Dynamic, Manageable Network Edge Mukesh Agrawal, Bobbi Bailey, Zihui Ge, Albert Greenberg, Kobus van der Merwe, Jorge Pastor, Panagiotis

RouterFarm: Towards a Dynamic, Manageable

Network Edge

Mukesh Agrawal, Bobbi Bailey, Zihui Ge, Albert Greenberg, Kobus van der Merwe, Jorge Pastor, Panagiotis Sebos,

Srinivasan Seshan, and Jennifer YatesInternet Network Management Workshop 2006

Page 2: RouterFarm: Towards a Dynamic, Manageable Network Edge Mukesh Agrawal, Bobbi Bailey, Zihui Ge, Albert Greenberg, Kobus van der Merwe, Jorge Pastor, Panagiotis

Customers

Today's IP NetworksToday's IP Networks

Customers

ISP Backbone

Edge Router

Customer Router

Backbone Router

Page 3: RouterFarm: Towards a Dynamic, Manageable Network Edge Mukesh Agrawal, Bobbi Bailey, Zihui Ge, Albert Greenberg, Kobus van der Merwe, Jorge Pastor, Panagiotis

Customers

The Weakest LinkThe Weakest Link

Customers

ISP Backbone

The network edge is a major source of customer downtime, due to...

• software updates• OS crashes• CPU failures• line card failures• etc.

The network edge is a major source of customer downtime, due to...

• software updates• OS crashes• CPU failures• line card failures• etc.

Page 4: RouterFarm: Towards a Dynamic, Manageable Network Edge Mukesh Agrawal, Bobbi Bailey, Zihui Ge, Albert Greenberg, Kobus van der Merwe, Jorge Pastor, Panagiotis

Customers

Edge vs. Backbone RoutersEdge vs. Backbone Routers

Customers

ISP BackboneBackbone Edge

Network Layer IP, OSPF, MPLS

IP, OSPF, MPLS, BGP, EIGRP, VPN, ACLs

Link Protocols POS, Ethernet POS, Ethernet, ATM, Frame Relay, DS3, DSL,

Redundancy High Low/None

Scale

(# interfaces)

Low 1,000s High 10,000s

Page 5: RouterFarm: Towards a Dynamic, Manageable Network Edge Mukesh Agrawal, Bobbi Bailey, Zihui Ge, Albert Greenberg, Kobus van der Merwe, Jorge Pastor, Panagiotis

Customers

The State of the ArtThe State of the Art

Customers

ISP Backbone

These solutions

• are costly• introduce complexity• tie ISPs to vendor priorities/schedules• each requires new testing

These solutions

• are costly• introduce complexity• tie ISPs to vendor priorities/schedules• each requires new testing

Vendors have proposed a collection of ad-hoc solutions...

• hitless updates• 1:1 redundant CPUs with fail-over• 1:1 redundant line cards

Vendors have proposed a collection of ad-hoc solutions...

• hitless updates• 1:1 redundant CPUs with fail-over• 1:1 redundant line cards

Page 6: RouterFarm: Towards a Dynamic, Manageable Network Edge Mukesh Agrawal, Bobbi Bailey, Zihui Ge, Albert Greenberg, Kobus van der Merwe, Jorge Pastor, Panagiotis

Customers

A Better Way?A Better Way?

Customers

ISP Backbone

Let routers fail, but make service restoration fast and easy(like RAID and server farms)

Let routers fail, but make service restoration fast and easy(like RAID and server farms)

Share resources to minimize costShare resources to minimize cost

Develop one technique that works across a variety of scenarios

Develop one technique that works across a variety of scenarios

Page 7: RouterFarm: Towards a Dynamic, Manageable Network Edge Mukesh Agrawal, Bobbi Bailey, Zihui Ge, Albert Greenberg, Kobus van der Merwe, Jorge Pastor, Panagiotis

The RouterFarm WayThe RouterFarm Way

Manage routers as a “Router Farm”, dynamically moving customers as necessary

Manage routers as a “Router Farm”, dynamically moving customers as necessary

Page 8: RouterFarm: Towards a Dynamic, Manageable Network Edge Mukesh Agrawal, Bobbi Bailey, Zihui Ge, Albert Greenberg, Kobus van der Merwe, Jorge Pastor, Panagiotis

1. Extract customer configuration from initial router

2. Install customer configuration on to target router

3. Reconfigure transport (layer 2) connectivity

4. Wait for network to converge

5. Perform maintenance

1. Extract customer configuration from initial router

2. Install customer configuration on to target router

3. Reconfigure transport (layer 2) connectivity

4. Wait for network to converge

5. Perform maintenance

RouterFarm in ActionRouterFarm in Action(Planned Maintenance)(Planned Maintenance)

BGPBGP

nicer document icon?
Page 9: RouterFarm: Towards a Dynamic, Manageable Network Edge Mukesh Agrawal, Bobbi Bailey, Zihui Ge, Albert Greenberg, Kobus van der Merwe, Jorge Pastor, Panagiotis

RouterFarm ViabilityRouterFarm Viability

Router Farm

Server Traffic

Generator

Cross-Connect

Target

Remote Edge

Customer 2

Customer 1

IP /MPLS

network

TransportNetwork

IP /MPLS

network

Initial

Page 10: RouterFarm: Towards a Dynamic, Manageable Network Edge Mukesh Agrawal, Bobbi Bailey, Zihui Ge, Albert Greenberg, Kobus van der Merwe, Jorge Pastor, Panagiotis

RouterFarm BenefitsRouterFarm Benefits(Planned Maintenance)(Planned Maintenance)

Today

Outage: 10-15 min

RouterFarm

Outage: 2x 1 min

Page 11: RouterFarm: Towards a Dynamic, Manageable Network Edge Mukesh Agrawal, Bobbi Bailey, Zihui Ge, Albert Greenberg, Kobus van der Merwe, Jorge Pastor, Panagiotis

Time BreakdownTime Breakdown

Link Up2

Physical Up15

Config Down

5

Routes CE24

Routes Target

2

BGP Up28

Routes PE21

Total outage: 57 seconds

Page 12: RouterFarm: Towards a Dynamic, Manageable Network Edge Mukesh Agrawal, Bobbi Bailey, Zihui Ge, Albert Greenberg, Kobus van der Merwe, Jorge Pastor, Panagiotis

0

10

20

30

40

50

60

70

80

90

100

10 500 1000 2000 3000 4000 5000

# of Routes

Ou

tag

e i

n S

ec

on

ds

(mean and 95% confidence interval from 10 runs)

Scaling in Customer RoutesScaling in Customer Routes

replace CIs with quartiles or similar (CI doesn't make sense, since the times are probably not normally distributed)
Page 13: RouterFarm: Towards a Dynamic, Manageable Network Edge Mukesh Agrawal, Bobbi Bailey, Zihui Ge, Albert Greenberg, Kobus van der Merwe, Jorge Pastor, Panagiotis

RouterFarm QuestionsRouterFarm Questions

• How can we reduce outage times further?

• How do outage times scale with number of customers?

• Can we manage configuration in heterogeneous networks?

• How do we keep up with an evolving network?

Page 14: RouterFarm: Towards a Dynamic, Manageable Network Edge Mukesh Agrawal, Bobbi Bailey, Zihui Ge, Albert Greenberg, Kobus van der Merwe, Jorge Pastor, Panagiotis

Challenge: ExtractingChallenge: ExtractingConfigurationConfiguration

ip vrf VPN1 …controller T1 1/0 …router bgp 65535 neighbor 192.168.10.2 network 10.1.0.0/16interface Serial 1/0/1 ip address 192.168.10.5/30 ppp XXXinterface Ethernet 2/0 ip address 192.168.10.1/30 vrf forwarding VPN1 …interface ATM3/0/1 ip address 192.168.10.9/30 ppp XXXinterface Multilink 1000ip route 10.1.1.0/24 Serial1/0/1ip route 10.1.2.0/24 ATM3/0/1

check vrf definitioncheck controller definitioncheck ATM interface configurationcheck serial interface configuration
nicer document icon?
Page 15: RouterFarm: Towards a Dynamic, Manageable Network Edge Mukesh Agrawal, Bobbi Bailey, Zihui Ge, Albert Greenberg, Kobus van der Merwe, Jorge Pastor, Panagiotis

Challenge: ExtractingChallenge: ExtractingConfigurationConfiguration

ip vrf VPN1 …controller T1 1/0 …router bgp 65535 neighbor 192.168.10.2 network 10.1.0.0/16interface Serial 1/0/1 ip address 192.168.10.5/30 ppp XXXinterface Ethernet 2/0 ip address 192.168.10.1/30 vrf forwarding VPN1 …interface ATM3/0/1 ip address 192.168.10.9/30 ppp XXXinterface Multilink 1000ip route 10.1.1.0/24 Serial1/0/1ip route 10.1.2.0/24 ATM3/0/1

check vrf definitioncheck controller definitioncheck ATM interface configurationcheck serial interface configuration
nicer document icon?
Page 16: RouterFarm: Towards a Dynamic, Manageable Network Edge Mukesh Agrawal, Bobbi Bailey, Zihui Ge, Albert Greenberg, Kobus van der Merwe, Jorge Pastor, Panagiotis

Challenge: ExtractingChallenge: ExtractingConfigurationConfiguration

ip vrf VPN1 …controller T1 1/0 …router bgp 65535 neighbor 192.168.10.2 network 10.1.0.0/16interface Serial 1/0/1 ip address 192.168.10.5/30 ppp XXXinterface Ethernet 2/0 ip address 192.168.10.1/30 vrf forwarding VPN1 …interface ATM3/0/1 ip address 192.168.10.9/30 ppp XXXinterface Multilink 1000ip route 10.1.1.0/24 Serial1/0/1ip route 10.1.2.0/24 ATM3/0/1

• Extraction varies with interface and service

• Configuration idioms can make some of this easier

• Tools which infer relationships may help further

• Extraction varies with interface and service

• Configuration idioms can make some of this easier

• Tools which infer relationships may help further

check vrf definitioncheck controller definitioncheck ATM interface configurationcheck serial interface configuration
add ppp chap hostname stuff
nicer document icon?
Page 17: RouterFarm: Towards a Dynamic, Manageable Network Edge Mukesh Agrawal, Bobbi Bailey, Zihui Ge, Albert Greenberg, Kobus van der Merwe, Jorge Pastor, Panagiotis

• Customer configuration depends on “global” configuration options

• What if configuration differs between routers?– Configuration difficult to reason about, but

heuristics might help…– Observation: some things should differ, others

should not– Idea: use frequency with which an differs across

network to estimate probability of error

Challenge: IntegratingChallenge: IntegratingConfigurationConfiguration

nicer document icon
Page 18: RouterFarm: Towards a Dynamic, Manageable Network Edge Mukesh Agrawal, Bobbi Bailey, Zihui Ge, Albert Greenberg, Kobus van der Merwe, Jorge Pastor, Panagiotis

ConclusionConclusion

• RouterFarm provides a solution to many edge-router reliability problems

• RouterFarm improves outage times for planned maintenance

• Configuration potentially an obstacle; need new tools and techniques to minimize risk

• Performance at scale, and evolving with the network require further investigation

Page 19: RouterFarm: Towards a Dynamic, Manageable Network Edge Mukesh Agrawal, Bobbi Bailey, Zihui Ge, Albert Greenberg, Kobus van der Merwe, Jorge Pastor, Panagiotis

Thank you

Page 20: RouterFarm: Towards a Dynamic, Manageable Network Edge Mukesh Agrawal, Bobbi Bailey, Zihui Ge, Albert Greenberg, Kobus van der Merwe, Jorge Pastor, Panagiotis
Page 21: RouterFarm: Towards a Dynamic, Manageable Network Edge Mukesh Agrawal, Bobbi Bailey, Zihui Ge, Albert Greenberg, Kobus van der Merwe, Jorge Pastor, Panagiotis

Backup

Page 22: RouterFarm: Towards a Dynamic, Manageable Network Edge Mukesh Agrawal, Bobbi Bailey, Zihui Ge, Albert Greenberg, Kobus van der Merwe, Jorge Pastor, Panagiotis

Lab ExperimentsLab Experiments

Page 23: RouterFarm: Towards a Dynamic, Manageable Network Edge Mukesh Agrawal, Bobbi Bailey, Zihui Ge, Albert Greenberg, Kobus van der Merwe, Jorge Pastor, Panagiotis

Testing GoalsTesting Goals

• Good coverage over customer configs

• Limited hardware requirements

• Automated

• Fast (hopefully, run every night)

Page 24: RouterFarm: Towards a Dynamic, Manageable Network Edge Mukesh Agrawal, Bobbi Bailey, Zihui Ge, Albert Greenberg, Kobus van der Merwe, Jorge Pastor, Panagiotis

Testing DesignTesting DesignInitial router

target router

A

B

A

B

A

B

A

B

A

B

A

B

A

AA

=?

Page 25: RouterFarm: Towards a Dynamic, Manageable Network Edge Mukesh Agrawal, Bobbi Bailey, Zihui Ge, Albert Greenberg, Kobus van der Merwe, Jorge Pastor, Panagiotis

Batched Route TransferBatched Route Transfer

Target Router PE CE2

BGP EstablishedCustomerRoutes

Partial Customer Routes

IBGP MinAdver Timer (5 sec)

Partial Customer Routes

EBGPMinAdver

Timer (30 sec)

Remaining Customer Routes

Remaining CustomerRoutes

Page 26: RouterFarm: Towards a Dynamic, Manageable Network Edge Mukesh Agrawal, Bobbi Bailey, Zihui Ge, Albert Greenberg, Kobus van der Merwe, Jorge Pastor, Panagiotis

Clipboard

Page 27: RouterFarm: Towards a Dynamic, Manageable Network Edge Mukesh Agrawal, Bobbi Bailey, Zihui Ge, Albert Greenberg, Kobus van der Merwe, Jorge Pastor, Panagiotis

The RouterFarm WayThe RouterFarm Way

Page 28: RouterFarm: Towards a Dynamic, Manageable Network Edge Mukesh Agrawal, Bobbi Bailey, Zihui Ge, Albert Greenberg, Kobus van der Merwe, Jorge Pastor, Panagiotis

Migration ChallengesMigration Challenges

• Transport layer capacity(IP vs. transport, bandwidth, duration, distance)

• Inconsistent/noisy data(circuit IDs, transport routing, configuration errors)

• Scale(# routes, # customers)

• Network diversity(DS1 vs. ATM, BGP vs. static, VPNs, CoS)

Page 29: RouterFarm: Towards a Dynamic, Manageable Network Edge Mukesh Agrawal, Bobbi Bailey, Zihui Ge, Albert Greenberg, Kobus van der Merwe, Jorge Pastor, Panagiotis

Feasibility: GoalsFeasibility: Goals

• Demonstrate feasibility using “off-the-shelf” commercial routers

• Establish that we reduce outage time over existing practice (especially for planned maintenance)

• Quantify variability in re-homing times

• Determine scaling of outage time in number of routes

Page 30: RouterFarm: Towards a Dynamic, Manageable Network Edge Mukesh Agrawal, Bobbi Bailey, Zihui Ge, Albert Greenberg, Kobus van der Merwe, Jorge Pastor, Panagiotis
Page 31: RouterFarm: Towards a Dynamic, Manageable Network Edge Mukesh Agrawal, Bobbi Bailey, Zihui Ge, Albert Greenberg, Kobus van der Merwe, Jorge Pastor, Panagiotis

Ongoing WorkOngoing Work

Page 32: RouterFarm: Towards a Dynamic, Manageable Network Edge Mukesh Agrawal, Bobbi Bailey, Zihui Ge, Albert Greenberg, Kobus van der Merwe, Jorge Pastor, Panagiotis

ChallengesChallenges

• Scale: can we move all customers to a new router– without overwhelming the new router?– without overwhelming the network?

• Diversity: moving customers requires configuration of numerous network layers, protocols, and parameters. In a network with 1000s of customers,– how do we develop dynamic reconfiguration tools?– how do we test these tools, without elaborate (and

expensive) testbeds?

Page 33: RouterFarm: Towards a Dynamic, Manageable Network Edge Mukesh Agrawal, Bobbi Bailey, Zihui Ge, Albert Greenberg, Kobus van der Merwe, Jorge Pastor, Panagiotis

Router Configuration ComplicationsRouter Configuration Complications

• So many configuration options!!!

• Complicated dependencies: how to extract relevant configuration? (need to understand network services)

• Inconsistent defaults(e.g. CRC length, POS scrambling)

• Channelized vs. unchannelized line cards(“clock source” irrelevant for channelized interfaces)

Page 34: RouterFarm: Towards a Dynamic, Manageable Network Edge Mukesh Agrawal, Bobbi Bailey, Zihui Ge, Albert Greenberg, Kobus van der Merwe, Jorge Pastor, Panagiotis
Page 35: RouterFarm: Towards a Dynamic, Manageable Network Edge Mukesh Agrawal, Bobbi Bailey, Zihui Ge, Albert Greenberg, Kobus van der Merwe, Jorge Pastor, Panagiotis

The RouterFarm WayThe RouterFarm Way