View
218
Download
0
Embed Size (px)
Citation preview
Routing
Jinyang Li
Administravia
• Hand in PS1 to Hui Zhang before you leave!
• Email me project teamlist by Oct-1
Routing basics
• Model the network as a graph • Goal: Find a (best) path from A to B• You favorite path finding algorithm?
– Breadth first search– Bellman-Ford– Dijkstra– Floyd-Warshall
• Routing protocols must be decentralized
Challenges
• Network topology is dynamic– Links go up and down– Nodes go up and down– Link costs (metrics) change
• Nodes might have stale information• Nodes might have different information
Basic decentralized routing algorithms
• Distance-vector (DV)• Link state (LS)
Distance vector routing
• Based on Bellman-Ford• Nodes only keep path metrics to all
destinations• Neighboring nodes exchange path metrics
Distance vector routing
a
b
c
d
1
1
10
1
a: a, 0b: b, 1c: c, 10
a: a, 1b: b, 0c: c, 1
a: a, 10b: b, 1c: c, 0d: d, 1
c: c, 1d: d, 0
DV: routing table update
a
a: a, 0b: b, 1c: c, 10
b
c
d a: a, 1b: b, 0c: c, 1
1
1
10
1
a: a, 10b: b, 1c: c, 0d: d, 1
c: c, 1d: d, 0
a: 1b: 0c: 1
+ =a: a, 0b: b, 1c: b, 2
a: 10b: 1c: 0d: 1
+ =a: a, 1b: b, 0c: c, 1d: c, 2
DV: routing table update
• When does DV find best paths?• Static topology, synchronous
exchange – A node learns the best path ≤ x hops
after x rounds of exchange– DV converges after n rounds if longest
short path is n hops
DV under dynamics
• DV update rule– reduce path metric if get a better one from nbr
• Always correct w/ static topology• Might be wrong when topology changes
– My path metric is based on new topology– Neighbor’s path metric could be for old topology
DV: count-to-infinity
a
a: a, 0b: b, 1c: b, 2d: b, 3
b
c
d a: a, 1b: b, 0c: c, 1d: c, 2
1
1
10
1
a: b, 2b: b, 1c: c, 0d: d, 1
a: c, 1b: c, 2c: c, 3d: d, 0
a: a, 1b: b, 0c: infd: inf
a: 10b: infc: c, 0d: d, 1
a: 0b: 1c: 2d: 3
+a: a, 1b: b, 0c: a, 3d: a, 4
=
Incorrect update of path metric based on old topology
a: 1b: 0c: 3d: 4
+ =
a: a, 0b: b, 1c: b, 4d: b, 5
(Partial) solutions to count-to-infinity
• Make infinity a finite number (e.g. 64)• Split-horizon
– Do not advertise routes you learnt from neighbor N back to N
• Split-horizon with poison reverse– Advertise route you learnt from neighbor N
with an infinity metric
Path vector:no more count-to-infinity
a
a: a, 0b: b, 1c: b,c, 2d: b,c,d, 3
b
c
d a: a, 1b: b, 0c: c, 1d: c,d, 2
1
1
10
1
a: b,a, 2b: b, 1c: c, 0d: d, 1
a: c, 1b: c,b, 2c: c,b,a, 3d: d, 0
a: a, 1b: b, 0c: infd: inf
a: a, 10b: infc: c, 0d: d, 1
a: a, 0b: b,1c: b,c 2d: b,c,d 3
+a: a, 1b: b, 0c: infd: inf
=
Discard old info based on path vector
DV Summary
• Periodic exchange among neighbors
• Each update has O(N) size, N is the # of nodes (routable prefixes)
• Convergence delays– Explicit path vector info speeds up
convergence
Link State Routing
• In DV, topology is implicit in the routing tables– Convergence is delayed when using old
topology for updates
• LSR: make topology explicit!
Link State Routing
• Each nodes keeps link state information (complete topology)• Each node computes paths based on topology using a centralized algorithm
a
b
c
d
1
1
10
1
a: a, 0b: b, 1c: b, 2d: b, 3
Dijkstra
Link state updates
• Both ends of a link floods link state to the entire network– Immediately upon change– Periodically with a long period
• LS seq # distinguishes old LS from new ones
• Old LS times out eventually
Link State vs. DV
• Routing state– LS: O(E) to keep complete topology– DV: O(N) to keep path metrics to all nodes
• Routing message overhead– LS: O(E*E) floods each LS to entire network– DV: O(E*N) to exchange routing tables on all links
• LS converges faster than DV• Does LS guarantee loop-free forwarding?
Common link metrics
• What’s the “cost” of different links?– 1– Latency– Bandwidth– Queue length– …
Other routing algorithms?
• LS/DV find optimal paths• Both incur substantial message
overhead• Trade off path optimality for lower
overhead?– Compact routing: O( ) state, 3 times
longer paths in the worst case – Geographic routing: constant state €
N
Routing on the Internet
-- from algorithms to protocols
Intra-domain and Inter-domain routing
Intra-domain routing
• Goal: – Find best paths between all intra-AS networks– Traffic engineering to load balance different paths
• Popular IGP (interior gateway) protocols:– OSPF (LS)– IS-IS (LS)– RIP (DV)
Inter-domain routing
• Goal:– Provide reachability for different ASes– Comply to polices of different ASes
• BGP: path vector based on ASes
BGP
• Routing policies• Protocol operations• Disseminating BGP routes within an AS• BGP challenges
– Policy interactions– Multihoming– security
Inter-AS topology is not simply a graph
AT&T
Another ISP
Small ISP pays $$ to AT&T
NYU pays $ to small ISP
Free for traffic between customers of two peering ISPs only
Small ISP
NYU Customer
BGP export policy: what to reveal to
neighbors?• If you tell N about A --> you agree to forward
traffic from N to A– If you do not want to forward traffic to A, don’t tell
others about it
• Always advertise customer routes– Carrying traffic for customer brings $$$
• Advertise non-customer routes to customers only– If you advertise non-customer routes to another
provider/peer, you are carrying traffic for nothing!
BGP import policies: which route to use?
• Not simply shortest path!• Different preferences for routes
from different ASes• Customer > peer > provider
Customers pay for their traffic
Avoid payingproviders by using peer routes
Example BGP routes>show ip bgp 216.165.108.8
BGP routing table entry for 216.165.0.0/17, version 221058Paths: (41 available, best #39, table Default-IP-Routing-Table) Not advertised to any peer 4513 701 7018 12 12, (aggregated by 12 192.76.177.66) 209.10.12.125 from 209.10.12.125 (209.10.12.125) Origin IGP, metric 4103, localpref 100, valid, external, atomic-
aggregate….
AS Path Longest matching prefix
Next hop
High values are better
Route selection based on attributesLocal Pref
• Used to prefer customer > peer > provider• high values are better
ASPATH• Prefer paths with lowest # of ASes
MED• Tell others to choose one exit point over another• low values are better
IGP path cost• Lower values are better• leads to “hot potato” routing
Router ID
Hot potato routing
• All ASes want to get rid of external traffic asap• Hot potato routing causes asymmetric traffic
MED=100 MED=500
Blue AS’ preferred route
BGP operations
• A router establishes a BGP session with its neighbors over TCP
• Neighbors might be many hops away• Two neighbors exchange
– UPDATE (announcements, withdrawal)– KEEPALIVE
Disseminating routes within an AS
Routers establish eBGP sessions between different ASes
Routers inside an AS establish iBGP session to learn external routes
Challenges of route dissemination
• Loop free– Routers should not disagree on how to route
• Complete– Each router chooses route as if it knows all
external routes from all eBGP sessions
• Scalable
A strawman that works: full mesh dissemination
• Each router establishes an iBGP session with all eBGP speaking routers.Complete All routers know all routes.Loop freeAll routers know the same set of routes. Not scalableRequires e(e-1)/2 + ei iBGP sessions among e
eBGP routers and i non-eBGP routers
A simple route reflector setup
Requires e+i BGP sessions Clients and the reflector exchange less traffic All loads are on one router Not all clients get best routes if there are multiple egress routers
Route Reflector
Reflector client
RR tells clients best route for each prefix over iBGP
RR learns routes from eBGP sessions
A problematic RR topology setup
RR1
Reflector client
RR2 tells clients its best route to D, next hop RR2
RR2
Reflector client
RR2 learns equally good route to prefix D from eBGP
RR1 learns best route to prefix D from eBGP
3
13
2
1RR1 tells clients its best route to D, next hop RR1
BGP
• Routing policies• Protocol operations• Disseminating BGP routes within an AS• BGP challenges
– Policy interactions– Multi-homing– Security
When policy goes against shortest path…
• Each AS prefers two-hop route via its clock-wise neighbor
AS1AS1
AS3AS3AS2AS2
AS0AS0
Shortest path routing always converges
• Why?• Shortest paths form a DAG (directed acyclic graph)
from all nodes to a destination.• When polices override shortest path, there’s danger…
Ensuring convergence
• Global policy check– Each AS submits its policy & neighbors to a
global registry– Centrally check for bad policy interactions Checking is NP-complete Topology might change
• Gao/Rexford (today’s paper)– AS graphs are hierarchical– Restrict the set of allowed policies
Gao’s observation
• AS graphs are not just any graph• Provider-subscriber relationships
form a DAG
Publisher-subscriber link
Peering link
Gao’s rule for convergence
• Do not go against DAG edges– Customer route > provider peer routes
• If peering links do not cause cycles…– Customer peer routes > provider routes
A peering link that will not cause cycle
A peering link that might cause cycle
Gao’s rule for convergence
• Gao’s rule matches ISPs’ incentives– ISP Incentives: customer > peer >
provider– Gao’s: customer > peer provider
BGP
• Routing policies• Protocol operations• Disseminating BGP routes within an AS• BGP challenges
– Policy interactions– Multi-homing– Security
BGP and multi-homing
• “stub” AS uses 2 links to the same ISP• “stub” AS uses 2 links to different ISPs• Transit AS uses 2 providers & peers
Stub AS with a single ISP
• Resilient to a single link failure – announce d/19 on both links
• Balance load between two links – split prefix, announce sub-prefix on different links
• No need for a public AS number for stub
Stub AS, d/19Stub AS, d/19
Announce one route to d/19
Announce d1/20 Announce d2/20
AS 5 AS 5 AS 6 AS 6
Stub AS with multiple ISPs
• Resilient to one ISP failure – announce prefix over both links in primary/backup setup
• Balance load between two ISPs– split prefix and announce sub-prefix on each link
• Need a public AS number
Stub AS 12, d/19Stub AS 12, d/19
Announce d/19 with ASPATH 12
Announce d/19 with ASPATH 12 12 12
Announce d/19 with 5 12
Announce d/19 with 6 12 12 12
Service providers multi-home
• Load balance transit traffic on many prefixes (inter-domain traffic engineering) – Control both outbound and inbound traffic
• Redundancy– Primary/backup etc.
• Challenge: scalability and predictability