65
IP Routing IP Routing IMA Minneapolis January, 2004 Timothy G. Griffin Intel Research [email protected] http://www.cambridge.intel-research.net/ ~tgriffin

IP Routing IMA Minneapolis January, 2004 Timothy G. Griffin Intel Research [email protected] tgriffin

Embed Size (px)

Citation preview

Page 1: IP Routing IMA Minneapolis January, 2004 Timothy G. Griffin Intel Research tim.griffin@intel.com tgriffin

IP Routing IP Routing

IMAMinneapolis

January, 2004

Timothy G. Griffin Intel Research

[email protected]

http://www.cambridge.intel-research.net/~tgriffin

Page 2: IP Routing IMA Minneapolis January, 2004 Timothy G. Griffin Intel Research tim.griffin@intel.com tgriffin

Common View of the Telco Network

Brick

Page 3: IP Routing IMA Minneapolis January, 2004 Timothy G. Griffin Intel Research tim.griffin@intel.com tgriffin

Common View of the IP Network (Layer 3)

Page 4: IP Routing IMA Minneapolis January, 2004 Timothy G. Griffin Intel Research tim.griffin@intel.com tgriffin

What This Course Is About

Page 5: IP Routing IMA Minneapolis January, 2004 Timothy G. Griffin Intel Research tim.griffin@intel.com tgriffin

Forwarding vs. Routing

• Forwarding: Use of local information (tables, data structures) to determine treatment of data traffic – traffic classification– treatment bases on classification– can drop traffic– or decide where to send it next, and how to send it

• Routing: Use of global information to populate forwarding data structures in multiple network nodes– goal is normally to optimize something given state of the network– difficult to fully automated --- often requires intricate

configuration– is partially automated with dynamic routing protocols

Page 6: IP Routing IMA Minneapolis January, 2004 Timothy G. Griffin Intel Research tim.griffin@intel.com tgriffin

What You Should Take Away

• Heterogeneity– Many technologies– Many networks– Many “routing policies”

• IP Routing is evolving– New protocols and extensions– New router knobs– New ways of using existing technologies

Page 7: IP Routing IMA Minneapolis January, 2004 Timothy G. Griffin Intel Research tim.griffin@intel.com tgriffin

Anthropology of Routing

Protocol Standards

Vendors

Operator Forums

Academic Research

IEEE, IETF, …

NANOG, RIPE, …

Books, Training, …

Cisco, Juniper …

Sun, Microsoft, …

EE

CS

Maths

Could add regulation, markets, ….

Page 8: IP Routing IMA Minneapolis January, 2004 Timothy G. Griffin Intel Research tim.griffin@intel.com tgriffin

8

Routing can happen at any level

Physical

Network

DataLink

Transport

Application

Session

Presentation

Physical

Network

DataLink

Transport

Application

Session

Presentation

data sentdata received

Page 9: IP Routing IMA Minneapolis January, 2004 Timothy G. Griffin Intel Research tim.griffin@intel.com tgriffin

9

IP is a Network Layer Protocol

Physical 1

Network

DataLink 1

Transport

Application

Session

Presentation

Network

Physical 1

DataLink 1

Physical 2

DataLink 2

Router

Physical 2

Network

DataLink 2

Transport

Application

Session

Presentation

Medium 1 Medium 2

Separate physical networks glued together into one logical network

Page 10: IP Routing IMA Minneapolis January, 2004 Timothy G. Griffin Intel Research tim.griffin@intel.com tgriffin

10

0 1 2 3 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ |Version| IHL | Service Type | Total Length | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | Identification |Flags| Fragment Offset | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | Time to Live | Protocol | Header Checksum | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | Source Address | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | Destination Address | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | Options | Padding | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+

All Hail the IP Datagram!

HEADER

DATA

1981, RFC 791

... up to 65,515 octets of data ...

::|+|+|

::|+|+|

+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+

shaded fields little-used today

Page 11: IP Routing IMA Minneapolis January, 2004 Timothy G. Griffin Intel Research tim.griffin@intel.com tgriffin

11

IP Hour Glass

IP

Networking Technologies

Networking Applications

Frame ATM

DWDMSONET

email

Webfile transfer

Ethernet

FDDI

Multimedia

X.25

Remote Access Voice

VPN

Minimalist network layer

TCP

e-stuff

Page 12: IP Routing IMA Minneapolis January, 2004 Timothy G. Griffin Intel Research tim.griffin@intel.com tgriffin

Routers Talking to Routers

Routing info

Routing info

• Routing computation is distributed among routers within a routing domain

• Computation of best next hop based on routing information is the most CPU/memory intensive task on a router

• Routing messages are usually not routed, but exchanged via layer 2 between physically adjacent routers (internal BGP and multi-hop external BGP are exceptions)

Page 13: IP Routing IMA Minneapolis January, 2004 Timothy G. Griffin Intel Research tim.griffin@intel.com tgriffin

Architecture of Dynamic Routing

AS 1

AS 2

BGP

EGP = Exterior Gateway Protocol

IGP = Interior Gateway Protocol

Metric based: OSPF, IS-IS, RIP, EIGRP (cisco)

Policy based: BGP

The Routing Domain of BGP is the entire Internet

OSPF

EIGRP

Page 14: IP Routing IMA Minneapolis January, 2004 Timothy G. Griffin Intel Research tim.griffin@intel.com tgriffin

• Topology information is flooded within the routing domain

• Best end-to-end paths are computed locally at each router.

• Best end-to-end paths determine next-hops.

• Based on minimizing some notion of distance

• Works only if policy is shared and uniform

• Examples: OSPF, IS-IS

• Each router knows little about network topology

• Only best next-hops are chosen by each router for each destination network.

• Best end-to-end paths result from composition of all next-hop choices

• Does not require any notion of distance

• Does not require uniform policies at all routers

• Examples: RIP, BGP

Link State Vectoring

Technology of Distributed Routing

Page 15: IP Routing IMA Minneapolis January, 2004 Timothy G. Griffin Intel Research tim.griffin@intel.com tgriffin

The Gang of Four

Link State Vectoring

EGP

IGP

BGP

RIPIS-IS

OSPF

Page 16: IP Routing IMA Minneapolis January, 2004 Timothy G. Griffin Intel Research tim.griffin@intel.com tgriffin

Autonomous Routing Domains (ARDs)

A collection of physical networks glued togetherusing IP, that have a unified administrativerouting policy.

• Campus networks• Corporate networks• ISP Internal networks• …

Page 17: IP Routing IMA Minneapolis January, 2004 Timothy G. Griffin Intel Research tim.griffin@intel.com tgriffin

Autonomous System (AS) Numbers

16 bit values.

64512 through 65535 are “private”

• Genuity: 1 • MIT: 3• JANET: 786• UC San Diego: 7377• AT&T: 7018, 6341, 5074, … • UUNET: 701, 702, 284, 12199, …• Sprint: 1239, 1240, 6211, 6242, …• …

ASNs represent units of routing policy

Currently over 16,000 in use.

Page 18: IP Routing IMA Minneapolis January, 2004 Timothy G. Griffin Intel Research tim.griffin@intel.com tgriffin

Autonomous Routing Domains Don’t Always Need BGP or an ASN

Qwest

Yale University

Nail up default routes 0.0.0.0/0pointing to Qwest

Nail up routes 130.132.0.0/16pointing to Yale

130.132.0.0/16

Static routing is the most common way of connecting anautonomous routing domain to the Internet. This helps explain why BGP is a mystery to many …

Page 19: IP Routing IMA Minneapolis January, 2004 Timothy G. Griffin Intel Research tim.griffin@intel.com tgriffin

ASNs Can Be “Shared” (RFC 2270)

AS 701UUNet

ASN 7046 is assigned to UUNet. It is used byCustomers single homed to UUNet, but needing BGP for some reason (load balancing, etc..) [RFC 2270]

AS 7046Crestar Bank

AS 7046 NJIT

AS 7046HoodCollege

128.235.0.0/16

Page 20: IP Routing IMA Minneapolis January, 2004 Timothy G. Griffin Intel Research tim.griffin@intel.com tgriffin

ARD != AS

• Most ARDs have no ASN (statically routed at Internet edge)

• Some unrelated ARDs share the same ASN (RFC 2270)

• Some ARDs are implemented with multiple ASNs (example: Worldcom, AT&T, …)

ASes are an implementation detail of Interdomain routing

Page 21: IP Routing IMA Minneapolis January, 2004 Timothy G. Griffin Intel Research tim.griffin@intel.com tgriffin

How Many ASNs?

Thanks to Geoff Huston. http://bgp.potaroo.net on October 24, 2003

15,981

Page 22: IP Routing IMA Minneapolis January, 2004 Timothy G. Griffin Intel Research tim.griffin@intel.com tgriffin

How many prefixes?

Thanks to Geoff Huston. http://bgp.potaroo.net on October 24, 2003

154,894

Note: numbersactually dependspoint of view…

29%

23%

Address space covered

Page 23: IP Routing IMA Minneapolis January, 2004 Timothy G. Griffin Intel Research tim.griffin@intel.com tgriffin

A Bit of OGI’s AS Neighborhood

AS 2914 Verio

AS 11964 OGI128.223.0.0/16

AS 7018 AT&T

AS 1239 Sprint

AS 6366Portland State U

AS 11995Oregon Health Sciences U

AS 3356 Level 3

Sources: ARIN, Route Views, RIPE

AS 3356 Level 3

AS 14262Portland Regional Education Network

AS 7774 U of Alaska

AS 3807 U of Montana

AS 101U of Washington

Page 24: IP Routing IMA Minneapolis January, 2004 Timothy G. Griffin Intel Research tim.griffin@intel.com tgriffin

(Winter '02)

(Win

ter '0

2)

(Summer '03)

UW-Superior

UW-StoutUW-River Falls

Fox Valley TC

UW-Oshkosh

UW-Milwaukee

UW-ParksideUW-Whitewater

UW-Madison

UW-Platteville

UW-La Crosse

UW-Eau Claire

UW-Stevens Point

UW-Green Bay

Marshfield

Rhinelander

Rice Lake

Clintonville

StilesJct.

Portage

Dodgeville

La Crosse

Genuity

OC-3 (155Mbps)

DS-3 (45Mbps)

T1 (1.5Mbps)

OC-12 (622Mbps)

(Summer '02)

Qwestand Other

Provider(s)

(Summ

er '02)

Internet 2& Qwest

Peering - Public and Private Commodity Internet Transit Internet2 Merit and Other State Networks National Education Network Regional Research Peers

Wausau

Gigabit Ethernet

(Summer '02)

(Sum

mer '03)

Chicago - 1

Chicago - 2(Winter '02)

Chicago

wiscnet.net

GO BUCKY!

Page 25: IP Routing IMA Minneapolis January, 2004 Timothy G. Griffin Intel Research tim.griffin@intel.com tgriffin

Partial View of cs.wisc.edu Neighborhood

AS 2381WiscNet

AS 209Qwest

AS 59 UW Academic Computing

128.105.0.0/16

AS 3549Global Crossing

AS 1Genuity

AS 3136 UW Madison

AS 7050 UW Milwaukee

129.89.0.0/16 130.47.0.0/16

Page 26: IP Routing IMA Minneapolis January, 2004 Timothy G. Griffin Intel Research tim.griffin@intel.com tgriffin

26

Policy : Transit vs. Nontransit

AS 701

AS144

AS 701

A nontransit AS allows only traffic originating from AS or traffic with destination within AS

IP traffic

UUnet

Bell Labs

AT&T CBB

A transit AS allows traffic with neither source nor destination within AS to flow across the network

Page 27: IP Routing IMA Minneapolis January, 2004 Timothy G. Griffin Intel Research tim.griffin@intel.com tgriffin

Customers and Providers

Customer pays provider for access to the Internet

provider

customer

IP trafficprovider customer

Page 28: IP Routing IMA Minneapolis January, 2004 Timothy G. Griffin Intel Research tim.griffin@intel.com tgriffin

The “Peering” Relationship

peer peer

customerprovider

Peers provide transit between their respective customers

Peers do not provide transit between peers

Peers (often) do not exchange $$$trafficallowed

traffic NOTallowed

Page 29: IP Routing IMA Minneapolis January, 2004 Timothy G. Griffin Intel Research tim.griffin@intel.com tgriffin

Peering Provides Shortcuts

Peering also allows connectivity betweenthe customers of “Tier 1” providers.

peer peer

customerprovider

Page 30: IP Routing IMA Minneapolis January, 2004 Timothy G. Griffin Intel Research tim.griffin@intel.com tgriffin

Peering Wars

• Reduces upstream transit costs

• Can increase end-to-end performance

• May be the only way to connect your customers to some part of the Internet (“Tier 1”)

• You would rather have customers

• Peers are usually your competition

• Peering relationships may require periodic renegotiation

Peering struggles are by far the most contentious issues in the ISP world!

Peering agreements are often confidential.

Peer Don’t Peer

Page 31: IP Routing IMA Minneapolis January, 2004 Timothy G. Griffin Intel Research tim.griffin@intel.com tgriffin

31

Policy-Based vs. Distance-Based Routing?

ISP1

ISP2

ISP3

Cust1

Cust2Cust3

Host 1

Host 2

Minimizing “hop count” can violate commercial relationships thatconstrain inter-domain routing.

YES

NO

Page 32: IP Routing IMA Minneapolis January, 2004 Timothy G. Griffin Intel Research tim.griffin@intel.com tgriffin

32

Why not minimize “AS hop count”?

Regional ISP1

Regional ISP2

Regional ISP3

Cust1Cust3 Cust2

National ISP1

National ISP2

YES

NO

Page 33: IP Routing IMA Minneapolis January, 2004 Timothy G. Griffin Intel Research tim.griffin@intel.com tgriffin

33

BGP-4• BGP = Border Gateway Protocol

• Is a Policy-Based routing protocol

• Is the de facto EGP of today’s global Internet

• Relatively simple protocol, but configuration is complex and the

entire world can see, and be impacted by, your mistakes.

• 1989 : BGP-1 [RFC 1105]– Replacement for EGP (1984, RFC 904)

• 1990 : BGP-2 [RFC 1163]

• 1991 : BGP-3 [RFC 1267]

• 1995 : BGP-4 [RFC 1771] – Support for Classless Interdomain Routing (CIDR)

Page 34: IP Routing IMA Minneapolis January, 2004 Timothy G. Griffin Intel Research tim.griffin@intel.com tgriffin

34

Four Types of BGP Messages

• Open : Establish a peering session.

• Keep Alive : Handshake at regular intervals.

• Notification : Shuts down a peering session.

• Update : Announcing new routes or withdrawing previously announced routes.

announcement = prefix + attributes values

Page 35: IP Routing IMA Minneapolis January, 2004 Timothy G. Griffin Intel Research tim.griffin@intel.com tgriffin

BGP Attributes

Value Code Reference----- --------------------------------- --------- 1 ORIGIN [RFC1771] 2 AS_PATH [RFC1771] 3 NEXT_HOP [RFC1771] 4 MULTI_EXIT_DISC [RFC1771] 5 LOCAL_PREF [RFC1771] 6 ATOMIC_AGGREGATE [RFC1771] 7 AGGREGATOR [RFC1771] 8 COMMUNITY [RFC1997] 9 ORIGINATOR_ID [RFC2796] 10 CLUSTER_LIST [RFC2796] 11 DPA [Chen] 12 ADVERTISER [RFC1863] 13 RCID_PATH / CLUSTER_ID [RFC1863] 14 MP_REACH_NLRI [RFC2283] 15 MP_UNREACH_NLRI [RFC2283] 16 EXTENDED COMMUNITIES [Rosen] ... 255 reserved for development

From IANA: http://www.iana.org/assignments/bgp-parameters

Mostimportantattributes

Not all attributesneed to be present inevery announcement

Page 36: IP Routing IMA Minneapolis January, 2004 Timothy G. Griffin Intel Research tim.griffin@intel.com tgriffin

Attributes are Used to Select Best Routes

192.0.2.0/24pick me!

192.0.2.0/24pick me!

192.0.2.0/24pick me!

192.0.2.0/24pick me!

Given multipleroutes to the sameprefix, a BGP speakermust pick at mostone best route

(Note: it could reject them all!)

Page 37: IP Routing IMA Minneapolis January, 2004 Timothy G. Griffin Intel Research tim.griffin@intel.com tgriffin

37

BGP Route Processing

Best Route Selection

Apply Import Policies

Best Route Table

Apply Export Policies

Install forwardingEntries for bestRoutes.

ReceiveBGPUpdates

BestRoutes

TransmitBGP Updates

Apply Policy =filter routes & tweak attributes

Based onAttributeValues

IP Forwarding Table

Apply Policy =filter routes & tweak attributes

Open ended programming.Constrained only by vendor configuration language

Page 38: IP Routing IMA Minneapolis January, 2004 Timothy G. Griffin Intel Research tim.griffin@intel.com tgriffin

Route Selection Summary

Highest Local Preference

Shortest ASPATH

Lowest MED

i-BGP < e-BGP

Lowest IGP cost to BGP egress

Lowest router ID

traffic engineering

Enforce relationships

Throw up hands andbreak ties

Page 39: IP Routing IMA Minneapolis January, 2004 Timothy G. Griffin Intel Research tim.griffin@intel.com tgriffin

39

ASPATH Attribute

AS7018135.207.0.0/16AS Path = 6341

AS 1239Sprint

AS 1755Ebone

AT&T

AS 3549Global Crossing

135.207.0.0/16AS Path = 7018 6341

135.207.0.0/16AS Path = 3549 7018 6341

AS 6341

135.207.0.0/16

AT&T Research

Prefix Originated

AS 12654RIPE NCCRIS project

AS 1129Global Access

135.207.0.0/16AS Path = 7018 6341

135.207.0.0/16AS Path = 1239 7018 6341

135.207.0.0/16AS Path = 1755 1239 7018 6341

135.207.0.0/16AS Path = 1129 1755 1239 7018 6341

Page 40: IP Routing IMA Minneapolis January, 2004 Timothy G. Griffin Intel Research tim.griffin@intel.com tgriffin

In fairness: could you do this “right” and still scale?

Exporting internalstate would dramatically increase global instability and amount of routingstate

Shorter Doesn’t Always Mean Shorter

AS 4

AS 3

AS 2

AS 1

Mr. BGP says that path 4 1 is better than path 3 2 1

Duh!

Page 41: IP Routing IMA Minneapolis January, 2004 Timothy G. Griffin Intel Research tim.griffin@intel.com tgriffin

BGP Routing Tables

• Use “whois” queries to associate an ASN with “owner” (for example, http://www.arin.net/whois/arinwhois.html)

• 7018 = AT&T Worldnet, 701 =Uunet, 3561 = Cable & Wireless, …

show ip bgpBGP table version is 111849680, local router ID is 203.62.248.4Status codes: s suppressed, d damped, h history, * valid, > best, i - internalOrigin codes: i - IGP, e - EGP, ? - incomplete

Network Next Hop Metric LocPrf Weight Path

. . .*>i192.35.25.0 134.159.0.1 50 0 16779 1 701 703 i*>i192.35.29.0 166.49.251.25 50 0 5727 7018 14541 i*>i192.35.35.0 134.159.0.1 50 0 16779 1 701 1744 i*>i192.35.37.0 134.159.0.1 50 0 16779 1 3561 i*>i192.35.39.0 134.159.0.3 50 0 16779 1 701 80 i*>i192.35.44.0 166.49.251.25 50 0 5727 7018 1785 i*>i192.35.48.0 203.62.248.34 55 0 16779 209 7843 225 225 225 225 225 i*>i192.35.49.0 203.62.248.34 55 0 16779 209 7843 225 225 225 225 225 i*>i192.35.50.0 203.62.248.34 55 0 16779 3549 714 714 714 i*>i192.35.51.0/25 203.62.248.34 55 0 16779 3549 14744 14744 14744 14744 14744 14744 14744 14744 i. . .

Thanks to Geoff Huston. http://www.telstra.net/ops on July 6, 2001

Page 42: IP Routing IMA Minneapolis January, 2004 Timothy G. Griffin Intel Research tim.griffin@intel.com tgriffin

AS Graphs Depend on Point of View

peer peer

customerprovider

54

2

1 3

6

54

2

6

1 3

54 6

1 3

54

2

6

1 32

Page 43: IP Routing IMA Minneapolis January, 2004 Timothy G. Griffin Intel Research tim.griffin@intel.com tgriffin

AS Graphs Can Be Fun

The subgraph showing all ASes that have more than 100 neighbors in fullgraph of 11,158 nodes. July 6, 2001. Point of view: AT&T route-server

Page 44: IP Routing IMA Minneapolis January, 2004 Timothy G. Griffin Intel Research tim.griffin@intel.com tgriffin

AS Graphs Do Not Show “Topology”!

The AS graphmay look like this. Reality may be closer to this…

BGP was designed to throw away information!

Page 45: IP Routing IMA Minneapolis January, 2004 Timothy G. Griffin Intel Research tim.griffin@intel.com tgriffin

45

So Many Choices

Which route shouldFrank pick to 13.13.0.0./16?

AS 1

AS 2

AS 4

AS 3

13.13.0.0/16

Frank’s Internet Barn

peer peer

customerprovider

Page 46: IP Routing IMA Minneapolis January, 2004 Timothy G. Griffin Intel Research tim.griffin@intel.com tgriffin

46

LOCAL PREFERENCE

AS 1AS 2

AS 4

AS 3

13.13.0.0/16

local pref = 80

local pref = 100

local pref = 90

Higher Localpreference valuesare more preferred

Local preference used ONLY in iBGP

Page 47: IP Routing IMA Minneapolis January, 2004 Timothy G. Griffin Intel Research tim.griffin@intel.com tgriffin

47

Implementing Backup Links with Local Preference (Outbound Traffic)

Forces outbound traffic to take primary link, unless link is down.

AS 1

primary link backup link

Set Local Pref = 100for all routes from AS 1 AS 65000

Set Local Pref = 50for all routes from AS 1

We’ll talk about inbound traffic soon …

Page 48: IP Routing IMA Minneapolis January, 2004 Timothy G. Griffin Intel Research tim.griffin@intel.com tgriffin

48

Multihomed Backups (Outbound Traffic)

Forces outbound traffic to take primary link, unless link is down.

AS 1

primary link backup link

Set Local Pref = 100for all routes from AS 1

AS 2

Set Local Pref = 50for all routes from AS 3

AS 3provider provider

Page 49: IP Routing IMA Minneapolis January, 2004 Timothy G. Griffin Intel Research tim.griffin@intel.com tgriffin

49

Shedding Inbound Traffic with ASPATH Prepending

Prepending will (usually) force inbound traffic from AS 1to take primary linkAS 1

192.0.2.0/24ASPATH = 2 2 2

customerAS 2

provider

192.0.2.0/24

backupprimary

192.0.2.0/24ASPATH = 2

Yes, this is a Glorious Hack …

Page 50: IP Routing IMA Minneapolis January, 2004 Timothy G. Griffin Intel Research tim.griffin@intel.com tgriffin

50

… But Padding Does Not Always Work

AS 1

192.0.2.0/24ASPATH = 2 2 2 2 2 2 2 2 2 2 2 2 2 2

customerAS 2

provider

192.0.2.0/24

192.0.2.0/24ASPATH = 2

AS 3provider

AS 3 will sendtraffic on “backup”link because it prefers customer routes and localpreference is considered before ASPATH length!

Padding in this way is oftenused as a form of loadbalancing

backupprimary

Page 51: IP Routing IMA Minneapolis January, 2004 Timothy G. Griffin Intel Research tim.griffin@intel.com tgriffin

51

COMMUNITY Attribute to the Rescue!

AS 1

customerAS 2

provider

192.0.2.0/24

192.0.2.0/24ASPATH = 2

AS 3provider

backupprimary

192.0.2.0/24ASPATH = 2 COMMUNITY = 3:70

Customer import policy at AS 3:If 3:90 in COMMUNITY then set local preference to 90If 3:80 in COMMUNITY then set local preference to 80If 3:70 in COMMUNITY then set local preference to 70

AS 3: normal customer local pref is 100,peer local pref is 90

Page 52: IP Routing IMA Minneapolis January, 2004 Timothy G. Griffin Intel Research tim.griffin@intel.com tgriffin

Don’t celebrate just yet…

customer

peering

provider/customer

Provider B (Tier 1)Provider A (Tier 1)

Provider C (Tier 2)

Now, customer wants a backup link to C….

provider/customer

Page 53: IP Routing IMA Minneapolis January, 2004 Timothy G. Griffin Intel Research tim.griffin@intel.com tgriffin

Customer installs a “backup link” …

customer

Provider B (Tier 1)Provider A (Tier 1)

Provider C (Tier 2)

customer sends “lower my preference” Community value

primarybackup

Page 54: IP Routing IMA Minneapolis January, 2004 Timothy G. Griffin Intel Research tim.griffin@intel.com tgriffin

Disaster Strikes!

customer

Provider B (Tier 1)Provider A (Tier 1)

Provider C (Tier 2)primary

backup

customer is happy that backup was installed …

Page 55: IP Routing IMA Minneapolis January, 2004 Timothy G. Griffin Intel Research tim.griffin@intel.com tgriffin

The primary link is repaired, and something odd occurs…

customer

Provider B (Tier 1)Provider A (Tier 1)

Provider C (Tier 2)primary

backup

YIKES --- routing DOES NOT return to normal!!!

Page 56: IP Routing IMA Minneapolis January, 2004 Timothy G. Griffin Intel Research tim.griffin@intel.com tgriffin

WAIT! It Gets Better…

A

P

B

BB

C

B

D

P = primary B = backup

Page 57: IP Routing IMA Minneapolis January, 2004 Timothy G. Griffin Intel Research tim.griffin@intel.com tgriffin

OOOOOPS!

A

P

B

BB

C

B

DSuppose A, B, C all break ties in the same direction(clockwise or counter-clockwise)

No solution =Protocol Divergence

Page 58: IP Routing IMA Minneapolis January, 2004 Timothy G. Griffin Intel Research tim.griffin@intel.com tgriffin

What the heck is going on?

• There is no guarantee that a BGP configuration has a unique routing solution. – When multiple solutions exist, the (unpredictable) order

of updates will determine which one is wins.

• There is no guarantee that a BGP configuration has any solution!– And checking configurations NP-Complete [GW1999]

• Complex policies (weights, communities setting preferences, and so on) increase chances of routing anomalies.– … yet this is the current trend!

Page 59: IP Routing IMA Minneapolis January, 2004 Timothy G. Griffin Intel Research tim.griffin@intel.com tgriffin

Are we too complacent?

If the provider/customer digraph is acyclic and every AS obeys the commandments

• Thou shall prefer customer routes over all others

• Thou shall use provider routes only as a last resort

• Thou shall not provide transit between peers or providers

then the BGP configuration is robust. [see Gao-Griffin-Rexford, INFOCOM 2001]

Page 60: IP Routing IMA Minneapolis January, 2004 Timothy G. Griffin Intel Research tim.griffin@intel.com tgriffin

Worrisome trends…

• Some Autonomous Routing Domains (ARDs) are implemented with multiple ASNs (example: MCI, InterNap, AT&T)– Such “sibling” ASes are not confined to “customer/provider,

peer/peer” relationships– ASNs are becoming just an implementation detail.

• Some ASes participate in different roles in different parts of the world (Sprint, for example). – I don’t think we understand this.

• We all know abut MED…– But MED oscillation is not a feature interaction problem (MEDs and

Route Reflection), but rather a manifestation of BGP’s general principle --- the more complex the policies, the more likely that bad things happen. MED just makes it easy to write very complex policies…

• Communities are being used for clever interdomain signaling. – Nobody has read “Inherently Safe Backup Routing with BGP” Gao,

Griffin, Rexford. INFOCOM 2001– “te” communities and extended communities…

Page 61: IP Routing IMA Minneapolis January, 2004 Timothy G. Griffin Intel Research tim.griffin@intel.com tgriffin

Let’s look at “te” communities…

See A survey of the utilization of the BGP community attributeBruno Quoitin and Olivier Bonaventure http://www.infonet.fundp.ac.be/doc/tr/Infonet-TR-2002-02.html

n = 0 do not announce prefix 1 <= n <= 3 prepend n times to announcement

13129:101n - do not announce/prepend to Sprint (AS1239) 13129:102n - do not announce/prepend to Cogent (AS16631) 13129:103n - do not announce/prepend to Abovenet (AS6461) 13129:111n - do not announce/prepend to DE-CIX 13129:112n - do not announce/prepend to INXS 13129:113n - do not announce/prepend to SFINX 13129:114n - do not announce/prepend to LINX 13129:115n - do not announce/prepend to AMS-IX 13129:116n - do not announce/prepend to IX-HH 13129:117n - do not announce/prepend to NYIIX 13129:191n - do not announce/prepend to DTAG (AS3320) 13129:192n - do not announce/prepend to DFN (AS680) 13129:1990 - do not announce to the RIPE RIS project (AS12654)

AS13129, Global Access Telecommunications, Inc (Frankfurt)

Acceptedon inbound routes

Page 62: IP Routing IMA Minneapolis January, 2004 Timothy G. Griffin Intel Research tim.griffin@intel.com tgriffin

Some AS286 Communities

remarks: +----------------------------------------------------------remarks: | COMMUNITIES - ROUTE ORIGIN - NOT SETTABLE BY CUSTOMERSremarks: |remarks: | 286:286 European customer routesremarks: | 286:999 US customer routes (received from Qwest)remarks: | 286:888 European or US peer routesremarks: |remarks: | 286:3000 + countrycode Country where route is receivedremarks: | countrycode E.164 international dial prefixremarks: |remarks: | EXAMPLESremarks: |remarks: | 286:286 286:3031 Customer in Amsterdamremarks: | 286:286 286:3032 Customer in Brusselsremarks: | 286:888 286:3044 Peer in Londonremarks: |remarks: +----------------------------------------------------------

From KPN Eurorings Backbone:

Comment: Aren’t we happy that RPSL has “remarks”!

Page 63: IP Routing IMA Minneapolis January, 2004 Timothy G. Griffin Intel Research tim.griffin@intel.com tgriffin

Need “Semantics of Interdomain Routing”

• Distinct from mechanism of finding routings (protocols). – Don’t start with “new algorithms”!!!

• BGP policy languages/usage have evolved organically --- lack of design.– Too closely tied to mechanism of BGP– RPSL doesn’t even begin to address the issues…

• What do we want to be true? • What do we mean by “autonomy”• How much “expressive power” is really

required?

Page 64: IP Routing IMA Minneapolis January, 2004 Timothy G. Griffin Intel Research tim.griffin@intel.com tgriffin

References

• [VGE1996, VGE2000] Persistent Route Oscillations in Inter-Domain Routing. Kannan Varadhan, Ramesh Govindan, and Deborah Estrin. Computer Networks, Jan. 2000. (Also USC Tech Report, Feb. 1996)

• [GW1999] An Analysis of BGP Convergence Properties. Timothy G. Griffin, Gordon Wilfong. SIGCOMM 1999

• [GSW1999] Policy Disputes in Path Vector Protocols. Timothy G. Griffin, F. Bruce Shepherd, Gordon Wilfong. ICNP 1999

• [GW2001] A Safe Path Vector Protocol. Timothy G. Griffin, Gordon Wilfong. INFOCOM 2001

• [GR2000] Stable Internet Routing without Global Coordination. Lixin Gao, Jennifer Rexford. SIGMETRICS 2000

• [GGR2001] Inherently safe backup routing with BGP. Lixin Gao, Timothy G. Griffin, Jennifer Rexford. INFOCOM 2001

– [GW2002a] On the Correctness of IBGP Configurations. Griffin and Wilfong.SIGCOMM 2002.

– [GW2002b] An Analysis of the MED oscillation Problem. Griffin and Wilfong. ICNP 2002.

Page 65: IP Routing IMA Minneapolis January, 2004 Timothy G. Griffin Intel Research tim.griffin@intel.com tgriffin

Pointers

• Interdomain routing links– http://www.cambridge.intel-research.net/~tgriffin/interdomain/

• These slides– http://www.cambridge.intel-research.net/~tgriffin/talks_tutorials