17
BGP Border Gateway Protocol (an introduction) dr. C. P. J. Koymans Informatics Institute University of Amsterdam March 11, 2008 General ideas behind BGP Background Providers, Customers and Peers External and Internal BGP BGP information bases The BGP protocol BGP attributes BGP packets Traffic Engineering Outbound Traffic Engineering Inbound Traffic Engineering IBGP scaling BGP version 4 Border Gateway Protocol version 4 (BGP4) is specified in RFC 4271 is an inter-AS routing protocol “monopolises” the Internet uses path vector routing which is inbetween distance vector and link state uses (often non-coordinated) policy based routing which is a nuisance for convergence Autonomous system (AS) An Autonomous System or AS is a connected group of networks and routers, representing some assigned set of IP prefixes, having a single, consistent routing policy, both internally and externally

BGP version 4 Autonomous system (AS) · I is an inter-AS routing protocol ... (possibly multiple times) in EBGP updates I Leave unchanged in IBGP updates ... I This works by manipulating

Embed Size (px)

Citation preview

BGPBorder Gateway Protocol (an introduction)

dr. C. P. J. Koymans

Informatics InstituteUniversity of Amsterdam

March 11, 2008

General ideas behind BGPBackgroundProviders, Customers and PeersExternal and Internal BGPBGP information bases

The BGP protocolBGP attributesBGP packets

Traffic EngineeringOutbound Traffic EngineeringInbound Traffic Engineering

IBGP scaling

BGP version 4

I Border Gateway Protocol version 4 (BGP4)

I is specified in RFC 4271

I is an inter-AS routing protocol

I “monopolises” the InternetI uses path vector routing

I which is inbetween distance vector and link state

I uses (often non-coordinated) policy based routing

I which is a nuisance for convergence

Autonomous system (AS)

I An Autonomous System or AS is

I a connected group of networks and routers,

I representing some assigned set of IP prefixes,

I having a single, consistent routing policy,

I both internally and externally

Autonomous system illustration

Autonomous Systems

3

Slide courtesy Iljitsch van Beijnum

Autonomous system illustration

Autonomous Systems

3

Slide courtesy Iljitsch van Beijnum

Autonomous system illustration

Autonomous Systems

3

Slide courtesy Iljitsch van Beijnum

Autonomous system illustration

Autonomous Systems

AS2503 AS192

AS29077

3

Slide courtesy Iljitsch van Beijnum

Providers, Customers and Peers

Customers and Providers

Customer pays provider for access to the Internet

provider

customer

IP trafficprovider customer

Slide courtesy Timothy Griffin

Providers, Customers and Peers

The “Peering” Relationship

peer peer

customerprovider

Peers provide transit between their respective customers

Peers do not provide transit between peers

Peers (often) do not exchange $$$trafficallowed

traffic NOTallowed

Slide courtesy Timothy Griffin

Providers, Customers and Peers

Peering Provides Shortcuts

Peering also allows connectivity betweenthe customers of “Tier 1” providers.

peer peer

customerprovider

Slide courtesy Timothy Griffin

Providers, Customers and Peers

AS Graph != Internet Topology

The AS graphmay look like this. Reality may be closer to this…

BGP was designed to throw away information!

Slide courtesy Timothy Griffin

Providers, Customers and Peers Treatment

I The order of preference for a route is

I Customers have highest preference

I Peers have the next highest preference

I Providers have the lowest preference

I Transit relationships are enforced by export filtering

I Do not advertise provider or peer routes

to other providers or peers

I Do advertise all routes to cutomers

I Do advertise customer routes to providers and peers

Providers, Customers and Peers: Import and Export

Import Routes

Frompeer

Frompeer

Fromprovider

Fromprovider

From customer

From customer

provider route customer routepeer route ISP route

Slide courtesy Timothy Griffin

Providers, Customers and Peers: Import and Export

Export Routes

Topeer

Topeer

Tocustomer

Tocustomer

Toprovider

From provider

provider route customer routepeer route ISP route

filtersblock

Slide courtesy Timothy Griffin

EBGP and IBGP (1)

I External BGP (EBGP)

I is used for BGP neighbors between different AS’s

I to exchange prefixes

I and to implement policies

I Internal BGP (IBGP)

I is used for BGP neighbors within only one AS

I to distribute Internet prefixes across the backbone

in order to create a consistent view

among all entry/exit points

I to originate local (customer) prefixes

EBGP and IBGP (2)

I Routes imported from one IBGP peer

are not distributed to another IBGP peer

I This prevents possible routing loops

I Loop detection is based on duplicates in AS paths,

which is detected by EBGP between different AS’s

I Requires IBGP peers to be configured as a full mesh

Routing Information Bases (RIBs)

I Adj-RIB-In (one per peer)

I Routes after input filtering

I Loc-RIB (one globally)

I Routes after best path selection

I Adj-RIB-Out (one per peer)

I Routes after output filtering

BGP protocol

I Uses TCP over port 179

I Exchanges NLRI

I Network Layer Reachability Information

I Prefixes that can or can no longer be reached through the

router

I Accompanied by BGP attributes

Some important BGP attributes

I In order of path selection importance

I LOCAL_PREF (Local Preference)

I AS_PATH

I ORIGIN

I MULTI_EXIT_DISC (MED; Multi-exit discriminator)

I And further...

I NEXT_HOP

I which must be reachable (directly or via IGP)

I except in the case of multi-hop BGP

Interaction betweed BGP and IGP

53

BGP Next Hop Attribute

Every time a route announcement crosses an AS boundary, the Next Hop attribute is changed to the IP address of the border router that announced the route.

AS 6431AT&T Research

135.207.0.0/16Next Hop = 12.125.133.90

AS 7018AT&T

AS 12654RIPE NCCRIS project

12.125.133.90

135.207.0.0/16Next Hop = 12.127.0.121

12.127.0.121

Slide courtesy Timothy Griffin

Interaction betweed BGP and IGP

Forwarding Table

Forwarding Table

Join EGP with IGP For Connectivity

AS 1 AS 2192.0.2.1

135.207.0.0/16

10.10.10.10

EGP

192.0.2.1135.207.0.0/16

destination next hop

10.10.10.10192.0.2.0/30

destination next hop

135.207.0.0/16Next Hop = 192.0.2.1

192.0.2.0/30

135.207.0.0/16

destination next hop

10.10.10.10

+

192.0.2.0/30 10.10.10.10

Slide courtesy Timothy Griffin

Route selection

Route Selection Summary

Highest Local Preference

Shortest ASPATH

Lowest MED

i-BGP < e-BGP

Lowest IGP cost to BGP egress

Lowest router ID

traffic engineering

Enforce relationships

Throw up hands andbreak ties

Slide courtesy Timothy Griffin

Route selection

52

BGP Route Processing

Best Route Selection

Apply Import Policies

Best Route Table

Apply Export Policies

Install forwardingEntries for bestRoutes.

ReceiveBGPUpdates

BestRoutes

TransmitBGP Updates

Apply Policy =filter routes & tweak attributes

Based onAttributeValues

IP Forwarding Table

Apply Policy =filter routes & tweak attributes

Open ended programming.Constrained only by vendor configuration language

Slide courtesy Timothy Griffin

BGP attribute types

I Well-known mandatory

I ORIGIN, AS_PATH, NEXT_HOP

I Well-known discretionary

I LOCAL_PREF, ATOMIC_AGGREGATE

I Optional transitive

I COMMUNITIES, AGGREGATOR

I Optional non-transitive

I MULTI_EXIT_DISC

LOCAL_PREF (Local Preference)

I Advertised within a single AS (via IBGP)

I Used to implement local policies

I Can depend on any locally available information,

possibly learned outside BGP

I Default value is 100

I Highest value wins

AS_PATH

I Sequence of AS’s (or sets of AS’s)

I Used for loop detection

I Shortest path wins

I Prepend own AS (possibly multiple times) in EBGP updates

I Leave unchanged in IBGP updates

Examples of AS_PATHs

64

ASPATH Attribute

AS7018135.207.0.0/16AS Path = 6341

AS 1239Sprint

AS 1755Ebone

AT&T

AS 3549Global Crossing

135.207.0.0/16AS Path = 7018 6341

135.207.0.0/16AS Path = 3549 7018 6341

AS 6341

135.207.0.0/16

AT&T Research

Prefix Originated

AS 12654RIPE NCCRIS project

AS 1129Global Access

135.207.0.0/16AS Path = 7018 6341

135.207.0.0/16AS Path = 1239 7018 6341

135.207.0.0/16AS Path = 1755 1239 7018 6341

135.207.0.0/16AS Path = 1129 1755 1239 7018 6341

Slide courtesy Timothy Griffin

Examples of AS_PATHs

In fairness: could you do this “right” and still scale?

Exporting internalstate would dramatically increase global instability and amount of routingstate

Shorter Doesn’t Always Mean Shorter

AS 4

AS 3

AS 2

AS 1

Mr. BGP says that path 4 1 is better than path 3 2 1

Duh!

Slide courtesy Timothy Griffin

Examples of AS_PATHs

66

Interdomain Loop Prevention

BGP at AS YYY will never accept a route with ASPATH containing YYY.

AS 7018

12.22.0.0/16ASPATH = 1 333 7018 877

Don’t Accept!

AS 1

Slide courtesy Timothy Griffin

Examples of AS_PATHs

Traffic Often Follows ASPATH

AS 4AS 3AS 2AS 1135.207.0.0/16

135.207.0.0/16ASPATH = 3 2 1

IP Packet Dest =135.207.44.66

Slide courtesy Timothy Griffin

Examples of AS_PATHs

… But It Might Not

AS 4AS 3AS 2AS 1135.207.0.0/16

135.207.0.0/16ASPATH = 3 2 1

IP Packet Dest =135.207.44.66

AS 5

135.207.44.0/25ASPATH = 5

135.207.44.0/25

AS 2 filters allsubnets with maskslonger than /24

135.207.0.0/16ASPATH = 1

From AS 4, it may look like thispacket will take path 3 2 1, but it actually takespath 3 2 5

Slide courtesy Timothy Griffin

ORIGIN

I The ORIGIN attribute tells where the route (NLRI) originated

I Interior to the originating AS: ORIGIN = 0

I Via the EGP protocol (historic): ORIGIN = 1

I Via some other means: ORIGIN = 2

I A lower ORIGIN wins

MULTI_EXIT_DISC (Multi-Exit Discriminator or MED)

I The MED (or metric, formerly INTER_AS_METRIC) is

meant

to be advertised between neighboring AS’s (via EBGP)

I Some implementations carry MED on by IBGP

(hot potato versus cold potato)

I The MED is non-transitive (is not transferred into a third AS)

I A lower MED wins

I The default MED is 0 (lowest possible value)

I Some implementations choose the highest possible value

BGP packet header

0 15 16 23 24 31

Marker

Length Type

Remember that BGP “packets” are in fact part of a TCP-stream

BGP header fields

BGP header fields

Marker All 1’s (compatibility)

Length Total length

no padding, including header

Type 1: OPEN

2: UPDATE

3: NOTIFICATION

4: KEEPALIVE

BGP OPEN message

0 7 8 15 16 31

Version

My Autonomous System

Hold Time

BGP Identifier

Opt Parm Len

Optional Parametershhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhh

hhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhh

(variable)

OPEN message fields

OPEN message fields

Version 4

My Autonomous System Sender’s AS

Hold Time Liveness detection

BGP Identifier Sender’s identifying IP address

Opt Parm Length Length of parameter field

Optional Parameters TLV-encoded options

One interesting parameter is the Capabilities Optional Parameter,which defines (among others) the Route Refresh Capability.

BGP KEEPALIVE message

This page intentionally left blank.http://www.this-page-intentionally-left-blank.org/

KEEPALIVE message fields

KEEPALIVE message fields

:)

BGP NOTIFICATION message

0 7 8 15 16 31

Error code Error subcode

Datahhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhh

hhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhh

(variable)

NOTIFICATION message fields

NOTIFICATION message fields

Error code 1: Message Header Error

2: OPEN Error

3: UPDATE Error

4: Hold Timer Expired

. . .

Error subcode Depends on error code

Data Depends on error code and subcode

BGP UPDATE message

0 15 16 31

Unfeasible Routes Length

Withdrawn Routes(variable length)

Total Path Attribute Length

Path Attributes(variable length)

Network Layer Reachability Information(variable length)

UPDATE message fields

UPDATE message fields

Unfeasible Routes Length Length of Withdrawn Routes

Withdrawn Routes List of prefixes1

Total Path Attribute Length Length of Path Attributes

Path Attributes TLV-encoded attributes

Network Layer Reachability Information List of NLRI prefixes

1A prefix is specified by its length and just enough bytes of

the network IP address to cover this length

Tweaking your policies

Tweak Tweak Tweak

• For inbound traffic– Filter outbound routes– Tweak attributes on

outbound routes in the hope of influencing your neighbor’s best route selection

• For outbound traffic– Filter inbound routes– Tweak attributes on

inbound routes to influence best route selection

outboundroutes

inboundroutes

inboundtraffic

outboundtraffic

In general, an AS has morecontrol over outbound traffic

Slide courtesy Timothy Griffin

Outbound Traffic Engineering

I This works by manipulating incoming routes

I Changing local preference

I Extending inbound AS paths

I Manipulating the metric (MED), for instance

by using inbound communities

I It is relatively simple (and based on your own policy)

Manipulating local preference

60

So Many Choices

Which route shouldFrank pick to 13.13.0.0./16?

AS 1

AS 2

AS 4

AS 3

13.13.0.0/16

Frank’s Internet Barn

peer peer

customerprovider

Slide courtesy Timothy Griffin

Manipulating local preference

61

LOCAL PREFERENCE

AS 1AS 2

AS 4

AS 3

13.13.0.0/16

local pref = 80

local pref = 100

local pref = 90

Higher Localpreference valuesare more preferred

Local preference used ONLY in iBGP

Slide courtesy Timothy Griffin

Manipulating local preference

70

Implementing Backup Links with Local Preference (Outbound

Traffic)

Forces outbound traffic to take primary link, unless link is down.

AS 1

primary link backup link

Set Local Pref = 100for all routes from AS 1 AS 65000

Set Local Pref = 50for all routes from AS 1

We’ll talk about inbound traffic soon …

Slide courtesy Timothy Griffin

Manipulating local preference

71

Multihomed Backups (Outbound Traffic)

Forces outbound traffic to take primary link, unless link is down.

AS 1

primary link backup link

Set Local Pref = 100for all routes from AS 1

AS 2

Set Local Pref = 50for all routes from AS 3

AS 3provider provider

Slide courtesy Timothy Griffin

Inbound Traffic Engineering

I This works by manipulating outgoing routes

I Extending outbound AS_PATHs is a traditional hack

I Manipulating the metric (MED) is the traditional way

I Setting outbound communities is the more modern approach,

where agreements with your neighbors are specified

I Inbound is more complex than outbound

I Inbound depends on neighbor’s policy

I Last resort method: announcing more specific routes

(often a bad idea)

Manipulating AS_PATHs

72

Shedding Inbound Traffic with ASPATH Padding. Yes, this is a

Glorious Hack …

Padding will (usually) force inbound traffic from AS 1to take primary link

AS 1

192.0.2.0/24ASPATH = 2 2 2

customerAS 2

provider

192.0.2.0/24

backupprimary

192.0.2.0/24ASPATH = 2

Slide courtesy Timothy Griffin

Manipulating AS_PATHs

73

… But Padding Does Not Always Work

AS 1

192.0.2.0/24ASPATH = 2 2 2 2 2 2 2 2 2 2 2 2 2 2

customerAS 2

provider

192.0.2.0/24

192.0.2.0/24ASPATH = 2

AS 3provider

AS 3 will sendtraffic on “backup”link because it prefers customer routes and localpreference is considered before ASPATH length!

Padding in this way is oftenused as a form of loadbalancing

backupprimary

Slide courtesy Timothy Griffin

Manipulating AS_PATHs

74

COMMUNITY Attribute to the Rescue!

AS 1

customerAS 2

provider

192.0.2.0/24

192.0.2.0/24ASPATH = 2

AS 3provider

backupprimary

192.0.2.0/24ASPATH = 2 COMMUNITY = 3:70

Customer import policy at AS 3:If 3:90 in COMMUNITY then set local preference to 90If 3:80 in COMMUNITY then set local preference to 80If 3:70 in COMMUNITY then set local preference to 70

AS 3: normal customer local pref is 100,peer local pref is 90

Slide courtesy Timothy Griffin

Manipulating MEDs

75

Hot Potato Routing: Go for the Closest Egress Point

192.44.78.0/24

15 56 IGP distances

egress 1 egress 2

This Router has two BGP routes to 192.44.78.0/24.

Hot potato: get traffic off of your network as Soon as possible. Go for egress 1!

Slide courtesy Timothy Griffin

Manipulating MEDs

76

Getting Burned by the Hot Potato

15 56

172865High bandwidth

Provider backbone

Low bandwidthcustomer backbone

Heavy Content Web Farm

Many customers want their provider to carry the bits!

tiny http requesthuge http reply

SFF NYC

San Diego

Slide courtesy Timothy Griffin

Manipulating MEDs

77

Cold Potato Routing with MEDs(Multi-Exit Discriminator Attribute)

15 56

172865

Heavy Content Web Farm

192.44.78.0/24

192.44.78.0/24MED = 15

192.44.78.0/24MED = 56

This means that MEDs must be considered BEFOREIGP distance!

Prefer lower MED values

Note1 : some providers will not listen to MEDs Note2 : MEDs need not be tied to IGP distance

Slide courtesy Timothy Griffin

COMMUNITIES

I An optional transitive attribute

I A community can be used to communicate

preferred treatment of a route

I Some communities have a well-known semantics

I NO_EXPORT: don’t export beyond current AS (or

confederation)

I NO_ADVERTISE: don’t export at all

I NO_EXPORT_SUBCONFED: don’t export via EBGP

Use of communities

58

How Can Routes be Colored?BGP Communities!

A community value is 32 bits

By convention, first 16 bits is ASN indicating who is giving itan interpretation

communitynumber

Very powerful BECAUSE it has no (predefined) meaning

Community Attribute = a list of community values.(So one route can belong to multiple communities)

RFC 1997 (August 1996)

Used for signallywithin and betweenASes

Two reserved communities

no_advertise 0xFFFFFF02: don’t pass to BGP neighbors

no_export = 0xFFFFFF01: don’t export out of AS

Slide courtesy Timothy Griffin

Use of communities

Communities Example

• 1:100– Customer routes

• 1:200– Peer routes

• 1:300– Provider Routes

• To Customers– 1:100, 1:200, 1:300

• To Peers– 1:100

• To Providers– 1:100

AS 1

Import Export

Slide courtesy Timothy Griffin

Route Reflectors

I A route reflector is a kind of “super” IBGP peer

I A route reflector has clients with which it peers via IBGP

and for which it reflects (transitively) routes

I A route reflector is part of a full mesh of

other route reflectors and non-clients

Route reflectors illustration

Full Mesh

39

Slide courtesy Iljitsch van Beijnum

Route reflectors illustration

Route Reflection

40

Slide courtesy Iljitsch van Beijnum

Confederations

I Use multiple private AS’s inside your main AS

I Talk to the outside world with your main AS,

hiding the private AS’s

I Talk to the inside world as if using EBGP and IBGP

for the different private AS’s

I This needs special AS_PATH segment types

Confederations illustration

Confederations

41

Slide courtesy Iljitsch van Beijnum