MPLS Yaakov (J) Stein February 2006 Chief Scientist RAD Data Communications

MPLSMPLS

Yaakov (J) Stein February 2006Chief ScientistRAD Data Communications

Y(J)S MPLS Slide 2

Why study MPLS?Why study MPLS?

it’s new

it’s different and interesting

along with IPv6 it will save the Internet

it enables delivery of IP VPN services

it facilitates pseudowire functionality

all of the above

Y(J)S MPLS Slide 3

Course OutlineCourse Outline

1) Introduction

2) MPLS Forwarding Component

3) MPLS Control Component

4) MPLS Traffic Engineering

5) MPLS Applications

Y(J)S MPLS Slide 4

Introduction

12345

12345

1.1 Review of forwarding and routing

1.2 Problems with IP routing

1.3 Label Switching

Y(J)S MPLS Slide 5

Forwarding and RoutingForwarding and Routing

a router (switch) performs 2 distinct algorithms forwarding algorithm (forwarding component) routing algorithm (control component)

may be manual configuration or signaling protocol

from the forwarding point of view there are three different types of network:

broadcast (e.g. Ethernet) connectionless (e.g. IP) connection oriented (e.g. ATM)

1.1

Y(J)S MPLS Slide 6

Broadcast (e.g. Ethernet)Broadcast (e.g. Ethernet)

message is sent to all possible recipients

all receivers check message destination address

if message destination address does not match receiver’s addressmessage ignored (except in promiscuous mode)

if message destination address matches receiver’s addressmessage processed

not for me

not for me

for me!! not for me

not for me

not for me

not for me

1.1

Y(J)S MPLS Slide 7

Connectionless ForwardingConnectionless Forwardingbroadcast is only practical for small networks (e.g. LANs)

large networks can be broken into subnetworkswith routers performing the internetworking

such a network is connectionless (CL) if no setup is required before sending data each router makes independent forwarding decision packets are self-describing packet inserted anywhere will be properly forwarded

Notes:– the address must have global significance– IP actually relies on a L2 protocol (Ethernet, PPP) for final stage

CL forwarder (router)

1.1

Y(J)S MPLS Slide 8

IP RoutingIP Routing Distance Vector (Bellman-Ford), e.g. RIP

– send <addr,cost> to neighbors– routers maintain cost to all destinations– need to solve “count to problem“

Path Vector, e.g. BGP– send <addr,cost,path> to neighbors– similar to distance vector, but w/o “count to problem“– like distance vector has slow convergence*– doesn’t require consistent topology– can support hierarchical topology => exterior protocol (EGP)

Link State, e.g. OSPF, IS-IS– send <neighbor-addr,cost> to all routers– determine entire flat network topology (Dijkstra’s algorithm)– fast convergence*, guaranteed loopless => interior routing protocol (IGP)

*convergence time is the time taken until all routers work consistently before convergence is complete packets may be misforwarded, and there may be loops

1.1

Y(J)S MPLS Slide 9

IP ForwardingIP Forwarding

what field is used for forwarding?field used depends on application regular: use longest destination prefix match ToS: use longest destination prefix match + exact match on ToS multicast: use longest/exact match on source/group address

how is forwarding performed?forwarding algorithm depends on routing algorithm! for Distance Vector - forward by cost for Path Vector - forward by cost and path for Link State - Dijkstra’s algorithm

1.1

Y(J)S MPLS Slide 10

Connection Oriented ForwardingConnection Oriented Forwarding

each forwarder maintains forwarding table (or table per input port)

control component: route must be set-up (table must be updated) before data sent set-up may be manual or signaled once route no longer needed it should be torn-down

input ports output ports

1

2

3

4

5

1

2

3

4

5

CO forwarder (switch)

1.1

Y(J)S MPLS Slide 11

CO ForwardingCO ForwardingCO addresses need not have global significance

by using locally defined addresses: addresses are smaller in size no need for global allocation mechanism no need to maintain global database

L2 forwarding is based on address read and swap

Note:when addresses are purely local

CO forwarding is called L2 (link layer) forwarding

12 17 23

1.1

Y(J)S MPLS Slide 12

CO Forwarding TableCO Forwarding Table

input port input address output port output address

1 21 2 21

1 37 1 21

2 12 5 12

3 5 5 12

4 15 3 37

Note: never really have an input port column either it is irrelevant (CO address is per switch not per input port), or if needed (e.g. ATM) we keep separate tables per input port (more efficient) 1.1

Y(J)S MPLS Slide 13

CO RoutingCO Routing

for CO forwarding we need to find a route at setup– manual configuration (strict explicit route)– use routing protocols (e.g. P-NNI for ATM)

CO routing protocols search for a global path – hence they can guarantee global characteristics (SLA)– pure CL routing can not guarantee path QoS

information source usually knows desired characteristicsso usually use source routing

source routing– can force route to go through specific forwarders (loose explicit route)– can reject forwarders that can not guarantee needed characteristic– can request resource reservation in selected forwarders

1.1

Y(J)S MPLS Slide 14

Problems with IP routingProblems with IP routing

scalability– router table overload– routing convergence slow-down– increase in queuing time and routing traffic– problems specific to underlying L2 technologies

hard to implement load balancing QoS and Traffic Engineering problem of routing changes difficulties in routing protocol update lack of VPN services

1.2

Y(J)S MPLS Slide 15

ScalabilityScalabilitywhen IP was first conceived scalability was not a problemas IP traffic increases, routing shows stress

simplistic example – N hosts– each router serves M hosts – each router entry takes a bytes

hence– router table size a N ~ N – N / M ~ N routers (more routers => slower convergence)– packet processing time* ~N (since have to examine entire table)

– ~N routers send to ~N routers tables of size ~Nso routing table update traffic increases ~N 3 (or ~N 4 )

* IP routing requires 1000s of clocks per decision

1.2

Y(J)S MPLS Slide 16

L2 Backbone ScalabilityL2 Backbone Scalabilityinstead of expensive and slow IP routers

we can use faster and cheaper ATM switches in core

but this doesn’t help! ATM switches are transparent to routerssince ATM switches do not participate in IP routing protocolsevery IP router must be logically adjacent to every other

and we need ~ N2 ATM VCs !if only the ATM switches could understand IP routing protocols!

ATM

Classical IP over ATM

1.2

Y(J)S MPLS Slide 17

Load BalancingLoad Balancing

using hop-count as path costtraffic destined for G from both A and B goes through D

so links CD and DG become congestedwhile links CE, EF, and FG are underutilized

since IP forwarding uses only destination addressthis can’t be overcome in pure IP routing

problem even exists in equal-cost multi-path (ECMP) case

solution requires traffic engineering 1.2

“fish diagram”A

B

C

D

E F

G

1

2

3

2 3

4

Y(J)S MPLS Slide 18

QoS and TEQoS and TE

pure IP, being CL, can not guarantee path QoSbut other protocols in the IP suite can help

TCP adds CO layer compensating for loss and misordering IntServ (RSVP) sets up path with reserved resources DiffServ (ToS) prioritizes packets (neither -Serv widely deployed)

so IP network managers mostly use network engineering– i.e. throw BW at the problem

rather than traffic engineering– i.e. optimally exploit the BW you have

1.2

Y(J)S MPLS Slide 19

Routing ChangesRouting Changes

IP routing is satisfactory in the steady state

but what happens when something changes?

change in routing information (new router, router failure, new inter-router connection, etc)

necessitates updating of tables of all routers

convergence will be slow

change in routing protocol is even worse (e.g. Bellman-Ford to present, classes to classless, IPv4 to IPv6)

necessitates upgrade of all router software

upgrade may have to be simultaneous

need more complete separation of forwarding from routing functionality

1.2

Y(J)S MPLS Slide 20

VPN ServicesVPN Services

IP was designed to interconnect LANsnot to provide VPN services

all routers logically interconnected, so security weak

LANs may use non globally unique addresses (present solution - NAT)

complex provider - customer relationship

VC-merge problem (discussed later)1.2

192.115.243.19

192.115.243.19 192.115.243.79

IP

Y(J)S MPLS Slide 21

Solution - Label SwitchingSolution - Label Switching

label switching adds the strength of CO to CL forwarding

label switching has three stages:– routing (topology determination) using L3 protocols– path setup (label binding and distribution) – data forwarding

label switching the solution to all of the above problems– speeds up forwarding– decreases forwarding table size (by using local labels)– load balance by explicitly setting up paths– complete separation of routing and forwarding algorithms

no new routing algorithm neededbut new signaling algorithm may be needed

CO forwarder (switch) CL forwarder (router) LSR

1.3

Y(J)S MPLS Slide 22

Where is it?Where is it?

unlike TCP, the CO layer lies under the CL layerif there is a broadcast L2 (e.g. Ethernet), the CO layer lies above it

hence, label switching is sometime called layer 2.5 switching

1.3

higher layers

layer 3 (e.g. IP)

label switching (layer 2.5)

layer 2 (e.g. ethernet)

physical layer

shim header

Y(J)S MPLS Slide 23

LabelsLabels

a label is a short, fixed length, structure-less address

the following are not labels: telephone number (not fixed length, country-code+area-code+local-number) Ethernet address (too long, note vendor-code is not meaningful structure) IP address (too long, has fields) ATM address (has VP/VC)

not explicit requirement, but normally only local in significance

label(s) added to CL packet, in addition to L3 address

layer 2.5 forwarding may find a different route than the L3 forwarding is faster than L3 forwarding requires a flow setup process and signaling protocol

1.3

Y(J)S MPLS Slide 24

FForwarding orwarding EEquivalence quivalence CClasslass

equivalence class - set of entities sharing common characteristicsthat can be considered equivalent for some purpose

Theorem (from set theory) - any equality relation (e.g. common features)

divides all entities into non-overlapping equivalence classes

any forwarding algorithm need only consider destination (and sometimes source) address service requirements (e.g. priority, BW, allowed delay, etc)

we can group together all packets with the samedestination and service requirements as a FEC

by the theorem every packet belongs to one unique FEC

packets in the same FEC should follow the same routeso we should map them to the same label (bind them) 1.3

Y(J)S MPLS Slide 25

FECsFECs

what constitutes a FEC ?

for IP routing (de facto)all packets w/ same destination IP prefixthat prefix being the longest in the routing table

for MPLS we can decide !

coarsest granularity all packets with a destination address

served by a given router

finest granularityall packets from given source socket

to given destination socket with specified handling requirements

1.3

Y(J)S MPLS Slide 26

Label Switching ArchitectureLabel Switching Architecture

label switching is needed in the core, access can be L3 forwarding*

core interfaces the access at the edge (ingress, egress)

LSR router that can* perform label switchingLER LSR with non-MPLS neighbors (LSR at edge of core network)LSP unidirectional path used by label switched forwarding (ingress to egress)

* not every packet needs label switching (e.g. only small number of packets, no QoS) 1.3

L3 router L3 routerLabel Edge Router Label Edge RouterLabel Switched Routers

L3 link L3 link

Label Switched Path

ingress egress

downstream directionupstream direction

Y(J)S MPLS Slide 27

Label Switched ForwardingLabel Switched ForwardingLSP needs to be setup before data is forwarded

and torn down once no longer needed

LSR performs – label switched forwarding* for labeled packets– label unique to LSR or unique to input interface (like ATM)– optionally L3 forwarding for unlabeled packets

ingress LER – assigns packet* to FEC– labels packet– forwards it downstream using label switching

egress LER – removes label– forwards packet using L3 forwarding– exception: PHP (discussed later)

*once packet is assigned to a FEC and labeled, no LSR looks at the L3 headers/address 1.3

Y(J)S MPLS Slide 28

Hierarchical ForwardingHierarchical Forwarding

many networks use hierarchical routing– decreases router table size– increases forwarding speed– decreases routing convergence time

telephone numbers

Internet DNS

ATM

Ethernet/802.3 address space is flat(even though written in byte fields)

Country-Code Area-Code Exchange-Code Line-Number

972 2 588 9159

host … SLD TLDmyrad . rad . co . il

1.3

VPVC

VCVC

Y(J)S MPLS Slide 29

IP Routing HierarchyIP Routing HierarchyIPv4

– originally flat space (1st come - 1st) until router tables exploded

– then 3 classes

– now CIDR <RFC1519> AS=Autonomous

SystemIP exploits hierarchy by employing interior/exterior routing protocols

IP can even support arbitrary levels of hierarchyby advertising aggregated addresses

but the exploitation is not optimal

ASany host

0 net7 host24A

10 net14 host16B 110 net22 host8C

AS12

AS13AS13

AS14

1.3

Y(J)S MPLS Slide 30

IP Routing Hierarchy BugIP Routing Hierarchy Bug

traffic from network 1 to network 2 must go through transit domain

to get to any host in network 2, all transit domain routers forward to border router 2

transit domain routers shouldn’t need to know anything about network 2but for IP protocols* change in network 2 forces rerouting in transit domainin worst case, transit domain routers need to know all routes in Internet

avoiding maintenance of interdomain information • conserves routing table memory• ensures faster convergence• transit domain routers restart faster• provides better fault isolation * except at BGP-OSPF interface

Net 1 Net 2transit domain1 2

1.3

Y(J)S MPLS Slide 31

Label StacksLabel Stacks

since labels are structure-less, the label space is flat

label switching can support arbitrary levels of hierarchy by using a label stack

label forwarding based only on top labelbefore forwarding, three possibilities (listed in NHLFE) :

– read top label and pop– read top label and swap– read top label, swap, and push new label(s)

top label

another label

yet another label

bottom label

1.3

Y(J)S MPLS Slide 32

Example Uses of Label StackExample Uses of Label Stack

Example applications that exploit the label stack

– fast rerouting

– VPNs

– X over MPLS (PWE3)

Note: three labels is usually more than enough

1.3

Y(J)S MPLS Slide 33

Fast ReroutingFast ReroutingIP has no inherent recovery method (like SONET)

in order to ensure resilience we provide bypass links

to reroute quickly we pre-prepare labels for the bypass links

10 11 12swap swap

swap+

push popswapswap

11

13 11

1411

from here on no difference! *

when link downchange fwd table

protection LSP1.3* label space per LSR

not per input port

Y(J)S MPLS Slide 34

Label Switched VPNsLabel Switched VPNs

1.3

customer 2 networkcustomer 1 network

C C

C

CC

C

CCE

C

C

CC

C

CE

PPE

PP

CE

PE

P

P

CEprovider network

customer 2 network customer 1 networkKey

C Customer routerCE Customer Edge routerP Provider routerPE Provider Edge router

Y(J)S MPLS Slide 35

Label Switched VPNs Label Switched VPNs (cont.)(cont.)

customers 1 and 2 use overlapping IP addresses

C-routers have inconsistent tables

ingress PE-router inserts two labels

P-routers don’t see IP addressesso no ambiguity

P-routers see only the label of the egress PE-router– they don’t know about VPNs at all– no need to understand customer configuration– hence smaller tables & no rerouting if customer reconfigures

ingress PE router only knows about CE routers– no need to understand customer configuration

egress PE

egress CE

IP headers

payload

1.3

Y(J)S MPLS Slide 36

X over MPLSX over MPLS

1.3

nativeservice

A tunnel may contain many PWs

nativeservice

PE

router

P router

P router

P router

PE

router

P router

tunnel label *

PW label *

PW control word

payload

* MPLS-f inner label outer label(s)Dictionary: ITU-T interworking label transport label(s)

IETF PW label tunnel label(s)

Y(J)S MPLS Slide 37

Service InterworkingService Interworking

Network interworking (PW):

Service interworking:

ProviderEdge

(PE)

CustomerEdge

(CE)

providernetwork

NativeService

A

ProviderEdge

(PE)

CustomerEdge

(CE)

NativeService

A

ProviderEdge

(PE)

CustomerEdge

(CE)

providernetwork

NativeService

A

ProviderEdge

(PE)

CustomerEdge

(CE)

NativeService

B

1.3

Y(J)S MPLS Slide 38

MPLS Forwarding Component

2.1 The essentials

2.2 The documents

Label (20b) Exp(3b) S(1b) TTL (8b)

Y(J)S MPLS Slide 39

MPLS historyMPLS historymany different label switching schemes were invented Cell Switching Router (Toshiba) <RFC 2098,2129> IP Switching (Ipsilon, bought by Nokia) <RFC 2297> Tag Switching (Cisco) <RFC 2105> Aggregate Route-based IP Switching (IBM) IP Navigator (Cascade bought by Ascend bought by Lucent)

so the IETF decided to standardize a single method BOFs 1994-1995 WG chartered 1997

– co-chairs from Cisco and IBM (so similar to tag switching and ARIS)

– Cisco, IBM, Ascend authored architecture document MPLS standards-track RFCs 2001-

2.1

Y(J)S MPLS Slide 40

MPLSMPLS

of all the label switching technologies - what is special about MPLS ?

multiprotocol - from above and below

label in L2 or shim header

single forwarding algorithm, including for multicast and TE

can run on IP router or ATM switch with only SW upgrade (although can benefit from special HW)

label distribution piggybacked on existing routing protocols or via LDP

control-driven downstream label binding

support for constraint-based routing (for TE)

2.1

Y(J)S MPLS Slide 41

MultiprotocolMultiprotocol Label Switching Label Switching

IPv4 IPv6 IPX etc.

MPLS

Ethernet ATM frame-relay etc.

2.1

Y(J)S MPLS Slide 42

MPLS LabelsMPLS Labelslabel may be an appropriate address in either L2 or L3

(even if we lose other features)

Examples: – for ATM use VPI and VCI fields as two labels (only two labels, no TTL)– for frame-relay use DLCI (only one label, no CoS, no TTL)

otherwise use shim header (described later)

2.1

Ethernet header

MPLS shim header

IP headers

payload

multi-access subnet

PPP header

MPLS shim header

IP headers

payloadpoint-to-point & POS

MPLS over ATM

ATM header

MPLS shim header

IP headers

payload fragment

ATM header

payload fragment

Y(J)S MPLS Slide 43

ATM Label SwitchingATM Label Switching

GFCVPI/VCI => top label

PTI/CLP/HEC

payload fragment rest of cells

1st cell

in this mode the label is carried in the VPI/VCIthis facilitates using ATM switcheshowever, we still need the shim header for S, and TTLthe label field in shim header is a placeholder and set to zero 2.1

shim IP packet AAL5 trailer

AAL5 PDU

ATM cell ATM cell ATM cell…

MPLS WG devoted a lot of time to this case

GFCVPI/VCI => top label

PTI/CLP/HEC

IP headers

payload fragment

MPLS shim

Y(J)S MPLS Slide 44

MPLS Shim HeaderMPLS Shim Header

when a shim header is needed, its format should be:

Label there are 220 different labels (+ 220 multicast labels)

Exp (CoS) left undefined by IETF WG was CoS in Cisco Tag Switching could influence packet queuing

Stack bit S=1 indicates bottom of label stack

TTL decrementing hop count used to eliminate infinite routing loops

generally copied from/to IP TTL field

Label (20b) Exp(3b) S(1b) TTL (8b)

2.1

S=0 top label

S=0 another label

S=0 yet another label

S=1 bottom label

Special (reserved) labels0 IPv4 explicit null1 router alert2 IPv6 explicit null3 implicit null

Y(J)S MPLS Slide 45

Single Forwarding AlgorithmSingle Forwarding AlgorithmIP uses different forwarding algorithms

for unicast, unicast w/ ToS, multicast, etc.

LSR uses one forwarding algorithm (LER is more complicated)– read top label L– consult Incoming Label Map (forwarding table) [Cisco terminology LFIB]– perform label stack operation (pop L, swap L - M, swap L - M and push N)

– forward based on L’s Next Hop Label Forwarding Entry

NHLFE contains:– next hop (output port, IP address of next LSR)

if next hop is the LSR itself then operation must be pop for multicast there may be multiple next hops, and packet is replicated

– label stack operation to be performed– any other info needed to forward (e.g. L2 format, how label is encoded)

ILM contains:– a NHLFE for each incoming label– possibly multiple NHLFEs for a label, but only one used per packet

2.1

Y(J)S MPLS Slide 46

LER Forwarding AlgorithmLER Forwarding Algorithm

LER’s forwarding algorithm is more complex check if packet is labeled or not if labeled

– then forward as LSR

– else lookup destination IP address in FEC-To-NHLFE Map if in FTN

– then prepend label and forward using LSR algorithm

– else forward using IP forwarding

[Cisco terminology LIB]

2.1LER LSRIP

routerpure IP link MPLS link

Y(J)S MPLS Slide 47

PPenultimate enultimate HHop op PPoppingopping

the egress LER E also may have to work overtime:– read top label– lookup label in ILM– find that in NHLFE that the label must be popped– lookup IP address in IP routing– forward to CE using IP forwarding

we can save a lookup (and the first 3 steps) by performing PHP but pay in loss of OAM capabilities

penultimate LSR PH performs the following:– read top label– lookup label in ILM– pop label revealing IP address of CE router– forward to CE using IP forwarding 2.1

I E

PH

CEIP link

MPLS domain

Y(J)S MPLS Slide 48

Route AggregationRoute Aggregation

traffic from both network 1 and network 2 is destined for network 3

scalability advantages– fewer labels – conserve table memory

disadvantages– IP forwarding may be required– OAM backwards trail is destroyed

2.1

E 3IP link

MPLS domain

1

2 IP link

IP link 12

22

13

23

31 32

net 2

net 3

net 1

Y(J)S MPLS Slide 49

IP Router / ATM SwitchIP Router / ATM Switchmigration - important to be able to exploit existing forwarding hardware

we can use IP routers as LSRs– already use the routing protocols for topology determination– need to add label distribution protocol (or extend routing protocol)– need to alter the forwarding algorithm– can manage with software upgrade but probably best to upgrade hardware too

we can use ATM switches as LSRs– need to alter the routing protocols for topology determination– need to add label distribution protocol– already uses the forwarding algorithm– probably can just upgrade software

2.1

+ =

=+

Y(J)S MPLS Slide 50

ATM Cell InterleaveATM Cell InterleaveATM switches

– have separate label space per input port– segment traffic into ATM cells

what if the forwarding tables for both port 1and port 2 have the entry

incoming label 13 outgoing label 17 outgoing port 1 ?

2.1B

ATMLSR

ATMLSR

13 data A1

13 data A2

A

13 data B2

13 data B1

17 data A1

17 data B1

17 data A2

17 data B2

1

21

2

B

Y(J)S MPLS Slide 51

VC/VP MergeVC/VP Mergein order to properly reconstruct the packet we can use

VC-merge

buffer cells at the ATM LSR until end-of-packet detected

cells belonging to different packets aren’t interleaved– introduces delay– requires large memory– requires ATM LSR to detect AAL5 end-of-packet

VP merge

use VPI as MPLS label

use VCI as multiplexing index

– limits number of available labels

2.1

Y(J)S MPLS Slide 52

Label Distribution ProtocolsLabel Distribution Protocolswhen LSR creates/removes a FEC - label binding

needs to inform other LSRs (“remote binding information”)

MPLS allows piggybacking label distribution on routing protocols– protocols already in use (don’t need to invent or deploy)

– eliminates race conditions (when route or binding, but not both, defined)

– ensures consistency between binding and routing information– only for distance vector or path vector routing protocols (not OSPF, IS-IS)

– not all routing protocols are sufficiently extensible (RIP isn’t)

– has been implemented for BGP-4

MPLS WG invented a new protocol LDP for “plain” label distribution– messages sent reliably using TCP/IP– messages encoded in TLVs– discovery mechanism to find other LSRs

… and extended RSVP to LSPs for QoS

2.1

Y(J)S MPLS Slide 53

Control MechanismsControl Mechanisms

pre-MPLS protocols used various control mechanisms

Example: label distribution– Cisco and IBM - control driven – Toshiba and Ipsilon - data (traffic) driven

MPLS is distinguished by its choice of control protocols– control driven– downstream binding– unsolicited and on-demand allowed– independent and ordered allowed

we will discuss these in the section on the control plane

2.1

Y(J)S MPLS Slide 54

MPLS RFCs MPLS RFCs (page 1 of 2)(page 1 of 2)

2547 BGP/MPLS VPNs2702 Requirements for Traffic Engineering Over MPLS2917 A Core MPLS IP VPN Architecture3031 Multiprotocol Label Switching Architecture3032 MPLS Label Stack Encoding3035 MPLS Using LDP and ATM VC Switching3036 LDP Specification3037 LDP Applicability3038 VCID Notification over ATM link for LDP3063 MPLS Loop Prevention Mechanism3107 Carrying Label Information in BGP-43209 RSVP-TE: Extensions to RSVP for LSP Tunnels3210 AS for Extensions to RSVP for LSP-Tunnels3212 Constraint-Based LSP Setup using LDP3213 Applicability Statement for CR-LDP3214 LSP Modification Using CR-LDP3215 LDP State Machine

2.2

Y(J)S MPLS Slide 55

More MPLS RFCsMore MPLS RFCs

3270 MPLS Support of Differentiated Services3346 AS for Traffic Engineering with MPLS3353 Overview of IP Multicast in MPLS Environment3429 Assignment of the 'OAM Alert Label' for MPLS OAM functions3443 TTL Processing in MPLS Networks3468 The MPLS WG decision on MPLS signaling protocols3469 Framework for MPLS-based Recovery3477 Signalling Unnumbered Links in RSVP-TE 3496 Support of ATM Service Class-aware MPLS Traffic

Engineering3564 Support of DiffServ-aware MPLS Traffic Engineering3478 Graceful Restart Mechanism for LDP3479 Fault Tolerance for the LDP 3480 Signalling Unnumbered Links in CR-LDP

2.2

Y(J)S MPLS Slide 56

Yet More MPLS RFCsYet More MPLS RFCs

3612 Applicability Statement for Restart Mechanisms for LDP3630 OSPF-TE3809 Generic Requirements for PPVPNs3811 Definitions of Textual Conventions for MPLS Management3812 MPLS Traffic Engineering Management Information Base3813 MPLS Label Switching Router (LSR) MIB3814 MPLS FEC-To-NHLFE MIB3815 Definitions of Managed Objects for LDP 3916 Requirements for Pseudo-Wire Emulation Edge-to-Edge 3988 Maximum Transmission Unit Signalling Extensions for LDP 3985 PWE3 Architecture4023 Encapsulating MPLS in IP or GRE4026 PPVPN Terminology 4031 Service requirements for Layer 3 PPVPNs

2.2

Y(J)S MPLS Slide 57

Even More MPLS RFCsEven More MPLS RFCs4023 Encapsulating MPLS in IP or Generic Routing Encapsulation (GRE)4090 Fast Reroute Extensions to RSVP-TE for LSP Tunnels4105 Requirements for Inter-Area MPLS Traffic Engineering4124 Protocol Extensions for Support of Diffserv-aware MPLS TE4125 Maximum Allocation BW Constraints Model for Diffserv-aware MPLS TE4126 Max Allocation w/ Reservation BW Constraints Model for Diffserv MPLS

TE4127 Russian Dolls Bandwidth Constraints Model for Diffserv-aware MPLS TE4128 Bandwidth Constraints Models for Diffserv-aware MPLS TE4182 Removing a Restriction on the use of MPLS Explicit NULL4201 Link Bundling in MPLS Traffic Engineering (TE)4206 Label Switched Paths (LSP) Hierarchy with GMPLS Traffic Engineering4216 MPLS Inter-AS TE Requirements4220 Traffic Engineering Link Management Information Base4221 Multiprotocol Label Switching (MPLS) Management Overview4247 Requirements for Header Compression over MPLS4364 BGP/MPLS IP VPNs (was 2547bis)4368 MPLS Label-Controlled ATM and FR Management Interface Definition4377 OAM Requirements for MPLS Networks4378 A Framework for MPLS OAM4379 Detecting MPLS Data Plane Failures

2.2

Y(J)S MPLS Slide 58

MFA forum IAsMFA forum IAs

1.0 Voice over MPLS IA

2.0 MPLS-PVC UNI IA

3.0 LDP Conformance IA

4.0 TDM Transport over MPLS using AAL1 IA

5.0 I.366.2 Voice Trunking over MPLS IA

6.0.0 MPLS Proxy Admission Control

7.0.0 MPLS UNI protocol

8.0.0 TDM over MPLS using Raw Encapsulation

2.2

Y(J)S MPLS Slide 59

ITU-T RecommendationsITU-T Recommendations

Y.1411 ATM-MPLS network interworking - Cell mode user plane interworking

Y.1412 ATM-MPLS network interworking - Frame mode user plane interworking

Y.1413 TDM-MPLS network interworking – User plane interworking

Y.1414 Voice services - MPLS network interworking

Y.1415 Ethernet-MPLS network interworking – User plane interworking

Y.1710 Requirements for OAM functionality for MPLS networks

Y.1711 Operation & Maintenance mechanism for MPLS networks

Y.1712 User-plane fault-management for ATM and MPLS OAM

Y.1713 Misbranching detection for MPLS networks

Y.1720 Protection switching for MPLS networks

X.84 FR over MPLS core networks

G.8110/Y.1370 MPLS Layer Network architecture

2.2

Y(J)S MPLS Slide 60

MPLS Control Component3.1 Procedures and scenarios

3.2 Label distribution protocols

3.3 LDP and BGP-4 details

user (data) plane

control plane

Y(J)S MPLS Slide 61

Control and User PlanesControl and User Planes

topology determination

use standard IP routing protocolsBGP, OSPF, IS-IS, PIM

label distribution

piggyback on routing protocol

use LDP, RSVP-TE

forwarding paradigm

label-stack CO forwardinguser (data) plane

control plane

3.1

Y(J)S MPLS Slide 62

What is Needed ?What is Needed ?

all IP routing protocols (BGP,OSPF,PIM,etc) procedure to bind label to FEC (label assignment)

protocol to distribute label binding information procedure to create forwarding table

procedure to label incoming packet forwarding procedure

– forwarding table lookup– label stack operations

control plane

user (data) plane

3.1

Y(J)S MPLS Slide 63

All the TablesAll the Tables

FEC table

Free Labels 128-200 presently free

FTN

ILM

NHLFE

FEC protocol input port handling

192.115/16 IPv4 2 best-effort

FEC port/label in port/label out

192.115/16 2/17 3/137

port/label in port/label out next hop operation

2/17 3/137 5.4.3.2 swap

3.1

Y(J)S MPLS Slide 64

LER ArchitectureLER Architecturecontrol plane

user (data) plane

IP routing protocolsIP routing table label binding and

distribution protocols

IP forwarding

tableMPLS

forwarding table

IP routers

labeling procedure

LSRs

3.1

free label table

egress LER

FTN

Y(J)S MPLS Slide 65

Binding & Distribution OptionsBinding & Distribution Options

label binding (assignment)

– per port or per LSR label space

– control driven vs. data driven (traffic driven)

– liberal vs. conservative label retention

label distribution (advertisement)

– downstream vs. upstream

– downstream on-demand (dod) vs. downstream unsolicited (du)

– independent vs. ordered3.1

Y(J)S MPLS Slide 66

Per Port Label SpacePer Port Label Space

LSR may have a separate label space for each input port (I/F)

or a single common label space

or any combination of the two

separate labels spaces means separate forwarding tables per port

ATM LSR have per port label spaces (leads to interleave problem)

per port label spaces increases number of available labels

common label space facilitates several MPLS mechanisms (e.g. reroute)

3.1

Y(J)S MPLS Slide 67

Control vs. Data DrivenControl vs. Data Driventhere are two philosophies as to when to create a binding

data-driven (traffic-driven) binding (Toshiba CSR, Ipsilon IP-Switching)

automatically create binding when data packets arrive(from first packet?, after enough packets? when tear LSP down?)

control-driven binding (Cisco Tag Switching, IBM ARIS)

create binding when routing updates arrive(only update when topology changes? update upon request?)

although not specifically stated in the architecture documentMPLS assumes control driven binding *

* two implementations of control driven: – topology-driven (routing tables are consulted)– control-traffic driven (only routing update messages are used)

3.1

Y(J)S MPLS Slide 68

Liberal vs. Conservative RetentionLiberal vs. Conservative Retention

LSR receives “advertisements” (label distribution messages) from other LSRs

conservative label retention:LSR retains only label-to-FEC bindings that are presently needed

liberal label retentionLSR stores all bindings received (more labels need to be maintained)

using liberal retention can speed response to topology changes

LSRs must agree upon mode to be used

A advertises labelB is previous hop LSR

but C retains label anywaylater routing change makes C the previous hop

C immediately can start forwardingC

AB

3.1

Y(J)S MPLS Slide 69

Downstream vs. UpstreamDownstream vs. Upstream

binding means to allocate a label to a FEClocal binding - LSR allocates the label from “free label” poolremote binding - LSR that receives the label

which LSR allocates ?

MPLS uses downstream bindinglabel allocated by LSR downstream from the LSR that prepends itlabel distribution information flows upstream

reverse in direction from data packetsto set up LSP through link from LSR A to LSR BLSR B binds label 13 to FECB advertises label to LSR ALSR A sends packets with label 13 to B

downstream

B13A

label 13

3.1

Y(J)S MPLS Slide 70

On-demand vs. UnsolicitedOn-demand vs. Unsolicited

downstream on-demand label distribution:

LSRs may explicitly request a label from its downstream LSR

unsolicited label distribution:

LSR distributes binding to upstream LSR w/o a request(e.g. based on time interval, or upon receipt of topology change)

LSR may support on-demand, unsolicited, or both

adjacent LSRs must agree upon which mode to be used

LSR A needs to send a packet to LSR BLSR A requests a label from LSR BB binds label 13 to the FECB distributes the labelA starts sending data with label 13

B13 dataA

I need a label

I allocated 13

3.1

Y(J)S MPLS Slide 71

Independent vs. OrderedIndependent vs. Orderedindependent binding (Tag Switching)

– each LSR makes independent decision to bind and distribute

ordered binding (ARIS) egress LSR binds first and distributes binding to neighbors LSR that believes that it should be the penultimate LSR

binds and distributes to its neighbors binding proceeds in orderly fashion until ingress LSR is reached

LSRs must agree upon mode to be used

B sees that it is egress LSR for 192.115.6B allocates label 13B distributes label to C and DC distributes label to E 192.115/16

AB

D

CE 13

13

3.1

Y(J)S MPLS Slide 72

LDP tasksLDP tasks

a label distribution protocol is a signaling protocol that can perform the following tasks:

discover LSR peers

initiate and maintain LDP session

signal label request

advertise binding

signal label withdrawal

loop prevention

explicit routing

resource reservation3.2

Y(J)S MPLS Slide 73

Label Distribution ProtocolsLabel Distribution Protocols

label distribution can be carried over various protocols

There are presently four options:

LDP– MPLS-enhanced IP networks

BGP4-MPLS– RFC 2547 VPNs

RSVP-TE– traffic engineering support

CR-LDP – constraint based (no longer recommended by IETF)

3.2

Y(J)S MPLS Slide 74

LDP vs. BGPLDP vs. BGP

both use TCP for reliable transport (LDP uses UDP for hellos)

both are hard-state protocols

both use TLV format for parameters

BGP

multiprotocol (IPv4, IPv6, IPX, MPLS)

highly complex protocol

provides routing / label distribution

built-in autodiscovery mechanism

LDP

MPLS only

simpler protocol

only label distribution

extendable for autodiscovery

3.2

Y(J)S MPLS Slide 75

LDPLDPmajor focus of the IETF MPLS WG was the design of LDP

based on similar TDP from Cisco

LDP sets up a bidirectional LDP session both sides can request or advertise labels

LDP usually uses TCP– needs reliable transport (e.g. what happens if miss a binding)– needs in-order delivery (e.g. binding+withdrawal)– hard to develop new reliable transport protocols– single acknowledgement timer for session– piggybacking ACK on data packets

Use UDP for discovery (hello) messages

periodic keepalive messages (if not received, session terminated)

messages encoded in TLV (Type Length Value) form3.2

Y(J)S MPLS Slide 76

LDP SetupLDP Setup

Discovery

Hello (UDP)

Session

Initialization (TCP)

Distribution

Label Request (TCP)

Label Distribution (TCP)

Hello (UDP)

Initialization (TCP)

3.2

Y(J)S MPLS Slide 77

Discovery PhaseDiscovery Phase

LSR periodically multicast transmits hello to “LDP discovery” UDP port– to “all routers on subnet” multicast group– to preconfigured IP address (when not all LSRs on same subnet) (extended discovery) “targeted LDP”

LSRs listen on this UDP port for hello messages Hello message contains:

– hold time– LSR Identifier

when LSR receives Hello from another LSR– it opens a TCP connection to that other LSR (if needed)

or (for extended discovery)– it unicast transmits a hello back to the other LSR

LDP session can now be established

3.2

Y(J)S MPLS Slide 78

Session InitializationSession Initialization

The LSR with higher identifier sends (TCP)a session initialization message to the other LSR

session initialization message contains:– LDP Protocol version– label distribution and control method– timer values– label space ranges (not any more !)

if receiving LSR accepts these parameters then it transmits a KeepAliveelse it transmits a reject

3.2

Y(J)S MPLS Slide 79

Distribution MessagesDistribution Messages

label mapping– downstream LSR advertisement of a label mapping for a FEC two FEC types: host address, IP address prefix

label withdrawal– reverse of mapping message– downstream LSR informs upstream LSR

that it has revoked a previous binding– upstream LSR can not longer use the label

label release– upstream LSR informs downstream LSR

that it no longer needs a binding– typically when downstream is no longer next hop

and operating in conservative retention mode

3.2

Y(J)S MPLS Slide 80

Request MessagesRequest Messagesin downstream-on-demand mode upstream LSR must request binding

upstream LSR sends label request message when:

– FEC in FEC table next hop LSR is LDP peer FEC not in forwarding table

– FEC next hop changes upstream LSR doesn’t have a mapping from new next hop

– receives FEC label request from upstream LDP peer next hop LSR is LDP peer upstream LSR doesn’t have a mapping from next hop

upstream LSR sends label request abort message when:– upstream LSR needs to revoke request before satisfied for example, next hop LSR for FEC has changed

3.2

Y(J)S MPLS Slide 81

NotificationsNotifications

There are two types of notifications:– error notifications (fatal errors - terminate session)– advisory notifications (status messages)

LSR sends notification messages when:– received LDP message with unsupported protocol version– received LDP message with unknown type– KeepAlive timer expired– session initialization fails due to unacceptable parameters – etc.

3.2

Y(J)S MPLS Slide 82

LDP state machineLDP state machine

LSR periodically transmits hello UDP messages– multicast to “all routers on subnet” group– targeted to preconfigured IP address

LSRs listen on this UDP port for hello messages

when LSR receives hello from another LSR– it opens a TCP connection to that other LSR

or (for extended discovery)– it unicast transmits a hello back to the other LSR

LSR with higher ID sends session initialization message

other LSR LDP accepts (sends keepalive) or rejects

informative or keepalive messages sent

3.2

Y(J)S MPLS Slide 83

LDP packet formatLDP packet format

version – presently 1

length - PDU length, excluding version and length fields

LDP-ID – identifies label space of sending LDP peer– LSR-ID(4B) globally unique LSR ID– label space ID (2B) for per-port label spaces

(zero for per-platform label spaces)

message TLVs – zero or more message TLVs (see next page)

version(2B)

length(2B)

LDP-ID(6B)

message TLVs (variable)

header (10B)

3.3

Y(J)S MPLS Slide 84

LDP message TLVsLDP message TLVs

U – unknown message bitif message type unknown to receiverU=0 – receiver returns notification to senderU=1 – receiver silently ignores

length - message length, excluding type and length fields

message-ID – unique ID for message (for matching with returned notification)

if there are mandatory parameters, they most appear in a specific order

optional parameters may appear in any order

type(15b)

length(2B)

message-ID(4B)

mandatoryparameter

TLVs (variable)

optional parameter

TLVs (variable)

U

3.3

Y(J)S MPLS Slide 85

All LDP message typesAll LDP message types

Hello (0x0100)

Initialization (0x0200)

KeepAlive (0x0201)

Notification (0x0001)

Address (0x0300)

Address Withdraw (0x0301)

Label Mapping (0x0400)

Label Withdraw (0x0402)

Label Request (0x0401)

Label Release (0x0403)

Label Abort Request (0x0404)

3.3

Y(J)S MPLS Slide 86

LDP parameter TLVsLDP parameter TLVs

U – unknown message bitif message type unknown to receiver:

U=0 – receiver returns notification to sender U=1 – receiver silently ignores

F – forward unknown message bit if U=1 and message type unknown to receiver

F=0 – do not forwardF=1 – forward

type - TLV type (FEC TLV, label TLV, address list TLV, hop count TLV, path vector TLV, status TLV )

length - length of value field in bytes

type(14b)

length(2B)

valueU

3.3

F

Y(J)S MPLS Slide 87

FEC TLVFEC TLV

there may be more than one FEC element for mapping messages only

the FEC elements are not themselves TLVs (no length needed), instead– wildcard FEC (0x01)– prefix FEC (0x02) + address family (IPv4, IPv6, Ethernet, E.164, etc.) +

prefix length in bits + prefix– host address FEC (0x03) + address family + length + address

type (2B)U=0 F=0 type=0x0100

length (2B)

3.3

FEC element 1

…

Y(J)S MPLS Slide 88

Generic Label TLVGeneric Label TLV

this is the generic label TLV

there are also special label TLVs for ATM and FR based MPLS

type (2B)U=0 F=0 type=0x0200

length (2B)

3.3

label (20 bits)

Y(J)S MPLS Slide 89

Status TLVStatus TLV

Status TLVs : mandatory parameters in notification messages optional parameters in other messagesU=0 when status in notification message, else U=1E - fatal error bit E=1 for fatal error E=0 for advisory notificationthe two F bits are equal and have the normal meaningstatus code data = 0 means successmessage ID - the message to which the status refersmessage type - the message type to which the status refers

type (2B)U F type=0x0300

length (2B)

3.3

status code data (30b)

message ID (32b)

E F

message type (16b)

Y(J)S MPLS Slide 90

Example full message - Example full message - label mappinglabel mapping

mapping messagetype = 0x0400

length=24(2B)

message-ID(4B)U=0

3.3

FEC TLV (2B)U=0 F=0 type=0x0100

length=8 (2B)

prefix FEC element (8B)type = 0x02 family=1(IPv4) prefix-length=16 prefix=192.115/16

label TLV (2B)U=0 F=0 type=0x0200

length=4 (2B)

label = 17

Y(J)S MPLS Slide 91

BGP4 Label DistributionBGP4 Label Distribution

BGP peers exchange VPN routescan easily associate a label with these routes

all BGP procedures are immediately available for use for label distribution messages

BGP4 is a very extensible protocol – multiprotocol extensions support address families (originally for IPv4,IPv6, etc)– MPLS defines a new address family

3.3

Y(J)S MPLS Slide 92

BGPBGP

marker can be used for authentication (TCP MD5 signature)

length is total BGP PDU length, including header

type– OPEN (for session initialization)

– UPDATE (add, change and withdraw routes)

– NOTIFICATION (return error messages, terminate session)

– KEEPALIVE (heartbeat)

KEEPALIVE packet consists of 19B header only

marker (16B)

length (2B)

type(1B)

data (variable)

header (19B)

3.3

Y(J)S MPLS Slide 93

BGP state machineBGP state machine

idle – no session (awaiting session initialization)

connect – attempting to connect to peer

active – started TCP 3-way handshake (router busy)

open sent – have sent OPEN message

open confirm – after receiving TCP SYN for OPEN message

established – BGP session up and running

3.3

Y(J)S MPLS Slide 94

BGP OPENBGP OPEN

version (3 or 4)

my AS – identifier of autonomous system

hold time – max time (sec) between receipt of messages

BGP ID – sender’s BGP identifier

op len – length (bytes) of optional parameters

opt parameters - TLVs

version (1B)

my AS (2B)

hold time (2B)

opt parameters (variable)

BGP-ID (2B)

op len (1B)

3.3

Y(J)S MPLS Slide 95

BGP UPDATEBGP UPDATE

Withdrawn Routes – list of routes no longer to be used (NLRI format- see below)

Path Attributes – route specific information (see next page)

Network Layer Reachability Information – (classless) routing information

the NLRI is a list of address-prefixeseach prefix must be masked from the left to the length specified

WR len (2B)

withdrawnroutes(var)

PA len(2B)

NLRI(var)

path attributes

(var)

len(1B)

prefix (variable)

3.3

Y(J)S MPLS Slide 96

BGP UPDATE - Path AttributesBGP UPDATE - Path Attributes

flags

O – optional/well-known bitif 1 must be recognized by all BGP implementationsif W=1 and unrecognized attribute, BGP sends notification and session closed

T – transitive/nontransitive bitif 1 and attribute unrecognized it is passed along, else silently ignoredwell-known attributes are always transitive

C – complete/partial bit (for optional transitive attributes only)

L – attribute length bit (=0 attribute length is 1B, =1 length is 2B)

type codeORIGIN, AS_PATH, NEXT_HOP, MED, LOCAL_PREF,AGGREGATOR, COMMUNITY, ORIGINATOR_ID…

flags(1B)

type code(1B)

3.3

Y(J)S MPLS Slide 97

BGP NOTIFICATONBGP NOTIFICATON

all notification messages cause BGP session to close

error codes include:– message header error– open message error– update message error– hold timer expired– state machine error– other fatal error

errorcode (1B)

errorsubcode(2B)

data(var)

3.3

Y(J)S MPLS Slide 98

MPLS Traffic Engineering

4.1 Introduction to Traffic Engineering

4.2 DiffServ and IntServ

4.3 TE protocols

A

B

C

D

E F

G

Y(J)S MPLS Slide 99

Traffic EngineeringTraffic EngineeringTE is control of network traffic to achieve specific objectives

unfortunately users and providers have contradictory objectives

user objectives (QoS)– network availability– packet loss– end-to-end delay– round-trip delay– packet delay variation (PDV)– error rate

provider objectives (performance)– bandwidth utilization– resource utilization– speed of failure recovery– ease of management– monetary outlay

4.1c

Y(J)S MPLS Slide 100

Network and Traffic EngineeringNetwork and Traffic Engineering

Network Engineeringputting the bandwidth where the traffic is– physical cable deployment (thick pipes)– over-provisioning– virtual connection provisioning– violates providers objectives

Traffic Engineeringputting the traffic where the bandwidth is– explicit traffic routing– route optimization– can it meet user objectives?

4.1


IP’s ProblemIP’s Problem

were C,D,E, and F ATM switches there would be no problem (1Gb over ACDG, 500Mb over BCEFG)

with standard hop-count cost function, all traffic over CDGresulting in 1.5Gb there (congestion) and CEFG idle

with administrative cost on CDG we can force all the traffic to CEFGeven worse congestion !

finally with administrative cost and ECMP we can load balance750 Mb over CDG and CEFG, link EF is still congested !

what can we do?

A

B

C

D

E F

Gtraffic from A to G 1Gbtraffic from B to G 500Mball links 1Gb except EF 500 Mb

4.1

1

0.5


SolutionSolutionIP’s problem arises from

– the forwarding being purely local– but the routing being too global

IP routing always optimizes a global (usually additive) metricdiscrete optimization problem

does not take local constraints (e.g. BW of individual links) into account

we need to optimize a global metricwhile taking local constraints into account

this is called constraint-based routingdiscrete optimization with inequality constraints

main constraint is BW, but also maximum delay, packet loss, etc.another constraint: explicit include/exclude links/routers 4.1


MPLS TEMPLS TEMPLS that we have seen so far allows explicit routing

– can minimize number of hops– can tailor LSP to needs

MPLS-TE LSPs can be setup according to constraints– only include in LSP LSR with sufficient available BW– only include in LSP LSR that guarantees sufficiently low delay

traffic engineering achieved by extending base protocols enhanced routing protocols (resource reservation)

– OSPF OSPF-TE– ISIS ISIS-TE

constraint-based signalling protocols (pinned LSPs and reservation)– RSVP RSVP-TE– LDP CR-LDP (no longer under active development)

4.1


IP FixesIP Fixes

two approaches to QoS were developed for IP (never popular)

IntServ (guaranteed QoS)– define traffic flows (CO approach)– guarantee QoS attributes for each flow– reserve resources at each router along the flow– signaling protocol (RSVP) needed

DiffServ (statistical QoS)– retain CL paradigm– no guaranteed QoS attributes– classify packets (differentiated)– offer special treatment (priority) relative to other packets– no resource reservation

4.2


DiffServDiffServ

DIffServ was developed in IETF after IntServ

IntServ too heavy-weight (and too revolutionary) for most purposes

resource reservation is against IP-philosophyif not enough BW, then more democratic for all to sufferif reserve BW and don’t use, then this is simply over-provisioning

DiffServ is evolutionary “coarse-grained” approach to IP QoS

DiffServ – divides traffic into service classes

and allocates resources on a per-class basis– uses 6 bits of ToS byte in IP header to mark packets

field is renamed Differentiated Services Code Point no setup or router state required

– DSCP defines per-hop behaviors (PHB) tells router how to treat packet

– three standard PHBs (BE, AF, EF)

RFCs 2474, 2475

4.2


DiffServ PHBsDiffServ PHBs

Best Effort– standard IP service– QoS depends on momentary network load

Assured Forwarding– AF specifies class that determines queue– in addition, three drop-precedence levels (low, med, high)– AF packets from a single source should not be mis-ordered even if have different drop-precedence (i.e. single queue)

Expedite Forwarding– EF packet should experience no queuing delays– EF packets should have low loss– implemented by dedicated EF router queue

4.2


MPLS and DiffServMPLS and DiffServ

what does all this have to do with MPLS ?

MPLS will have to co-exist with DiffServ IP

MPLS provides similar functionality

Exp bits in MPLS shim header are similar to DSCPs– only three bits (8 classes) while DiffServ ended up with 6 bits– but 8 service classes is usually more than enough commonly 4 classes are offered (bronze, silver, gold, platinum)

LSR can maintain mapping from Exp to PHBe.g. Exp=000 for BE, Exp=001 for AF low drop precedence, etc

no new signaling mechanism required

LSP with Exp mapping to PHB is called an E-LSP

ingress LER assigns the EXP field

4.2


L-LSP vs. E-LSPL-LSP vs. E-LSPwith DiffServ each FEC is defined by

– destination address– service class

there are two ways of implementing the DiffServ mapping

L-LSP (Labeled-inferred LSP)– behavior based on label alone– support different service classes by using different labels– LSP BW allocated from specific queue (class)– Exp may be used for drop precedence

E-LSP (Exp-inferred LSP)– behavior based on label + Exp field– LSP BW allocated from link– can support 8 different service classes per label

link BWL-LSP

L-LSPL-LSP

BW %

4.2


IntServIntServ

IntServ is an overall QoS architecture (not just RSVP)

IntServ is a radical departure from pure IPand requires IntServ-enabled routers

IntServ– enables providing end-to-end QoS guarantees

for services that need them– defines flows (introduces CO to IP’s CL architecture)

flows are classified into three service classes (BE,CLS,GS)– specifies admission control and policing– like all CO architectures, requires signaling protocol (RSVP)

IntServ-enabled routers– reserve needed resources along the flow’s path– routers must retain state

RFCs 2205-2216, 2379-2382

4.2


IntServ CoSIntServ CoS

Best Effort– standard IP service– QoS depends on momentary network load

Controlled Load Service– service equivalent to unloaded network– low packet loss– most packets will experience delay close to minimum– no quantitative guarantees

Guaranteed Service– bounded worst case delay (no PDV guarantee)– low packet loss (zero if node buffers correctly provisioned)– quantitative guarantees

4.2


RSVPRSVP

primary signaling protocol for IntServ

runs over raw IP or UDP/IP

does not setup path, uses path setup by routing protocols

sessions identified by source and destination socket numbers

requests unidirectional QoS characteristics from network

causes routers along path to reserve link and node resources

network responds with success/failure

routers maintain soft-state, reservations time-out unless refreshed

two main message types: PATH and RESV4.2

RSVP


RSVP MessagesRSVP Messages

PATH– message from sender to receiver(s)– carries classification info and TSpecs

RESV– response of receiver to PATH message– carries session ID and RSpec specifying QoS required– this is the actual request for resource reservation

receivers

sender

4.2


MPLS and IntServMPLS and IntServ

what does all this have to do with MPLS ?

MPLS extension of RSVP : RSVP-TE (RFC 3209) can be used as a label distribution protocol

for downstream-on-demand binding

create and distribute bindings between RSVP flows and labelsNote: requests by LSRs and hosts (RSVP only by hosts)

uses label instead of source and destination socket numbers

extends RSVP by adding new objects (e.g. label) and procedures

allows strict/loose explicitly routed LSPs

like LDP there is peer discovery, label requests, binding messages

QoS and traffic parameters are carried transparently

4.2


CR-LDPCR-LDPRSVP-TE is not the only way of setting up a LSP for TE

Constraint-based Routing LDP (CR-LDP) extends LDP

basic LDP lacks– explicit routing– resource reservation

so CR-LDP adds– explicit route object, carried as TLV in label request – traffic parameter object

peak/committed data rate peak/committed burst sizes etc.

and since we are already extending the basic protocol …– path pre-emption priority– path re-optimization (with pinning option)– LSP setup failure notification– automatic LSP failure recovery policies– etc. 4.3


Setup ProceduresSetup Procedures

Example setupwith explicit routes

CR-LDP: A reserves resources, sends label request w/ explicit route BC & resource req.s B reserves resources and sends label request w/ explicit route C & resource req.s C reserves resources, locally binds label and sends label mapping to B B matches mapping to previous request, remotely binds, sends label mapping to A A remotely binds label

RSVP-TE: A sends PATH message to B w/ explicit route BC and resource requirements B forwards PATH message to C after changing explicit route to C C determines required resources, reserves, locally binds label and sends RESV to B B matches, reserves resources, remotely binds, and sends RESV to A A matches, remotely binds label and reserves resources

A CBingress egress

4.3


RSVP-TE/CR-LDP Comparison RSVP-TE/CR-LDP Comparison 1/21/2

CR-LDP extends an MPLS label distribution protocol adding explicit route and resource reservation features

RSVP-TE extends a pure IP resource reservation protocol to MPLSrequires running two protocols

both distribute binding information

both support explicit routing, pinning, and resource reservation

CR-LDP uses TCP (reliable) message order is guaranteedlost packet delays all processing

RSVP-TE uses IP (in core) or UDP/IP (unreliable)needs refreshes for reliability

RSVP-TE supported by Cisco and Juniper, and selected by IETF

CR-LDP supported by Nortel and Nokia, was favorite of ITU-T4.3


RSVP-TE/CR-LDP Comparison RSVP-TE/CR-LDP Comparison 2/22/2

CR-LDP is hard-state

RSVP-TE is soft-state– automatically tracks changes– needs refresh overhead (can be reduced by bundling)

both use ordered binding from egress to ingress after first traversing from ingress to egress

but CR-LDP reserves resources in forward directionRSVP-TE reserves in backward direction

RSVP-TE uses standard IntServ QoS model (TSpec/RSpec)

CR-LDP uses its own QoS model (traffic parameters)

CR-LDP may be vulnerable to TCP DoS attacks

RSVP-TE specifies authentication control

4.3


OSPF-TE and IS-IS-TEOSPF-TE and IS-IS-TEconstraint-based routing needs link attribute information

most important - available BW

link-state protocols have mechanisms to flood link-up/link-downto every router in the domain

piggyback attribute as TLV on link-up/link-down messages– OSPF add to Link-State Advertisement (LSA) (RFC 3630)– IS-IS add to Link-State Packets

once router receives attribute information it can reserve resource

when attribute information changes need to reflood

but every allocation changes attributes!

to decrease overhead: only flood when change passes threshold inherent timing bounds

4.3


MPLS Applications

5.1 L3 VPNs

5.2 L2 VPNs

5.3 PWE3


2547bis (RFC 4364) VPNs2547bis (RFC 4364) VPNs

Virtual router (peering) model, not tunneling

PE maintains Virtual Route Forwarding table for each VPN

BGP (with multiprotocol extensions) used for label distribution

in order to support private IP addresses in BGPPE prepends 8B Route Distinguisher (unique to site) to IP address

PEPE

CE peer to PE

CE not peer to CE

ext label

IP Packet

SP label

CE is IP router

P PC

CEC

C

CC

C

CE

5.1


BGP MPLS VPNs (2547bis)BGP MPLS VPNs (2547bis)

presently most popular provider managed VPN

originally specified in RFC 2547, update in draft called RFC 4364

transports IPv4 (IPv6) traffic in MPLS tunnels

uses BGP for route distributionsince SPs commonly use BGP for routing

2547 is not an overlay model– CE routers at different sites are not routing peers – they do not directly exchange routing information– they don’t even need to know of each other– so customer needn’t manage a backbone or virtual backbone– no inter-site routing problems

5.1


BGP MPLS VPNs (cont.)BGP MPLS VPNs (cont.)

only PE routers maintain VPN informationP routers needn’t maintain any customer routing information

C routes either manually configured in PE or advertised to PE using BGP, OSPF, etc.

PE advertises routes to remote PEs using BGPremote PEs advertise routes to their CEs using BGP, OSPF, etc.

IP address overlap solved using Route Distinguisher (RD)

5.1


L2VPN - VPWSL2VPN - VPWS

Virtual Private Wire Service is a L2 point-to-point service

it emulates a wire supporting the Ethernet physical layer

set up MPLS tunnel between PEsset up Ethernet PW inside tunnel

CEs appear to be connected by a single L2 circuit

(can also make VPWS for ATM, FR, etc.)

CE CEAC AC

PE PE

provider network

5.2


L2VPN - VPLSL2VPN - VPLS

VPLS emulates a LAN over an MPLS network

set up MPLS tunnel between every pair of PEs (full mesh)set up Ethernet PW inside tunnels, for each VPN instance

CEs appear to be connected by a single LAN

PE must know where to send Ethernet frames …but this is what an Ethernet bridge does

CE

CE

CE

for clarity only one VPN is shown

AC

AC

AC

PE

PE

PE

5.2


VPLSVPLS

a VPLS-enabled PE has, in addition to its MPLS functions: VPLS code module (IETF drafts) Bridging module (standard IEEE 802.1D learning bridge, but

no STP)

SP network (inside rectangle) looks like a single Ethernet bridge!

Note: if CE is a router, then PE only sees 1 MAC per customer location

CE

CE

V B

V B

B V

CE

5.2


Pseudowire EmulationPseudowire Emulation

This is a special case of “Pseudowire Emulation Edge to Edge”, or PWE3

ProviderEdge

(PE)Customer

Edge

(CE)

CustomerEdge

(CE)

CustomerEdge

(CE)

nativeservice

PacketSwitchedNetwork

(PSN)

PseudoWires (PWs)

[tunnels]

CustomerEdge

(CE)

CustomerEdge

(CE)

ProviderEdge

(PE)nativeservice

5.3


IETF PWE3 WGIETF PWE3 WG

In the Internet Area of the Internet Engineering Task Force

Native (layer 1,2) services : ATM (port mode, cell mode, AAL5-specific modes) FR Ethernet (DIX, 802.3, VLAN) TDM (SONET/SDH, E1, T1, E3, T3)

PSNs IPv4 IPv6 MPLS L2TPv3

5.3


PWE3 WG CharterPWE3 WG Charter

Edge-to-edge emulation and maintenance of PWs– tunnel creation and placement out of scope

Network interworking, not service interworking Must not exert controls on underlying PSN

– but diffserv, RSVP-TE can be used

Use RTP when necessary– for real-time functions, clock recovery

Realize that emulation will not be perfect– need applicability statement for each native service

WG will produce the following documents– framework, requirements, architecture documents– service specific encapsulation documents for each native service

5.3


Why MPLS for PWE3 ?Why MPLS for PWE3 ?

Much of the PWE work is focused on MPLS Emulated services have QoS and TE requirements

– IP is basically a “best effort” service– diffserv and RSVP extensions not prevalent– MPLS can provide TE guarantees – RSVP-TE (CR-LDP) allows TE signaling

IP provides no standard connection multiplexing method– UDP/TCP ports provide application multiplexing– RTP uses ports in a nonstandard way– L2TP includes a multiplexing mechanism– MPLS label stack provides natural multiplexing method– Using “inner labels” provides two layers of switching (like ATM VP/VC)

ATM-f inner label outer label(s)Dictionary: ITU-T interworking label transport label(s)

IETF PW label tunnel label(s)5.3


PWE packet formatPWE packet format

outer label

inner label

controlword

Payload

• inner and outer labels specify forwarding and multiplexing• outer label is standard MPLS stack• inner label contains PW label

• control word• enables detection of out-of-order and lost packets• flags may indicate critical alarm conditions (TDM PWs)• PW PID nibble

5.3

Documents

MPLS Yaakov (J) Stein February 2006 Chief Scientist RAD Data Communications