Upload
lisa-griffith
View
220
Download
2
Tags:
Embed Size (px)
Citation preview
MPLSMPLS
Yaakov (J) Stein February 2006Chief ScientistRAD Data Communications
Y(J)S MPLS Slide 2
Why study MPLS?Why study MPLS?
it’s new
it’s different and interesting
along with IPv6 it will save the Internet
it enables delivery of IP VPN services
it facilitates pseudowire functionality
all of the above
Y(J)S MPLS Slide 3
Course OutlineCourse Outline
1) Introduction
2) MPLS Forwarding Component
3) MPLS Control Component
4) MPLS Traffic Engineering
5) MPLS Applications
Y(J)S MPLS Slide 4
Introduction
12345
12345
1.1 Review of forwarding and routing
1.2 Problems with IP routing
1.3 Label Switching
Y(J)S MPLS Slide 5
Forwarding and RoutingForwarding and Routing
a router (switch) performs 2 distinct algorithms forwarding algorithm (forwarding component) routing algorithm (control component)
may be manual configuration or signaling protocol
from the forwarding point of view there are three different types of network:
broadcast (e.g. Ethernet) connectionless (e.g. IP) connection oriented (e.g. ATM)
1.1
Y(J)S MPLS Slide 6
Broadcast (e.g. Ethernet)Broadcast (e.g. Ethernet)
message is sent to all possible recipients
all receivers check message destination address
if message destination address does not match receiver’s addressmessage ignored (except in promiscuous mode)
if message destination address matches receiver’s addressmessage processed
not for me
not for me
for me!! not for me
not for me
not for me
not for me
1.1
Y(J)S MPLS Slide 7
Connectionless ForwardingConnectionless Forwardingbroadcast is only practical for small networks (e.g. LANs)
large networks can be broken into subnetworkswith routers performing the internetworking
such a network is connectionless (CL) if no setup is required before sending data each router makes independent forwarding decision packets are self-describing packet inserted anywhere will be properly forwarded
Notes:– the address must have global significance– IP actually relies on a L2 protocol (Ethernet, PPP) for final stage
CL forwarder (router)
1.1
Y(J)S MPLS Slide 8
IP RoutingIP Routing Distance Vector (Bellman-Ford), e.g. RIP
– send <addr,cost> to neighbors– routers maintain cost to all destinations– need to solve “count to problem“
Path Vector, e.g. BGP– send <addr,cost,path> to neighbors– similar to distance vector, but w/o “count to problem“– like distance vector has slow convergence*– doesn’t require consistent topology– can support hierarchical topology => exterior protocol (EGP)
Link State, e.g. OSPF, IS-IS– send <neighbor-addr,cost> to all routers– determine entire flat network topology (Dijkstra’s algorithm)– fast convergence*, guaranteed loopless => interior routing protocol (IGP)
*convergence time is the time taken until all routers work consistently before convergence is complete packets may be misforwarded, and there may be loops
1.1
Y(J)S MPLS Slide 9
IP ForwardingIP Forwarding
what field is used for forwarding?field used depends on application regular: use longest destination prefix match ToS: use longest destination prefix match + exact match on ToS multicast: use longest/exact match on source/group address
how is forwarding performed?forwarding algorithm depends on routing algorithm! for Distance Vector - forward by cost for Path Vector - forward by cost and path for Link State - Dijkstra’s algorithm
1.1
Y(J)S MPLS Slide 10
Connection Oriented ForwardingConnection Oriented Forwarding
each forwarder maintains forwarding table (or table per input port)
control component: route must be set-up (table must be updated) before data sent set-up may be manual or signaled once route no longer needed it should be torn-down
input ports output ports
1
2
3
4
5
1
2
3
4
5
CO forwarder (switch)
1.1
Y(J)S MPLS Slide 11
CO ForwardingCO ForwardingCO addresses need not have global significance
by using locally defined addresses: addresses are smaller in size no need for global allocation mechanism no need to maintain global database
L2 forwarding is based on address read and swap
Note:when addresses are purely local
CO forwarding is called L2 (link layer) forwarding
12 17 23
1.1
Y(J)S MPLS Slide 12
CO Forwarding TableCO Forwarding Table
input port input address output port output address
1 21 2 21
1 37 1 21
2 12 5 12
3 5 5 12
4 15 3 37
Note: never really have an input port column either it is irrelevant (CO address is per switch not per input port), or if needed (e.g. ATM) we keep separate tables per input port (more efficient) 1.1
Y(J)S MPLS Slide 13
CO RoutingCO Routing
for CO forwarding we need to find a route at setup– manual configuration (strict explicit route)– use routing protocols (e.g. P-NNI for ATM)
CO routing protocols search for a global path – hence they can guarantee global characteristics (SLA)– pure CL routing can not guarantee path QoS
information source usually knows desired characteristicsso usually use source routing
source routing– can force route to go through specific forwarders (loose explicit route)– can reject forwarders that can not guarantee needed characteristic– can request resource reservation in selected forwarders
1.1
Y(J)S MPLS Slide 14
Problems with IP routingProblems with IP routing
scalability– router table overload– routing convergence slow-down– increase in queuing time and routing traffic– problems specific to underlying L2 technologies
hard to implement load balancing QoS and Traffic Engineering problem of routing changes difficulties in routing protocol update lack of VPN services
1.2
Y(J)S MPLS Slide 15
ScalabilityScalabilitywhen IP was first conceived scalability was not a problemas IP traffic increases, routing shows stress
simplistic example – N hosts– each router serves M hosts – each router entry takes a bytes
hence– router table size a N ~ N – N / M ~ N routers (more routers => slower convergence)– packet processing time* ~N (since have to examine entire table)
– ~N routers send to ~N routers tables of size ~Nso routing table update traffic increases ~N 3 (or ~N 4 )
* IP routing requires 1000s of clocks per decision
1.2
Y(J)S MPLS Slide 16
L2 Backbone ScalabilityL2 Backbone Scalabilityinstead of expensive and slow IP routers
we can use faster and cheaper ATM switches in core
but this doesn’t help! ATM switches are transparent to routerssince ATM switches do not participate in IP routing protocolsevery IP router must be logically adjacent to every other
and we need ~ N2 ATM VCs !if only the ATM switches could understand IP routing protocols!
ATM
Classical IP over ATM
1.2
Y(J)S MPLS Slide 17
Load BalancingLoad Balancing
using hop-count as path costtraffic destined for G from both A and B goes through D
so links CD and DG become congestedwhile links CE, EF, and FG are underutilized
since IP forwarding uses only destination addressthis can’t be overcome in pure IP routing
problem even exists in equal-cost multi-path (ECMP) case
solution requires traffic engineering 1.2
“fish diagram”A
B
C
D
E F
G
1
2
3
2 3
4
Y(J)S MPLS Slide 18
QoS and TEQoS and TE
pure IP, being CL, can not guarantee path QoSbut other protocols in the IP suite can help
TCP adds CO layer compensating for loss and misordering IntServ (RSVP) sets up path with reserved resources DiffServ (ToS) prioritizes packets (neither -Serv widely deployed)
so IP network managers mostly use network engineering– i.e. throw BW at the problem
rather than traffic engineering– i.e. optimally exploit the BW you have
1.2
Y(J)S MPLS Slide 19
Routing ChangesRouting Changes
IP routing is satisfactory in the steady state
but what happens when something changes?
change in routing information (new router, router failure, new inter-router connection, etc)
necessitates updating of tables of all routers
convergence will be slow
change in routing protocol is even worse (e.g. Bellman-Ford to present, classes to classless, IPv4 to IPv6)
necessitates upgrade of all router software
upgrade may have to be simultaneous
need more complete separation of forwarding from routing functionality
1.2
Y(J)S MPLS Slide 20
VPN ServicesVPN Services
IP was designed to interconnect LANsnot to provide VPN services
all routers logically interconnected, so security weak
LANs may use non globally unique addresses (present solution - NAT)
complex provider - customer relationship
VC-merge problem (discussed later)1.2
192.115.243.19
192.115.243.19 192.115.243.79
IP
Y(J)S MPLS Slide 21
Solution - Label SwitchingSolution - Label Switching
label switching adds the strength of CO to CL forwarding
label switching has three stages:– routing (topology determination) using L3 protocols– path setup (label binding and distribution) – data forwarding
label switching the solution to all of the above problems– speeds up forwarding– decreases forwarding table size (by using local labels)– load balance by explicitly setting up paths– complete separation of routing and forwarding algorithms
no new routing algorithm neededbut new signaling algorithm may be needed
CO forwarder (switch) CL forwarder (router) LSR
1.3
Y(J)S MPLS Slide 22
Where is it?Where is it?
unlike TCP, the CO layer lies under the CL layerif there is a broadcast L2 (e.g. Ethernet), the CO layer lies above it
hence, label switching is sometime called layer 2.5 switching
1.3
higher layers
layer 3 (e.g. IP)
label switching (layer 2.5)
layer 2 (e.g. ethernet)
physical layer
shim header
Y(J)S MPLS Slide 23
LabelsLabels
a label is a short, fixed length, structure-less address
the following are not labels: telephone number (not fixed length, country-code+area-code+local-number) Ethernet address (too long, note vendor-code is not meaningful structure) IP address (too long, has fields) ATM address (has VP/VC)
not explicit requirement, but normally only local in significance
label(s) added to CL packet, in addition to L3 address
layer 2.5 forwarding may find a different route than the L3 forwarding is faster than L3 forwarding requires a flow setup process and signaling protocol
1.3
Y(J)S MPLS Slide 24
FForwarding orwarding EEquivalence quivalence CClasslass
equivalence class - set of entities sharing common characteristicsthat can be considered equivalent for some purpose
Theorem (from set theory) - any equality relation (e.g. common features)
divides all entities into non-overlapping equivalence classes
any forwarding algorithm need only consider destination (and sometimes source) address service requirements (e.g. priority, BW, allowed delay, etc)
we can group together all packets with the samedestination and service requirements as a FEC
by the theorem every packet belongs to one unique FEC
packets in the same FEC should follow the same routeso we should map them to the same label (bind them) 1.3
Y(J)S MPLS Slide 25
FECsFECs
what constitutes a FEC ?
for IP routing (de facto)all packets w/ same destination IP prefixthat prefix being the longest in the routing table
for MPLS we can decide !
coarsest granularity all packets with a destination address
served by a given router
finest granularityall packets from given source socket
to given destination socket with specified handling requirements
1.3
Y(J)S MPLS Slide 26
Label Switching ArchitectureLabel Switching Architecture
label switching is needed in the core, access can be L3 forwarding*
core interfaces the access at the edge (ingress, egress)
LSR router that can* perform label switchingLER LSR with non-MPLS neighbors (LSR at edge of core network)LSP unidirectional path used by label switched forwarding (ingress to egress)
* not every packet needs label switching (e.g. only small number of packets, no QoS) 1.3
L3 router L3 routerLabel Edge Router Label Edge RouterLabel Switched Routers
L3 link L3 link
Label Switched Path
ingress egress
downstream directionupstream direction
Y(J)S MPLS Slide 27
Label Switched ForwardingLabel Switched ForwardingLSP needs to be setup before data is forwarded
and torn down once no longer needed
LSR performs – label switched forwarding* for labeled packets– label unique to LSR or unique to input interface (like ATM)– optionally L3 forwarding for unlabeled packets
ingress LER – assigns packet* to FEC– labels packet– forwards it downstream using label switching
egress LER – removes label– forwards packet using L3 forwarding– exception: PHP (discussed later)
*once packet is assigned to a FEC and labeled, no LSR looks at the L3 headers/address 1.3
Y(J)S MPLS Slide 28
Hierarchical ForwardingHierarchical Forwarding
many networks use hierarchical routing– decreases router table size– increases forwarding speed– decreases routing convergence time
telephone numbers
Internet DNS
ATM
Ethernet/802.3 address space is flat(even though written in byte fields)
Country-Code Area-Code Exchange-Code Line-Number
972 2 588 9159
host … SLD TLDmyrad . rad . co . il
1.3
VPVC
VCVC
Y(J)S MPLS Slide 29
IP Routing HierarchyIP Routing HierarchyIPv4
– originally flat space (1st come - 1st) until router tables exploded
– then 3 classes
– now CIDR <RFC1519> AS=Autonomous
SystemIP exploits hierarchy by employing interior/exterior routing protocols
IP can even support arbitrary levels of hierarchyby advertising aggregated addresses
but the exploitation is not optimal
ASany host
0 net7 host24A
10 net14 host16B 110 net22 host8C
AS12
AS13AS13
AS14
1.3
Y(J)S MPLS Slide 30
IP Routing Hierarchy BugIP Routing Hierarchy Bug
traffic from network 1 to network 2 must go through transit domain
to get to any host in network 2, all transit domain routers forward to border router 2
transit domain routers shouldn’t need to know anything about network 2but for IP protocols* change in network 2 forces rerouting in transit domainin worst case, transit domain routers need to know all routes in Internet
avoiding maintenance of interdomain information • conserves routing table memory• ensures faster convergence• transit domain routers restart faster• provides better fault isolation * except at BGP-OSPF interface
Net 1 Net 2transit domain1 2
1.3
Y(J)S MPLS Slide 31
Label StacksLabel Stacks
since labels are structure-less, the label space is flat
label switching can support arbitrary levels of hierarchy by using a label stack
label forwarding based only on top labelbefore forwarding, three possibilities (listed in NHLFE) :
– read top label and pop– read top label and swap– read top label, swap, and push new label(s)
top label
another label
yet another label
bottom label
1.3
Y(J)S MPLS Slide 32
Example Uses of Label StackExample Uses of Label Stack
Example applications that exploit the label stack
– fast rerouting
– VPNs
– X over MPLS (PWE3)
Note: three labels is usually more than enough
1.3
Y(J)S MPLS Slide 33
Fast ReroutingFast ReroutingIP has no inherent recovery method (like SONET)
in order to ensure resilience we provide bypass links
to reroute quickly we pre-prepare labels for the bypass links
10 11 12swap swap
swap+
push popswapswap
11
13 11
1411
from here on no difference! *
when link downchange fwd table
protection LSP1.3* label space per LSR
not per input port
Y(J)S MPLS Slide 34
Label Switched VPNsLabel Switched VPNs
1.3
customer 2 networkcustomer 1 network
C C
C
CC
C
CCE
C
C
CC
C
CE
PPE
PP
CE
PE
P
P
CEprovider network
customer 2 network customer 1 networkKey
C Customer routerCE Customer Edge routerP Provider routerPE Provider Edge router
Y(J)S MPLS Slide 35
Label Switched VPNs Label Switched VPNs (cont.)(cont.)
customers 1 and 2 use overlapping IP addresses
C-routers have inconsistent tables
ingress PE-router inserts two labels
P-routers don’t see IP addressesso no ambiguity
P-routers see only the label of the egress PE-router– they don’t know about VPNs at all– no need to understand customer configuration– hence smaller tables & no rerouting if customer reconfigures
ingress PE router only knows about CE routers– no need to understand customer configuration
egress PE
egress CE
IP headers
payload
1.3
Y(J)S MPLS Slide 36
X over MPLSX over MPLS
1.3
nativeservice
A tunnel may contain many PWs
nativeservice
PE
router
P router
P router
P router
PE
router
P router
tunnel label *
PW label *
PW control word
payload
* MPLS-f inner label outer label(s)Dictionary: ITU-T interworking label transport label(s)
IETF PW label tunnel label(s)
Y(J)S MPLS Slide 37
Service InterworkingService Interworking
Network interworking (PW):
Service interworking:
ProviderEdge
(PE)
CustomerEdge
(CE)
providernetwork
NativeService
A
ProviderEdge
(PE)
CustomerEdge
(CE)
NativeService
A
ProviderEdge
(PE)
CustomerEdge
(CE)
providernetwork
NativeService
A
ProviderEdge
(PE)
CustomerEdge
(CE)
NativeService
B
1.3
Y(J)S MPLS Slide 38
MPLS Forwarding Component
2.1 The essentials
2.2 The documents
Label (20b) Exp(3b) S(1b) TTL (8b)
Y(J)S MPLS Slide 39
MPLS historyMPLS historymany different label switching schemes were invented Cell Switching Router (Toshiba) <RFC 2098,2129> IP Switching (Ipsilon, bought by Nokia) <RFC 2297> Tag Switching (Cisco) <RFC 2105> Aggregate Route-based IP Switching (IBM) IP Navigator (Cascade bought by Ascend bought by Lucent)
so the IETF decided to standardize a single method BOFs 1994-1995 WG chartered 1997
– co-chairs from Cisco and IBM (so similar to tag switching and ARIS)
– Cisco, IBM, Ascend authored architecture document MPLS standards-track RFCs 2001-
2.1
Y(J)S MPLS Slide 40
MPLSMPLS
of all the label switching technologies - what is special about MPLS ?
multiprotocol - from above and below
label in L2 or shim header
single forwarding algorithm, including for multicast and TE
can run on IP router or ATM switch with only SW upgrade (although can benefit from special HW)
label distribution piggybacked on existing routing protocols or via LDP
control-driven downstream label binding
support for constraint-based routing (for TE)
2.1
Y(J)S MPLS Slide 41
MultiprotocolMultiprotocol Label Switching Label Switching
IPv4 IPv6 IPX etc.
MPLS
Ethernet ATM frame-relay etc.
2.1
Y(J)S MPLS Slide 42
MPLS LabelsMPLS Labelslabel may be an appropriate address in either L2 or L3
(even if we lose other features)
Examples: – for ATM use VPI and VCI fields as two labels (only two labels, no TTL)– for frame-relay use DLCI (only one label, no CoS, no TTL)
otherwise use shim header (described later)
2.1
Ethernet header
MPLS shim header
IP headers
payload
multi-access subnet
PPP header
MPLS shim header
IP headers
payloadpoint-to-point & POS
MPLS over ATM
ATM header
MPLS shim header
IP headers
payload fragment
ATM header
payload fragment
Y(J)S MPLS Slide 43
ATM Label SwitchingATM Label Switching
GFCVPI/VCI => top label
PTI/CLP/HEC
payload fragment rest of cells
1st cell
in this mode the label is carried in the VPI/VCIthis facilitates using ATM switcheshowever, we still need the shim header for S, and TTLthe label field in shim header is a placeholder and set to zero 2.1
shim IP packet AAL5 trailer
AAL5 PDU
ATM cell ATM cell ATM cell…
MPLS WG devoted a lot of time to this case
GFCVPI/VCI => top label
PTI/CLP/HEC
IP headers
payload fragment
MPLS shim
Y(J)S MPLS Slide 44
MPLS Shim HeaderMPLS Shim Header
when a shim header is needed, its format should be:
Label there are 220 different labels (+ 220 multicast labels)
Exp (CoS) left undefined by IETF WG was CoS in Cisco Tag Switching could influence packet queuing
Stack bit S=1 indicates bottom of label stack
TTL decrementing hop count used to eliminate infinite routing loops
generally copied from/to IP TTL field
Label (20b) Exp(3b) S(1b) TTL (8b)
2.1
S=0 top label
S=0 another label
S=0 yet another label
S=1 bottom label
Special (reserved) labels0 IPv4 explicit null1 router alert2 IPv6 explicit null3 implicit null
Y(J)S MPLS Slide 45
Single Forwarding AlgorithmSingle Forwarding AlgorithmIP uses different forwarding algorithms
for unicast, unicast w/ ToS, multicast, etc.
LSR uses one forwarding algorithm (LER is more complicated)– read top label L– consult Incoming Label Map (forwarding table) [Cisco terminology LFIB]– perform label stack operation (pop L, swap L - M, swap L - M and push N)
– forward based on L’s Next Hop Label Forwarding Entry
NHLFE contains:– next hop (output port, IP address of next LSR)
if next hop is the LSR itself then operation must be pop for multicast there may be multiple next hops, and packet is replicated
– label stack operation to be performed– any other info needed to forward (e.g. L2 format, how label is encoded)
ILM contains:– a NHLFE for each incoming label– possibly multiple NHLFEs for a label, but only one used per packet
2.1
Y(J)S MPLS Slide 46
LER Forwarding AlgorithmLER Forwarding Algorithm
LER’s forwarding algorithm is more complex check if packet is labeled or not if labeled
– then forward as LSR
– else lookup destination IP address in FEC-To-NHLFE Map if in FTN
– then prepend label and forward using LSR algorithm
– else forward using IP forwarding
[Cisco terminology LIB]
2.1LER LSRIP
routerpure IP link MPLS link
Y(J)S MPLS Slide 47
PPenultimate enultimate HHop op PPoppingopping
the egress LER E also may have to work overtime:– read top label– lookup label in ILM– find that in NHLFE that the label must be popped– lookup IP address in IP routing– forward to CE using IP forwarding
we can save a lookup (and the first 3 steps) by performing PHP but pay in loss of OAM capabilities
penultimate LSR PH performs the following:– read top label– lookup label in ILM– pop label revealing IP address of CE router– forward to CE using IP forwarding 2.1
I E
PH
CEIP link
MPLS domain
Y(J)S MPLS Slide 48
Route AggregationRoute Aggregation
traffic from both network 1 and network 2 is destined for network 3
scalability advantages– fewer labels – conserve table memory
disadvantages– IP forwarding may be required– OAM backwards trail is destroyed
2.1
E 3IP link
MPLS domain
1
2 IP link
IP link 12
22
13
23
31 32
net 2
net 3
net 1
Y(J)S MPLS Slide 49
IP Router / ATM SwitchIP Router / ATM Switchmigration - important to be able to exploit existing forwarding hardware
we can use IP routers as LSRs– already use the routing protocols for topology determination– need to add label distribution protocol (or extend routing protocol)– need to alter the forwarding algorithm– can manage with software upgrade but probably best to upgrade hardware too
we can use ATM switches as LSRs– need to alter the routing protocols for topology determination– need to add label distribution protocol– already uses the forwarding algorithm– probably can just upgrade software
2.1
+ =
=+
Y(J)S MPLS Slide 50
ATM Cell InterleaveATM Cell InterleaveATM switches
– have separate label space per input port– segment traffic into ATM cells
what if the forwarding tables for both port 1and port 2 have the entry
incoming label 13 outgoing label 17 outgoing port 1 ?
2.1B
ATMLSR
ATMLSR
13 data A1
13 data A2
A
13 data B2
13 data B1
17 data A1
17 data B1
17 data A2
17 data B2
1
21
2
B
Y(J)S MPLS Slide 51
VC/VP MergeVC/VP Mergein order to properly reconstruct the packet we can use
VC-merge
buffer cells at the ATM LSR until end-of-packet detected
cells belonging to different packets aren’t interleaved– introduces delay– requires large memory– requires ATM LSR to detect AAL5 end-of-packet
VP merge
use VPI as MPLS label
use VCI as multiplexing index
– limits number of available labels
2.1
Y(J)S MPLS Slide 52
Label Distribution ProtocolsLabel Distribution Protocolswhen LSR creates/removes a FEC - label binding
needs to inform other LSRs (“remote binding information”)
MPLS allows piggybacking label distribution on routing protocols– protocols already in use (don’t need to invent or deploy)
– eliminates race conditions (when route or binding, but not both, defined)
– ensures consistency between binding and routing information– only for distance vector or path vector routing protocols (not OSPF, IS-IS)
– not all routing protocols are sufficiently extensible (RIP isn’t)
– has been implemented for BGP-4
MPLS WG invented a new protocol LDP for “plain” label distribution– messages sent reliably using TCP/IP– messages encoded in TLVs– discovery mechanism to find other LSRs
… and extended RSVP to LSPs for QoS
2.1
Y(J)S MPLS Slide 53
Control MechanismsControl Mechanisms
pre-MPLS protocols used various control mechanisms
Example: label distribution– Cisco and IBM - control driven – Toshiba and Ipsilon - data (traffic) driven
MPLS is distinguished by its choice of control protocols– control driven– downstream binding– unsolicited and on-demand allowed– independent and ordered allowed
we will discuss these in the section on the control plane
2.1
Y(J)S MPLS Slide 54
MPLS RFCs MPLS RFCs (page 1 of 2)(page 1 of 2)
2547 BGP/MPLS VPNs2702 Requirements for Traffic Engineering Over MPLS2917 A Core MPLS IP VPN Architecture3031 Multiprotocol Label Switching Architecture3032 MPLS Label Stack Encoding3035 MPLS Using LDP and ATM VC Switching3036 LDP Specification3037 LDP Applicability3038 VCID Notification over ATM link for LDP3063 MPLS Loop Prevention Mechanism3107 Carrying Label Information in BGP-43209 RSVP-TE: Extensions to RSVP for LSP Tunnels3210 AS for Extensions to RSVP for LSP-Tunnels3212 Constraint-Based LSP Setup using LDP3213 Applicability Statement for CR-LDP3214 LSP Modification Using CR-LDP3215 LDP State Machine
2.2
Y(J)S MPLS Slide 55
More MPLS RFCsMore MPLS RFCs
3270 MPLS Support of Differentiated Services3346 AS for Traffic Engineering with MPLS3353 Overview of IP Multicast in MPLS Environment3429 Assignment of the 'OAM Alert Label' for MPLS OAM functions3443 TTL Processing in MPLS Networks3468 The MPLS WG decision on MPLS signaling protocols3469 Framework for MPLS-based Recovery3477 Signalling Unnumbered Links in RSVP-TE 3496 Support of ATM Service Class-aware MPLS Traffic
Engineering3564 Support of DiffServ-aware MPLS Traffic Engineering3478 Graceful Restart Mechanism for LDP3479 Fault Tolerance for the LDP 3480 Signalling Unnumbered Links in CR-LDP
2.2
Y(J)S MPLS Slide 56
Yet More MPLS RFCsYet More MPLS RFCs
3612 Applicability Statement for Restart Mechanisms for LDP3630 OSPF-TE3809 Generic Requirements for PPVPNs3811 Definitions of Textual Conventions for MPLS Management3812 MPLS Traffic Engineering Management Information Base3813 MPLS Label Switching Router (LSR) MIB3814 MPLS FEC-To-NHLFE MIB3815 Definitions of Managed Objects for LDP 3916 Requirements for Pseudo-Wire Emulation Edge-to-Edge 3988 Maximum Transmission Unit Signalling Extensions for LDP 3985 PWE3 Architecture4023 Encapsulating MPLS in IP or GRE4026 PPVPN Terminology 4031 Service requirements for Layer 3 PPVPNs
2.2
Y(J)S MPLS Slide 57
Even More MPLS RFCsEven More MPLS RFCs4023 Encapsulating MPLS in IP or Generic Routing Encapsulation (GRE)4090 Fast Reroute Extensions to RSVP-TE for LSP Tunnels4105 Requirements for Inter-Area MPLS Traffic Engineering4124 Protocol Extensions for Support of Diffserv-aware MPLS TE4125 Maximum Allocation BW Constraints Model for Diffserv-aware MPLS TE4126 Max Allocation w/ Reservation BW Constraints Model for Diffserv MPLS
TE4127 Russian Dolls Bandwidth Constraints Model for Diffserv-aware MPLS TE4128 Bandwidth Constraints Models for Diffserv-aware MPLS TE4182 Removing a Restriction on the use of MPLS Explicit NULL4201 Link Bundling in MPLS Traffic Engineering (TE)4206 Label Switched Paths (LSP) Hierarchy with GMPLS Traffic Engineering4216 MPLS Inter-AS TE Requirements4220 Traffic Engineering Link Management Information Base4221 Multiprotocol Label Switching (MPLS) Management Overview4247 Requirements for Header Compression over MPLS4364 BGP/MPLS IP VPNs (was 2547bis)4368 MPLS Label-Controlled ATM and FR Management Interface Definition4377 OAM Requirements for MPLS Networks4378 A Framework for MPLS OAM4379 Detecting MPLS Data Plane Failures
2.2
Y(J)S MPLS Slide 58
MFA forum IAsMFA forum IAs
1.0 Voice over MPLS IA
2.0 MPLS-PVC UNI IA
3.0 LDP Conformance IA
4.0 TDM Transport over MPLS using AAL1 IA
5.0 I.366.2 Voice Trunking over MPLS IA
6.0.0 MPLS Proxy Admission Control
7.0.0 MPLS UNI protocol
8.0.0 TDM over MPLS using Raw Encapsulation
2.2
Y(J)S MPLS Slide 59
ITU-T RecommendationsITU-T Recommendations
Y.1411 ATM-MPLS network interworking - Cell mode user plane interworking
Y.1412 ATM-MPLS network interworking - Frame mode user plane interworking
Y.1413 TDM-MPLS network interworking – User plane interworking
Y.1414 Voice services - MPLS network interworking
Y.1415 Ethernet-MPLS network interworking – User plane interworking
Y.1710 Requirements for OAM functionality for MPLS networks
Y.1711 Operation & Maintenance mechanism for MPLS networks
Y.1712 User-plane fault-management for ATM and MPLS OAM
Y.1713 Misbranching detection for MPLS networks
Y.1720 Protection switching for MPLS networks
X.84 FR over MPLS core networks
G.8110/Y.1370 MPLS Layer Network architecture
2.2
Y(J)S MPLS Slide 60
MPLS Control Component3.1 Procedures and scenarios
3.2 Label distribution protocols
3.3 LDP and BGP-4 details
user (data) plane
control plane
Y(J)S MPLS Slide 61
Control and User PlanesControl and User Planes
topology determination
use standard IP routing protocolsBGP, OSPF, IS-IS, PIM
label distribution
piggyback on routing protocol
use LDP, RSVP-TE
forwarding paradigm
label-stack CO forwardinguser (data) plane
control plane
3.1
Y(J)S MPLS Slide 62
What is Needed ?What is Needed ?
all IP routing protocols (BGP,OSPF,PIM,etc) procedure to bind label to FEC (label assignment)
protocol to distribute label binding information procedure to create forwarding table
procedure to label incoming packet forwarding procedure
– forwarding table lookup– label stack operations
control plane
user (data) plane
3.1
Y(J)S MPLS Slide 63
All the TablesAll the Tables
FEC table
Free Labels 128-200 presently free
FTN
ILM
NHLFE
FEC protocol input port handling
192.115/16 IPv4 2 best-effort
FEC port/label in port/label out
192.115/16 2/17 3/137
port/label in port/label out next hop operation
2/17 3/137 5.4.3.2 swap
3.1
Y(J)S MPLS Slide 64
LER ArchitectureLER Architecturecontrol plane
user (data) plane
IP routing protocolsIP routing table label binding and
distribution protocols
IP forwarding
tableMPLS
forwarding table
IP routers
labeling procedure
LSRs
3.1
free label table
egress LER
FTN
Y(J)S MPLS Slide 65
Binding & Distribution OptionsBinding & Distribution Options
label binding (assignment)
– per port or per LSR label space
– control driven vs. data driven (traffic driven)
– liberal vs. conservative label retention
label distribution (advertisement)
– downstream vs. upstream
– downstream on-demand (dod) vs. downstream unsolicited (du)
– independent vs. ordered3.1
Y(J)S MPLS Slide 66
Per Port Label SpacePer Port Label Space
LSR may have a separate label space for each input port (I/F)
or a single common label space
or any combination of the two
separate labels spaces means separate forwarding tables per port
ATM LSR have per port label spaces (leads to interleave problem)
per port label spaces increases number of available labels
common label space facilitates several MPLS mechanisms (e.g. reroute)
3.1
Y(J)S MPLS Slide 67
Control vs. Data DrivenControl vs. Data Driventhere are two philosophies as to when to create a binding
data-driven (traffic-driven) binding (Toshiba CSR, Ipsilon IP-Switching)
automatically create binding when data packets arrive(from first packet?, after enough packets? when tear LSP down?)
control-driven binding (Cisco Tag Switching, IBM ARIS)
create binding when routing updates arrive(only update when topology changes? update upon request?)
although not specifically stated in the architecture documentMPLS assumes control driven binding *
* two implementations of control driven: – topology-driven (routing tables are consulted)– control-traffic driven (only routing update messages are used)
3.1
Y(J)S MPLS Slide 68
Liberal vs. Conservative RetentionLiberal vs. Conservative Retention
LSR receives “advertisements” (label distribution messages) from other LSRs
conservative label retention:LSR retains only label-to-FEC bindings that are presently needed
liberal label retentionLSR stores all bindings received (more labels need to be maintained)
using liberal retention can speed response to topology changes
LSRs must agree upon mode to be used
A advertises labelB is previous hop LSR
but C retains label anywaylater routing change makes C the previous hop
C immediately can start forwardingC
AB
3.1
Y(J)S MPLS Slide 69
Downstream vs. UpstreamDownstream vs. Upstream
binding means to allocate a label to a FEClocal binding - LSR allocates the label from “free label” poolremote binding - LSR that receives the label
which LSR allocates ?
MPLS uses downstream bindinglabel allocated by LSR downstream from the LSR that prepends itlabel distribution information flows upstream
reverse in direction from data packetsto set up LSP through link from LSR A to LSR BLSR B binds label 13 to FECB advertises label to LSR ALSR A sends packets with label 13 to B
downstream
B13A
label 13
3.1
Y(J)S MPLS Slide 70
On-demand vs. UnsolicitedOn-demand vs. Unsolicited
downstream on-demand label distribution:
LSRs may explicitly request a label from its downstream LSR
unsolicited label distribution:
LSR distributes binding to upstream LSR w/o a request(e.g. based on time interval, or upon receipt of topology change)
LSR may support on-demand, unsolicited, or both
adjacent LSRs must agree upon which mode to be used
LSR A needs to send a packet to LSR BLSR A requests a label from LSR BB binds label 13 to the FECB distributes the labelA starts sending data with label 13
B13 dataA
I need a label
I allocated 13
3.1
Y(J)S MPLS Slide 71
Independent vs. OrderedIndependent vs. Orderedindependent binding (Tag Switching)
– each LSR makes independent decision to bind and distribute
ordered binding (ARIS) egress LSR binds first and distributes binding to neighbors LSR that believes that it should be the penultimate LSR
binds and distributes to its neighbors binding proceeds in orderly fashion until ingress LSR is reached
LSRs must agree upon mode to be used
B sees that it is egress LSR for 192.115.6B allocates label 13B distributes label to C and DC distributes label to E 192.115/16
AB
D
CE 13
13
3.1
Y(J)S MPLS Slide 72
LDP tasksLDP tasks
a label distribution protocol is a signaling protocol that can perform the following tasks:
discover LSR peers
initiate and maintain LDP session
signal label request
advertise binding
signal label withdrawal
loop prevention
explicit routing
resource reservation3.2
Y(J)S MPLS Slide 73
Label Distribution ProtocolsLabel Distribution Protocols
label distribution can be carried over various protocols
There are presently four options:
LDP– MPLS-enhanced IP networks
BGP4-MPLS– RFC 2547 VPNs
RSVP-TE– traffic engineering support
CR-LDP – constraint based (no longer recommended by IETF)
3.2
Y(J)S MPLS Slide 74
LDP vs. BGPLDP vs. BGP
both use TCP for reliable transport (LDP uses UDP for hellos)
both are hard-state protocols
both use TLV format for parameters
BGP
multiprotocol (IPv4, IPv6, IPX, MPLS)
highly complex protocol
provides routing / label distribution
built-in autodiscovery mechanism
LDP
MPLS only
simpler protocol
only label distribution
extendable for autodiscovery
3.2
Y(J)S MPLS Slide 75
LDPLDPmajor focus of the IETF MPLS WG was the design of LDP
based on similar TDP from Cisco
LDP sets up a bidirectional LDP session both sides can request or advertise labels
LDP usually uses TCP– needs reliable transport (e.g. what happens if miss a binding)– needs in-order delivery (e.g. binding+withdrawal)– hard to develop new reliable transport protocols– single acknowledgement timer for session– piggybacking ACK on data packets
Use UDP for discovery (hello) messages
periodic keepalive messages (if not received, session terminated)
messages encoded in TLV (Type Length Value) form3.2
Y(J)S MPLS Slide 76
LDP SetupLDP Setup
Discovery
Hello (UDP)
Session
Initialization (TCP)
Distribution
Label Request (TCP)
Label Distribution (TCP)
Hello (UDP)
Initialization (TCP)
3.2
Y(J)S MPLS Slide 77
Discovery PhaseDiscovery Phase
LSR periodically multicast transmits hello to “LDP discovery” UDP port– to “all routers on subnet” multicast group– to preconfigured IP address (when not all LSRs on same subnet) (extended discovery) “targeted LDP”
LSRs listen on this UDP port for hello messages Hello message contains:
– hold time– LSR Identifier
when LSR receives Hello from another LSR– it opens a TCP connection to that other LSR (if needed)
or (for extended discovery)– it unicast transmits a hello back to the other LSR
LDP session can now be established
3.2
Y(J)S MPLS Slide 78
Session InitializationSession Initialization
The LSR with higher identifier sends (TCP)a session initialization message to the other LSR
session initialization message contains:– LDP Protocol version– label distribution and control method– timer values– label space ranges (not any more !)
if receiving LSR accepts these parameters then it transmits a KeepAliveelse it transmits a reject
3.2
Y(J)S MPLS Slide 79
Distribution MessagesDistribution Messages
label mapping– downstream LSR advertisement of a label mapping for a FEC two FEC types: host address, IP address prefix
label withdrawal– reverse of mapping message– downstream LSR informs upstream LSR
that it has revoked a previous binding– upstream LSR can not longer use the label
label release– upstream LSR informs downstream LSR
that it no longer needs a binding– typically when downstream is no longer next hop
and operating in conservative retention mode
3.2
Y(J)S MPLS Slide 80
Request MessagesRequest Messagesin downstream-on-demand mode upstream LSR must request binding
upstream LSR sends label request message when:
– FEC in FEC table next hop LSR is LDP peer FEC not in forwarding table
– FEC next hop changes upstream LSR doesn’t have a mapping from new next hop
– receives FEC label request from upstream LDP peer next hop LSR is LDP peer upstream LSR doesn’t have a mapping from next hop
upstream LSR sends label request abort message when:– upstream LSR needs to revoke request before satisfied for example, next hop LSR for FEC has changed
3.2
Y(J)S MPLS Slide 81
NotificationsNotifications
There are two types of notifications:– error notifications (fatal errors - terminate session)– advisory notifications (status messages)
LSR sends notification messages when:– received LDP message with unsupported protocol version– received LDP message with unknown type– KeepAlive timer expired– session initialization fails due to unacceptable parameters – etc.
3.2
Y(J)S MPLS Slide 82
LDP state machineLDP state machine
LSR periodically transmits hello UDP messages– multicast to “all routers on subnet” group– targeted to preconfigured IP address
LSRs listen on this UDP port for hello messages
when LSR receives hello from another LSR– it opens a TCP connection to that other LSR
or (for extended discovery)– it unicast transmits a hello back to the other LSR
LSR with higher ID sends session initialization message
other LSR LDP accepts (sends keepalive) or rejects
informative or keepalive messages sent
3.2
Y(J)S MPLS Slide 83
LDP packet formatLDP packet format
version – presently 1
length - PDU length, excluding version and length fields
LDP-ID – identifies label space of sending LDP peer– LSR-ID(4B) globally unique LSR ID– label space ID (2B) for per-port label spaces
(zero for per-platform label spaces)
message TLVs – zero or more message TLVs (see next page)
version(2B)
length(2B)
LDP-ID(6B)
message TLVs (variable)
header (10B)
3.3
Y(J)S MPLS Slide 84
LDP message TLVsLDP message TLVs
U – unknown message bitif message type unknown to receiverU=0 – receiver returns notification to senderU=1 – receiver silently ignores
length - message length, excluding type and length fields
message-ID – unique ID for message (for matching with returned notification)
if there are mandatory parameters, they most appear in a specific order
optional parameters may appear in any order
type(15b)
length(2B)
message-ID(4B)
mandatoryparameter
TLVs (variable)
optional parameter
TLVs (variable)
U
3.3
Y(J)S MPLS Slide 85
All LDP message typesAll LDP message types
Hello (0x0100)
Initialization (0x0200)
KeepAlive (0x0201)
Notification (0x0001)
Address (0x0300)
Address Withdraw (0x0301)
Label Mapping (0x0400)
Label Withdraw (0x0402)
Label Request (0x0401)
Label Release (0x0403)
Label Abort Request (0x0404)
3.3
Y(J)S MPLS Slide 86
LDP parameter TLVsLDP parameter TLVs
U – unknown message bitif message type unknown to receiver:
U=0 – receiver returns notification to sender U=1 – receiver silently ignores
F – forward unknown message bit if U=1 and message type unknown to receiver
F=0 – do not forwardF=1 – forward
type - TLV type (FEC TLV, label TLV, address list TLV, hop count TLV, path vector TLV, status TLV )
length - length of value field in bytes
type(14b)
length(2B)
valueU
3.3
F
Y(J)S MPLS Slide 87
FEC TLVFEC TLV
there may be more than one FEC element for mapping messages only
the FEC elements are not themselves TLVs (no length needed), instead– wildcard FEC (0x01)– prefix FEC (0x02) + address family (IPv4, IPv6, Ethernet, E.164, etc.) +
prefix length in bits + prefix– host address FEC (0x03) + address family + length + address
type (2B)U=0 F=0 type=0x0100
length (2B)
3.3
FEC element 1
…
Y(J)S MPLS Slide 88
Generic Label TLVGeneric Label TLV
this is the generic label TLV
there are also special label TLVs for ATM and FR based MPLS
type (2B)U=0 F=0 type=0x0200
length (2B)
3.3
label (20 bits)
Y(J)S MPLS Slide 89
Status TLVStatus TLV
Status TLVs : mandatory parameters in notification messages optional parameters in other messagesU=0 when status in notification message, else U=1E - fatal error bit E=1 for fatal error E=0 for advisory notificationthe two F bits are equal and have the normal meaningstatus code data = 0 means successmessage ID - the message to which the status refersmessage type - the message type to which the status refers
type (2B)U F type=0x0300
length (2B)
3.3
status code data (30b)
message ID (32b)
E F
message type (16b)
Y(J)S MPLS Slide 90
Example full message - Example full message - label mappinglabel mapping
mapping messagetype = 0x0400
length=24(2B)
message-ID(4B)U=0
3.3
FEC TLV (2B)U=0 F=0 type=0x0100
length=8 (2B)
prefix FEC element (8B)type = 0x02 family=1(IPv4) prefix-length=16 prefix=192.115/16
label TLV (2B)U=0 F=0 type=0x0200
length=4 (2B)
label = 17
Y(J)S MPLS Slide 91
BGP4 Label DistributionBGP4 Label Distribution
BGP peers exchange VPN routescan easily associate a label with these routes
all BGP procedures are immediately available for use for label distribution messages
BGP4 is a very extensible protocol – multiprotocol extensions support address families (originally for IPv4,IPv6, etc)– MPLS defines a new address family
3.3
Y(J)S MPLS Slide 92
BGPBGP
marker can be used for authentication (TCP MD5 signature)
length is total BGP PDU length, including header
type– OPEN (for session initialization)
– UPDATE (add, change and withdraw routes)
– NOTIFICATION (return error messages, terminate session)
– KEEPALIVE (heartbeat)
KEEPALIVE packet consists of 19B header only
marker (16B)
length (2B)
type(1B)
data (variable)
header (19B)
3.3
Y(J)S MPLS Slide 93
BGP state machineBGP state machine
idle – no session (awaiting session initialization)
connect – attempting to connect to peer
active – started TCP 3-way handshake (router busy)
open sent – have sent OPEN message
open confirm – after receiving TCP SYN for OPEN message
established – BGP session up and running
3.3
Y(J)S MPLS Slide 94
BGP OPENBGP OPEN
version (3 or 4)
my AS – identifier of autonomous system
hold time – max time (sec) between receipt of messages
BGP ID – sender’s BGP identifier
op len – length (bytes) of optional parameters
opt parameters - TLVs
version (1B)
my AS (2B)
hold time (2B)
opt parameters (variable)
BGP-ID (2B)
op len (1B)
3.3
Y(J)S MPLS Slide 95
BGP UPDATEBGP UPDATE
Withdrawn Routes – list of routes no longer to be used (NLRI format- see below)
Path Attributes – route specific information (see next page)
Network Layer Reachability Information – (classless) routing information
the NLRI is a list of address-prefixeseach prefix must be masked from the left to the length specified
WR len (2B)
withdrawnroutes(var)
PA len(2B)
NLRI(var)
path attributes
(var)
len(1B)
prefix (variable)
3.3
Y(J)S MPLS Slide 96
BGP UPDATE - Path AttributesBGP UPDATE - Path Attributes
flags
O – optional/well-known bitif 1 must be recognized by all BGP implementationsif W=1 and unrecognized attribute, BGP sends notification and session closed
T – transitive/nontransitive bitif 1 and attribute unrecognized it is passed along, else silently ignoredwell-known attributes are always transitive
C – complete/partial bit (for optional transitive attributes only)
L – attribute length bit (=0 attribute length is 1B, =1 length is 2B)
type codeORIGIN, AS_PATH, NEXT_HOP, MED, LOCAL_PREF,AGGREGATOR, COMMUNITY, ORIGINATOR_ID…
flags(1B)
type code(1B)
3.3
Y(J)S MPLS Slide 97
BGP NOTIFICATONBGP NOTIFICATON
all notification messages cause BGP session to close
error codes include:– message header error– open message error– update message error– hold timer expired– state machine error– other fatal error
errorcode (1B)
errorsubcode(2B)
data(var)
3.3
Y(J)S MPLS Slide 98
MPLS Traffic Engineering
4.1 Introduction to Traffic Engineering
4.2 DiffServ and IntServ
4.3 TE protocols
A
B
C
D
E F
G
Y(J)S MPLS Slide 99
Traffic EngineeringTraffic EngineeringTE is control of network traffic to achieve specific objectives
unfortunately users and providers have contradictory objectives
user objectives (QoS)– network availability– packet loss– end-to-end delay– round-trip delay– packet delay variation (PDV)– error rate
provider objectives (performance)– bandwidth utilization– resource utilization– speed of failure recovery– ease of management– monetary outlay
4.1c
Y(J)S MPLS Slide 100
Network and Traffic EngineeringNetwork and Traffic Engineering
Network Engineeringputting the bandwidth where the traffic is– physical cable deployment (thick pipes)– over-provisioning– virtual connection provisioning– violates providers objectives
Traffic Engineeringputting the traffic where the bandwidth is– explicit traffic routing– route optimization– can it meet user objectives?
4.1
Y(J)S MPLS Slide 101
IP’s ProblemIP’s Problem
were C,D,E, and F ATM switches there would be no problem (1Gb over ACDG, 500Mb over BCEFG)
with standard hop-count cost function, all traffic over CDGresulting in 1.5Gb there (congestion) and CEFG idle
with administrative cost on CDG we can force all the traffic to CEFGeven worse congestion !
finally with administrative cost and ECMP we can load balance750 Mb over CDG and CEFG, link EF is still congested !
what can we do?
A
B
C
D
E F
Gtraffic from A to G 1Gbtraffic from B to G 500Mball links 1Gb except EF 500 Mb
4.1
1
0.5
Y(J)S MPLS Slide 102
SolutionSolutionIP’s problem arises from
– the forwarding being purely local– but the routing being too global
IP routing always optimizes a global (usually additive) metricdiscrete optimization problem
does not take local constraints (e.g. BW of individual links) into account
we need to optimize a global metricwhile taking local constraints into account
this is called constraint-based routingdiscrete optimization with inequality constraints
main constraint is BW, but also maximum delay, packet loss, etc.another constraint: explicit include/exclude links/routers 4.1
Y(J)S MPLS Slide 103
MPLS TEMPLS TEMPLS that we have seen so far allows explicit routing
– can minimize number of hops– can tailor LSP to needs
MPLS-TE LSPs can be setup according to constraints– only include in LSP LSR with sufficient available BW– only include in LSP LSR that guarantees sufficiently low delay
traffic engineering achieved by extending base protocols enhanced routing protocols (resource reservation)
– OSPF OSPF-TE– ISIS ISIS-TE
constraint-based signalling protocols (pinned LSPs and reservation)– RSVP RSVP-TE– LDP CR-LDP (no longer under active development)
4.1
Y(J)S MPLS Slide 104
IP FixesIP Fixes
two approaches to QoS were developed for IP (never popular)
IntServ (guaranteed QoS)– define traffic flows (CO approach)– guarantee QoS attributes for each flow– reserve resources at each router along the flow– signaling protocol (RSVP) needed
DiffServ (statistical QoS)– retain CL paradigm– no guaranteed QoS attributes– classify packets (differentiated)– offer special treatment (priority) relative to other packets– no resource reservation
4.2
Y(J)S MPLS Slide 105
DiffServDiffServ
DIffServ was developed in IETF after IntServ
IntServ too heavy-weight (and too revolutionary) for most purposes
resource reservation is against IP-philosophyif not enough BW, then more democratic for all to sufferif reserve BW and don’t use, then this is simply over-provisioning
DiffServ is evolutionary “coarse-grained” approach to IP QoS
DiffServ – divides traffic into service classes
and allocates resources on a per-class basis– uses 6 bits of ToS byte in IP header to mark packets
field is renamed Differentiated Services Code Point no setup or router state required
– DSCP defines per-hop behaviors (PHB) tells router how to treat packet
– three standard PHBs (BE, AF, EF)
RFCs 2474, 2475
4.2
Y(J)S MPLS Slide 106
DiffServ PHBsDiffServ PHBs
Best Effort– standard IP service– QoS depends on momentary network load
Assured Forwarding– AF specifies class that determines queue– in addition, three drop-precedence levels (low, med, high)– AF packets from a single source should not be mis-ordered even if have different drop-precedence (i.e. single queue)
Expedite Forwarding– EF packet should experience no queuing delays– EF packets should have low loss– implemented by dedicated EF router queue
4.2
Y(J)S MPLS Slide 107
MPLS and DiffServMPLS and DiffServ
what does all this have to do with MPLS ?
MPLS will have to co-exist with DiffServ IP
MPLS provides similar functionality
Exp bits in MPLS shim header are similar to DSCPs– only three bits (8 classes) while DiffServ ended up with 6 bits– but 8 service classes is usually more than enough commonly 4 classes are offered (bronze, silver, gold, platinum)
LSR can maintain mapping from Exp to PHBe.g. Exp=000 for BE, Exp=001 for AF low drop precedence, etc
no new signaling mechanism required
LSP with Exp mapping to PHB is called an E-LSP
ingress LER assigns the EXP field
4.2
Y(J)S MPLS Slide 108
L-LSP vs. E-LSPL-LSP vs. E-LSPwith DiffServ each FEC is defined by
– destination address– service class
there are two ways of implementing the DiffServ mapping
L-LSP (Labeled-inferred LSP)– behavior based on label alone– support different service classes by using different labels– LSP BW allocated from specific queue (class)– Exp may be used for drop precedence
E-LSP (Exp-inferred LSP)– behavior based on label + Exp field– LSP BW allocated from link– can support 8 different service classes per label
link BWL-LSP
L-LSPL-LSP
BW %
4.2
Y(J)S MPLS Slide 109
IntServIntServ
IntServ is an overall QoS architecture (not just RSVP)
IntServ is a radical departure from pure IPand requires IntServ-enabled routers
IntServ– enables providing end-to-end QoS guarantees
for services that need them– defines flows (introduces CO to IP’s CL architecture)
flows are classified into three service classes (BE,CLS,GS)– specifies admission control and policing– like all CO architectures, requires signaling protocol (RSVP)
IntServ-enabled routers– reserve needed resources along the flow’s path– routers must retain state
RFCs 2205-2216, 2379-2382
4.2
Y(J)S MPLS Slide 110
IntServ CoSIntServ CoS
Best Effort– standard IP service– QoS depends on momentary network load
Controlled Load Service– service equivalent to unloaded network– low packet loss– most packets will experience delay close to minimum– no quantitative guarantees
Guaranteed Service– bounded worst case delay (no PDV guarantee)– low packet loss (zero if node buffers correctly provisioned)– quantitative guarantees
4.2
Y(J)S MPLS Slide 111
RSVPRSVP
primary signaling protocol for IntServ
runs over raw IP or UDP/IP
does not setup path, uses path setup by routing protocols
sessions identified by source and destination socket numbers
requests unidirectional QoS characteristics from network
causes routers along path to reserve link and node resources
network responds with success/failure
routers maintain soft-state, reservations time-out unless refreshed
two main message types: PATH and RESV4.2
RSVP
Y(J)S MPLS Slide 112
RSVP MessagesRSVP Messages
PATH– message from sender to receiver(s)– carries classification info and TSpecs
RESV– response of receiver to PATH message– carries session ID and RSpec specifying QoS required– this is the actual request for resource reservation
receivers
sender
4.2
Y(J)S MPLS Slide 113
MPLS and IntServMPLS and IntServ
what does all this have to do with MPLS ?
MPLS extension of RSVP : RSVP-TE (RFC 3209) can be used as a label distribution protocol
for downstream-on-demand binding
create and distribute bindings between RSVP flows and labelsNote: requests by LSRs and hosts (RSVP only by hosts)
uses label instead of source and destination socket numbers
extends RSVP by adding new objects (e.g. label) and procedures
allows strict/loose explicitly routed LSPs
like LDP there is peer discovery, label requests, binding messages
QoS and traffic parameters are carried transparently
4.2
Y(J)S MPLS Slide 114
CR-LDPCR-LDPRSVP-TE is not the only way of setting up a LSP for TE
Constraint-based Routing LDP (CR-LDP) extends LDP
basic LDP lacks– explicit routing– resource reservation
so CR-LDP adds– explicit route object, carried as TLV in label request – traffic parameter object
peak/committed data rate peak/committed burst sizes etc.
and since we are already extending the basic protocol …– path pre-emption priority– path re-optimization (with pinning option)– LSP setup failure notification– automatic LSP failure recovery policies– etc. 4.3
Y(J)S MPLS Slide 115
Setup ProceduresSetup Procedures
Example setupwith explicit routes
CR-LDP: A reserves resources, sends label request w/ explicit route BC & resource req.s B reserves resources and sends label request w/ explicit route C & resource req.s C reserves resources, locally binds label and sends label mapping to B B matches mapping to previous request, remotely binds, sends label mapping to A A remotely binds label
RSVP-TE: A sends PATH message to B w/ explicit route BC and resource requirements B forwards PATH message to C after changing explicit route to C C determines required resources, reserves, locally binds label and sends RESV to B B matches, reserves resources, remotely binds, and sends RESV to A A matches, remotely binds label and reserves resources
A CBingress egress
4.3
Y(J)S MPLS Slide 116
RSVP-TE/CR-LDP Comparison RSVP-TE/CR-LDP Comparison 1/21/2
CR-LDP extends an MPLS label distribution protocol adding explicit route and resource reservation features
RSVP-TE extends a pure IP resource reservation protocol to MPLSrequires running two protocols
both distribute binding information
both support explicit routing, pinning, and resource reservation
CR-LDP uses TCP (reliable) message order is guaranteedlost packet delays all processing
RSVP-TE uses IP (in core) or UDP/IP (unreliable)needs refreshes for reliability
RSVP-TE supported by Cisco and Juniper, and selected by IETF
CR-LDP supported by Nortel and Nokia, was favorite of ITU-T4.3
Y(J)S MPLS Slide 117
RSVP-TE/CR-LDP Comparison RSVP-TE/CR-LDP Comparison 2/22/2
CR-LDP is hard-state
RSVP-TE is soft-state– automatically tracks changes– needs refresh overhead (can be reduced by bundling)
both use ordered binding from egress to ingress after first traversing from ingress to egress
but CR-LDP reserves resources in forward directionRSVP-TE reserves in backward direction
RSVP-TE uses standard IntServ QoS model (TSpec/RSpec)
CR-LDP uses its own QoS model (traffic parameters)
CR-LDP may be vulnerable to TCP DoS attacks
RSVP-TE specifies authentication control
4.3
Y(J)S MPLS Slide 118
OSPF-TE and IS-IS-TEOSPF-TE and IS-IS-TEconstraint-based routing needs link attribute information
most important - available BW
link-state protocols have mechanisms to flood link-up/link-downto every router in the domain
piggyback attribute as TLV on link-up/link-down messages– OSPF add to Link-State Advertisement (LSA) (RFC 3630)– IS-IS add to Link-State Packets
once router receives attribute information it can reserve resource
when attribute information changes need to reflood
but every allocation changes attributes!
to decrease overhead: only flood when change passes threshold inherent timing bounds
4.3
Y(J)S MPLS Slide 119
MPLS Applications
5.1 L3 VPNs
5.2 L2 VPNs
5.3 PWE3
Y(J)S MPLS Slide 120
2547bis (RFC 4364) VPNs2547bis (RFC 4364) VPNs
Virtual router (peering) model, not tunneling
PE maintains Virtual Route Forwarding table for each VPN
BGP (with multiprotocol extensions) used for label distribution
in order to support private IP addresses in BGPPE prepends 8B Route Distinguisher (unique to site) to IP address
PEPE
CE peer to PE
CE not peer to CE
ext label
IP Packet
SP label
CE is IP router
P PC
CEC
C
CC
C
CE
5.1
Y(J)S MPLS Slide 121
BGP MPLS VPNs (2547bis)BGP MPLS VPNs (2547bis)
presently most popular provider managed VPN
originally specified in RFC 2547, update in draft called RFC 4364
transports IPv4 (IPv6) traffic in MPLS tunnels
uses BGP for route distributionsince SPs commonly use BGP for routing
2547 is not an overlay model– CE routers at different sites are not routing peers – they do not directly exchange routing information– they don’t even need to know of each other– so customer needn’t manage a backbone or virtual backbone– no inter-site routing problems
5.1
Y(J)S MPLS Slide 122
BGP MPLS VPNs (cont.)BGP MPLS VPNs (cont.)
only PE routers maintain VPN informationP routers needn’t maintain any customer routing information
C routes either manually configured in PE or advertised to PE using BGP, OSPF, etc.
PE advertises routes to remote PEs using BGPremote PEs advertise routes to their CEs using BGP, OSPF, etc.
IP address overlap solved using Route Distinguisher (RD)
5.1
Y(J)S MPLS Slide 123
L2VPN - VPWSL2VPN - VPWS
Virtual Private Wire Service is a L2 point-to-point service
it emulates a wire supporting the Ethernet physical layer
set up MPLS tunnel between PEsset up Ethernet PW inside tunnel
CEs appear to be connected by a single L2 circuit
(can also make VPWS for ATM, FR, etc.)
CE CEAC AC
PE PE
provider network
5.2
Y(J)S MPLS Slide 124
L2VPN - VPLSL2VPN - VPLS
VPLS emulates a LAN over an MPLS network
set up MPLS tunnel between every pair of PEs (full mesh)set up Ethernet PW inside tunnels, for each VPN instance
CEs appear to be connected by a single LAN
PE must know where to send Ethernet frames …but this is what an Ethernet bridge does
CE
CE
CE
for clarity only one VPN is shown
AC
AC
AC
PE
PE
PE
5.2
Y(J)S MPLS Slide 125
VPLSVPLS
a VPLS-enabled PE has, in addition to its MPLS functions: VPLS code module (IETF drafts) Bridging module (standard IEEE 802.1D learning bridge, but
no STP)
SP network (inside rectangle) looks like a single Ethernet bridge!
Note: if CE is a router, then PE only sees 1 MAC per customer location
CE
CE
V B
V B
B V
CE
5.2
Y(J)S MPLS Slide 126
Pseudowire EmulationPseudowire Emulation
This is a special case of “Pseudowire Emulation Edge to Edge”, or PWE3
ProviderEdge
(PE)Customer
Edge
(CE)
CustomerEdge
(CE)
CustomerEdge
(CE)
nativeservice
PacketSwitchedNetwork
(PSN)
PseudoWires (PWs)
[tunnels]
CustomerEdge
(CE)
CustomerEdge
(CE)
ProviderEdge
(PE)nativeservice
5.3
Y(J)S MPLS Slide 127
IETF PWE3 WGIETF PWE3 WG
In the Internet Area of the Internet Engineering Task Force
Native (layer 1,2) services : ATM (port mode, cell mode, AAL5-specific modes) FR Ethernet (DIX, 802.3, VLAN) TDM (SONET/SDH, E1, T1, E3, T3)
PSNs IPv4 IPv6 MPLS L2TPv3
5.3
Y(J)S MPLS Slide 128
PWE3 WG CharterPWE3 WG Charter
Edge-to-edge emulation and maintenance of PWs– tunnel creation and placement out of scope
Network interworking, not service interworking Must not exert controls on underlying PSN
– but diffserv, RSVP-TE can be used
Use RTP when necessary– for real-time functions, clock recovery
Realize that emulation will not be perfect– need applicability statement for each native service
WG will produce the following documents– framework, requirements, architecture documents– service specific encapsulation documents for each native service
5.3
Y(J)S MPLS Slide 129
Why MPLS for PWE3 ?Why MPLS for PWE3 ?
Much of the PWE work is focused on MPLS Emulated services have QoS and TE requirements
– IP is basically a “best effort” service– diffserv and RSVP extensions not prevalent– MPLS can provide TE guarantees – RSVP-TE (CR-LDP) allows TE signaling
IP provides no standard connection multiplexing method– UDP/TCP ports provide application multiplexing– RTP uses ports in a nonstandard way– L2TP includes a multiplexing mechanism– MPLS label stack provides natural multiplexing method– Using “inner labels” provides two layers of switching (like ATM VP/VC)
ATM-f inner label outer label(s)Dictionary: ITU-T interworking label transport label(s)
IETF PW label tunnel label(s)5.3
Y(J)S MPLS Slide 130
PWE packet formatPWE packet format
outer label
inner label
controlword
Payload
• inner and outer labels specify forwarding and multiplexing• outer label is standard MPLS stack• inner label contains PW label
• control word• enables detection of out-of-order and lost packets• flags may indicate critical alarm conditions (TDM PWs)• PW PID nibble
5.3