142
AARNet's experiences using MPLS for protection Glen Turner, Network Engineer [email protected] Internet2/NLANR Joint Techs Meeting Boulder, CO, USA 2002-07-28 Australian Academic & Research Network http://www.aarnet.edu.au/

AARNet's experiences using MPLS for protectiongdt/presentations/2002-07-28-i2-mpls/mpls.pdf · MPLS uses IP as its control and routing protocol. Layer 2.5 and protection Network layer

  • Upload
    vuduong

  • View
    220

  • Download
    0

Embed Size (px)

Citation preview

AARNet's experiences using MPLS for protection

Glen Turner, Network [email protected]

Internet2/NLANR Joint Techs MeetingBoulder, CO, USA

2002-07-28

Australian Academic & Research Networkhttp://www.aarnet.edu.au/

Topics

MPLS overview

Protection technology

AARNet's experiences with MPLS

Other interesting stuff if we have time

Coverage

MPLS is a big topic with multiple implementation choices at almost every turn

Only discuss some of the technolgy choices● MPLS generic tagging, not ATM tagging● RSVP and not LDP● OSPF and not IS-IS

Coverage

Discuss the use of MPLS for protection, not discussing some important uses of MPLS● VPNs (and thus BGP)● GMPLS, the integrated control layer for

switching technologies

“How to speak Australian”● words with “or” à “our”, “z” à “s”

SONET à SDH (slight framing difference)

T1 à E1 (E1 is 2Mbps)

TopicMPLS overview

Label switchingControl plane protocols, RSVP

Routing protocols, OSPF

MPLS aims

Scalable IP traffic engineering● Avoid need for full IP network knowledge

at core

Virtual private network service● By providing label switch paths exclusive

to a customer

This presentation focuses on traffic engineering● Only beginning to experiment with VPNs

MPLS is a layer 2½ protocol

Presentation

Session

Transport

Network

Link

Physical

Application

Presentation

Session

Transport

Network

Link

Physical

Application

MPLS

1

2

3

4

5

6

7

1

2

3

4

5

6

7

Advantages of layer 2½

No complex next hop algorithm● IP address lookup is expensive

– Closest matching prefix versus table lookup● IP next hop algorithm gets more complex

with each new service– Policy routing– Multicast

Want GbE switch prices not GbE router prices

New behaviours only effect edge routers

Advantages of layer 2½, cond

No need to follow IP routing● The shortest path may not be the best

path● Want policy

– For traffic engineering● Bandwidth● Diverse routers and paths

– For arbitrary customer requirements● eg: Australian Army doesn't want to be routed over

links not owned by Australian-controlled telcos

Advantages of layer 2½, cond

Why MPLS for policy and not BGP?● BGP is globally visible

– Scalability: Does outer Mongolia need to know of an interface failure in outback Australia?

– Can lose connectivity due to dampening, which is essential due to global visibility

● Not all reasonable policies can be expressed in BGP

Disadvantages of layer 2½

Another set of control protocols● ATM: OAM, ILMI, PNNI● 802.1Q VLANs: Virtual LAN reservation

protocol● SDH/SONET

MPLS uses IP as its control and routing protocol

Layer 2.5 and protection

Network layer protection requires a network-layer repsonse● Limited by convergence time of routing

protocol● Fast convergence and global visibility do

not mix– BGP rate limiting is an expression of this

Layer 2.5 and protection

Link layer protection requires a link layer response● These often have constrained topologies

– SDH/SONET rings– 802.1D and parallel links

● They often inefficiently use protection bandwidth

● They often treat all network traffic as equally valuable

● Lack of network topology: poor decisions

Layer 2.5 and protection

Allow network layer to establish pre-routed fallback path● Full topology awareness

Allow link layer to switch to fallback path● Not globally visible● Fast convergence

This could get messy upon multiple failures● Run interior routing protocol afterwards

Forwarding equivalence class

Another view of IP routing● Step 1: Determine forwarding equivalence

class from IP header (or more)– Standard: Destination IP address– Advanced: source IP address, multicast group,

DSCP, TCP port, increasing bizaare● Step 2: Lookup FEC forwarding table to

determine output interface (ie: switch the packet)

Forwarding equivalence class, cond

IP router calculates forwarding equivalence class at every hop● Expensive

– either in CPU time or hardware● Extensive

– IP forwarding table is big with frequent updates

● Difficult to alter for new behaviours– ASIC designers may have not anticipated the

change (reverse path lookup, source-specific multicast)

Forwarding equivalence class, cond

MPLS switching● Determine forwarding equivalence class at

ingress● Tag packet with a fixed-length label for

this forwarding equivalence class● Switch using the label at every other hop

to egress– Tags are designed for hardware manipulation

Labels are not globally unique

Even one router can run multiple “label spaces”

– eth0, eth1 in LS1– eth2, eth3 in LS2

Edge routers need distinct IP routing tables for each label space● The key to MPLS VPNs● We often want multiple routing tables and

settle for policy routing instead

MPLS tag

A 32-bit header in front of the packet

Tag contains just enough information for forwarding and queuing● Unlike IPv4/IPv6 header, which carries a

lot more

Tag has hardware-friendly structure

MPLS tag, fields

Label● Determines next-hop interface

Experimental (QoS)● Determines output interface queuing

S for “last of stack”● S=1 on last header

Time to live● Discard upon zero, otherwise decrement

MPLS tag, stacking

An MPLS tagged packet can be tagged again (“stacked”)● Allows Provider-Provider connections to

maintain customer tags● Simplifies design considerably● Avoids need for global label space

Network-layer packetTagS=0

TagS=1

TagS=0

MPLS tag, stacking and MTU

The tag may reduce the size of the path maximum transmission unit (PMTU)● TCP/IP stacks don't cope well with change

of PMTU– PMTU at establishment of TCP determines TCP

MSS● Best to ensure that main and protect

paths have identical tag depths

Or may not, if the link layer will let us flex the rules

MPLS operation

mpls-path.dia

MPLS operation, condLabel switch router

Incoming packet, look up incoming label map, which contains● Incoming label● MPLS opcode: PUSH, POP, etc● Forwarding equivalence class● Link to outgoing next hop label entry

MPLS operation, condLabel switch router

Incoming packet operations● Extract label from top tag● Lookup incoming label map● Execute MPLS opcodes to manipulate tags● Forward packet to outgoing processing

MPLS operation, condLabel switch router

Outgoing packet, look up next hop label entry, which contains● Outgoing label● Outgoing interface● Perhaps, outgoing per-hop queuing

behaviour

MPLS operation, condLabel switch router

Outgoing packet operations● Look up next hop label entry● Create new tag containing outgoing label● PUSH tag onto label stack● Add to transmit queue on outgoing

interface– queuing discipline may depend upon

● Value in next hop forwarding entry● Value determined from Exp bits, a lá IP DSCP and

weighted fair queuing + RED

MPLS operation, condIngress label edge router

Incoming packet, look up forwarding equivance class to next hop label entry (FTN), which contains● forwarding equivalence class● next hop label entry

MPLS operation, condIngress label edge router

Incoming packet operations● Determine forwarding equivalence class

using “standard” IP forwarding– Basic: lookup destination IP address in IP

forwarding table– Advanced: policy routing, multicast routing,

QoS routing, ...● Use FEC to lookup forwarding equivalence

class to next hop label entry table● Process next hop label entry

MPLS operation, condEgress label edge router

Next hop label entry shows this router as the penultimate hop

Protocol-dependent actions to simulate label switch routers being real routers● Decrement IP TTL● Generate any ICMP which would have

occurred

Forward the packet using the standard IP algorithm

Faking ICMP gives interesting results

Traceroute from Glen's home to www.internet2.edu

1 sadial.sa.csiro.au 119.657 ms 129.673 ms 100.004 ms 2 sa.gw.csiro.au 119.944 ms 129.829 ms 110.382 ms 3 lis255.atm1-0.central.saard.net 131.917 ms 119.858 ms 109.980 ms 4 sa-nsw.atm.net.aarnet.edu.au 139.715 ms 149.829 ms 140.002 ms 5 vlan916.gbe3-0.sccn1.broadway.aarnet.net.au 149.941 ms 149.773 ms 149.968 ms 6 pos1-0.sccn1.manoa.aarnet.net.au 349.907 ms 279.791 ms 289.963 ms 7 pos2-0.sccn1.seattle.aarnet.net.au 279.866 ms 329.880 ms 279.904 ms 8 Abilene-PWAVE.pnw-gigapop.net 279.870 ms 351.155 ms 328.555 ms 9 dnvr-sttl.abilene.ucaid.edu 339.933 ms 339.861 ms 329.944 ms10 kscy-dnvr.abilene.ucaid.edu 349.847 ms 339.622 ms 350.053 ms11 ipls-kscy.abilene.ucaid.edu 339.756 ms 339.932 ms 339.903 ms12 clev-ipls.abilene.ucaid.edu 339.884 ms 349.808 ms 339.963 ms13 nycm-clev.abilene.ucaid.edu 349.752 ms 349.857 ms 339.969 ms14 border-abilene-oc3.advanced.org 360.135 ms 359.857 ms 379.851 ms15 www.internet2.edu 379.865 ms 359.838 ms 359.950 ms

Architectural issues

There is a lot of complexity at the edge● Especially in the egress router

But we want the edge to be cheap, as there is a lot of it

There are no MPLS applications

ATM has applications● (Today's bizaare but true fact)

Links between 3G base stations and switching points is the most recent application to treat ATM as a transport layer

Even ethernet has applications● DEC Local Area Transport

There are no MPLS applications

MPLS exists only to carry other protocols● The label edge routers must support the

protocol● This isn't new

– All routers have to support the network layer protocol they are routing

Model is strained somewhat by abuse of MPLS to carry ethernet frames

Configuring a label switch routerLinux

Both eth0 and eth1 in label space 1● mplsadm -L eth0:1mplsadm -L eth1:1

Configuring a label switch routerLinux

Configure label switching● mplsadm -A -I gen:10:1 -O gen:20:ipv4:10.3.0.2 -Bmplsadm -A -I gen:21:1 -O gen:11:ipv4:10.2.0.1 -B

– -A -B: add and bind

– -I: incoming on eth0, generic tag, label 10

– -O: outgoing on eth1, generic tag, label 20, only if next hop is available

Configuring a label edge routerLinux

Configuration for left-most router

Label space● mplsadm -L eth0:1mplsadm -L eth1:1

Configuring a label ingress router – Linux

Ingress label edge router

Set forwarding equivalence class in routing subsystem

● route add -net 10.4.0.0/16 gw 10.2.0.2

Set FEC in MPLS subsystem● mplsadm -A -B -O gen:10:eth0:ipv4:10.2.0.2 -f 10.4.0.0/16

– outgoing label of 10

Egress label edge router● mplsadm -A -I gen:11:1

● mplsadm -A -I gen:10:1 -O gen:20:ipv4:10.3.0.2 -Bmplsadm -A -I gen:21:1 -O gen:11:ipv4:10.2.0.1 -B

– -A -B: add and bind

– -I: incoming on eth0, generic tag, label 10

– -O: outgoing on eth1, generic tag, label 20, only if next hop is available

Configuring a label egress routerLinux

Egress label edge router

Incoming MPLS packets with label 11 are POPed and escalated to IP routing system

● mplsadm -A -I gen:11:1

Configuring a label edge routerLinux

Label space● mplsadm -L eth0:1mplsadm -L eth1:1

Ingress label edge router● Forwarding equivalence class is

determined by routing sub-system● route add 10.4.0.0/16 gw 10.2.0.2mplsadm -A -B -O gen:10:eth0:ipv4:10.2.0.2 -f 10.4.0.0/16

Egress label edge router● mplsadm -A -I gen:11:1

● mplsadm -A -I gen:10:1 -O gen:20:ipv4:10.3.0.2 -Bmplsadm -A -I gen:21:1 -O gen:11:ipv4:10.2.0.1 -B

– -A -B: add and bind

– -I: incoming on eth0, generic tag, label 10

– -O: outgoing on eth1, generic tag, label 20, only if next hop is available

Representation

How should MPLS look to the network layer?

The preceeding is not a good fit● eth0 has multiple subnets● eth0 can be partially down● Routing protocols need considerable work

Representation

A tunnel seems a good fit● Tunnels run between routers, making

intermediate routers invisible● Tunnels have MTU issues, as does MPLS● Routing protocols understand tunnels● Management systems expectations are

met– interface either down or up– SNMP counters count something useful

Configuring a label ingress router with tunnels – Linux

Create MPLS tagging● mplsadm -A -O gen:10:eth0:ipv4:10.3.0.2

– -A -O: add outgoing label● gen:10: generic tag with label 10● eth0: outgoing interface● ipv4:10.3.0.2: address of remote-end of tunnel

Configuring a label ingress router with tunnels – Linux

Create a tunnel interface● mplsadm -A -T mpls0

– -A -T: Add tunnel● mpls0: tunnel interface name

Configuring a label ingress router with tunnels – Linux

Assign an IP address to the local end of the tunnel, use the same address as the ethernet interface

● ifconfig eth0 inet addr:10.2.0.1ifconfig mpls0 10.2.0.1 netmask 255.255.255.255

– mpls0: tunnel interface to configure

– 10.2.0.1: local-end IPv4 address

Configuring a label ingress router with tunnels – Linux

Bind outgoing label to tunnel● mplsadm -B -O gen:10:eth0 -T mpls0

– -B -O: bind outgoing label● gen:10: generic tag with label 10● eth0: interface

– -T: tunnel● mpls0: tunnel interface name

Configuring a label ingress router with tunnels – Linux

Forward traffic to mpls0 tunnel● route add -net 10.4.0.0/16 gw 10.3.0.2 dev mpls0

– 10.4.0.0/16: Forwarding equivalence class

– gw 10.3.0.2: remote tunnel-end address

– dev mpls0: next hop interface

Configuring a label egress router with tunnels – Linux

Same as normal egress● mplsadm -A -I gen:11:1

Configure label edge routerLinux

Configure the rightmost label edge router similarly

We want to do this automatically● That is, to use a signalling protocol

TopicMPLS overview

Label switchingControl plane protocols, RSVP

Routing protocols, OSPF

IP-based signalling and routing

Unusual, most link technologies develop their own signalling and routing● Ethernet: bridge protocol data unit

– carries● 802.1D spanning tree● 802.1Q virtual LAN registration protocol

● ATM: OAM, ILMI and PNNI

Signalling

LDP: Label distribution protocol

RSVP: Resource reservation protocol

We'll only discuss RSVP

RSVP

A soft-state protocol for establishing and maintaining IntServ QoS paths● Sent Path message requests a IntServ

path● Received Resv message confirms a

IntServ path request

RSVP, cond

New RSVP objects for MPLS paths

Path mesage● LABEL_REQUEST: create a label switched

path● EXPLICIT_ROUTE: through these label

switch routers

Resv message● LABEL: Inserts entry into label switch

forwarding table

RSVP and traffic engineering

Sometimes don't want the shortest path● A longer congestion-free path is always

better than a shorter congested path● Bizaare customer requirements

– eg: ADF and links controlled by non-ANZUS telcos

● Diversity– Complex, as lots of failure modes

● Don't want to share core, cable, conduit, router, UPS, building, site, block, substation, road, flood plain, craft personnel, jurisdiction

RSVP and diversity

● RSVP has “resource affinities”, roughly 32 per label space– Enough for broad-brush use, say for a national

backbone– AARNet doesn't use this

● Our use of MPLS is either too trivial or too complex

RSVP and degraded service

RSVP has a Setup Priority and Holding Priority● These allow established paths to be pre-

empted by a new path● AARNet considering use for recovery

scenarios– So we can prioritise use of degraded capacity– eg: voice, commodity, research, quality video,

multicast

RSVP node failure

Hello protocol ● HELLO REQUEST● HELLO ACK

Detects● Node down● Node reboot

– Thus needs instant path re-establishment● All links between the two nodes have

failed

RSVP node failure, cond

No “alarm heirarchy” of Hellos● They run on every label switch path

Good● Alarm heirarchies often fail

– CPU overwhelmed by massive failure

Bad● Bandwidth and CPU interrupts● End-to-end, not segment-based

This won't do for GMPLS

Signalling configures path in one direction

Important that other direction be established :-)

It should follow the same physical segments● Balakrishnan, Padmanabhan, Fairhurst, et

alTCP performance implications of network path asymmetry– draft-ietf-pilc-asym-07

TopicMPLS overview

Label switchingControl plane protocols, RSVP

Routing protocols, OSPF

Requirements

We want to specify paths with● Forwarding equivalency class● Origin and destination node● Path placement constraints

So the routing protocol needs to distribute● Connectivity● Path attributes to satify constraint

calculations

Possible contraintsRouters

Support for prioritisation

Support for protocols

Available bandwidth

Link technologies

Protection switching technologies

Possible contraintsLinks

Available bandwidth

Reliability

Colour

Cost

Membership of shared link risk group

OSPF implementation

Add new link state advertisment types which contain link attributes

These LSAs should be ignored by standard OSPF – they are “oqaque”

There are three new Opaque LSAs, all identical except for flooding scope

Add an OSPF Hello option so neighbours can become Opaque LSA neighbours and pass Opaque LSAs

Structure of the opaque (huh?)

List of TLVs for routers and links● Type, length value● Allows un unsupported variable to be

silently ignored

Attributes are held in sub-TLVs● TLVs within TLVs

Routers sub-TLV● Router ID

Structure of the oqaque

Link TLV● Identity sub-TLVs

– Link type: point-to-point, multi-point– Router ID of neighbour– Local interface IP address– Remote interface IP address

Structure of the opaque

Link TLV● Traffic engineering sub-TLVs

– Traffic engineering metric, 32-bit cardinal– Maximum bandwidth, 32-bit floating point– Maximum reservable bandwidth, 32-bit

floating point– Unreserved bandwidth, 32-bit floating point– Resource colour, 32-bit mask

● A “colour” might be a DWDM channel, or a E1 time-slice within an E3, or a ...

Limitations - flooding

Traffic engineering values can change rapidly and repeatedly● Available bandwidth

Important to limit flooding

Opaque LSAs don't do this nearly as well as they could as there are only three flooding scopes

Limitations - summarisation

Difficult to summarise traffic engineering information

Thus areas are difficult to construct

But areas are vital in limiting flooding

Configuration – ZebraInterface control

zebra.conf● interface eth0 bandwidth 100000 description Link to LSR ip address 10.2.0.1/30

● interface eth1 description Hosts bandwidth 100000 ip address 10.1.0.1/16

● interface mpls0 description Tunnel bandwidth 100000 ip address 10.3.0.2/32 no multicast ipv6 nd suppress-ra

Configuration – ZebraOSPF router

ospfd.conf● router ospf ospf router-id 10.2.0.1 auto-cost reference-bandwidth 10000 area 0 authentication message-digest network 10.1.0.0/16 area 0 network 10.2.0.0/30 area 0 network 10.3.0.2/32 area 0 neighbor 10.3.0.2 capability opaque mpls-te mpls-te router-address 10.2.0.1

Configuration - ZebraOSPF interfaces

ospfd.conf● interface eth0 ip ospf network broadcast ip ospf authentication message-digest ip ospf message-digest-key ... md5 ... mpls-te link metric 0 mpls-te link max-bw 1e+07 mpls-te link max-rsv-bw 5e+06 mpls-te link rsc-clsclr 0x1

● interface eth1 ip ospf network broadcast ip ospf authentication message-digest ip ospf message-digest-key ... md5 ...

Configuration - ZebraOSPF tunnel interface

ospfd.conf● interface mpls0 ip ospf network point-to-point ip ospf authentication message-digest ip ospf message-digest-key ... md5 ...

MPLS is improving OSPF

Dynamic shortest path first algorithms– About 10% of full-DB Dijkstra

Hitless restart– Remove assumption OSPF comes up in

quiescent netwok

Graceful handing of failure– Database overflow– Rate limiting

● Especially of flapping interfaces

Load sharingAARNet's US capacity

AARNet's load share configuration – South

● interface POS1/0 description Seattle-Sydney SDH ip address 192.231.212.34 255.255.255.252 ip ospf cost 128 mpls traffic-eng tunnels mpls traffic-eng backup-path Tunnel8204 tag-switching ip pos ais-shut pos report lrdi ip rsvp bandwidth 150000 150000 ...

AARNet's load share configuration – North

● interface POS2/0 description Seattle-Manoa SDH ip address 192.231.212.162 255.255.255.252 ip ospf cost 64 mpls traffic-eng tunnels mpls traffic-eng backup-path Tunnel 8203 tag-switching ip pos ais-shut pos report lrdi ip rsvp bandwidth 150000 150000

OSPF design hintsUse current best practice

● Small area 0, consistent with TE– Area 0 has total network knowledge

● Using areas allows address aggregation– Most importantly this aggregates network

state– Addressing needs to be thought out in

advanceSydneycore

Manoa

Seattle

Wollongongcore

Area0

Area1

Area2

OSPF design hints, cond

● Loopback interface as router ID– Make this a /32

● Broadcast and loop media has an advantage– Only two routers in subnet (DR and BDR)

track area state● Don't redistribute

– Use network statements– You'll end up with a lot of these so use a Perl

script● Use MD5 authentication

TopicProtection technology

Fast re-routeBasic mechanism

Fast re-route● Detect fault using

– Link layer carrier loss– RSVP Hello timeout (150ms)

● Signal failure using RSVP ResvTear message

● Change to pre-established label switch path

● Recalculate optimal paths by running OSPF

RSVP messages

Path FAST_REROUTE● Request a path proected with a fast re-

route path

Path DETOUR● Request a fast re-route path

Two modes of operation

J: LSP oriented:● Establish an detour LSP to protect one

other LSP● Upon failure switch packets to the detour

LSP

Two modes of operation

C: Tunnel oriented● Establish a tunnel to protect other tunnels● Upon failure send the packets through the

tunnel– pushing onto the label stack

● One backup tunnel can protect many other tunnels

These don't interoperate. Ouch

OSPF run to clean up

A multiple failure may not lead to a sane topology

OSPF is run to route all active main and detour LSPs optimally

Need to rate limit how often this is done● else intermittent interface failures will use

more CPU than they deserve

Tunnel-style fast re-route

The main LSP● interface POS1/0 description Seattle-Sydney fiber ip address 192.231.212.34 255.255.255.252 mpls traffic-eng tunnels ! Seattle-Manoa protect mpls traffic-eng backup-path Tunnel8204 tag-switching ip pos ais-shut pos report lrdi ip rsvp bandwidth 150000 150000 ...

Tunnel-style fast re-route

The backup tunnel● interface Tunnel8204 description Seattle-Manoa backup ip unnumbered Loopback0 tag-switching ip ! Loopback0 on manoa tunnel destination 192.231.212.148 tunnel mode mpls traffic-eng tunnel mpls traffic-eng priority 0 0 tunnel mpls traffic-eng path-option 1 explicit name sea-haw ...

TopicAARNet's experiences

ConfigurationProtection

MeasurementProperties of international links

Future

“You are in one of a large number of tunnels, all seemingly

alike”The number of MPLS paths explodes

quickly

It took up some time and a lot of care to get all the tunnels established

Managing tunnels

Naming conventions

Only the beginning of automated tools● These tend to be proprietary rather than

general, and driven from a GUI rather than a database

Had to build a lot of our own tools● SNMP program to check all LSPs had

reverse LSP● Wanted to write more but insufficent

router MIBs

TopicAARNet's experiences

ConfigurationProtection

MeasurementProperties of international links

Future

Restoration

MPLS performance should be worse than SDH performance● MPLS is end-to-end protection and the link

latency is 80ms● SDH has section protection, longest

section is 40ms

Restoration

This was true in practice● Still not enough time for a phone user to

hang up● Too long to be used to switch routers in

and out of working path– Want to do this for software upgrades

Performance under stress

MPLS restoration was better behaved than SDH when things fell apart● AARNet's network management system

has a sophistication that the SDH systems do not have– This leverages off the work on monitoring

generic IP links● We could detect and isolate odd conditions

before they threatened service

SDH in practice

SDH alarms can overwhelm management console● Some vendors have poor isolation

between configuration and operation

Configuration errors are disturbingly common

No interlocks● Put main circuit into loopback● Put protect circuit into loopback

OSPF

Far too easy to cause OSPF-TE to fail● Flapping interfaces drove CPU to 100%● CPU then fails to generate OSPF

Neighbour Hellos● OSPF loses adjacencies● CPU returns to 0%● Repeat

OSPF, cond

Fixes● Obvious solution is to rate limit repeated

OSPF next state output where state inputs are the same– Router manufacturers have gone for simpler

variants of this, such as rate limiting all state changes

OSPF, cond

Fixes, cond● Dynamic alternatives to Dijkstra algorithm

– Run time depends on “importance” of lost link, not size of total database

– In practice, about 10% resources of standard algorithm

TopicAARNet's experiences

ConfigurationProtection

MeasurementProperties of international links

Future

Measurement

Traceroute and ping haven't been useful as performance-measuring tools since flow routing

MPLS nails coffin shut● It's all faked at egress router, probably on

slow path

Active measurement

Need to allow for parallel paths● Four adjacent IP addresses on measuring

platforms● Hashing will place these on differing paths

to the same destination

Need to use a fast-path protocol● Not ICMP

Be careful not to measure the measurement host

Active measurement

Loss● Indicates major fault or congestion

Latency● Indicates protection or misconfiguration

– Measurement system needs to know nominal latency for main and protect paths

SNMP

Needed to detect protection event

Needed to detect loss of protect path● RECOVERY

Service:Tunnel8194(Broadway­Seattle)BackupHost:SCCN­Broadway­RouterAddress:162.231.212.20State:OKInterface:OK–1Date: Sat20Jul12:54:24.3

Useful for checking configuration● Each label switch path has a reverse path● Each main LSP has a protect LSP

MPLS load sharing in operationSCCN interfaces in Sydney

Sydney, NSW — Seattle, WA

Sydney, NSW — Manoa, HA — Seattle, WA

MPLS load sharing in operation

Graphs are similar in shape but not in detail● Load sharing is by hashing

– As round robining would delivery every second packet out-of-order

● Sydney-Manoa traffic is not load-shared– As the southern path is Sydney-NZ-Fiji-

Seattle-Manoa

MPLS traffic engineering in operation – Manoa

Typically file transfer trafficLabel switched path Sydney, NSW — Manoa, HW

MPLS traffic engineering in operation – Manoa

No load sharing Sydney-Manoa● South path only used if much more direct

North path fails

TopicAARNet's experiences

ConfigurationProtection

MeasurementProperties of international links

Future

SDH

SDH works well

But you can't build a genuine 99.999% availability with only one redundant path

Failure patterns

Many small single-segment failures● These are usually intentional

● Software upgrades● Maintain consistent active service age of equipment

● “Hits” of 50ms of less● Better to blackhole this traffic rather than attempt a

protection switch● When should we declare a path failure?● A big time avoids MPLS fast reroutes at the cost of

greater time to restore service upon a genuine segment failure

Failure patterns

Causes of major failures– Physical break of cable

● SCCN had a cable break whilst maintaining the Protect segment

– Craft technician error● Decommission or loopback of wrong link

Failures are often made worse– Loopback test both segments simultaneously– Insufficient CPU provisioning in control plane– Network Management System fails when most

needed

MPLS fast re-route tuning

Configuration● Need to calculate value for MPLS fast re-

route hold-down timer from capacity vendor's SDH automatic protection switching tables– You'll get lots of small hits otherwise

International link interior routing design

International links are an obvious OSPF stub area● With OSPF default pointing back towards

NOC– BGP default might point towards a US ISP– Exterior default overrides interior default

during normal operation● A stub area is good as we want to isolate

MPLS-TE information for an international link

H.323 configuration

H.323 gatekeeper should always reject calls to PoP console server modems● Forcing calls to re-route to PSTN without

needing a prefix● Uni phone books never list that uni's

prefix to defeat VoIP toll bypass

Personal relationships are important

SCCN has been forthright and honest about failures● Helps considerably to estimate risk of

outage re-occuring● US ISPs compare poorly

AARNet's unusual requirements interested the SCCN technical staff

Allowed us to build excellent relationships which have carried over into operation

TopicAARNet's experiences

ConfigurationProtection

MeasurementProperties of international links

Future

Future intentions

Obviously fuzzy

Lots of trans-Pacific capacity coming available● SCCN (AU-US)● AU-JP● US-JP

Opportunity to construct protection against design and operational failure by international capacity providers

MPLS across undersea vendors

MPLS should be able to do multi-vendor protection better than SDH● SDH has no segment visibility in this

application● No clocking issues● No “profile” issues

MPLS VPNs look like a good idea

MPLS VPNs look attractive for virtual research networks● For example, to research routing protocols● When MPLS becomes a campus

technology then allows network policies smaller than an autonomous system– Not necessarily a good thing– Moves the complexity (not removes the

complexity)– At least the complexity is no longer seen in

global BGP routing table

MPLS VPNs look like a bad idea

VPNs are useful for “crunchy outside, soft inside” firewalled networks● Do not need a firewall at each site● Firewall configuration is simpler

Assumes that the “baddies” are on the outside, ROFL

MPLS configures as best effort offers no protection from denial of service attacks in network “interior”

Use of MPLS to simplify BGP

A&R networks often want to offer transit to other A&R networs

Problem: BGP configuration for this can be complex, and transit network gets caught up in this complexity

Solution: offer MPLS transit

Use of MPLS to simplify BGP

Often use to policy routing because we want multiple routing instances in one router● Operational nightmare, especially in

protection scenarios

Can use MPLS to implement this● Run two BGP instances, obe per VPN● Place interfaces in particular MPLS VPNs

MPLS monitoring

Routers don't provide nearly enough performance information● Protection

● How long did protection take?● What was the cost in CPU resource?

⋯ Enabling capacity planning for protection● End-to-end performance

– Loss– Latency, especially changes

Topics if we have time

Quality of serviceMPLS on Linux

GMPLS

Forwarding equivalence class and quality of service

A forwarding equivalence class is mainly about routing

Quality of service is mainly about queuing

Two choices

Place differing QoS into differing FECs● Label switch router uses label to infer

forwarding and queuing

Place differing QoS into same FEC● Use Experimental bits to mark 3 bits of

service– Treating Exp similarly to IP's DSCP

Not really a choice, we can do both

It might router configuration easier if queueing discipline were always driven from Exp bits

Even if Exp always has the same value for that forwarding equivalence class

Traffic engineering and QoS routing

Traffic engineering can be used to recover some quality services before others● Recover voice services before recovering

best-effort data before recovering video

See “reservation priority”

Experience with QoS so far

No harder than IP DiffServ :-)● Same lack of coherent, total solution● Same “will be supported in next version”

issues

Topics if we have time

Quality of serviceMPLS on Linux

GMPLS

MPLS on Linux

mpls-linux on sourceforge● Kernel patches against 2.4

– Compiles against 2.4.18-rc3-ac1 after about an hour's work

● Includes Nortel-based LDP, library and command line configuration. Supports tunnels and MPLS opcode programming. Over ATM, ethernet and (with patches) PPP.

● Beta

MPLS on Linux

Zebra ospfd● OSPF-TE with Opaque LSA● CVS is usually more stable than releases● Late beta

MPLS on Linux

rsvpd● Not yet with Hellos and fast re-route● Early beta, research orientation

MPLS on Linux

NIST Switch● For BSD● Reasonably complete MPLS, RSVP-TE

implementation● Web site suggested a Linux port could

happen, but this was in 2001● Current status?

Topics if we have time

Quality of serviceMPLS on Linux

GMPLS

GMPLS expands MPLS's aims

World domination● One control plane protocol

– RSVP-TE

Controlling all switching mechanisms● MPLS

– ATM– Ethernet– RPR

● SDH/SONET

GMPLS expands MPLS's aims, cond

By viewing all switching as a special case of MPLS switching we can get a single● Control layer

– Not one per switching mechanism● Management domain

– Not one per vendor per switching mechanism● Security mechanism

– Not a billion passwords, all known and unalterable

That's all folks

Further readingBooks

Davie & RekhterMPLS: Technology and applications● Good but dated

AlwaynAdvanced MPLS design and implementation● Good coverage of TE and VPNs● “Advanced” only in sense of not a “... for

dummies” book

Further readingInternet drafts

Sharma, Hellstrand (eds)Framework for MPLS-based recovery● draft-ietf-mpls-recovery-frmwrk-05

Lai, McDysan (eds), Boyle, et alNetwork hierarchy and multilayer survivability● draft-ietf-tewg-restore-hierarchy-00

Further readingInternet drafts, cond

Owens, Sharma, Oommen, et alNetwork survivability considerations for traffic engineered IP networks● draft-owens-te-network-survivability-03