Upload
vuongtram
View
253
Download
18
Embed Size (px)
Citation preview
1
IP RAN Backhaul A Learning Experience
Sumon Ahmed Sabir ([email protected] )
Md. Abdullah-‐Al-‐Mamun ( [email protected] )
APRICOT 2014, Malaysia
Before the IP RAN Backhaul :
2
Before we begin • Have a IP/MPLS Network with 10G backbone ring.
Ring consists of 4 Cisco 7609 Routers.
• Around 2500 end nodes carrying Internet and Data traffic.
IP Protocols :
3
• OSPF v2 for IGP • LDP for Label DistribuZon • BGP for L3VPN
IniAal Network Design :
4
Requirement of IP RAN Backhaul
5
What we need to do? Need to provide IP RAN Backhaul for 3G Operators Also need to provide 2G Backhaul mostly E1’s over IP network IniZally 3 operators asked for services.
1st operator wants L3VPN, 100mpbs/node B
2nd /3rd operators wants L2Circuits 20mbps/node B
All of them need legacy E1’s for there 2G sites
Service Requirement :
6
MPLS L3VPN/VRF MPLS L2Circuit
SDH (E1) over IP Network
Challenges :
7
IniZal Challenges • Need to carry mobile voice data seamlessly • First convergence in case of link/node failure
without interrupZng voice calls • Need to connect 500 nodes in six months, 1000
nodes in one year and may be 5000 nodes in next two years
Design Philosophy :
8
Follow the best pracZces Keep it simple
Scalable design to address future growth
Design ConsideraAon :
9
Singe AS MulZ-‐Area OSPF End to End MPLS with LDP
1G Ring Topology for Access Layer. X-‐OSPF Area 10G Mesh/Ring Topology for Backbone. 0-‐OSPF
Area
Network Diagram :
10
StarAng Service Delivery :
11
Successfully implemented L3VPN, L2VPN and E1 over IP network
Incase of link failure call drops in all 3 types NoZced 5 to 7 packet drops in ping test. Traffic is not following the intended path always.
Need to find reasons? Need faster convergence Zme………..
IGP Cost and Traffic Flow :
12
OSPF Cost and Traffic Flow Since there is no cost impact in ospf between 1G and 10GE so we
need to fix the cost all access and Backbone interface otherwise 10G and 1G will be considered as same. We decide we will follow below standard-‐-‐-‐
OSPF Cost CalculaZon Interface Cost 100G 1 40G 4 10G BB 5 10G Access 10 OC 48 BB 15 OC 48 Access 20 1G Backbone 25 1G Access 30 OC 12 Backbone 50 OC 12 Access 55 OC 3 Backbone 100 OC 3 Access 155 FE Backbone 500 FE Access 505
OSPF Traffic Flow :
13
AVer Before
Traffic of Cell Site router does not forwarded through backbone interface although there is minimum hop count whereas its forward through Area then consider shortest path for that mulZple hop is consider to reach desire prefix.
In that case we create a logical interface between the two ABR router and traffic forward to the Area with minimum hop count through logical BB interface.
OSPF Traffic Flow ( Cont ) :
14
AVer Before
Same thing will be happened MulZhop away ABR. Traffic will forward to the Area first then calculate SPF Algorithm and that’s why mulZple hop will engage.
MulZhop away ABR router should be in same logical interface to maintain Area conZnuity and traffic forwarding in minimum hop count
OSPF Traffic Flow :
15
C:\>tracert 10.253.4.2 Tracing route to 10.253.4.2 over a maximum of 30 hops 1 2 ms 1 ms 6 ms 10.1.0.33 2 1 ms 1 ms 1 ms 10.0.0.121 3 1 ms 1 ms 1 ms 10.0.0.65 4 3 ms 2 ms 2 ms 10.10.12.122 5 1 ms 1 ms 5 ms 10.253.4.2 Trace complete.
C:\>tracert 10.253.4.2 Tracing route to 10.253.4.2 over a maximum of 30 hops 1 1 ms 1 ms 1 ms 10.1.0.33 2 1 ms 1 ms 1 ms 10.0.0.121 3 1 ms 1 ms 1 ms 10.0.0.65 4 2 ms 2 ms 3 ms 10.10.12.130 5 3 ms 2 ms 2 ms 10.10.12.190 6 2 ms 1 ms 1 ms 10.253.4.2 Trace complete.
Fast Convergence : BFD
16
Implemented BFD in all the links Tuned OSPF parameters
• ISF • Hello Zmer
• BFD
Packet loss reduced to 2 to 3. L3VPN circuits are not dropping calls But L2Circuits and E1’s are sZll a challenge
Fast Convergence : BFD
17
router ospf 65000 router-‐id 10.253.39.1 ispf Zmers throhle spf 50 100 5000 Zmers throhle lsa 0 20 1000 Zmers lsa arrival 20 Zmers pacing flood 15 passive-‐interface Loopback1 mpls ldp sync
interface Vlan10 descripZon " MPLS Ring Interface to PAN-‐ASR903-‐PE03: Gi0/2/5 " mtu 9178 ip address 10.10.152.2 255.255.255.252 ip ospf network point-‐to-‐point ip ospf bfd ip ospf 65000 area 39 ip ospf cost 30 carrier-‐delay msec 0 mpls ip mpls label protocol ldp bfd interval 50 min_rx 50 mulAplier 3
Fast Convergence : LFA FRR
18
Some vendor suggested Traffic Engineering but our choice is always generic configuraZon
Decided to try LFA FRR for IGP fast convergence and had to change some network design to implement LFA FRR
Got Significant improvement in fast convergence
Its almost 0 packet loss in case of link failure
mpls label protocol ldp mpls ldp discovery targeted-‐hello accept no l3-‐over-‐l2 flush buffers asr901-‐plal-‐frr enable router ospf 65000 prefix-‐priority high route-‐map TE_PREFIX fast-‐reroute per-‐prefix enable area 39 prefix-‐priority high fast-‐reroute per-‐prefix remote-‐lfa tunnel mpls-‐ldp mpls ldp sync ip prefix-‐list TE_PREFIX seq 5 permit 10.255.255.30/32 route-‐map TE_PREFIX permit 10 match ip address prefix-‐list TE_PREFIX
Before and AVer OSPF LFA/FRR
19
Xshell:\> ping 10.252.51.111 –t Reply from 10.252.51.111: bytes=32 time=2ms TTL=253 Reply from 10.252.51.111: bytes=32 time=4ms TTL=253 Reply from 10.252.51.111: bytes=32 time=2ms TTL=253 Reply from 10.252.51.111: bytes=32 time=2ms TTL=253 Request timed out. Reply from 10.252.51.111: bytes=32 time=61ms TTL=253 Reply from 10.252.51.111: bytes=32 time=86ms TTL=253 Reply from 10.252.51.111: bytes=32 time=70ms TTL=253 Reply from 10.252.51.111: bytes=32 time=147ms TTL=253
Reply from 10.252.51.111: bytes=32 time=2ms TTL=253 Reply from 10.252.51.111: bytes=32 time=2ms TTL=253 Reply from 10.252.51.111: bytes=32 time=1ms TTL=253 Reply from 10.252.51.111: bytes=32 time=1ms TTL=253 Reply from 10.252.51.111: bytes=32 time=27ms TTL=253 Reply from 10.252.51.111: bytes=32 time=32ms TTL=253 Reply from 10.252.51.111: bytes=32 time=1ms TTL=253 Reply from 10.252.51.111: bytes=32 time=2ms TTL=253 Reply from 10.252.51.111: bytes=32 time=2ms TTL=253 Reply from 10.252.51.111: bytes=32 time=1ms TTL=253
Clock SynchronizaAon: RFC 1855v2 :
20
L3VPN and L2Circuits seems stabilized aper LFA FRR implementaZon but operators complaining they are seeing some bit error in their E1 Circuits
SyncE/PTP • Aper implemenZng RFC 1855v2 PTP clock to the end
nodes E1 Circuits become clean
Before and AVer PTP Clock SynchronizaAon :
21
Start Time Period (Min) T7001:E1/T1 Error Seconds T7002:E1/T1 Severely Error Seconds 01/06/2014 19:00:00 60 1 1 01/06/2014 20:00:00 60 1 1 01/06/2014 21:00:00 60 1 1 01/06/2014 22:00:00 60 1 1 01/06/2014 23:00:00 60 0 0 01/07/2014 00:00:00 60 1 1 01/07/2014 01:00:00 60 1 1 01/07/2014 02:00:00 60 1 1 01/07/2014 03:00:00 60 1 1 01/07/2014 04:00:00 60 1 1 01/07/2014 05:00:00 60 1 1 01/07/2014 06:00:00 60 2 2
01/08/2014 19:00:00 60 0 0 01/08/2014 20:00:00 60 0 0 01/08/2014 21:00:00 60 0 0 01/08/2014 22:00:00 60 0 0 01/08/2014 23:00:00 60 0 0 01/09/2014 00:00:00 60 0 0 01/09/2014 01:00:00 60 0 0 01/09/2014 02:00:00 60 0 0 01/09/2014 03:00:00 60 0 0 01/09/2014 04:00:00 60 0 0 01/09/2014 05:00:00 60 0 0 01/09/2014 06:00:00 60 1 1
BEFO
RE
AFTER
Final IP Protocol SelecAon :
22
OSPF v2 for IGP LDP for Label DistribuZon BGP for L3VPN BFD for Fast fail over NoZficaZon IP OSPF FRR for ms switchover
RFC 1855V2 for PTP Clock SynchronizaZon
More and More Routers were Added :
23
Challenges in New Dimension :
24
Soon the number of nodes/routers grows more than 500 new challenges started to appear • Convergence Zme become unpredictable some
Zmes few packet loss seen • Some nodes become unreachable some Zmes
• Edge routers CPU load goes high.
Degraded Network Performance forced us to think again
Requirement of IP OpAmizaAon :
25
• We are gerng service interrupZon some of the cell site routers due to exhausted of prefixes at Cell Site Router
• We are gerng low/delayed performance for LFA-‐FRR where Prefixes are near to Threshold
• We are gerng more than one packet drop for traffic switch over. Fail-‐over was beher when we were in lower prefixes.
Within the short Zme period we will reach the limit of the router
Prefix StaAsAcs of the Cell Sites :
26
• Global RouZng: 1600+ • OM_FH VRF: 700+ • GP_VRF: 700-‐1500
• OM_ROBI VRF: 200+
• Total Prefixes: 2500-‐3200
Scalability of the Cell Site Router (Cisco ASR 901/ Juniper ACX1000) :
27
• 12k Prefixes without MPLS • 4k Prefixes for VRF without MPLS • 3K to 4K Prefixes with MPLS
• 1600 Prefixes with MPLS+FRR
Prefix OpAmizaAon Plan :
28
• OSPF Area Based Route SummarizaZon and export the summary route to other OSPF Area
• Import Expected Route into the Cell Site Router at Global and VRF RouZng Instance
Present Logical Network Design :
29
We planned our IP Addressing Earlier :
30
One logical P2P Interface Represent a OSPF Area with Ring Topology
One OSPF Area belongs a /22 Prefix le 30 for per P2P Link. i.e 10.10.112.0/22 for Area 29
One Area belongs /24 for loopback IP which is represent router ID. i.e 10.253.29.0/24 for loopback
Back-‐Bone Area also belongs a /22 prefix le 30 for P2P link. i.e 10.0.0.0/22 for Area 0
Back-‐Bone Area belongs a /24 prefixes le 32 for loopback as well as router ID. i.e 10.255.255.0/24 for loopback
Link Failure Before Prefix filtering :
31
Link Failure AVer Prefix filtering :
32
Prefix OpAmizaAon :
33
Before IGP Prefix SummarizaAon AVer IGP Prefix summarizaAon
Global Routes: 1600+ Global Routes: 1066
VRF Routes: 2400+ VRF Routes: 700
Total Routes: 2500~3200 Total Routes: 1766
DHAGG3#show ip cef summary IPv4 CEF is enabled and running VRF Default 1066 prefixes (1063/3 fwd/non-‐fwd) Table id 0x0 Database epoch: 0 (1066 entries at this epoch)
DHAGG3#show ip cef summary IPv4 CEF is enabled and running VRF Default 1677 prefixes (1677/0 fwd/non-‐fwd) Table id 0x0 Database epoch: 0 (1066 entries at this epoch)
Prefix OpAmizaAon :
34
Before IGP Prefix Filtering AVer IGP Prefix Filtering
Global Routes: 1066 Global Routes: 205
VRF Routes: 700 VRF Routes: 700
Total Routes: 1766 Total Routes: 905
DHAGG3#show ip cef summary IPv4 CEF is enabled and running VRF Default 1066 prefixes (1066/0 fwd/non-‐fwd) Table id 0x0 Database epoch: 0 (1066 entries at this epoch)
ROBI24-‐DHTEJ11#show ip cef summary IPv4 CEF is enabled and running VRF Default 205 prefixes (205/0 fwd/non-‐fwd) Table id 0x0 Database epoch: 0 (205 entries at this epoch)
Prefix OpAmizaAon :
35
Before VRF Prefix Filtering AVer VRF Prefix Filtering
Global Routes: 205 Global Routes: 205
VRF Routes: 700 VRF Routes: 11
Total Routes: 905 Total Routes: 216
ROBI24-‐DHTEJ11#show ip cef summary IPv4 CEF is enabled and running VRF Default 205 prefixes (205/0 fwd/non-‐fwd) Table id 0x0 Database epoch: 0 (205 entries at this epoch)
ROBI24-‐DHTEJ11#show ip cef vrf OM_FH summary IPv4 CEF is enabled and running VRF OM_FH 11 prefixes (11/0 fwd/non-‐fwd) Table id 0x1 Database epoch: 0 (11 entries at this epoch)
Prefix OpAmizaAon :
36
ip vrf OM_FH rd 58587:10 import map OM-‐ROUTE route-‐target export 58587:10 route-‐target import 58587:10 ip prefix-‐list FH-‐OM-‐ROUTE seq 5 permit 10.1.0.32/27 ip prefix-‐list FH-‐OM-‐ROUTE seq 6 permit 10.1.0.0/27 route-‐map OM-‐ROUTE permit 10 match ip address prefix-‐list FH-‐OM-‐ROUTE
VRF ROUTE Filtering
Prefix OpAmizaAon :
37
router ospf 65000 router-‐id 10.255.255.15 ispf area 1 range 10.10.0.0 255.255.252.0 area 1 filter-‐list prefix BB-‐ROUTE-‐IN in ip prefix-‐list BB-‐ROUTE-‐IN seq 5 permit 10.255.255.0/24 le 32 ip prefix-‐list BB-‐ROUTE-‐IN seq 10 permit 10.0.0.0/22 le 32 ip prefix-‐list BB-‐ROUTE-‐IN seq 150 deny 0.0.0.0/0 le 32
OSPF AREA ROUTE SummarizaAon
Final Scalability of the Network :
38
Now we have a very stable network with 1000+ nodes Planned to Follow the best pracZces and learning
experiences for further growth: Core Router does not perform any service delivery
Maximum 3 Area should be considered at ABR.
/Area there will be Maximum 60 Nodes Area should be conZguous physically or logically otherwise LFA-‐FRR
and traffic forwarding will be hampered Make it simple through LFA-‐FRR, LDP and BFD and it is enough for
Telco Voice and Data Traffic
No manual TE, Policy Route
Memory leakage @ASR901 for SNMP ConfiguraZon IOS BUG VER 15.2(S1) 15.3(S2)
MPLS Label Egress failed @ASR903 IOS XE 3.7, BUG Fixed 3.10
JUNOS BUG at 12.R2 for MX10, 10GE-‐3D-‐MIC card goes down frequently. BUG fixed 13.1R1
SZll some unresolved Issues and we are working with vendors to fix those.
IOS/JUNOS BUG Issue:
Scalability EquaAon in Single IGP Domain:
40
Maximum 250 Node should be within Area 0 and considering P and PE Device
Per P Router should be connect max 4 PE Router 50 P can connect 4x50=200 PE Router Per PE connect 60 nodes
Possible nodes in a single IGP 200x60=12,000 For further growth, we should consider separate IGP
domain
Possible Max Node/IGP Domain : 12K??
Network Monitoring Tools
Diversity of the Network :
42
• Serving MulZ operators ( open access) • MulZ vendor environment (Cisco/Juniper)
• MulZ service delivery network(Ethernet, TDM, GPON)
43
Thank You.
Sumon Ahmed Sabir Md. Abdullah-‐Al-‐Mamun
www.fiberathome.net