View
217
Download
0
Category
Tags:
Preview:
Citation preview
1
OmniPoPGigaPOP-of-GigaPOPs Design and Future
Debbie Montanodmontano@force10networks.com
Internet2 Member Meeting – April 2007
2
Special Note Regarding Forward Looking Statements
This presentation contains forward-looking statements that involve substantial risks and uncertainties, including but not limited to, statements relating to goals, plans, objectives and future events. All statements, other than statements of historical facts, included in this presentation regarding our strategy, future operations, future financial position, future revenues, projected costs, prospects and plans and objectives of management are forward-looking statements. The words “anticipates,” “believes,” “estimates,” “expects,” “intends,” “may,” “plans,” “projects,” “will,” “would” and similar expressions are intended to identify forward-looking statements, although not all forward-looking statements contain these identifying words. Examples of such statements include statements relating to products and product features on our roadmap, the timing and commercial availability of such products and features, the performance of such products and product features, statements concerning expectations for our products and product features [and projections of revenue or other financial terms. These statements are based on the current estimates and assumptions of management of Force10 as of the date hereof and are subject to risks, uncertainties, changes in circumstances, assumptions and other factors that may cause the actual results to be materially different from those reflected in our forward looking statements. We may not actually achieve the plans, intentions or expectations disclosed in our forward-looking statements and you should not place undue reliance on our forward-looking statements. In addition, our forward-looking statements do not reflect the potential impact of any future acquisitions, mergers, dispositions, joint ventures or investments we may make. We do not assume any obligation to update any forward-looking statements. Any information contained in our product roadmap is intended to outline our general product direction and it should not be relied on in making purchasing decisions. The information on the roadmap is (i) for information purposes only, (ii) may not be incorporated into any contract and (iii) does not constitute a commitment, promise or legal obligation to deliver any material, code, or functionality. The development, release and timing of any features or functionality described for our products remains at our sole discretion.
3
Topics
Design Trends
OmniPoP Design
Layer 2 Model
VLANs
Switch Requirements & Capabilities
4
R&E Design Trends
R&E utilizes services at all three layers:– Layer 1: Fiber/Lambdas– Layer 2: Ethernet/VLANs– Layer 3: IP
HOPI – Hybrid Optical & Packet Infrastructure– VLANs to provide dedicated paths/bandwidth
Regional Aggregation
Number of VLANs continues to increase.
Number of Peers continues to increase
Segmentation is a major trend
5
CIC OmniPOP Members
6
OmniPOP Design
OmniPOP – Layer 2 Only– 10 GbE & 1 GbE connections for each member– 10 GbE connections to R&E networks– Multiple VLANs per connection– VLANS for aggregating router-to-router peering
connections– GigaPOP-to-GigaPOP– University-to-University– University-to-GigaPOP– University/GigaPOP-to-National Network
– VLANs which extend through to end researchers/sites– (VLANs can be used for control plane)
CIC also has Chicago Fiber Ring – Layer 1
7
VLANs
Local Area Network (LAN)
Virtual LAN (VLAN)– Create independent logical networks– Isolate traffic/groups of users in different VLANs– Create VLANs across multiple devices– Multiple VLANs across a single port
Institute of Electrical & Electronics Engineers (IEEE)– IEEE 802.3 - Ethernet standards– IEEE 802.1 - higher layer LAN protocol standards
IEEE 802.1q - Virtual LAN (VLAN) standard
802.1q limit: 4096 VLAN IDs
8
OmniPoP Weather Map
9
Model
StarLight is a Layer 2 service, and does not mandate Layer 3 peering to a central router. Supports open peering amongst its members.
You provide a GigE [or 10GigE] connection from a router in your AS to the Starlight Force10 switch.
Individual peering sessions are bilaterally negotiated.
Your router must support 802.1Q tagged VLANs.
Starlight configures 1 point-to-point VLAN for each of your peerings at your request, or the peer's request. – If you desire to peer with, say, five other StarLight
connectors, we will assign 5 VLAN id's and configure them on our switch, giving you five individual point-to-point connections.
Reference:http://www.startap.net/starlight/CONNECT/connectPeering.html
10
StarLight Bilateral 802.1q VLANS
StarLight provides bilateral 802.1q VLANs to peering participants for three primary reasons: – The MTU of the peerings is decided between the peers.
StarLight does not enforce a common MTU size for all peers; instead, each peering MTU is individually negotiated between the peers.
– By eliminating a common IEEE 802 broadcast domain, IP multicast is not flooded to all participants. Avoids the problems with PIM Asserts in an interdomain peering mesh.
– IPv4, IPv4 multicast, and IPv6 are transparent to StarLight. The services you enable on your peerings are between you and each of your peers. They're not limited by anything StarLight supports.
11
Matching MTUs
Maximum Transmission Unit (MTU) or Media Transmission Unit– the largest physical packet size, measured in bytes, that a
network can transmit. Any messages larger than the MTU are divided into smaller packets before being sent
Link-layer MTU is frame size of an ethernet packet
IP (layer3) MTU is size used for IP fragmentation
All VLAN members must use same IP MTU value
Force10 defines Link MTU as = Entire Ethernet Packet (ethernet header + frame check sequence + Payload)
E.g. Max link MTU 9252 B; Max IP MTU 9234 B
Link MTU on OmniPOP switch set to Max (9252)
12
I2 Recommendation - IP MTU
Internet2-wide Recommendation on IP MTU:
Engineers throughout all components of the extended Internet2 infrastructure, including its campus LANs, its gigaPoPs, its backbone(s), and exchange points, are encouraged to support, where ever practical, an…
IP MTU of 9000 bytes.
Recommended by the Joint Engineering Team (JET) for federal research networks.
13
IP MTU Rational
The rationale for this recommendation includes the following points: Applications, including but not limited to bulk TCP, benefit from
being able to send 8K (i.e., 8 times 1024) bytes of payload plus various headers. An IP MTU of 9000 would satisfy this application need.
A growing number of routers, switches, and host NICs support IP packets of at least 9000.
Very few routers, switches, and host NICs support IP packets of more than 9500. Thus, there is comparatively little motivation for a value much more than 9000.
There is anecdotal evidence that Path MTU discovery would be more reliable if a given agreed-on value were commonly used. This relates to weaknesses in current Path MTU discovery technology.
9000 is an easy number to remember. It is stressed that this is an interim recommendation. Engineers are
also encouraged to explore the benefits and practicalities of much larger MTUs, up to the full 64 KBytes permitted for an IPv4 datagram.
14
OmniPOP – Connects Members to Each Other
University Connections GigaPOP
University of Chicago 10 GbE & 1 GbE
Univ of Illinois Chicago 10 GbE
Univ of Illinois Urbana-Champaign
10 GbE
Indiana University 10 GbE Indiana GigaPOP
University of Iowa 10 GbE & 1 GbE
University of Michigan 10 GbE MERIT
Michigan State Univ. 10 GbE MERIT
University of Minnesota 10 GbE & 1 GbE Northern Lights
Northwestern University 10 GbE & 1 GbE
Ohio State University 10 GbE & 1 GbE OARnet
Purdue University 10 GbE Indiana GigaPOP
Univ. of Wisconsin - Madison
10 GbE & 1 GbE WiscREN
15
OmniPOP: Connects Members to R&E Worldwide
R&E Network Connection
NLR Layer 2 1 GbE
NLR National Exchange Fabric
1 GbE
NLR Layer 3 10 GbE
Internet2 10 GbE
MREN 10 GbE
Starlight (via MREN) 10 GbE
16
OminPOP VLANs
OmniPOP selected VLAN IDs 2000-2499– Each connecting institution tries to set aside this range
IEEE 802.1q standard limit: 4096 VLAN IDs
Need to select a range of VLAN IDs not already in use by connecting institutions and gigaPOPs
Multiple VLAN supported on 1 physical network connection.
17
VLAN IDs
reserve the first 20 for broadcast vlans
(we'll need some v4 & v6 address space to talk with each other on each of these): – 2000 v4 unicast, jumbo frames – 2001 v4 multicast, jumbo frames – 2002 v6 unicast, jumbo frames – 2003 v6 multicast, jumbo frames – 2004 v4 unicast, 1500B frames – 2005 v4 multicast, 1500B frames – 2006 v6 unicast, 1500B frames – 2007 v6 multicast, 1500B frames – 2008-2019 reserved
18
2099-2070: NLR
– 2099 NLR-U of Chicago 10G – 2098 NLR-U of Chicago 1G backup – 2097 NLR-U of Ill-Chicago 10G – 2096 NLR-U of Ill-Chicago 1G backup – 2095 NLR-UIUC 10G – 2094 NLR-UIUC 1G backup – 2093 NLR-Indiana 10G – 2092 NLR-Indiana 1G backup – 2091 NLR-Iowa 10G– 2090 NLR-Iowa 1G backup – 2089 NLR-U Michigan 10G – 2088 NLR-U Michigan 1G backup – 2087 NLR-Michigan State 10G – 2086 NLR-Michigan State 1G backup– 2085 NLR-U Minnesota 10G– 2084 NLR-U Minnesota 1G backup
– 2083 NLR-Northwestern 10G – 2082 NLR-Northwestern 1G backup – 2081 NLR-Ohio State 10G – 2080 NLR-Ohio State 1G backup – 2079 NLR-Purdue 10G – 2078 NLR-Purdue 1G backup – 2077 NLR-Wisconsin 10G – 2076 NLR-Wisconsin 1G backup – 2075 – 2074 – 2073 – 2072 – 2071 – 2070
19
2069-2040: Internet2
– 2069 Internet2-U of Chicago 10G – 2068 Internet2-U of Chicago 1G backup – 2067 Internet2-U of Ill-Chicago 10G – 2066 Internet2-U Ill-Chicago 1G backup – 2065 Internet2-UIUC 10G – 2064 Internet2-UIUC 1G backup – 2063 Internet2-Indiana 10G – 2062 Internet2-Indiana 1G backup – 2061 Internet2-Iowa 10G – 2060 Internet2-Iowa 1G backup – 2059 Internet2-U Michigan 10G – 2058 Internet2-U Michigan 1G backup – 2057 Internet2-Michigan State 10G – 2056 Internet2-Michigan State 1G
backup – 2055 Internet2-U Minnesota 10G
– 2054 Internet2-U Minnesota 1G backup
– 2053 Internet2-Northwestern 10G – 2052 Internet2-Northwestern 1G
backup – 2051 Internet2-Ohio State 10G – 2050 Internet2-Ohio State 1G
backup – 2049 Internet2-Purdue 10G – 2048 Internet2-Purdue 1G backup – 2047 Internet2-Wisconsin 10G – 2046 Internet2-Wisconsin 1G
backup – 2045 Internet2-IndianaCPS – 2044 – 2043 – 2042 – 2041 – 2040
20
2150-2215 Intra-CIC Point-to-Point VLANs
– 2150 UOC-UIC – 2151 UOC-UIUC – 2152 UOC-Indiana – 2153 UOC-Iowa – 2154 UOC-Michigan – 2155 UOC-MSU – 2156 UOC-UMN – 2157 UOC-NU – 2158 UOC-OSU – 2159 UOC-Purdue – 2160 UOC-Wisconsin – 2161 UIC-UIUC – 2162 UIC-Indiana – 2163 UIC-Iowa – 2164 UIC-Michigan
– 2165 UIC-MSU – 2166 UIC-UMN – 2167 UIC-NU – 2168 UIC-OSU – 2169 UIC-Purdue – 2170 UIC-Wisconsin – 2171 UIUC-Indiana – 2172 UIUC-Iowa – 2173 UIUC-Michigan – 2174 UIUC-MSU – 2175 UIUC-UMN – 2176 UIUC-NU – 2177 UIUC-OSU – 2178 UIUC-Purdue – 2179 UIUC-Wisc – 2180 IU-Iowa
21
2150-2215 Intra-CIC Point-to-Point VLANs (Cont)
– 2181 IU-Michigan – 2182 IU-MSU – 2183 IU-UMN – 2184 IU-NU – 2185 IU-OSU – 2186 IU-Purdue – 2187 IU-Wisc – 2188 Iowa-Michigan – 2189 Iowa-MSU – 2190 Iowa-UMN – 2191 Iowa-NU – 2192 Iowa-OSU – 2193 Iowa-Purdue – 2194 Iowa-Wisc – 2195 Michigan-MSU – 2196 Michigan-UMN – 2197 Michigan-NU – 2198 Michigan-OSU – 2199 Michigan-Purdue
– 2200 Michigan-Wisconsin – 2201 MSU-UMN – 2202 MSU-NU – 2203 MSU-OSU – 2204 MSU-Purdue – 2205 MSU-Wisconsin – 2206 UMN-NU – 2207 UMN-OSU – 2208 UMN-Purdue – 2209 UMN-Wisconsin – 2210 NU-OSU – 2211 NU-Purdue – 2212 NU-Wisconsin – 2213 OSU-Purdue – 2214 OSU-Wisconsin – 2215 Purdue-Wisconsin – 2216 Iowa10G-Iowa1G
22
OmniPOP Switch Design
E1200 Overview
1 GbE and 10 GbE ports
Upgrade path to higher densities & 100 GbE
Resiliency Architecture
Switching Protocols
23
E1200 Overview
Redundant PowerSupplies: 1+1 DC
CableManagement
14 Line CardSlots
Redundant SwitchFabric Modules
(SFMs): 8:11.6875 Tbps Capacity
Passive CopperBackplane
5 Tbps Capacity,100 GbE Ready
Redundant Fansand Fan Modules
Redundant Route Processor Modules (RPMs): 1+1No Central Clock
24
Line Rate & High Density Ports
OmniPOP includes:
4-port 10GbE Line Rate cards
16-port 10GbE High Density cards– 4:1 lookup oversubscribed
10 GbE ports– Functions as a line-rate card if
every fourth XFP is used
Provides– Flexibility & control– Can balance ports/chassis use
with bandwidth requirements– Room for growth – up to 224 10
GbE ports per E1200 chassis
25
Path to 100 GbE & Higher Density
January 2002
October 2002
5 Tbps Passive Copper
Backplane100 GbE Ready
September 2004
1st Generation Switch Fabric Module (SFM)
112.5 Gbps/Slot
1999 – 2002
1st Generation Line Cards
“EtherScale”
E1200 (1.6875 Tbps)28 x 10 GbE336 x GbE
E600 (900 Gbps)14 x 10 GbE196 x GbE
April 2005
2nd Generation Line Cards
“TeraScale”
E120056 x 10 GbE
672/1260 x GbE
E60028 x 10 GbE
336/630 x GbE
October 2005
March 2006
2nd Generation Switch Fabric
Module (SFM3)225 Gbps/Slot
E1200 (3.375 Tbps)E600 (1.8 Tbps)100 GbE Ready
2008* 200x*
4th Generation Line Cards
High Density 100 GbE
3rd Generation Switch Fabric
Module337.5 Gbps/Slot
E-Series E1200
E-Series E600
90-port GbE
16-port 10 GbE
E120056/224 x 10 GbE672/1260 x GbE
E60028/112 x 10 GbE336/630 x GbE
3rd Generation Line Cards
High Density 10 GbE
Very High Density GbE
E120056 x 10 GbE672 x 1 GbE
E60028 x 10 GbE336 x GbE
* planned
26
Higher Speeds Drive Switch/Router Requirements
Driving architectural requirements
Massive hardware and software scalability– >200 Gbps/slot switch fabric capacity – Support for several thousand interfaces– Multi-processor, distributed architectures
Fast packet processing at line-rate– 100 GbE is ~149 Mpps or 1 packet every 6.7 ns(10 GbE is only ~14.9 Mpps or 1 packet every 67 ns)
27
Higher Speeds Drive Density
100 Gbps Ethernet will benefit all
Drives 10 GbE port density up and cost down
Possible line-rate combinations– 1 x 100 GbE port– 10 x 10 GbE ports– 100 x 1 GbE ports– And even more oversubscribed port density…
The more things change the more they stay the same….
28
100 GbE Ready Chassis = No Forklift Upgrade
Current backplane can scale to 5 Tbps and 337.5 Gbps/slot with future components– Designed and tested for 5 Tbps– Advanced fiberglass materials
improve transmission characteristics
– Unique conductor layers decouple 5 Tbps signal from power traces
– Engineered trace geometry for channel stability and 25 Gbps channels
Force10 has 19 patents awarded and more than 60 patents pending on its switching technology
Backplane Capacity (Tbps)
Capacity Per Slot (Gbps)
Force10
E1200: 5
E600: 2.7
E300: 1.2
E1200: 337.5
E600: 337.5
E300: 150
Cisco ?* ?*
Foundry ?* ?*
Extreme ?* ?*
* No other vendor has openly discussed testing their backplanes for future capacity
29
Backplane Considerations
Slot Capacity Switching Capacity Performance
– Signal coding– BER– Impacts SerDes Design
Design and technology drives scalability– Advanced fiberglass materials– Unique conductor layers– Engineered trace geometries
30
Power Considerations
End User Restrictions?
Total system wattage?
Input power quality not specified
Higher speeds require lower noise
31
Thermal Management Considerations
Cooling capacity per slot?
Front to back filtered airflow for carrier deployments
Cooling redundancy
Heat can affect material performance which affects high-speed signaling performance
32
100 GbE Ready Chassis = No Forklift Upgrade
Chassis designed for 100 GbE and high density 10 GbE– Backplane and channel signaling for higher internal
speeds– Lower system BER– Connectors– N+1 switch fabric – Reduced EMI– Clean power routing architecture– Thermal and cooling– Cable management
33
Resiliency Architecture
Resilient Hardware Resilient Hardware ArchitectureArchitecture
Link ResiliencyLink Resiliency
Protocol ResiliencyProtocol Resiliency
Manageability and ServiceabilityManageability and Serviceability
HA Software ArchitectureHA Software Architecture• Modular OS (NetBSD)• 3 CPU (L2, L3, CP)• Line Card CPU• HA Software
• Hardware Redundancy• Distributed Forwarding• Hitless Failover• DoS Protection
• LAG• ECMP• LFS/WAN PHY• BFD
• OSPF/BGP Restart• RSTP, MST• VRRP
• Hitless Software Upgrade• Hot Swap• Logging and Tracing• One Software Image
34
Buffer/TrafficManagement
BackplaneScheduler
TernaryCAM
1GE/10GEMAC
Buffer/TrafficManagement
1.6875 TbpsSwitch Fabric
Passive CopperBackplane
56.25 GbpsPer Slot
56.25 Gbps
Line Card
Backplane/Fabric Scalability and Reliability
Route Processor Module
• Completely passive copper backplane
• Scalability tested to 5 Tbps• Passive copper backplanes more
reliable than optical backplanes• Force10 holds many patents
on backplane design and manufacturing technology
• 1.6875 Tbps fabric• 8:1 SFM redundancy provides
OpEx savings
35
Modularity Enables Predictability
Buffer/TrafficManagement
BackplaneScheduler
TernaryCAM
1GE/10GEMAC
Route Processor Module
Buffer/TrafficManagement
1.6875 TbpsSwitch Fabric
Passive CopperBackplane
56.25 GbpsPer Slot
56.25 Gbps
Line Card
• Independent data and control paths (among CPUs)
• Multi-CPUs with modular OS for routing and management (control plane) • Distributed ASICs for line
rate forwarding and packet processing (data plane)
36
More than 1,000,000 ACLs with No
Performance impact
Route Processor Module
Routing ManagementSwitching
Force10 TeraScale ArchitectureEmbedded Security & Catastrophic Failure Prevention
Traffic Going to Each CPU is Prioritized
& Rate Limited
Line Card
TernaryCAM
1GE/10GEMAC
1.6875 TbpsSwitch Fabric
Passive CopperBackplane
Buffer/TrafficManagement
BackplaneScheduler
CPU Utilization >85% Triggers Internal
Protection Mechanisms
ACL
37
Buffer/TrafficManagement
BackplaneScheduler
TernaryCAM
1GE/10GEMAC
Buffer/TrafficManagement
1.6875 TbpsSwitch Fabric
Passive CopperBackplane
56.25 GbpsPer Slot
56.25 Gbps
Line Card
Distributed and Hitless Forwarding
Route Processor Module
• CPUs calculate best paths, and download to line cards synchronously
• Line cards make independent forwarding decisions
• During RPM failover, line cards continue to forward without packet loss (hitless)
• Each line card has distributed CAMs for L2/L3 forwarding, ACL, and QoS lookup
• Always 5 lookup allows line rate packet processing
• FIB based architecture scales extremely well against flow based architectures (no slow path, no flow cache thrashing)
• Hardware based ACLs with Sequence numbers ensure no security holes
38
Switching
4K VLANs 128K MAC + L2 ACL Entries per
Port Pipe 802.1Q VLAN Tagging VLAN Stacking 802.1p VLAN Prioritization 802.3ad Link Aggregation w/
LACP 802.1D Spanning Tree Protocol RRR Rapid Root Redundancy
(STP) Force10 VLAN Redundancy
Protocol (FVRP) MSTP (802.1s), RSTP (802.1w) Filtering/Load balancing on L3
header 802.1ac Frame Extension for
VLAN tagging
Separate Module for System
Mgt.
Kernel Layer
IPC
Protocols Run as Individual Processes
MA
C M
anag
er
Sp
ann
ing
Tre
e
AR
PM
anag
er
Route Processor 2
Lin
k A
gg
r. (
LA
G)
VR
RP
, IC
MP
, P
PP
Sys
tem
Man
ager
39
MAC Learning & Filtering
Mac Learning Limit to Control Number of Entries Learnt L2 Interface Filtering
MAC Addresses Received Beyond Limit are Discarded
L2 ACL Entry to Prevent Discarded MACs from Forwarding Traffic
Counter to Measure Dropped Addresses
Standard MAC ACL: Source MAC Address
Extended MAC ACL: SA, DA, Ethernet Frame Type, VLAN
Standard IP ACL: Source IP Address
Extended IP ACL: IP-DA, IP-SA, Protocol Type, Destination Port,
Source Port
40
Link Aggregation 802.3ad
Up to 16 Links in a LAG
Up to 256 LAGs Per System
No dependency on Slot– Any slot any port– Must be like ports – all 1 GE or all 10GE
Adding or Deleting Ports from a LAG Does not Require LAG/System Reset
Map traffic on to a Link Based on:– L2 header: MAC-DA, MAC-SA– L3 header within L2 packet: IP-DA, IP-SA, protocol
type, destination port, source port
41
Enhanced STP Support
802.1D Standard Spanning Tree – STP can be disabled per interface
STP Portfast Support on Switch Ports Connected to End Hosts
STP BPDU Guard – Disable port if BPDU received on a portfast port.
Rapid Root Redundancy (RRR) Protocol – Sub 50 msec STP recovery from root port failures
Multiple Spanning tree protocol (802.1s) for rapid convergence and efficient link utilization
RSTP (802.1w)
42
CIC OmniPOP Members
43
Thank You
Debbie Montanodmontano@force10networks.com
Question?
Recommended