Upload
others
View
2
Download
0
Embed Size (px)
Citation preview
Routers Technologies & Evolution
C. Pham
Slides byNick McKeown, Pankaj Gupta
[email protected]/~nickm
2
“The Internet is a mesh ofrouters”
The Internet Core
IP Core router
IP EdgeRouter
5
Where high performance packetswitches are used
Enterprise WAN access& Enterprise Campus Switch
- Carrier Class Core Router- ATM Switch- Frame Relay Switch
The Internet Core
Edge Router
6
Ex: Points of Presence (POPs)
A
B
C
POP1
POP3POP2
POP4 D
E
F
POP5
POP6 POP7POP8
7
What a Router Looks LikeCisco GSR 12416 Juniper M160
6ft
19”
2ft
Capacity: 160Gb/sPower: 4.2kW
3ft
2.5ft
19”
Capacity: 80Gb/sPower: 2.6kW
8
Basic Architectural Components
OutputScheduling
Control Plane
Datapath”per-packet processing
SwitchingForwardingTable
ReservationAdmissionControl Routing
Table
Routing Protocols
Policing& AccessControl
PacketClassification
Ingress EgressInterconnect
1. 2. 3.
9
Basic Architectural ComponentsDatapath: per-packet processing
2. Interconnect 3. EgressForwarding
Table
ForwardingDecision
1. Ingress
ForwardingTable
ForwardingDecision
ForwardingTable
ForwardingDecision
11
RFC 1812: Requirements for IPv4Routers
• Must perform an IP datagram forwardingdecision (called forwarding)
• Must send the datagram out theappropriate interface (called switching)
Optionally: a router MAY choose to perform specialprocessing on incoming packets
12
Examples of special processing
• Filtering packets for security reasons• Delivering packets according to a pre-
agreed delay guarantee• Treating high priority packets
preferentially• Maintaining statistics on the number
of packets sent by various routers
13
Special Processing RequiresIdentification of Flows
• All packets of a flow obey a pre-definedrule and are processed similarly by therouter
• E.g. a flow = (src-IP-address, dst-IP-address), or a flow = (dst-IP-prefix,protocol) etc.
• Router needs to identify the flow of everyincoming packet and then performappropriate special processing
14
Flow-aware vs Flow-unawareRouters
• Flow-aware router: keeps track offlows and perform similar processingon packets in a flow
• Flow-unaware router (packet-by-packet router): treats each incomingpacket individually
15
Why do we Need Faster Routers?
1. To prevent routers becoming thebottleneck in the Internet.
2. To increase POP capacity, and toreduce cost, size and power.
16
2x / 18 months2x / 7 months
Source « Optical fibers for Ultra-Large CapacityTransmission » by J. Grochocinski
Why we Need Faster Routers1: To prevent routers from being the bottleneck
Added by C. Pham
17
POP with smaller routers
Why we Need Faster Routers2: To reduce cost, power & complexity of POPs
POP with large routers
Ports: Price >$100k, Power > 400W. It is common for 50-60% of ports to be for interconnection.
18
Why are Fast Routers Difficult toMake?
1. It’s hard to keep up with Moore’s Law:– The bottleneck is memory speed.– Memory speed is not keeping up with
Moore’s Law.
19
Memory BandwidthCommercial DRAM
1. It’s hard to keep up with Moore’s Law:– The bottleneck is memory speed.– Memory speed is not keeping up with
Moore’s Law.
0,0001
0,001
0,01
0,1
1
10
100
1000
1980 1983 1986 1989 1992 1995 1998 2001
Access
Tim
e (ns) DRAM
1.1x / 18months
Moore’s Law2x / 18 months
Router Capacity2.2x / 18months
Line Capacity2x / 7 months
20
Why are Fast Routers Difficult to Make?
1. It’s hard to keep up with Moore’s Law:– The bottleneck is memory speed.– Memory speed is not keeping up with
Moore’s Law.2. Moore’s Law is too slow:
– Routers need to improve faster thanMoore’s Law.
21
Router Performance Exceeds Moore’s Law
Growth in capacity of commercial routers:– Capacity 1992 ~ 2Gb/s– Capacity 1995 ~ 10Gb/s– Capacity 1998 ~ 40Gb/s– Capacity 2001 ~ 160Gb/s– Capacity 2003 ~ 640Gb/s
Average growth rate: 2.2x / 18 months.
22
Things that slow routers down• 250ms of buffering
– Requires off-chip memory, more board space, pins and power.• Multicast
– Affects everything!– Complicates design, slows deployment.
• Latency bounds– Limits pipelining.
• Packet sequence– Limits parallelism.
• Small internal cell size– Complicates arbitration.
• DiffServ, IntServ, priorities, WFQ etc.• Others: IPv6, Drop policies, VPNs, ACLs, DOS traceback,
measurement, statistics, …
23
First Generation Routers
Shared Backplane
LineInterface
CPU
Memory
RouteTableCPU Buffer
Memory
LineInterface
MAC
LineInterface
MAC
LineInterface
MAC
Fixed length “DMA” blocksor cells. Reassembled on egress
linecard
Fixed length cells or variable length packets
Typically <0.5Gb/s aggregate capacity
Most Ethernet switches and cheappacket routersBottleneck can be CPU, host-adaptor or I/O bus
24
Output 2
Output N
First Generation RoutersQueueing Structure: Shared Memory
Large, single dynamicallyallocated memory buffer:N writes per “cell” timeN reads per “cell” time.
Limited by memorybandwidth.
Input 1 Output 1
Input N
Input 2
Numerous work has proven andmade possible:– Fairness– Delay Guarantees– Delay Variation Control– Loss Guarantees– Statistical Guarantees
25
Limitations• First generation router built with 133 MHz Pentium
– Mean packet size 500 bytes– Interrupt takes 10 microseconds, word access take 50 ns– Per-packet processing time is 200 instructions = 1.504 µs
• Copy loopregister <- memory[read_ptr]memory [write_ptr] <- registerread_ptr <- read_ptr + 4write_ptr <- write_ptr + 4counter <- counter -1if (counter not 0) branch to top of loop
• 4 instructions + 2 memory accesses = 130.08 ns• Copying packet takes 500/4 *130.08 = 16.26 µs; interrupt 10 µs• Total time = 27.764 µs => speed is 144.1 Mbps• Amortized interrupt cost balanced by routing protocol cost
26
Second Generation Routers
RouteTableCPU
LineCard
BufferMemory
LineCard
MAC
BufferMemory
LineCard
MAC
BufferMemory
FwdingCache
FwdingCache
FwdingCache
MAC
Slow Path
Drop PolicyDrop Policy OrBackpressure
OutputLink
Scheduling
BufferMemory
Typically <5Gb/s aggregate capacity
Port mappingintelligence in linecards-better forconnection modeHigher hit rate inlocal lookup cache
27
RouteTableCPU
Second Generation RoutersAs caching became ineffective
LineCard
BufferMemory
LineCard
MAC
BufferMemory
LineCard
MAC
BufferMemory
FwdingTable
FwdingTable
FwdingTable
MAC
ExceptionProcessor
28
Second Generation RoutersQueueing Structure: Combined Input and Output
Queueing
Bus
1 write per “cell” time 1 read per “cell” timeRate of writes/reads determined by bus speed
29
Third Generation Routers
LineCard
MAC
LocalBufferMemory
CPUCard
LineCard
MAC
LocalBufferMemory
Switched Backplane
LineInterface
CPUMemory Fwding
Table
RoutingTable
FwdingTable
Typically <50Gb/s aggregate capacity
Third generation switch provides parallel paths (fabric)
30
Arbiter
Third Generation RoutersQueueing Structure
Switch
1 write per “cell” time 1 read per “cell” timeRate of writes/reads determined by switch
fabric speedup
31
Arbiter
Third Generation RoutersQueueing Structure
Switch
1 write per “cell” time 1 read per “cell” timeRate of writes/reads determined by switch
fabric speedup
Per-flow/class or per-output queues (VOQs)
Per-flow/class or per-input queues
Flow-controlbackpressure
32
Review: crossbar, general design
• Simplest possiblespace-division switch
• Crosspoints can beturned on or off, longenough to transfer apacket from an inputto an output
• Expensive– need N2 crosspoints– time to set each
crosspoint growsquadratically
configuration
Dat
a In
Data Out
Added by C. Pham
33
Switch Fabrics: Bufferedcrossbar (packets)
• What happens ifpackets at twoinputs both want togo to same output?
• Can defer one at aninput buffer
• Or, buffer cross-points: complexarbiter
Added by C. Pham
34
Multistage crossbar
• In a crossbar during eachswitching time only onecross-point per row orcolumn is active
• Can save crosspoints if across-point can attach tomore than one input line
• This is done in amultistage crossbar
N/narraysn x k
karraysN/n x N/n
N/narraywk x n
35
Switch fabric element
• Goal: towards building “self-routing”fabrics
• Can build complicated fabrics from a simpleelement
• Routing rule: if 0, send packet to upperoutput, else to lower output
– If both packets to same output, buffer or drop
0
1
data 10
data 00
Added by C. Pham
36
Banyan element (1 possible conf.)
110
001
ATM has boosted research on high-performance switches
0
1
data 10
data 00
Added by C. Pham
37
Batcher-Banyan switch
a same direction than arrow if a > b,a opposite direction if a is alone
5
7
5
7
101
111
Added by C. Pham
38
Buffer management
• Inputbuffers
• Outputbuffer
Added by C. Pham
39
Fourth Generation Routers/Switches
Switch Core Linecards
Optical links
100’sof feet
40
Physically Separating Switch Coreand Linecards
• Distributes power over multiple racks.• Multistage, clustering• Allows all buffering to be placed on
the linecard:– Reduces power.– Places complex scheduling, buffer mgmt,
drop policy etc. on linecard.
41
Do optics belong in routers?
They are already there.– Connecting linecards to switches.
Optical processing doesn’t belong onthe linecard.– You can’t buffer light.– Minimal processing capability.
Optical switching can reduce power.
42
Optics in routers
Switch Core Linecards
Optical links
43
Complex linecards
PhysicalLayer
Framing&
Maintenance
PacketProcessing
Buffer Mgmt&
Scheduling
Buffer Mgmt&
Scheduling
Buffer & StateMemory
Buffer & StateMemory
Typical IP Router Linecard
10Gb/s linecard: Number of gates: 30M Amount of memory: 2Gbits Cost: >$20k Power: 300W
LookupTables
SwitchFabric
Arbitration
Optics
44
Replacing the switch fabric with optics
SwitchFabric
Arbitration
PhysicalLayer
Framing&
Maintenance
PacketProcessing
Buffer Mgmt&
Scheduling
Buffer Mgmt&
Scheduling
Buffer & StateMemory
Buffer & StateMemory
Typical IP Router LinecardLookupTables
OpticsPhysicalLayer
Framing&
Maintenance
PacketProcessing
Buffer Mgmt&
Scheduling
Buffer Mgmt&
Scheduling
Buffer & StateMemory
Buffer & StateMemory
Typical IP Router LinecardLookupTables
Opticselectrical
SwitchFabric
Arbitration
PhysicalLayer
Framing&
Maintenance
PacketProcessing
Buffer Mgmt&
Scheduling
Buffer Mgmt&
Scheduling
Buffer & StateMemory
Buffer & StateMemory
Typical IP Router LinecardLookupTables
OpticsPhysicalLayer
Framing&
Maintenance
PacketProcessing
Buffer Mgmt&
Scheduling
Buffer Mgmt&
Scheduling
Buffer & StateMemory
Buffer & StateMemory
Typical IP Router LinecardLookupTables
Optics
optical
Req/Grant Req/Grant
Candidate technologies1. MEMs.2. Fast tunable lasers + passive optical couplers.3. Diffraction waveguides.4. Electroholographic materials.
48
Some Mc Keown's predictionsabout core Internet routers
• The need for more capacity for a given power and volumebudget will mean:
• Fewer functions in routers:– Little or no optimization for multicast,– Continued overprovisioning will lead to little or no support for
QoS, DiffServ, …,• Fewer unnecessary requirements:
– Mis-sequencing will be tolerated,– Latency requirements will be relaxed.
• Less programmability in routers, and hence no networkprocessors.
• Greater use of optics to reduce power in switch.
49
What McKeown believe is mostlikely
The need for capacity and reliability will mean:
• Widespread replacement of core routers withtransport switching based on circuits:– Circuit switches have proved simpler, more
reliable, lower power, higher capacity and lowercost per Gb/s. Eventually, this is going to matter.
– Internet will evolve to become edge routersinterconnected by rich mesh of DWDM circuitswitches.