26
Gigabit Routing on a Gigabit Routing on a Software-exposed Software-exposed Tiled-Microprocessor Tiled-Microprocessor James W Anderson, Anthony Degangi, Anant Agarwal Umar Saif MIT Computer Science and AI Laboratory

Gigabit Routing on a Software- exposed Tiled-Microprocessor James W Anderson, Anthony Degangi, Anant Agarwal Umar Saif MIT Computer Science and AI Laboratory

Embed Size (px)

Citation preview

Page 1: Gigabit Routing on a Software- exposed Tiled-Microprocessor James W Anderson, Anthony Degangi, Anant Agarwal Umar Saif MIT Computer Science and AI Laboratory

Gigabit Routing on a Software-Gigabit Routing on a Software-exposed Tiled-Microprocessorexposed Tiled-Microprocessor

James W Anderson, Anthony Degangi, Anant Agarwal

Umar Saif

MIT Computer Science and AI Laboratory

Page 2: Gigabit Routing on a Software- exposed Tiled-Microprocessor James W Anderson, Anthony Degangi, Anant Agarwal Umar Saif MIT Computer Science and AI Laboratory

Network RoutersNetwork Routers

QuickTime™ and aTIFF (Uncompressed) decompressor

are needed to see this picture.

QuickTime™ and aTIFF (Uncompressed) decompressor

are needed to see this picture.

xKb/sec xGb/sec

~5 ports ~102 ports

Network “Switch” Network “Processor”

Page 3: Gigabit Routing on a Software- exposed Tiled-Microprocessor James W Anderson, Anthony Degangi, Anant Agarwal Umar Saif MIT Computer Science and AI Laboratory

Three ChallengesThree Challenges

Performance– 5 -- 10Gb/sec (OC-192)

Architectural Scalability– Throughput: x2.2/year– Port count: 10 -- 100 for edge routers

Programmability– Network Services: NAT, firewalls, VPN “Layer 7”

switches

– Monitoring: Loss rate, link utilization, traffic patterns

Page 4: Gigabit Routing on a Software- exposed Tiled-Microprocessor James W Anderson, Anthony Degangi, Anant Agarwal Umar Saif MIT Computer Science and AI Laboratory

Network ProcessorsNetwork Processors

Conventional Wisdom

Tiled “all-purpose”architectures

Page 5: Gigabit Routing on a Software- exposed Tiled-Microprocessor James W Anderson, Anthony Degangi, Anant Agarwal Umar Saif MIT Computer Science and AI Laboratory

MIT RAW MicroprocessorMIT RAW Microprocessor

ComputePipeline

Registered at input longest wire = length of tile

8 32-bit channels•2 DOR dynamic networks

•Memory Dynamic(MDN)•General Dynamic(GDN)

•2 Static Networks•Streaming Tile-Multicast

•Tiled-architecture •Low-latency mesh networks•Software-exposed pins

8 stage 32bMIPS-stylesingle-issuein-order computeprocessor

4-stage 32bpipelined FPU

32 KB DCache

32 KB IMem

Routers and wires for threeon-chip mesh networks

Page 6: Gigabit Routing on a Software- exposed Tiled-Microprocessor James W Anderson, Anthony Degangi, Anant Agarwal Umar Saif MIT Computer Science and AI Laboratory

RAW MicroprocessorRAW Microprocessor

RAW

Software-exposed tiled-architecture

Software exposed Pins

Software-exposed point-to-point

networks

Network Routing

Parallel processing

Flexible buffering

Efficient, scalable switching

Page 7: Gigabit Routing on a Software- exposed Tiled-Microprocessor James W Anderson, Anthony Degangi, Anant Agarwal Umar Saif MIT Computer Science and AI Laboratory

However ..However ..

Network Processors

RAW Microprocessor

Processing Special-purpose hardware

Software running on RAW general-purpose tiles

Switching Special-purpose switching fabric

RAW general-purpose on-chip networks

Buffering Centrally-accessible- specialized memory-controllers

- dedicated

interconnects

External to the chip, - connected to Software-exposed pins- Accessed via RAW on-chip networks

Page 8: Gigabit Routing on a Software- exposed Tiled-Microprocessor James W Anderson, Anthony Degangi, Anant Agarwal Umar Saif MIT Computer Science and AI Laboratory

IPv4 Router: RFC 1812IPv4 Router: RFC 1812

Look-up– DIR-24-8-BASIC [Gupta98]

Header verificationTTL update, header re-compute

– Incremental Checksum [RFC 1141]Switch to destination

Page 9: Gigabit Routing on a Software- exposed Tiled-Microprocessor James W Anderson, Anthony Degangi, Anant Agarwal Umar Saif MIT Computer Science and AI Laboratory

Evaluation MethodologyEvaluation Methodology

Maximum Loss Free Forwarding Rate MLFFR– Minimum-sized 64-byte packets

• Millions of packets per second (mpps)

– Maximum-sized 1500-byte packets:• Gigabit/sec

Captured Internet Trace: ~128 bytes Packet Latency RAW Clocked at 425 Mhz Comparison with IXP1200 as a reference

point

Page 10: Gigabit Routing on a Software- exposed Tiled-Microprocessor James W Anderson, Anthony Degangi, Anant Agarwal Umar Saif MIT Computer Science and AI Laboratory

RAW Router, Take 1: RAW Router, Take 1: ParallelismParallelism

DRAMSRAM

SRAM DRAM

Line Card

Line Card

Line Card

Line Card

Line Card

Line Card

Line Card

Line Card

Lookup tables

Lookup tables

Packet Buffer

Packet Buffer

Lookup2 stage lookup

Header Verify

Header recomputeInterrupt Drain-tile

Drain FIFO

Page 11: Gigabit Routing on a Software- exposed Tiled-Microprocessor James W Anderson, Anthony Degangi, Anant Agarwal Umar Saif MIT Computer Science and AI Laboratory

Flow of PacketsFlow of Packets

DRAMLookup

Lookup DRAM

Line Card

Line Card

Line Card

Line Card

Line Card

Line Card

Line Card

Line Card

L UV D

L: LookupV: VerifyU: UpdateD: Drain

Page 12: Gigabit Routing on a Software- exposed Tiled-Microprocessor James W Anderson, Anthony Degangi, Anant Agarwal Umar Saif MIT Computer Science and AI Laboratory

RAW Router, Take 1RAW Router, Take 1

• Static Network for Streaming Packets• Feed the pipeline• Stream the payload to DRAM

• General Dynamic Network• Header Forwarding 3 -> 4

• Memory Dynamic Network

• From memory to line-card

DRAMSRAM

SRAM DRAM

Line Card

Line Card

Line Card

Line Card

Line Card

Line Card

Line Card

Line Card

StaticMDNGDN

Page 13: Gigabit Routing on a Software- exposed Tiled-Microprocessor James W Anderson, Anthony Degangi, Anant Agarwal Umar Saif MIT Computer Science and AI Laboratory

Version I PerformanceVersion I Performance1.8 Gb/sec -- > 6.17Gb/sec

2.9 mpps -- > 6.23 mpps

Page 14: Gigabit Routing on a Software- exposed Tiled-Microprocessor James W Anderson, Anthony Degangi, Anant Agarwal Umar Saif MIT Computer Science and AI Laboratory

RAW Router Version 1RAW Router Version 1

DRAMSRAM

SRAM DRAM

Line Card

Line Card

Line Card

Line Card

Line Card

Line Card

Line Card

Line Card

Bus Contention

Shared Buffering

Memory Dynamic Network

DOR: x --> y

Page 15: Gigabit Routing on a Software- exposed Tiled-Microprocessor James W Anderson, Anthony Degangi, Anant Agarwal Umar Saif MIT Computer Science and AI Laboratory

RAW Router, Take 2: RAW Router, Take 2: Buffering and SwitchingBuffering and Switching

SDRAM

SDRAMSDRAM

SDRAM

Line Card

Line Card

Line Card

Line Card

Line Card

Line Card

Line Card

Line Card

Lookup Lookup

Lookup2 stage lookup

Header Verify

Header recomputeInterrupt Drain-tile

Drain FIFO

Page 16: Gigabit Routing on a Software- exposed Tiled-Microprocessor James W Anderson, Anthony Degangi, Anant Agarwal Umar Saif MIT Computer Science and AI Laboratory

RAW Router, Take 2RAW Router, Take 2

SDRAM

SDRAMSDRAM

SDRAM

Line Card

Line Card

Line Card

Line Card

Line Card

Line Card

Line Card

Line Card

Lookup Lookup

• Respects DOR

• No “bus contention” for DMAs (bottleneck is shared SDRAMs)

• 2x Memory BW

• No need to look at packet length

• Dynamic networks for “out-of-band” communication

StaticMDNGDN

Page 17: Gigabit Routing on a Software- exposed Tiled-Microprocessor James W Anderson, Anthony Degangi, Anant Agarwal Umar Saif MIT Computer Science and AI Laboratory

Optimized buffering and switchingOptimized buffering and switching

6.17 Gb/sec -- > 8.68Gb/sec

6.17 mpps -- > 6.77 mpps

Page 18: Gigabit Routing on a Software- exposed Tiled-Microprocessor James W Anderson, Anthony Degangi, Anant Agarwal Umar Saif MIT Computer Science and AI Laboratory

RAW Router, take 3:RAW Router, take 3:Reducing Memory TransactionsReducing Memory Transactions

DRAM

SDRAMSDRAM

SDRAM

Line Card

Line Card

Line Card

Line Card

Line Card

Line Card

Line Card

Line Card

SDRAM SDRAM

Streaming DDRNo fragmentation of frames

Pipelined Memory Requests

Page 19: Gigabit Routing on a Software- exposed Tiled-Microprocessor James W Anderson, Anthony Degangi, Anant Agarwal Umar Saif MIT Computer Science and AI Laboratory

Streaming packet buffers Streaming packet buffers + 64-byte minimum buffering+ 64-byte minimum buffering

8.68 Gb/sec -- > 9.57Gb/sec

6.77 mpps -- > 9.79 mpps

Page 20: Gigabit Routing on a Software- exposed Tiled-Microprocessor James W Anderson, Anthony Degangi, Anant Agarwal Umar Saif MIT Computer Science and AI Laboratory

Buffering on Line-cardsBuffering on Line-cards9.57 Gb/sec -- > 15.03Gb/sec

9.79 mpps -- > 9.79mpps

Page 21: Gigabit Routing on a Software- exposed Tiled-Microprocessor James W Anderson, Anthony Degangi, Anant Agarwal Umar Saif MIT Computer Science and AI Laboratory

All dynamic networksAll dynamic networks9.57 Gb/sec -- > 8.50Gb/sec

9.57 mpps -- > 6.94 mpps

Page 22: Gigabit Routing on a Software- exposed Tiled-Microprocessor James W Anderson, Anthony Degangi, Anant Agarwal Umar Saif MIT Computer Science and AI Laboratory

Evaluation with captured Trace Evaluation with captured Trace

Page 23: Gigabit Routing on a Software- exposed Tiled-Microprocessor James W Anderson, Anthony Degangi, Anant Agarwal Umar Saif MIT Computer Science and AI Laboratory

Packet LatencyPacket Latency

Router Packet size Cycles Time(ns)

RAW null 64 416 177

RAW IPv4 64 690 293

RAW null 1500 3490 1483

RAW IPv4 1500 5394 2292

Page 24: Gigabit Routing on a Software- exposed Tiled-Microprocessor James W Anderson, Anthony Degangi, Anant Agarwal Umar Saif MIT Computer Science and AI Laboratory

ConclusionsConclusions

Tiled-architectures = NPU performance + enhanced programmability

RAW’s low-level software-control was vital for deriving performance:– Layout of routing functions

• 30% improvement by altering layout

– Role and behavior of the on-chip networks

• 15% improvement by using GDN and static networks in place of MDN

Page 25: Gigabit Routing on a Software- exposed Tiled-Microprocessor James W Anderson, Anthony Degangi, Anant Agarwal Umar Saif MIT Computer Science and AI Laboratory

ConclusionsConclusions

Network oblivious: 30-35% degradation

No Static networks: 10-30% degradation

Buffering on line-cards: 35% improvement

Page 26: Gigabit Routing on a Software- exposed Tiled-Microprocessor James W Anderson, Anthony Degangi, Anant Agarwal Umar Saif MIT Computer Science and AI Laboratory

Thank you!

Questions: [email protected]