10
SOCSA Slides: Network Processors © Institute for Integrated Systems Technische Universität München www.lis.ei.tum.de Case Study: Network Processors System-on-Chip Solutions & Architectures A. Herkersdorf © Institute for Integrated Systems A. Herkersdorf SoC - Network Processors - 2 Network Processors Motivation Where will Network Processors be used What is the „business case“ for network processors Status on today‘s NP products IC Requirements for Network Processors Processor/CPU Requirements Memory Requirements On-chip Interconnect Requirements

Case Study: Network Processors - syncandshare.lrz.de filewireless infrastructure (Node B/RNC)

Embed Size (px)

Citation preview

Page 1: Case Study: Network Processors - syncandshare.lrz.de filewireless infrastructure (Node B/RNC)

SOCSA Slides: Network Processors

© Institute for Integrated Systems Technische Universität München www.lis.ei.tum.de

Case Study: Network Processors

System-on-Chip

Solutions & Architectures A. Herkersdorf

© Institute for

Integrated Systems

A. Herkersdorf SoC - Network Processors - 2

Network Processors

Motivation

Where will Network Processors be used

What is the „business case“ for network processors

Status on today‘s NP products

IC Requirements for Network Processors

Processor/CPU Requirements

Memory Requirements

On-chip Interconnect Requirements

Page 2: Case Study: Network Processors - syncandshare.lrz.de filewireless infrastructure (Node B/RNC)

SOCSA Slides: Network Processors

© Institute for

Integrated Systems

A. Herkersdorf SoC - Network Processors - 3

Real-world case studies

Firewall

Edge

Router

Server

Adapters (NIC)

Cable

Core

Router

Base

Station

Controllers

Cable

Headend

Sonet/SDH Transmission LAN/SAN

Switch

Internet Router

Sonet/SDH Transmission

© Institute for

Integrated Systems

A. Herkersdorf SoC - Network Processors - 4

Networking Trends Voice/Data integration

both in WAN backbone and wireless access

New broadband access technologies xDSL, Cable

Need for service differentiation interactive real-time, streaming,

best effort

Security (crypto, authentication) Business use of Internet Prevention of attacks (firewalls,

virus scanning)

Emerging standards still in flux IETF, ETSI, ANSI

Ever growing data link rates Time

L2 / VLAN

History

L2 / VLAN

Layer 3: IP Routing

Today

AS

IC

Netw

ork

Pro

ce

sso

r

L2 / VLAN

Layer 3: IP Routing

Current Trend

Layer 4: DiffServ Security Filtering

L2 / VLAN

Layer 3: IP Routing

Future

Layer 4: DiffServ Security Filtering

Deep Packet Processing

Demand for high performance and flexibility in future networking components

Page 3: Case Study: Network Processors - syncandshare.lrz.de filewireless infrastructure (Node B/RNC)

SOCSA Slides: Network Processors

© Institute for

Integrated Systems

A. Herkersdorf SoC - Network Processors - 5

Network Processor Status

No industry standard NP architecture established yet High conceptual diversity among different vendors

High-end tends towards chip set solutions,

low-end towards fully integrated NPs

Market focus is in multi-Gb EN and OC-48

Trends

Towards high-level programming language support

Standard third-party tools chain

Push towards standardized interfaces

Coprocessor-NP interworking among diff. vendors

NP Forum (API, Streaming / LookAside i/f), IETF ForCES

LSI AMCC Netronome Xelerated

product APP3300 nP3700 IXP2855 X11

Gbps 0.6-3.5 5 5 10-24

Op. Freq. 290 MHz 700 MHz 1.5 GHz 240 MHz

# proc. 2C + ? D 3 16 160

# threads ? 72 128 160

proc struct. SMP CP

D-Pipeline SMP

1 thr./pkt Pipeline Pipeline

# coproc. 10 5 3 10

mem type DDR-RAM Flash

QDR-SRAM QDR-SRAM RDRAM

RL-/DDR2 SRAM

# chips 1 1 1 1

power [W] [email protected] ? 27-32 10 (X10q) Sources: www.lsi.com; www.amcc.com; www.intel.com;

www.xelerated.com (2008)

RLDRAM

© Institute for

Integrated Systems

A. Herkersdorf SoC - Network Processors - 6

LSI APP3300

NP targeted for

• wireline access (MSAN, DSLAM)

• wireless infrastructure (Node B/RNC)

• multi-service switch/router

and integrates

• classifier

• traffic manager

• control&service processor (2 ARM11)

• security protocol processor (1.5 Gbit/s)

• high-level language programmable data path processor pipeline (proprietary)

• MACs for different interface types Source: www.lsi.com

Page 4: Case Study: Network Processors - syncandshare.lrz.de filewireless infrastructure (Node B/RNC)

SOCSA Slides: Network Processors

© Institute for

Integrated Systems

A. Herkersdorf SoC - Network Processors - 7

AMCC – nP3700

Multi-threaded SMP

3 CPU cores with

24 threads each

Dedicated coprocessors

Traffic Management

Schedule, Shape, Queuing

Policy / Rule Mgmt.

Hashing (search)

Statistics

Fast QDR SRAM and RLDRAM interfaces

Applications: VoIP/media GW, edge routers, WLAN Enterprise access

Source: www.amcc.com

© Institute for

Integrated Systems

A. Herkersdorf SoC - Network Processors - 8

Netronome IXP2855 (formerly Intel)

16-core processor pipeline:

• 8 threads/core

• special core2core interconnect

• max. 1.5 GHz

Two Crypto-Engines:

• max. 10 Gbps IPSec

Hash Engine

Multiple Memory connections:

• 4x QDR SRAM/TCAM

• 3x RDRAM

XScale control plane processor (750 MHz)

Sources: www.intel.com, www.netronome.com

Page 5: Case Study: Network Processors - syncandshare.lrz.de filewireless infrastructure (Node B/RNC)

SOCSA Slides: Network Processors

© Institute for

Integrated Systems

A. Herkersdorf SoC - Network Processors - 9

Xelerated X11

Multi-stage processor pipeline:

• 5x 32 PISC processors 1 instr./stage

deterministic performance

• 5 engine access points for coprocessor calls

10 co-processors

Buffer Manager (Q, TS)

Memory interfaces:

• RLDRAM/DDR2-SDRAM

• SRAM/TCAM i/f

• internal TCAM

Source: www.xelerated.com

© Institute for

Integrated Systems

A. Herkersdorf SoC - Network Processors - 10

Anatomy of Switch/Router Systems

Determines box function: Switch, Router, Gateway, etc.

Line Interface

Network Processor Switch

Fabric System Processor

Backplane

Page 6: Case Study: Network Processors - syncandshare.lrz.de filewireless infrastructure (Node B/RNC)

SOCSA Slides: Network Processors

© Institute for

Integrated Systems

A. Herkersdorf SoC - Network Processors - 11

NP CPU Performance Requirements

Network Processing Wide performance spectrum

(x15) between applications for same link rate

High absolute MIPS

MIPS

Q:

Determine the back-to-back packets/s arrival rate on OC-3 to OC-192 links, and the required per packet instruction budget for L2 switching, QoS forwarding, and virus scanning at OC-48. Assume a packet size of 64 Bytes.

800 MIPS 12 K MIPS

Source: [3] Jenkins, "NPU Co-Processors", 2000

© Institute for

Integrated Systems

A. Herkersdorf SoC - Network Processors - 12

NP CPU Performance Requirements

pkt/s = link rate [b/s] / pkt size [b/pkt]

Instr / pkt = MIPS / Mpkt / s

Instr / pkt ¦L2 = 800 / 4.86 = 164

Instr / pkt ¦QoS = 3000 / 4.86 = 617

Instr/ pkt ¦virus = 12000 / 4.86 = 2470

MIPS

800 MIPS 12 K MIPS

OC-3 OC-12 OC-48 OC-192

Mbps 155.5 622.1 2488 9953

pkt/s 304 K 1.21 M 4.86 M 19.44 M

s/pkt 3.29 µ 823 n 206 n 51 n

For the same applications, what are the Instr / pkt requirements for OC-3 and OC-192?

Same as for OC-48

Page 7: Case Study: Network Processors - syncandshare.lrz.de filewireless infrastructure (Node B/RNC)

SOCSA Slides: Network Processors

© Institute for

Integrated Systems

A. Herkersdorf SoC - Network Processors - 13

NP CPU Performance Requirements

How many embedded processor cores would we need to operate in parallel to support virus scanning at OC-48 rate?

32 bit, dual-issue, single threaded RISC CPU 600 MHz, CPI = 0.7

32 K Byte I/D cache

4.5 W, 10 mm2, 0.13µm CMOS

System memory 133 MHz, DDR SDRAM

32 bit Data bus, 1 M words in each of 4 banks

8 – 0.5 access cycles

Assumptions: D-cache miss rate: 2 %

I-cache miss rate: 0

Data access freq.: 20 %

I/O

Mem PE PE

PE PE

N

© Institute for

Integrated Systems

A. Herkersdorf SoC - Network Processors - 14

NP CPU Performance Requirements

Microprocessor lecture:

CPU time = x x Instructions

Program

Clock cycles

Instruction

Seconds

Clock cycle

Application specific

1 / fcpu CPI: CPU and memory hierarchy dependent

N x

? 206 ns 2470 1/600MHz

CPI = CPICPU + CPIMEM = CPICPU + f(D_acc) x Cachemiss_rate x Cachemiss_penalty =

= 0.7 + 0.2 x 0.02 x 8 x (600 / 133) = 0.7 + 0.14 = 0.84

N = ceil( 16.78 ) = 17

Page 8: Case Study: Network Processors - syncandshare.lrz.de filewireless infrastructure (Node B/RNC)

SOCSA Slides: Network Processors

© Institute for

Integrated Systems

A. Herkersdorf SoC - Network Processors - 15

NP CPU Performance Requirements

Q: By how much does N increase

when D-cache miss rate increases to 5% ?

CPIMEM = 0.36

N’ = ceil(21.18) = 22 Five more processor cores

when cache miss rate increases from 2% to 5%

+ 22.5 W

N = 17 means: 170 mm2 = 13x13 mm

76.5 W

Reasons why our calculation is even optimistic No interconnect latencies nor

contention on shared memory access considered

No runtime kernel (or OS) on CPU considered

D-cache miss rate might well be higher depending on application context size

© Institute for

Integrated Systems

A. Herkersdorf SoC - Network Processors - 16

Conclusions (1)

No One-size-fits-all solution for NP! Not even within one speed rate

Different NP solutions for different application ranges and network location

CPU-only resources aren’t the way to approach network processing

Great insight! But what is it then that makes up Network Processors?

Page 9: Case Study: Network Processors - syncandshare.lrz.de filewireless infrastructure (Node B/RNC)

SOCSA Slides: Network Processors

© Institute for

Integrated Systems

A. Herkersdorf SoC - Network Processors - 17

NP Applications by Sub-Function

Imbalance in processing requirement between NP sub-functions

Packet classification, security and traffic mgmt. are most performance demanding fcts

What are the remaining CPU performance requirements for Firewalls and Security Network Interface Cards when Classification, Traffic shaping, Encryption and Compression would be implemented by specific coprocessors?

Firewall: ≈ 10% SecNIC: ≈ 15%

© Institute for

Integrated Systems

A. Herkersdorf SoC - Network Processors - 18

Network Processor Resource Mix

Log COMPUTATIONAL DENSITY = performance / area

103 . . . 104

Lo

g

P O

W E

R

C

ON

SU

MP

TIO

N

10

5 .

. . 10

6 ASIP

DSP CPU

FPGA

ASIC

Custom

IC

Lo

g

F L

E X

I B

I L

I T

Y

FU

NC

TIO

NA

L D

IVE

RS

ITY

Network

Processor

Network Processors address demand for high flexibility and performance with a mix of different SoC resources

The M-$ Question: What is the right

balance between these resources and …

how to assemble them to a homogeneous, high- performance system?

Page 10: Case Study: Network Processors - syncandshare.lrz.de filewireless infrastructure (Node B/RNC)

SOCSA Slides: Network Processors

© Institute for

Integrated Systems

A. Herkersdorf SoC - Network Processors - 19

Network Processor Resource Mix

ASIP DSP CPU

FPGA

ASIC

Custom

IC

Network

Processor

ASIC resources Link interfaces EN Phy/ MAC,

SONET framer,

Memory controller

std. peripherals

FPGA/ASIP Generic coprocessors:

Classification, Traffic mgmt., etc.

CPU’s Reserved for functionality

with highest flexibility demand

Secures “future proof” of device

Hardware multi-threading (see Microprocessor lecture) for CPU’s strongly recommended to hide memory and coprocessor access latencies!