24
1 2003 ©UCR CS 162 Computer Architecture Lecture 8: Introduction to Network Processors (II) Instructor: L.N. Bhuyan www.cs.ucr.edu/~bhuyan/ cs162

CS 162 Computer Architecture Lecture 8: Introduction to Network Processors (II)

  • Upload
    bat

  • View
    41

  • Download
    0

Embed Size (px)

DESCRIPTION

CS 162 Computer Architecture Lecture 8: Introduction to Network Processors (II). Instructor: L.N. Bhuyan www.cs.ucr.edu/~bhuyan/cs162. Outline. Introduction to NP Systems Relevant Applications Design Issues and Challenges Relevant Software and Benchmarks - PowerPoint PPT Presentation

Citation preview

Page 1: CS 162 Computer Architecture  Lecture 8: Introduction to Network Processors (II)

1 2003 ©UCR

CS 162 Computer Architecture Lecture 8: Introduction toNetwork Processors (II)

Instructor: L.N. Bhuyanwww.cs.ucr.edu/~bhuyan/cs162

Page 2: CS 162 Computer Architecture  Lecture 8: Introduction to Network Processors (II)

2 2003 ©UCR

Outline°Introduction to NP Systems

°Relevant Applications

°Design Issues and Challenges

°Relevant Software and Benchmarks

°A case study: Intel IXP network processors

Page 3: CS 162 Computer Architecture  Lecture 8: Introduction to Network Processors (II)

3 2003 ©UCR

What are Network Processors°Any device that executes programs to handle packets in a data network

°Examples• Processors on router line cards

• Processors in network access equipment

Page 4: CS 162 Computer Architecture  Lecture 8: Introduction to Network Processors (II)

4 2003 ©UCR

Why Network Processors° Current Situation• Data rates are increasing

• Protocols are becoming more dynamic and sophisticated

• Protocols are being introduced more rapidly

° Processing Elements• GP(General-purpose Processor)

- Programmable, Not optimized for networking applications

• ASIC(Application Specific Integrated Circuit)- high processing capacity, long time to develop, Lack the flexibility

• NP(Network Processor)- achieve high processing performance

- programming flexibility

- Cheaper than GP

Page 5: CS 162 Computer Architecture  Lecture 8: Introduction to Network Processors (II)

5 2003 ©UCR

Outline°Introduction to NP Systems

°Relevant Applications

°Design Issues and Challenges

°Relevant Software and Benchmarks

°A case study: Intel IXP network processors

Page 6: CS 162 Computer Architecture  Lecture 8: Introduction to Network Processors (II)

6 2003 ©UCR

Organizing Processor Resources°Design decisions:

• High-level organization

• ISA and micro architecture

• Memory and I/O integration

°Today’s commercial NPs:• Chip multiprocessors

• Most are multithreaded

• Exploit little ILP (Cisco does)

• No cache

• Micro-programmed

Page 7: CS 162 Computer Architecture  Lecture 8: Introduction to Network Processors (II)

7 2003 ©UCR

Architectural Comparisons°High-level organizations

• Aggressive superscalar (SS)

• Fine-grained multithreaded (FGMT)

• Chip multiprocessor (CMP)

• Simultaneous multithreaded (SMT)

Ref: [NPRD]

Page 8: CS 162 Computer Architecture  Lecture 8: Introduction to Network Processors (II)

8 2003 ©UCR

Architectural Comparisons (cont.)Ti

me

(pro

cess

or

cycle

)Superscalar Fine-Grained Coarse-Grained Multiprocessing

Thread 1

Thread 2

Thread 3

Thread 4

Thread 5

Idle slot

SimultaneousMultithreading

Page 9: CS 162 Computer Architecture  Lecture 8: Introduction to Network Processors (II)

9 2003 ©UCR

Tasks and Services

Three Benchmarks used in the experiment

Ref: [NPRD]

Page 10: CS 162 Computer Architecture  Lecture 8: Introduction to Network Processors (II)

10 2003 ©UCR

° Systems must support some form of concurrent packet-level parallelism

° SMT and CMP are nearly equivalent, with SMT always coming out ahead

Forwarding: IP ForwardAuthentication: MD5Encryption: 3DES

SSFGMTCMPSMT

•Workloads have little ILP•Need to exploit packet-level parallelism•CMP and SMT do just that

Performance Evaluation

Ref: [NPRD]

Page 11: CS 162 Computer Architecture  Lecture 8: Introduction to Network Processors (II)

11 2003 ©UCR

Example Toaster System: Cisco 10000

° Almost all data plane operations execute on the programmable XMC

° Pipeline stages are assigned tasks – e.g. classification, routing, firewall, MPLS

• Classic SW load balancing problem

° External SDRAM shared by common pipe stages

Ref: [NPT]

Page 12: CS 162 Computer Architecture  Lecture 8: Introduction to Network Processors (II)

12 2003 ©UCR

IBM PowerNP

° 16 pico-procesors and 1 powerPC

° Each pico-processor• Support 2 hardware threads

• 3 stage pipeline : fetch/decode/execute

° Dyadic Processing Unit• Two pico-processors

• 2KB Shared memory

• Tree search engine

° Focus is layers 2-4

° PowerPC 405 for control plane operations

• 16K I and D caches

° Target is OC-48

Ref: [NPT]

Page 13: CS 162 Computer Architecture  Lecture 8: Introduction to Network Processors (II)

13 2003 ©UCR

C-Port C-5 Chip Architecture

text

text

Q ueueM ngt U n it

FabricP rocessor

Tab leLookup

U nit

B uffe r M ngtU nit

E xecutiveP rocessor

C P -0

P H Y

C P -1

P H Y

C P -2

P H Y

C P -3

P H Y

C luster

textC P -12

P H Y

C P -13

P H Y

C P -14

P H Y

C P -15

P H Y

C luster

60G bps B usses

S R A MS R A M S R A MS witchFabric

PR

OM

PC

I

CO

NT

RO

L

Ref: [NPT]

Page 14: CS 162 Computer Architecture  Lecture 8: Introduction to Network Processors (II)

14 2003 ©UCR

Some Challenges° Intelligent Design

• Given a selection of programs, a target network link speed, the ‘best’ design for the processor

- Least area

- Least power

- Most performance

° Write efficient multithreaded programs

• NPs have- Heterogeneous computer resources- Non-uniform memory- Multiple interacting threads of execution- Real-time constraints

• Make use of resources- How to use special instructions and hardware assists

– Compilers– Hand-coded

• Multithreaded programs- Manage access to shared state- Synchronization between threads

Ref: [NPRD]

Page 15: CS 162 Computer Architecture  Lecture 8: Introduction to Network Processors (II)

15 2003 ©UCR

Outline°Introduction to NP Systems

°Relevant Applications

°Design Issues and Challenges

°Relevant Software and Benchmarks

°A case study: Intel IXP network processors

Page 16: CS 162 Computer Architecture  Lecture 8: Introduction to Network Processors (II)

16 2003 ©UCR

NP Software° Teja

• NPU vendor-neutral software tools

• Key is a GUI-based state-machine tool

° CLICK router• From MIT, supports a specialized development model

° Zebra• Open source routing environment

• Supporting most of the key IP routing protocols in SW

• IP Fusion Inc. is providing commercial support

° LVL7• Closed source – i.e. traditional commercial – complete IP

solutions

Ref: [NPT]

Page 17: CS 162 Computer Architecture  Lecture 8: Introduction to Network Processors (II)

17 2003 ©UCR

Benchmarks for Network Processors• NetBench

- 10 applications

- http://cares.icsl.ucla.edu/NetBench

• CommBench- 8 networking and communications

applications

- http://ccrc.wustl.edu/~wolf/cb/

• EEMBC- http://www.eembc.org/benchmark

• MediaBench- Transcoders

- Some communications applicationsRef: [NPT]

Page 18: CS 162 Computer Architecture  Lecture 8: Introduction to Network Processors (II)

18 2003 ©UCR

Outline°Introduction to NP Systems

°Relevant Applications

°Design Issues and Challenges

°Relevant Software and Benchmarks

°A case study: Intel IXP network processors

Page 19: CS 162 Computer Architecture  Lecture 8: Introduction to Network Processors (II)

19 2003 ©UCR

IXP1200 Block Diagram

° StrongARM processing core

° Microengines introduce new ISA

° I/O• PCI

• SDRAM

• SRAM

• IX : PCI-like packet bus

° On chip FIFOs• 16 entry 64B each

Ref: [NPT]

Page 20: CS 162 Computer Architecture  Lecture 8: Introduction to Network Processors (II)

20 2003 ©UCR

IXP1200 Microengine° 4 hardware contexts

• Single issue processor

• Explicit optional context switch on SRAM access

° Registers• All are single ported

• Separate GPR

• 256*6 = 1536 registers total

° 32-bit ALU• Can access GPR or XFER registers

° Shared hash unit• 1/2/3 values – 48b/64b

• For IP routing hashing

° Standard 5 stage pipeline

° 4KB SRAM instruction store – not a cache!

° Barrel shifterRef: [NPT]

Page 21: CS 162 Computer Architecture  Lecture 8: Introduction to Network Processors (II)

21 2003 ©UCR

IXP 2400 Block Diagram

° XScale core replaces StrongARM

° Microengines• Faster

• More: 2 clusters of 4 microengines each

° Local memory

° Next neighbor routes added between microengines

° Hardware to accelerate CRC operations and Random number generation

° 16 entry CAM

ME0 ME1

ME2ME3

ME4 ME5

ME6ME7

Scratch/Hash/CSR

MSF Unit

DDR DRAM controller

XScaleCore

QDR SRAM controller

PCI

Page 22: CS 162 Computer Architecture  Lecture 8: Introduction to Network Processors (II)

22 2003 ©UCR

Different Types of Memory

Type Width

(byte)

Size

(bytes)

Approx unloaded latency (cycles)

Notes

Local 4 2560 1 Indexed addressing post incr/decr

On-chip Scratch

4 16K 60 Atomic ops

SRAM 4 256M 150 Atomic ops

DRAM 8 2G 300 Direct path to/fro MSF

Ref: [NPRD]

Page 23: CS 162 Computer Architecture  Lecture 8: Introduction to Network Processors (II)

23 2003 ©UCR

IXA Software Framework

Control Plane Protocol Stack

Control Plane PDK

Core Components

Core Component Library

Resource Manager Library

Microblock Library

Protocol Library

Hardware Abstraction Library

Microblock

Microblock

Microblock

Utility Library

XScaleCore

MicroenginePipeline Microengine

C Language

C/C++ Language

ExternalProcessors

Ref: [NPRD]

Page 24: CS 162 Computer Architecture  Lecture 8: Introduction to Network Processors (II)

24 2003 ©UCR

References° [NPT] W. H. Mangione-Smith, G. Memik Network Processor

Technologies

° [NPRD] Patrick Crowley, Raj Yavatkar An Introduction to Network Processor Research & Design, HPCA-9 Tutorial