ClearSpeed’s CS301: The World’s First Commercially-Available Stream Processor

ClearSpeed’s CS301:The World’s First Commercially-Available Stream Processor

Simon McIntosh-Smith simon@clearspeed.comDairsie Latimer dairsie@clearspeed.comRon Bell ron.bell@awe.co.ukStephen Hudson stephen.hudson@awe.co.uk

Architecture, Algorithms and Benchmark Results

• Multi-threaded Array Processing– Programmed in high-level languages– Hardware multi-threading

• Enables simultaneous data streaming and computation for latency tolerance

– Run-time extensible instruction set

• Array of Processors Elements– PEs are VLIW cores

– Flexible data parallel processing

– Built-in PE fault tolerance, resiliency

• High performance, low power– 10 GFLOPS/Watt

• Multiple high bandwidth I/O channels

CS301 Processor

3CS301 Processing Elements

Each PE is a VLIW processor:

• Multiple execution units• Floating point adder

• Floating point multiplier

• Divide/square root unit

• Fixed point MAC 8x8->16+48

• Integer ALU with shifter

• Load/store

• High-bandwidth, 5-port register file (3r, 2w)

• Closely coupled 4KB SRAM for data

• High bandwidth per PE load/store (PIO)

• Per PE address generator

• Complete pointer model, including parallel pointer chasing and vectors of addresses

} 32-bit IEEE 754

4CS301-based development board

• 2 chip board – 50 GFLOPS peak @ 10W total• 200K FFTs/s (1K complex single precision IEEE754)• Up to 1GB DRAM for local processing• Shipping since 1Q04• Single slot width full-size PCI card

5What Applications Can Be Accelerated?

Any applications with significant data parallelism:• Fine-grained – vector operations• Medium-grained – unrolled independent loops • Coarse-grained – multiple simultaneous data channels/sets

Example applications and libraries include:• Math libraries – BLAS, LAPACK (→ Matlab, Maple, …)• Chemistry – GROMACS, CHARMM, BLAST, DLPOLY, …• Computational finance – Monte Carlo, genetic algorithms• Intelligent systems – artificial neural networks• Signal processing – FFT (1D, 2D, 3D), FIR• Simulation – CFD, N-body, Finite Element• Image processing – filtering, image recognition, DCTs

6Software Development Environment

Software Development Kit (SDK)• C compiler, assembler, libraries, visual debugger, etc.• CS301-based development boards• Available for Linux and Windows

Applications and libraries under development• Math – L3 BLAS, LAPACK• DSP – FFTs (1D, 2D, 3D)• Bio/Chemistry – GROMACS, DLPOLY, DockIt• Financial – random number generation, Monte Carlo

7Porting Code

void daxpy(double *c, double *a, double alpha, uint N) {

uint i;

for (i=0; i<N; i++)

c[i] = c[i] + a[i]*alpha;

void daxpy(double *c, double *a, double alpha, uint N) {

uint i;

poly double cp, ap;

for (i=0; i<N; i+=num_pes) {

memcpym2p(&cp, &c[i+pe_num], sizeof(double));

memcpym2p(&ap, &a[i+pe_num], sizeof(double));

cp = cp + ap*alpha;

memcpyp2m(&c[i+pe_num], &cp, sizeof(double))

8ClearSpeed-AWE Joint Investigation

• Chemistry codes: DLPOLY (Molecular Dynamics)– Owned by UK Daresbury Lab, heavily used at AWE– Widely used in academia and industry

– 91% of CPU in 5 relatively small routines– One of these (forces) calls the other 4 to compute

forces on all atoms– “forces” called once per time step– Data needing to be returned by “forces” from CS to

host relatively small – Calculation for each atom is independent

• Matrix Multiply Benchmark (SGEMM)– CS301 single precision code started at ~20% efficiency – AWE helped CS restructure code to give 12 GFLOPS – 47%– Performance verified by AWE on CS301 hardware– Next-generation processor from ClearSpeed significantly

increases this performance – “Avebury”

ClearSpeed’s CS301: The World’s First Commercially-Available Stream Processor

Documents

Progr amming Languages: W elcome to CS301 the ultimate ...gexia/cs301/week1.pdfCS301 Session 2 1 Ov erview 2 About oper ational semantics Oper ational semantics of Impcore top lev

CS300 and CS301 - Campbell Sci · The CS301 replaced the CS300 in August 2018. The CS301 has a stainless steel connector, a removable cable, different wire colours, and a serial number

Programación Commercially Speaking

Commercially Effective MCM

Silvertown Tunnel Commercially Sensitive Information · EVERY JOURNEY MATTERS Part 2 Part 2 — Commercially Sensitive Material SCHEDULE 24 COMMERCIALLY SENSITIVE INFORMATION - Commercially

KONTEK 2 Marinarkæologisk undersøgelse af fundområde CS301 · positionerne CS301, CS303 og CS306. I sugehullerne fandtes 69 stk. forarbejdet flint, heriblandt enkelte flækker,

CS301 - Algorithms

The theme is inspired by James Bond and the atmosphere is that …kemper/cs301/HallOfFame/CS301... · 2015. 1. 29. · comes with backgrounds that gives it some glamour and the classic

Data Structure CS301 Lecture No. 01 Introduction to Data Structuresapi.ning.com/files/VQvbBACfG9pOKUzKZC6B3JFS4VeQeAJY3... · 2016-11-03 · Data Structure CS301 Lecture No. 01 Introduction

CS301- Data Structures Feb 06,2012 LATEST SOLVED ...api.ning.com/files/W-cvbOicaVURPviHq70nJfTEsw2YtVqW1EvBWFFNP8Xvt1... · CS301- Data Structures LATEST SOLVED SUBJECTIVES FROM

CS301- Data Structures JAN 30,2011api.ning.com/files/...*u9UtXNQSUaKtMB/cs301MCQSBYMOAZ.pdf · CS301- Data Structures ... A solution is said to be efficient if it solves the problem

ClearSpeed’s CS301: The World’s First Commercially-Available Stream Processor

2016 ANNUAL REPORT - Freshwater Fish · CORPORATE PROFILE Freshwater is a self-sustaining federal Crown corporation – the buyer, processor and marketer of all commercially-caught

(2)Commercially Available Biosensors_Matthew

CS301- Data Structures Feb 06,2012 LATEST SOLVED … · 2019. 6. 16. · CS301- Data Structures LATEST SOLVED SUBJECTIVES FROM FINALTERM PAPERS Feb 06,2012 Cs301 - subjective Mc100401285

Performance Power DieYielddszajda/classes/cs301/Fall_2020/... · CS301 Prof Szajda. Performance Metrics (How do we compare two machines?) What to Measure?!3 Which airplane has the

Commercially Important Cra

CS301 Notes

FPGA Based Crypto Processor for Elliptic Curve …...Elliptic Curve Cryptography (ECC) is proposed by Neal Koblitz [1] and Victor Miller [2] in 1985. ECC has been commercially accepted

CS301 Tools for Working with Data · Data Analytics CS301 Tools for Working with Data Week 1: 3rd Sept Fall 2020 Oliver BONHAM-CARTER