42
ASH: A Substrate for Scalable Architectures Mihai Budiu Seth Copen Goldstein http://www.cs.cmu.edu/~phoenix CALCM Seminar, March 19, 2002

ASH: A Substrate for Scalable Architectures

  • Upload
    kasie

  • View
    39

  • Download
    0

Embed Size (px)

DESCRIPTION

ASH: A Substrate for Scalable Architectures. Mihai Budiu Seth Copen Goldstein http://www.cs.cmu.edu/~phoenix CALCM Seminar, March 19, 2002. Resources. CPU Problems. Complexity Power Global Signals Limited issue window => limited ILP. - PowerPoint PPT Presentation

Citation preview

Page 1: ASH:  A Substrate for Scalable Architectures

ASH: A Substrate for Scalable

Architectures

Mihai Budiu

Seth Copen Goldsteinhttp://www.cs.cmu.edu/~phoenix

CALCM Seminar, March 19, 2002

Page 2: ASH:  A Substrate for Scalable Architectures

/322

Resources

Page 3: ASH:  A Substrate for Scalable Architectures

/323

CPU Problems

• Complexity

• Power

• Global Signals

• Limited issue window => limited ILP

We propose an architecture with none of these limits

Page 4: ASH:  A Substrate for Scalable Architectures

/324

Outline

• Scalability

• Reconfigurable hardware advantages

• A hybrid RH + CPU architecture

• CPU and RH as peers

• Application Specific Hardware

Page 5: ASH:  A Substrate for Scalable Architectures

/325

FU * clock freq

Computational Bandwidth

CPU

Unbounded

RH

*

+

/

a=a+bb=b+c

Page 6: ASH:  A Substrate for Scalable Architectures

/326

Registers

Fixed

RH

Unbounded

eaxebxecxedx

ijklm spillsp[0]

CPU

Page 7: ASH:  A Substrate for Scalable Architectures

/327

Register Bandwidth

Fixed

CPU

R1R2R3W1W2

RH

Unbounded

Page 8: ASH:  A Substrate for Scalable Architectures

/328

Out-of-Order Execution

RHCPU

Fe

tch

De

cod

e

Dis

pa

tch

Exe

cute

Co

mm

it

In-order

Limited bywindow

Compiler’s window is unbounded

Page 9: ASH:  A Substrate for Scalable Architectures

/329

Outline

• Scalability

• Reconfigurable hardware advantages

• A hybrid RH + CPU architecture

• CPU and RH as peers

• Application Specific Hardware

Page 10: ASH:  A Substrate for Scalable Architectures

/3210

Hybrid system: CPU+RH

High ILP

application-specific

Low ILP+ OS + VM

generic

CPU RH

Memory

Tight coupling

Page 11: ASH:  A Substrate for Scalable Architectures

/3211

Problem

HLL Program

CPU RH

Memory

Compiler

Page 12: ASH:  A Substrate for Scalable Architectures

/3212

Our Solution

General: applicable to today’s software

Automatic: compiler-driven [RISC approach]

Scalable: with clock, hardware and program size

Parallelism: exploit application parallelism• bit-level• ILP• pipeline• loop-level

Page 13: ASH:  A Substrate for Scalable Architectures

/3213

Outline

• Scalability

• Reconfigurable hardware advantages

• A hybrid RH + CPU architecture

• CPU and RH as peers

• Application Specific Hardware

Page 14: ASH:  A Substrate for Scalable Architectures

/3214

Peeringa( ) {

b( );}

b( ) {c( );

}

c( ) {d( )

}

d( ) { }

CPU RH

a

b

c

d

Program

Page 15: ASH:  A Substrate for Scalable Architectures

/3215

marshalling,control transfer

softwareprocedure

callhardware

dependent

RH

“RPC”

CPU

a

b

c

d

b’

c’

d’

Stubs built automatically.

Page 16: ASH:  A Substrate for Scalable Architectures

/3216

Stub Synthesis

Proceduresfor RH

RH Compiler

Proceduresfor CPU

Program

Partitioning

Stubs

Configuration

Linker

Executable

Page 17: ASH:  A Substrate for Scalable Architectures

/3217

Outline

• Scalability

• Reconfigurable hardware advantages

• A hybrid RH + CPU architecture

• CPU and RH as peers

• Application Specific Hardware

Page 18: ASH:  A Substrate for Scalable Architectures

/3218

Application-Specific Hardware

Reconfigurablehardware

HLL program

Compiler

Circuit

HLL Program

CPU RH

Memory

Compiler

Page 19: ASH:  A Substrate for Scalable Architectures

/3219

CASH: Compiling for ASH

Memory partitioning

Interconnection net

Circuits

C Program

RH

Page 20: ASH:  A Substrate for Scalable Architectures

/3220

Asynchronous Computation

+

data

dataready

ack

Can extend to locally synchronous, globally asynchronous

Page 21: ASH:  A Substrate for Scalable Architectures

/3221

Dataflow Graphs

int plus(int x, int y)

{

return x + y;

}

Page 22: ASH:  A Substrate for Scalable Architectures

/3222

From Control Flow to Data Flow

Page 23: ASH:  A Substrate for Scalable Architectures

/3223

From Control Flow to Data Flow

Page 24: ASH:  A Substrate for Scalable Architectures

/3224

From Control Flow to Data Flow

Page 25: ASH:  A Substrate for Scalable Architectures

/3225

Conditionals = Speculation

int cond(int p, int x, int y)

{

int z;

if (p)

z = x;

else

z = y;

return z;

}

Page 26: ASH:  A Substrate for Scalable Architectures

/3226

Critical Paths

if (x > 0) y = -x;

elsey = b*x;

*

xb 0

y

!

- >

Page 27: ASH:  A Substrate for Scalable Architectures

/3227

Executing Lenient Operators

if (x > 0) y = -x;

elsey = b*x;

*

xb 0

y

!

- >

Up to 40% performance improvement.

Page 28: ASH:  A Substrate for Scalable Architectures

/3228

Pipelining

Pipelined Cycles

N 903

Y 653

Page 29: ASH:  A Substrate for Scalable Architectures

/3229

Loop Pipelining

Pipe FIFO Cycles

N 0 903

N 1 903

Y 0 653

Y 1 474

Y 2 408

Y 3 408

Page 30: ASH:  A Substrate for Scalable Architectures

/3230

Loop Pipelining

Pipe FIFO Cycles

N 0 903

N 1 903

Y 0 653

Y 1 474

Y 2 408

Y 3 408

Page 31: ASH:  A Substrate for Scalable Architectures

/3231

ASH Features

• What you code is what you get– no hidden control logic– really lean hardware

(no CAM, decoders, multiported files, etc.)

• Compiler has complete control• Dynamic scheduling => latency tolerant• Naturally exploits ILP,

even across loop iterations

Page 32: ASH:  A Substrate for Scalable Architectures

/3232

Conclusions

• ASH = Compiler-synthesized hardware

• ASH matches program parallelism

• Dynamically scheduled RH

• ASH scales with – clock frequency– transistors– program size

Page 33: ASH:  A Substrate for Scalable Architectures

/3233

Backup Slides

Page 34: ASH:  A Substrate for Scalable Architectures

/3234

Reconfigurable Hardware

Universal gates

and/or

storage elements

Interconnectionnetwork

Programmable switches

Page 35: ASH:  A Substrate for Scalable Architectures

/3235

Switch controlled by a 1-bit RAM cell

0001

Universal gate = RAM

a0a1a0

a1

dataa1 & a2

0data in

control

Main RH Ingredient: RAM Cell

Page 36: ASH:  A Substrate for Scalable Architectures

/3236

Stubs

a( ) { r = b(b_args);}

b(b_args) {

}

a( ) { r = b’(b_args);}

b’(b_args) { send_rh(b_args); invoke_rh(b); r = receive_rh( ); return r;}

RH

Program

Page 37: ASH:  A Substrate for Scalable Architectures

/3237

Independent of b

Dispatcher Stubs

a( ) { r = b(b_args);}

b(b_args) { if (x) c( ); return r;}

c( ) {

}

Program

b’(b_args) { send_rh(b_args); invoke_rh(b);

while (1) { com = get_rh_command( ); if (! com) break; (*com)( ); }

r = receive_rh( ); return r;}

c’s stub

Page 38: ASH:  A Substrate for Scalable Architectures

/3238

C’s Stuba( ) { r = b(b_args);}

b(b_args) { if (x) c( ); return r;}

c( ) {

}

Program

c’( ) { receive_rh(c_args); r = c(c_args); send_rh(r); invoke_rh(return_to_rh);}

back

Page 39: ASH:  A Substrate for Scalable Architectures

/3239

Input to Output

int io(int x)

{

return x;

}

Page 40: ASH:  A Substrate for Scalable Architectures

/3240

Loops

int loop()

{

int w = 10;

while (w > 0)

w--;

return w;

}

Page 41: ASH:  A Substrate for Scalable Architectures

/3241

Pointers and Arrays

int a[10];

void pointer(int *p)

{

a[2] += a[4] + *p;

}

Page 42: ASH:  A Substrate for Scalable Architectures

/3242

int sum(){ int s = 0; int i;

for (i=0; i < 10; i++)s += a[i];

return s;}

Pointers and Loops