65
1 - CPRE 583 (Reconfigurable Computing): Reconfigurable Computing Architectures Iowa State University (Ames) CPRE 583 Reconfigurable Computing Lecture 5: Wed 9/8/2010 (Reconfigurable Computing Architectures) Instructor: Dr. Phillip Jones ([email protected]) Reconfigurable Computing Laboratory Iowa State University Ames, Iowa, USA http://class.ece.iastate.e du/cpre583/

CPRE 583 Reconfigurable Computing Lecture 5: Wed 9/8/2010 (Reconfigurable Computing Architectures)

  • Upload
    drake

  • View
    59

  • Download
    0

Embed Size (px)

DESCRIPTION

CPRE 583 Reconfigurable Computing Lecture 5: Wed 9/8/2010 (Reconfigurable Computing Architectures). Instructor: Dr. Phillip Jones ([email protected]) Reconfigurable Computing Laboratory Iowa State University Ames, Iowa, USA. http://class.ece.iastate.edu/cpre583/. Overview. - PowerPoint PPT Presentation

Citation preview

Page 1: CPRE 583 Reconfigurable Computing Lecture 5: Wed 9/8/2010 (Reconfigurable Computing Architectures)

1 - CPRE 583 (Reconfigurable Computing): Reconfigurable Computing Architectures Iowa State University (Ames)

CPRE 583Reconfigurable Computing

Lecture 5: Wed 9/8/2010(Reconfigurable Computing Architectures)

Instructor: Dr. Phillip Jones([email protected])

Reconfigurable Computing LaboratoryIowa State University

Ames, Iowa, USA

http://class.ece.iastate.edu/cpre583/

Page 2: CPRE 583 Reconfigurable Computing Lecture 5: Wed 9/8/2010 (Reconfigurable Computing Architectures)

2 - CPRE 583 (Reconfigurable Computing): Reconfigurable Computing Architectures Iowa State University (Ames)

• Chapter 2 (Reconfigurable Architectures)

Overview

Page 3: CPRE 583 Reconfigurable Computing Lecture 5: Wed 9/8/2010 (Reconfigurable Computing Architectures)

3 - CPRE 583 (Reconfigurable Computing): Reconfigurable Computing Architectures Iowa State University (Ames)

Common Questions

Page 4: CPRE 583 Reconfigurable Computing Lecture 5: Wed 9/8/2010 (Reconfigurable Computing Architectures)

4 - CPRE 583 (Reconfigurable Computing): Reconfigurable Computing Architectures Iowa State University (Ames)

Common Questions

Page 5: CPRE 583 Reconfigurable Computing Lecture 5: Wed 9/8/2010 (Reconfigurable Computing Architectures)

5 - CPRE 583 (Reconfigurable Computing): Reconfigurable Computing Architectures Iowa State University (Ames)

Common Questions

Page 6: CPRE 583 Reconfigurable Computing Lecture 5: Wed 9/8/2010 (Reconfigurable Computing Architectures)

6 - CPRE 583 (Reconfigurable Computing): Reconfigurable Computing Architectures Iowa State University (Ames)

• Basic trade-offs associated with different aspects of a Reconfigurable Architecture. (Chapter 2)

What you should learn

Page 7: CPRE 583 Reconfigurable Computing Lecture 5: Wed 9/8/2010 (Reconfigurable Computing Architectures)

7 - CPRE 583 (Reconfigurable Computing): Reconfigurable Computing Architectures Iowa State University (Ames)

Reconfigurable Architectures

• Main Idea Chapter 2’s author wants to convey– Applications often have one or more small

computationally intense regions of code (kernels)

– Can these kernels be sped up using dedicated hardware?

– Different kernels have different needs. How does a kernels requirements guide design decisions when implementing a Reconfigurable Architecture?

Page 8: CPRE 583 Reconfigurable Computing Lecture 5: Wed 9/8/2010 (Reconfigurable Computing Architectures)

8 - CPRE 583 (Reconfigurable Computing): Reconfigurable Computing Architectures Iowa State University (Ames)

Reconfigurable Architectures• Forces that drive a Reconfigurable Architecture

– Price• Mass production 100K to millions• Experimental 1 to 10’s

– Granularity of reconfiguration• Fine grain• Course Grain

– Degree of system integration/coupling• Tightly• Loosely

All are a function of the application that will run on the Architecture

Page 9: CPRE 583 Reconfigurable Computing Lecture 5: Wed 9/8/2010 (Reconfigurable Computing Architectures)

9 - CPRE 583 (Reconfigurable Computing): Reconfigurable Computing Architectures Iowa State University (Ames)

Example Points in (Price,Granularity,Coupling) Space

Price

$100’s

$1M’s

Granularity

Coarse

Fine

CouplingLoose Tight

Intel /AMD

Int

float

RFU

Processor

PC

ML507

Ethernet

Decode

Exec

Store

Page 10: CPRE 583 Reconfigurable Computing Lecture 5: Wed 9/8/2010 (Reconfigurable Computing Architectures)

10 - CPRE 583 (Reconfigurable Computing): Reconfigurable Computing Architectures Iowa State University (Ames)

What’s the point of a Reconfigurable Architecture

• Performance metrics– Computational

• Throughput• Latency

– Power• Total power dissipation• Thermal

– Reliability• Recovery from faults

Increase application performance!

Page 11: CPRE 583 Reconfigurable Computing Lecture 5: Wed 9/8/2010 (Reconfigurable Computing Architectures)

11 - CPRE 583 (Reconfigurable Computing): Reconfigurable Computing Architectures Iowa State University (Ames)

Typical Approach for Increasing Performance

• Application/algorithm implemented in software– Often easier to write an application in software

• Profile application (e.g. gprof)– Determine where the application is spending its time

• Identify kernels of interest– e.g. application spends 90% of its time in function

matrix_multiply()• Design custom hardware/instruction to accelerate kernel(s)

– Analysis to kernel to determine how to extract fine/coarse grain parallelism (does any parallelism even exist?)

Amdahl’s Law!

Page 12: CPRE 583 Reconfigurable Computing Lecture 5: Wed 9/8/2010 (Reconfigurable Computing Architectures)

12 - CPRE 583 (Reconfigurable Computing): Reconfigurable Computing Architectures Iowa State University (Ames)

Amdahl’s Law: Example• Application My_app

– Running time: 100 seconds– Spends 90 seconds in matrix_mul()

• What is the maximum possible speed up of My_app if I place matrix_mul() in hardware?

• What if the original My_app spends 99 seconds in matrx_mul()?

10 seconds = 10x faster

1 seconds = 100x faster

Good recent FPGA paper that illustrates increasing an algorithm’s performance with Hardware

“NOVEL FPGA BASED HAAR CLASSIFIER FACE DETECTION ALGORITHM ACCELERATION”, FPL 2008

http://class.ece.iastate.edu/cpre583/papers/Shih-Lien_Lu_FPL2008.pdf

Page 13: CPRE 583 Reconfigurable Computing Lecture 5: Wed 9/8/2010 (Reconfigurable Computing Architectures)

13 - CPRE 583 (Reconfigurable Computing): Reconfigurable Computing Architectures Iowa State University (Ames)

Granularity

Page 14: CPRE 583 Reconfigurable Computing Lecture 5: Wed 9/8/2010 (Reconfigurable Computing Architectures)

14 - CPRE 583 (Reconfigurable Computing): Reconfigurable Computing Architectures Iowa State University (Ames)

Granularity: Coarse Grain

• rDPA: reconfigurable Data Path Array• Function Units with programmable interconnects

ALU ALU ALU

ALU ALU ALU

ALU ALU ALU

Example

Page 15: CPRE 583 Reconfigurable Computing Lecture 5: Wed 9/8/2010 (Reconfigurable Computing Architectures)

15 - CPRE 583 (Reconfigurable Computing): Reconfigurable Computing Architectures Iowa State University (Ames)

Granularity: Coarse Grain

• rDPA: reconfigurable Data Path Array• Function Units with programmable interconnects

ALU ALU ALU

ALU ALU ALU

ALU ALU ALU

Example

Page 16: CPRE 583 Reconfigurable Computing Lecture 5: Wed 9/8/2010 (Reconfigurable Computing Architectures)

16 - CPRE 583 (Reconfigurable Computing): Reconfigurable Computing Architectures Iowa State University (Ames)

Granularity: Coarse Grain

• rDPA: reconfigurable Data Path Array• Function Units with programmable interconnects

ALU ALU ALU

ALU ALU ALU

ALU ALU ALU

Example

Page 17: CPRE 583 Reconfigurable Computing Lecture 5: Wed 9/8/2010 (Reconfigurable Computing Architectures)

17 - CPRE 583 (Reconfigurable Computing): Reconfigurable Computing Architectures Iowa State University (Ames)

Granularity: Fine Grain

• FPGA: Field Programmable Gate Array• Sea of general purpose logic gates

CLB CLB CLB CLB

CLB CLB CLB CLB

CLB CLB CLB CLB

CLB CLB CLB CLB

Configurable Logic Block

Page 18: CPRE 583 Reconfigurable Computing Lecture 5: Wed 9/8/2010 (Reconfigurable Computing Architectures)

18 - CPRE 583 (Reconfigurable Computing): Reconfigurable Computing Architectures Iowa State University (Ames)

Granularity: Fine Grain

• FPGA: Field Programmable Gate Array• Sea of general purpose logic gates

CLB CLB CLB

CLB CLB CLB CLB

CLB CLB CLB CLB

CLB CLB CLB CLB

Configurable Logic Block

Page 19: CPRE 583 Reconfigurable Computing Lecture 5: Wed 9/8/2010 (Reconfigurable Computing Architectures)

19 - CPRE 583 (Reconfigurable Computing): Reconfigurable Computing Architectures Iowa State University (Ames)

Granularity: Fine Grain

• FPGA: Field Programmable Gate Array• Sea of general purpose logic gates

CLB CLB

CLB

CLB

CLB CLB CLB CLB

Configurable Logic Block

Page 20: CPRE 583 Reconfigurable Computing Lecture 5: Wed 9/8/2010 (Reconfigurable Computing Architectures)

20 - CPRE 583 (Reconfigurable Computing): Reconfigurable Computing Architectures Iowa State University (Ames)

Granularity: Trade-offsTrade-offs associated with LUT size

Example: 2-LUT (4=2x2 bits) vs. 10-LUT (1024=32x32 bits)1024-bits

1024-bits

2-LUT

10-LUTMicroprocessor

Page 21: CPRE 583 Reconfigurable Computing Lecture 5: Wed 9/8/2010 (Reconfigurable Computing Architectures)

21 - CPRE 583 (Reconfigurable Computing): Reconfigurable Computing Architectures Iowa State University (Ames)

Granularity: Trade-offsTrade-offs associated with LUT size

Example: 2-LUT (4=2x2 bits) vs. 10-LUT (1024=32x32 bits)1024-bits

1024-bits

2-LUT

10-LUTMicroprocessor

4

3

3

AB

op3

Page 22: CPRE 583 Reconfigurable Computing Lecture 5: Wed 9/8/2010 (Reconfigurable Computing Architectures)

22 - CPRE 583 (Reconfigurable Computing): Reconfigurable Computing Architectures Iowa State University (Ames)

Granularity: Trade-offsTrade-offs associated with LUT size

Example: 2-LUT (4=2x2 bits) vs. 10-LUT (1024=32x32 bits)1024-bits

1024-bits

2-LUT

10-LUTMicroprocessor

4

3

3

AB

op3

4

3

3AB

op3

4

3

3

AB

op3

Page 23: CPRE 583 Reconfigurable Computing Lecture 5: Wed 9/8/2010 (Reconfigurable Computing Architectures)

23 - CPRE 583 (Reconfigurable Computing): Reconfigurable Computing Architectures Iowa State University (Ames)

Granularity: Trade-offsTrade-offs associated with LUT size

Example: 2-LUT (4=2x2 bits) vs. 10-LUT (1024=32x32 bits)1024-bits

1024-bits

2-LUT

10-LUTMicroprocessor

4

3

3

AB

op

3

4

3

3AB

op

3

3

3

3

AB

op

Page 24: CPRE 583 Reconfigurable Computing Lecture 5: Wed 9/8/2010 (Reconfigurable Computing Architectures)

24 - CPRE 583 (Reconfigurable Computing): Reconfigurable Computing Architectures Iowa State University (Ames)

Granularity: Trade-offsTrade-offs associated with LUT size

Example: 2-LUT (4=2x2 bits) vs. 10-LUT (1024=32x32 bits)1024-bits

1024-bits

2-LUT

10-LUTMicroprocessor

4

3

3

AB

op

3

4

3

3AB

op

3

4

3

3

AB

op

3

Page 25: CPRE 583 Reconfigurable Computing Lecture 5: Wed 9/8/2010 (Reconfigurable Computing Architectures)

25 - CPRE 583 (Reconfigurable Computing): Reconfigurable Computing Architectures Iowa State University (Ames)

Granularity: Trade-offsTrade-offs associated with LUT size

Example: 2-LUT (4=2x2 bits) vs. 10-LUT (1024=32x32 bits)1024-bits

1024-bits

2-LUT

10-LUT

Bit logic and constants

Page 26: CPRE 583 Reconfigurable Computing Lecture 5: Wed 9/8/2010 (Reconfigurable Computing Architectures)

26 - CPRE 583 (Reconfigurable Computing): Reconfigurable Computing Architectures Iowa State University (Ames)

Granularity: Trade-offsTrade-offs associated with LUT size

Example: 2-LUT (4=2x2 bits) vs. 10-LUT (1024=32x32 bits)1024-bits

1024-bits

2-LUT

10-LUT

Bit logic and constants

(A and “1100”) or (B or “1000”)

Page 27: CPRE 583 Reconfigurable Computing Lecture 5: Wed 9/8/2010 (Reconfigurable Computing Architectures)

27 - CPRE 583 (Reconfigurable Computing): Reconfigurable Computing Architectures Iowa State University (Ames)

Granularity: Trade-offsTrade-offs associated with LUT size

Example: 2-LUT (4=2x2 bits) vs. 10-LUT (1024=32x32 bits)1024-bits

1024-bits

2-LUT

10-LUT

Bit logic and constants

(A and “1100”) or (B or “1000”)

A

B

Page 28: CPRE 583 Reconfigurable Computing Lecture 5: Wed 9/8/2010 (Reconfigurable Computing Architectures)

28 - CPRE 583 (Reconfigurable Computing): Reconfigurable Computing Architectures Iowa State University (Ames)

Granularity: Trade-offsTrade-offs associated with LUT size

Example: 2-LUT (4=2x2 bits) vs. 10-LUT (1024=32x32 bits)1024-bits

1024-bits

2-LUT

10-LUT

Bit logic and constants

(A and “1100”) or (B or “1000”)

A AND

OR

OR

1

0

B

4

4

It’s much worse, each 10-LUT only has one output

Area that wasrequired using

2-LUTS

Page 29: CPRE 583 Reconfigurable Computing Lecture 5: Wed 9/8/2010 (Reconfigurable Computing Architectures)

29 - CPRE 583 (Reconfigurable Computing): Reconfigurable Computing Architectures Iowa State University (Ames)

Granularity: Example Architectures

• Fine grain: GARP

• Course grain: PipeRench

Page 30: CPRE 583 Reconfigurable Computing Lecture 5: Wed 9/8/2010 (Reconfigurable Computing Architectures)

30 - CPRE 583 (Reconfigurable Computing): Reconfigurable Computing Architectures Iowa State University (Ames)

Granularity: GARP

CPU RFU

Garp chip

Memory

I-cache D-cache

Configcache

Page 31: CPRE 583 Reconfigurable Computing Lecture 5: Wed 9/8/2010 (Reconfigurable Computing Architectures)

31 - CPRE 583 (Reconfigurable Computing): Reconfigurable Computing Architectures Iowa State University (Ames)

Granularity: GARP

CPU RFU

Garp chip

Memory

I-cache D-cache

Configcache

RFUcontrol

(1)Execution(16, 2-bit)

N

PE (Processing Element)

Page 32: CPRE 583 Reconfigurable Computing Lecture 5: Wed 9/8/2010 (Reconfigurable Computing Architectures)

32 - CPRE 583 (Reconfigurable Computing): Reconfigurable Computing Architectures Iowa State University (Ames)

Granularity: GARP

CPU RFU

Garp chip

Memory

I-cache D-cache

Configcache

RFUcontrol

(1)Execution(16, 2-bit)

N

PE (Processing Element)Example computations in one cycleA<<10 | (b&c)(A-2*b+c)

Page 33: CPRE 583 Reconfigurable Computing Lecture 5: Wed 9/8/2010 (Reconfigurable Computing Architectures)

33 - CPRE 583 (Reconfigurable Computing): Reconfigurable Computing Architectures Iowa State University (Ames)

Granularity: GARP

CPU RFU

Garp chip

Memory

I-cache D-cache

Configcache

Impact of configuration size• 1 GHz bus frequency•128-bit memory bus• 512Kbits of configuration size

On a RFU context switch how longto load a new full configuration?

4 microseconds

An estimate of amount of time for theCPU perform a context switch is ~5 microseconds

~2x increase context switch latency!!

Page 34: CPRE 583 Reconfigurable Computing Lecture 5: Wed 9/8/2010 (Reconfigurable Computing Architectures)

34 - CPRE 583 (Reconfigurable Computing): Reconfigurable Computing Architectures Iowa State University (Ames)

Granularity: GARP

CPU RFU

Garp chip

Memory

I-cache D-cache

Configcache

RFUcontrol

(1)Execution(16, 2-bit)

N

PE (Processing Element)

“The Garp Architecture and C Compiler”http://www.cs.cmu.edu/~tcal/IEEE-Computer-Garp.pdf

Page 35: CPRE 583 Reconfigurable Computing Lecture 5: Wed 9/8/2010 (Reconfigurable Computing Architectures)

35 - CPRE 583 (Reconfigurable Computing): Reconfigurable Computing Architectures Iowa State University (Ames)

Granularity: PipeRench • Coarse granularity

• Higher (higher) level programming

• Reference papers• PipeRench: A Coprocessor for Streaming Multimedia Acceleration

(ISCA 1999): http://www.cs.cmu.edu/~mihaib/research/isca99.pdf• PipeRench Implementation of the Instruction Path Coprocessor

(Micro 2000): http://class.ee.iastate.edu/cpre583/papers/piperench_Micro_2000.pdf

Page 36: CPRE 583 Reconfigurable Computing Lecture 5: Wed 9/8/2010 (Reconfigurable Computing Architectures)

36 - CPRE 583 (Reconfigurable Computing): Reconfigurable Computing Architectures Iowa State University (Ames)

Granularity: PipeRench

Interconnect

8-bit ALU

Reg file

PE8-bit ALU

Reg file

PE8-bit ALU

Reg file

PE

Interconnect

8-bit ALU

Reg file

PE8-bit ALU

Reg file

PE8-bit ALU

Reg file

PE

8-bit ALU

Reg file

PE8-bit ALU

Reg file

PE8-bit ALU

Reg file

PE

Glo

bal b

us

Page 37: CPRE 583 Reconfigurable Computing Lecture 5: Wed 9/8/2010 (Reconfigurable Computing Architectures)

37 - CPRE 583 (Reconfigurable Computing): Reconfigurable Computing Architectures Iowa State University (Ames)

Granularity: PipeRench

PE PE PEPE

PE PE PEPE

PE PE PEPE

Cycle

Pipelinestage

1 2 3 4 5 6

0

1

2

3

4

Page 38: CPRE 583 Reconfigurable Computing Lecture 5: Wed 9/8/2010 (Reconfigurable Computing Architectures)

38 - CPRE 583 (Reconfigurable Computing): Reconfigurable Computing Architectures Iowa State University (Ames)

Granularity: PipeRench

PE PE PEPE

PE PE PEPE

PE PE PEPE

0

Cycle

Pipelinestage

1 2 3 4 5 6

0

1

2

3

4

Page 39: CPRE 583 Reconfigurable Computing Lecture 5: Wed 9/8/2010 (Reconfigurable Computing Architectures)

39 - CPRE 583 (Reconfigurable Computing): Reconfigurable Computing Architectures Iowa State University (Ames)

Granularity: PipeRench

PE PE PEPE

PE PE PEPE

PE PE PEPE

0

Cycle

Pipelinestage

1 2 3 4 5 6

0

1

2

3

4

0

1

Page 40: CPRE 583 Reconfigurable Computing Lecture 5: Wed 9/8/2010 (Reconfigurable Computing Architectures)

40 - CPRE 583 (Reconfigurable Computing): Reconfigurable Computing Architectures Iowa State University (Ames)

Granularity: PipeRench

PE PE PEPE

PE PE PEPE

PE PE PEPE

0

Cycle

Pipelinestage

1 2 3 4 5 6

0

1

2

3

4

0

1

0

1

2

Page 41: CPRE 583 Reconfigurable Computing Lecture 5: Wed 9/8/2010 (Reconfigurable Computing Architectures)

41 - CPRE 583 (Reconfigurable Computing): Reconfigurable Computing Architectures Iowa State University (Ames)

Granularity: PipeRench

PE PE PEPE

PE PE PEPE

PE PE PEPE

0

Cycle

Pipelinestage

1 2 3 4 5 6

0

1

2

3

4

0

1

0

1

2

1

2

3

Page 42: CPRE 583 Reconfigurable Computing Lecture 5: Wed 9/8/2010 (Reconfigurable Computing Architectures)

42 - CPRE 583 (Reconfigurable Computing): Reconfigurable Computing Architectures Iowa State University (Ames)

Granularity: PipeRench

PE PE PEPE

PE PE PEPE

PE PE PEPE

0

Cycle

Pipelinestage

1 2 3 4 5 6

0

1

2

3

4

0

1

0

1

2

1

2

3

2

3

4

Page 43: CPRE 583 Reconfigurable Computing Lecture 5: Wed 9/8/2010 (Reconfigurable Computing Architectures)

43 - CPRE 583 (Reconfigurable Computing): Reconfigurable Computing Architectures Iowa State University (Ames)

Granularity: PipeRench

PE PE PEPE

PE PE PEPE

PE PE PEPE

0

Cycle

Pipelinestage

1 2 3 4 5 6

0

1

2

3

4

0

1

0

1

2

1

2

3

2

3

4

0

3

4

Page 44: CPRE 583 Reconfigurable Computing Lecture 5: Wed 9/8/2010 (Reconfigurable Computing Architectures)

44 - CPRE 583 (Reconfigurable Computing): Reconfigurable Computing Architectures Iowa State University (Ames)

Granularity: PipeRench

PE PE PEPE

PE PE PEPE

PE PE PEPE

0

Cycle

Pipelinestage

1 2 3 4 5 6

0

1

2

3

4

0

1

0

1

2

1

2

3

2

3

4

0

3

4

Cycle

Pipelinestage

1 2 3 4 5 6

0

1

2

Page 45: CPRE 583 Reconfigurable Computing Lecture 5: Wed 9/8/2010 (Reconfigurable Computing Architectures)

45 - CPRE 583 (Reconfigurable Computing): Reconfigurable Computing Architectures Iowa State University (Ames)

Granularity: PipeRench

PE PE PEPE

PE PE PEPE

PE PE PEPE

0

Cycle

Pipelinestage

1 2 3 4 5 6

0

1

2

3

4

0

1

0

1

2

1

2

3

2

3

4

0

3

4

0

Cycle

Pipelinestage

1 2 3 4 5 6

0

1

2

Page 46: CPRE 583 Reconfigurable Computing Lecture 5: Wed 9/8/2010 (Reconfigurable Computing Architectures)

46 - CPRE 583 (Reconfigurable Computing): Reconfigurable Computing Architectures Iowa State University (Ames)

Granularity: PipeRench

PE PE PEPE

PE PE PEPE

PE PE PEPE

0

Cycle

Pipelinestage

1 2 3 4 5 6

0

1

2

3

4

0

1

0

1

2

1

2

3

2

3

4

0

3

4

0

Cycle

Pipelinestage

1 2 3 4 5 6

0

1

2

0

1

Page 47: CPRE 583 Reconfigurable Computing Lecture 5: Wed 9/8/2010 (Reconfigurable Computing Architectures)

47 - CPRE 583 (Reconfigurable Computing): Reconfigurable Computing Architectures Iowa State University (Ames)

Granularity: PipeRench

PE PE PEPE

PE PE PEPE

PE PE PEPE

0

Cycle

Pipelinestage

1 2 3 4 5 6

0

1

2

3

4

0

1

0

1

2

1

2

3

2

3

4

0

3

4

0

Cycle

Pipelinestage

1 2 3 4 5 6

0

1

2

0

1

0

1

2

Page 48: CPRE 583 Reconfigurable Computing Lecture 5: Wed 9/8/2010 (Reconfigurable Computing Architectures)

48 - CPRE 583 (Reconfigurable Computing): Reconfigurable Computing Architectures Iowa State University (Ames)

Granularity: PipeRench

PE PE PEPE

PE PE PEPE

PE PE PEPE

0

Cycle

Pipelinestage

1 2 3 4 5 6

0

1

2

3

4

0

1

0

1

2

1

2

3

2

3

4

0

3

4

0

Cycle

Pipelinestage

1 2 3 4 5 6

0

1

2

0

1

0

1

2

3

1

2

Page 49: CPRE 583 Reconfigurable Computing Lecture 5: Wed 9/8/2010 (Reconfigurable Computing Architectures)

49 - CPRE 583 (Reconfigurable Computing): Reconfigurable Computing Architectures Iowa State University (Ames)

Granularity: PipeRench

PE PE PEPE

PE PE PEPE

PE PE PEPE

0

Cycle

Pipelinestage

1 2 3 4 5 6

0

1

2

3

4

0

1

0

1

2

1

2

3

2

3

4

0

3

4

0

Cycle

Pipelinestage

1 2 3 4 5 6

0

1

2

0

1

0

1

2

3

1

2

3

4

2

Page 50: CPRE 583 Reconfigurable Computing Lecture 5: Wed 9/8/2010 (Reconfigurable Computing Architectures)

50 - CPRE 583 (Reconfigurable Computing): Reconfigurable Computing Architectures Iowa State University (Ames)

Granularity: PipeRench

PE PE PEPE

PE PE PEPE

PE PE PEPE

0

Cycle

Pipelinestage

1 2 3 4 5 6

0

1

2

3

4

0

1

0

1

2

1

2

3

2

3

4

0

3

4

0

Cycle

Pipelinestage

1 2 3 4 5 6

0

1

2

0

1

0

1

2

3

1

2

3

4

2

3

4

0

Page 51: CPRE 583 Reconfigurable Computing Lecture 5: Wed 9/8/2010 (Reconfigurable Computing Architectures)

51 - CPRE 583 (Reconfigurable Computing): Reconfigurable Computing Architectures Iowa State University (Ames)

Degree of Integration/Coupling • Independent Reconfigurable Coprocessor

– Reconfigurable Fabric does not have direct communication with the CPU

• Processor + Reconfigurable Processing Fabric– Loosely coupled on the same chip– Tightly coupled on the same chip

Page 52: CPRE 583 Reconfigurable Computing Lecture 5: Wed 9/8/2010 (Reconfigurable Computing Architectures)

52 - CPRE 583 (Reconfigurable Computing): Reconfigurable Computing Architectures Iowa State University (Ames)

Degree of Integration/Coupling M

ain M

emory

CPU

Fe

tch

De

code

Execute Me

mory

Write

Back

L1 Cache

L2 Cache

MemoryController

DMAController

I/OController

USB PCI PCI-Express SATA

Hard DriveNIC

ALU

FPU

Page 53: CPRE 583 Reconfigurable Computing Lecture 5: Wed 9/8/2010 (Reconfigurable Computing Architectures)

53 - CPRE 583 (Reconfigurable Computing): Reconfigurable Computing Architectures Iowa State University (Ames)

Degree of Integration/Coupling M

ain M

emory

CPU

Fe

tch

De

code

Execute Me

mory

Write

Back

L1 Cache

L2 Cache

MemoryController

DMAController

I/OController

USB PCI PCI-Express SATA

Hard DriveNIC

ALU

FPU

RPF

Page 54: CPRE 583 Reconfigurable Computing Lecture 5: Wed 9/8/2010 (Reconfigurable Computing Architectures)

54 - CPRE 583 (Reconfigurable Computing): Reconfigurable Computing Architectures Iowa State University (Ames)

Degree of Integration/Coupling M

ain M

emory

CPU

Fe

tch

De

code

Execute Me

mory

Write

Back

L1 Cache

L2 Cache

MemoryController

DMAController

I/OController

USB PCI PCI-Express SATA

Hard DriveNIC

ALU

FPURPF

Page 55: CPRE 583 Reconfigurable Computing Lecture 5: Wed 9/8/2010 (Reconfigurable Computing Architectures)

55 - CPRE 583 (Reconfigurable Computing): Reconfigurable Computing Architectures Iowa State University (Ames)

Degree of Integration/Coupling M

ain M

emory

CPU

Fe

tch

De

code

Execute Me

mory

Write

Back

L1 Cache

L2 Cache

MemoryController

DMAController

I/OController

USB PCI PCI-Express SATA

Hard DriveNIC

ALU

FPU

RPF

ConfigI/F

Page 56: CPRE 583 Reconfigurable Computing Lecture 5: Wed 9/8/2010 (Reconfigurable Computing Architectures)

56 - CPRE 583 (Reconfigurable Computing): Reconfigurable Computing Architectures Iowa State University (Ames)

Degree of Integration/Coupling M

ain M

emory

CPU

Fe

tch

De

code

Execute Me

mory

Write

Back

L1 Cache

L2 Cache

MemoryController

DMAController

I/OController

USB PCI PCI-Express SATA

Hard DriveNIC

ALU

FPU

RPF

ConfigI/F

Page 57: CPRE 583 Reconfigurable Computing Lecture 5: Wed 9/8/2010 (Reconfigurable Computing Architectures)

57 - CPRE 583 (Reconfigurable Computing): Reconfigurable Computing Architectures Iowa State University (Ames)

Degree of Integration/Coupling M

ain M

emory

CPU

Fe

tch

De

code

Execute Me

mory

Write

Back

L1 Cache

L2 Cache

MemoryController

DMAController

I/OController

USB PCI PCI-Express SATA

Hard DriveNIC

ALU

FPU

RPFI/O

ConfigI/F

Page 58: CPRE 583 Reconfigurable Computing Lecture 5: Wed 9/8/2010 (Reconfigurable Computing Architectures)

58 - CPRE 583 (Reconfigurable Computing): Reconfigurable Computing Architectures Iowa State University (Ames)

Degree of Integration/Coupling M

ain M

emory

CPU

Fe

tch

De

code

Execute Me

mory

Write

Back

L1 Cache

L2 Cache

MemoryController

DMAController

I/OController

USB PCI PCI-Express SATA

Hard DriveNIC

ALU

FPURFU

Page 59: CPRE 583 Reconfigurable Computing Lecture 5: Wed 9/8/2010 (Reconfigurable Computing Architectures)

59 - CPRE 583 (Reconfigurable Computing): Reconfigurable Computing Architectures Iowa State University (Ames)

Friday: State machines

Page 60: CPRE 583 Reconfigurable Computing Lecture 5: Wed 9/8/2010 (Reconfigurable Computing Architectures)

60 - CPRE 583 (Reconfigurable Computing): Reconfigurable Computing Architectures Iowa State University (Ames)

Questions/Comments/Concerns

Page 61: CPRE 583 Reconfigurable Computing Lecture 5: Wed 9/8/2010 (Reconfigurable Computing Architectures)

61 - CPRE 583 (Reconfigurable Computing): Reconfigurable Computing Architectures Iowa State University (Ames)

Lecture 3 notes / slides in progress

Page 62: CPRE 583 Reconfigurable Computing Lecture 5: Wed 9/8/2010 (Reconfigurable Computing Architectures)

62 - CPRE 583 (Reconfigurable Computing): Reconfigurable Computing Architectures Iowa State University (Ames)

Granularity: PipeRench

• Scheduling virtual stage on to physical• Partial/Dynamically reconfig (each cycle)

Page 63: CPRE 583 Reconfigurable Computing Lecture 5: Wed 9/8/2010 (Reconfigurable Computing Architectures)

63 - CPRE 583 (Reconfigurable Computing): Reconfigurable Computing Architectures Iowa State University (Ames)

Granularity: GARP

• Impact of configuration size on performance• Context switching

• Garp feature• Dynamic reconfigurable• Store multiple configurations in an on chip

cache (4)• One configuration at a time

• Example app mapping to GARP (loop)• Amdahl's Law

The Garp Architecture and C Compiler• http://www.cs.cmu.edu/~tcal/IEEE-Computer-Garp.pdf

Page 64: CPRE 583 Reconfigurable Computing Lecture 5: Wed 9/8/2010 (Reconfigurable Computing Architectures)

64 - CPRE 583 (Reconfigurable Computing): Reconfigurable Computing Architectures Iowa State University (Ames)

Overview• Dimensions

– Price– Granularity– Coupling– To optimize App Performance (compute (throughput, latency),

Power, reliability)• RPF to efficiently implement VICs

– Main picture authors' wants to convey• What’s the point or having a Reconfigure arch

– Example (Increase App performance)• App -> SW/CPU• Profile• ID kernels of intense compute• Design custom hardware/instruction (Amdels law)

– Intel FPL paper, great example for reading by Friday

Page 65: CPRE 583 Reconfigurable Computing Lecture 5: Wed 9/8/2010 (Reconfigurable Computing Architectures)

65 - CPRE 583 (Reconfigurable Computing): Reconfigurable Computing Architectures Iowa State University (Ames)

Reconfigurable Architectures• RPF -> VIC (short slide)