17
1 ® © 2003 Intel Corporation Software Development Environment for Reconfigurable Communications Architecture Software Development Environment for Reconfigurable Communications Architecture Vladimir Vladimir Ivanov Ivanov Radio Communications Lab/Corporate Technology Group Radio Communications Lab/Corporate Technology Group Contributor: Vicki Tsai Contributor: Vicki Tsai Radio Communications Lab/Corporate Technology Group Radio Communications Lab/Corporate Technology Group Reconfigurable Computing Tutorial Reconfigurable Computing Tutorial International Symposium on System International Symposium on System-on on- Chip Conference Chip Conference Tampere Tampere, Finland , Finland Intel Corporation Intel Corporation 18 November 2003 18 November 2003 • 2 • Communications Technology Communications Technology Lab Lab © 2003 Intel Corporation Outline Outline ?RCA Review RCA Review ? What are the specific architectural features which What are the specific architectural features which impact software development tools? impact software development tools? ?Programming flow Programming flow ? How do specific architectural features impact software How do specific architectural features impact software development process? development process? ?Software Development Environment Software Development Environment ? Goals and Challenges Goals and Challenges ? Process specifics Process specifics ? System System-level issues level issues ?Development Environment Concept Development Environment Concept ? Algorithm and compiler point of view Algorithm and compiler point of view

Software Development Environment for Reconfigurable ... · Environment for Reconfigurable Communications Architecture Software Development Environment for Reconfigurable ... WCDMA

Embed Size (px)

Citation preview

1

®

© 2003 Intel Corporation

Software Development Environment for Reconfigurable Communications Architecture

Software Development Environment for Reconfigurable Communications Architecture

Vladimir Vladimir IvanovIvanovRadio Communications Lab/Corporate Technology GroupRadio Communications Lab/Corporate Technology Group

Contributor: Vicki TsaiContributor: Vicki TsaiRadio Communications Lab/Corporate Technology GroupRadio Communications Lab/Corporate Technology Group

Reconfigurable Computing TutorialReconfigurable Computing TutorialInternational Symposium on SystemInternational Symposium on System--onon--Chip ConferenceChip Conference

TampereTampere, Finland, Finland

Intel CorporationIntel Corporation

18 November 200318 November 2003

• 2 •Communications TechnologyCommunications Technology

LabLab

© 2003 Intel Corporation

OutlineOutline??RCA ReviewRCA Review??What are the specific architectural features which What are the specific architectural features which

impact software development tools?impact software development tools?

??Programming flowProgramming flow??How do specific architectural features impact software How do specific architectural features impact software

development process?development process?

??Software Development EnvironmentSoftware Development Environment??Goals and ChallengesGoals and Challenges??Process specificsProcess specifics??SystemSystem--level issueslevel issues

??Development Environment ConceptDevelopment Environment Concept??Algorithm and compiler point of viewAlgorithm and compiler point of view

2

®

© 2003 Intel Corporation

RCA ReviewRCA Review

What are the specific architectural features What are the specific architectural features which impact software development tools?which impact software development tools?

• 4 •Communications TechnologyCommunications Technology

LabLab

© 2003 Intel Corporation

RCA ReviewRCA Review??Scalable mesh interconnect of Scalable mesh interconnect of

heterogeneous processing elements (heterogeneous processing elements (PEsPEs))

??Interconnect with Nearest Interconnect with Nearest NeighbourNeighbour MeshMesh

??Clock frequency dependent on load and Clock frequency dependent on load and processprocess

3

• 5 •Communications TechnologyCommunications Technology

LabLab

© 2003 Intel Corporation

RCA ReviewUbiquitous wireless communication across multiple protocols

A scalable mesh interconnect of heterogeneous processing elements (PEs):? Configurable basebands for multiple (concurrent) PHY/MAC operation? Power and Size conserving when compared to “multiple” dedicate d cores or

“traditional” SDR (S/W defined radio) approaches? Tools for simple programming and portability to different arrays of elements

Ultra-wideband WPAN

802.11a WLAN

WCDMA WWAN

DD

CMOS AFE 3

CMOS AFE 2

1

PEPE PE

IO (EC)

IO (AFE 2)

PE

PE

PE

PE

PEPE PE

PE

PE

PE

PE

1

4

3

2

4

3

2

A

EA

CMOS AFE 1

I.E.

IO (EC) IO (EC)

IO (AFE 1) IO (AFE 3)

UMAC 2UMAC 2UMAC 1UMAC 1 UMAC 3UMAC 3

B C D

DCB

E

Figure source Intel research and development

®

© 2003 Intel Corporation

Programming FlowProgramming Flow

How do How do specific architectural specific architectural featuresfeatures impact the software impact the software development process?development process?

4

• 7 •Communications TechnologyCommunications Technology

LabLab

© 2003 Intel Corporation

1. Divide the protocol into modes1. Divide the protocol into modesPreambleDetect:

DiversitySelection:

Steady-StateData:

Each mode refers to a different, non-overlapping period in time

• 8 •Communications TechnologyCommunications Technology

LabLab

© 2003 Intel Corporation

2. PartitioningSpecify functions for each mode

2. PartitioningSpecify functions for each mode

PreambleDetect:

DiversitySelection:

AFE1 (ant. 1) AGC Dec. Filter Preamble Det.

AFE1 (ant. 1) AGC Dec. Filter SNR Calc.

AFE1 (ant. 2) AGC Dec. Filter SNR Calc.

Diversity Sel.

Steady-StateData: AFE1 (ant. 1) Dec. Filter AFC Fixed IQ Imb. Corr. Guard Int. Removal

Deinter. Adapt. IQ Imb Corr FEQ 64-Pt FFTQAM DemapViterbiDescram.Host IO

Note: This description is function based and not hardware based.

5

• 9 •Communications TechnologyCommunications Technology

LabLab

© 2003 Intel Corporation

3. CommunicationEstablish communication structure among functions

3. CommunicationEstablish communication structure among functions

PreambleDetect:

DiversitySelection:

AFE1 (ant. 1) AGC Dec. Filter Preamble Det.

AFE1 (ant. 1) AGC Dec. Filter SNR Calc.

AFE1 (ant. 2) AGC Dec. Filter SNR Calc.

Diversity Sel.

Steady-StateData: AFE1 (ant. 1) Dec. Filter AFC Fixed IQ Imb. Corr. Guard Int. Removal

Deinter. Adapt. IQ Imb Corr FEQ 64-Pt FFTQAM DemapViterbiDescram.Host IO

• 10 •Communications TechnologyCommunications Technology

LabLab

© 2003 Intel Corporation

4. AggregationDetermine onto which PE types the functions could be mapped

4. AggregationDetermine onto which PE types the functions could be mappedAFE1 (ant. 1)

PreambleDetect: AGC Dec. Filter Preamble Det.

AFE1 (ant. 1)Diversity

Selection: AGC Dec. Filter SNR Calc.

AFE1 (ant. 2) AGC Dec. Filter SNR Calc.

Diversity Sel.

AFE1 (ant. 1)Steady-State

Data: Dec. Filter AFC Fixed IQ Imb. Corr. Guard Int. Removal

Deinter. Adapt. IQ Imb Corr FEQ 64-Pt FFTQAM DemapViterbiDescram.

Host IO

PE typeAPE typeBPE typeCPE typeDPE typeE

6

• 11 •Communications TechnologyCommunications Technology

LabLab

© 2003 Intel Corporation

5. Check if resources available for the current hardware layout5. Check if resources available for the current hardware layout

AFE1 (ant. 1)PreambleDetect: AGC Dec. Filter Preamble Det.

AFE1 (ant. 1)Diversity

Selection: AGC Dec. Filter SNR Calc.

AFE1 (ant. 2) AGC Dec. Filter SNR Calc.

Diversity Sel.

AFE1 (ant. 1)Steady-State

Data: Dec. Filter AFC Fixed IQ Imb. Corr. Guard Int. Removal

Deinter. Adapt. IQ Imb Corr FEQ 64-Pt FFTQAM DemapViterbiDescram.

Host IO

ResourceUsage (%): PE typeB PE typeA PE typeE PE typeD PE typeC

PE typeAPE typeBPE typeCPE typeDPE typeE

B A

C D

C C

E D

HW topology

• 12 •Communications TechnologyCommunications Technology

LabLab

© 2003 Intel Corporation

Host IO

AGC

6. Mapping Place functions onto specific PEs

6. Mapping Place functions onto specific PEs

AFE1 (ant. 1)PreambleDetect: AGC Dec. Filter Preamble Det.

AFE1 (ant. 1)Diversity

Selection: Dec. Filter SNR Calc.

AFE1 (ant. 2) AGC Dec. Filter SNR Calc.

Diversity Sel.

AFE1 (ant. 1)Steady-State

Data: Dec. Filter AFC Fixed IQ Imb. Corr. Guard Int. Removal

Deinter. Adapt. IQ Imb Corr FEQ 64-Pt FFTQAM DemapViterbiDescram.

PE1PE2PE3PE4PE5PE6

7

• 13 •Communications TechnologyCommunications Technology

LabLab

© 2003 Intel Corporation

Host IO

AGC

7. Generate “code” for this mapping7. Generate “code” for this mapping

AFE1 (ant. 1)PreambleDetect: AGC Dec. Filter Preamble Det.

AFE1 (ant. 1)Diversity

Selection: Dec. Filter SNR Calc.

AFE1 (ant. 2) AGC Dec. Filter SNR Calc.

Diversity Sel.

AFE1 (ant. 1)Steady-State

Data: Dec. Filter AFC Fixed IQ Imb. Corr. Guard Int. Removal

Deinter. Adapt. IQ Imb Corr FEQ 64-Pt FFTQAM DemapViterbiDescram.

…BinaryImage

BinaryImage

BinaryImage

BinaryImage

BinaryImage

BinaryImage

PE1PE2PE3PE4PE5PE6

• 14 •Communications TechnologyCommunications Technology

LabLab

© 2003 Intel Corporation

8. Check if desired performance has been reached8. Check if desired performance has been reached

BinaryImage

BinaryImage

BinaryImage

BinaryImage

BinaryImage

BinaryImage

System Profiler

StimulusData

HW topology

Performance results

If desired performance has been met, output the binary images.Otherwise, use the results to adjust the mapping and go to Step 2 or 4 or 6

8

• 15 •Communications TechnologyCommunications Technology

LabLab

© 2003 Intel Corporation

Programming Flow SummaryProgramming Flow Summary1.1. Divide the protocol into modesDivide the protocol into modes2.2. Specify functions for each modeSpecify functions for each mode3.3. Establish communication structure among functionsEstablish communication structure among functions4.4. Determine onto what PE types the functions could be Determine onto what PE types the functions could be

mappedmapped5.5. Check if we have the resources in the hardwareCheck if we have the resources in the hardware6.6. Place functions onto specific Place functions onto specific PEsPEs7.7. Generate “code” for this mappingGenerate “code” for this mapping?? If code cannot be generated because the PE cannot fit If code cannot be generated because the PE cannot fit

the assigned functions, try a different mappingthe assigned functions, try a different mapping8.8. Check if desired performance has been reachedCheck if desired performance has been reached

?? If not, try a different mappingIf not, try a different mapping?? Otherwise, output the generated code from Step 6Otherwise, output the generated code from Step 6

Programmer

Tools

®

© 2003 Intel Corporation

Software Development EnvironmentSoftware Development Environment

??Goals and ChallengesGoals and Challenges

??Process specificsProcess specifics

??SystemSystem--level issueslevel issues

9

• 17 •Communications TechnologyCommunications Technology

LabLab

© 2003 Intel Corporation

Tools GoalsTools Goals??Primary goal is to assure development of Primary goal is to assure development of

effective code for RCA effective code for RCA ??Developed code should effectively use all RCA Developed code should effectively use all RCA

capabilitiescapabilities??Implemented protocols should meet users Implemented protocols should meet users

requirementsrequirements??Abstract code development from hardwareAbstract code development from hardware??If the number of total If the number of total PEsPEs change or the number of change or the number of PEsPEs

of a certain type change, the algorithm does not need to of a certain type change, the algorithm does not need to be alteredbe altered

??Give reasonable programming abstraction level Give reasonable programming abstraction level for the programmerfor the programmer??Provide effective environment for development, Provide effective environment for development,

debugging and testing of softwaredebugging and testing of software

• 18 •Communications TechnologyCommunications Technology

LabLab

© 2003 Intel Corporation

Tools ChallengesTools Challenges??Reasonable balance for abstracting software development Reasonable balance for abstracting software development

from hardwarefrom hardware??Classical challenges for parallel architectureClassical challenges for parallel architecture??Decomposition of program into parallel processesDecomposition of program into parallel processes??Effective mapping of processes to Effective mapping of processes to PEsPEs??Effective communication among processesEffective communication among processes??Synchronization among processesSynchronization among processes

??Protocol concurrency implies dynamic RCA resource Protocol concurrency implies dynamic RCA resource distribution among protocolsdistribution among protocols

??Heterogeneity of Heterogeneity of PEsPEs meshmesh??Variety of Processing Elements (Variety of Processing Elements (PEsPEs))??PEsPEs may not be processormay not be processor--basedbased??Methods to program Methods to program PEsPEs differ greatlydiffer greatly

??Guaranteed protocol performanceGuaranteed protocol performance??Effective data visualization from multiple Effective data visualization from multiple PEsPEs??High performance simulation of RCAHigh performance simulation of RCA

10

• 19 •Communications TechnologyCommunications Technology

LabLab

© 2003 Intel Corporation

Software Development ProcessSoftware Development Process

Source codedevelopment

Program codetranslation

Debugging

Performancemeasurement

Meetsuser’s reqs?

Algorithmdevelopment

START

Testing

AWARD

Yes

No

Algorithmredevelopment

Redevelopment

Tools foralgorithmdevelopment

Tools forsource codedevelopment

Translationtools, Linkagetools

Debugger,Simulator

Profiler

• 20 •Communications TechnologyCommunications Technology

LabLab

© 2003 Intel Corporation

Software Development EnvironmentSoftware Development Environment

IDE

DescriptionsTranslator

ParsedDescriptions

(XML)

Makefile

Mapper

ProcessesLayout (XML)

UserConstraints

Executionstatistics

Packager Makefile

RelocatableImage

LoadableImage

LoaderSimulation/Execution

Linkage tool

Translationtool

Source

Source code ofprocesses

Packager createsMakefile inaccordance withlayout scheme andruns make for theloadable imagebuilding

SourceCode Editor

Processdiagram

editorDescriptions

editor

etcetc.

Library

Hardwaredescription

Softwaredescription

Librarian

Translationtool

Source

RelocatableImage

Librarian

Linkage tool

Library

Relocatableimages of

processes

MapDirectives

Algorithm and source code development tools

Translation and linkage tools

Specific tools

11

• 21 •Communications TechnologyCommunications Technology

LabLab

© 2003 Intel Corporation

Input ExampleInput Example

myFn

(Auto)

in data In0 Out0

realFIRIn0 Out0

myFn2

(PE typeA)

In0 Out0

myFn3

(Auto)

In0Out0

In1out data

L(Auto)

_myFn2:...X0:X7=*P0++8 || Y0:Y7=*P1++8 ||M0=X0*Y0 || M1=X1*Y1 || M2=X2*Y2|| M3=X3*Y3 || M4=X4*Y4||M5=X5*Y5 ||M6=X6*Y6 ||M7=X7*Y7||A00=M0+M1 || A20=M2+M3 ||A4=M4+M5 || A6=M6+M7;....DONE

myFn2.ccs

myFn(int16 in0[], int16 out0[]){ int16 i,x; for (i=0; i<IN1LEN; i++) { x=in1[i] * in1[i]; send_output(0,x); }}

myFn.c

• 22 •Communications TechnologyCommunications Technology

LabLab

© 2003 Intel Corporation

System SimulatorSystem Simulator??Cycle accurate simulationCycle accurate simulation

??High performanceHigh performance

??Allow to evaluate latency and Allow to evaluate latency and computational overheadcomputational overhead

??Possibility to connect two instances Possibility to connect two instances of the System Simulator to each otherof the System Simulator to each other

??Provide debugging facilitiesProvide debugging facilities

12

• 23 •Communications TechnologyCommunications Technology

LabLab

© 2003 Intel Corporation

System SimulatorSystem Simulator?? SysSimSysSim contains contains

Simulator Core (SC) Simulator Core (SC) and Individual and Individual Simulators (IS)Simulators (IS)

?? Two abstraction Two abstraction layers for IS layers for IS representationrepresentation??High level objectHigh level object??Scheduled ObjectScheduled Object

?? Object design Object design principle: If being in principle: If being in state S1 and got an state S1 and got an input signal input signal InIn than than after delay D change after delay D change the state to S2 and the state to S2 and produce an output produce an output signal signal OutOut

Debugger

RCA Device Driver

RCA System Simulator

User Application

Simulator Core

Individual PE simulator

Scheduled Objects Layer: efficientcycle-accurate scheduling

AFE Data Host Data

AFE Data(to data filesor to anotherinstance ofthe Simulator)

JTAGControl

Host Data

HardwareConfiguration

File

Debugqueries

Debugevents /responses

JTAGControl

Host Data Drivercontrol

• 24 •Communications TechnologyCommunications Technology

LabLab

© 2003 Intel Corporation

Simulation PerformanceSimulation Performance

?? Comparing Comparing SystemCSystemCcore and core and SysSimSysSim corecore

?? SC_METHOD process SC_METHOD process was used for was used for SystemCSystemC

?? Simulated object is Simulated object is NNinstances of D flipinstances of D flip--flop flop objectsobjects

?? Simulation on Intel 2.4 Simulation on Intel 2.4 GHz Pentium 4 GHz Pentium 4

?? 4*4 Mesh (~1000 4*4 Mesh (~1000 objects), 400 MHz objects), 400 MHz

?? 1 sec simulation takes 1 sec simulation takes ~100 hours for ~100 hours for SystemCSystemC Core and ~13 Core and ~13 hours for hours for SySimSySim CoreCore

Source DestinationD flip-flop D flip-flop…

N instances

0

100

200

300

400

500

600

700

800

900

0 2000 4000 6000 8000 10000 12000

N of scheduled objects

Sim

ula

tio

n t

ime

(sec

)

CTL coreSystemC core

13

®

© 2003 Intel Corporation

Development Environment ConceptDevelopment Environment Concept

Algorithm and Compiler Algorithm and Compiler point of viewpoint of view

• 26 •Communications TechnologyCommunications Technology

LabLab

© 2003 Intel Corporation

Tools Development ConceptsTools Development Concepts??Naive PhaseNaive Phase ::??Manual program partitioningManual program partitioning??Manual code optimizationManual code optimization??Independent compiler toolsIndependent compiler tools??Static hardware and softwareStatic hardware and software

??Mature Phase:Mature Phase:??Automatic program partitioningAutomatic program partitioning??Automatic code optimizationAutomatic code optimization??Common compiler toolsCommon compiler tools??Static hardware and softwareStatic hardware and software

??Advanced Phase:Advanced Phase:??Macro architecture description toolsMacro architecture description tools??Automatic generation of micro architecture descriptionAutomatic generation of micro architecture description??Automatic software tools generationAutomatic software tools generation??Protocol partitioning for joint hardwareProtocol partitioning for joint hardware --software optimizationsoftware optimization

14

• 27 •Communications TechnologyCommunications Technology

LabLab

© 2003 Intel Corporation

Tools Development Naive PhaseTools Development Naive Phase??Enhanced Traditional ModelEnhanced Traditional Model??Networking (communication architecture)Networking (communication architecture)??Mapping (distributable compilation)Mapping (distributable compilation)

??Traditional toolTraditional tool--suite for RCAsuite for RCA??Complete development toolComplete development tool--suitesuite??Integration of tools for sequential programmingIntegration of tools for sequential programming

??Solution constraintsSolution constraints??Aided mapping (userAided mapping (user--defined mapping of defined mapping of

process to PE)process to PE)

• 28 •Communications TechnologyCommunications Technology

LabLab

© 2003 Intel Corporation

Enhanced Traditional ModelEnhanced Traditional Model

TinyMapper

RCA SimulatorDebugger

C sourcecode for PE i

Assemblycode for PE j

C Compiler

Objectmodule 1

Objectmodule 2

Linker

Executablemodule

C Compiler

Specializedcode for VMCA

Reconfigurationvector

Configurator

Linker Linker

C sourcecode for PE i

C Compiler

Objectmodule 1

C sourcecode for PEC

C Compiler

Objectmodule

RCA Linker

Executablemodule

Executablemodule

Loadableimage

Assemblycode for PE j

Objectmodule 2

C Compiler

Assemblycode for FMCA

Assembler

Objectmodule

DescriptionTranslator

Make directives

Link directives

15

• 29 •Communications TechnologyCommunications Technology

LabLab

© 2003 Intel Corporation

Tools Development Mature PhaseTools Development Mature Phase??True distributable compilationTrue distributable compilation??Automated mappingAutomated mapping

??Global optimizationGlobal optimization??IntermoduleIntermodule optimizationoptimization

??Optimization on heterogeneous environmentOptimization on heterogeneous environment

??Enhanced development toolsEnhanced development tools??C Compiler with highC Compiler with high--level IR generationlevel IR generation

??HighHigh--level IR Linkerlevel IR Linker

??RetargetableRetargetable Code GeneratorCode Generator

• 30 •Communications TechnologyCommunications Technology

LabLab

© 2003 Intel Corporation

Distributable Compilation ArchitectureDistributable Compilation Architecture

C sourcecode

C Front-End

IR 1

IR Linker

General IR

Assemblycode

Assembler

IR 2

Specializedcode

Configurator

IR 3

Mapper

IR for PE1 IR for PE2 IR for PE3

CG for PE1 CG for PE2 CG for PE3

Object module 1 Object module 2 Object module 3

IR Libs

Obj Libs

C sourcecode

C Front-End

IR 1

C sourcecode

C Front-End

IR 1

Assemblycode

Assembler

IR 2

Assemblycode

Assembler

IR 2

Specializedcode

Configurator

IR 3

Specializedcode

Configurator

IR 3

16

• 31 •Communications TechnologyCommunications Technology

LabLab

© 2003 Intel Corporation

Tools Development Advanced PhaseTools Development Advanced Phase

??Distributable compilationDistributable compilation

??RetargetableRetargetable development toolsdevelopment tools??RetargetableRetargetable C Compiler (tunable CG and optimization)C Compiler (tunable CG and optimization)

??RetargetableRetargetable Assembler (target architecture templates)Assembler (target architecture templates)

??RetargetableRetargetable Simulator (for RCA configurations)Simulator (for RCA configurations)

??Comprehensive Target Descriptive LanguageComprehensive Target Descriptive Language

??Target Tools GeneratorTarget Tools Generator

??HDL code generationHDL code generation

??Joint hardware and software optimizationJoint hardware and software optimization

• 32 •Communications TechnologyCommunications Technology

LabLab

© 2003 Intel Corporation

Co-Design ArchitectureCo-Design Architecture

High LevelIR

ComprehensiveTarget

Description

HDL(VHDL)

RCA Hardware DesignRCA Hardware Design

Target Tools

C Compiler

Assemblers

RCA Simulator

Tools Generator HDL Output

Software DesignSoftware Design

Debugger

Source Codefor RCA

TargetRepresentation

CGi

17

• 33 •Communications TechnologyCommunications Technology

LabLab

© 2003 Intel Corporation

SummarySummary?? RCA programming process characteristicsRCA programming process characteristics??Parallel running processes with message exchangeParallel running processes with message exchange??Procedure level parallelismProcedure level parallelism

??“Partitioning“Partitioning--communicationcommunication--aggregationaggregation--mapping” based mapping” based optimization cycleoptimization cycle

?? RCA software development RCA software development envenv contains standard set of tools for contains standard set of tools for ??Algorithm and source code developmentAlgorithm and source code development??Source code translation and linkingSource code translation and linking

?? RCA software development environment contains specific set of RCA software development environment contains specific set of tools for the optimization cycletools for the optimization cycle

?? 3 phases of software tools development3 phases of software tools development??Main goal of the naive and mature phases is to assure Main goal of the naive and mature phases is to assure

(manually or automatically) program code effectiveness(manually or automatically) program code effectiveness

??Main goal of advanced phase is to assure joint hardwareMain goal of advanced phase is to assure joint hardware--software effectiveness of PHY/MAC algorithms implementationsoftware effectiveness of PHY/MAC algorithms implementation

• 34 •Communications TechnologyCommunications Technology

LabLab

© 2003 Intel Corporation

AcknowledgementsAcknowledgementsErnest Tsui, Vladimir Pudovkin, Vladimir Pavlov, Sergey Mironov, Veronica Mikheeva, Tony Chun, Michael K. Chen