40
Hardware-Software Codesign Elvira Kitsis Hermawan Ho Alex Papadimoulis

Hardware-Software Codesign

Embed Size (px)

DESCRIPTION

Hardware-Software Codesign. Elvira Kitsis Hermawan Ho Alex Papadimoulis. HW/SW Codesign Introduction. Unified design of hardware and software systems All design based off of logical model no HW/SW partition Maintained throughout design process Concurrent Design - PowerPoint PPT Presentation

Citation preview

Hardware-Software Codesign

Elvira KitsisHermawan Ho

Alex Papadimoulis

HW/SW Codesign Introduction

Unified design of hardware and software systems

All design based off of logical model no HW/SW partition Maintained throughout design process

Concurrent Design hw/sw optimized for peak

performance

HW/SW Codesign Origins

Field of Embedded systems Demand for consumer information

appliances (cell phone, pda) Specialized industrial products designers developed new tools and

techniques to satisfy demand These became HW/SW codesign

Traditional Systems Design

Early, key decision: HW/SW Partition Must be kept, changes require

extensive redesign for both HW and SW

Lacks a well defined HW/SW interface data flow

Leads to Sub optimal designs And longer design-to-market time

HW/SW Codesign - A Solution

HW/SW Codesign alleviates traditional design issues

Maps system specification to a mixed HW/SW implementation Conventional SW on a RISC processor ASICs (Application Specific Integrated

Circuit)

Practical Implementation of Hardware Software Codesign

Elvira Kitsis

Practical implementation of hardware/software co-design

The purpose of hardware/software

co-design Four common approaches to the task of

hardware/software co-design unbiased hardware-biased software-biased hardware acceleration

Co-design development routine

Objectives of development routine The first stage is to determine the

performance critical section of a C program using a profiler tool and routine system, as described in Fig.1.

Next step

The next step in the development routine is to implement a critical section in hardware as shown in Fig.2.

Limitationson the type C code:

All C types must be mapped to 16/32 bit signed integers in HardwareC

Type qualifiers, enumerated types, unions and structures NOT permited

Global variables are NOT allowed Parameters for functions may consists of

simple types, pointer types and data arrays only.

No support for "gotos"

Test Results

Execution time

Example 1

Software-only 80 ms

Software-hardware 47 ms

Example 2

Software-only 114 ms

Software-hardware 80 ms

Hardware or software?

Performance Cost Form factor Flexibility Safety Architectural cleanness and

simplicity

System Level Memory Optimization for Hardware-Software Co-Design

Hermawan Ho

Intro

In multi media applications, a considerable amount of memory is required.

To reduce this dominant cost. A quad-tree based image coding

application.

Design Model

If we do not need the flexibility, one or more dedicated hardware processor(s) can be designed to perform the functions which are in the cycle.

When the flexibility is needed, we can use data level parallelism. The advantage of this approach is that it is simple to program but the memory overhead is high.

Design Model

Alternatively, we can use task level parallelism.

The advantages are that the code size per processor is relatively low.

The disadvantage is that the design time will be much higher due to the complex processor partitioning and memory management.

System Level Memory Optimization

All functions are taken together in one big function.

We have an algorithm that operates block per block. All computations are done on the first block.

Buffer memory for only one block will be required between the sub modules.

QSDPCM QSDPCM (Quadtree Structured

Difference Pulse Code Modulation) is a compression technique for video.

The algorithm optimize both the displacement vector and the quadtree mean decomposition jointly.

The displacement which requires the minimum number of bits for the quadtree decomposition is selected

Summary

If the HW/SW partitioning is performed first, remaining buffers afterwards cannot be optimized away anymore.

QSDPCM application, can do much better before the HW/SW partitioning.

The Design of Mixed Hardware/Software Systems

Mixed Hardware/Software Systems

Many digital systems contain both hardware and software

Combining hardware and software design tasks has several advantages.

One is that may accelerate the design process. Another is that may enable hardware/software trade-offs to be made dynamically, as the design progresses.

Mixed Hardware/Software Systems

Unless the they are design together, we do not think of it as a mixed hardware/software system.

The distinguishing factor is whether the boundary between hardware and software is logical boundary or a physical boundary.

Simulation of Hardware/Software Systems

Presents the problem of modeling the behavior of a system based on the behavior of the hardware and software components.

Requires a simulation environment that can understand the semantics of both the software and the hardware components

Automated Hardware/Software Co-Synthesis

Allow the designer to explore more of the design space by dynamically reconfiguring the hardware and software.

Another challenge for hardware/software co-synthesis is that hardware and software are often described using different languages and formalisms.

Automated Hardware/Software Co-Synthesis

May include hardware/software partitioning. Some of the considerations are:

Performance requirements Implementation cost Modifiability Nature of computation Concurrency Communication

Several Examples of Hardware/Software Co-Design

Embedded microprocessor systems Heterogeneous multiprocessing

systems Application-specific instruction set

processors Special-purpose functional units Application-specific co-processor

Using HW/SW Codesign

Alex Papadimoulis

OOP & HW/SW Codesign

Develop entire system in an object oriented programming language

Treat hardware as an object Allows for a unified design

environment HW functions can be simulated in

SW Object and implemented concurently

Problems with OOP

Synchronizing sequential code Interleaved SW and HW functions HW needs to know exactly when a

data object is ready to be worked on Same holds true for SW

C++ Class Library – Cylib

Handle this synchronization problem

Clock function and Done flag: objHardware.Modify( objData,

blnDone);while (!(blnDone)) {

objHardware.clock();}SoftwareFunction( objData );

C++ Class Library – Cylib 2

Approach is similar to interrupts Complexity is greatly reduced Interface allows HW/SW objects to

work hierarchical and in parallel Modification of HW design requires

changing only the class library

Another Approach:Complier Generation

OOP approach won’t work for all cases Example: MPU architecture changes Traditional MPU replacement, 2

options: Backwards compatible hardware. Simply

increase speed of functions, no new functionality.

Rewrite compiler, very costly.

Complier Generation

Theory: third option, generate compiler Radical architecture changes, compilers

wouldn’t need time to catch up Ideal for user defined processors Extract HW architecture information

then generate optimized executable code from high-level language

Complier Generation

Retargetable compilers exist Require significant human skill Simply are superset of all CPU instructions

Compiler Generator would Overcome retargetable compiler

limitations Maintain quality (speed, size, compilation

time) of conventional compiler

How it works 1

Optimize front-end code Architecture independent step Performed by conventional

compilers Passes a optimized grammar tree

structure to the next step

How it works 2 Get parameterized architecture info

Number of general registers, memory word size, instruction behavior, etc

Modify tree branches Using existing language (“twig”)

Translate into pattern functions Allocate registers Generate Executable Code

Compiler Gen. Conclusion

Requires a lot of work Pipelined compilation techniques Automated architecture information

extraction (perhaps HDL, etc) Experiments provide that concept

could be used in HW/SW Codesign in the future