47
Hardware synthesis with the aid of dynamic programming Citation for published version (APA): Woudenberg, van, H., & Born, van den, R. (1988). Hardware synthesis with the aid of dynamic programming. (EUT report. E, Fac. of Electrical Engineering; Vol. 88-E-201). Eindhoven University of Technology. Document status and date: Published: 01/01/1988 Document Version: Publisher’s PDF, also known as Version of Record (includes final page, issue and volume numbers) Please check the document version of this publication: • A submitted manuscript is the version of the article upon submission and before peer-review. There can be important differences between the submitted version and the official published version of record. People interested in the research are advised to contact the author for the final version of the publication, or visit the DOI to the publisher's website. • The final author version and the galley proof are versions of the publication after peer review. • The final published version features the final layout of the paper including the volume, issue and page numbers. Link to publication General rights Copyright and moral rights for the publications made accessible in the public portal are retained by the authors and/or other copyright owners and it is a condition of accessing publications that users recognise and abide by the legal requirements associated with these rights. • Users may download and print one copy of any publication from the public portal for the purpose of private study or research. • You may not further distribute the material or use it for any profit-making activity or commercial gain • You may freely distribute the URL identifying the publication in the public portal. If the publication is distributed under the terms of Article 25fa of the Dutch Copyright Act, indicated by the “Taverne” license above, please follow below link for the End User Agreement: www.tue.nl/taverne Take down policy If you believe that this document breaches copyright please contact us at: [email protected] providing details and we will investigate your claim. Download date: 20. Aug. 2021

Hardware synthesis with the aid of dynamic programming · The demand graph that is constructed is transformed into a hardware deSCription that consists of two parts: a net-list that

  • Upload
    others

  • View
    2

  • Download
    0

Embed Size (px)

Citation preview

Page 1: Hardware synthesis with the aid of dynamic programming · The demand graph that is constructed is transformed into a hardware deSCription that consists of two parts: a net-list that

Hardware synthesis with the aid of dynamic programming

Citation for published version (APA):Woudenberg, van, H., & Born, van den, R. (1988). Hardware synthesis with the aid of dynamic programming.(EUT report. E, Fac. of Electrical Engineering; Vol. 88-E-201). Eindhoven University of Technology.

Document status and date:Published: 01/01/1988

Document Version:Publisher’s PDF, also known as Version of Record (includes final page, issue and volume numbers)

Please check the document version of this publication:

• A submitted manuscript is the version of the article upon submission and before peer-review. There can beimportant differences between the submitted version and the official published version of record. Peopleinterested in the research are advised to contact the author for the final version of the publication, or visit theDOI to the publisher's website.• The final author version and the galley proof are versions of the publication after peer review.• The final published version features the final layout of the paper including the volume, issue and pagenumbers.Link to publication

General rightsCopyright and moral rights for the publications made accessible in the public portal are retained by the authors and/or other copyright ownersand it is a condition of accessing publications that users recognise and abide by the legal requirements associated with these rights.

• Users may download and print one copy of any publication from the public portal for the purpose of private study or research. • You may not further distribute the material or use it for any profit-making activity or commercial gain • You may freely distribute the URL identifying the publication in the public portal.

If the publication is distributed under the terms of Article 25fa of the Dutch Copyright Act, indicated by the “Taverne” license above, pleasefollow below link for the End User Agreement:www.tue.nl/taverne

Take down policyIf you believe that this document breaches copyright please contact us at:[email protected] details and we will investigate your claim.

Download date: 20. Aug. 2021

Page 2: Hardware synthesis with the aid of dynamic programming · The demand graph that is constructed is transformed into a hardware deSCription that consists of two parts: a net-list that

Hardware Synthesis with the Aid of Dynamic Programming by H. van Woudenberg and R. van den Born

EUT Report 88-E-201 ISBN 90-6144-201-X

June 1988

Page 3: Hardware synthesis with the aid of dynamic programming · The demand graph that is constructed is transformed into a hardware deSCription that consists of two parts: a net-list that

. 1

ISSN 0167- 9708

Eindhoven University of Technology Research Reports

EINDHOVEN UNIVERSITY OF TECHNOLOGY

Faculty of Electrical Engineering

Eindhoven The Netherlands

Coden: TEUEDE

HARDWARE SYNTHESIS WITH THE AID OF DYNAMIC PROGRAMMING

by

H. van Woudenberg

and

R. van den Born

EUT Report 88-E-201

ISBN 90-6144-201-X

Eindhoven

June 1988

Page 4: Hardware synthesis with the aid of dynamic programming · The demand graph that is constructed is transformed into a hardware deSCription that consists of two parts: a net-list that

COOPERATIVE DEVELOPMENT OF AN INTEGRATED. HIERARCHICAL

AND MULTIVIEW VLSI DESIGN SYSTEM WITH DISTRIBUTED

MANAGEMENT ON WORKSTATIONS.

(Multiview VLSI-design System leo)

code: 991

DEUVERABLE

Report on activity: S.1.D.

Abstract: This report describes the automated synthesis of hardware structures from behavioural descriptions. Hereto the description is first translated to a demand graph. The nodes of this graph describe operations and algorithmic constructs, the edges describe data flo\\ Then a dynamic programming based method is used to generate the structure. Dynamic programming is used to restrict the large number of possible implementations by selecting at some intervals the best intennediate structural solutions. The final hardware structure consists of a list of modules and a net-list with the interconnections between the modules. This hardware description will be completed with a state machine description. 1 Ie synthesis program is coded in LISP. Some improvements and completions are suggested. The results are encouraging for funher research.

deliverable code: WP 5, task: 5.1. activity 5.1.0

date: 13·06-1988

partner: Eindhoven UniversilY of Technology

authors: H. van Woudenberg, R. van den Born

This report was accepted as a M.Sc. Thesis of H. van Woudenberg by Prof.Dr.-Ing. J.A.G. Jess, Automatic System Design Group, Facu~ty of E~ectrica~ Engineering, Eindhoven University of Techno~ogy. The work was supervised by Drs. R. Van den Born.

CIP-GEGEVENS KONINKLIJKE BIBLIOTHEEK, DEN HAAG

Woudenberg~ H. van

Hardware synthesis with the aid of dynamic programming I by H. van Woudenberg and R. van den Born. - Eindhoven: University of Technology. - Fig. -(Eindhoven University of Technology research reports I Faculty of Electrical Engineering, ISSN 0167-9708j 88-E-201) Met lit. opg., reg. ISBN 90-6144-201-X 51S0 664.3 UDC 621.382:6B1.3.06 NUGI B32 Trefw.: elektronische schakelingenj computer aided design.

Page 5: Hardware synthesis with the aid of dynamic programming · The demand graph that is constructed is transformed into a hardware deSCription that consists of two parts: a net-list that

I , 1

1. INTRODUCTION • • • • • 1.1 CONTENTS OF THIS REPORT

2. SYSTEM OVERVIEW • • 2.1 DEMAND GRAPH. •

2.1.1 IF STATEMENT 2.1.2 WHILE LOOP

2.2 MODULE LIBRARY • 2.3 COST ESTIMATOR

3. DYNAMICPROGR~NG

- iii -

CONTENTS

3.1 SOLVING DECISION PROBLEMS 3.2 DYNAMIC PROGRAMMING 3.3 DYNAMIC HARDWARE GENERATION

4. HARDWARE GENERA TJNG PROCESS • • 4.1 IMPLEMENTATION • • • • • • • 4.2 PROCESSING A STATE . • • • • •

4.2.1 IMPLEMENTING A SIMPLE NODE 4.2.2 IMPLEMENTING A CASE STATEMENT • 4.2.3 IMPLEMENTING A WHILE LOOP. •

4.3 COMPARABILITY AND COST FUNCTION • 4.4 POSTPROCESSOR. • •

5. DATA STRUCTURES •••••••• 5.1 STATE IDENTIFICATION • • • • • 5.2 RELATIONS TO THE DEMAND GRAPH 5.3 HARDWARE DESCRIPTION 5.4 COST DATA ••••••• 5.5 INPUTS, OUTPUTS & CONSTANTS 5.6 TRACING NODES AND MODULES 5.7 CYCLE MECHANISM ••••

5.7.1 STARTING A NEW CYCLE • 5.7.2 NEW CYCLES FOR CASE·STATEMENT 5.7.3 NEW CYCLES FOR WHILE LOOP • 5.7.4 ENDING A CYCLE •••••

5.8 STACKS • • • • • • • • • • •

6. CONCLUSIONS AND RECOMMENDA,IONS

REFERENCES

APPENDIX 1: DEMAND GRAPH NODE· AND EDGE·TYPES

APPENDIX 2: STRUCTURE OF THE MODULE LIBRARY

APPENDIX 3: FORMATS FOR COST E~TIMATOR

APPENDIX 4: SYNTAX OF A STATE • • • • • • •

I 2

3 4 5 6 7 8

9 9

10 11

12 13 13 13 16 18 19 20

21 21 21 21 23 23 23 24 24 25 25 25 26

27

28

29

30

33

36

Page 6: Hardware synthesis with the aid of dynamic programming · The demand graph that is constructed is transformed into a hardware deSCription that consists of two parts: a net-list that

- iv -

LIST OF FIGURES

Figure 2.1. Hardware synthesis system

Figure 2.2. Demand graph for discriminant

Figure 2.3. Demand graph for min-max-sort

Figure 2.4. Demand graph for factorial

Figure 2.5. Standard cell bit-slice layout for factorial

Figure 3.1. Tree for all possible sequences . .

Figure 3.2. Lattice during dynamic programming

Figure 4.1. Flow chart for hardware-generation

Figure 4.2. Flow chart for process-state • • .

Figure 4.3. Mapping an operator node on an existing module

3

4

5

6

8

9

11

I3

14

15

Page 7: Hardware synthesis with the aid of dynamic programming · The demand graph that is constructed is transformed into a hardware deSCription that consists of two parts: a net-list that

- I -

1- INTRODUCTION

The synthesis of circuit structures or layouts from higher level descriptions is receiving more attention as the need for more powerful design aids increases. The rapid development of Very Large Scale Integrated Circuits (VLSI) creates these needs. This can well be illustrated by the fol­k\ling numbers (from [Latt79]). When the average layout productivity of a layout designer lays between 5 and 10 devices per day, then a VLSI circuit containing 100,000 transistors would take about sixty man years to layout and another sixty man years to debug the design.

The tremendous amount of detail and complexity associated with large systems has necessitated the development and use of design automation tools. First, automation design aids for simulation and verification have been developed. Then, with the increasing complexity of the designs, the need for creative aids synthesising a circuit design grows. The emphasis was on the placement and layout of regular logic structures in silicon. The design of circuits became an interactive process betwoon man and computer, in which more and more work was done by the computer, but still the man could not be missed. The advantages of automatic design are obvious: a reduced design time, reduced design costs, no need for verification of the designed hardware, etc. Next to the advan­tages in speed and costs, design automation tools ntake the design of even more complex circuits possible and it allows a designer, who is not versed in the detailed electrical problems of IC design, to design his IC chip.

The next step in this automation process is the reduction or even elimination of the human interac­tive influence on the design process as will be done by a silicon compiler. Silicon compilation is ([Davi84]) the translation of a (Very Large Scale) Integrated Circuit described in a high level I, I guage into a target language describing the integrated circuit layout The silicon compilation can be divided into several levels. This division starts with the hardware synthesis, also called the higher levels of the silicon compiler. G!obally and in common, hardware synthesis consists of three stages ([Thom81]). First, the transformation of the algorithm, for example into a graph that serves as an equivalent intermediate description of the circuit Second, the generation of a net-list from this graph. A net-list represents all connections between the modules that perform the operations prescribed by the algorithm. From this net-list description, a dal'l path will be generated. This dau: path contains the logic components that store and process the data in the circuit And third, the design of the control part. The control part represents the sequential state machine that evokes the processing of the information stored in the data path. After the hardware synthesis a logic optimiser changes the data path following some optimisation rules; then the modules are placed and the connecting wires are routed.

Constructing a silicon compiler is at least as complex as designing a circuit Some obvious prob­lems, arising from the lower levels of the compiler, are (see [Sahn80]): the partitioning of the design process into sub-implementation-problems, keeping wire buildup in the routing channels within tolerable bounds, minimizing the total weighted wire length, routing the wires, minimizing t1--.., number of layers. etc. Most problems that arise in the area of design automation are shown to

be NP-hard ([Coh083], [Sahn80]). But also at the higher levels NP-hard problems exist, e.g. the circuit realisation problem ([Sahn80]). This points out the importance of heuristics and other tools to obtain algorithms that perform well on the problem instances of interest and thus will return solutions that are near the optimal solution.

In some literature (e.g. [Shiv83]), silicon compilation is described as an academic exercise. In this context, mainly its higher levels are meant. But apart from the universities, a lot of industries pay atten")n to this field of research. As a result, a number of silicon compilers have been developed during the last few years. Mostly, these systems are no general compilers, but they only work for a small subset of all electrical circuits, for circuits of some special type. After all, a silicon compiler

Page 8: Hardware synthesis with the aid of dynamic programming · The demand graph that is constructed is transformed into a hardware deSCription that consists of two parts: a net-list that

·2·

""n best be used when the highest valued criteria are not minimizing the speed or the area of the design, but minimizing design costs and specially design time. Reducing design time overrules the added cost due to less·than·perfect optimisation of design parameters.

The Esprit-991 project at the Automatic System Design Group of the department of Eleclrical Engillooring at the Eindhoven University of Technology concerns a silicon compiler. At this pro· ject I worked for my master degree and this thesis reports on the work I have done: it covers the generation of the net·list description and some initialisations for the state machine of the compiler. The demand graph that is constructed is transformed into a hardware deSCription that consists of two parts: a net-list that describes all connections in the hardware and a module·list that describes the cells that have been selected from a module library. The dynamic programming approach is used to ensure that some near optimal hardware descrip· tion can be generated within reasonable time.

1.1 CONTENTS OF THIS REPORT

The next chapler gives a survey over the hardware synthesis pan of the silicon compiler syslem. The parts that are of main inlerest for the hardware generating subsystem are described in more detail. Chapter 3 deals with dynamic programming. In short it describes this methodology solving multi· stage decision problems. Then. chapler 4 comes to the heart of this report: the hardware generating syslem. This chapter descnbes the way the demand graph is transformed into hardware and how the dynamic program· ming approach works for the hardware generation. Chapter 5 describes the stale, this is the main data structure that holds the status of the syslem; also the way some parts of the transformation are i-nplemented in a LISP program is described. Chapter 6 terminates this report; it gives some conclusions ahout the system and some recommen· dations for further research.

Page 9: Hardware synthesis with the aid of dynamic programming · The demand graph that is constructed is transformed into a hardware deSCription that consists of two parts: a net-list that

- 3 -

2. SYSTEM OVERVIEW

This chapter describes the hardware synthesis part of the Esprit-991 silicon compiler, as presented in figure 2.1. The main subject of this graduation work, the hardware generator, is in the middle of this figure. As shown by the arcs, it uses a demand graph as input data and it communicates with the module library and the cost estimator. Three subsections of this chapter are dedicated to these parts of the system. The output of the hardware generator, the data path description and the state machine, is described in chapter 5.

demand graph optimiser

demand graph consttuctor

layout description

Figure 2.1. Hardware synthesis system

The input to the system is a behavioural description of an algorithm in a high level language (LISP, C, Pascal). This algorithm describes the functions that must be fulfilled. The parser ana­lyses the algorithm syntactically with conventional compiler techniques ([Ah086) and converts this algorithm to an abstract syntax tree. This tree is converted into a demand graph by the demand graph constructor, which has been described in [Stok86]. An optimiser deletes some inefficiencies of the demand graph, converting it into a functionally equivalent demand graph. Such an optimiser performs, for instance, constant folding, dead code elimination, code motion, elim'" :ltion of redundancies, and strength reduction, i.e. optimisations similar to those used in optimiSing compilers ([Abo86]). The hardware generator will produce the data path description

Page 10: Hardware synthesis with the aid of dynamic programming · The demand graph that is constructed is transformed into a hardware deSCription that consists of two parts: a net-list that

-4-

and the state machine, wilb Ibe help of !he cost estiTMtor and choosing modules from Ibe ITlf)dule library. Then Ibe layout generator should convert Ibe more symbolical data palb description and state machine into a detailed layout description. However, Ibis part of Ibe system is beyond Ibe scope of Ibis report

C,mparing Lbe behavioural algorilbm at Ibe input of the system wiLb the produced data palh and state machine at the (intermediate) output, one can globally say thal the variables will be mapped to nets or registerS and operators to logic circuits; assignments can be seen as data flow through !he logic circuits; special constructs in the algorilbm (if statement, while loop) find their represen-tation in !he control by the state machine. .

2.1 DEMAND GRAPH

A demand graph is a directed graph that represents Lbe dataflow Lbrough operators. Figure 2.2 shows a demand graph for !he algorithm that computes the discriminant, used to solve !he qua­dratic equation as described by the next Pascal program:

program discriminant (a,b,c; var D); read(a); read(b); read(c); D:= (b • b) - (4 • a. c); writein(D);

The nodes of a demand graph (shown as circles) represent the operations that are performed on the dra The edges (shown as arcs) represent the dataflow from node to node. Each edge is directed from the node that uses !he data to the node that produces the data (demand!).

b a c

D

Figure 2.2_ Demand graph for discriminant

Appendix 1 summarises the properties that describe the nodes and the edges. A more detailed specification of the node and edge characteristics can be found in [Stok861 and [Veen851. In the next two sections two special demand graph constructs will be discussed. one for an if statement . and one for a while loop.

Page 11: Hardware synthesis with the aid of dynamic programming · The demand graph that is constructed is transformed into a hardware deSCription that consists of two parts: a net-list that

- 5 -

2.1.1 IF STATEMENT

The algorithm min-max-sort serves as an example for an if statement. This algorithm exchanges the values of min and max when min is larger than max:

program min-max-sort (min max); read(min); read(max); (if (min> max)

then begin min := min + max; max :=min - max~ min := min - max

end; writeln(min); writeln(max).

The corresponding demand graph is shown in figure 2.3. Two new node types ask for attention: the merge node and the branch node, that are drawn separately right to the demand graph. The names of the connecting edges for these nodes are written at those edges. For each of the variables involved in the if statement a merge and branch node pair have been created. Depending on the value at the control edge the data flows from the merge nodes to the link-in-l or to the Iink-in-2 nodes, where the paths for, respectively, the then and the else clause start. In the branch nodes these paths come together and one of them will be selected depending on the value at the control edge.

Figure 2.3. Demand graph for min-max-sort

value

6 M cootto1

inlink-l inlink-2

outlink-l outlink-2

~conum 9

Page 12: Hardware synthesis with the aid of dynamic programming · The demand graph that is constructed is transformed into a hardware deSCription that consists of two parts: a net-list that

-6-

The if statement is in fact a special case of the case statemen t where the case statement can select between n possible clauses, the if statement can be seen as a case statement for n=2_ For a case statement, each ""'rge node has n incoming edges named inlink-} .. inlink-n and each branch node has consequently outgoing edges outlink-} " ou/link-n. The control structure will be somewhat more complex, but this does not change the idea behind the graph structure: at the ""'rge nodes n paths start, one path for each clause, and they come together in the branch nodes. Depending on the value of the control signal, one of the paths will be chosen.

2.1-2 WlDLE LOOP

The demand graph for a while loop is illustrated with the algorithm jactorial that computes the factorial of an integer variable n:

program factorial (n; var f); read(n); f:= I; while (n > I) begin f:= f. n;

n:= n - I end;

writeln(f).

>

Figure 2.4. Demand graph for factorial

last entry

~cont~ 9

value

a M=<~

last entry

The demand graph corresponding to this algorithm is shown in figure 2.4. Again two new node types are introduced: the entry node and the exit node. For each variable that is affected in the while loop, an entry- and exit-node pair is introduced. For both of these node types the node with

Page 13: Hardware synthesis with the aid of dynamic programming · The demand graph that is constructed is transformed into a hardware deSCription that consists of two parts: a net-list that

- 7 -

its edges and the names of these edges are listed next to the demand graph for the factorial algo­rithm.

Initially, the data enters the while loop through the entry edges of the entry nodes. Then the con­trol signal(s) for the entry and exit nodes can be calculated and they are connected to the control edges of both node types. While the control signal is 'true', the data circles around through the while loop: it leaves the exit nodes through the last edges, after the link-in-J nodes some opera­tions are performed on the data and then the data enters the entry nodes again through the last edges. At this moment the control signal(s) are computed again. The while loop ends when the control becomes 'false'; then the data leaves the exit nodes through the entry edges and becomes available at the connected Iink-in-2 nodes.

2.2 MODULE LIBRARY

The module library contains a set of predefined cells. Each cell is described by a set of characteris­tics. These characteristics can be accessed by a set of library access functions. Appendix 2 gives a summary of the syntax of the cell characteristics that are of interest, such as the width and height of the layout, the power dissipation, input-, output- and control-connections, delay time through the cell circuit, the operations that can be performed, etc. A complete description of the module library can be found in [Kais871.

The library cells perform specified functions. Simple cells only perform one function (for instance: add, nand, or greater-than), but also more complex cells are provided (an alu that is able to add, subtract, divide and multiply, for instance). Next to the cells performing logic and arithmetic func­tions, also cells for (de)multiplexers, registers and some other 'functions' are present

The hardware generator chooses a cell taking into account the functions it can perform; it also takes into account the area the cell occupies and the delay of the cell.

At the moment all cells in the module library are of a special kind: standard cells for a bit-slice layout methodology. (This is not a limitation of the system, but a methodology that is chosen for.) Standard cell means that each cell has the same width (the width depends on the complexity of the performed function(s)) and that connections for power supply, ground and (sometimes) clock sig­nals are placed on fixed distances in the height of the cell. These standard cells can economically be placed in rows: then the power, ground and clock will be connected automatically; between the rows channels are created that serve for routing the wires to connect the ceUs to each other.

In a bit-slice layout methodology, each cell only describes one bit of a function. The functional elements will be composed of those bit-slices by taking a number of slices equal to the word length of the data processed. Looking at the layout, when you see the different cells being placed horizontally next to each other, then the composing slices are placed verticaUy next to each other, separated by wiring channels. This bit-slice approach saves the storage of a lot of network data that is similar for each slice.

In figure 2.5 the symbolic layout of two bit-slices of standard cells has been drawn. This place­ment has been generated with the hardware generator. (Notes: 1. the interconnections between the slices (for carry etc.) have been left ou~ 2. all cells have equal width in this figure, but in reality they are different)

Page 14: Hardware synthesis with the aid of dynamic programming · The demand graph that is constructed is transformed into a hardware deSCription that consists of two parts: a net-list that

- 8 -

de- de-ter-

I multi· multi· regis- 'er-multi- • I control > multi- I -mind plexer plexer Ie, minal

placr plexer

I lUI UI\ I LJ II II LJIUl ,LJ L---1

I

do- de-le<-

I multi- • multi- regis- '''-multi- I control > multi- I -minal

placr plcxer plexer 'e< minal

plexer

I I LJ I UI\ I LJ II II L-J I I LJ -c.:.I

Figure 2.5_ Slalldard cell bit-slice layout for factorial

2-3 COST ESTIMATOR

TIle implementation of a node can often be done in several ways. To make a well-considered choice between these possibilities, some different partial implementations, performing the same functions, are generated. Because it is unnecessary and even not desirable to evaluate all possible implementations till the far end, the process must continuously select the best partial implementation(s). Therefore these partial implementations must be compared somehow.

The cost estimator, described completely in [Enge88], produces an estimation for the module- and net-areas of an partial implementation; these values are used (among others) when implementa­tions are compared. This estimator uses a model according to the linear placement heuristic of Kang. Linear placement used for modelling hardware is a very time efficient method and hence the estimator can work quickly. This is important, because the estimator will be called frequently.

TIle linear placement also fits in the Slalldard cell bit-slice methodology. The estimator places the slices in one row, trying to find the most economic sequence, i.e. the sequence causing the lowest wiring density in the wiring channel. The placement algorithm works incremental: when a new cell must be placed in the row, the old placement row is taken, a number of cells will be deleted from the end until the new cell can best be placed and then the deleted cells will be placed again.

Although the cost estimator uses a model instead of the real system and therefore is not 100% accurate, this does not matter that much. Because the cost estimator is used for comparing, it is important that it has a good relative accuracy. Furthermore, the model is independent of the actu­ally used layout methodology, so it does not restrict the hardware generating system; a large variety of circuits can be represented. The model leaves out a number of details that are part of the original configuration, thus allowing a considerably smaller complexity in representing the system.

TIle way the hardware generator communicates with the cost estimator will be described in section 4.3. The syntax diagrams of appendix 3 give a summary of the formats used for the data that is interchanged between hardware generator and cost estimator.

Page 15: Hardware synthesis with the aid of dynamic programming · The demand graph that is constructed is transformed into a hardware deSCription that consists of two parts: a net-list that

- 9 -

3_ DYNAMIC PROGRAMMING

Dynamic programming is a method used to get an optimal solution to multi-stage decision prob­lems. In this chapter first is introduced why the dynamic programming strategy is applied to the hardware generating process. Then the method is explained. At the end the special conditions are discussed under which dynamic programming is used in our system.

3.1 SOLVING DECISION PROBLEMS

The class of problems this section deals with is illustrated with the next abstract example. Consider a process of four jobs. These jobs are numbered 1 through 4. The process chooses one of these jobs as the first one to be executed. When this job has been completed, a next job is chosen and so on, until all jobs have been completed.

An essential characteristic of this process is the fact that costs can be calculated for each ellipse, i.e. for each subsequence of already completed jobs. The cost of a sequence depends not only on the jobs but also on the order in which they have been executed. Therefore the costs of all sequences probably will differ.

At this moment the problem of this process can be stated: which sequence is the cheapest and especially, how can you optimally find out which one is the optimal solution?

One can easily see, that there are 4! = 24 different sequences scheduling these jobs. Figure 3.1 shows all of them in a tree. The numbers in the ellipses indicate the jobs that still have to be done.

Figure 3.1. Tree for all possible sequences

Of course it is possible to compute the costs for all possible sequences (depth first search or breadth first search) and then select the cheapest one just by comparing all costs. But realising that the number of sequences grows proportional to the factorial of the number of jobs (and that is much worse than exponentia!!), this way will be quite time consuming even for small sets of jobs (as in the hardware synthesis process). Therefore it is necessary that the tree is restricted; dynamic programming is a method to achieve this.

Page 16: Hardware synthesis with the aid of dynamic programming · The demand graph that is constructed is transformed into a hardware deSCription that consists of two parts: a net-list that

- !O-

3.2 DYNAMIC PROGRAMMING

What dynamic programming exactly is, has been well described in literature ([Be1l62] and [Leve75] a.o.). This section presents a very shon reproduction of these works as far as they con­cern N-step detenninistic decision problems. Then the next section discusses how dynamic pro­gramming can be applied to the hardware generating process.

In a deterministic decision problem one has to make a finite number of decisions, each dependent on the status of the process at moment. Each decision consists of allocating some amount of a resource 10 an activity. Each activity rerums some yield, depending on the amount of invested resource. The objective is 10 find the optimal policy, i.e. the sequence of decisions that maximizes the total yield of all activities. Therefore it is necessary, that the yields of all activities can be measured in some common unit and that me total yield can be obtained as me sum of the indivi­dual yields. Validity of the principle of optimality is a third condition that must be satisfied be:·ore dynamic programming can be applied.

Tbe principle of optimaUty:

An optimal policy ensures, independent of the initial state or the first decision, mat me next decisions fonn an optimal policy for the slate resulting from the first decision.

As a result of me dynamic programming approach, you get a sequence of decisions mat represents an optimal policy. Suppose N decisions have to be taken, each decision being an investment R, in activity i (i = 1;2,.N). Then me principle of optimality tells, that having chosen some initial RN , you do not then examine all poliCies involving that particular choice of RN, but ramer only Ihose poUcies which are optimal for a N-I stage process. The same holds for a choice RN, RN_1 and aN -2 stage process, and SO on. In this 'magical' way, operations will be kept essentially additive rather than multipli­cative.

Instead of maximizing the yield when investing in activities, dynamic programming can also be awlied when costs have 10 be minimized.

Page 17: Hardware synthesis with the aid of dynamic programming · The demand graph that is constructed is transformed into a hardware deSCription that consists of two parts: a net-list that

- 11-

Figure 3-2. Lattice during dynamic progmmming

During the dynamic progmmming process. sequences containing the same jobs (or activities) in just different orders are compared and only the cheapest is maintained. This results in the lattice of figure 3.2 that is much smaller than the tree of figure 3.1. At the bottom of the lattice only one empty ellipse is present: this represents automatically the cheapest sequence and in this way the optimal solution has been found.

3.3 DYNAMIC HARDWARE GENERATION

The hardware synthesis implements the nodes of the demand graph one by one. Each implemen­tation concerns the mapping of a node on a hardware module. A decision in our hardware gen­erating process is the way a node from the demand graph is implemented into hardware. The rc;,Ources that are used when a node is implemented consist of area on the chip and time in a machine cycle. A complete description of a state in the lattice is presented in chapter 5. For the dynamic process, a state represents an hardware description. This hardware has been formed during the implementa­tion of a part of the demand graph. The hardware can perform a certain set of operations; this takes some machine cycle time and some area is needed for the hardware. Each state has a set of demand graph nodes that can be implemented next; implementing one of these nodes delivers a successive state in the state lattice.

The employment of dynamic progmmnting to the hardware generating process rests on two mechanisms: a costftmction and comparability. The costfunction is needed to measure in some way what a certain partial circuit costs. This meas­urement is based on the total area of the cells that are selected and the wires that connect them and on the maximal delay time through the circuit The :·":Cond mechanism, comparability, is provided to determine which states are equivalent In section 4.3 the cost function and the comparability are specified in detail.

The strategies that are used for implementing some algorithmic constructs (e.g. if statement, while loop: see subsections 4.2.2 and 4.2.3) can not assure that the optimal policy will be found; on the cootrary, it is more likely it will not be found. So the principle of optimality is not valid and strictly speaking, the hardware generating process is not suitable for the dynamic progmmming approach.

Nevertheless, dynamic progmmming is used: it can be considered as a breadth first search in which the total number of possibilities is restricted considerably.

Page 18: Hardware synthesis with the aid of dynamic programming · The demand graph that is constructed is transformed into a hardware deSCription that consists of two parts: a net-list that

-12-

4. HARDWARE GENERATING PROCESS

In this chapter the synthesis process will be described. The LISP-coded program is explained lOp­down by discussing several of its algorithms, going inlO details when necessary. This chapter often refers 10 the properties of a slale. These properties (always printed in ilalics) and some of the rou· tines that use them are discussed in detail in the next chapter.

No

No

No

Yes

generate demand graph

initialise Slate·a, enter in next-slates

current-states := next-states next,slales := empty

take state from current-slates

call process-state

insert resulting state(s) in next·slales if no cheaper

comparable state is present

hardware representation

Figure 4.1. Flow chart for hardware-generation

block 1

block 2

block 3

block 4

blockS

test 1

test 2

block 6

Page 19: Hardware synthesis with the aid of dynamic programming · The demand graph that is constructed is transformed into a hardware deSCription that consists of two parts: a net-list that

\,

-13-

4.1 IMPLEMENTATION

The main pan of the hardware generating process is presented in figure 4.1. The algorithm that must be compiled enters the system and is transformed into a demand graph; then it is passed to the hardware generator. In block 1 of the hardware generator some initialisations are performed: the properties of State.() and the global variables get the proper initial values.

The dynamic programming cycle is represented by block 2 until test 2. In block 2 the list current·states keeps the states that represent the partial implementations that have been computed until now. Block 3 and test 1 take care that each of these states is passed to routine process-state (t:ock 4). This routine will be discussed in section 4.2. The output of process-state is a list of pos­sible new states, each representing an extended partial implementation with regard to the pro­cessed state. Each of these possible new states is added to the list next-states only if no cheaper comparable state is present in it. When a new state is added, the comparable but more expensive states that are already present in next-states are deleted. The calculation of the costs and the com­parability mechanism are discussed in detail in section 4.3.

When all states in current-slates are processed, a set of new states is present in next-states. Then. in block 2, these states move from next-states to ClUrent-states, next-stales becomes empty and a new cycle of the dynamic process will be passed through. This goes on until all nodes of the demand graph have been implemented; then no new state will be generated and consequently next-states stays empty. Test 2 fails and the states in current-states are passed to block 6: the post-processor selects the cheapest state that has been generated for the full circuit and performs some optimizations on the hardware. See section 4.4 for more details.

Finally, the generated hardware is presented at the oulput of the hardware generator.

4.2 PROCESSING A STATE

Ever, state has a property bucket that holds the nodes from the demand graph that are free. A node is called rree if all nodes that are connected to the outgoing edges of the node have already been implemented. A free node can be implemented only if it is implementable. A free node is called implementable if all related nodes, i.e. those nodes that are controlled by the same control node, are free too. This only applies to special constructs in the algorithm: see the next subsections about the while loop and the case statement. Most node-types however are independent, they are implementable when they are free.

Figure 4.2 shows the How diagram of the state processing routine. This routine handles one state at a time. For every node or set of related nodes in the bucket of this state that can be implemented, a new state is created. Then, depending on the type of the node (-set), an implementing routine is called. The different types of implementing routines are subject of the next subsections. These routines return a list, containing one or more new states. All of them are put together in the oulput list, that is passed to the hardware generating routine that called this process.

4.2.1 IMPLEMENTING A SIMPLE NODE

Sim"k nodes are nodes of type get,put, constanl or nodes that represent an operator. These nodes are called simple because they stand alone, they are not related to other nodes. As stated before, a simple node that is present in the bucket is always implementable.

Page 20: Hardware synthesis with the aid of dynamic programming · The demand graph that is constructed is transformed into a hardware deSCription that consists of two parts: a net-list that

No

-14-

depending on the node-type select implement-__ routine

No

Figure 4_2_ Flow chart for process-state

When a simple node is implemented, this node is removed from the bucket and placed in the set reali .. d-nodes. Nodes that become free by the implementation of this node are added to the bucket. Tbe implementation of ari operator node is the most difficult because of the many possibilities: re-using a free module, making free an existent module in a new machine cycle or selecting a new module from the module hbrary. Figure 4.3 shows the two possibilities concerning the re-use of an existing hardware module.

Tbe implementing routine first searches in the property free-modules for modules that perform the ri",1t function and that are not used yet in the current machine cycle. When such modules are present, one of them is selected and the node is mapped on this free module (see below). For the selection criterion, see recommendation 3 in chapter 6.

As /igure 4.3 shows, if no free module is present to implement the node, the hardware is searched for a module of the proper type that is already used in this machine cycle. If one is found, a new machine cycle will be started (see section 5.7). Then the node can be mapped on a free module as described below.

Besides trying to re-use an existing module, another possible implementation is to extend the hardware with an extta module. Therefore a cell must be selected from the module library. The seleclon criterion implemented selects three cells that all perform the operation as requested by the node; they are the smallest cell, the cell that has the shortest delay time and the cell that

Page 21: Hardware synthesis with the aid of dynamic programming · The demand graph that is constructed is transformed into a hardware deSCription that consists of two parts: a net-list that

-15-

add node to realised-nodes, update bucket of state

free No

node type

resent?

No

Figure 4.3. Mapping an operator node on an existing module

embodies most operations covering the operations that still must be performed by the nodes of the demand graph that have not been implemented. The first one is selected in consideration of area econo'llical reasons, the second one is useful when short delay time is an important objective and the last cell is selected with a view to fUlUIe re·use and is thus area oconomical too. Each cell that has a delay time exceeding the time left in the current cycle, must be removed from this set (see recommendation 5, chapter 6). If no cells are left in the set afterwards, first a new machine cycle will be started and the set is retrieved. Possible doubles are removed from this cell selection. For each cell that remains a new state is generated, a module is created for the cell and is added to the hardware; lhen 1he node is mapped on this (free) module.

All generated states (both for re-used modules and for new modules) are put in the oullis! and this list is returned to the routine process-state lhat activated lhis implementation.

Mapping a node on a free module implies a sequence of actions.

1. The selected module is deleted from lhe list/ree·modules.

2. To connoct the proper signals to lhe module, the nodes of lhe demand graph that produce lhese signals are determined wilh the aid of lhe graph sbUcture routines. Property trace­nodes is used to find out on which modules of the hardware these nodes are mapped. The output nets of lhese modules are selected; if no net is connocted to lhe output of such a .nodule, a new net is created and then connected to lhe output of that module. The order in which aU data is passed to and fro is always maintained, because this can be important when operators are non-commutative (see chapter 6, recommendation 4).

3. If 1he module has never been used before, no nets are connected to 1he inputs and the output nets can direcdy be connected to the inputs. But if the module has been used before once, the output nets can not be connected direcdy. First for each input signal multiplexers must be inserted. The nets that already are connected to the input of the module are now moved to the first input of the multiplexers; new nets are created to connect the outputs of the multiplexers to the inputs of the module. Then the selected output nets can be connected to the second input of the multiplexers. It is also possible that already multiplexers are connected to the inputs of the module. If

Page 22: Hardware synthesis with the aid of dynamic programming · The demand graph that is constructed is transformed into a hardware deSCription that consists of two parts: a net-list that

-16-

these multiplexers have no free inputs left, first a larger multiplexer is selected from the module library to substitute the old multiplexer. The existing connections need not to be cbanged; the selected output nets can now be connected to the first free input of each multi­plexer.

4. Property trace-nodes is updated for the node that has been mapped on the module.

Implementing a node of type get rakes the following steps. A get node represents the taking in of a variable from outside. With each get node a port is associ-8t1'Ai that represents the input pin. Each port will be mapped on a module of type terminal.

The name of the port for the get node is defined in the algorithm that is implemented. In property inputs can be seen if the port has been implemented or not; if so, it gives the module on which the port has been mapped If no terminal is implemented for the get node, a new terminal module is created. At the input side of this module a dummy module of type dum-term is connected because (at this moment) the cost estimator cannot handle a module that is not connected to any other module. At the end of the hardware generating process this dummy module will be removed by the postprocessor. The name of the pon together with the name of the module is added to the property inputs. If the terminal is not used already in the current machine cycle, the node is mapped on the free ter­minal, else first a new machine cycle will be started. Property trace-nodes is updated and the new state is returned to the routine process-state.

Impltmenting 8 node of type put stans with looking in propeny outputs for the output port where the computed signal must be presented to the outside world. If this port is not implemented, this is done first: a new module of typ(l terminal is created and the name of the pon together wilb the name of Ibe module is added to the property oUlputs. Then Ibe net that represents the signal of the put node is determined with the aid of trace-nodes. If the terminal is not used already in the current machine cycle, the net is connected to Ibis terminal, else first a new cycle will be staned.

Implementing 8 constant node differs a liule from Ibe other simple nodes. In fact only some con­nections to ground or supply voltage must be made. But because of the bit-slice structure this can not be described in that way. Therefore a module for a cell of type constant is created. In property constants is kept which value must be realised by Ibis module. Later, in the layout realisation part of the silicon compiler, the correct connections must be made. Because (at this moment) Ibe cost estimator can not handle an unconnected module, a dummy module of type dum-cons is created and connected to Ibe input of the constant module. At the end of the hardware generating process the postprocessor will remove this dummy module.

4.2.2 IMPLEMENTING A CASE STATEMENT

In section 2.1.1. the demand graph for a case statement (or if steuement) has been described; figure 2.3 was used to demonstrate the connections between the edges and nodes used.

A case statement consists of several paths in which operations are performed. Depending on a control signal one of these paths is selected and the values that are computed in the selected palb \\.ll be present at the end of the case statement. Every path is implemented in a separate state of the state machine to make it simple to select a path. Therefore each path is implemented in a new machine cycle.

Page 23: Hardware synthesis with the aid of dynamic programming · The demand graph that is constructed is transformed into a hardware deSCription that consists of two parts: a net-list that

-17-

When a merge node is passed to the process-state routine, it is tested whether it is implementable or not This is done by determining all merge nodes that are controlled by the same node that is connected at the control edge of the merge node. If they all are present in the bucket, the entire set of merge nodes is implementable. Then this set is passed to the implement-merge-set routine.

This implement-merge-set routine performs several actions:

1. The control signal from the control node is connected to a module of type control, representing the state machine while not implemented in the program (see chapter 6, recom­mendation 6).

2. The merge nodes that are implemented are deleted from the bucket. The current properties bucket, cycle-nr, cycle-type and free-modules are pushed on stack (see section 5.7) because each path will be implemented in a separate machine cycle.

3. The different machine cycles wi11 be initialised: all link-in-<i> nodes are sorted for the clause they represent and pushed on the bucket-stack. The <i>-value determines the increas­ing order in which they wi11 be popped. At the same time the right values are pushed on the other stacks.

4. Property trace-nodes is updated for the merge nodes and all link-in-<i> nodes that are con­nected; the merge nodes are added to realised-nodes.

5. All paths of the case statement come together in branch nodes. For each branch node a module for a multiplexer is created and an association sublist pair (branch-node module­name) is placed in property cycle-save. (A multiplexer is not always needed; see section 4.4 about the postprocessor and the note at the end of this subsection.)

6. For economical reasons (savings in the width and length of the state lattice), every time when a state of the state machine for a path of a case statement is started, all link-in-<i> nodes are implemented at once. These nodes do not perform any operation, so implementa­tion is done by only updating the realised-nodes and bucket properties. The trace-nodes pro­perty was already updated in step 4.

This ends the implementation of a set of merge nodes. In a number of machine cycles the nodes of each path will be implemented in the nonna! way. When the implementation of a clause of the case statement is finished, the state of the state machine and therefore the current machine cycle will be ended. The computed signals are connected to the proper multiplexers with the aid of pro­peny cycle-save.

A branch node becomes free and is placed in the bucket when all paths going to that branch node have been implemented. A branch node is implementable if the entire set of related branch nodes is present in the bucket. Implementing the set of branch nodes takes the next actions:

1. The signals of the multiplexers are stored in propeny trace-nodes.

2. The control signal (see point 1 of implementing a merge set) is connected to the multi­plexers.

3. The association pair (branch-node multiplexer-name) is deleted from the propeny cycle­save.

4. The branch nodes are added to the realised-nodes property and the bucket is updated.

This completes the implementation of a case statement. If a variable, that is computed in a case statement, always gets its value from the same operator, all inputs of the multiplexer for this variable will be connected to the same output net of that operator. Then this multiplexer is superfluous. The postprocessor detects such a situation and wi11 remove the multiplexer.

Page 24: Hardware synthesis with the aid of dynamic programming · The demand graph that is constructed is transformed into a hardware deSCription that consists of two parts: a net-list that

-18-

4.2.3 IMPLEMENTING A WHILE LOOP

The demand graph for a while loop has been described in section 2.1.1; figure 2.4 shows an exam· pIe.

A while loop can be described in two parts: a path in which some computations are performed and the computation of a signal that controls this path. While the control signal delivers a value 'true', data will circle around through the path: when it comes to the end of the path it will be fed back to the beginning. Two states of the state machine cycles will be used: one for the computation of the control signal and one for the implementation of the nodes in the path.

A while loop starts with entry nodes. An entry node becomes free when the node at its entry edge has been implemented. An entry node is implementable if all related entry nodes are free. This means that all elllry nodes that have their control edges connected to the same node that delivers the cvntrol signal, must be present in the bucket. In this case the entire set of related entry nodes is implemented, which results in:

1. Move the entry nodes from bucket to realised·nodes. Then push the current bucket, cycle­nr, cycle·type and free-modules properties on their stacks and start a new machine cycle for a new state of the state machine (see section 5.7).

2. Initialise the new cycle: property cycle· type gets the value and the free nodes after the entry nodes are put in the bucket. These nodes will compute the control signal for the while loop.

3. For each entry node that is used for a variable a module for a two-input multiplexer is created. (So not for an entry node that realises a path between the sink node and nodes of type constant, see section 5.6.) The initial input signal for an entry node comes from the node that is connected to the edge of type entry. The output nets of the modules on which these nodes are mapped are con· nected to the first inputs of the created multiplexers. Then properties trace-nodes and trace­invalid are updated.

After the entry nodes have been implemented, normal implementation goes on for the nodes that produce the control signal. When this is ready, the bucket ouly contains all exit nodes for this while :Oop. The exit nodes are also implemented as an entire set:

1. Now the control signal for the while loop has been computed. The control edge of all related entry and exit nodes is connected to one node that delivers the control signal. A module of type control, that represents the state machine, is created and the output net of the module on which the controlling node has been mapped is connected to this control module. This module is then connected to the control inputs of the multiplexer modules on which the entry nodes are mapped. Again must be remarked that this control module only is used because the state machine is still not implemented in the program.

2. Wben the while loop finishes, the computed signals of the while loop will be present in the link-in-2 nodes at the entry edges of the exit nodes. These link-in-2 nodes are added to the bucket·list at the top of the bucket·stack, that contains the nodes that are pushed in step 1 of implementing entry nodes: these nodes will be popped from stack at the end of the while loop.

3. For each exit node, that is used for a variable, a module for a two-output demultiplexer is created. (So not for an exit node that realises a path between the sink node and nodes of type constant, see section 5.6.) The nets that carry these variables are connected to these demul· tiplexers. The control module is connected to the control input of the demultiplexers. The Jxit nodes are moved from the bucket to realised·nodes.

4. A new state of the state machine is started for the implementation of the loop path. The Ilnk·in·1 nodes are put in realised·nodes, the nodes that become free after them are put in

Page 25: Hardware synthesis with the aid of dynamic programming · The demand graph that is constructed is transformed into a hardware deSCription that consists of two parts: a net-list that

-19-

the bucket.

5. Property trace-nodes is updated for the control node and all link-in-l, Iink-in-2, and exit nodes.

At this moment all exit nodes are implemented and the first nodes that realise the computations of the while loop are present in the bucket. They are implemented in the normal way. If a variable of an exit node (mapped on a demultiplexer) is used. the net that is connected to the first output will be token. When all nodes in the while loop are implemented. the bucket contains again the set of entry nodes that have been implemented before. This time the following actions will take place:

1. The computed signals are connected to the second input of the multiplexers for the entry nodes.

2. The cycle of type '(while) is ended; this means that for each node of type Iink-in-2 on the top of the bucket-stack a connection-net is created at the second output of the demultiplexer for its exit node. To the output-side of this net a dummy module (of type dum-conn) is con­nected. because the cost estimator cannot handle a net that is not connected at both sides. This dummy module will be deleted by the post-processor. Also frace-rwdes is updated and the top of each stack is popped.

3. The Iink-in-2 nodes are implemented (i.e. moved from bucket to rea/ised-rwdes and the free nodes after the Iink-in-2 nodes are put in the bucket).

This ends the implementation of the entry and exit nodes and thus completes the description of the implementation of a while loop.

4-3 COMPARABIUTY AND COST FUNCTION

For every state in the set current-states a set of new states is computed by the process-state rou­tine. These sets of new states are passed back to the hardware-generation routine (see figure 4.1: block 4. and figure 4.2). Then the mechanism that delimits the number of states is activated: only the cheapest comparable states are put in the set next-states; these states will be processed in the next cycle of the dynamic hardware generation process.

When is State-A comparable to State-B that is already present in next-states? This is based on two rules:

1. The property realised-nodes of both states must contain exactly the same nodes.

2. The nodes-cells-cover-set of State-B must be smaller than or equal to the rwdes-cells­cover-set of State-A.

The nodes-cells-cover-set of a state is the intersection (without removing doubles) of all opera­tions that are represented by nodes in the demand graph and all operations that can be performed by the cells that already have been implemented in hardware.

The first rule assures that the hardware has been generated for the same part of the demand graph. the second rule measures in some way how much the hardware can be used.

All states in next-states that satisfy these conditions are put in a list equal-states. Then the cheapest state out of these equal-states and State-A is determined. This is done with the aid of a cost-function. The cost of a state depends on the area A the implemented hardware occupies and the number of machine cycles M that is needed. A scale-factor C is needed to be able to add area and time. Then the cost-function is defined as:

Page 26: Hardware synthesis with the aid of dynamic programming · The demand graph that is constructed is transformed into a hardware deSCription that consists of two parts: a net-list that

-20-

cost=A+CoM

The value of C can be used to emphasise the importance of a small circuit (small C) or the impor­tance of a fast circuit (large C). This roOSt function is linear in time and area, but it is also possible to implement another cost func­tion. For example, when the area that can be used is bound to a maximum, a progressive cost func­tion is suggested to restrict hardware extensions when the total available area is almost consumed. Another possibility is to multiply the used area and the number of machine cycles.

4.4 POSTPROCESSOR

When one final state has been selected, a postprocessor will perform some optimizations on the generated hardware:

1. All dummy modules are removed (see section 4.2.1: implementing a get or constant node, and section 4.2.3 and 5.7: second connection to demultiplexer for exit node); nets that were connected are removed or reconnected.

2. If a net is connected to the same multiplexer more than one time, only one of the inputs of the multiplexer will stay connected to this net The state machine will be updated and it is checked if a smaller multiplexer from the module library can be sufficient

3. When the oulput of a multiplexer is connected to the input of another multiplexer, this com­'>ination can be substituted by only one multiplexer. A new multiplexer will be selected from the module library and the hardware connections and the state machine will be updated. (This situation occurs when a nested case statement has been implemented.)

4. Remove a demultiplexer with only one oUlput connected. Delete one affected net and recon­nect the other one. (This situation can occur when a while loop has been implemented and one of the signals in the loop was only local.)

Page 27: Hardware synthesis with the aid of dynamic programming · The demand graph that is constructed is transformed into a hardware deSCription that consists of two parts: a net-list that

-21-

5. DATA STRUCTURES

The hardware synthesis process has heen implemented in a CommonLlSP [Wins841 program. The basic dala structure around which the process has heen built up, is the state. This chapter describes what dala exactly is held in a Slate. Also the program implemenlations for some of the routines that affect properties are presented.

A state represents a hardware description that has been generated as a result of the implementation of a part of a demand graph. The hardware description and the status of the demand graph imple­mentation are described in the properties of the state. The complete demand graph is implemented in a dynamic programming process (see chapter 3). If a new state is generated, first all data in the properties of the predecessor state are copied (except for the cost data, see section 4.4); then a next demand graph node or a set of related nodes is implemented and the hardware is extended according to the implementation rules for the node(s). The ;iynamic process compares equivalent Slates (hardware implemenlations) and deletes those states that probably will not develop into an optimal hardware implementation. This way a state lattice is formed, starting with an initial State-O and resulting in a set of final states from which the best will be selected.

5.1 STATE IDENTIFICATION

A state is identified by its property state-id, an integer number. This number, converted to its character string representation, forms the suffix part of the name of the state; the prefix is "State-". To determine the place of a state in the slate lattice, the property prev-stale is provided; this is an integer number equal to the state-id of the parent state, i.e. the previous state where this slate is descended from.

5.2 RELATIONS TO THE DEMAND GRAPH

The property realised-nodes keeps a list of demand graph nodes, that represents the part of the graph that has already been implemented. The propeity bucket, another list of demand graph nodes, contains the nodes that are free (see section 2.1). Initially, the set realised-nodes is empty and the set bucket contains DmgNode-O and DmgNode-I, i.e. the sink node and the lO-sink node, together being the starting points of the demand graph.

When a next state is generated, the node (or nodes) that will be implemented is deleted from the bucket and addcd to the set of realised-nodes. The bucket is extended with the nodes that become free by the implementation of this node.

The computing of new slates stops, when all nodes have been implemented and the hardware gen­erated covers a complete circuit description that realises the function(s) prescribed by the algo­rithm.

5.3 HARDWARE DESCRIPTION

The loardware is described by two sorts of elements. Modules constitute the building blocks for the hardware. A module represents a certain cell from the module library, it performs some operation(s) on data. Nets reptesent the wires that connect the modules, hence they transport data

Page 28: Hardware synthesis with the aid of dynamic programming · The demand graph that is constructed is transformed into a hardware deSCription that consists of two parts: a net-list that

-22-

from one module to another.

While the hardware generating process proceeds, the circuit description is constructed. This causes many new modules and new nets to be generated, many searches for, references to and updates of old nets and modules, deletions of nets or modules, insertions of modules in between old con­nected modules, etc. In other words, the hardware data will be referred to and changed many times. Therefore the access to the nets and modules has to be very fast This is done using vectors, because in a vector each data field can be accessed directly. Two sorts of vectors are used and will be introduced below: the 1U!1-veClor and the module-veclor.

The 1U!1-veClor consists of the following three data fields:

net .. name

1U!1-in-modules

1U!1-Oul-modules

the name of a net is a string, that has prefix "net-" and the suffix is a unique integer number (converted into a string).

is a list that contains the module-names of the modules that have an output connected at the input of the net

is a list that contains the module-names of the modules that have an input connected at the output side of the net

The module-veclor consists of five data fields:

module-in-nels

module-oul-1U!IS

module-control-nels

library-module

a string with prefix "mdl-" and suffix an unique integer number con­verted into its string representation.

a list of net-names, each net is connected at a data input of the module.

a list of net-names, each net is connected at an output of the module.

a list of net-names, each net is connected at a control input of the module.

the name of a cell in the module-library, that has been selected for implementation.

The order in which the net-names occur in the lists module-in-1U!IS, module-oul-nels and module­control-1U!1S is important, because these orders determine at which pin of the cell these nets must be connected.

The module-vectors and net-vectors are stored in two properties of a state, the lists mdl-vecs and 1U!1-vecs, respectively. Further, propeny mdl-nr keeps the integer number for the suffix part of the name of the module that will be generated next The same holds for nel-nr and the next net-name.

Some special actions are taken when somelbing must be changed in Ibe hardware data. To be able to access the hardware data, the hardware must be loaded: all nets and modules are interned. This is a LISP function; it means that symbols are created for each net and module. A net-name or a module-name becomes Ibe name of a symbol and the corresponding net-vector or module-vector becomes its vaiue. Then, all hardware data can be accessed immediately and the changes are carried out Mterwards, the hardware must be saved. The hardware-save routine first saves Ibe newly created vectors in the properties 1U!1-vecs and mdl-vecs; then it uninlerns all symbols, i.e. the links between each vector as a value and the name of its symbol will be set free. In this way, it is prevented that a state would be coufused by hardware data of anolber state. When the hardware is saved (this occurs exactly one time for each new state), the changes in hardware are computed and the hardware-save routine activates the cost estimator.

Page 29: Hardware synthesis with the aid of dynamic programming · The demand graph that is constructed is transformed into a hardware deSCription that consists of two parts: a net-list that

,

-23-

5.4 COST DATA

As stated before. the cost estimator calculates some parameters that define the area of the hardware Ibat has been implemented in a state. Because Ibe cost estimator works incremental (see chapter 3.3), some data must be stored in a state for the benefit of the cost estimator. Three proper­ties are used to store all cost data.

The costs is the properly that stores the cost vector. This cost vector holds five fields. The first one should give Ibe total delay time of the critical path Ibrough the circuit, but this calculation is not performed by the cost estimator. The second field gives the total area occupied by all library modules. The third field gives the total wiring length, then the length of the sequence of stlItdard cells is provided and the last field gives the maximum number of nets above each other in the wir· ing channel.

The property placement contains a list of module-names. The order of the module-names deter­mines the order in which the cost estimator has placed Ibe library-modules next to each other.

The property new-cost-vecs is not of interest for the hardware generating process. The cost estima­tor uses this data for its next incremental cost calculation.

Each time when a new state is generated the hardware has been changed. Then the cost estimator is activated: the three cost data types from Ibe old state are passed to the cost estimator and the new values that will be returned are saved in the properties of the new state. Therefore it is unnecessary to copy these properties when a new state is initialised.

5.5 INPUTS, OUTPUTS & CONSTANTS

The data for the circuit that is needed from outside the circuit itself, enters the circuit via tertni­nals. Also the output signals generated will be present at tertninals. Tertninals are cells from the module-library. They correspond with ports that have been defined in the algorithm. Which port corresponds to which tertninal is stored in two properties: input-conn and output-conn. These pro­perties are filled in during the hardware generation; both can be used as an association list in the program. Such an association list consists of sublists; each sublist has two elements: first the name of the port in the algorithm, then the narne of the corresponding module in the hardware descrip­tion.

Constants will be implemented with Ibe aid of library-modules of the constant type. In fact, this is only symbolic: in Ibe integrated circuit a constant will be obtained by connecting the right nel­wires for the data bits to ground or supply voltage. The property constants keeps the constant values that these constant modules must realise. It is an association lis~ for each constant module a sublist is present that contains the narne of Ibe module and the value of the constanL

5.6 TRACING NODES AND MODULES

During the hardware synthesis nodes from the demand graph are implemented by modules. When the next node will be implemented, the module must be connected to Ibose modules that produce the data for its inputs. These modules correspond to the nodes in the demand graph that are con­nected at the out-edges of the node 10 be implemented. When the nodes Ibat produce the data are known, the associated modules can be found with the aid of property Irace-nodes.

Page 30: Hardware synthesis with the aid of dynamic programming · The demand graph that is constructed is transformed into a hardware deSCription that consists of two parts: a net-list that

-24-

Trace-nodes is an association list that holds a sublist for each node that must be used. The key for the association list is the name of a node, the associated value is a vector of two fields. The first field is an integer number representing the number of times the data of this node will be used in future. Each time the node is used, this number is lowered hy one until it becomes zero; then the sublist is deleted from the association list. because the data of this node will be used never again. The second field holds the name of the module where the data is present. When this data moves in the circuit - for instance from an operator to a register when a new cycle will be started - this field will be updated.

As a consequence of this tracing mechanism, two other properties are used. One is trace-invalid, a list of nodes that can not be used because they do not produce any data, although other nodes refer to them. On account of this fact these nodes disturb the trace-nodes and they have to be set apart. For an example of such nodes, see figure 2.4 for the demand graph of algorithm factorial. There the right-most entry, exit and link-in-l node are only used to provide a path between some constant nodes and the sink node.

The last property in this context is free-modules. This is a list of names of modules that have not been used in the current machine cycle.

5.7 CYCLE MECHANISM

When a lot of operators are connected in series to perform some complex function, the delay time through these operators can grow to such a large value, that it exceeds the duration of a period of the system clock. When this happens, the signals must be stored in registers and a new machine cycle must be started. In the new cycle, all implemented operators are set free and can be re-used with the aid of multiplexers connected at their inputs.

The j,-.Iplementation of special syntax constructs, such as a case-statement or a while loop also requires new machine cycles, because the different paths of these constructs will be implemented in separate cycles.

The current machine cycle is identified by the integer number that the property cyc/e-nr keeps. When a new cycle is started, its cycle-nr will be the number present in property neXl-cycle-nr and this number will be increased by one. With each cycle a property cycle-type is associated, that is used when a cycle is ended; the actions that have to be taken when ending a cycle depend on this cycle-type.

5.7.1 STARTING A NEW CYCLE

When a new cycle must be started, first all signals that must be saved for future use will be stored in registers. These signals can be determined with the aid of the trace-nodes property: all output signals of operators and input terminals that are held in this list must be stored. When more signals must be stored than free registers are present, new register modules are created. The hardware is extended for the connections between the selected signals and the registers.

Then property trace-nodes is Updated for these movements of data. Also the set of free modules for the new machine cycle is determined and stored in property free-modules.

Page 31: Hardware synthesis with the aid of dynamic programming · The demand graph that is constructed is transformed into a hardware deSCription that consists of two parts: a net-list that

-25-

5.7.2 NEW CYCLES FOR CASE·STATEMENT

(See also section 4.2.2.) While a case statement is implemented, no other parts of the demand graph are implemented. So before the new machine-cycles for the case statement can be started, first the current situation must be preserved: the properties cycle-nr, cycle-type, bucket and free-modules are pushed on their stacks so that the current situation can be restored when the implementation of the case state­ment is finished.

Each path will be implemented in a separate machine cycle: for each cycle some initialisation data is pushed on the stacks. These initialisations are:

- The cycle-nr for the cycle, in which the implementation of a path starts, is pushed on the cycle-nr-stack.

- A list' (merge <i» is pushed on the cycle-type-stack for the i-th path (i = 1 .. number of paths).

- The modules that are set free now, are pushed on thefree-modules-stack. When in a cycle a new module for an operator will be implemented, this module must be added to every list offree-modules on the free-modules-stack.

- The implementation of path <i> starts with the link-in-i nodes, so these nodes are pushed on the bucket-stack.

5.7.3 NEW CYCLES FOR WHILE LOOP

(See also section 4.2.3.) During the implementation of a while loop no other nodes are implemented. Therefore the current situation is pushed on the stacks: after the implementation of the while loop has been finished, this situation will be popped from the stacks. For a while loop two new cycles are initialised. The first one, that implements the nodes that real­ise the control signal, will be started immediately: so the data for this cycle is not pushed on the stacks but placed in the 'current' properties. The second cycle will be used to implement the nodes that perform the operations of the loop.

To initialise the cycle for the control signal, the properties free-modules, next-cycle-nr and cycle­nr are updated, type '(while) is placed in cycle-type and the free nodes after the entry nodes are put in the bucket.

The implementation of the control part of the while loop ends when the bucket contains all related exit nodes. Then a new cycle will be started. Again free-modules, next-cycle-nr and eycle-nr are updated, type '(while) remains in cycle-type and the nodes that become free after implementation of the link-in-l nodes after the exit nodes are put in the bucket. The link-in-2 nodes after the exit nodes arc added to the top list of the bucket-stack.

5.7.4 ENDING A CYCLE

If no nodes are present in the bucket and the stacks are empty, the complete demand graph has been implemented. Then the hardware generating process will be finished.

Page 32: Hardware synthesis with the aid of dynamic programming · The demand graph that is constructed is transformed into a hardware deSCription that consists of two parts: a net-list that

-26-

Otherwise the top of each stack will be popped, the data popped will be placed in the correct pro­perties. But first the current machine cycle must be ended; the actions that have to take place depend on the cycle-type:

'( fllJrmol)

'(merge i)

'(while)

5.8 STACKS

The signals that must be saved are stored in registers.

The signals that are created in the cycle must be connected to the multiplexers for the branch nodes; these multiplexers are traced with the aid of property cycle-save. This property provides an association list (branch-fllJde demultiplexer-module­name).

Dummy modules are connected at all second outputs of the demultiplexers for the exit nodes that will be used later.

The last set of properties deals with the stack mechanism of the program. In case of a case­statement or a while-loop, a choice between two or more paths through the algorithm (thus through the demand graph, thus through the circuit) is made on the score of the value of some con­trol signal(s). These paths can not be created at the same time. Therefore the different paths of the demand graph are traced one at a time and each path is implemented in a new state of the state machine. While generating the hardware for one path, some data about the other ones must be preserved.

When the case-statement (or while-loop) is finished, data has flown through one of the paths, that has been selected by a control signal. All different paths come together in a demultiplexer. Then, the data must flow further; therefore the status of the process just before this statement is needed, so this status must be preserved too.

The preservation of all data that determines the staws of a machine cycle is accomplished by the stack mechanism. Four stacks are in use: the cycle-nr-stack stores the cycle-nr, the cycle-type­slack stores the cycle-type, the bucket-stack does the same for the bucket and thefree-mdls-stack for d." free-modules. Each stack always holds the same number of elements as the other stacks. Pushing and popping data always occurs simultaneously for alI stacks.

Page 33: Hardware synthesis with the aid of dynamic programming · The demand graph that is constructed is transformed into a hardware deSCription that consists of two parts: a net-list that

-21-

6. CONCLUSIONS AND RECOMMENDATIONS

This chapter lists some necessary extensions and some recommendations for improvements on the hardware synlhesis program.

1. The status of Ihe free·modules propeny is preliminary: when Ihe program pan that produces Ihe state machine is implemented, it will probably be much easier to find the free modules wilh Ihe aid of trace·nodes and Ihe state machine description than updating each time Ihe free-modules list N.B.: If new modules are created, all lists in Ihe free-modules-stack (if present) must be updated for Ihese modules.

2. Perhaps Ihe way the module-vectors and net-vectors are used must be changed. At the moment, every time when Ihe hardware is changed a symbol is created for each of Ihese vectors and Ihese symbols are uninterned when they will not be used any more. This stra­tegy leads to very shon access times to the data, but the interning and uninterning of the symbols costs some overhead time, especially because all vectors are interned and not only those needed. Another possibility is to store the net- and module-data in association lists: the names of the nets and modules serve as the key while searching for the vector needed. This method costs more time due to searching the association lists, but we get rid of the ovethead time. Funher ~dvantage can be found in the communication wilh the cost estimator. This program initially was written for net- and module-data in association lists; for Ihe benefit of Ihe hardware gen­erating process, some special interface functions have been implemented that will be unnecessary.

3. The criterion that is used to select a module for re-use has to test the delay time of the module for fitting in the time that is left in Ihe current machine cycle. It is also recom­mended, that is tested for already existent connections between the input signals and Ihe free modules; also the number of functions that a module can perform can be taken in considera­tion. At the moment, !he module that embodies the shonest delay time is selected; Ihen the imple­menting routine can test if it fits in Ihe current machine cycle.

4. When input nets must be connected to a non-commutative operator, the order of connection is imponant for correct operation; but for commutative operations this order may be altered: perhaps it becomes possible to re-use an existing net. At the moment, Ihe order of Ihe input nets is always maintained as it bas been described in the algorithm.

S. The clock frequency on the chip that is under development determines the duration of a machine cycle; therefore the number of operations that can be performed in one cycle is limited and dependent on the delay times of the used modules. The machine cycle time can be tested when a state machine description is present; at moment this is nOl done.

6. Control signals that are computed for special constructs of the algorithm will be used by the state machine to generate signals for proper operation of the hardware. At moment, Ihe state machine has not been implemented; the control signals are connected to symbolical COnlrol modules and these modules are connected to the control inputs of the (de)multiplexers.

7. If a new machine cycle is staned, the signals in the hardware must first be stored in regis­ters. At moment, each signal will be saved in an arbitrary register; it will be better to re-use pos­sible existing connections between the operators, that produce the signals, and the registers.

8. The transformation of the demand graph construct for functions and procedures into hardware has not been implemented yet.

Page 34: Hardware synthesis with the aid of dynamic programming · The demand graph that is constructed is transformed into a hardware deSCription that consists of two parts: a net-list that

[ Aho86]

-28-

REFERENCES

Aha, A.V. and R. Sethi, J.D. Ullman !Ompilers: principles, techniques and tools. Reading, Mass.: Addison-Wesley, 1986. Addison-Wesley series in computer science

[Be1l62] Bellman, R.E. and S.E. Dreyfus Applied dynamic programming. Princeton, N.J.: Princeton University Press, 1962.

[Coho83] Cohoon, J. and S. Sahni Heuristics for the-crrcuit realization problem.

[ Oavi84]

[ Enge88 ]

[ Kai87 ]

In: Proc. 20th Design Automation Conf., Miami Beach, Fla., 27-29 June 1983. New York: IEEE, 1983. P. 560-566.

Davia, M. ~ithmic aspects of digital system design. Philips J. Res., Vol. 39(1984), p. 206-225.

R.J. van and R. van den Born ~~~~i: •. ~ for incremental hardwar,e synthesis.

Faculty of Electrical Engineering, Eindhoven University of Technology, 1988. EUT Report 88-E-202

Kaiser, F. and L. Stok, R. van den Born Design and implementation of a modure-Tibrary to support the structural synthesis. Faculty of Electrical Engineering, Eindhoven University of Technology, 1988. EUT Report 88-E-187

[ Latt79] Latti n, B. ~esign methodology: the problem of the 80's for microprocessor design. In: Proc. 16th Design Automation Conf., San Diego, Cal., 25-27 June 1979. New York: IEEE, 1979. P. 548-549.

[ Leve75 ]

[ Sahn80 ]

[ Shiv83 ]

Leve, G. de Leergang besliskunde. Deel 7a: Dynamische programmering 1. Amsterdam: Mathematisch Centrum, 1979. MC syllabus, 1.7a.

Sahni, S. and A. Bhatt i'ne""Complexity ofde'STgn automation problems. In: Proc. 17th Design Automation Conf., Minneapolis, Minn., 23-25 June 1980. New York: IEEE, 1980. P. 402-411.

Shiva, S.G. Automatic hardware synthesis. Proc. IEEE, Vol. 71(1983), p. 76-87.

[ Stok86] Stok, L. and R. van den Born, G.L.J.M. Janssen ~er levels of a silicon-compiler.

[ Thom81 ]

[ Vee nBS ]

[ Wins84 ]

Faculty of Electrical Engineering, Eindhoven University of Technology, 1986. EUT Report 86-E-163

Thomas, D.E. ~tomatic synthesis of digital systems. Proc. IEEE, Vol. 69(1981), p. 1100-1211.

Veen, A.H. rne-misconstrued semicolon: reconciling imperative languages and dataflow machines. Ph.D. Thesis. Eindhoven University of Technology, 1985.

Winston, P.H. and B.K.? Horn Lisp. 2nd ed. Reading, Mass.: Addison-Wesley, 1984.

Page 35: Hardware synthesis with the aid of dynamic programming · The demand graph that is constructed is transformed into a hardware deSCription that consists of two parts: a net-list that

-29-

APPENDIX 1: DEMAND GRAPH NODE- AND EDGE-TYPES

Properties of a demand graph node (named: DmgNode-<i»:

type describes the sort of operation the node perfonns (sink, IO-sink, get, <j>, <arithmetic or logic operation>, merge, branch, enlly, exit, link-in-<k>, put, ... )

origin (Simple)

in-edges a list of names of demand graph edges pointing at the node

out-edges a list of names of demand grapb edges starting from the node

io-in-edges a list of names of incoming io-chain edges

io-out-edges a list of names of leaving io-chain edges

name name of the variable that will be computed by the node, as declared in the algorithm

port [only for get and put nodes) name of the io-port where the data comes in or leaves

time-out-edges list of names of time-constraint edges

sel-list [only for merge and branch nodes) (0 1)

Properties of a demand graph edge (named: DmgEdge-<i»:

type

to-node

from-node

name

describes the sort of connection between the nodes (left-source, right-source, last, control. enlly. value. inIink-<j>. outlink-<k>. sink, ... )

the name of the node the edge is pointing at

the name of the node the edge is starting from

the name of the variable flowing through the edge, as declared in the algorithm.

Page 36: Hardware synthesis with the aid of dynamic programming · The demand graph that is constructed is transformed into a hardware deSCription that consists of two parts: a net-list that

-30-

APPENDIX 2: STRUCTURE OF THE MODULE LIDRARY

library :

.1 library-module

1-+1 maintenance-function f-­.1 access-function

comment

libary-module :

documentation

version

documentation:

~text 1-+

behaviour :

operation

logic-table

interlace :

version :

input-list

output-list

control-list

behaviour

commutativity

control-table

date-of -last-update

technology

electrical-characteristics

boolean-function

complexity

Page 37: Hardware synthesis with the aid of dynamic programming · The demand graph that is constructed is transformed into a hardware deSCription that consists of two parts: a net-list that

operation :

~

class :

_, monadic-logic

~ monadic-arithmetic J-­dyadic-logic I

H dyadic-arithmetic r­relational I

dyadic-logic :

dyadic-arithmetic :

relational :

-31-

monadic-logic :

~

monadic-arithmetic :

square

square-root

Page 38: Hardware synthesis with the aid of dynamic programming · The demand graph that is constructed is transformed into a hardware deSCription that consists of two parts: a net-list that

-32-

input-list : output-list :

~ terminal-list I---- -1 terminal-list I----control-list :

~ terminal-list I----terminal-list :

terminal :

name: electrical-characteristics :

-1 symbol I---- -1some I----length : width :

-.(Integer-number ~ ---+(lnteger-number ~

dissipation : delay:

----.(integer-number ~ ----.(integer-number ~

Page 39: Hardware synthesis with the aid of dynamic programming · The demand graph that is constructed is transformed into a hardware deSCription that consists of two parts: a net-list that

-33-

APPENDIX 3: FORMATS FOR COST ESTIMATOR

In appendix 4 hdw-net-vector. hdw-module-vector. net-name and module-name are described.

input-sequence :

new-net-vees new-mod-vees

old-net-vees old-mod-vees

output-sequence :

~ costs H placement H new-cost-vecs ~

changes :

~ extension H deletion H substitution ~

extension :

-<0 l'--(_)_I_hd_w_-n_e-Jt: .... ve_ct_or_~Jr ~

deletion :

delete-pair :

~net-name

substitution :

~ ....... I..,....(L~)I=m=od=ul::;e:1=na=m=e =--.JJI )~

-<0 F~)-':I=m=od=ul:::;e:1=na=m=e=-~)J )~

Page 40: Hardware synthesis with the aid of dynamic programming · The demand graph that is constructed is transformed into a hardware deSCription that consists of two parts: a net-list that

-34-

new-nel-vecs :

new-mod-vecs :

old-net-vecs :

old-mod-vecs :

--+0~I~(~)il ~hd~w;-~m~Od~:~e-~v~ec~to;r :t::::)~r~) (2)--

costs :

cum-net-area

placement :

old-cost-vecs :

-<D~I~(=)~I~m~o~du~le~-c~o:~st~-v~e~cto~r~l=)~J~) (2)--

new-cost-vecs :

-<D~l~(=)~I~m~o~du~le~-~co~:t~-v~e~ct~or~I=)~J~)(i)--

Page 41: Hardware synthesis with the aid of dynamic programming · The demand graph that is constructed is transformed into a hardware deSCription that consists of two parts: a net-list that

module-cast-vector :

cst-comp-name

csl-comp-name :

4 module-name ~

cst-comp-max-delay :

--+{integer-number }---+-

cosl-vector :

cross-list :

-35-

cst-comp-cost- info

cst-comp-cost-in/o :

4 cost-vector ~

cum-net-area

-<D r ~~ net-name tj used-terminals f-+<D )J ~<D---+

used-terminals :

-+{integer-number }---+-

cum-mod -area :

--+{integer-number }---+-

cum-seq-length :

--+{integer-number }---+-

delby :

-+{integer-number }---+-

cum-net-area :

---+(integer-number }---+-

tracks :

-+(Integer-number ~

Page 42: Hardware synthesis with the aid of dynamic programming · The demand graph that is constructed is transformed into a hardware deSCription that consists of two parts: a net-list that

-36-

APPENDIX 4: SYNTAX OF A STATE

See appendix 3 for the syntax-diagrams of costs, placement and new-cost-vees_ See appendix 2 for the syntax-diagram of library-module.

slate .-

realised - nodes

next-cycle-nr

cycle-type-stack

bucket-stack free-mdls-stack

slate-id .-

~Inteller-number ~

realised-nodes .-

~ nOde-list ~

mdl-nr .-

~Inteller-number )-+

mdl-vecs .-

prev-slale .­

~Inteller-number ~

bucket .-

-+I node-list ~

nel-nr :

~Integer-number ~

~~lfC(~)11 ~hd~w~-m~o~du~t~-~ve~ct~or~=:Ji)Ji) ~

nel-vecs .-

Page 43: Hardware synthesis with the aid of dynamic programming · The demand graph that is constructed is transformed into a hardware deSCription that consists of two parts: a net-list that

-37-

hdw-net-vector :

net-name net-in-modules

net-name:

~integer-number ~

'net-in-modules : net-out-modules :

~ module-list ~ 4 module-list ~

module-list :

hdw-module-vector :

module-name module-in-nets

module-control-nets

library-module

module-name:

~integer-number ~

module-in-nets : module-out-nets :

4 net-list ~ 4 net-list ~

module-control-nets :

4 net-list ~

net-list :

Page 44: Hardware synthesis with the aid of dynamic programming · The demand graph that is constructed is transformed into a hardware deSCription that consists of two parts: a net-list that

-38-

node-list :

node-name:

~DmgNOde- Minteger-number ~

input-conn : output-conn :

~ port-list f--+ -+l port-list f--+ port-list :

-<D F )CD--l port-name 9 module-name ~ )J )~

port-name :

~cbaracter-strlng~

constants :

-<D F )CD--l module-name tf ccinstant-value ~ )J )0---

constant-value : free-modules :

integer-number -+l module-list ~

real-number

trace-nodes :

use: trace-invalid :

~integer-number ~ -+l node-list ~

Page 45: Hardware synthesis with the aid of dynamic programming · The demand graph that is constructed is transformed into a hardware deSCription that consists of two parts: a net-list that

cyc/e-nr :

~Integer-number )---+-

cycle-type :

normal

case-nr :

~Integer-number )---+-

cycle-nr-stack :

cycle-type-stack :

buckel-slack :

/ree-mdls-slack :

-39-

next-cycle-nr :

--.(Integer-number )---+-

Page 46: Hardware synthesis with the aid of dynamic programming · The demand graph that is constructed is transformed into a hardware deSCription that consists of two parts: a net-list that

Eindhoven University of Technology Research Reports Faculty of Electrical Engineering

ISSN 0167-9708 Coden: TEUEOE

(171 ) Monnee, P. and M.H.A.J. Herben MLnJriPLE-BEAM GROUNDSTATTON"RrFLECTOR ANTENNA EUT Report 87-E-171. 1987. ISBN 90-6144-171-4

SYSTEM: A preliminary study.

(172) Bastiaans, M.J. and A.H.M. Akkermans ERROR REDUCTION IN TWO-DIMENSiONAL PULSE-AREA MODULATION, WITH APPLICATION TO COMPUTER-GENERATED TRANSPARENCIES. EUT Report 87-E-172. 1987. ISBN 90-6144-172-2

(173) Zhu Yu-Cai nNlA BOUND OF THE MODELLING ERRORS OF BLACK-BOX TRANSFER FUNCTION ESTIMATES. EUT Report 87-E-173. 1987. ISBN 90-6144-173-0

(174) Berkelaar, M.R.C.M. and J.F.M. Theeuwen TECHNOLOGY MAPPING FROM BOOLEAN EXPRESSIONS TO STANDARD CELLS. EUT Report 87-E-174. 1987. ISBN 90-6144-174-9

(175) Janssen, P.H.M. FURTHER RESULTS ON THE McMILLAN DEGREE AND THE KRONECKER INDICES OF ARMA MODELS. EUT Report 87-E-17S. 1987. ISBN 90-6144-175-7

(176) Janssen, P.H.M. and P. Stoica, T. ·s~r;~SV.~PT'F~~~;R()SS;-VAI MODEL STRUCTURE SELECTI~ MUL A IDATION METHODS. EUT Report 87-E-176. 1987. ISBN 90-6144-176-5

(177) Stefanov, B. and A. Veefkind, L. Zarkova ARCS iN CESIUM SEEDED NOBLE GASES RESULTING FROM A MAGNETICALLY INDUCED ELECTRIC FI ELD. EUT Report 87-E-177. 1987. ISBN 90-6144-177-3

(178) Janssen, P.H.M. and P. Stoica ON THE EXPECTATION OF T~DUCT OF FOUR MATRIX-VALUED GAUSSIAN RANDOM VARIABLES. EUT Report 87-E-178. 1987. ISBN 90-6144-178-1

(179) Lieshout, G.J.P. van and L.P.P.P. van Ginneken GM: A gate matrix layout generator. EUT Report 87-E-179. 1987. ISBN 90-6144-179-X

(180) Ginneken, L.P.P.P. van GRIDLESS ROUTING FOR GENERALIZED CELL ASSEMBLIES: Report and user manual. EUT Report 87-E-180. 1987. ISBN 90-6144-180-3

(181) Bollen, M.H.J. and P.T.M. Vaessen ~NCY SPECTRA FOR ADMITTANCE AND VOLTAGE TRANSFERS MEASURED ON A THREE-PHASE POWER TRANSFORMER. EUT Report 87-E-181. 1987. ISBN 90-6144-181-1

(182) Zhu Yu-Cai

(183 )

(18_ )

(185)

(186)

~CK-BOX IDENTIFICATION OF MIMO TRANSFER FUNCTIONS: Asymptotic properties of prediction error models. EUT Report 87-E-182. 1987. ISBN 90-6144-182-X

Zhu Yu-Cai nNITHE BOUNDS OF THE MODELLING ERRORS OF BLACK-BOX MIMO TRANSFER FUNCTION ESTIMATES. EUT Report 87-E-183. 1987. ISBN 90-6144-183-8

Kadete, H. ~EMENT OF HEAT TRANSFER BY CORONA WINO. EUT Report 87-E-184. 1987. ISBN 90-6144-6

Hermans, P.A.M. and A.M.J. Kwaks, I.V. Bruza, J. DD~tE THE iMPACT OF TELECOMMUNICATnlNON RURAI.IiR£'AS IN LOPING COUNTRIES. EUT Report 87-E-185. 1987. ISBN 90-6144-185-4

Fu Yanhong THE INFLUENECE OF CONTACT SURFACE MICROSTRUCTURE ON VACUUM ARC STABILITY AND ARC VOLTAGE. EUT Report 87-E-186. 1987. ISBN 90-6144-186-2

(187) Kaiser, F. and L. Stok, R. van den Born ~ AND IMPLEMENTATION OF A MODULE LIBRARY TO SUPPORT THE STRUCTURAL SYNTHESIS. EUT Report 87-E-187. 1987. ISBN 90-6144-187-0

Page 47: Hardware synthesis with the aid of dynamic programming · The demand graph that is constructed is transformed into a hardware deSCription that consists of two parts: a net-list that

Eindhoven Universit of Technolo Research Re orts aculty 0 ectrical nqineerinq

ISSN 0167-9708 Coden: TEUEDE

(188) J6iwiak, J. THE FULL DECCMPOSITION OF SEQUENTIAL MACHINES WITH THE STATE AND OUTPUT BEHAVIOUR REALIZATION. EUT Report 88-E-188. 1988. IS8N 90-6144-188-9

(189) Pineda de Gyvez, J.

(190)

(191 )

( 192)

( 193)

ALWAYS: A system for wafer yield analysis. EUT Report 88-E-189. 1988. ISBN 90-6144-189-7

Siuzdak, J. OPTICAL COUPLERS FOR COHERENT OPTICAL PHASE DIVERSITY SYSTEMS. EUT Report 88-E-190. 1988. ISBN 90-6144-190-0

Bastiaans, M.J. LOCAL-FREQUENCY DESCRIPTION OF OPTICAL SIGNALS AND SYSTEMS. EUT Report 88-E-191. 1988. ISBN 90-6144-191-9

Worm, S.C.J. A MULTI-FREQUENCY ANTENNA SYSTEM FOR PROPAGATION EXPERIMENTS WITH THE OLYMPUS SATELLITE. EUT Report 88-E-192. 1988. ISBN 90-6144-192-7

Kersten, W.F.J. and G.A.P. Jacobs ANALOG AND DIGITAL SIMULATI~LINE-ENERGIZINC OVERVOLTAGES AND COMPARISON WITH MEASUREMENTS IN A 400 kV NETWORK. EUT Report 88-E-193. 1988. ISBN 90-6144-193-5

(194) Hosselet, L.M.L.F. MARfiNU5 VAN MARUM: A Dutch scientist in a revolutionary time. EUT Report 88-E-194. 1988. ISBN 90-6144-194-3

(195) Bondarev, V.N.

(196)

ON SYSTEM IDENTIFICATION USING PULSE-FREQUENCY MODULATED SIGNALS. EUT Report 88-E-19S. 1988. ISBN 90-6144-195-1

Liu Wen-Jiang, Zhu Yu-Cai and Cai Da-Wei M05EL BUILDING ~ AN INGOT HEATING PROCESS: Physical identification approach. EUT Report 88-E-196. 1988. ISBN 90-6144-196-X

modelling approach and