97
Chapter 1 INTRODUCTION 1.1 MOTIVATION The major objective of floorplanning/placement is to locate the modules of a circuit into a chip to optimize its area and timing.Floorplanning being the first stage of VLSI Physical Design is the most suited phase for early optimization of timing, congestion and routability.Floorplanning thus has a profound impact on the area, delay, power, and many other design parameters. To ensure an effective and reliable design, careful and accurate foorplanning is necessary. Due to the increase in design complexity, circuit size is getting larger in modern VLSI design. To handle the design complexity, hierarchical design and reuse of IP modules become popular, which makes floorplanning/placement much more important than ever. Further the need to integrate heterogeneous systems or special modules imposes some placement constraints e.g., the boundary-module constraint which requires some modules to be placed along the chip boundaries for shorter connections to pads, the replaced-module constraint which pre-assigns modules to specific positions etc. These trends make floorplanning/placement much more important and it is 1

9report1

Embed Size (px)

DESCRIPTION

must download

Citation preview

Chapter 1

Chapter 1

INTRODUCTION

1.1 MOTIVATION

The major objective of floorplanning/placement is to locate the modules of a circuit into a chip to optimize its area and timing.Floorplanning being the first stage of VLSI Physical Design is the most suited phase for early optimization of timing, congestion and routability.Floorplanning thus has a profound impact on the area, delay, power, and many other design parameters. To ensure an effective and reliable design, careful and accurate foorplanning is necessary. Due to the increase in design complexity, circuit size is getting larger in modern VLSI design. To handle the design complexity, hierarchical design and reuse of IP modules become popular, which makes floorplanning/placement much more important than ever.

Further the need to integrate heterogeneous systems or special modules imposes some placement constraints e.g., the boundary-module constraint which requires some modules to be placed along the chip boundaries for shorter connections to pads, the replaced-module constraint which pre-assigns modules to specific positions etc. These trends make floorplanning/placement much more important and it is of particular significance to consider the floorplanning/placement with various constraints. To cope with these challenges it is desired to develop an efficient and effective floorplan representation that can model the geometric relations among regular as well as constrained modules.

The realization of Floorplanning relies on a representation which describes geometric relations among modules. The representation has a great impact on the feasibility and complexity of floorplan designs. Thus it is of particular significance to develop an efficient, effective, and flexible representation for floorplan designs.

1.2 ORGANIZATION OF THE REPORT

This report is organized as follows. The chapter 1 describes the introduction of the thesis work. Chapter 2 formulates the floorplan/placement design problem. Chapter 3 presents the procedures to derive a TCG from a placement and construct a placement from a TCG. Chapter 4 introduces the operations to perturb a TCG. Experimental results are reported in Section 5. Finally, conclusion of work and discussion on future research directions in Section6.

Chapter 2

LITERATURE SURVEY

2.1 Physical Design

Physical design of a circuit is the phase that precedes the fabrication of a circuit. In most general terms, physical design refers to all the synthesis steps succeeding logic design and preceding fabrication. The performance of the circuit, its area, its yield and its reliability depend critically on the way of the circuit is physically laid out.

In an integrated circuit layout, metal and polysilicon are used to connect two

Points that is electrically equivalent. Both metal and poly lines introduce wiring

Impedances. Thus a wire can impede a signal from traveling at a fast speed. The longer the wire, the larger the wiring impedance, and the longer the delays introduced by the wiring impedance. When more than one metal layer is used for layout, there is another source of impedance. If a connection is implemented partly in metal layer 1 and partly in metal layer 2, via is used at the point of layer in metal, a contact becomes necessary to perform the layer change. Contacts and vias introduce a significant amount of impedance, once again contributing to the slowing down of signals. Layout critically affects the area of a circuit. There are two components to the

Area of an integrated circuit, the functional area and the wiring area. The area taken up by the active elements in the circuit is known as functional area.

The wires used to interconnect these functional modules contribute to the wiring area. Just as the affect the performance of the circuit. A good layout should have strongly connected modules to be placed close together, so that longer wires are avoided as much as possible. Similarly, a good layout will have as few vias as possible. Thus the area of the circuit has direct influence on the yield of the manufacturing process. We define yield to be the number of chips that are detect free in a batch of manufactured chips. The larger the chip area, the lower the yield. A low yield would mean high production cost, which in turn would increase the selling cost of the chip.

The reliability of the chip is also influenced by the layout. For instance, vias are the sources of unreliability, and a layout which has a large number of vias is more likely to have defects. Further, the width of a metal wire must be chosen appropriately by the layout program to avoid metal migration. If a thin metal wire carries a large current, the excessive current density may cause wearing away of metal, tapering the wire slowly result in the open circuit. Hence floorplanning, placement and routing hold to be key stages in the layout for any circuit. As these parameters directly reflects its performance over the yield and reliability of circuit. In this thesis a study of various approaches for floorplanning is performed.

2.2 Physical Design Cycle

Physical design is the phase that precedes fabrication. The different stages of the physical design cycle are shown in the Figure: 2.1

2.2.1 Partitioning

A chip may contain several million transistors. Layout of the entire circuit cannot be handled due to the limitation of memory space as well as computation power available. Therefore, it is normally partitioned by grouping the components into blocks (sub-circuits / modules).The actual partitioning process considers many factors such as, the size of the blocks, number of blocks, and the number of interconnections between the blocks. The set of interconnections required is referred to as a netlist. The output of partitioning is a set of blocks and the interconnections required between the blocks.

2.2.2 Floorplanning and Placement

This step is concerned with selecting good layout alternatives for each block, as well as the entire chip. The area of each block can be estimated after partitioning and is based approximately on the number and the type of components in that block.Floorplanning is a critical step, as it sets up the ground work for a good layout. However, it is computationally quite hard. During placement, the blocks are exactly positioned on the chip. The goal of placement is to find a minimum area arrangement for the blocks that allows completion of interconnections between the blocks, while meeting the performance constraints.

Fig.2.1 VLSI Physical design cycle

2.2.3 Routing

The objective of the routing phase is to complete the interconnections between blocks according to the specified netlist. First, the space not occupied by the blocks (called the routing space) is partitioned into rectangular regions called channels and switch boxes. The goal of the router is to complete all circuit connections using shortest possible wire length and using only the channel and switch boxes. This is usually done in two phases, referred to as the Global Routing and Detailed Routing phase. For each wire, the global router finds a list of channels and switchboxes which are to be used as a passageway for that wire. Global routing is followed by detailed routing which completes point to point connections between pins on the blocks. Detailed routing and switchbox routing, and is done for each channel and switch box.

2.2.4 Compaction

Compaction is simply the task of compressing the layout in all directions such that the total area is reduced. By making the chip smaller, wire lengths are reduced which in turn reduced the signal delay between components of the circuit. At the same time smaller area may imply more chips can be produced on a wafer which in turn reduces the cost of manufacturing.

2.2.5 Extraction and Verification:

Design Rule Checking is the process which verifies the design rules imposed by the fabrication process. After checking the layout for design rule violations and removing the design Rule violations, the functionality of the layout are verified by Circuit Extraction. This is a reverse engineering process and generate the circuit representation from the layout. Extracted description is compared with circuit description to verify its correctness. This process is called Layout Versus Schematics (LVS) verification. Geometric information is extracted to compute Resistance and Capacitance. This allows to accurately calculate the timing of each component including interconnect. This process is called Performance Verification.

Physical design is iterative in nature and many steps such as global routing and channel routing are repeated several times to obtain better layout. In addition results obtained in a step depend on the quality of solution obtained in earlier steps. Finally after meeting all the specifications of the physical design cycle the design is sent for fabrication.

2.3 FLOORPLANNING

Floorplanning is a major step in the physical design cycle of VLSI circuits. It is the step to plan the positions and the shapes of the top-level blocks of a hierarchical design. The floorplan is a physical description of an ASIC. The traditional foorplanning problem takes as input a set of modules (blocks), their widths and heights, and interconnections between them. It tries to find a foorplan such that the total area, delay, and power are minimized. The input to a floorplanning step is a hierarchical netlist that describes the interconnection of the blocks (RAM, ROM, ALU, cache controller, and so on), the logic cells (NAND, NOR, D flip-flop, and so on) within the blocks; and the logic cell connectors (the terms terminals, pins, or ports mean the same thing as connectors). The netlist is a logical description of the ASIC; we have to now set aside spaces (channels) for interconnect and arrange the cells. Floorplanning is thus a mapping between the logical description (the netlist) and the physical description (the floorplan).

Fig.2.2 Typical Floor Plan of a chip

2.3.1 Floorplanning Goals and Objectives

Floorplanning being the first stage of VLSI Physical Design is the most suited for early optimization of timing, congestion and routability. Floorplanning follows the system partitioning step and is the first step in arranging circuit blocks on an ASIC. There are many factors to be considered during floorplanning: minimizing connection length and signal delay between blocks; arranging fixed blocks and reshaping flexible blocks to occupy the minimum die area; organizing the interconnect areas between blocks; planning the power, clock, and I/O distribution. The criterion for optimization may be minimum interconnect area, minimum total interconnect length, or performance.

The goals of floorplanning are to

arrange the blocks on a chip,

decide the location of the I/O pads,

decide the location and number of the power pads,

decide the type of power distribution, and

Decide the location and type of clock distribution.

The objectives of floorplanning are to:

minimize delay

minimize the chip area

2.3.2 Floorplan Problem Definition

The major objective of floorplanning/placement is to locate the modules of a circuit into a chip to optimize its area and timing the input to the Floorplaning algorithm is a circuit C (M, N), where m is the set of modules in the floorplanning system and N is the set of nets defining the connectivity among these modules. Modules can be soft modules and hard modules. A soft module is a module whose width and height can be changed as long as the aspect ratio is within a given range and the area is as given. A hard module is a module whose width and height are fixed.

2.3.3 User Defined Constraints

i) Shape constraint: since we dont want layout chips as long strips there should be bounds on aspect ratios of each block. ri < hi /wi < si, where h/w is called aspect ratio of a block. For hard blocks, only the orientations can be changed

ii) Capacity constraint: The chip is divided into several bins by super-imposing a grid. Each bin boundary has a capacity (maximum number of nets that can cross it) associated with it. The objective is to minimize the capacity violation on these bins.

iii) Timing constraints: Based on the given clock speed the goal is to meet the critical delay for the longest paths (between two flip-flop boundaries). This delay may be modeled as the sum of the delays of the nets and gates on the critical path.Given the input specification, the objective is to find a floorplan which best meets the given constraints.

iv) Overlap constraints:

Prevent any two blocks from overlapping

v) Routability constraints:

Estimate the routing area required between the blocks

2.3.4 Wirelength estimation

Exact wire length of each net is not known until routing is done. In floorplanning, even pin positions are not known. The process of identifying pin location is called pin assignment. A possible wire length estimation method is Center-to-center estimation, half of the perimeter of the rectangle enclosing all terminals in a net or minimum rectilinear spanning/Steiner tree.

a)center-to-center estimation b)half of the perimeter

Fig.2.3 Wire Length Estimation

2.3.5 Dead space

Dead space is the space that is wasted; Minimizing area is the same as minimizing dead space. Dead space percentage is computed as

((A - Ai) / _ Ai) 100%

2.4 Approaches to Floorplanning:

Several approaches have been reported to tackle the floorplanning problem. The Reported approaches belong to three general classes:

Constructive

Iterative and

Knowledge based.

The constructive algorithms attempt to build a feasible solution by starting from a

seed module; then other modules are selected one (or group) at a time and added to the partial floorplan. This process continues until all the modules have been selected.

Among the approached that fall into this class are cluster growth, partitioning and

Slicing, connectivity clustering, geometric approach, mathematical programming, and

Rectangular dualization.

The Iterative techniques start from an initial floorplan. Then this floorplan

Undergoes a series of perturbations until a feasible floorplan is obtained or no more

Improvements can be achieved. Typical iterative techniques which have been

Successfully applied to Floorplaning are simulated annealing and genetic algorithm.

The knowledge-based approach has been applied to several design automation

Problems including cell generation and layout, circuit extraction, routing, and

floorplanning. In this approach, a knowledge expert system is implemented which

Consists of three basic elements: (a) a knowledge base that contains data describing the floorplan problem and its current state, (b) rules stating how to manipulate the data in the knowledge base in order to progress toward a solution, and (c) an inference engine controlling the application of the rules to the knowledge base.

2.5 Floorplan structures

The geometrical relationship among the blocks is commonly specified by a rectangular dissection of the floorplan region. The floorplan region is first dissected into rectangular rooms and each block is then mapped to a different room. In order to restrict the size of the solution space, three different ways of dissection are proposed. The corresponding floorplanning structures are called slicing, mosaic and general floorplan.Slicing floorplan is a rectangular dissection that can be obtained by recursively cutting a rectangle horizontally or vertically into two smaller rectangles. Otherwise it is a non-slicing floorplan as shown in fig 2.4

a.Slicing floorplan b. Non Slicing floorplan

Fig.2.4 Floorplan structures

2.6 Floorplan Representation

A floorplan representation is usually used to represent the geometrical relationships among the blocks.The floorplan representation is perturbed repeatedly by the stochastic techniques to search for a good floorplan.The run time and the quality of the solutions depend strongly on the size of the solution space, i.e., the number of possible representation.

2.6.1 Slicing Floorplans

2.6.1.1 Slicing Tree

The first proposed slicing floorplan representation is using a binary tree representation called slicing tree. Each leaf of the slicing tree corresponds to a block and each internal node represents a vertical or horizontal merge operation on the two descendents. One slicing floorplan may correspond to more than one slicing tree. Later redundancy was identified in slicing tree.

(a) Slicing floorplan (b) slicing tree

Fig.2.5 Slicing floorplan

2.6.1.2 Polish Expression (PE)

A string of symbols obtained by traversing a binary tree in post-order called polished expression., to present a slicing floorplan.Left child of a V-cut in the tree represents the left slice in the floorplan.Left child of an H-cut in the tree represents the top slice in the floorplan.Example:

Fig.2.6 Polish Expression

Problems with PE:multiple representations for some slicing trees (When more than one cut in one direction cut a floorplan), larger solution space.

2.6.2 Floorplan Representations in non slicing structure

2.6.2.1 Sequence Pair (SP)

A sequence pair for a set of modules is a pair of sequences of the module names. A sequence pair imposes a horizontal/vertical constraint for every pair of modules as follows: Example:

(ab,ab)=>a should be placed to the left of b

(ba,ab) =>a should be placed below b

SP1= (ABCDFE, FADEBC), SP2= (ABCDFE, FADBEC)

Fig.2.7 Sequence Pair

Sp can handle non-slicing structure and it is very flexible in representation however it is time-consuming Sequence. Since the solution space is large pair, Harder to transform between a sequence pair and a placement .moreover Sequence pair cannot handle soft modules directly.

2.6.2.2 Bounded-Sliceline Grid (BSG)

In the bounded-sliceline grid (BSG) representation blocks are randomly placed in a special n-by-n grid. The corresponding size of the solution space is even larger than that of SP. The huge solution spaces of BSG restrict the applicability of these representations in large floorplan problems. In a BSG modules are assigned into n x n rooms. Edge weights of Gh (Gv) denote the widths (heights) of modules.

Fig.2.8 Bounded-Sliceline Grid

2.6.2.3 O-Tree

An O-tree is a rooted directed tree in which the order of the sub trees T1... Tm is important. The order of the sub trees T1... Tm determines the DFS order when we traverse the tree. To encode a rooted ordered tree with n nodes, we need a 2(n-1)-bit string T to identify the branching structure of tree, and a permutation p as the labels of n nodes. The bit string T is a realization of the tree structure. We write a 0 for a traversal which descends an edge and a 1 when it subsequently ascends that edge in tree. The permutation p is the label sequence when we traverse the tree in depth-first search order. The first element in permutation p is the root of tree.

The following example demonstrates the encoding of an 8-node rooted ordered tree: Given an 8-node tree shown in Fig. 1, its root node has three sub trees rooted at a, b and c. We can represent it by (00110100011011, adbcegf). Starting from the root, we visit node a first and record a bit 0 to T and a label a to p.Then we visit node d and record a bit 0 to T and a label d to p. On the way back to the root from nodes d and a, we record two bits 11 to T. Then we visit sub trees b and c in Sequence, and record the remaining of T and p respectively. The length of the bit string T is 16.

Fig.2.9 O-Tree and placement

The solution space is smaller for O-tree ,transformation between representation and placement takes only linear time and O-tree can be encoded by fewer bits than sequence pair and BSG .however O-tree is less flexible than BSG/sequence pair in representation, tree structure is irregular, harder for implementation ,need to encode and operate on module sequence, need to transform between the tree and its placement during processing, inserting positions are limited, might deviate from the optimal during solution perturbation.

2.6.2.4 B*-tree

A B*-tree is an ordered binary tree whose root corresponds to the module on the bottom-left corner. Similar to the DFS procedure, we construct the B*-tree for an admissible placement p in a recursive fashion. We make n0 the root of tree since b0 is on the bottom-left corner. Constructing the left sub tree of n0 recursively, we make n7 the left child of n0. Since the left child of n7 does not exist, we then construct the right sub tree of n7 (which is rooted by n8). The construction is recursively performed in the DFS order. After completing the left sub tree of n0, the same procedure applies to the right sub tree of n0.

Fig.2.10 B*-Tree and placement

Binary-tree based representation is efficient and flexible to deal with hard, pre-placed, soft, and rectilinear modules, smaller encoding cost for B*-tree. Except for handling soft modules, it can transform a tree to its placement during processing, which takes only linear time. A B* tree Can evaluate area cost incrementally and the solution space is smaller.

2.6.2.5 Corner Block List (CBL)

The corner block list is constructed from the record of a recursive corner block deletion. For each block deletion, we keep a record of block name, corner block orientation, and number of T-junctions uncovered. At the end of deletion iterations, we concatenate the data of these three items in a reversed order. Thus, we have a sequence S of block names, a list L of orientations, and a list T of T-junction information. The three topple (S, L, T) is called a corner block list. We use the floorplan of figure 2.10 as an example.

Fig.2.11 CBL and placement

First, block d is deleted. d is vertical oriented and there is one T- junction attached at the bottom edge of block d.Block a, b, g, e, c, f are deleted successively. We concatenate these record in a reverse order of deletion and derive a corner block list (S, L, T), where S= (fcegbad), L= (001100), and T= (001010010).

CBL a new effective representation for non-slicing floorplan has the same computing complexity as that of binary tree of slicing structure. However, it can not only represent all floorplans with slicing structure, but also represent non-slicing floorplans. The time complexity of CBL is much lower than the other non-slicing structures such as SP and BSG. CBL has almost the same time and space complexity as O-tree; however, it is better suited for floorplan optimization with various size configurations of each block.

CHAPTER 3

TRANSITIVE CLOSURE GRAPH

3.1 Introduction

A transitive closure graph-based representation for general floorplans.TCG uses a horizontal and a vertical transitive closure graphs to describe the horizontal and vertical relations for each pair of modules.

It combines the advantages of sequence pair, BSG, and B*-tree. Like sequence pair and BSG, but unlike O-tree, B*-tree, and CBL, TCG satisfies the four properties of P-admissibility (1) its solution space is finite (2) it guarantees a unique feasible packing for each representation (3) packing and cost evaluation can be performed in O (m2) time, and (4) the best evaluated packing in the solution space corresponds to an optimum placement. Like B*tree, but unlike sequence pair, BSG, O-tree, and CBL, TCG does not need to construct additional constraint graphs for the cost evaluation during packing, implying faster runtime. Further, TCG supports incremental update during operations, and keeps the information of boundary modules as well as the shapes and the relative positions of modules in the representation. More importantly, the geometric relation among modules is transparent not only to the TCG representation but also to its operations (i.e., the effect of an operation on the change of the geometric relation is known before packing), facilitating faster convergence to a desired solution. All these properties make TCG an effective and flexible representation for handling the general floorplan/placement design problems with various constraints such as boundary constraints. Compared to O-tree and enhanced O-tree, the runtime requirements of TCG are much smaller than O-tree and B*-tree.

3.2 From a Placement to Its TCG

For two non-overlapped modules bi and bj, bi is said to be horizontally (Vertically) related to bj, denoted by bi bj (bi bj), if bi is on the left (bottom) side of bj and their projections on the y (x) axis overlap. For two non-overlapped modules bi and bj, bi is said to be diagonally related to bj, if bi is on the left side of bj and their projections on the x and the y axes do not overlap. In a placement, every two modules must bear one of the three relations: horizontal relation, vertical relation, and diagonal relation. To simplify the operations on geometric relations, we treat a diagonal relation for modules bi and bj as a horizontal one, unless there exists a chain of vertical relations from bi (bj), followed by the modules enclosed with the rectangle defined by the two closest corners of bi and bj, and finally to bj (bi), for which we make bi bj, (bj bi). Figure 3.1 shows a Placement to Its TCG (ch, cv).

a) Floorplan b) Horizontal closure graph c) Vertical closure graph

Fig.3.1 floorplan to TCG

3.3 From a TCG to its placement

Given a TCG, its corresponding placement can be obtained in by performing a well-known longest path algorithm called Bellman Ford Algorithm on TCG.To facilitate the implementation of the longest path algorithm, we augment the given two closure graphs as follows. We introduce two special nodes with zero weights for each closure graph, the source ns and the sink nt, and construct an edge from ns to each node with in-degree equal to zero, and also from each node with out-degree equal to zero to nt. Figure 3.2 shows the augmented VCG and HCG for the TCG shown in Figure 3.1 b and 3.1 c.

(a) Augmented Ch (b) Augmented Cv

Fig.3.2 Augmented TCG.

3.3.1 The Bellman Ford Algorithm :

Pseudocode

For( i ( 1; i n; i ( i + 1)

Xi ( - ;

Count ( 0;

S1( {V0};

S2 ( ;

While (Count n && S1 )

{

For each Vi S1

For each Vj such that (Vi, Vj) E

If (Xj < Xi + dij)

{

Xj < Xi + dij;

S2 ( S2 U {Vj}

}

S1( S2;

S2 ( ;

Count = Count+1;

}

If (Count > n)

Error (positive cycle);

Let Lh (ni) (Lv (ni)) is the length of the longest path from ns to ni in the augmented Ch (Cv). Lh (ni) (Lv (ni) can be determined by performing the single source longest path algorithm on the augmented Ch (Cv) The coordinate (xi, yi) of a module bi is given by (Lh (ni), Lv (ni_)).Since the respective width and height of the placement for the given TCG are Lh (nt) and Lv (nt), the area of the placement is given by Lh (nt).Lv (nt).

3.4 Floorplanning Algorithm

A simulated annealing based algorithm is used for Solution Perturbation. Given an initial solution represented by a TCG, the algorithm perturbs the TCG to obtain a new TCG.

3.4.1 Solution Perturbation

Four operations can be applied to perturb a TCG to obtain a new TCG

3.4.1.1 Rotate

To rotate a module bi, we only need to exchange the weights of the corresponding node ni in Ch and Cv.Figure.3.3 (b) shows the resulting Ch, Cv, and placement after rotating the module d shown in Figure 3.3(a). The weights associated with the node nd in Ch and Cv has been exchanged.

Fig 3.3(a) initial configuration of TCG

Fig 3.3 (b) rotate module d

3.4.1.2 Swap

To swap two nodes ni and nj, we only need to exchange two nodes in both Ch and Cv. Fig 3.4(b) shows the resulting Ch, Cv, and placement after swapping the nodes na and nb shown in Fig 3.4(a) Notice that the nodes na and nb in both Ch and Cv have been exchanged.

Fig 3.4(a) TCG before swap

Fig 3.4 (b) swap na and nb

3.4.1.2 Reverse

The Reverse operation reverses the direction of a reduction edge (ni, nj) in a transitive closure graph, which corresponds to changing the geometric relation of the two modules bi and bj. For two modules bi and bj ,bi bj (bi bj), if there exists a reduction edge (ni , nj ) in Ch (Cv); after reversing the edge (ni, nj), we have the new geometric relation bj bi (bj bi), Therefore, the geometric relation among modules is transparent not only to the TCG representation but also to the Reverse operation (i.e., the effect of such an operation on the change of the geometric relation is known before packing); this property can facilitate the convergence to a desired solution.

Fig 3.5(a) TCG before reverse

Fig 3.5 (b) reverse (nc, ne)

3.4.1.3 Move

The Move operation moves a reduction edge (ni, nj) in a transitive closure graph to the other, which corresponds to switching the geometric relation of the two modules bi and bj between a horizontal relation and a vertical one. For two modules bi and bj , bi bj (bi bj)if there exists a reduction edge (ni , nj) in Ch (Cv); after moving the edge (ni, nj) to Cv (Ch), we have the new geometric relation bi bj (bi bj). Therefore, the geometric relation among modules is also transparent to the Move operation.

Fig 3.6(a) TCG before move

Fig 3.6(b) move (nb, ne)

3.5 Simulated Annealing physical model

Annealing is a mechanical process in which material is slowly cooled allowing the molecules to arrange themselves in such a way that the material is less strained thereby making it more stable. If materials such as glass or metal are cooled too quickly its constituent molecules will be under high stress lending it to failure (breaking) if further thermal or physical shocks are encountered. Slowing the cooling of the material allows each molecule to move into a place it feels most comfortable, i.e., less stress. As the material is kept at a high temperature the molecules are able to move around quite freely thus reducing stress on a large scale, indeed if the material is made too hot it will move into the liquid state allowing free movement of the molecules. As the material is cooled the molecules are not able to move around as freely but still move limited distances reducing stress in regional areas. The result is a material with significantly less internal stress and resistant to failure due to external shock. If one equates molecules to components and the substance to the overall design of an electronic circuit, Simulated Annealing can be applied to efficiently place the system onto the target die.

Figure 3.7 Molecules Movements per Temperature Region

3.5.1 Simulated annealing

Simulated annealing is a generic probabilistic search algorithm for finding a good approximation of the global optimum of a given objective function in a large discrete search space. During each search step the annealing algorithm replaces the current solution by a randomly selected "neighbor" solution. The neighbor is chosen with a probability that depends on the difference between the corresponding objective function values and on a global control parameter T, typically referred to as temperature. Starting from a high value, T is gradually decreased during the search such that the current solution changes almost randomly when T is large, but the moves become increasingly biased towards better solutions as T approaches zero. The possibility of uphill Moves for larger values of T ensures probabilistically that the search climbs out of local minimums.

Figure 3.8: Series of Neighboring Solutions Containing a Local Minimum

3.5.2 Application to Combinatorial Optimization

Simulated Annealing (SA) is a stochastic algorithm. As a Genetic Algorithm attempts to model evolution as a way to select an optimal solution, Simulated Annealing looks to the model of molecules in a heated mass and the way they behave as they cool to form a structured solid. The aim of the algorithm is to reduce the energy of the system through a slow cooling. As applied to placement, system energy is measured in the inefficiency (cost) of the placement; a poor placement will cause a system to have higher energy. This analogy is drawn from molecules in the cooling mass to components in the placement; a quickly cooled mass is quite fragile as a poorly placed design is inefficient due to the molecules and components, respectfully, being arranged in such a way that they experience internal tensions amongst each other as they try to move to regions which would lower their energy. After the material is cooled, the molecules (components) are frozen in these non-optimal positions, resulting in overall fragility of the system.

3.5.3 SA & Placement

VLSI placement in general consists of rectilinear components being targeted onto a rectangular or square die area in such a way that the interconnect wire length is minimized. In general, components are free to move to any location on the die and the interconnect wire length is calculated by measuring and summing the length of wire used to connect each net, or connection of a group of ports .

3.5.4 Cost Function

The initial placement may simply be given as a random placement. The cost function is usually comprised of several parameters each measuring a different aspect of the current solution. A very simple and widely used cost function parameter is the interconnect wire length of a placement solution , this can be easily approximated using the bounding box method .This wire length estimation method draws a bounding box around all ports in a given net, half the perimeter of this box is taken as the nets interconnect length approximation. The half-perimeter wire length (HPWL) estimation for minimally routed two and three port nets gives an exact value .The sum of all HPWL, i.e., all nets, gives a value of the approximate interconnect wire length for a placement solution. Another cost function parameter widely used is component overlap; as design rules do not allow components to overlap each other, any instance of such should be considered as a penalty .a penalty factor is added directly to the wiring length approximation. A commonly used objective function is a weighted sum of area and wire length.

cost = A + L, where A is the total area of the packing, L is

the total wire length, and and are constants.

The simplest way to generate a new placement is to move one random component from one position to another random position while another fairly simple change is to swap two random components positions. Changing a components orientation will move its ports resulting in small changes to interconnect lengths of the nets of which the component is a member. This type of move is usually only performed when no other types of perturbations yield a new solution. The key to applying Simulated Annealing to placement is the use of a cooling schedule which the algorithm follows. As the algorithms greediness is inversely proportional to the systems temperature, moves may be accepted that actually allow an increase in a placements cost. Enough time must be spent in the upper and lower temperatures to allow first a quick arrangement of the system and a final localized arrangement, respectively. If too much time is spent in the upper temperatures, processing time will be wasted as many inefficient intermediate solutions will be accepted, if too much time is spent in the lower temperatures, processing time will again be wasted due to a tight restriction on accepted moves. The algorithm is generalized by Figure 3.9.

3.5.5 Simulated annealing algorithm

Algorithm Simulated Annealing

Begin

Temp = Initial_Temp

Current placement = Initial Placement

While (temp! = Final_Temp) do

While (no of iterations < max_iterations) do

{

New placement = Rand Move (current placement)

c = COST (new placement)-COST (old placement)

If (c < 0) then

Current placement = new placement

Else if (exp (-c/T) > Random (0, 1)) then

Current placement = new placement

Temp = Schedule (temp)

}

End

Figure 3.9 Simulated Annealing Placement Flow Chart

3.5.6 Cooling Schedule

Specifically, the cooling schedule can follow any function but it typically employs two slopes, one steep slope for the extreme high and low temperatures and a smaller slope for the intermediate temperatures where the most beneficial placement changes will be made. As stated above, changes resulting in a reduced cost will always be accepted but that is not to say that changes resulting in an elevated cost will always be rejected. Temperature has an effect on the probability of a cost inducing change being accepted with the specific form exp(-c/T) where c is the positive cost change due to the new placement and T is the current system temperature given by the cooling schedule. This function is combined with a random value generator, if the randomly generated value is greater than the temperature function result the new placement is accepted. It is easy to see that for very large temperatures almost any change will be accepted while as the temperature is reduced the chance that a positive cost change will also be accepted is reduced. Each temperature step may contain several placement perturbations in the algorithm, adjusting this number is one of the refinements that may be made to deliver a more computationally efficient placement.

One of the most important part is the proper definition of the cooling schedule in order to maximize the cost reduction for each temperature step. If too much time is spent in upper temperatures, placement attempts are wasted due to an inordinate amount of cost increasing moves being accepted. At this point in the temperature schedule the design can be thought of as liquid and placement is simply randomizing the initial solution. If the design is cooled too quickly, the algorithm tends to get trapped in local minima that would otherwise be avoided if the design was allowed to cool slowly. This allows cost increasing moves to be accepted, moving the design to new points potentially allowing previously unavailable cost reducing moves. If the design is cooled too slowly, each temperature step will reach a point where no further cost reductions are seen; the algorithm converges on a cost which is then maintained by the combination of acceptance of cost increasing moves and discovering cost reducing moves.

Initial temperature, Cooling schedule, and freezing point are usually experimentally determined some common cooling schedules are t = t, where is typically around 0.95 and t = e-t, where is typically around 0.7

3.5.7 Advantages and Disadvantages

The simulated annealing algorithm is one of the most established algorithms for placement problems. It produces good quality floorplan.However; simulated annealing is computationally expensive and can lead to longer runtimes. Therefore, it is only suitable for small to medium size circuits. Although it is proven to converge to the optimum, it converges in infinite time. Not only for this reason, but also since we have to cool down slowly, the algorithm is usually not faster than its cotemporaries.

Chapter 4

TCGS: Combination of TCG and SP

4.1 Introduction

The equivalence of the two most promising P*-admissible representations, TCG and SP, and integrate TCG with a packing sequence (part of SP) into a new representation, called TCGS.TCG-S combines the advantages of SP and TCG and at the same time eliminates their disadvantages. With the property of SP, faster packing and perturbation schemes are possible. Inherited nice properties from TCG, the geometric relations among modules are transparent to TCGS (implying faster convergence to a desired solution), placement with position constraints becomes much easier, and incremental update for cost evaluation can be realized. These nice properties make TCG-S a superior representation which exhibits an elegant solution structure to facilitate the search for a desired floorplan/placement. TCG-S results in the best area utilization, wirelength optimization, convergence speed, and stability among existing works and is very flexible in handling placement with special constraints.

4.2 P*-admissible and non-P*-admissible representations

A representation is said to be P-admissible if it satisfies the following four conditions (1) the solution space is finite, (2) every solution is feasible, (3) packing and cost evaluation can be performed in polynomial time, and (4) the best evaluated packing in the space corresponds to an optimal placement. Extension of the P-admissible representation to that of P*admissible one is done by adding the fifth condition: (5) the geometric relation between each pair of modules is defined in the representation. With this condition, general floorplans/placements can be modeled. Therefore, a P*-admissible representation contains a complete structure for searching for an optimal floorplan/placement solution. Therefore, it is desirable to develop an effective and flexible P*admissible representation.

Among the existing popular representations, SP, BSG, and TCG are P*admissible while slicing tree, NPE, O-tree, B*-tree, CBL, and Qsequence are not. The slicing tree and NPE are intended for slicing floorplans only. Since an optimal placement could be a non-slicing structure, the two representations are not P-admissible and thus not P*-admissible (i.e., violation of P*-admissibility Condition (4)). An O-tree defines only one-dimensional geometrical relation between compacted modules and thus can obtain the relation in the other dimension only after packing (i.e., violation of Condition (5)). A B*-tree requires a placement to be left and/or bottom compacted. However, the space intended for placing a module may be occupied by previously placed modules during packing, resulting in a mismatch between the original representation and its compacted placement. Therefore, it may not be feasible to find a compacted placement corresponding to the original B*-tree, and thus it is not P-admissible (i.e., violation of Condition (2)). CBL and Q-sequence can represent only mosaic floorplans, in which each region in the floorplan contains exactly one module. CBL and Q-sequence are not P-admissible because it cannot guarantee a feasible solution after a perturbation (i.e., violation of Conditions (2) and (4)).

4.3 Combining TCG and SP

Both SP and TCG are considered very flexible representations and construct constraint graphs to evaluate their packing cost. SP consists of two sequences of modules (+, -) where + specifies the module ordering from top-left to bottom-right and - corresponds to the ordering from bottom-left to top-right. This can be used to guide module packing. However, like most existing representations (e.g., NPE, BSG, O-tree,B*-tree, CBL, Q-sequence),the geometric relations between modules are not transparent to the operations of SP (i.e., the effect of an operation on the change of module relation is not clear before packing), and thus we need to construct constraint graphs from scratch after each perturbation to evaluate the packing cost; this deficiency makes SP harder to converge to a desired solution and to handle placement with constraints (e.g., boundary modules, pre-placed modules, etc)

TCG consists of a horizontal transitive closure graph Ch to define the horizontal geometric relations between modules and a vertical one Cv for vertical geometric relations. Contrast to SP, the geometric relations between modules is transparent to TCG as well as its operations, facilitating the convergence to a desired solution. Further, TCG supports incremental update during operations and keeps the information of boundary modules as well as the shapes and the relative positions of modules in the representation. Nevertheless, like SP, constraint graphs are also needed for TCG to evaluate its packing cost, and unlike SP, we need to perform extra operations to obtain the module packing sequence.

4.5 Problem Definition

Let B = (b1, b2, b3.) a set of rectangular modules whose width, height, and area are denoted by wi, hi, ai. A placement P is an assignment of blocks such that no two modules overlap. The goal of floorplanning/placement is to optimize a predefined cost metric such as a combination of the area (i.e., the minimum bounding rectangle of P) and/or the wirelength (i.e., the summation of half bounding box of interconnections) induced by the assignment of bis on the chip.

4.6 Equivalence of SP and TCG

We can transform between TCG and SP as follows: Let the fan-in(fan-out) of a node ni ,denoted by fin(ni) ( fout (ni) ) be the nodes njs with edges ( nj , ni )((ni , nj)). Given a TCG, we can obtain a sequence - by repeatedly extracting a node ni with fin(ni) =0 in Cv and fout (ni)=0 in Ch ,then deleting the edges ( nj , ni )' s ((ni , nj)s) from Ch (Cv) until no node is left in Ch (Cv) . Similarly, we can transform a TCG into another sequence - by repeatedly extracting the node ni with fin(ni) =0 in both Ch and Cv and then deleting the edges ( nj , ni )s from both Ch and Cv until no node is left in Ch and Cv . Given an SP (+, -), we can obtain a unique TCG (Ch, Cv).

4.7 Comparison between TCG and SP

- Of an SP corresponds to the ordering for packing modules to the bottom-left direction and thus can be used for guiding module packing. However, like most existing representations, the geometric relations among modules are not transparent to the operations of SP (i.e., the effect of an operation on the change of module relation is not clear before packing), and thus we need to construct constraint graphs from scratch after each perturbation to evaluate the packing cost; this deficiency makes SP harder to converge to a desired solution and to handle placement with constraints (e.g., boundary modules, pre-placed modules, etc).

Contrast to SP, the geometric relations among modules is transparent to TCG as well as its operations, facilitating the convergence to a desired solution. Further, TCG supports incremental update during operations and keeps the information of boundary modules as well as the shapes and the relative positions of modules in the representation. Unlike SP, nevertheless, we need to perform extra operations to obtain the module packing sequence and an additional time to find a special type of edges, called reduction edges, in Ch (Cv).for some operations.

For both SP and TCG, the packing scheme by applying the longest path algorithm is time-consuming since all edges in the constraint graphs are processed, even though they are not on the longest path. if we add a source with zero weight and connect it to those nodes with zero in-degree, the x coordinate of each module can be obtained by applying the longest path algorithm on the resulting directed acyclic graph. Therefore, we have xg =max (xa, xb, xc, xd xe) However, if we place modules based on the sequence - and maintain a horizontal and a vertical contours, denoted by RH and RV respectively, for the placed modules, the number of nodes need to be considered can be reduced. Let RH (RV ) be a list of modules bi s for which there exists no module bj with yj yi ' ( xj xi ' ) and xj ' xi ' ( yj ' yi ' ) .Suppose we have packed the modules a ,b,c,d,e based on the sequence - then, the resulting horizontal contour will be RH < c, e, d >. Keeping RH we only need to traverse the contour from e, the successor of c, to the last module e, which have horizontal relations with g. Thus, we have xg = xd . Packing modules in this way, we only need to consider xe and xd and can get rid of the computation for a maximum value, leading to a faster packing scheme.

4.8 The TCG-S Representation

Combining TCG (Ch Cv ) and SP (- ) , we develop a representation called TCG-S = (Ch ,Cv ,- ), which uses a horizontal and a vertical transitive closure graphs as well as a packing sequence - to represent a placement..

4.8.1 From a placement to TCG-S

In this section, it is shown how to construct (Ch,Cv) and - from a placement. first - is extracted from a placement, and then constructs Ch and Cv according to -. For two non-overlapped modules bi and bj, bi is said to be horizontally (Vertically) related to bj, denoted by bi bj (bi bj), if bi is on the left (bottom) side of bj and their projections on the y (x) axis overlap. (two modules cannot have both horizontal and vertical relations unless they overlap.) For two non-overlapped modules bi and bj, bi is said to be diagonally related to bj if bi is on the left side of bj and their projections on the x and the y axes do not overlap. In a placement, every two modules must bear one of the three relations: horizontal relation, vertical relation, and diagonal relation. To simplify the operations on geometric relations, we treat a diagonal relation for modules bi and bj as a horizontal one, unless there exists a chain of vertical relations from bi (bj), followed by the modules enclosed with the rectangle defined by the two closest corners of bi and bj, and finally to bj (bi), for which we make bi

bj, (bj bi).

(a) A placement (b) The corresponding TCG-S of (a).

Fig 4.1: a placement to TCG-S

4.8.2 Sequence Pair Representation from a placement

Given a placement, - can be extracted based on the procedure described. A sequence pair imposes a horizontal/vertical (H/V) constraint for every pair of modules as follows

(ab,ab)=>a should be placed to the left of b

(ba,ab)=>a should be placed below b.

After extracting - we can construct Ch and Cv based on -, For each module bi in - , we introduce a node ni with the weight being bis width (height) in Ch( Cv ) Also, for each module bi before bj we introduce an edge ( nj , ni ) in Ch ( Cv ) if bi bj (bi bj) As shown in Figure 4.1 .

4.8.3 From TCG-S to a placement

The basic idea is to process the modules based on the sequence defined in - , and then pack the current module to a corner formed by two previously placed modules in RH (RV) according to the geometric relations defined in Ch and Cv.We can keep the modules bis in RH (RV) in a balanced binary search tree in TH (TV) in the increasing order according to their right (top) boundaries. For easier presentation, we add a dummy module bS (bt) to RH (RV) to denote the left (bottom) boundary module of a placement. We have bs bi ( bt bi ).Let (xs ,ys) =(0, ) and (xt ,yt) =( ,0). RH (RV ) consists of bS (bt ) initially, and so does the corresponding TH (TV ).To pack a module bj in - ,we traverse the modules bks in TH (TV )from its root, and go to the right child if bk bj ( bk bj ). And the left child if bk bj ( bk bj).The process is repeated for the newly encountered module until a leaf node is met.

Then bj is connected to the leaf node, and xj = xp (yj = yp), where p is the last module with bp bj (bp bj) in the path. After bj is inserted into TH ( TV ), every successor bl with xl < xj (yl < yj )in TH ( TV ) is deleted since bl is no longer in the contour. The ordering of nodes in TH (TV) can be obtained by depth-first search. This process repeats for all modules in -.We have W = x v (H= y v ) if bv is the module in the resulting TH (TV) with the largest value, where W (H) denotes the width (height) of the placement.

To pack the first module ba in -, we traverse TH ( TV ), from the root bS (bt ) and insert it to the right child of bS ( bt ) since bs ba ( bt ba ).Therefore, the first module ba in - is placed at the bottom-left corner i.e.( xa , ya )=(0,0 ) since bS (bt ) is the last module that is horizontally (vertically) related to ba and xs = 0,yt = 0 ( fig 4.2 (a) a balanced binary search tree after ba is inserted into TH ( TV )). Similarly to pack the second module bb in - , we traverse TH from the root bs and then its right child since bs ba. Then bb is inserted to the left child of since bs bb.Because bs is the last module with bs bb in the path, xb = xs = 0. Similarly, we traverse TV from the root bt and then its right child ba since ba bt .Then bb is inserted to the right child of ba in TV since ba bb.Therefore, yb = ya = 1.5 because ba is the last module with ba bb in the path. The resulting balanced binary search trees after performing tree rotations TH ', TV ' is shown in Figure 4.2 (b). As shown in Figure 4.2(c), after bc is inserted, bb in TH is deleted since bb is the successor of bc and xb by extracting nodes in fan out (nc) based on the sequence in - .nC and the first node nd in S form a reduction edge (nc, nd). Traversing S, we have another reduction edge (nc, ne). Since edge (nd, ne). Is not in Ch .Starting from ne, we search the next node n with (ne, n), not in Ch.We find node nf, implying that (nc, nf) is also a reduction edge. Therefore we have found all reduction edges emanating from nC: (nc, nd), (nc, ne), (nc, nf).

4.9.2 Solution Perturbation

The four operations Rotation, Swap, Reverse, and Move are used to perturb Ch and Cv. During each perturbation, we must maintain the three feasibility properties for Ch and Cv.Unlike the Rotation operation, Swap, Reverse, and Move may change the configurations of Ch and Cv and thus their properties.Further, we also need to maintain - to conform to the topological ordering of the new Ch and Cv.

(i) Rotation

To Rotate a module bi, we exchange the weights of the corresponding node ni in Ch and Cv. Since the configurations of Ch and Cv do not change, so dose - . Figure 4.3 shows the resulting TCG-S after rotating the module g .

Fig 4.3: The resulting TCG-S after rotating the module g

(ii) Swap

Swapping ni and nj does not change the topologies of Ch and Cv, except that nodes ni and nj in both Ch and Cv are exchanged. Therefore we only need to exchange bi and bj in - .Figure 4.4 shows the resulting TCG-S after swapping the nodes nc and ng shown in Figure 4.3.The modules bc and bg in - are exchanged.

Fig 4.4: The resulting TCG-S after swapping the nodes nc and ng

(iii) Reverse

Reverse changes the geometric relation between bi and bj from bi bj (bi bj) to bj bi (bj bi).To reverse a reduction edge (ni , nj) in one transitive closure graph, we first delete the edge (ni , nj)from the graph, and then add the edge (nj , ni )to the graph. To keep Ch and Cv graph feasible, for each node nk fin (nj ) U { nj }and nl fout (ni) U { ni } in the new graph, we have to keep the edge (nk , nl) in the graph. If the edge does not exist in the graph, we add the edge to the graph and delete the corresponding edge (nk, nl) (or (nl, nk)) in the other graph. To make - conform to the topological ordering of the new Ch and Cv, we delete bi from - and insert bi after bj. For each module bk between bi and bj in - , we shall check whether the edge (ni, nk) exists in the same graph. We delete bk from - and insert it after the most recently inserted module.

Figure 4.5 shows the resulting TCG-S after reversing the reduction edge (nd, ne) of the Cv in Figure 4.4. Since there exists no module between bd and be in - , we only need to delete bd from - and insert it after be, and the resulting - is shown in Figure 4.5.

Figure 4.5.The resulting TCG-S after reversing the reduction edge (nd, ne)

(iv) Move

Move changes the geometric relation between bi bj (bi bj) to bj bi (bj bi) to move a reduction edge (ni, nj) from a transitive closure graph G to the other G, we delete the edge from G and then add it to G. Similar to Reverse, for each node node nk fin (ni ) U { ni }and nl fout (nj) U { nj } in G , we must move the edge (nk, nl) to G if the the corresponding edge (nk, nl) (or (nl, nk) ) is in G . Since the operation changes only the edges in Ch or Cv but not the topological Ordering among nodes, - remains unchanged. Figure 4.6 shows the resulting TCG-S after moving the reduction edge (na, ne) from Ch to Cv in Figure 4.5. Notice that the resulting - is the same as that in Figure 4.6.

Figure 4.6: The resulting TCG-S after moving the reduction edge (na, ne) from

Ch to Cv

4.10 Placement with Constraints

4.10.1 TCG-S with boundary constraints

The placement with boundary constraints is to place a set of prespecified modules along the designated boundaries of a chip.

Boundary Constraint: Given a boundary module bi, it must be placed in one of the four sides: on the left, on the right, at the bottom or at the top in a chip in the final packing. If a module bi is placed along the left (right) boundary, the

in-degree (out-degree) of the node ni in Ch equals zero. If a module is placed along the bottom (top) boundary, the in-degree (out-degree) of ni in Cv equals zero. For each perturbation, we can guarantee a feasible placement by checking whether the conditions of boundary modules are satisfied.

4.11 Advantages Of TCG-S

The orthogonal combination TCG and SP, TCG-S = (Ch, C v, and -) leads to a representation with at least the following advantages:

With the property of SP, faster packing

And perturbation schemes are possible for a P*-admissible representation

Inherited from TCG, the geometric relations among modules are transparent to TCG-S, implying faster convergence to a desired solution.

Inherited from TCG, placement with position constraints becomes much easier.

TCG-S can support incremental update for cost evaluation.

Chapter 5

Software Implementation

5.1 Software Implementation Phases

In order to study and characterize the floorpaln representation TCG -S when applied to floorplan optimization, a C++ language program / Linux was implemented. The program uses the floorplan bench mark circuits file formats as its input. The software implementation of the SA placement algorithm went through several phases as given below.

PHASE 1 : Placement to TCG-S

i) TCG-S: TCG + SP

ii) TCG-S: HCG & VCG from initial floor plan

PHASE 2: Solution Perturbation using simulated annealing algorithm.

i) Rotation

ii) Swap

ii) Reverse

iv) Move

PHASE 3 : TCG-S to Placement

i) HCG & VCG to floor plan.

5.2 Experimental Code

This work began with the definition of the data structures to be used. As a design consists of a set of soft and hard rectangular blocks of specified area and aspect ration constraints, structures representing these elements are defined and populated with data required to manage the floorplan algorithm. An overview of initial organization is given below.

i) Structure which defines rectangular blocks in a floorplan bench mark circuit. Components of the structure are block name, type (hard or soft), area, xcord, ycord, leng, width, minimum and maximum aspect ratios.

ii) Structure which defines nodes in a graph (gerez structure for directed acyclic graphs).Components of the structure are node name, ch weight, cv weight, longest path, xcord, ycord, Struct edge *outgoing, indegree, out degree.

iii) Structure which defines edges in a graph components of the structure are

edge index, from, to, next.

An array of these data structures is maintained to hold number of blocks in a floorplan bench mark circuit. First part is reading the bench mark circuit file and storing the informations into the data structures defined.

5.3 Benchmarks

In order to provide an algorithm characterization test set the GSRC/MCNC benchmarks are employed. The benchmark sizes range from 33 to 200k components across 18 circuits. In the simplest formulation of the block packing problem, all the blocks are rectangle hard blocks with fixed heights and widths, and their locations are free to assign within the whole layout region. However, in real design, the block packing algorithm may have to deal with blocks with more complex features:

SoftblocksIn the early stage of physical design many of the circuit blocks are not yet designed and are thus flexible (soft) in shape. For these soft blocks, the block packing algorithm not only needs to determine their locations, but also needs to assign specific shapes to them.

Rectilinearblocksas some of the circuit blocks come from design re-use, their shapes are not necessarily rectangle.Therefore the block packing algorithm should be able to handle the arbitrarily shaped rectilinear blocks.

Pre-placedblocksin some cases, the locations of some blocks may be fixed, or a region may be specified for their placement.For example, in high performance chips, the clock buffer may have to be located in the center of the chip to reduce the time difference between the clock signal arrival times at different blocks. The block packing algorithm should put these pre-placed blocks on their specified location or region without overlapping with other blocks

GSRC benchmark circuits Ami33, Ami49, Apte, Xerox, Hp, N200, and N300 used to test the software implementation.

5.3.1.Theblocksformat The blocks file specifies the name and other optional information about each node. After the standard header, the format specifies:

NumSoftRectangularBlocks: number of soft rectangular block nodes

NumHardRectilinearBlocks: number of hard rectilinear block nodes

NumTerminals: number of terminal (pad etc.) nodes

Then, one line for each soft rectangular block node with format as follows:

Node Name soft rectangular area minAspectRatio maxAspectRatioblock/terminal node in the floorplan. Each line specifies a single block/terminal

5.4 Logical Modules

In order to analyze the SA algorithm to characterize its behavior it is necessary to logically modularize the implemented software.Each module is represented by number of function calls in the program.

5.4.1 Design of width and length for given area of block

Width and lengths are designed for each soft block of given area, satisfying minimum and maximum aspect ratio constraints.

5.4.2 Building an initial floorplan from the blocks of benchmark circuit.

For constructing an initial random floorplan from blocks of bench mark circuit, first blocks are sorted in the ascending order of length and arranged rowwise.x coordinates and y coordinates of block is saved in array of floorplan data structure.

5.4.3 Extracting - (sequence pair representation) from the floorplan.

- Of sequence pair representation can be obtained by using the method explained in section 3.8.1.first a tree is constructed from placement .a depth first search traversal will give - and a breadth first traversal will give -.

5.4.4 Construction of Horizontal constraint and Vertical constraint graph

HCG and VCG constructed from initial floorplan according to - (sp), as explained in section 3.8

5.4.5 Finding reduction edges.

Set of reduction edges is found for performing solution perturbations.

5.4.6 Floorplaning algorithm using simulated annealing.

Initially we have to decide:

i) The state space

ii) The neighborhood structure

iii) The cost function

iv) The initial state

v) The initial temperature

vi) The cooling schedule (how to change t)

vii) The freezing point

Then Solution perturbations are performed on reduction edges by four operations swap, move, rotation and reverse.

5.4.6.1 Design Perturbation

The perturbation function is the combination of four different operations specifically, move, rotate, reverse, and swap.

5.4.6.2 Move Acceptance

This module is simple in concept but represents the essence of SA, the unconditional acceptance of cost decreasing moves and the possibility of accepting cost increasing moves based on the current system temperature. The change of the designs cost is generated by the combination of the previous two modules and represents the factor by which the move(s) will be analyzed for acceptance. Any negative change in cost (i.e., a good move) is unconditionally accepted while a positive change in cost is not unconditionally rejected. If the random value is less than the calculated factor, the move is accepted regardless of its cost. It is easily seen that during the high temperature phase of placement, moves that increase the designs cost are readily accepted. As the temperature is reduced, the size of the cost increase that will have a certain chance of being accepted is also reduced. This gives that large increases will be less likely to be accepted beginning at higher temperatures while small increases will experience this at lower temperatures. If the move is not accepted under the above two circumstances it is rejected and any changes that have been made are Undone.

5.4.6.3 Design Update

After the decision has been made to either reject or accept the move, the design must be then updated accordingly as some change to the design has been made during the move attempt. If it is decided that the move is to be accepted, all that is required to be performed is to check the overlap list for and remove components which no longer exhibit overlap. If the move is to be rejected, the design needs to be reset as if the move was never attempted. This involves not only moving the component(s) back to their original position(s), but also iterating through the list of modified nets and resetting their cost, iterating through the list of overlapping components and resetting any changes made to their overlap status and finally resetting the designs overlap cost which is recorded through iterations. In both cases the list of nets to be updated is deleted as this point represents the end of the move attempt.

5.4.6.4 Temperature Schedule

The temperature schedule greatly influences the behavior and quality of the placement; it deserves a place distinct from the rest of the algorithm. The temperature is degraded by two factors determined by the user of the program. One factor is used at the outer regions of the schedule while the other is used in the middle region. Typically, the temperature degradation factors and the borders which determine their use are set such that a high temperature zone is quickly transverse to a point where moves begin to reduce the overall cost of the design. From this point the temperature is slowly degraded until a point where moves are mostly rejected, where again, the temperature is quickly degraded as in the high temperature region.

5.4.6.5 Construction of horizontal and vertical contours RH (RV) and packing the module according to - In TH (TV)

RH (RV) is a list of modules bi s for which there exists no module bj with yj yi'(xj xi' ) and xj ' xi'(yj' yi). We can keep the modules bis in RH (RV) in a balanced binary search tree TH (TV) in the increasing order according to their right (top) boundaries.

CHAPTER 6

RESULTS

6.1 Organization

This chapter is organized to present the experimental results of software implementation of floorplan representation TCG-S, and analys its performance with other representations. Based on a simulated annealing method the TCG-S representation has been implemented in the C++ programming language on Linux platform. Based on the five commonly used MCNC and GSRC benchmark circuits . The results summarizes the area utilization of the the two Floorplan Representations TCG and TCG-S.

6.2 TCG

Circuit Name

No. of Blocks

Area Utilization (%)

Apte

9

79

Hp

11

80

Xerox

10

77

Ami33

33

85

Ami49

49

86

N10

10

82

N30

30

83

N50

50

86

N200

200

87

N300

300

90

Table 6.1 Summarization of area utilization for various bench mark circuits-TCG

6.3 Sample Output

Circuit Name: n200 (MCNC Benchmark)

Number of Modules: 200

Circuit Name: n300 (MCNC Benchmark)

Number of Modules: 300

6.4 TCG-S

Circuit Name

No. of Blocks

Area Utilization (%)

Apte

9

79

Hp

11

80

Xerox

10

78

Ami33

33

85

Ami49

49

87

N10

10

84

N30

30

83

N50

50

88

N200

200

88

N300

300

91

Table 6.2 Summarization of area utilization for various bench mark circuits TCGS

6.5 Sample output

Circuit Name: Apte (MCNC Benchmark)

Number of Modules: 9

Circuit Name: Xerox (MCNC Benchmark)

Number of modules: 10

Circuit Name: hp (MCNC Benchmark)

Number of Modules: 11

Circuit Name: n10 (GSRC Benchmark)

Number of Modules: 10

Circuit Name: n30 (GSRC Benchmark)

Number of Modules: 30

Circuit Name: ami33 (MCNC Benchmark)

Number of Modules: 33

Circuit Name: ami49 (MCNC Benchmark)

Number of Modules: 49

Circuit Name: n50 (GSRC Benchmark)

Number of Modules: 50

Circuit Name: n200 (MCNC Benchmark)

Number of Modules: 200

Circuit Name: n300 (MCNC Benchmark)

Number of Modules: 300

CHAPTER 7

CONCLUSIONS

This work began by implementing the floorplan representations TCG (transitive closure graph) and SP (sequence pair) and then combining these to representations a new representation TCG-S is developed. Simulated Annealing algorithm has been designed for solution perturbation. An optimal solution with respect to component placement and minimization of floorplan area is obtained. This step allowed insight to the intricacies of the Simulated Annealing algorithm, leading to a novel approach to model a floorplan. Area optimization as a primary parameter, experiments based on a set of commonly used MCNC benchmarks show that TCG-S results in the best area utilization and convergence speed.

CHAPTER 8

FUTURE DEVELOPMENTS

The presented annealing-based optimization considers only area, however, this work can extend to other objectives like wirelength minimization and placement with Pre-placed Modules.

7.1 Wirelength Minimization:

Global interconnect is commonly recognized as a key factor for designing high-performance integrated circuits, as VLSI process technology migrates into deep submicron (DSM) dimensions and operates in giga-hertz clock frequencies. By using a wide range of interconnect synthesis and optimization techniques, such as topology optimization, buffer insertion, layer assignment, wire sizing, and wire spacing, the performance of a global interconnect could be improved by a factor of 5 or more. As the global interconnects are largely determined by floorplanning, it becomes critical for floorplanning engines to be able to handle efficient interconnect planning and optimizations, so that the overall timing and design convergence can be achieved.

7.2 TCG-S with Pre-placed Modules

The placement with pre-placed modules is to place a set of prespecified modules at the designated locations of a chip.

Pre-placed Constraint: Given a module bi with a fixed coordinate (xi,yi) and an orientation, bi must be placed at the designated location with the same orientation in the final packing.

BIBLIOGRAPHY

[1] J.-M. Lin and Y.-W. Chang, ``TCG-S: Orthogonal Coupling of P*-admissible Representations for General Floorplans," IEEE Trans. Computer-Aided Design, Vol. 24, No. 6, pp. 968--980, June 2004.

[2] Jai-Ming Lin and Yao-Wen, TCG: A Transitive Closure Graph-Based

Representation for General Floorplans, IEEE Transactions on Very Large Scale

Integration Systems, 2003.

[5] P.-N. Guo, C.-K. Cheng, and T. Yoshimura, An O-Tree representation of non-slicing floorplan and its applications,Proc.DAC, pp. 268273, 1999.

[6] Sadiq M. Sait and Habib Youssef, VLSI Physical Design Automation- Theory and Practice, IEEE Press.

[7] Naveed Sherwani, Algorithms for VLSI Physical Design Automation, Kluwer Academic publishers, 1995

[8] Floorplanning in VLSI avalable online: www.manchester.ac.uk.

[9]design optimization avalable online:www.me.mtu.edu

[10] VLSI Design Automation avalable online: www.mountains.ece.umn.edu

[11] Floorplanning by Dinesh Bhatia available online: www.dallas.edu

[12] Floorplanning by Professor Lei He available online: www.ee.ucla.edu

[13] Recent Development in VLSI Floorplan Representations available online: www.ee.nthu.edu.

[14] CAD for VLSI Simulated Annealing Algorithm available online: www.ece.wisc.edu

[15] Physical Design. Floorplan available online: www.sharif.edu

[16] VLSI Placement avalable online: www.ee.ucla.edu

[17] VLSI CAD avalable online: www.vlsicad.ucsd.edu

[18] A Thesis Submitted Modeling of a Hardware VLSI Placement System: Accelerating, the Simulated Annealing Algorithm available online: www.ritdml.rit.edu

B

D

C

E

A

F

PAGE

53

_1243949695.unknown