76
ECE 260B – CSE 241A Static Timing Analysis 1 http://vlsicad.ucsd.edu ECE260B – CSE241A Winter 2005 Logic Synthesis Website: http://vlsicad.ucsd.edu/courses/ece260b-w05 Slides courtesy of Dr. Cho Moon

ECE 260B – CSE 241A Static Timing Analysis 1 ECE260B – CSE241A Winter 2005 Logic Synthesis Website:

Embed Size (px)

Citation preview

Page 1: ECE 260B – CSE 241A Static Timing Analysis 1 ECE260B – CSE241A Winter 2005 Logic Synthesis Website:

ECE 260B – CSE 241A Static Timing Analysis 1 http://vlsicad.ucsd.edu

ECE260B – CSE241A

Winter 2005

Logic Synthesis

Website: http://vlsicad.ucsd.edu/courses/ece260b-w05

Slides courtesy of Dr. Cho Moon

Page 2: ECE 260B – CSE 241A Static Timing Analysis 1 ECE260B – CSE241A Winter 2005 Logic Synthesis Website:

ECE 260B – CSE 241A Static Timing Analysis 2 http://vlsicad.ucsd.edu

Introduction

Why logic synthesis? Ubiquitous – used almost everywhere VLSI is done Body of useful and general techniques – same solutions can be

used for different problems Foundation for many applications such as

- Formal verification

- ATPG

- Timing analysis

- Sequential optimization

Page 3: ECE 260B – CSE 241A Static Timing Analysis 1 ECE260B – CSE241A Winter 2005 Logic Synthesis Website:

ECE 260B – CSE 241A Static Timing Analysis 3 http://vlsicad.ucsd.edu

RTL Design Flow

RTLSynthesis

HDL

netlist

LogicSynthesis

netlist

Library

Physical Synthesis

layout

a

b

s

q0

1

d

clk

a

b

s

q0

1

d

clk

ModuleGenerators

ManualDesign

Slide courtesy of Devadas, et. al

Page 4: ECE 260B – CSE 241A Static Timing Analysis 1 ECE260B – CSE241A Winter 2005 Logic Synthesis Website:

ECE 260B – CSE 241A Static Timing Analysis 4 http://vlsicad.ucsd.edu

Logic Synthesis Problem

Given Initial gate-level netlist Design constraints

- Input arrival times, output required times, power consumption, noise immunity, etc…

Target technology libraries

Produce Smaller, faster or cooler gate-level netlist that meets constraints

Very hard optimization problem!

Page 5: ECE 260B – CSE 241A Static Timing Analysis 1 ECE260B – CSE241A Winter 2005 Logic Synthesis Website:

ECE 260B – CSE 241A Static Timing Analysis 5 http://vlsicad.ucsd.edu

Combinational Logic Synthesis

Logic Synthesis

netlist

netlist

Library

techindependent

techdependent

2-levelLogic opt

multilevelLogic opt

Library

Slide courtesy of Devadas, et. al

Page 6: ECE 260B – CSE 241A Static Timing Analysis 1 ECE260B – CSE241A Winter 2005 Logic Synthesis Website:

ECE 260B – CSE 241A Static Timing Analysis 6 http://vlsicad.ucsd.edu

Outline

Introduction

Two-level Logic Synthesis

Multi-level Logic Synthesis

Timing Optimization in Synthesis

Page 7: ECE 260B – CSE 241A Static Timing Analysis 1 ECE260B – CSE241A Winter 2005 Logic Synthesis Website:

ECE 260B – CSE 241A Static Timing Analysis 7 http://vlsicad.ucsd.edu

Two-level Logic Synthesis Problem

Given an arbitrary logic function in two-level form, produce a smaller representation.

For sum-of-products (SOP) implementation on PLAs, fewer product terms and fewer inputs to each product term mean smaller area.

F = A B + A B C

F = A B

I1

I2

O1

O2

Page 8: ECE 260B – CSE 241A Static Timing Analysis 1 ECE260B – CSE241A Winter 2005 Logic Synthesis Website:

ECE 260B – CSE 241A Static Timing Analysis 8 http://vlsicad.ucsd.edu

Boolean Functions

f(x) : Bn B

B = {0, 1}, x = (x1, x2, …, xn)

x1, x2, … are variables

x1, x1, x2, x2, … are literals

each vertex of Bn is mapped to 0 or 1

the onset of f is a set of input values for which f(x) = 1

the offset of f is a set of input values for which f(x) = 0

Page 9: ECE 260B – CSE 241A Static Timing Analysis 1 ECE260B – CSE241A Winter 2005 Logic Synthesis Website:

ECE 260B – CSE 241A Static Timing Analysis 9 http://vlsicad.ucsd.edu

Logic Functions:

Slide courtesy of Devadas, et. al

Page 10: ECE 260B – CSE 241A Static Timing Analysis 1 ECE260B – CSE241A Winter 2005 Logic Synthesis Website:

ECE 260B – CSE 241A Static Timing Analysis 10 http://vlsicad.ucsd.edu

Cube Representation

Slide courtesy of Devadas, et. al

Page 11: ECE 260B – CSE 241A Static Timing Analysis 1 ECE260B – CSE241A Winter 2005 Logic Synthesis Website:

ECE 260B – CSE 241A Static Timing Analysis 11 http://vlsicad.ucsd.edu

A function can be represented by a sum of cubes (products):

f = ab + ac + bc

Since each cube is a product of literals, this is a “sum of products” representation

A SOP can be thought of as a set of cubes FF = {ab, ac, bc} = C

A set of cubes that represents f is called a cover of f. F={ab, ac, bc} is a cover of f = ab + ac + bc.

Sum-of-products (SOP)

Page 12: ECE 260B – CSE 241A Static Timing Analysis 1 ECE260B – CSE241A Winter 2005 Logic Synthesis Website:

ECE 260B – CSE 241A Static Timing Analysis 12 http://vlsicad.ucsd.edu

Prime Cover

A cube is prime if there is no other cube that contains it(for example, b c is not a prime but b is)

A cover is prime iff all of its cubes are prime

c

ab

Page 13: ECE 260B – CSE 241A Static Timing Analysis 1 ECE260B – CSE241A Winter 2005 Logic Synthesis Website:

ECE 260B – CSE 241A Static Timing Analysis 13 http://vlsicad.ucsd.edu

Irredundant Cube

A cube of a cover C is irredundant if C fails to be a cover if c is dropped from C

A cover is irredundant iff all its cubes are irredudant (for exmaple, F = a b + a c + b c)

c

ba

Not covered

Page 14: ECE 260B – CSE 241A Static Timing Analysis 1 ECE260B – CSE241A Winter 2005 Logic Synthesis Website:

ECE 260B – CSE 241A Static Timing Analysis 14 http://vlsicad.ucsd.edu

Quine-McCluskey Method

We want to find a minimum prime and irredundant cover for a given function.

Prime cover leads to min number of inputs to each product term.

Min irredundant cover leads to min number of product terms.

Quine-McCluskey (QM) method (1960’s) finds a minimum prime and irredundant cover.

Step 1: List all minterms of on-set: O(2^n) n = #inputs Step 2: Find all primes: O(3^n) n = #inputs Step 3: Construct minterms vs primes table Step 4: Find a min set of primes that covers all the minterms:

O(2^m) m = #primes

Page 15: ECE 260B – CSE 241A Static Timing Analysis 1 ECE260B – CSE241A Winter 2005 Logic Synthesis Website:

ECE 260B – CSE 241A Static Timing Analysis 15 http://vlsicad.ucsd.edu

QM Example (Step 1)

F = a’ b’ c’ + a b’ c’ + a b’ c + a b c + a’ b c

List all on-set minterms

Minterms

a’ b’ c’

a b’ c’

a b’ c

a b c

a’ b c

Page 16: ECE 260B – CSE 241A Static Timing Analysis 1 ECE260B – CSE241A Winter 2005 Logic Synthesis Website:

ECE 260B – CSE 241A Static Timing Analysis 16 http://vlsicad.ucsd.edu

QM Example (Step 2)

F = a’ b’ c’ + a b’ c’ + a b’ c + a b c + a’ b c

Find all primes.

primes b’ c’ a b’ a c b c

Page 17: ECE 260B – CSE 241A Static Timing Analysis 1 ECE260B – CSE241A Winter 2005 Logic Synthesis Website:

ECE 260B – CSE 241A Static Timing Analysis 17 http://vlsicad.ucsd.edu

QM Example (Step 3)

F = a’ b’ c’ + a b’ c’ + a b’ c + a b c + a’ b c

Construct minterms vs primes table (prime implicant table) by determining which cube is contained in which prime. X at row i, colum j means that cube in row i is contained by prime in column j.

b’ c’ a b’ a c b c

a’ b’ c’ X

a b’ c’ X X

a b’ c X X

a b c X X

a’ b c X

Page 18: ECE 260B – CSE 241A Static Timing Analysis 1 ECE260B – CSE241A Winter 2005 Logic Synthesis Website:

ECE 260B – CSE 241A Static Timing Analysis 18 http://vlsicad.ucsd.edu

QM Example (Step 4)

F = a’ b’ c’ + a b’ c’ + a b’ c + a b c + a’ b c

Find a minimum set of primes that covers all the minterms

“Minimum column covering problem”

b’ c’ a b’ a c b c

a’ b’ c’ X

a b’ c’ X X

a b’ c X X

a b c X X

a’ b c X

Essential primes

Page 19: ECE 260B – CSE 241A Static Timing Analysis 1 ECE260B – CSE241A Winter 2005 Logic Synthesis Website:

ECE 260B – CSE 241A Static Timing Analysis 19 http://vlsicad.ucsd.edu

ESPRESSO – Heuristic Minimizer

Quine-McCluskey gives a minimum solution but is only good for functions with small number of inputs (< 10)

ESPRESSO is a heuristic two-level minimizer that finds a “minimal” solution

ESPRESSO(F) {

do {

reduce(F);

expand(F);

irredundant(F);

} while (fewer terms in F);

verfiy(F);

}

Page 20: ECE 260B – CSE 241A Static Timing Analysis 1 ECE260B – CSE241A Winter 2005 Logic Synthesis Website:

ECE 260B – CSE 241A Static Timing Analysis 20 http://vlsicad.ucsd.edu

ESPRESSO ILLUSTRATED

Reduce

Irredundant

Expand

Page 21: ECE 260B – CSE 241A Static Timing Analysis 1 ECE260B – CSE241A Winter 2005 Logic Synthesis Website:

ECE 260B – CSE 241A Static Timing Analysis 21 http://vlsicad.ucsd.edu

Outline

Introduction

Two-level Logic Synthesis

Multi-level Logic Synthesis

Timing optimization in Synthesis

Page 22: ECE 260B – CSE 241A Static Timing Analysis 1 ECE260B – CSE241A Winter 2005 Logic Synthesis Website:

ECE 260B – CSE 241A Static Timing Analysis 22 http://vlsicad.ucsd.edu

Multi-level Logic Synthesis

Two-level logic synthesis is effective and mature

Two-level logic synthesis is directly applicable to PLAs and PLDs

But…

There are many functions that are too expensive to implement in two-level forms (too many product terms!)

Two-level implementation constrains layout (AND-plane, OR-plane)

Rule of thumb: Two-level logic is good for control logic Multi-level logic is good for datapath or random logic

Page 23: ECE 260B – CSE 241A Static Timing Analysis 1 ECE260B – CSE241A Winter 2005 Logic Synthesis Website:

ECE 260B – CSE 241A Static Timing Analysis 23 http://vlsicad.ucsd.edu

Two-Level (PLA) vs. Multi-Level

PLAcontrol logic

constrained layout

highly automatic

technology independent

multi-valued logic

slower?

input, output, state encoding

Multi-levelall logic

general

automatic

partially technology independent

coming

can be high speed

some results

Page 24: ECE 260B – CSE 241A Static Timing Analysis 1 ECE260B – CSE241A Winter 2005 Logic Synthesis Website:

ECE 260B – CSE 241A Static Timing Analysis 24 http://vlsicad.ucsd.edu

Multi-level Logic Synthesis Problem

Given Initial Boolean network Design constraints

- Arrival times, required times, power consumption, noise immunity, etc…

Target technology libraries

Produce a minimum area netlist consisting of the gates from the target

libraries such that design constraints are satisfied

Page 25: ECE 260B – CSE 241A Static Timing Analysis 1 ECE260B – CSE241A Winter 2005 Logic Synthesis Website:

ECE 260B – CSE 241A Static Timing Analysis 25 http://vlsicad.ucsd.edu

Modern Approach to Logic Optimization

Divide logic optimization into two subproblems:

Technology-independent optimization- determine overall logic structure

- estimate costs (mostly) independent of technology

- simplified cost modeling

Technology-dependent optimization (technology mapping)

- binding onto the gates in the library

- detailed technology-specific cost model

Orchestration of various optimization/transformation techniques for each subproblem

Slide courtesy of Keutzer

Page 26: ECE 260B – CSE 241A Static Timing Analysis 1 ECE260B – CSE241A Winter 2005 Logic Synthesis Website:

ECE 260B – CSE 241A Static Timing Analysis 26 http://vlsicad.ucsd.edu

Optimization Cost Criteria

The accepted optimization criteria for multi-level logic are to minimize some function of:

1. Area occupied by the logic gates and interconnect (approximated by literals = transistors in technology independent optimization)

2. Critical path delay of the longest path through the logic

3. Degree of testability of the circuit, measured in terms of the percentage of faults covered by a specified set of test vectors for an approximate fault model (e.g. single or multiple stuck-at faults)

4. Power consumed by the logic gates

5. Noise Immunity

6. Wireability

while simultaneously satisfying upper or lower bound constraints placed on these physical quantities

Page 27: ECE 260B – CSE 241A Static Timing Analysis 1 ECE260B – CSE241A Winter 2005 Logic Synthesis Website:

ECE 260B – CSE 241A Static Timing Analysis 27 http://vlsicad.ucsd.edu

Representation: Boolean Network

Boolean network:• directed acyclic graph (DAG)• node logic function representation

fj(x,y)

• node variable yj: yj= fj(x,y)

• edge (i,j) if fj depends explicitly on yi

Inputs x = (x1, x2,…,xn )

Outputs z = (z1, z2,…,zp )

Slide courtesy of Brayton

Page 28: ECE 260B – CSE 241A Static Timing Analysis 1 ECE260B – CSE241A Winter 2005 Logic Synthesis Website:

ECE 260B – CSE 241A Static Timing Analysis 28 http://vlsicad.ucsd.edu

Network Representation

Boolean network:

Page 29: ECE 260B – CSE 241A Static Timing Analysis 1 ECE260B – CSE241A Winter 2005 Logic Synthesis Website:

ECE 260B – CSE 241A Static Timing Analysis 29 http://vlsicad.ucsd.edu

Node Representation: Sum of Products (SOP)

Example:

abc’+a’bd+b’d’+b’e’f (sum of cubes)

Advantages:• easy to manipulate and minimize• many algorithms available (e.g. AND, OR, TAUTOLOGY)• two-level theory applies

Disadvantages:• Not representative of logic complexity. For example f=ad+ae+bd+be+cd+ce f’=a’b’c’+d’e’

These differ in their implementation by an inverter.• hence not easy to estimate logic; difficult to estimate progress during logic manipulation

Page 30: ECE 260B – CSE 241A Static Timing Analysis 1 ECE260B – CSE241A Winter 2005 Logic Synthesis Website:

ECE 260B – CSE 241A Static Timing Analysis 30 http://vlsicad.ucsd.edu

Factored Forms

Example: (ad+b’c)(c+d’(e+ac’))+(d+e)fg

Advantages• good representative of logic complexity f=ad+ae+bd+be+cd+ce

f’=a’b’c’+d’e’ f=(a+b+c)(d+e)• in many designs (e.g. complex gate CMOS) the implementation of a function corresponds directly to

its factored form• good estimator of logic implementation complexity• doesn’t blow up easily

Disadvantages• not as many algorithms available for manipulation• hence usually just convert into SOP before manipulation

Page 31: ECE 260B – CSE 241A Static Timing Analysis 1 ECE260B – CSE241A Winter 2005 Logic Synthesis Website:

ECE 260B – CSE 241A Static Timing Analysis 31 http://vlsicad.ucsd.edu

Factored Forms

Note:

literal count transistor count area

(however, area also depends on wiring)

Page 32: ECE 260B – CSE 241A Static Timing Analysis 1 ECE260B – CSE241A Winter 2005 Logic Synthesis Website:

ECE 260B – CSE 241A Static Timing Analysis 32 http://vlsicad.ucsd.edu

Factored Forms

Definition : a factored form can be defined recursively by the following rules. A factored form is either a product or sum where:• a product is either a single literal or product of factored forms• a sum is either a single literal or sum of factored forms

A factored form is a parenthesized algebraic expression.

In effect a factored form is a product of sums of products … or a sum of products of sums …

Any logic function can be represented by a factored form, and any factored form is a representation of some logic function.

Page 33: ECE 260B – CSE 241A Static Timing Analysis 1 ECE260B – CSE241A Winter 2005 Logic Synthesis Website:

ECE 260B – CSE 241A Static Timing Analysis 33 http://vlsicad.ucsd.edu

Factored Forms

When measured in terms of number of inputs, there are functions whose size is exponential in sum of products representation, but polynomial in factored form.

Example: Achilles’ heel function

There are n literals in the factored form and (n/2)2n/2 literals in the SOP form.

2/

1212 )(

ni

iii xx

Factored forms are useful in estimating area and delay in a multi-level synthesis and optimization system.

In many design styles (e.g. complex gate CMOS design) the implementation of a function corresponds directly to its factored form.

Page 34: ECE 260B – CSE 241A Static Timing Analysis 1 ECE260B – CSE241A Winter 2005 Logic Synthesis Website:

ECE 260B – CSE 241A Static Timing Analysis 34 http://vlsicad.ucsd.edu

Factored Forms

Factored forms cam be graphically represented as labeled trees, called factoring trees, in which each internal node including the root is labeled with either + or , and each leaf has a label of either a variable or its complement.

Example: factoring tree of ((a’+b)cd+e)(a+b’)+e’

Page 35: ECE 260B – CSE 241A Static Timing Analysis 1 ECE260B – CSE241A Winter 2005 Logic Synthesis Website:

ECE 260B – CSE 241A Static Timing Analysis 35 http://vlsicad.ucsd.edu

Reduced Ordered BDDs

• like factored form, represents both function and complement

• like network of muxes, but restricted since controlled by primary input variables

- not really a good estimator for implementation complexity

• given an ordering, reduced BDD is canonical, hence a good replacement for truth tables

• for a good ordering, BDDs remain reasonably small for complicated functions (e.g. not multipliers)

• manipulations are well defined and efficient• true support (dependency) is displayed

Page 36: ECE 260B – CSE 241A Static Timing Analysis 1 ECE260B – CSE241A Winter 2005 Logic Synthesis Website:

ECE 260B – CSE 241A Static Timing Analysis 36 http://vlsicad.ucsd.edu

Technology-Independent Optimization

Technology-independent optimization is a bag of tricks: Two-level minimization (also called simplify) Constant propagation (also called sweep)

f = a b + c; b = 1 => f = a + c Decomposition (single function)

f = abc+abd+a’c’d’+b’c’d’ => f = xy + x’y’; x = ab ; y = c+d

Extraction (multiple functions)

f = (az+bz’)cd+e g = (az+bz’)e’ h = cde

f = xy+e g = xe’ h = ye x = az+bz’ y = cd

Page 37: ECE 260B – CSE 241A Static Timing Analysis 1 ECE260B – CSE241A Winter 2005 Logic Synthesis Website:

ECE 260B – CSE 241A Static Timing Analysis 37 http://vlsicad.ucsd.edu

More Technology-Independent Optimization

More technology-independent optimization tricks: Substitution

g = a+b f = a+bc

f = g(a+c)

Collapsing (also called elimination)f = ga+g’b g = c+d

f = ac+ad+bc’d’ g = c+d

Factoring (series-parallel decomposition)

f = ac+ad+bc+bd+e => f = (a+b)(c+d)+e

Page 38: ECE 260B – CSE 241A Static Timing Analysis 1 ECE260B – CSE241A Winter 2005 Logic Synthesis Website:

ECE 260B – CSE 241A Static Timing Analysis 38 http://vlsicad.ucsd.edu

Summary of Typical Recipe for TI Optimization

Propagate constants

Simplify: two-level minimization at Boolean network node

Decomposition

Local “Boolean” optimizations Boolean techniques exploit Boolean identities (e.g., a a’ = 0)

Consider f = a b’ + a c’ + b a’ + b c’ + c a’ + c b’ Algebraic factorization procedures

f = a (b’ + c’) + a’ (b + c) + b c’ + c b’ Boolean factorization procedures

f = (a + b + c) (a’ + b’ + c’)

Slide courtesy of Keutzer

Page 39: ECE 260B – CSE 241A Static Timing Analysis 1 ECE260B – CSE241A Winter 2005 Logic Synthesis Website:

ECE 260B – CSE 241A Static Timing Analysis 39 http://vlsicad.ucsd.edu

Technology-Dependent Optimization

Technology-dependent optimization consists of

Technology mapping: maps Boolean network to a set of gates from technology libraries

Local transformations Discrete resizing Cloning Fanout optimization (buffering) Logic restructuring

Slide courtesy of Keutzer

Page 40: ECE 260B – CSE 241A Static Timing Analysis 1 ECE260B – CSE241A Winter 2005 Logic Synthesis Website:

ECE 260B – CSE 241A Static Timing Analysis 40 http://vlsicad.ucsd.edu

Technology Mapping

Input Technology independent, optimized logic network Description of the gates in the library with their cost

Output Netlist of gates (from library) which minimizes total cost

General Approach Construct a subject DAG for the network Represent each gate in the target library by pattern DAG’s Find an optimal-cost covering of subject DAG using the

collection of pattern DAG’s Canonical form: 2-input NAND gates and inverters

Page 41: ECE 260B – CSE 241A Static Timing Analysis 1 ECE260B – CSE241A Winter 2005 Logic Synthesis Website:

ECE 260B – CSE 241A Static Timing Analysis 41 http://vlsicad.ucsd.edu

DAG Covering

DAG covering is an NP-hard problem

Solve the sub-problem optimally Partition DAG into a forest of trees Solve each tree optimally using tree covering Stitch trees back together

Slide courtesy of Keutzer

Page 42: ECE 260B – CSE 241A Static Timing Analysis 1 ECE260B – CSE241A Winter 2005 Logic Synthesis Website:

ECE 260B – CSE 241A Static Timing Analysis 42 http://vlsicad.ucsd.edu

Tree Covering Algorithm

Transform netlist and libraries into canonical forms 2-input NANDs and inverters

Visit each node in BFS from inputs to outputs Find all candidate matches at each node N

- Match is found by comparing topology only (no need to compare functions)

Find the optimal match at N by computing the new cost- New cost = cost of match at node N + sum of costs for matches at

children of N Store the optimal match at node N with cost

Optimal solution is guaranteed if cost is area

Complexity = O(n) where n is the number of nodes in netlist

Page 43: ECE 260B – CSE 241A Static Timing Analysis 1 ECE260B – CSE241A Winter 2005 Logic Synthesis Website:

ECE 260B – CSE 241A Static Timing Analysis 43 http://vlsicad.ucsd.edu

Tree Covering Example

into the technology library (simple example below)

Find an ``optimal’’ (in area, delay, power) mapping of this circuit

Slide courtesy of Keutzer

Page 44: ECE 260B – CSE 241A Static Timing Analysis 1 ECE260B – CSE241A Winter 2005 Logic Synthesis Website:

ECE 260B – CSE 241A Static Timing Analysis 44 http://vlsicad.ucsd.edu

Elements of a library - 1

INVERTER 2

NAND2 3

NAND3 4

NAND4 5

Element/Area Cost Tree Representation (normal form)

Slide courtesy of Keutzer

Page 45: ECE 260B – CSE 241A Static Timing Analysis 1 ECE260B – CSE241A Winter 2005 Logic Synthesis Website:

ECE 260B – CSE 241A Static Timing Analysis 45 http://vlsicad.ucsd.edu

Trivial Covering

subject DAG

7 NAND2 (3) = 215 INV (2) = 10

Area cost 31

Slide courtesy of Keutzer

Can we do better with tree covering?

Page 46: ECE 260B – CSE 241A Static Timing Analysis 1 ECE260B – CSE241A Winter 2005 Logic Synthesis Website:

ECE 260B – CSE 241A Static Timing Analysis 46 http://vlsicad.ucsd.edu

Optimal tree covering - 1

``subject tree’’

3

2

2

3

Slide courtesy of Keutzer

Page 47: ECE 260B – CSE 241A Static Timing Analysis 1 ECE260B – CSE241A Winter 2005 Logic Synthesis Website:

ECE 260B – CSE 241A Static Timing Analysis 47 http://vlsicad.ucsd.edu

Optimal tree covering - 2

``subject tree’’

5

8

3

2

2

3

Slide courtesy of Keutzer

Page 48: ECE 260B – CSE 241A Static Timing Analysis 1 ECE260B – CSE241A Winter 2005 Logic Synthesis Website:

ECE 260B – CSE 241A Static Timing Analysis 48 http://vlsicad.ucsd.edu

Optimal tree covering - 3

``subject tree’’

Cover with ND2 or ND3 ?

3

2

2

3

813

5

1 NAND2 3+ subtree 5

1 NAND3 = 4

Area cost 8Slide courtesy of Keutzer

Page 49: ECE 260B – CSE 241A Static Timing Analysis 1 ECE260B – CSE241A Winter 2005 Logic Synthesis Website:

ECE 260B – CSE 241A Static Timing Analysis 49 http://vlsicad.ucsd.edu

Optimal tree covering – 3b

``subject tree’’

3

2

2

3

813

5 4

Label the root of the sub-tree with optimal match and cost

Slide courtesy of Keutzer

Page 50: ECE 260B – CSE 241A Static Timing Analysis 1 ECE260B – CSE241A Winter 2005 Logic Synthesis Website:

ECE 260B – CSE 241A Static Timing Analysis 50 http://vlsicad.ucsd.edu

Optimal tree covering - 4

``subject tree’’

Cover with INV or AO21 ?

54

3

8

2

2

13

2

1 Inverter 2+ subtree 13

Area cost 15

1 AO21 4+ subtree 1 3+ subtree 2 2

Area cost 9

Slide courtesy of Keutzer

Page 51: ECE 260B – CSE 241A Static Timing Analysis 1 ECE260B – CSE241A Winter 2005 Logic Synthesis Website:

ECE 260B – CSE 241A Static Timing Analysis 51 http://vlsicad.ucsd.edu

Optimal tree covering – 4b

``subject tree’’54

3

8

2

2

13

2

9

Label the root of the sub-tree with optimal match and cost

Slide courtesy of Keutzer

Page 52: ECE 260B – CSE 241A Static Timing Analysis 1 ECE260B – CSE241A Winter 2005 Logic Synthesis Website:

ECE 260B – CSE 241A Static Timing Analysis 52 http://vlsicad.ucsd.edu

Optimal tree covering - 5

``subject tree’’

Cover with ND2 or ND3 ?

subtree 1 9subtree 2 41 NAND2 3

Area cost 16

NAND2 NAND3

8

4

9

subtree 1 8subtree 2 2subtree 3 41 NAND3 4

Area cost 18

2

Slide courtesy of Keutzer

Page 53: ECE 260B – CSE 241A Static Timing Analysis 1 ECE260B – CSE241A Winter 2005 Logic Synthesis Website:

ECE 260B – CSE 241A Static Timing Analysis 53 http://vlsicad.ucsd.edu

Optimal tree covering – 5b

``subject tree’’

168

4

9

2

Label the root of the sub-tree with optimal match and cost

Slide courtesy of Keutzer

Page 54: ECE 260B – CSE 241A Static Timing Analysis 1 ECE260B – CSE241A Winter 2005 Logic Synthesis Website:

ECE 260B – CSE 241A Static Timing Analysis 54 http://vlsicad.ucsd.edu

Optimal tree covering - 6

``subject tree’’

Cover with INV or AOI21 ?

INV AOI21

Area cost 22

5

16

Area cost 18

subtree 1 161 INV 2

subtree 1 13subtree 2 51 AOI21 4

13

Slide courtesy of Keutzer

Page 55: ECE 260B – CSE 241A Static Timing Analysis 1 ECE260B – CSE241A Winter 2005 Logic Synthesis Website:

ECE 260B – CSE 241A Static Timing Analysis 55 http://vlsicad.ucsd.edu

Optimal tree covering – 6b

``subject tree’’5

16

1813

Label the root of the sub-tree with optimal match and cost

Slide courtesy of Keutzer

Page 56: ECE 260B – CSE 241A Static Timing Analysis 1 ECE260B – CSE241A Winter 2005 Logic Synthesis Website:

ECE 260B – CSE 241A Static Timing Analysis 56 http://vlsicad.ucsd.edu

Optimal tree covering - 7

``subject tree’’

Cover with ND2 or ND3 or ND4 ?

Slide courtesy of Keutzer

Page 57: ECE 260B – CSE 241A Static Timing Analysis 1 ECE260B – CSE241A Winter 2005 Logic Synthesis Website:

ECE 260B – CSE 241A Static Timing Analysis 57 http://vlsicad.ucsd.edu

Cover 1 - NAND2

``subject tree’’

Cover with ND2 ?

16

18

subtree 1 18subtree 2 01 NAND2 3

Area cost 21

4

9

Slide courtesy of Keutzer

Page 58: ECE 260B – CSE 241A Static Timing Analysis 1 ECE260B – CSE241A Winter 2005 Logic Synthesis Website:

ECE 260B – CSE 241A Static Timing Analysis 58 http://vlsicad.ucsd.edu

Cover 2 - NAND3

``subject tree’’

Cover with ND3?

subtree 1 9subtree 2 4subtree 3 01 NAND3 4

Area cost 17

9

4

Slide courtesy of Keutzer

Page 59: ECE 260B – CSE 241A Static Timing Analysis 1 ECE260B – CSE241A Winter 2005 Logic Synthesis Website:

ECE 260B – CSE 241A Static Timing Analysis 59 http://vlsicad.ucsd.edu

Cover - 3

``subject tree’’

Cover with ND4 ?

Area cost 19

subtree 1 8subtree 2 2subtree 3 4subtree 4 01 NAND4 5

8

4

2

Slide courtesy of Keutzer

Page 60: ECE 260B – CSE 241A Static Timing Analysis 1 ECE260B – CSE241A Winter 2005 Logic Synthesis Website:

ECE 260B – CSE 241A Static Timing Analysis 60 http://vlsicad.ucsd.edu

Optimal Cover was Cover 2

``subject tree’’

INV 2ND2 32 ND3 8AOI21 4

Area cost 17

AOI21

ND2

INV

ND3

ND3

Slide courtesy of Keutzer

Page 61: ECE 260B – CSE 241A Static Timing Analysis 1 ECE260B – CSE241A Winter 2005 Logic Synthesis Website:

ECE 260B – CSE 241A Static Timing Analysis 61 http://vlsicad.ucsd.edu

Summary of Technology Mapping

DAG covering formulation Separated library issues from mapping algorithm (can’t do this

with rule-based systems)

Tree covering approximation Very efficient (linear time) Applicable to wide range of libraries (std cells, gate arrays) and

technologies (FPGAs, CPLDs)

Weaknesses Problems with DAG patterns (Multiplexors, full adders, …) Large input gates lead to a large number of patterns

Page 62: ECE 260B – CSE 241A Static Timing Analysis 1 ECE260B – CSE241A Winter 2005 Logic Synthesis Website:

ECE 260B – CSE 241A Static Timing Analysis 62 http://vlsicad.ucsd.edu

Outline

Introduction

Two-level Logic Synthesis

Multi-level Logic Synthesis

Timing optimization in Synthesis

Page 63: ECE 260B – CSE 241A Static Timing Analysis 1 ECE260B – CSE241A Winter 2005 Logic Synthesis Website:

ECE 260B – CSE 241A Static Timing Analysis 63 http://vlsicad.ucsd.edu

Timing Optimization in Synthesis

Factors determining delay of circuit:

Underlying circuit technology Circuit type (e.g. domino, static CMOS, etc.) Gate type Gate size

Logical structure of circuit Length of computation paths False paths Buffering

Parasitics Wire loads Layout

Page 64: ECE 260B – CSE 241A Static Timing Analysis 1 ECE260B – CSE241A Winter 2005 Logic Synthesis Website:

ECE 260B – CSE 241A Static Timing Analysis 64 http://vlsicad.ucsd.edu

Problem Statement

Given:

Initial circuit function description

Library of primitive functions

Performance constraints (arrival/required times)

Generate:

an implementation of the circuit using the primitive functions, such that:

1. performance constraints are met

2. circuit area is minimized

Page 65: ECE 260B – CSE 241A Static Timing Analysis 1 ECE260B – CSE241A Winter 2005 Logic Synthesis Website:

ECE 260B – CSE 241A Static Timing Analysis 65 http://vlsicad.ucsd.edu

Current Design Process

BehaviorBehaviorOptiizationOptiization(scheduling)(scheduling)

PartitioningPartitioning(retiming)(retiming)

Logic synthesisLogic synthesis•Technology independentTechnology independent•Technology mappingTechnology mapping

Timing drivenTiming drivenplace and routeplace and route

Behavioral descriptionBehavioral description

Logic and latchesLogic and latches

Logic equationsLogic equations

Gate netlistGate netlist

Layout Layout

•Gate libraryGate library•Perf. ConstraintsPerf. Constraints•Delay modelsDelay models

Page 66: ECE 260B – CSE 241A Static Timing Analysis 1 ECE260B – CSE241A Winter 2005 Logic Synthesis Website:

ECE 260B – CSE 241A Static Timing Analysis 66 http://vlsicad.ucsd.edu

Synthesis delay models

Why are technology independent delay reductions hard?

Lack of fast and accurate delay models

1. # levels, fast but crude

2. # levels + correction term (fanout, wires,… ): a little better, but still crude (what coefficients to use?)

3. Technology mapped: reasonable, but very slow

4. Place and route: better but extremely slow

5. Silicon: best, but infeasibly slow (except for FPGAs)

bbeetttteerr

sslloowweerr

Page 67: ECE 260B – CSE 241A Static Timing Analysis 1 ECE260B – CSE241A Winter 2005 Logic Synthesis Website:

ECE 260B – CSE 241A Static Timing Analysis 67 http://vlsicad.ucsd.edu

Clustering/partial-collapse

Traditional critical-path based methods require Well defined critical path Good delay/slack information

Problems: Good delay information comes from mapper and layout Delay estimates and models are weak

Possible solutions: Better delay modeling at technology independent level

Make speedup, insensitive to actual critical paths and mapped delays

Page 68: ECE 260B – CSE 241A Static Timing Analysis 1 ECE260B – CSE241A Winter 2005 Logic Synthesis Website:

ECE 260B – CSE 241A Static Timing Analysis 68 http://vlsicad.ucsd.edu

Overview of Solutions for delay

1. Circuit re-structuring Rescheduling operations to reduce time of computation

2. Implementation of function trees (technology mapping) Selection of gates from library

- Minimum delay (load independent model)- Minimize delay and area

Implementation of buffer trees

3. Resizing

Focus here on circuit re-structuring

Page 69: ECE 260B – CSE 241A Static Timing Analysis 1 ECE260B – CSE241A Winter 2005 Logic Synthesis Website:

ECE 260B – CSE 241A Static Timing Analysis 69 http://vlsicad.ucsd.edu

Circuit re-structuring

Approaches:

Local:

Mimic optimization techniques in adders Carry lookahead (tree height reduction) Conditional sum (generalized select transformation) Carry bypass (generalized bypass transformation)

Global:

Reduce depth of entire circuit Partial collapsing Boolean simplification

Page 70: ECE 260B – CSE 241A Static Timing Analysis 1 ECE260B – CSE241A Winter 2005 Logic Synthesis Website:

ECE 260B – CSE 241A Static Timing Analysis 70 http://vlsicad.ucsd.edu

Re-structuring methods

Performance measured by 1. levels,

2. sensitizable paths,

3. technology dependent delays

Level based optimizations: Tree height reduction (Singh ‘88) Partial collapsing and simplification (Touati ‘91) Generalized select transform (Berman ‘90)

Sensitizable paths Generalized bypass transform (Mcgeer ‘91)

Page 71: ECE 260B – CSE 241A Static Timing Analysis 1 ECE260B – CSE241A Winter 2005 Logic Synthesis Website:

ECE 260B – CSE 241A Static Timing Analysis 71 http://vlsicad.ucsd.edu

Re-structuring for delay: tree-height reduction

nn

ll mm

ii jj

hh

kk33

66

55 55

11 4411

00 00 00 00 22 00 00aa bb cc dd ee ff gg

ii11

00 00

aa bb

mm

jj

hh

kk3344

11

00 00 22 00 00

cc dd ee ff gg

n’n’DuplicatedDuplicatedlogiclogic

11220000

55CriticalCriticalregionregion

CollapsedCollapsedCritical regionCritical region

Page 72: ECE 260B – CSE 241A Static Timing Analysis 1 ECE260B – CSE241A Winter 2005 Logic Synthesis Website:

ECE 260B – CSE 241A Static Timing Analysis 72 http://vlsicad.ucsd.edu

Restructuring for delay: path reduction

ii11

00 00

aa bb

mm

jj

hh

kk3344

11

00 00 22 00 00

cc dd ee ff gg

n’n’DuplicatedDuplicatedlogiclogic

11220000

55

ii11

00 00

aa bb

mm

jj

hh

kk33

4411

00 00 22 00 00

cc dd ee ff gg

1122

00

3355

n’n’

22

11

00

44

CollapsedCollapsedCritical regionCritical region

New delay = 5New delay = 5

Page 73: ECE 260B – CSE 241A Static Timing Analysis 1 ECE260B – CSE241A Winter 2005 Logic Synthesis Website:

ECE 260B – CSE 241A Static Timing Analysis 73 http://vlsicad.ucsd.edu

Generalized select transform (GST)

Late signal feeds multiplexor

cc dd ee ff gg

aa

bb

outout

cc dd ee ff gg

bb

cc dd ee ff gg

bb

a=0a=0

a=1a=1

outout00

11

aa

Page 74: ECE 260B – CSE 241A Static Timing Analysis 1 ECE260B – CSE241A Winter 2005 Logic Synthesis Website:

ECE 260B – CSE 241A Static Timing Analysis 74 http://vlsicad.ucsd.edu

Generalized bypass transform (GBX)

Make critical path false Speed up the circuit

Bypass logic of critical path(s)

ffmm=f=f ffm+1m+1 ffnn=g=g……

ffm m =f=f ffm+1m+1 ffnn=g=g…… 00

11g’g’

dgdg____dfdf

BooleanBooleandifferencedifference

s-a-0 redundants-a-0 redundant

Page 75: ECE 260B – CSE 241A Static Timing Analysis 1 ECE260B – CSE241A Winter 2005 Logic Synthesis Website:

ECE 260B – CSE 241A Static Timing Analysis 75 http://vlsicad.ucsd.edu

cc dd ee ff ggbb

cc dd ee ff ggbb

a=0a=0

a=1a=1

outout00

11

aa

GSTGST

GST vs GBX

…… 00

11g’g’

dhdh____dada

aa

0/10/1

bb

cc gg

hh

cc dd ee ff ggbb

cc dd ee ff ggbb

a=0a=0

a=1a=1

…… 00

11g’g’

0/10/1 cc gg

bb

GBXGBX

aa

hh

Boolean

diffe

Note:

rence =

a a a

hh h

GBXGBX

Page 76: ECE 260B – CSE 241A Static Timing Analysis 1 ECE260B – CSE241A Winter 2005 Logic Synthesis Website:

ECE 260B – CSE 241A Static Timing Analysis 76 http://vlsicad.ucsd.edu

Thank you!