33
1 A Deep Sub-Micron VLSI A Deep Sub-Micron VLSI Design Flow using Design Flow using Layout Fabrics Layout Fabrics Sunil P. Khatri University of Colorado, Boulder Amit Mehrotra University of Illinois, Urbana-Champaign Robert K Brayton Alberto L Sangiovanni-Vincentelli University of California, Berkeley

A Deep Sub-Micron VLSI Design Flow using Layout Fabrics

  • Upload
    alika

  • View
    51

  • Download
    0

Embed Size (px)

DESCRIPTION

A Deep Sub-Micron VLSI Design Flow using Layout Fabrics. Sunil P. Khatri University of Colorado, Boulder Amit Mehrotra University of Illinois, Urbana-Champaign Robert K Brayton Alberto L Sangiovanni-Vincentelli University of California, Berkeley. Our VLSI Design Flow. Logic netlist. - PowerPoint PPT Presentation

Citation preview

Page 1: A Deep Sub-Micron VLSI Design Flow using Layout Fabrics

1

A Deep Sub-Micron A Deep Sub-Micron VLSI Design Flow using VLSI Design Flow using

Layout FabricsLayout Fabrics

Sunil P. KhatriUniversity of Colorado, Boulder

Amit MehrotraUniversity of Illinois, Urbana-Champaign

Robert K Brayton

Alberto L Sangiovanni-VincentelliUniversity of California, Berkeley

Page 2: A Deep Sub-Micron VLSI Design Flow using Layout Fabrics

2

Our VLSI Design FlowOur VLSI Design Flow

Optimized logic netlist

Layout

Logic Optimization

Technology Mapping

Routing

Placement

Logic netlist

Page 3: A Deep Sub-Micron VLSI Design Flow using Layout Fabrics

3

MotivationMotivation Modern IC processes

Feature size well below 1 micron Certain electrical effects increasingly important

Cross-talk Electromigration Self Heat Statistical variations

Logic abstraction eroded Existing design paradigms need to be

rethought

Page 4: A Deep Sub-Micron VLSI Design Flow using Layout Fabrics

4

a

C

C2

1C

C2C2

1a av

v

a

C

C2

1C

C2C2

1a av

v

C1

C2 C2

1

C2

C

Research FocusResearch Focus

Tackled in an ad-hoc manner Increases turn-around time

Verified cross-talk trends Accurate 3-D capacitance extraction Delay variation 2.47:1 (200 m wires, 10X

drivers, 0.1 m technology)

The cross-talk issueC

C2

1C

C2C2

1a v aa v a

C

C2

1C

C2C2

1

v

a

a v aC

C2

1C

C2C2

1

v

a

Page 5: A Deep Sub-Micron VLSI Design Flow using Layout Fabrics

5

OutlineOutline

Previous Approaches New idea:New idea: The Fabric Approach

Fabric1 (in DAC-1999) Standard-cell based design

Fabric3 (in ICCAD-2000) Network of PLA based design

Further Tasks Summary

Page 6: A Deep Sub-Micron VLSI Design Flow using Layout Fabrics

6

Previous ApproachesPrevious Approaches

[ALPHA 97] : Metal layers 3 and 6 dedicated to power Not viable in future processes

[Rubio 94]: Functional analysis based on layout Post-layout methods don’t scale

[Kirkpatrick 94, 96] : Concept of digital sensitivity Requires don’t-care and image computations

Page 7: A Deep Sub-Micron VLSI Design Flow using Layout Fabrics

7

Solution: Layout Solution: Layout FabricsFabrics

Repeating dense wiring fabric dense wiring fabric (DWF) pattern at minimum pitch

We handle cross-talk by designby design A new layout and design paradigm

S

S SV V S

G

SSV

G

V

Page 8: A Deep Sub-Micron VLSI Design Flow using Layout Fabrics

8

Research ContributionResearch Contribution

Verify cross-talk trends Fabric1 [KMBSO99] (in DAC)

Incorportated into traditional design flow Fabric3 [KBS00] (in ICCAD-00)

Network of PLAs Detailed electrical characterization Synthesis, wire removal algorithms

Both utilize DWF pattern 1.02:1 cross-talk delay variation

Page 9: A Deep Sub-Micron VLSI Design Flow using Layout Fabrics

9

Layout FabricsLayout Fabrics

AdvantagesAdvantages Pre-characterized parasitics

Uniform, low cross-coupling capacitance 40X40X lower, 2% delay variation

Uniform, low signal inductance

Automatic power and ground routing Uniform, low power and ground resistance

Can effectively implement regular structures DisadvantagesDisadvantages

5% increase in total capacitance Area penalty Power increase

Page 10: A Deep Sub-Micron VLSI Design Flow using Layout Fabrics

10

Capacitance in DWFCapacitance in DWF Experimental setup

“Strawman” process model, copper wires, low-K dielectric Capacitances from 3-D field solver (space3d) Simulated three wires in spice

0.1 micron process, Metal2 wires Length 200 microns, 10x minimum drivers

Non-DWF Delay variation 2.47:1 Signal integrity problems for fast slew rates

With DWF 40X reduction in cross-coupling capacitance Delay variation 1.02:1, no signal integrity problem

Page 11: A Deep Sub-Micron VLSI Design Flow using Layout Fabrics

11

Inductance in the DWFInductance in the DWF

Low and uniform in DWF Current return path is at minimum spacing

In regular layout style, varies greatly Problems reported for clock signals

Compared inductance of Metal8 trace Verified using ASITIC

Process DWF Stdcell

0.10 2.68 4.16

Inductance (nH / micron)

Page 12: A Deep Sub-Micron VLSI Design Flow using Layout Fabrics

12

VDD/GND Resistance in VDD/GND Resistance in DWFDWF

Check resistance at various points in DWF Compare with standard cell case

Varies greatly Measured at end of row

L/W = 1000/8

Process DWF Stdcell

0.25 0.17 – 0.24 5.5

0.10 0.39 – 0.54 10.63

0.05 0.68 – 0.86 20.1

VDD/GND resistance (ohms)

Page 13: A Deep Sub-Micron VLSI Design Flow using Layout Fabrics

13

Buffer Insertion in DWFBuffer Insertion in DWF

Easily performed VDD and GND available all over routing area

Page 14: A Deep Sub-Micron VLSI Design Flow using Layout Fabrics

14

Fabric1 - IntroductionFabric1 - Introduction

DWF pattern utilized chip-wide Library cells implemented in this pattern

Std Cell Fabric Cell

Synthesis, placement and routing use standard cell methodology

Page 15: A Deep Sub-Micron VLSI Design Flow using Layout Fabrics

15

Fabric1 - ResultsFabric1 - Results

0

0.5

1

1.5

2

2.5

Layout Area (2 Layers, using DWF)

Stdcell Fabric1

Page 16: A Deep Sub-Micron VLSI Design Flow using Layout Fabrics

16

Fabric1 - ResultsFabric1 - Results

0

0.2

0.4

0.6

0.8

1

1.2

1.4

1.6

Layout Area (using DWF)

Stdcell Fabric1

0

0.2

0.4

0.6

0.8

1

1.2

1.4

Layout Area (no DWF)

Stdcell Fabric1_min

Page 17: A Deep Sub-Micron VLSI Design Flow using Layout Fabrics

17

Fabric3 Fabric3 Network of Programmable Logic Arrays Programmable Logic Arrays Combine many logic nodes into a PLA Routing area utilizes DWF pattern PLA implements a multi-output function

example : f = a b + c ; g = a b + c

a b

c

a b

a b cb f g

AND plane OR Plane

Page 18: A Deep Sub-Micron VLSI Design Flow using Layout Fabrics

18

Fabric3Fabric3

PLA Core Layout

b ga a b f

clk

Page 19: A Deep Sub-Micron VLSI Design Flow using Layout Fabrics

19

PLAs v/s Standard CellsPLAs v/s Standard Cells

PLAs are densedense and fastfast

0

0.2

0.4

0.6

0.8

1

cmb cu x2 z4ml

Delay

0

0.2

0.4

0.6

0.8

1

cmb cu x2 z4ml

Area

# input # output # row

cmb 16 4 15

cu 14 11 19

x2 10 7 17

z4ml 7 4 59

0

0.5

1

1.5

2

2.5

3

cmb cu x2 z4ml

Power

PLA

Standard Cell

Page 20: A Deep Sub-Micron VLSI Design Flow using Layout Fabrics

20

PLA CharacteristicsPLA Characteristics

Why is the PLA area and delay so low? Wiring localized within PLA PLA core transistor sizes are minimum No p-transistor to n-transistor diffusion spacing

“Gigahertz” chip utilized pre-charged PLAs High performance Quick implementation Didn’t use a network of PLAs

Page 21: A Deep Sub-Micron VLSI Design Flow using Layout Fabrics

21

Network of PLAsNetwork of PLAs PLAs are pre-charged

Inputs to all PLAs must settle before evaluation begins

a

g

f

d

b c

e

Page 22: A Deep Sub-Micron VLSI Design Flow using Layout Fabrics

22

Network of PLAsNetwork of PLAs

For correct operation: PLA dependency graph must be acyclic

Evaluation of PLAi after completion of slowest PLAj in its “fanin”

Self-timed design style Each PLA generates a completion signal Overhead of one wordline, one output

Delay formula to find slowest PLAj

Page 23: A Deep Sub-Micron VLSI Design Flow using Layout Fabrics

23

DecompositionDecomposition Algorithm collapses wiring into PLAs Input:Input: multi-level combinational network

W bound

H bound Output:Output: Correct network of PLAs Our algorithm greedily grows a PLA until either bound is

violated Attempt to reduce wires by selecting fanouts for inclusion in the PLA

being grown

Page 24: A Deep Sub-Micron VLSI Design Flow using Layout Fabrics

24

Choice of W, HChoice of W, H

Choice of W Driven by synthesis constraints Large W means larger runtimes

espresso and folding done in inner loop

Use W between 25 and 50

Choice of H Driven by power considerations Large H also affects synthesis runtimes Used H between 15 and 40

Page 25: A Deep Sub-Micron VLSI Design Flow using Layout Fabrics

25

a

4

3

2

1 1

g

f

d

b c 1

e2

a

4

3

2

1 1

g

f

d

b c 1

e2

a

4

3

2

1 1

g

f

d

b c 1

e2

a

4

3

2

1 1

g

f

d

b c 1

e2

a

4

3

2

1 1

g

f

d

b c 1

e2

a

4

3

2

1 1

g

f

d

b c 1

e2

a

4

3

2

1 1

g

f

d

b c 1

e2

a

4

3

2

1 1

g

f

d

b c 1

e2

a

4

3

2

1 1

g

f

d

b c 1

e2

Fabric3 - Fabric3 - DecompositionDecomposition

a

g

f

d

b c

e

Page 26: A Deep Sub-Micron VLSI Design Flow using Layout Fabrics

26

Place/Route FlowPlace/Route Flow PLA generationPLA generation using perl script

Layout generated on the fly

2 Layer experiments:2 Layer experiments: Placement using vpr

FPGA placement tool All PLAs have approximately same size

Routing using wolfe interface to TimberWolfSC and yacr

3-6 Layer experiments:3-6 Layer experiments: Placement using CADENCE qplace Routing using CADENCE router

Page 27: A Deep Sub-Micron VLSI Design Flow using Layout Fabrics

27

00.20.40.60.8

11.21.41.61.8

Layout Area (2 Layers, no DWF)

Stdcell Fabric3_min

0

0.5

1

1.5

2

2.5

Layout Area (2 Layer, with DWF)

Stdcell Fabric3

Fabric3 - Area ResultsFabric3 - Area Results

Page 28: A Deep Sub-Micron VLSI Design Flow using Layout Fabrics

28

00.20.40.60.8

11.21.41.6

Timing (2 Layers, no DWF)

Stdcell Fabric3_min

0

0.2

0.4

0.60.8

1

1.2

1.4

1.6

Timing (2 Layer, with DWF)

Stdcell Fabric3

Fabric3 - Timing Fabric3 - Timing ResultsResults

Page 29: A Deep Sub-Micron VLSI Design Flow using Layout Fabrics

29

Fabric3 - ResultsFabric3 - Results

0.94

0.96

0.98

1

1.02

1.04

1.06

1.08

1.1

Layout Area (using DWF)

Stdcell Fabric3

00.10.20.30.40.50.60.70.80.9

1

Layout Area (no DWF)

Stdcell Fabric3_min

Timing results essentially unchanged For C3540, delay variation due to cross-talk is 3.45:1 (Stdcell) versus 1.07:1

(Fabric3)

Page 30: A Deep Sub-Micron VLSI Design Flow using Layout Fabrics

30

Fabric3 layout (2 Layer)Fabric3 layout (2 Layer)

Page 31: A Deep Sub-Micron VLSI Design Flow using Layout Fabrics

31

Future TasksFuture Tasks

Better algorithms:Better algorithms: Better ways of decomposing original netlist

Refining the fabric:Refining the fabric: Alternative denser fabrics Encoding PLA inputs [Schmookler80] Connecting gates to PLA outputs

Alternative implementation of logic blocks:Alternative implementation of logic blocks: Different PLA styles Alternative circuits

Page 32: A Deep Sub-Micron VLSI Design Flow using Layout Fabrics

32

SummarySummary

Layout fabricsLayout fabrics to eliminate cross-talkto eliminate cross-talk in in DSM VLSI designDSM VLSI design New layout and design paradigm Fix cross-talk by design Highly regular and predictable

Network of PLA based design flowNetwork of PLA based design flow PLA decomposition algorithms Minimal area penalty 15% timing improvement

Page 33: A Deep Sub-Micron VLSI Design Flow using Layout Fabrics

33

Thank you!!