17
Hierarchical Physical Design Methodology for Multi-Million Gate Chips Session 11 Wei-Jin Dai

Hierarchical Physical Design Methodology for Multi-Million Gate Chips Session 11 Wei-Jin Dai

Embed Size (px)

Citation preview

Page 1: Hierarchical Physical Design Methodology for Multi-Million Gate Chips Session 11 Wei-Jin Dai

Hierarchical Physical Design Methodology for Multi-Million Gate

Chips

Session 11

Wei-Jin Dai

Page 2: Hierarchical Physical Design Methodology for Multi-Million Gate Chips Session 11 Wei-Jin Dai

2

Overview

• Introduction

• Challenges of hierarchical design

• Hierarchical methodology – Full chip physical prototyping

• Performance data

• Summary

Page 3: Hierarchical Physical Design Methodology for Multi-Million Gate Chips Session 11 Wei-Jin Dai

3

Introduction

• As chip size and complexity grow, hierarchical design approach is necessary

• During last 12 months, there is a big increase in the number of chips designed with hierarchical approach

• The advantages of hierarchical approach is divide-and-conquer

Page 4: Hierarchical Physical Design Methodology for Multi-Million Gate Chips Session 11 Wei-Jin Dai

4

The Challenges• How to get full-chip (10 million gates+) physical

reality early on to identify potential problems?• How to have convergence process to reach design

closure from beginning to end?• How to achieve die utilization similar to “flat”

approach?• How to achieve clock speed and skews similar to

“flat” approach?• How to automatically generate optimal pin

assignments for each module?• How to automatically come up with realistic timing

budgets for each module?• How to achieve top level timing/signal integrity

closure?

Page 5: Hierarchical Physical Design Methodology for Multi-Million Gate Chips Session 11 Wei-Jin Dai

5

Creating the Physical Prototype

• Full-chip flat prototype delivers the complete physical, timing, clock and power data– Eliminates the guessing of the traditional block-based

approaches

• Drives the partitioning in manageable blocks

Flat Full-Chip

Delivers an

Accurate Physical Prototype

Page 6: Hierarchical Physical Design Methodology for Multi-Million Gate Chips Session 11 Wei-Jin Dai

6

Estimation

Prototyping Starts Early in the Flow

• Most accurate view possible at all design stages• Physical timing budgeting drives synthesis

RTL/Black box

75% netlist/Black box

Completenetlist

Refinement Optimization

DesignCompletion

P r o t o t y p i n g

Initial timing budgets

Refined timing budgets

Page 7: Hierarchical Physical Design Methodology for Multi-Million Gate Chips Session 11 Wei-Jin Dai

7

Hierarchical Design Flow

Flat Full ChipPhysical Prototype

PhysicallyFeasible?

Physical Partitioning

Top Level ImplementationCTS, Optimization, Power

NO

OptimizedTop Level Netlist

• Die size• Timing• Clock skew• Power• SI

LEF/GDSIIRTL/Black BoxProcess Data

• Quick synthesis• Floor planning• Placement• CTS• Trial route

PartitionData

BlockImplementation

Place, CTS, Optimize

PartitionDataPartition

DataPartition

DataPartition

Data

• Pin assignment• Timing budget• Clock spec• Power grid

DEFPlacement

Chip LevelTiming

Constraints

DEFPlacement

Page 8: Hierarchical Physical Design Methodology for Multi-Million Gate Chips Session 11 Wei-Jin Dai

8

Hierarchical Partitioning

• Pin assignment• Timing budgeting• Clock tree generation• Power grid planning

PartitioningIndependent block-level

implementation SoC assembly

Page 9: Hierarchical Physical Design Methodology for Multi-Million Gate Chips Session 11 Wei-Jin Dai

9

Accurate Pin Assignment

• Full-chip prototype results in optimal pin placement– Results in narrower channels and reduced die size– Reduces the routing congestion– Improves the chip timing

Accurate Physical PrototypeFlat Full-Chip

Top Level Partition View

Page 10: Hierarchical Physical Design Methodology for Multi-Million Gate Chips Session 11 Wei-Jin Dai

10

Timing Budgeting

Each block requires:• Clock definition• Set_input_delay• Set_output_delay• Set_drive• Set_load• Path exceptions

(false, multicycle paths)

Block 1

Block 3

Block 2

L

L

L

Accurate timing budgets result in predictable timing convergence

Page 11: Hierarchical Physical Design Methodology for Multi-Million Gate Chips Session 11 Wei-Jin Dai

11

Hierarchical Clock Tree Synthesis

• Accurate physical timing data enables the creation of an optimal clock tree– Block-level followed by top-level clock tree

• Final clock tree routing generates near zero skew– Balanced tree at the top level

Worst block skew

+ Zero top level skew

= 150ps total clock skew

Balancedclocktree

150psskew

120ps skew

50psskew

50psskew

100psskew

130ps skew

Page 12: Hierarchical Physical Design Methodology for Multi-Million Gate Chips Session 11 Wei-Jin Dai

12

Full Chip Power Analysis

Page 13: Hierarchical Physical Design Methodology for Multi-Million Gate Chips Session 11 Wei-Jin Dai

13

Hierarchical Power Grid Design

• P/G are planned at full chip level• P/G network gets automatically pushed down

during partitioning

Full chip

Block

Page 14: Hierarchical Physical Design Methodology for Multi-Million Gate Chips Session 11 Wei-Jin Dai

14

Performance Data

Design Description Netlist to SDF Time

1.8M cells; 200 macros 6 hours

900K cells 3 hours

2.3M cells; 700 macros 14 hours

2M cells; 100+ macros 5 hours

2.8M cells 10 hours

1.7M cells; 70 macros 5 hours

Page 15: Hierarchical Physical Design Methodology for Multi-Million Gate Chips Session 11 Wei-Jin Dai

15

High Performance Environment

DesignImport

DetailPlace

DetailRoute*

RCExtract

DelayCalculation

TimingAnalysis

IPODesignIteration

60x

4 m

in

4 hr

1x

3 hr

20

min

2 hr

50

min

56x

8 m

in

7 hr

30

min

57x

6 m

in

5 hr

45

min

33x

7 m

in

3 hr

50

min

7x

20 m

in

2 hr

15

min

5x

1 hr

50

min 9

hr

6x

5 hr

25

min

35 h

r 40

min

• Design 580K cells, 0.25um process, 5LM, 100MHz• Data collected on a 500MHz processor workstation

(*) SPC Trial Route First Encounter

Traditional

Page 16: Hierarchical Physical Design Methodology for Multi-Million Gate Chips Session 11 Wei-Jin Dai

16

High Accuracy of the Prototype

• The prototype closely correlates with post-route layout– Comparison to ‘tape-out’ back-end flow– More than 90% of the interconnect and IO path delays within 2%

Design:Design: 5LM 0.25um 580K cells 620K nets 572 I/Os 4 blocks

Page 17: Hierarchical Physical Design Methodology for Multi-Million Gate Chips Session 11 Wei-Jin Dai

17

SummarySoC Hierarchical Methodology

• Build a full-chip physical prototype early on– Start at RTL– Identify problems early

• Achieve design closure before partitioning– Close full-chip timing– Optimize die size– Meet power requirements– Resolve signal integrity issues

• Maintain the design closure throughout the design process