Upload
truongdien
View
234
Download
0
Embed Size (px)
Citation preview
1 © 2015 Broadcom Corporation. All rights reserved.
A PRACTICAL MODEL TO REDUCE MARGIN PESSIMISM FOR MULTI-INPUT SWITCHING IN STATIC TIMING ANALYSIS OF DIGITAL CMOS CIRCUITS
3/12/2015
Christian Lütkemeyer
2 © 2015 Broadcom Corporation. All rights reserved.
The basics of Multi-Input Switching (MIS).
“Historical” (Y2008) context when we started to margin for MIS in Static Timing Analysis (STA).
Examples for MIS speedup:
NAND3.
AND3.
Timing window-based pessimism reduction when only a subset of inputs can contribute to MIS speedup.
Caveat on window-based pessimism reduction for complex gates (AOI, OAI, etc.)
A pseudo-current based soft metric calculation that formalizes the translation of input arrival times to a quantitative MIS impact metric.
Summary.
OUTLINE
3 © 2015 Broadcom Corporation. All rights reserved.
MULTI-INPUT SWITCHING EXAMPLE NAND4
A
B
Y D
C
A
B
C
D
1
A falling, B=C=D=1
Y
Only single input switching is captured in timing libraries.
t
Single Input Switching
4<< 1
A=B=C=D=falling
Y t
Multi Input Switching
MIS derate: R= 4/ 1
4 © 2015 Broadcom Corporation. All rights reserved.
MIS speedup
HSPICE MIS EXAMPLE FOR A NAND4X1 GATE
• i1, i2, and i3 are delayed by 20 ps vs. i0.
• Significantly shorter output transition time and cell delay.
5 © 2015 Broadcom Corporation. All rights reserved.
Developing and maintaining state-of-the-art Electronic Design Automation (EDA) tools is not economical for most ASIC design houses.
Using standardized models (Liberty library modeling standard) with widely deployed commercial tools for characterization and analysis reduces design and modeling risk.
Competition between EDA companies drives tool performance to match Moore’s Law.
We actively work with our EDA Partners to address new modeling needs like MIS.
Innovation takes time!
Complexity adds risk!
You have to design ICs with the tools you have, not the tools you wish you had!
WHY WE USE COMMERCIAL STA TOOLS
6 © 2015 Broadcom Corporation. All rights reserved.
Academia (Samples)
[1] V. Chandramouli, Karem A. Sakallah “Modeling the Effects of Temporal Proximity of Input Transitions on Gate Propagation Delay and Transition Time”, DAC 1996
[2] L.-C. Chen, S. K. Gupta, M. A. Breuer, “A New Gate Delay Model for Simultaneous Switching and Its Applications”, DAC 2001
[3] Agarwal, A. Dartu, F. Blaauw, “Statistical gate delay model considering multiple input switching”, DAC 2004
[4] S. Yanamanamanda, J. Li, J. Wang, “Uncertainty Modeling of Gate Delay Considering Multiple Input Switching”, 2005, ISCAS 2005, patented
…
Complex models, some patented!
Commercial STA Tools (~Y2008)
Library models moved from NLDM -> CCS
Statistical STA looked promising, but failed to take off as statistical characterization was too expensive/hard and runtimes prohibitively long.
OCV (On Chip Variation) was enhanced to Advanced OCV (AOCV) or Parametric OCV (POCV) to model statistical averaging of local variation over longer paths.
No Liberty File Standard support for MIS. (still missing in 2015).
No commercial characterization solution for MIS.
Even today, MIS support is barely mentioned in commercial STA tool marketing material.
PrimeTime ADV (Synopsys) supports Multi-input switching analysis.
Details are essentially unpublished.
MIS MODELING Y2008: A DISCONNECT BETWEEN ACADEMIA AND COMMERCIAL STA TOOLS
7 © 2015 Broadcom Corporation. All rights reserved.
Goal: Reduce significant constant margin in timing closure.
Large constant margin became more and more costly as gates became faster.
We need STA support for a MIS margin mechanism
It needs to be robust and pessimistic.
It should be reasonably simple so that it can be added to commercial tools based on a limited customer request.
There should be no required changes to Liberty library models.
MIS characterization needs to be developed in-house.
We requested a MIS margin mechanism that would reduce gate delays in minimum delay analysis based on a derate factor that we provided.
Gate_specific_derate_rise.
Gate_specific_derate_fall.
BROADCOM’S Y2008 REQUEST TO COMMERCIAL STA PARTNERS FOR SUPPORT OF MIS
8 © 2015 Broadcom Corporation. All rights reserved.
SINGLE STAGE GATE EXAMPLE: NAND3 DELAY VS. OUTPUT LOAD
MIS
speedup
• MIS speedup growth is linear with Cload.
Short input fall time: 1 ps
9 © 2015 Broadcom Corporation. All rights reserved.
SINGLE STAGE GATE EXAMPLE: NAND3 OUTPUT TRANSITION TIME VS. OUTPUT LOAD
MIS
speedup
• MIS speedup of transition time growth is also linear with Cload.
10 © 2015 Broadcom Corporation. All rights reserved.
NAND3 MIS DELAY DERATE FACTOR AS A FUNCTION OF LOAD AND INPUT TRANSITION TIME
primary region of MIS concern
• For inverting gates, the primary region of concern (large cell delay, large speedup) is found for larger loads and fast input transition times.
• Approximately Rwc=1/3 for NAND3.
• Using the R factor from the large load and small input transition time area is mostly pessimistic.
• Exception: A tiny region (with very small loads and large input transition time) where cell delay is very small. MIS speedup in that region is small in absolute terms.
Exception
Rwc
11 © 2015 Broadcom Corporation. All rights reserved.
TWO STAGE GATE EXAMPLE: AND3 DELAY VS. OUTPUT LOAD
MIS
speedup
• MIS speedup is constant as only the first stage of the AND gate speeds up. => derating of gate delay is not an accurate mechanism to margin for MIS speedup.
12 © 2015 Broadcom Corporation. All rights reserved.
TWO STAGE GATE EXAMPLE: AND3 OUTPUT TRANSITION TIME VS. OUTPUT LOAD
small MIS transition time
speedup
• The output inverter decouples the output transition time from the MIS speedup of the first stage.
• The transition time speedup is marginal and only barely noticeable for small loads.
13 © 2015 Broadcom Corporation. All rights reserved.
AND3 MIS DERATE FACTOR AS A FUNCTION OF LOAD AND INPUT TRANSITION TIME
primary region of MIS concern
• For non-inverting gates, the largest relative speedup is observed for small loads.
• Select Rwc=0.76 for AND3 (small load, fast input transition). This is pessimistic for regions with larger loads where the relative speedup is smaller.
• Exception: A small region (with small loads and large input transition time) where cell delay is small.
Rwc
worse than Rwc
14 © 2015 Broadcom Corporation. All rights reserved.
EXAMPLES OF MIS DERATE FACTORS
Gate MIS derate rising MIS derate falling
NAND2 R=0.5 R=1
NAND3 R=0.33 R=1
NAND4 R=0.25 R=1
AND2 R=1 R=0.84
AND3 R=1 R=0.76
AND4 R=1 R=0.75
Inverting NAND or NOR gates have a worst case MIS speed-up of about 1/N (N: number of inputs).
The additional output inverter in the AND gates reduces the relative speed-up due to MIS.
15 © 2015 Broadcom Corporation. All rights reserved.
MIS PESSIMISM REDUCTION BASED ON INPUT TIMING WINDOWS EXAMPLE: NAND4
TW(i0) i0
i1
i2
i3
o
Reff 1
TW(o)
1/2 1/3 1/4=R
TW
# early switching inputs Neff
1 2 3 4
16 © 2015 Broadcom Corporation. All rights reserved.
NAND4 EFFECTIVE MIS DERATE FACTOR
• Reff follows a 1/Neff characteristic as the combined output current grows ~Neff.
• Linear interpolation between Neff=1 and Neff=4 leads to optimistic Reff values! => Cannot use linear derate reduction.
R=0.25
linear interpolation optimism
17 © 2015 Broadcom Corporation. All rights reserved.
A gate with an output inverter will show MIS speedup mostly on the first stage.
Worst case derate:
GENERALIZED 1/N MIS PESSIMISM REDUCTION MODEL THAT ALSO COVERS MULTI-STAGE GATES
n gate/n inv n: number of simultaneous switching inputs.
)Reff1(1
1
1
1
/with1
1
/1
//1/
)1(
)(
eff
eff
gateinv
gateinv
gateinv
invgate
invgate
NR
R
NR
NNNNR
can be calculated based on R and N, R<1
18 © 2015 Broadcom Corporation. All rights reserved.
EXAMPLE: REFF AS A FUNCTION OF NEFF FOR DIFFERENT GATES
• Eqn. (1Reff) avoids optimism of linear interpolation and creates a 1/N characteristic of MIS derate adjustment.
• A linear derate adjustment would create significant optimism vs. the physical MIS model.
• NAND2, NAND3, and NAND4 align for the same Neff as the physical MIS mechanism requires.
• Reff is defined over the range of 1 to N, not just discrete points. => Can we develop a soft metric to create Neff that models partial input overlap?
NAND2=AOI22
NAND3
NAND4
AND3
AND4
19 © 2015 Broadcom Corporation. All rights reserved.
CAVEAT FOR TIMING WINDOW BASED PESSIMISM REDUCTION EXAMPLE: COMPLEX AOI22 GATE
A
A
B
B
D
C
D C
Y=not((A&B)|(C&D))
• AOI22 has four inputs.
• A MIS Y-falling event can be created by activating the two parallel pull-down branches connected to Y.
• => R=0.5 worst case, i.e., cell delay can be cut in half.
• A worst-case simultaneous switching event can be created by aligned switching on pairs of inputs:
• A rising, D rising, B=1, C=1
• A rising, C rising, B=1, D=1
• B rising, D rising, A=1, C=1
• B rising, C rising, A=1, D=1
• The pessimism reduction function (1Reff) has to be adjusted to consider the number of input signals NMIS that are sufficient to create the worst-case MIS event. If NMIS signals or more overlap, then the worst case MIS derate R has to be used.
20 © 2015 Broadcom Corporation. All rights reserved.
Dependence on load and input slews (per pin?)
Skews between input arrival times.
Large set of logic functions: NAND, AND, NOR, OR, AOI, AO, OAI, OA, MUX, XOR, or FA.
Input ranges per logic function: NAND[2,3,4], AOI[21,22,31,32,33,211,221,222,311,321,331,322,332,333]
Information for sets of pins which can interact in MIS speedup.
As the details captured in the MIS models increases, the implementation complexity in STA tools may “explode”.
The validation of the overall accuracy of library characterization software, library models, and STA implementation, will be daunting!
Are the cost savings in silicon worth the investment into very complex MIS modeling? There is increased risk that potential bugs in this complex MIS modeling may still cause product failures?
CHALLENGES FOR TRULY ACCURATE MIS DELAY MODELING
Library Model Characterization
MISdelay(load,slew, input skews, input sets)
STA Tool MIS Implementation
21 © 2015 Broadcom Corporation. All rights reserved.
The State-of-the-Art solution appears to say: “Probably not!” A reasonably simple model that can be understood to be largely pessimistic is an accepted product even if it incurs some avoidable hold-fixing overhead! Next: One more idea to improve the state-of-the-art without introducing significant complexity.
22 © 2015 Broadcom Corporation. All rights reserved.
A pseudo-current based soft metric calculation that formalizes the translation of input arrival times to a quantitative MIS impact metric Neff.
23 © 2015 Broadcom Corporation. All rights reserved.
Goals for the resulting metric:
Simplify the detection of MIS relevant input timing window overlap, including partial overlap.
Be equal to one when all MIS aggressors arrive after the output settles.
Create a MIS derate that has a small error vs. SPICE, when plugged into Eqn. (1Reff), especially for single-stage gates where the largest MIS speedup is observed.
Be affordable to calculate.
PROPOSAL FOR A PSEUDO-CURRENT BASED CALCULATION OF THE EFFECTIVE NUMBER OF SWITCHING INPUTS
24 © 2015 Broadcom Corporation. All rights reserved.
MIS PSEUDO-CURRENT BASED CALCULATION OF EFFECTIVE DRIVE PARALLELISM FACTOR NEFF
i0
o(i0)
o(i1)
related_early_transition_window(i0)
related_early_transition_window(i1)
tr0
o(i2) related_early_transition_window(i2)
tr1
tr2
MIS pseudo-current 1/tr0
+1/tr1
+1/tr2
-1/tr0-1/tr1
-1/tr2
calculate trMIS , which is the time we need to integrate MIS pseudo-current until we reach area=1 under the MIS pseudo-current curve. Neff=tr0/trMIS trMIS
25 © 2015 Broadcom Corporation. All rights reserved.
TEST BASED ON A NAND3 IN A SPICE SIMULATION NEFF, C=40 FEMTO FARAD
The window with MIS
effect extends to 200 ps
skew.
zero skew => Two MIS aggressors
One MIS aggressor One MIS aggressor
no MIS aggressor
26 © 2015 Broadcom Corporation. All rights reserved.
CALCULATED REFF, C=40 FEMTO FARAD
No MIS margin required.
Reff for two switching inputs.
All three inputs switch.
Zero skew.
Maximum MIS margin.
Reff=1/Neff as alpha=0 for R=1/3 and Nmax=3
27 © 2015 Broadcom Corporation. All rights reserved.
REFF ERROR, C=40 FEMTO FARAD
• MIS derate Reff is pessimistic by up to -0.018.
• This represents good accuracy for a range of Reff=[0.33 …1] and the overall accuracy of the derating-based MIS margin approach.
• The pseudo current model is affordable, as the complexity of the pseudo current integration is comparable to a CCS delay calculation.
28 © 2015 Broadcom Corporation. All rights reserved.
LOCAL VARIATION CREATES VARIATION OF TIMING WINDOWS => MODIFIED NEFF CALCULATION TO ACCOUNT FOR LOCAL VARIATION
tstart,1
pseudo current
tstart,2 skew tstart,2,tail
-k*sqrt( i02 + i1
2)
• MIS aggressor pseudo current is shifted towards the victim by k*sqrt( i0
2 + i12)
k: confidence factor for MIS local variation pessimism.
Scenario 1: positive skew to confidence tail
tstart,1=tstart,2,tail
tstart,2 skew
tstart,1-tstart,2
• MIS aggressor pseudo current is shifted to align with the victim to create a worst case Neff.
Scenario 2: negative skew to confidence tail
MIS Pseudo-Current
MIS Pseudo-Current
29 © 2015 Broadcom Corporation. All rights reserved.
A MIS pessimism reduction equation was derived to accurately calculate the effective MIS derate Reff based on a worst case MIS derate R and the effective number of inputs Neff that switch simultaneously, and can interact to create MIS.
The model correctly follows the 1/N behavior of the physical MIS mechanism.
It avoids optimism that a linear interpolation model would create.
A pseudo-current based soft metric calculation was introduced that provides a refined effective number of inputs Neff that contribute in a MIS speedup event.
The calculation formalizes, and thereby simplifies, the translation of input arrival time information to a quantitative MIS impact metric.
The metric automatically adjusts the region of MIS impact according to the output transition time of a gate.
It can be easily adjusted to account for timing window uncertainty due to local variation.
In conjunction with the (1Reff) equation, this method enables reasonably accurate gradual MIS margin adjustments to model partial MIS aggressor overlaps.
SUMMARY
)Reff1(1
1
with1
1maxMIS
effR
NRN
Reff
30 © 2015 Broadcom Corporation. All rights reserved.
The absence of MIS related data in the Liberty Standard is a significant gap, IMHO.
From an STA and Implementation tool user’s perspective, it would be desirable to integrate MIS related data into the Liberty Standard.
It requires a review of published and unpublished proprietary models to determine a good model for standardization.
Beware of complexities!
Relevant data for a simple, derate based model as presented:
For multi-stage gates, a gate-specific constant margin may be preferable in cases where the MIS speedup is fairly independent of the output load due to the output inverter. For multi-stage gates, derating creates significant pessimism for larger loads.
MIS MODELING SUPPORT IN LIBERTY
Data rise fall
worst case delay speedup R Rrise Rfall
worst case transition time speedup Rtr Rtr_rise Rtr_fall
Number of inputs that create the worst case speedup NmaxMIS NmaxMIS_rise NmaxMIS_fall
31 31 © 2015 Broadcom Corporation. All rights reserved.
I hope the Liberty Technical Advisory Board Members here at Tau 2015 will start to think about MIS standardization! Thanks!
CCS Models
Waveform Propagation MIS Margin Model
Constant Margin
against MIS
32 32 © 2015 Broadcom Corporation. All rights reserved.
Appendix: Data from a NAND3 comparing the MIS speed-up from SPICE to an estimated speed-up based on the Neff calculation
33 © 2015 Broadcom Corporation. All rights reserved.
In the following slides we present data from a SPICE experiment with a NAND3 gate that emulates how the overlap estimation with a Neff and Reff calculation in the STA tool would compare to a SPICE simulation.
i0 is the reference pin for the MIS event.
i1 and i2 area skewed late from 0 ps to 250 ps relative to i0 in 25 ps steps.
The following data from the SPICE simulations is used to calculate Neff and then Reff
based on the Pseudo-Current model.
Delay dr0, dr1, and dr2 for an i0,i1 or i2 falling to output o rising event.
The output rising time tr0 for an i0 rising event.
Input signals fall from VDD to VSS on a linear ramp in 20 ps.
Output loads are varied [10 20 40]fF.
A scale factor k_tr=0.45 is used to calculate the width of the pseudo-current transition window to 2*k_tr*tr0.
The error between the calculated derating factor Reff and the ratio of MIS delay with all three inputs switching divided by the delay if only i0 switches is plotted over the skew1 x skew2 plane.
REFF ACCURACY DATA FROM A NAND3 EXPERIMENT IN SPICE
34 © 2015 Broadcom Corporation. All rights reserved.
NEFF, C=10 FEMTO FARAD
All three inputs switch at the same time.
i1 is skewed late,
only two inputs switch
simultaneously. i2 is skewed late,
only two inputs switch
simultaneously.
i1 and i2 are skewed late,
no MIS effect is expected
when skew > 50 ps.
35 © 2015 Broadcom Corporation. All rights reserved.
NEFF, C=20 FEMTO FARAD
The increased load
creates a slower
output transition.
The window with MIS
effect increases to 100 ps
skew.
36 © 2015 Broadcom Corporation. All rights reserved.
NEFF, C=40 FEMTO FARAD
Doubling the load
again creates a slower
output transition.
The window with MIS
effect increases to 200 ps
skew.
37 © 2015 Broadcom Corporation. All rights reserved.
REFF, C=10 FEMTO FARAD
No MIS margin required.
Margin for two inputs switching.
All three inputs switch.
Maximum margin.
Reff=1/Neff as alpha=0 for R=1/3 and Nmax=3
38 © 2015 Broadcom Corporation. All rights reserved.
REFF, C=20 FEMTO FARAD
No MIS margin required.
Margin for two inputs switching.
All three inputs switch.
Maximum margin.
Reff=1/Neff as alpha=0 for R=1/3 and Nmax=3
39 © 2015 Broadcom Corporation. All rights reserved.
REFF, C=40 FEMTO FARAD
No MIS margin required
Margin for two inputs switching.
All three inputs switch.
Maximum margin.
Reff=1/Neff as alpha=0 for R=1/3 and Nmax=3
40 © 2015 Broadcom Corporation. All rights reserved.
REFF ERROR, C=10 FEMTO FARAD
MIS derate Reff is
pessimistic by -0.045 to slightly
optimistic by 0.01.
The region where margin is applied
tracks very well with SPICE.
41 © 2015 Broadcom Corporation. All rights reserved.
REFF ERROR, C=20 FEMTO FARAD
MIS derate Reff is
pessimistic by up to -0.03.
The reduction of pessimism is related to
the fact that the worst case MIS appears
at larger loads.
For smaller loads, the MIS speedup ratio
does not drop all the way to 1/N.
The region where margin is applied
tracks very well with SPICE.
42 © 2015 Broadcom Corporation. All rights reserved.
REFF ERROR, C=40 FEMTO FARAD
MIS derate Reff is
pessimistic by up to -0.018.
For 40 fF, this gate approaches closely
the theoretical worst case 1/N MIS speed-up.
43 © 2015 Broadcom Corporation. All rights reserved.
The pseudo-current based calculation of the effective number of MIS aggressors Neff enables the estimation of an effective MIS derate Reff that is a close approximation for the MIS speed-up.
This model should be considered as a good candidate for an improved MIS model that is based on adaptive windows.
SUMMARY OF THE REFF ESTIMATION EXPERIMENT FOR THE NAND3
44 44 © 2015 Broadcom Corporation. All rights reserved.
The End