Delay Calculation(IMP)

Clock Definitions

Rising and falling edge of the clock

For a +ve edge triggered design +ve (or rising) edge is called ‘leading edge’ whereas –ve (or falling) edge is called ‘trailing edge’.

For a -ve edge triggered design –ve (or falling) edge is called ‘leading edge’ whereas +ve (or rising) edge is called ‘trailing edge’.

basic clock Minimum pulse width of the clock can be checked in PrimeTime by using commands given below:

set_min_pulse_width -high 2.5 [all_clocks]

set_min_pulse_width -low 2.0 [all_clocks]

These checks are generally carried out for post layout timing analysis. Once these commands are set, PrimeTime checks for high and low pulse widths and reports any violations.

Capture Clock Edge

The edge of the clock for which data is detected is known as capture edge.

Launch Clock Edge

This is the edge of the clock wherein data is launched in previous flip flop and will be captured at this flip flop.

http://asic-soc.blogspot.com/2009/01/clock-definitions.html

http://3.bp.blogspot.com/_Se0VANaI9uM/SXQFl4ol5rI/AAAAAAAAAoA/B8Nz0TKBtwU/s1600-h/basic_clock.jpg

http://3.bp.blogspot.com/_Se0VANaI9uM/SXRJY8hZZMI/AAAAAAAAAoI/7RX5Cz9_bjM/s1600-h/launch_capture_clock.jpg

launch clock and capture clock

Skew

Skew is the difference in arrival of clock at two consecutive pins of a sequential element is called skew. Clock skew is the variation at arrival time of clock at destination points in the clock network. The difference in the arrival of clock signal at the clock pin of different flops.

Two types of skews are defined: Local skew and Global skew.

Local skew

Local skew is the difference in the arrival of clock signal at the clock pin of related flops.

Global skew

Global skew is the difference in the arrival of clock signal at the clock pin of non related flops. This also defined as the difference between shortest clock path delay and longest clock path delay reaching two sequential elements.

local and global skew

Skew can be positive or negative. When data and clock are routed in same direction then it is Positive skew. When data and clock are routed in opposite direction then it is negative skew.

Positive Skew

If capture clock comes late than launch clock then it is called +ve skew.

http://4.bp.blogspot.com/_Se0VANaI9uM/SXQFUS_4mVI/AAAAAAAAAnw/zljfCv3YGDw/s1600-h/local_skew_global_skew.jpg

Clock and data both travel in same direction.

When data and clock are routed in same direction then it is Positive skew.

+ve skew can lead to hold violation.

+ve skew improves setup time.

positive skew negative skew

Negative Skew

If capture clock comes early than launch clock it is called –ve skew. Clock and data travel in opposite direction. When data and clock are routed in opposite then it is negative skew. -ve skew can lead to setup violation. -ve skew improves hold time. (Effects of skew on setup and hold will be discussed in detail in forthcoming articles)

Uncertainty

Clock uncertainty is the time difference between the arrivals of clock signals at registers in one clock domain or between domains.

Pre-layout and Post-layout Uncertainty

Pre CTS uncertainty is clock skew, clock Jitter and margin. After CTS skew is calculated from the actual propagated value of the clock. We can have some margin of skew + Jitter.

http://4.bp.blogspot.com/_Se0VANaI9uM/SXQFLnczIgI/AAAAAAAAAno/JMbNmYfPLX0/s1600-h/%2Bve_-ve_skew.jpg

timing diagram depicting skew, latency, jitter

Clock latency

Latency is the delay of the clock source and clock network delay.

Clock source delay is the time taken to propagate from ideal waveform origin point to clock definition point. Clock network latency is the delay from clock definition point to register clock pin.

Pre CTS Latency and Post CTS Latency

Latency is the summation of the Source latency and the Network latency. Pre CTS estimated latency will be considered during the synthesis and after CTS propagated latency is considered.

Source Delay or Source Latency

It is known as source latency also. It is defined as "the delay from the clock origin point to the clock definition point in the design".

http://3.bp.blogspot.com/_Se0VANaI9uM/SXQFB5XZ_9I/AAAAAAAAAng/6rObneRUOPM/s1600-h/skew_latency_jitter_timing_dig.jpg

Delay from clock source to beginning of clock tree (i.e. clock definition point).

The time a clock signal takes to propagate from its ideal waveform origin point to the clock definition point in the design.

Network Delay (latency) or Insertion Delay

It is also known as Insertion delay or Network latency. It is defined as "the delay from the clock definition point to the clock pin of the register".

The time clock signal (rise or fall) takes to propagate from the clock definition point to a register clock pin.

Figure below shows example of latency for a design without PLL.

latency for a design without PLL

The latency definitions for designs with PLL are slightly different.

Figure below shows latency specifications of such kind of designs.

Latency from the PLL output to the clock input of generated clock circuitry becomes source latency. From this point onwards till generated clock divides to flops is now known as network latency. Here we can observe that part of the network latency is clock to q delay of the flip flop (of divide by 2 circuit in the given example) is known value.

http://4.bp.blogspot.com/_Se0VANaI9uM/SXQEz9weovI/AAAAAAAAAnY/Zg4vG8kg-g0/s1600-h/onchip_offchip_latency_without_pll.jpg

latency for a design with PLL

Jitter

Jitter is the short-term variations of a signal with respect to its ideal position in time.

Jitter is the variation of the clock period from edge to edge. It can vary +/- jitter value.

From cycle to cycle the period and duty cycle can change slightly due to the clock generation circuitry. Jitter can also be generated from PLL known as PLL jitter. Possible jitter values should be considered for proper PLL design. Jitter can be modeled by adding uncertainty regions around the rising and falling edges of the clock waveform.

Sources of Jitter Common sources of jitter include:

Internal circuitry of the phase-locked loop (PLL)

Random thermal noise from a crystal

Other resonating devices

Random mechanical noise from crystal vibration

Signal transmitters

http://1.bp.blogspot.com/_Se0VANaI9uM/SXQEpAkS7XI/AAAAAAAAAnQ/R9glpQtF0js/s1600-h/onchip_offchip_latency_with_pll.jpg

Traces and cables

Connectors

Receivers

Click here to read more about jitter from Altera.

Click here to read what wiki says about jitter.

Multiple Clocks

If more than one clock is used in a design, then they can be defined to have different waveforms and frequencies. These clocks are known as multiple clocks. The logics triggered by each individual clock are then known as “clock domain”.

If clocks have different frequencies there must be a base period over which all waveforms repeat.

Base period is the least common multiple (LCM) of all clock periods

Asynchronous Clocks

In multiple clock domains, if these clocks do not have a common base period then they are called as asynchronous clocks. Clocks generated from two different crystals, PLLs are asynchronous clocks. Different clocks having different frequencies generated from single crystal or PLL are not asynchronous clocks but they are synchronous clocks.

Gated clocks

Clock signals that are passed through some gate other than buffer and inverters are called gated clocks. These clock signals will be under the control of gated logic. Clock gating is used to turn off clock to some sections of design to save power. Click here to read more about clock gating.

Generated clocks

Generated clocks are the clocks that are generated from other clocks by a circuit within the design such as divider/multiplier circuit.

Static timing analysis tools such as PrimeTime will automatically calculate the latency (delay) from the source clock to the generated clock if the source clock is propagated and you have not set source latency on the generated clock.

http://asic-soc.blogspot.com/2008/04/clock-gating.html

http://en.wikipedia.org/wiki/Jitter

http://www.altera.com/support/devices/pll_clock/jitter/pll-jitter.html

generated clock

‘Clock’ is the master clock and new clock is generated from F1/Q output. Master clock is defined with the constraint ‘create_clok’. Unless and until new generated clock is defined as ‘generated clock’ timing analysis tools won’t consider it as generated clock. Hence to accomplish this requirement use “create_generated_clock” command. ‘CLK’ pin of F1 is now treated as clock definition point for the new generated clock. Hence clock path delay till F1/CLK contributes source latency whereas delay from F1/CLK contributes network latency.

Virtual Clocks

Virtual clock is the clock which is logically not connected to any port of the design and physically doesn’t exist. A virtual clock is used when a block does not contain a port for the clock that an I/O signal is coming from or going to. Virtual clocks are used during optimization; they do not really exist in the circuit.

Virtual clocks are clocks that exist in memory but are not part of a design. Virtual clocks are used as a reference for specifying input and output delays relative to a clock. This means there is no actual clock source in the design. Assume the block to be synthesized is “Block_A”. The clock signal, “VCLK”, would be a virtual clock. The input delay and output delay would be specified relative to the virtual clock.

Transition Delay and Propagation Delay

Transition Delay

Transition delay or slew is defined as the time taken by signal to rise from 10 %( 20%) to the 90 %( 80%) of its maximum value. This is known as “rise time”.

http://asic-soc.blogspot.com/2008/12/transition-delay-and-propagation-delay.html

http://1.bp.blogspot.com/_Se0VANaI9uM/SXQEYcb9FsI/AAAAAAAAAnI/UXhPrrS5vKE/s1600-h/generated_clock.jpg

Similarly “fall time” can be defined as the time taken by a signal to fall from 90 %( 80%) to the 10 %( 20%) of its maximum value.

Transition is the time it takes for the pin to change state.

Setting Transition Time Constraints

The above theoretical definitions are to be applied on practical designs. Now, the transition time of a net becomes the time required for its driving pin to change logic values (from 10 %( 20%) to the 90 %( 80%) of its maximum value). This transition time used for delay calculations are based on the timing library (.lib files).

Transition related constraints can be provided in Design Compiler (logic synthesis tool from Synopsys) by using below commands:

1. max_transition : This attribute is applied to each output of a cell. During optimization, Design Compiler tries to make the transition time of each net less than the value of the max_transition attribute.

2. set_max_transition: This command is used to change the maximum transition time restriction specified in a technology library.

“This command sets a maximum transition time for the nets attached to the identified ports or to all the nets in a design by setting the max_transition attribute on the named objects.

For example, to set a maximum transition time of 3.2 on all nets in the design adder, enter the following command:

http://1.bp.blogspot.com/_Se0VANaI9uM/SUc_Bx4AlbI/AAAAAAAAAmI/pYdjFqCqddc/s1600-h/propagation_delays.jpg

set_max_transition 3.2 [get_designs adder]

To undo a set_max_transition command, use the remove_attribute command. For example, enter the following command:

remove_attribute [get_designs adder] max_transition”

(Directly quoted from Design Complier user manual)

Setting Capacitance Constraints

The transition time constraints specified above do not provide a direct way to control the actual capacitance of nets. To control capacitance directly, below command has to be used:

set_max_capacitance: This command sets the maximum capacitance constraint on input ports or designs.

In addition to set_max_transition, set_max_capacitance can also be used as this command works independent.

This command applies maximum capacitance limit to output pin or port of the design.

This command can also be used to apply capacitance limit on any net.

Eg:

set_max_capacitance 4 [get_designs decoder]

To remove the set_max_capacitance command, use the remove_attribute command.

remove_attribute [get_designs decoder] max_capacitance

Propagation Delay

Propagation delay is the time required for a signal to propagate through a gate or net.

Hence if it is cell, you can call it as “Gate or Cell Delay” or if it is net you can call it as “Net Delay”

Propagation delay of a gate or cell is the time it takes for a signal at the input pin to affect the output signal at output pin.

For any gate propagation delay is measured between 50% of input transition to the corresponding 50% of output transition.

There are 4 possibilities:

Propagation delay between 50 % of Input rising to 50 % of output rising.

Propagation delay between 50 % of Input rising to 50 % of output falling.

Propagation delay between 50 % of Input falling to 50 % of output rising.

Propagation delay between 50 % of Input falling to 50 % of output falling.

Each of these delays has different values. Maximum and minimum values of these set are very important. Maximum and minimum propagation delay values are considered for timing analysis.

For net propagation delay is the delay between the time a signal is first applied to the net and the time it reaches other devices connected to that net.

Propagation delay is taken as the average of rise time and fall time i.e. Tpd= (Tphl+Tplh)/2.

Propagation delay depends on the input transition time (slew rate) and the output load. Hence two dimensional look up tables are used to calculate these delays. How to calculate propagation delay of net and gate? Please refer below articles to find the detailed explanation.

How gate delay is calculated?

We encounter several types of delays in ASIC design. They are as follows:

Gate delay or Intrinsic delay Net delay or Interconnect delay or Wire delay or Extrinsic delay or Flight

time Transition or Slew Propagation delay Contamination delay

Wire delays or extrinsic delays are calculated using output drive strength, input capacitance and wire load models. Other delays are intrinsic properties of each and every gate.

Delays are interdependent on different electrical properties. [Nekoogar]:

http://asic-soc.blogspot.com/2008/09/delays-in-asic-design.html

Input capacitance of the logic gate is a function of output state, output loads and input slew rate.

Internal timing arcs and output slew rate is a function of switching input(s).

Capacitance of the wire is dependent on frequency.

Internal timing arcs are a function of input slew rates.

Output slew rate is a function of input slew rate on each input.

Wires exhibit RLC characteristics instead of lumped RC.

Gate Delay

Transistors within a gate take a finite time to switch. This means that a change on the input of a gate takes a finite time to cause a change on the output. [Magma]

Gate delay =function of (input transition (slew) time, Cnet+Cpin).

or

Gate delay =function of (input transition (slew) time, Cload).

where Cload=Cnet+Cpin

Cnet-->Net capacitance

Cpin-->pin capacitance of the driven cell

Cell delay is also same as Gate delay.

How gate delay is calculated?

http://4.bp.blogspot.com/_Se0VANaI9uM/SLt6koxEfsI/AAAAAAAAAYA/SzwpjCiA_k0/s1600-h/stage_delay.jpg

Cell or gate delay is calculated using Non-Linear Delay Models (NLDM). NLDM is highly accurate as it is derived from SPICE characterizations. The delay is a function of the input transition time (i.e. slew) of the cell, the wire capacitance and the pin capacitance of the driven cells. A slow input transition time will slow the rate at which the cell’s transistors can change state logic 1 to logic 0 (or logic 0 to logic 1), as well as a large output load Cload (Cnet + Cpin), thereby increasing the delay of the logic gate.

There is another NLDM table in the library to calculate output transition. Output transition of a cell becomes the input transition of the next cell down the chain.

Table models are usually two-dimensional to allow lookups based on the input slew and the output load (Cload). A sample table is given below.

timing() {

related_pin : "CKN";

timing_type : falling_edge;

timing_sense : non_unate;

cell_rise(delay_template_7x7) {

index_1 ("0.012, 0.032, 0.074, 0.154, 0.318, 0.644, 1.3");

index_2 ("0.001278, 0.0046008, 0.0112464, 0.0245376, 0.05112, 0.10454, 0.212148");

values ( \

"0.225894, 0.249015, 0.285537, 0.352680, 0.484244, 0.748180, 1.279570", \

"0.231295, 0.254415, 0.290938, 0.358081, 0.489646, 0.753585, 1.284980", \

"0.243754, 0.266878, 0.303398, 0.370542, 0.502105, 0.766044, 1.297440", \

"0.267240, 0.290389, 0.326908, 0.394052, 0.525615, 0.789561, 1.320950", \

"0.307080, 0.330200, 0.366721, 0.433861, 0.565425, 0.829373, 1.360760", \

"0.380552, 0.403875, 0.440426, 0.507569, 0.639136, 0.903084, 1.434500", \

"0.497588, 0.521769, 0.558548, 0.625744, 0.757301, 1.021260, 1.552680");

}

rise_transition(delay_template_7x7) {

index_1 ("0.012, 0.032, 0.074, 0.154, 0.318, 0.644, 1.3");

index_2 ("0.001278, 0.0046008, 0.0112464, 0.0245376, 0.05112, 0.10454, 0.212148");

values ( \

"0.040574, 0.068619, 0.125391, 0.246672, 0.497688, 1.005982, 2.030120", \

"0.040570, 0.068618, 0.125390, 0.246672, 0.497688, 1.005940, 2.030240", \

"0.040565, 0.068616, 0.125389, 0.246650, 0.497770, 1.006180, 2.030120", \

"0.040532, 0.068612, 0.125387, 0.246670, 0.497710, 1.006164, 2.030100", \

"0.040578, 0.068621, 0.125392, 0.246636, 0.497688, 1.006182, 2.030040", \

"0.041763, 0.069211, 0.125662, 0.246758, 0.497726, 1.005930, 2.030000", \

"0.045813, 0.071321, 0.126671, 0.247154, 0.497846, 1.005962, 2.030180");

}

index_1 --> input transition values

index_2--> output load capacitance values

values--> delay values

Situation 1:

Input transition and output load values match with table index values

If both input transition and output load values match with table index values then corresponding delay value is directly picked up from the delay “values” table as highlighted by yellow shaded data.

Situation 2:

Output load values doesn't match with table index values

When the actual load capacitance values does not fall directly on or at one of the load-axis index points, the delay is determined by interpolation from the

closest points. Note that to carry out interpolation input transition point should match with the any one of the table index values.

Determine the equation for the line segment connecting the two nearest points in the table.

To do this first we need to find the slope value.

Slope m = (y2-y1)/(x2-x1) where (y2-y1) is delay segment (generally in ns) on y axis and (x2-x1) is load segment (generally in pf) on x-axis.

Solve for the delay at the load point of interest.

The linear equation is:

y = mx+c

where

y-->delay (ns)

m-->slope

x-->load capacitance (pf)

i.e. delay=slope*load point of interest (constant value is zero)

Load point of interest means load capacitance value for which delay has to be calculated.

Situation 3:

Both input transition and output load values doesn't match with table index values

If both input transition and load capacitance values do not match exactly with the look up table index values then bilinear interpolation is used.

Multiple linear interpolations (~3) are performed on multiple closest table data points (~4) as shown in highlighted violet color in the look up table.

Situation 4:

Output load values doesn't match with table index values and is outside the table boundary

When the load point is outside of the boundary of the index, the delay is extrapolated to the closest known points.

Lookup value too far out of range of the given table value could lead to inaccuracy. [Cadence]

Intrinsic delay

Intrinsic delay is the delay internal to the gate. This is from input pin of the cell to output pin of the cell.

It is defined as the delay between an input and output pair of a cell, when a near zero slew is applied to the input pin and the output does not see any load condition. It is caused by the internal capacitance associated with its transistor.

This delay is largely dependent on the size of the transistors forming the gate because increasing size of transistors increase internal capacitors.

How net delay is calculated?

Net delay is the difference between the time a signal is first applied to the net and the time it reaches other devices connected to that net.

It is due to the finite resistance and capacitance of the net. It is also known as wire delay.

Wire delay = function of (Rnet, Cnet+Cpin)

This is output pin of the cell to the input pin of the next cell.

Net delay is calculated using Rs and Cs.

There are several factors which affect net parasitic:

http://asic-soc.blogspot.com/2008/10/net-delay.html

http://3.bp.blogspot.com/_Se0VANaI9uM/SPRYqi9e9iI/AAAAAAAAAlI/lFq-PDJaqug/s1600-h/net_delay.jpg

Net Length

Net cross-sectional area Resistively of material used for metal layers (Aluminum vs. copper) Number of vias traversed by the net Proximity to other nets (crosstalk)

Post-layout design is annotated with RCs extracted from layout for better accuracy. Annotated RCs override information from WLM.

Interconnect introduces capacitive, resistive and inductive parasites. All three have multiple effects on the circuit behavior.

1. Interconnect parasites cause an increase in propagation delay (i.e. it slows down working speed)

2. Interconnect parasites increase energy dissipation and affect the power distribution.

3. Interconnect parasites introduce extra noise sources, which affect reliability of the circuit. (Signal Integrity effects)

Dominant parameters determine the circuit behavior at a given circuit node. Non-dominant parameters can be neglected for interconnect analysis.

Inductive effect can be ignored if the resistance of the wire is substantial enough-this is the case for long aluminum wires with a small cross section or if the rise and fall times of the applied signals are slow.

When the wires are short, the cross section of the wire is large or the interconnect material used has a low resistivity, a capacitive only model can be used.

When the separation between neighboring wires is large or when the wires only run together for short distance, inter-wire capacitance can be ignored, and all the parasitic capacitance can be modeled as capacitance to ground.

Capacitance

Capacitance can be modeled by the parallel plate capacitor model.

C = ( / t).WLε

Where

--> permittivity of dielectric material (SiO2)ε

t --> thickness of dielectric material (SiO2)

W --> width of wire

L --> length of wire

--> ε εr εo where εr --> relative permittivity of SiO2

εo --> 8.854 x 10-12 F/m; permittivity of free space

As technology node shrinks (scaling), to minimize resistance of the wires, it is desirable to keep the cross section of the wire (WxH) as large as possible. But this increases area. Small values of W lead to denser wiring and less area overhead. In advanced process W/H ratio has reduced below unity. Under such circumstances parallel plate capacitance model becomes inaccurate. The capacitance between the sidewall of the wires and substrate called fringing capacitance can no longer be ignored and contributes to the overall capacitance.

Inter-wire capacitance become dominant factor in multilayer interconnect structures. These floating capacitors (not connected to substrate or ground) form a source of noise (cross talk). This effect is more pronounced for wires in the higher interconnect layer, as these are farther away from the substrate.

Generally higher metal layers (i.e. interconnects) have higher thickness (i.e. height) and higher dielectric layers have higher permittivity. Hence these

http://1.bp.blogspot.com/_Se0VANaI9uM/SPRYcPRqdvI/AAAAAAAAAlA/QIUbL-yEqdI/s1600-h/cfringe.jpg

wires display the highest inter-wire capacitance. Hence use it for global signals that are not sensitive to interference. (eg. Supply rails). Or it is advisable to separate wires by an amount that is larger than minimum spacing.

Resistance

Resistance R= ( .L)/ (H.W) = ( . L)/ Area ρ ρ

L --> length

W --> width

--> resistivity (ohm-m)ρ

Since H (height, thickness) is constant for a given technology we can write: R = Rs.(L/W) where Rs= /H ohm/sqare is called “ρ sheet resistance”.

At very high frequencies “skin effect” comes into play such that the resistance becomes frequency dependent. High frequency currents tend to flow primarily on the surface of a conductor, with the current density falling off exponentially with depth into the conductor.

Skin effect is only an issue for wider wires. Since clocks tends to carry the highest frequency signals on a chip and also fairly wide to limit resistance, the skin effect likely to have its first impact on these lines.

Inductance

With the adoption of low resistance interconnect materials and the increase of switching frequencies to GHz range, inductance starts to an important role. Consequences of on chip inductance include ringing and overshoot effect, reflection of signals due to impedance mismatch, inductive coupling between lines, and switching noise due to (Ldi/dt) voltage drops.

Lumped Capacitor Model

As long as the resistive component of the wire is small, and switching frequencies are in the low to medium range, it is meaningful to consider only the capacitive component of the wire, and to lump the distributed capacitance into a single capacitance.

The only impact on performance is introduced by the loading effect of the capacitor on the driving gate.

Lumped RC Model

If wire length is more than a few millimeters, the lumped capacitance model is inadequate and a resistive capacitive model has to be adopted.

In lumped RC model the total resistance of each wire segment is lumped into one single R, combines the global capacitive into single capacitor C.

Analysis of network with larger number of R and C becomes complex as network contains many time constants (zeroes and poles). Elmore delay model overcome such problem.

Elmore Delay Model

Properties of the network:

http://1.bp.blogspot.com/_Se0VANaI9uM/SPRYObsl84I/AAAAAAAAAk4/B7GimmAZLGY/s1600-h/lumped+c+model.jpg

http://3.bp.blogspot.com/_Se0VANaI9uM/SPRX_YTjZ3I/AAAAAAAAAkw/Xj47EMHYvqs/s1600-h/elmore+delay+model.jpg

Has single input node All the capacitors are between a node and ground. Network does not contain any resistive loops.

“Path resistance” is the resistance from source node to any other node.

“Shared path resistance” is the resistance shared among the paths from the source node to any other two nodes.

Hence,

Delay at node 1: Tow d1 = R1C1

Delay at node 2: Tow d2= (R1+R2)C2

Delay at node 3: Tow d3 = (R1+R2+R3)C3

In general:

di=R1C1+(R1+R2)C2+……..+(R1+R2+R3+…..+Ri)Ciτ

If

R1=R2=R3=….=R

C1=C2=C3=…..C then

di=RC+2RC+……..+nRCτ

Thus Elmore delay is equivalent to the first order time constant of the network.

Assuming an interconnect wire of length L is partitioned into N identical segments. Each segment has length L/N.

Then,

d=L/N.R.L/N.C+ 2 (L/n.r+L/N.C)+……τ

=(L/N)2(RC+2RC+…….+NRC)

=(L/N)2. N(N+1)

or d=RC.Lτ 2/2

=> The delay of a wire is a quadratic function of its length

=> doubling the length of the wire quadruples its delay

Advantages

It is simple It is always situated between minimum and maximum bounds

Disadvantages

It is pessimistic and inaccurate for long interconnect wires.

Distributed RC model

Lumped RC model is always pessimistic and distributed RC model provides better accuracy over lumped RC model.

But distributed RC model is complex and no closed form solution exists. Hence distributed RC line model is not suitable for Computer Aided Design Tools.

The behavior of the distributed RC line can be approximated by a lumped RC ladder network such as Elmore Delay model hence these are extensively used in EDA tools.

Transmission Line Model

When frequency of operation increases to a larger extent, rise (or fall) time of the signal becomes comparable to time of flight of the net, then inductive effects starts dominating over RC values. This inductive effect is modeled by Transmission Line models. The model assumes that the signal is a "wave" and it propagates over the medium "net".

There are two types of transmission models:

Lossless transmission line model: This is good for Printed Circuit Board level design.

Lossy transmission line model: This model is used for IC interconnect model.

Transmission line effects should be considered when the rise or fall time of the input signal is smaller than the time of flight of the transmission line or resistance of the wire is less than characteristics impedance.

Wire Load Models

“A Wire load model is an estimate of a net’s RC parasitic based on the net’s fanout”

Extraction data from already routed designs are used to build a lookup table known as the wire load model (WLM). WLM is based on the statistical estimates of R and C based on “Net Fan-out”.

For fanouts greater than those specified in a wire load table, a “slope factor” is specified for linear extrapolation.

wire_load (“5KGATES”) {

resistance : 0.000271 -------------> R per unit length

capacitance : 0.00017 -------------> C per unit length

slope : 29.4005 ---------------------> Used for linear extrapolation

fanout_length (1, 18.38) ----------> (fanout = 1, length = 18.38)

fanout_length (2, 47.78)




}

Eg:

Fanout = 7

Net length = 135.98 + 2 x 29.4005 (slope) = 194.78 ----------> length of net with fanout of 7 Resistance = 194.78 x 0.000271 = 0.05279 units Capacitance = 194.78 x 0.00017 = 0.03311 units

Wire load models for synthesis

Wire load modeling allows us to estimate the effect of wire length and fanout on the resistance, capacitance, and area of nets. Synthesizer uses these physical values to calculate wire delays and circuit speeds. Semiconductor vendors develop wire load models, based on statistical information specific to the vendors’ process. The models include coefficients for area, capacitance, and resistance per unit length, and a fanout-to-length table for estimating net lengths (the number of fanouts determines a nominal length).

Selection of wire load models in the initial stage (before physical design) depends on the fallowing factors:

1. User specification

2. Automatic selection based on design area

3. Default specification in the technology library

Once the final routing step is over in the physical design stage, wire load models are generated based on the actual routing in the design and synthesis is redone using those wire load models.

In hierarchical designs, we have to determine which wire load model to use for nets that cross hierarchical boundaries. There are three modes for determining which wire load model to use for nets that cross hierarchical boundaries:

Top:

Applying same wire load models to all nets as if the design has no hierarchy and uses the wire load model specified for the top level of the design hierarchy for all nets in a design and its sub designs.

http://2.bp.blogspot.com/_Se0VANaI9uM/SUja1Nz_hdI/AAAAAAAAAmQ/7pDggc5Yu2k/s1600-h/wire_load_models.jpg

Enclosed:

The wire load model of the smallest design that fully encloses the net is applied. If the design enclosing the net has no wire load model, then traverses the design hierarchy upward until we finds a wire load model. Enclosed mode is more accurate than top mode when cells in the same design are placed in a contiguous region during layout.

Use enclosed mode if the design has similar logical and physical hierarchies.

Segmented:

Wire load model for each segment of a net is determined by the design encompassing the segment. Nets crossing hierarchical boundaries are divided into segments. For each net segment, the wire load model of the design containing the segment is used. If the design contains a segment that has no wire load model, then traverse the design hierarchy upward until it finds a wire load model.

Interconnect Delay vs. Deep Sub Micron Issues

Performances of deep sub micron ICs are limited by increasing interconnect loading affect. Long global clock networks account for the larger part of the power consumption in chips. Traditional CAD design methodologies are largely affected by the interconnect scaling. Capacitance and resistance of interconnects have increased due to the smaller wire cross sections, smaller wire pitch and longer length. This has resulted in increased RC delay. As technology is advancing scaling of interconnect is also increasing. In such scenario increased RC delay is becoming major bottleneck in improving performance of advanced ICs.

Here the gate delay and the interconnect delay are shown as functions of various technology nodes ranging from 180nm to 60nm. The interconnect delays shown assumes a line where repeaters are connected optimally and includes the delay due to the repeaters. From the graph it can be observed that with the shrinking of technology gate delay reduces but interconnect delay increases.

Limits of Cu/low-k interconnects

At submicron level of 250 nm copper with low-k dielectric was introduced to decrease affects of increasing interconnect delay. But below 130 nm technology node interconnect delays are increasing further despite of introducing low-k dielectric. As the scaling increases new physical and technological effects like resistivity and barrier thickness start dominating and interconnect delay increases. Introduction of repeaters to shorten the interconnect length increases total area. The vias connecting repeaters to global layers can cause blockage in lower metal layers. Thus as the technology improves material limitations will dominate factor in the interconnect delay. Increasing metal layer width will cause

increase in metallization layer. This can’t be a solution for the problem as it increases complexity, reliability and cost.

Cu low-k dielectric films are deposited by a special process known as Damascene process. Adhesion property of Cu with dielectric materials is very poor. Under electric bias they easily drift and cause short between metal layers. To avoid this problem a barrier layer is deposited between dielectric and Cu trench. Even though it decreases effective cross section of interconnects compared to drawn dimensions, it improves reliability. The barrier thickness becomes significant in deep submicron level and effective resistance of the interconnect rises further. In addition to this increasing electron scattering and self heating caused by the electron flow in interconnects due to comparable increase in internal chip temperature also contribute to increase interconnect resistance.

Contamination Delay:

Best case delay from valid input to valid output. i.e. minimum propagation delay.

Documents

Delay Calculation(IMP)