View
215
Download
0
Embed Size (px)
Citation preview
Timing Analysis of Timing Analysis of Cyclic Combinational CircuitsCyclic Combinational Circuits
Marc D. Riedel and Jehoshua Bruck California Institute of Technology
IWLS, Temecula Creek, CA, June 4, 2004
Marrella splendens Cyclic circuit
...
...
...
...
),,( 11 mxxf a
),,( 12 mxxf a
),,( 1 mn xxf a
inputs outputs
The current outputs depend only on the current inputs.
Combinational Circuits
1x
2x
mx
miix
,,1
{0,1}
nj
mjf
,,1
{0,1}{0,1}:
combinationallogic
Generally acyclic (i.e., feed-forward) structures.
Combinational Circuits
x
y
x
y
z
z
c
s
ANDOR
AND
XOR
XOR
Cyclic Combinational Circuits
a
b
x
c
d
x
AND
AND
OR
OR
AND
OR
1f cdxab ((
2f bxcd (
)))( 1fx
))( 2fxa )a
))
Circuit is cyclic yet combinational;computes functions and with 6 gates.1f 2f
An acyclic circuit computing these functions requires 8 gates.
Timing Analysis
Predicated on a topological ordering.
l1 = 1
jgj
i lli )fanin(
max
+ 1level:
l2 = 1
x
y
x
y
z
z
c
s
g1 g4
g3
g2
g5
l3 = 2
l5 = 2
l4 = 3
z
10
10
10
10
10
12
02
12
Timing Analysis
Predicated on a topological ordering.
11
01
x
y
x
yz
c
sarrival times
10
(assume a delay bound of 1 time unit for each gate)
l1 = 1
l2 = 1
g4
g3
g2
g5
l3 = 2
l5 = 2
l4 = 3
g1
jgj
i lli )fanin(
max
+ 1level:
Cyclic Combinational Circuits
a
b
x
c
d
x
1f
2f
No topological ordering.
How can we perform timing analysis?
cdxab ((
bxcd ( )a
))
AND
AND
OR
OR
AND
OR
Cyclic Combinational Circuits
a
b
x
c
d
x
1f
2f
10
10
00
10
10
00
11
12
13
14
15
16cdxab (( ))
bxcd ( )a
AND
AND
OR
OR
AND
OR
No topological ordering.
How can we perform timing analysis?
• Algorithms for functional analysis (IWLS’03);• Strategies for synthesis (DAC’03).
In trials on benchmark circuits, cyclic optimizations reduced the area of by as much as 30%
In previous papers, we presented:
Prior Work
Optimization for Area
Number of NAND2/NOR2 gates inBerkeley SIS vs. CYCLIFYsolutions
Benchmark Berkeley SIS Caltech CYCLIFY Improvement
5xp1 203 182 10.34%
ex6 194 152 21.65%
planet 943 889 5.73%
s386 231 222 3.90%
bw 302 255 15.56%
cse 344 329 4.36%
pma 409 393 3.91%
s510 514 483 6.03%
duke2 847 673 20.54%
styr 858 758 11.66%
s1488 1084 1003 7.47%
application of “script.rugged” and mapping
• An algorithm for timing analysis.
• Synthesis results, with optimization jointly targetingarea and delay.
In trials on benchmarks circuits, cyclic optimizations simultaneously reduced the area by up to 10% and thedelay by up to 25%.
In this paper, we discuss:
Contributions
cycliccircuitacycliccircuit
Related Work
Their approach: identify equivalent acyclic circuits.
Malik (1994), Hsu, Sun and Du (1998), and Edwards (2003) considered analysis techniques for cyclic circuits.
inputs outputs
Unravelling cyclic circuits this way is a difficult task.
minimum-cutfeedback set
Our Approach
Perform event propagation, directly on a cyclic circuit.
outputsinputs cycliccircuit
10
10
00
10
00
13
16
Our Approach
Perform event propagation, directly on a cyclic circuit.
[b]0
[c]0
[a]0
[x]0
[d]0
f2=[d+c(x+ba))]6
f1=[b(a+x(c+d))]6
Compute events symbolically, with BDDs.
cycliccircuit
Circuit Model
0
0
AND
1
AND
Perform static analysis in the “floating-mode”. At the outset:
11
1AND
• all wires are assumed to have unknown/undefined values ( ).• the primary inputs assume definite values in {0, 1}.
a “controlling” input
full set of“non-controlling” inputs
unknown/undefinedoutput
Circuit Model
During the analysis, only signals driven (directly or indirectly) by the primary inputs are assigned definite values.
1
ORAND
Perform static analysis in the “floating-mode”. At the outset:
• all wires are assumed to have unknown/undefined values ( ).• the primary inputs assume definite values in {0, 1}.
Circuit Model
AND
Up-bounded inertial delay model.
each gate has delay in [0, td]
Ensures monotone speed-up property.
Circuit Model
The arrival time at a gate output is determined:
• either by the earliest controlling input.
AND
13
02
06
03
(assuming a delay bound of 1)
Circuit Model
The arrival time at a gate output is determined:
• either by the earliest controlling input;
AND
13
12
16
• or by the latest non-controlling input.
17
(assuming a delay bound of 1)
Timing Analysis
Characterize arrival times symbolically (with BDDs):
: set of input assignments that produce 0
: set of input assignments that produce 1
:)1()0( CC
Implicitly:
set of input assignments for which output is
)1(
)0(
C
C
Timing Analysis
Characterize arrival times symbolically (with BDDs):
Time-stamp the characteristic sets:
t
vC )(
arrival time
)1(
)0(
C
C : set of input assignments that produce 0
: set of input assignments that produce 1
Initialization
internal signals:
x
primary inputs:
0][
0][
0)1(
0)0(
C
C
xC
xC
0)1(
0)0(
][
][
Propagation
)(viC
)(wjC
)()()( : vi
wj
wj CCC
For a controlling input value v, producing an output value w,
If there is a change in the characteristic set of a gate’s fan-in:
Propagation
)(wjC
)( 1
1
vkC
)( 2
2
vkC
)( 3
3
vkC
For non-controlling input values v1, v2, v3 producing an output
value w, )()()()( 3
3
2
2
1
1: v
kvk
vk
wj CCCC
If there is a change in the characteristic set of a gate’s fan-in:
Propagation
)(wjC
If there is a change in the characteristic set of a gate’s fan-in:
If changes as a result, update its time-stamp:)(wjC
dtttwjC :
)( ][
delay in [0, td]
a
b
x
AND
c
d
x
AND
OR
OR
AND
OR
)1(1
)0(1
C
C
)1(2
)0(2
C
C
)1(3
)0(3
C
C
)1(4
)0(4
C
C
)1(5
)0(5
C
C
)1(6
)0(6
C
C
1g
2g
3g
4g
5g
6g
Exampletime 1time 2time 3time 4time 5time 6
x
b
c
a
x
d
xd
xa
ab
bx
xc
cd
cdx
xda
xab
abx
bxc
cxd
)( cdx
)( cdxa
)( dxab
)( bax
)( abxc
)( bxcd
)( cdxab
)( abxcd
)( cdxa
)( abxc
))(( abxcd
))(( cdxab
0
0
0
0
00
0
0
0
0
0
0
Timing Analysis
• The algorithm terminates since the cardinality of each set increases over time; at most .1)( v
iC
)(viC
• The circuit is combinational iff the “care” set of input assignments is contained within
for each output gate gi .
)1()0(ii CC
• The delay bounds on the arrival times for the output gates give a bound on the circuit delay.
Multi-Terminal BDDs
For finer-grained timing information,preserve a history of the changes.
d
f2
02
11c
x
0
1
b
a
13
04
15 06
2f bxcd ( )a
Reference: Bahar et al., “Timing Analysis using ADDs"
Synthesis
Select best solution through a branch-and-bound search.
N1
N2
N4
N7
N3
N5
N8
N6
N9
See The Synthesis of Cyclic Combinational Circuits, DAC’03.
Analysis algorithm is used tovalidate and rank potentialsolutions.
Implementation: CYCLIFY Program
• Incorporated synthesis methodology in a general logic synthesis environment (Berkeley SIS package).
• Trials on wide range of circuits
– randomly generated
– benchmarks
– industrial designs.
• Conclusion: nearly all circuits of practical interest can be optimized with feedback.
Optimization for Area and Delay
Area and Delay of Berkeley SIS vs. CYCLIFYsolutions.
Berkeley SIS Caltech CYCLIFY
benchmark Area Delay Area Improvement Delay Improvement
p82 175 19 167 4.57% 15 21.05%
t1 343 17 327 4.66% 14 17.65%
in3 599 40 593 1.00% 33 17.50%
in2 590 34 558 5.42% 29 14.71%
5xp1 210 23 180 14.29% 22 4.35%
bw 280 28 254 9.29% 20 28.57%
s510 452 28 444 1.77% 24 14.29%
s1 566 36 542 4.24% 31 13.89%
duke2 742 38 716 3.50% 34 10.53%
s1488 1016 43 995 2.07% 34 20.93%
s1494 1090 46 1079 1.01% 39 15.22%
Area: number of NAND2/NOR2 gates.Delay: 1 time unit/gate.
application of “script.delay” and mapping
Discussion
• Nearly all circuits can be optimized with cycles.• Optimizations are significant.
Synthesis strategies targeting area and delay:
Analysis through symbolic event propagation:
• Existing methods can be applied to cyclic circuits.• Complexity is comparable for cyclic and acyclic circuits.
Future Directions
• Apply more realistic timing models for analysis.
• Use more efficient symbolic techniques (e.g., use boolean satisfiability (SAT)-based techniques).
• Incorporate more sophisticated search heuristics into synthesis.