Upload
thimba
View
38
Download
1
Embed Size (px)
DESCRIPTION
ELEC 7770 Advanced VLSI Design Spring 2014 Retiming. Vishwani D. Agrawal James J. Danaher Professor ECE Department, Auburn University Auburn, AL 36849 [email protected] http://www.eng.auburn.edu/~vagrawal/COURSE/E7770_Spr14/course.html. Retiming. - PowerPoint PPT Presentation
Citation preview
Spring 2014, Feb 10 . . .Spring 2014, Feb 10 . . . ELEC 7770: Advanced VLSI Design (Agrawal)ELEC 7770: Advanced VLSI Design (Agrawal) 11
ELEC 7770ELEC 7770Advanced VLSI DesignAdvanced VLSI Design
Spring 2014Spring 2014RetimingRetiming
Vishwani D. AgrawalVishwani D. AgrawalJames J. Danaher ProfessorJames J. Danaher Professor
ECE Department, Auburn UniversityECE Department, Auburn University
Auburn, AL 36849Auburn, AL 36849
[email protected]://www.eng.auburn.edu/~vagrawal/COURSE/E7770_Spr14/course.html
Spring 2014, Feb 10 . . .Spring 2014, Feb 10 . . . ELEC 7770: Advanced VLSI Design (Agrawal)ELEC 7770: Advanced VLSI Design (Agrawal) 22
RetimingRetiming Retiming is a function-preserving transformation Retiming is a function-preserving transformation
of a synchronous sequential circuit.of a synchronous sequential circuit. Flip-flops are moved according to specific rules.Flip-flops are moved according to specific rules. Original references:Original references:
C. E. Leiserson, F. Rose and J. B. Saxe, “Optimizing C. E. Leiserson, F. Rose and J. B. Saxe, “Optimizing Synchronous Circuits by Retiming,” Synchronous Circuits by Retiming,” Proc. 3Proc. 3rdrd Caltech Caltech Conf. on VLSIConf. on VLSI, 1983, pp. 87-116., 1983, pp. 87-116.
C. E. Leiserson and J. B. Saxe, “Retiming C. E. Leiserson and J. B. Saxe, “Retiming Synchronous Circuitry,” Synchronous Circuitry,” AlgorithmicaAlgorithmica, vol. 6, pp. 5-35, , vol. 6, pp. 5-35, 1991.1991.
Spring 2014, Feb 10 . . .Spring 2014, Feb 10 . . . ELEC 7770: Advanced VLSI Design (Agrawal)ELEC 7770: Advanced VLSI Design (Agrawal) 33
A Trivial Example: Reduced HardwareA Trivial Example: Reduced Hardware
FF
FF
FF
Spring 2014, Feb 10 . . .Spring 2014, Feb 10 . . . ELEC 7770: Advanced VLSI Design (Agrawal)ELEC 7770: Advanced VLSI Design (Agrawal) 44
Example 2: Faster ClockExample 2: Faster Clock
FF
FF
Spring 2014, Feb 10 . . .Spring 2014, Feb 10 . . . ELEC 7770: Advanced VLSI Design (Agrawal)ELEC 7770: Advanced VLSI Design (Agrawal) 55
Example 3: Reduced Flip-FlopsExample 3: Reduced Flip-Flops
FF
FF
FF
Spring 2014, Feb 10 . . .Spring 2014, Feb 10 . . . ELEC 7770: Advanced VLSI Design (Agrawal)ELEC 7770: Advanced VLSI Design (Agrawal) 66
Applications of RetimingApplications of Retiming
Performance optimizationPerformance optimization Area optimizationArea optimization Power optimizationPower optimization Testability enhancementTestability enhancement FPGA optimizationFPGA optimization
Spring 2014, Feb 10 . . .Spring 2014, Feb 10 . . . ELEC 7770: Advanced VLSI Design (Agrawal)ELEC 7770: Advanced VLSI Design (Agrawal) 77
Fundamental Operation of RetimingFundamental Operation of Retiming
A retiming move in a circuit is caused by moving A retiming move in a circuit is caused by moving all of the memory elements at the input of a all of the memory elements at the input of a combinational block to all of its outputs, or vice-combinational block to all of its outputs, or vice-versa.versa.
Combinational logic
FF
FFCombinational
logicFF≡
Spring 2014, Feb 10 . . .Spring 2014, Feb 10 . . . ELEC 7770: Advanced VLSI Design (Agrawal)ELEC 7770: Advanced VLSI Design (Agrawal) 88
A Correlator CircuitA Correlator Circuit
+ + +
= = = =
host
Adderdelay = 7
Comparatordelay = 3
Flip-flops
PI
PO
a1 a2 a3 a4
Spring 2014, Feb 10 . . .Spring 2014, Feb 10 . . . ELEC 7770: Advanced VLSI Design (Agrawal)ELEC 7770: Advanced VLSI Design (Agrawal) 99
Graph ModelGraph Model
7 7 7
3 3 3 3
0
0
0 0
00 0 0
1
1 1 1
Vertex vi: combinational, delay = d(vi), assumed unchanged by retiming d(host) = 0
Edge e(vi,vj): or eij, weight wij = number of flip-flops between vi and vj
h
a b c d
efg
Spring 2014, Feb 10 . . .Spring 2014, Feb 10 . . . ELEC 7770: Advanced VLSI Design (Agrawal)ELEC 7770: Advanced VLSI Design (Agrawal) 1010
Path Delay and Path WeightPath Delay and Path Weight
A set of connected nodes specify a path. A set of connected nodes specify a path. A path A path does not traverse through the host node.does not traverse through the host node.
Path delay = Path delay = ∑ d(vi) = combinational delay of path∑ d(vi) = combinational delay of path Path weight = ∑ wij = clock delay of pathPath weight = ∑ wij = clock delay of path Retiming of a node i is denoted by an integer riRetiming of a node i is denoted by an integer ri
It represents the number of registers moved across, It represents the number of registers moved across, initially ri = 0initially ri = 0
Register moved from output to input, ri → ri + 1Register moved from output to input, ri → ri + 1 Register moved from input to output, ri → ri – 1Register moved from input to output, ri → ri – 1 After retiming, edge weight wij’ = wij + rj – riAfter retiming, edge weight wij’ = wij + rj – ri
Spring 2014, Feb 10 . . .Spring 2014, Feb 10 . . . ELEC 7770: Advanced VLSI Design (Agrawal)ELEC 7770: Advanced VLSI Design (Agrawal) 1111
Example of Node RetimingExample of Node Retiming
3 3 3 33 3
∑ ∑ d(vi) = 12, ∑ wij = 0 d(vi) = 12, ∑ wij = 0
3 3 3 33 3
∑ ∑ d(vi) = 12, ∑ wij = 2 d(vi) = 12, ∑ wij = 2
r1 = 0 r2 = 0 r3 = 0 r4 = 0 r5 = 0 r6 =0
r1 = 0 r2 = -1 r3 = 0 r4 = 0 r5 = 1 r6 =0
Spring 2014, Feb 10 . . .Spring 2014, Feb 10 . . . ELEC 7770: Advanced VLSI Design (Agrawal)ELEC 7770: Advanced VLSI Design (Agrawal) 1212
Legal RetimingLegal Retiming
Retiming is legal if the retimed circuit has no Retiming is legal if the retimed circuit has no negative weights.negative weights.
A legally retimed circuit is functionally equivalent A legally retimed circuit is functionally equivalent to the original circuit – proof by Leiserson and to the original circuit – proof by Leiserson and Saxe (1991)Saxe (1991)
Retiming is the most general method for Retiming is the most general method for changing the register count and position without changing the register count and position without knowing the functions of vertices.knowing the functions of vertices.
Spring 2014, Feb 10 . . .Spring 2014, Feb 10 . . . ELEC 7770: Advanced VLSI Design (Agrawal)ELEC 7770: Advanced VLSI Design (Agrawal) 1313
ExampleExample
FFa
b
c
x
d
c
host x
10
0
0
Spring 2014, Feb 10 . . .Spring 2014, Feb 10 . . . ELEC 7770: Advanced VLSI Design (Agrawal)ELEC 7770: Advanced VLSI Design (Agrawal) 1414
Example: Illegal RetimingExample: Illegal Retimingc
host x
10
0
00
0
0
Retiming vector = {0, 0, 0}
c
host x
1 → 00
0 → –1
0 →10
0
0 → –1
Retiming vector = {0, 0, –1}
FF
a
b
cx
d
Spring 2014, Feb 10 . . .Spring 2014, Feb 10 . . . ELEC 7770: Advanced VLSI Design (Agrawal)ELEC 7770: Advanced VLSI Design (Agrawal) 1515
Example: Legal RetimingExample: Legal Retiming
c
host x
10
0
00
0
0
Retiming vector = {0, 0, 0}
c
host x
1 → 0
0
0
0 →1
00
0 →1
Retiming vector = {0, 1, 0}
FFa
b
c
x
d
FF
Spring 2014, Feb 10 . . .Spring 2014, Feb 10 . . . ELEC 7770: Advanced VLSI Design (Agrawal)ELEC 7770: Advanced VLSI Design (Agrawal) 1616
Correlator CircuitCorrelator Circuit
7 7 7
3 3 3 3
0
0
0 0
00 0 0
1
1 1 1a b c d
efg
h
Initial retiming vector = {0,0,0,0,0,0,0,0}
Critical path delay = 24
rh=0
ra=0 rb=0 rc=0rd=0
re=0rf=0rg=0
Spring 2014, Feb 10 . . .Spring 2014, Feb 10 . . . ELEC 7770: Advanced VLSI Design (Agrawal)ELEC 7770: Advanced VLSI Design (Agrawal) 1717
Retimed Correlator CircuitRetimed Correlator Circuit
7 7 7
3 3 3 3
0
0
0→1 0→1
00→1 0 0
1→0
1
1→0 1
a b c d
efg
h
retiming vector = {-1,-1,-2,-2,-2,-1,0,0}
Critical path delay = 13
rh=0
ra= -1 rb= -1 rc= -2rd= -2
re= -2rf= -1rg=0
Spring 2014, Feb 10 . . .Spring 2014, Feb 10 . . . ELEC 7770: Advanced VLSI Design (Agrawal)ELEC 7770: Advanced VLSI Design (Agrawal) 1818
Retiming TheoremRetiming Theorem
Given a network G(V, E, W) and a cycle time T, (r1, . . . ) is a Given a network G(V, E, W) and a cycle time T, (r1, . . . ) is a feasible retiming if and only if:feasible retiming if and only if:
1.1. ri – rj ri – rj ≤ wij≤ wij for all edges (vi,vj) for all edges (vi,vj) εε E E
2.2. ri – rj ≤ W(vi,vj) – 1 ri – rj ≤ W(vi,vj) – 1 for all node-pairs vi, vj such thatfor all node-pairs vi, vj such thatD(vi,vj) > TD(vi,vj) > T
Where,Where,
W(vi,vj): is the minimum weight for all paths between vi and vjW(vi,vj): is the minimum weight for all paths between vi and vj
D(vi,vj):D(vi,vj): is the maximum delay among all minimum is the maximum delay among all minimum weight paths weight paths between vi and vj between vi and vj
Spring 2014, Feb 10 . . .Spring 2014, Feb 10 . . . ELEC 7770: Advanced VLSI Design (Agrawal)ELEC 7770: Advanced VLSI Design (Agrawal) 1919
Proof of Condition 1Proof of Condition 1 We assume that the original network is legal, i.e., all We assume that the original network is legal, i.e., all
edge weights are positive.edge weights are positive. For an arbitrary edge (vi,vj) For an arbitrary edge (vi,vj) εε E: E:
ri – rj ri – rj ≤ wij or wij + rj – ri ≤ wij or wij + rj – ri ≥ 0, ≥ 0, means that after retiming the new means that after retiming the new
weight wij’ = wij + rj – ri will be positive. Thus, condition 1 weight wij’ = wij + rj – ri will be positive. Thus, condition 1
ensures the legality of retiming.ensures the legality of retiming.
i j
Original flip-flops, wijRetimed flip-flops, wij’ = wij + rj – ri ≥ 0
Edge (i,j)
wij flip-flopsrj flip-flopsri flip-flops
Spring 2014, Feb 10 . . .Spring 2014, Feb 10 . . . ELEC 7770: Advanced VLSI Design (Agrawal)ELEC 7770: Advanced VLSI Design (Agrawal) 2020
Proof of Condition 2Proof of Condition 2 Given: d(vi) < T, for all i.Given: d(vi) < T, for all i. Any retimed path whose combinational delay exceeds Any retimed path whose combinational delay exceeds
clock period, will have at least one flip-flop.clock period, will have at least one flip-flop. The above is the requirement for correct operation.The above is the requirement for correct operation.
i j
Original weight, WijRetimed weight, Wij’ = Wij + rj – ri ≥ 1
Path (i,j), D(i,j) > T
Wij flip-flopsrj flip-flopsri flip-flops
Retiming Optimization ProblemRetiming Optimization Problem
Given the initial retiming graph G(V, E, d, w) of a Given the initial retiming graph G(V, E, d, w) of a synchronous system and a required clock period synchronous system and a required clock period P, find a feasible retiming transformation such P, find a feasible retiming transformation such that for the retimed graph G’that for the retimed graph G’
CP(G’) ≤ PCP(G’) ≤ P Solution:Solution:
Algorithm 1 – Finds CP(G), critical path of GAlgorithm 1 – Finds CP(G), critical path of G Algorithm 2 – Finds feasible retiming G → G’Algorithm 2 – Finds feasible retiming G → G’
Spring 2014, Feb 10 . . .Spring 2014, Feb 10 . . . ELEC 7770: Advanced VLSI Design (Agrawal)ELEC 7770: Advanced VLSI Design (Agrawal) 2121
Algorithm 1: Critical Path DelayAlgorithm 1: Critical Path Delay Delete all edges (vi, vj) for which wij ≥ 1.Delete all edges (vi, vj) for which wij ≥ 1. Create a level order for vertices such that an edge Create a level order for vertices such that an edge
(vi, vj) requires order of vj to be higher than that of (vi, vj) requires order of vj to be higher than that of vi.vi.
Traversing all nodes (v) in level order, compute ∆(v)Traversing all nodes (v) in level order, compute ∆(v) ∆∆(v) = d(v), if v has no incoming edge(v) = d(v), if v has no incoming edge ∆∆(v) = d(v) + max{∆(vi)}, for all incoming edges (vi, v)}(v) = d(v) + max{∆(vi)}, for all incoming edges (vi, v)}
ii CP(G) = max{∆(vj), for all vertices j}CP(G) = max{∆(vj), for all vertices j}
jj
Spring 2014, Feb 10 . . .Spring 2014, Feb 10 . . . ELEC 7770: Advanced VLSI Design (Agrawal)ELEC 7770: Advanced VLSI Design (Agrawal) 2222
Spring 2014, Feb 10 . . .Spring 2014, Feb 10 . . . ELEC 7770: Advanced VLSI Design (Agrawal)ELEC 7770: Advanced VLSI Design (Agrawal) 2323
Algorithm 1 ApplicationAlgorithm 1 Application7 7 7
3 3 3 3
0
0
0 0
00 0 0
1
1 1 1
h
a b cd
efg
7 7 7
3 3 3 3
0
0
0 0
00 0 0
∆=3
∆=10
h
a b cd
efg
∆=3 ∆=3 ∆=3
∆=17
∆=24
CP(G)=∆=24
1 1 11
Algorithm 2: Retiming for Period = PAlgorithm 2: Retiming for Period = P
Initialize retiming variable, r(v) = 0, for all v.Initialize retiming variable, r(v) = 0, for all v. Repeat |V| – 1 times:Repeat |V| – 1 times:
Derive retiming graph.Derive retiming graph. Run Algorithm 1 to determine ∆(v) for all v.Run Algorithm 1 to determine ∆(v) for all v. For each v such that ∆(v) > P, set r(v) to r(v) + 1.For each v such that ∆(v) > P, set r(v) to r(v) + 1.
Derive retiming graph and run Algorithm 1:Derive retiming graph and run Algorithm 1: If CP(G) > P, then no feasible retiming exists.If CP(G) > P, then no feasible retiming exists. Otherwise, CP(G) < P and the retimed graph is the Otherwise, CP(G) < P and the retimed graph is the
required result.required result.
Spring 2014, Feb 10 . . .Spring 2014, Feb 10 . . . ELEC 7770: Advanced VLSI Design (Agrawal)ELEC 7770: Advanced VLSI Design (Agrawal) 2424
Spring 2014, Feb 10 . . .Spring 2014, Feb 10 . . . ELEC 7770: Advanced VLSI Design (Agrawal)ELEC 7770: Advanced VLSI Design (Agrawal) 2525
Algorithm 2 Application, P = 13Algorithm 2 Application, P = 13
7 7 7
3 3 3 3
0
0
0 0
00 0 0
1
1 1 1
h
a b cd
efg
7 7 7
3 3 3 3
0
0
0 1
01 1 0
∆=3
∆=10
h
a b cd
efg
∆=3 ∆=3 ∆=3
∆=17
∆=24
CP(G)=∆=24
0
∆=14 ∆=3 ∆=3
∆=3
∆=10
∆=7
1 1 1
∆=14
∆=14
Spring 2014, Feb 10 . . .Spring 2014, Feb 10 . . . ELEC 7770: Advanced VLSI Design (Agrawal)ELEC 7770: Advanced VLSI Design (Agrawal) 2626
Retimed Circuit for P = 13Retimed Circuit for P = 13
7 7 7
3 3 3 3
0
0
1 1
01 0 0
1→0
1
0 1
a b c d
efg
h
retiming vector = {-1,-1,-2,-2,-2,-1,0,0}
Critical path delay = 13
rh=0
ra= -1 rb= -1 rc= -2rd= -2
re= -2rf= -1rg=0
Spring 2014, Feb 10 . . .Spring 2014, Feb 10 . . . ELEC 7770: Advanced VLSI Design (Agrawal)ELEC 7770: Advanced VLSI Design (Agrawal) 2727
ReferencesReferences
Two papers by Leiserson Two papers by Leiserson et alet al. (see slide 2).. (see slide 2). G. De Micheli, G. De Micheli, Synthesis and Optimization of Synthesis and Optimization of
Digital CircuitsDigital Circuits, New York: McGraw-Hill, 1994., New York: McGraw-Hill, 1994. N. Maheshwari and S. S. Sapatnekar, N. Maheshwari and S. S. Sapatnekar, Timing Timing
Analysis and Optimization of Sequential CircuitsAnalysis and Optimization of Sequential Circuits , , Boston: Springer, 1999.Boston: Springer, 1999.