Upload
cain-higgins
View
36
Download
1
Embed Size (px)
DESCRIPTION
ELEC 7770 Advanced VLSI Design Spring 2007 Constraint Graph and Performance Optimization. Vishwani D. Agrawal James J. Danaher Professor ECE Department, Auburn University Auburn, AL 36849 [email protected] http://www.eng.auburn.edu/~vagrawal/COURSE/E7770_Spr07. Retiming Theorem. - PowerPoint PPT Presentation
Citation preview
Spring 07, Apr 10, 12Spring 07, Apr 10, 12 ELEC 7770: Advanced VLSI Design (Agrawal)ELEC 7770: Advanced VLSI Design (Agrawal) 11
ELEC 7770ELEC 7770Advanced VLSI DesignAdvanced VLSI Design
Spring 2007Spring 2007Constraint Graph and Performance Constraint Graph and Performance
OptimizationOptimization
Vishwani D. AgrawalVishwani D. AgrawalJames J. Danaher ProfessorJames J. Danaher Professor
ECE Department, Auburn UniversityECE Department, Auburn University
Auburn, AL 36849Auburn, AL 36849
[email protected]@eng.auburn.edu
http://www.eng.auburn.edu/~vagrawal/COURSE/E7770_Spr07http://www.eng.auburn.edu/~vagrawal/COURSE/E7770_Spr07
Spring 07, Apr 10, 12Spring 07, Apr 10, 12 ELEC 7770: Advanced VLSI Design (Agrawal)ELEC 7770: Advanced VLSI Design (Agrawal) 22
Retiming TheoremRetiming Theorem Given a network G(V, E, W) and a cycle time T, Given a network G(V, E, W) and a cycle time T,
(r1, . . . ) is a feasible retiming if and only if:(r1, . . . ) is a feasible retiming if and only if: ri – rj ri – rj ≤ wij≤ wij for all edges (vi,vj) for all edges (vi,vj) εε E E ri – rj ≤ W(vi,vj) – 1 ri – rj ≤ W(vi,vj) – 1 for all node-pairs vi, vj such thatfor all node-pairs vi, vj such that
D(vi,vj) D(vi,vj) > T> T
Where,Where,
W(vi,vj) is the minimum weight path between vi and vjW(vi,vj) is the minimum weight path between vi and vj
D(vi,vj) is the maximum delay among all minimum D(vi,vj) is the maximum delay among all minimum weight paths between vi and vjweight paths between vi and vj
Spring 07, Apr 10, 12Spring 07, Apr 10, 12 ELEC 7770: Advanced VLSI Design (Agrawal)ELEC 7770: Advanced VLSI Design (Agrawal) 33
Timing OptimizationTiming Optimization
Find the clock period (T) by path analysis.Find the clock period (T) by path analysis. Set clock period to T/2 and find a feasible Set clock period to T/2 and find a feasible
retiming.retiming. If feasible, further reduce the clock period to If feasible, further reduce the clock period to
half.half. If not feasible, increase clock period.If not feasible, increase clock period. Do a binary search for optimum clock period.Do a binary search for optimum clock period. Retime the circuit.Retime the circuit.
Spring 07, Apr 10, 12Spring 07, Apr 10, 12 ELEC 7770: Advanced VLSI Design (Agrawal)ELEC 7770: Advanced VLSI Design (Agrawal) 44
Representing a ConstraintRepresenting a Constraint
ri – rj ≤ wij or rj ≥ ri – wij
rj ri– wij
Spring 07, Apr 10, 12Spring 07, Apr 10, 12 ELEC 7770: Advanced VLSI Design (Agrawal)ELEC 7770: Advanced VLSI Design (Agrawal) 55
Constraint GraphConstraint Graph
r1 ≥ r0 + 3 r1 ≥ r2 + 1 r2 ≥ r0 + 1 r2 ≥ r1 – 1 r3 ≥ r1 + 1 r3 ≥ r2 + 4 r0 ≥ r3 – 6 r0
r1
r2
r3-1 1
3 1
1 4
-6
Spring 07, Apr 10, 12Spring 07, Apr 10, 12 ELEC 7770: Advanced VLSI Design (Agrawal)ELEC 7770: Advanced VLSI Design (Agrawal) 66
Feasibility ConditionFeasibility Condition
A set of values for variables can be found if and A set of values for variables can be found if and only if the constraint graph has no positive only if the constraint graph has no positive cycles.cycles.
This is also the condition for the solvability of the This is also the condition for the solvability of the longest path problem, which provides a solution longest path problem, which provides a solution to the set of constraints.to the set of constraints.
Spring 07, Apr 10, 12Spring 07, Apr 10, 12 ELEC 7770: Advanced VLSI Design (Agrawal)ELEC 7770: Advanced VLSI Design (Agrawal) 77
Example: Infeasible ConstraintsExample: Infeasible Constraints
x1 ≥ x2 + 6 x2 ≥ x1 – 3
x1 x2
6
-3
x1
x2
60x1 ≥ x2 + 6
x2 ≥ x1 – 3
3
3
Positive cycle mean no longest path can be found.
Spring 07, Apr 10, 12Spring 07, Apr 10, 12 ELEC 7770: Advanced VLSI Design (Agrawal)ELEC 7770: Advanced VLSI Design (Agrawal) 88
Solving a Constraint SetSolving a Constraint Set
r1 ≥ r0 + 3 r1 ≥ r2 + 1 r2 ≥ r0 + 1 r2 ≥ r1 – 1 r3 ≥ r1 + 1 r3 ≥ r2 + 4 r0 ≥ r3 – 6 r0
r1
r2
r3-1 1
3 1
1 4
-6
Longest path from source r0: r0, r1, r2, r3Path lengths: s0=0, s1=3, s2=2, s3=6Solution: r0=0, r1=3, r2=2, r3=6
Spring 07, Apr 10, 12Spring 07, Apr 10, 12 ELEC 7770: Advanced VLSI Design (Agrawal)ELEC 7770: Advanced VLSI Design (Agrawal) 99
The General Path ProblemThe General Path Problem
Find the shortest (or longest) path in a graph Find the shortest (or longest) path in a graph from a source vertex to any other vertex.from a source vertex to any other vertex.
Graph has vertices and directed edges:Graph has vertices and directed edges: Edge weights can be positive or negativeEdge weights can be positive or negative Graph can be cyclicGraph can be cyclic Single source vertex – a vertex with 0 in-degreeSingle source vertex – a vertex with 0 in-degree
Inconsistent problemInconsistent problem Negative cycles for shortest pathNegative cycles for shortest path Positive cycles for longest pathPositive cycles for longest path
Spring 07, Apr 10, 12Spring 07, Apr 10, 12 ELEC 7770: Advanced VLSI Design (Agrawal)ELEC 7770: Advanced VLSI Design (Agrawal) 1010
Dijkstra’s Shortest Path AlgorithmDijkstra’s Shortest Path Algorithm
Greedy algorithm.Greedy algorithm. Applies to directed acyclic graphs (DAG) with Applies to directed acyclic graphs (DAG) with positivepositive
edge weights.edge weights. Computational complexityComputational complexity
O(|E| + |V| log |V|) O(|E| + |V| log |V|) ≤ O(n≤ O(n22)) References:References:
A. Aho, J. Hopcroft and J. Ullman, A. Aho, J. Hopcroft and J. Ullman, Data Structures and Data Structures and AlgorithmsAlgorithms, Reading, Massachusetts: Addison-Wesley, 1983., Reading, Massachusetts: Addison-Wesley, 1983.
T. Cormen, C. Leiserson and R. Rivest, T. Cormen, C. Leiserson and R. Rivest, Introduction to Introduction to AlgorithmsAlgorithms, New York: McGraw-Hill, 1990., New York: McGraw-Hill, 1990.
Spring 07, Apr 10, 12Spring 07, Apr 10, 12 ELEC 7770: Advanced VLSI Design (Agrawal)ELEC 7770: Advanced VLSI Design (Agrawal) 1111
Dijkstra’s Shortest Path AlgorithmDijkstra’s Shortest Path Algorithm
v0
v2
v3
v1w01=15 3
10
2 6source
si = path weight (v0, vi)
Alg. stepsAlg. steps s0s0 s1s1 s2s2 s3s3
Initially: mark s0Initially: mark s0 00 1515 22
Step 1: mark s2Step 1: mark s2 00 1212 22 88
Step 2: mark s3Step 2: mark s3 00 1111 22 88
Step 3: mark s1Step 3: mark s1 00 1111 22 88
Each step marks the path with smallest weight and updates the unmarked path weights.
Spring 07, Apr 10, 12Spring 07, Apr 10, 12 ELEC 7770: Advanced VLSI Design (Agrawal)ELEC 7770: Advanced VLSI Design (Agrawal) 1212
Dijkstra’s Algorithm, G(V, E, W)Dijkstra’s Algorithm, G(V, E, W)
s0(1) = 0s0(1) = 0 initialize sourceinitialize source
for ( i = 1 to n )for ( i = 1 to n ) initialize path weights, n=|V| –1initialize path weights, n=|V| –1si(1) = w0isi(1) = w0i
repeat {repeat {
Select an unmarked vertex vq such that sq is minimalSelect an unmarked vertex vq such that sq is minimal
Mark vqMark vq
foreach ( unmarked vertex vi )foreach ( unmarked vertex vi )si =si = min min { si, sq + wqi } { si, sq + wqi }
}}until (all vertices are marked)until (all vertices are marked)
Spring 07, Apr 10, 12Spring 07, Apr 10, 12 ELEC 7770: Advanced VLSI Design (Agrawal)ELEC 7770: Advanced VLSI Design (Agrawal) 1313
Dijkstra’s Longest Path AlgorithmDijkstra’s Longest Path Algorithm
v0
v2
v3
v1w01=15 3
10
2 6source
si = path length (v0, vi)
Alg. stepsAlg. steps s0s0 s1s1 s2s2 s3s3
InitiallyInitially 00 -15-15 -2-2
Step 1Step 1 00 -15-15 -2-2
Step 2Step 2 00 -15-15 -2-2 -8-8
Step 2Step 2 00 -15-15 -2-2 -8-8v0
v2
v3
v1w01= -15 -3
-10
-2 -6source
Spring 07, Apr 10, 12Spring 07, Apr 10, 12 ELEC 7770: Advanced VLSI Design (Agrawal)ELEC 7770: Advanced VLSI Design (Agrawal) 1414
Dijkstra’s Alg. for Cycles, Neg. WeightsDijkstra’s Alg. for Cycles, Neg. Weights
v0
v2
v3
v1w01=15 3
5
2 4source
si = path weight (v0, vi)
Alg. stepsAlg. steps s0s0 s1s1 s2s2 s3s3
InitiallyInitially 00 1515 22
Step 1Step 1 00 77 22 66
Step 2Step 2 00 77 22 66
Step 3Step 3 00 77 22 6?6?
-2
There exists a v0 to v3 path of length 5
Spring 07, Apr 10, 12Spring 07, Apr 10, 12 ELEC 7770: Advanced VLSI Design (Agrawal)ELEC 7770: Advanced VLSI Design (Agrawal) 1515
Bellman’s Equations – Shortest PathBellman’s Equations – Shortest Path
vi
vn
vm
vkvj
sq = minimum path weight betweensource and vq
wki
wji
wmi
wni
For all vertices:
si = min (sq + wqi)
vq ε pred(vi)
Spring 07, Apr 10, 12Spring 07, Apr 10, 12 ELEC 7770: Advanced VLSI Design (Agrawal)ELEC 7770: Advanced VLSI Design (Agrawal) 1616
Bellman-Ford Algorithm, G(V, E, W)Bellman-Ford Algorithm, G(V, E, W)Bellman-Ford {Bellman-Ford {
s0(1) = 0s0(1) = 0 initialize sourceinitialize source
for ( i = 1 to n )for ( i = 1 to n ) initialize path weights, n = |V| – 1initialize path weights, n = |V| – 1si(1) = w0isi(1) = w0i
for ( j = 1 to n )for ( j = 1 to n ) n iterationsn iterationsfor ( i = 1 to n )for ( i = 1 to n )
si(j+1) =si(j+1) = min min { si(j), sk(j) + wkj } { si(j), sk(j) + wkj }
vvk k εε pred(vi) pred(vi)
}}
if ( si(j+1) == si(j) if ( si(j+1) == si(j) i ) return (true)i ) return (true)
}}
return (false)return (false) Complexity = O(|V||E|) ≤ O(n3)
Spring 07, Apr 10, 12Spring 07, Apr 10, 12 ELEC 7770: Advanced VLSI Design (Agrawal)ELEC 7770: Advanced VLSI Design (Agrawal) 1717
Bellman-Ford Shortest PathBellman-Ford Shortest Path
v0
v2
v3
v1w01=15 3
10
2 6source
si = path weight (v0, vi)
Alg. stepsAlg. steps s0s0 s1s1 s2s2 s3s3
InitiallyInitially 00 1515 22
Iteration 1Iteration 1 00 1212 22 88
Iteration 2Iteration 2 00 1111 22 88
Iteration 3Iteration 3 00 1111 22 88
n = 3
Spring 07, Apr 10, 12Spring 07, Apr 10, 12 ELEC 7770: Advanced VLSI Design (Agrawal)ELEC 7770: Advanced VLSI Design (Agrawal) 1818
Bellman-Ford Longest PathBellman-Ford Longest Path
v0
v2
v3
v1w01= -15 -3
-10
-2 -6source
si = path weight (v0, vi)
Alg. stepsAlg. steps s0s0 s1s1 s2s2 s3s3
InitiallyInitially 00 -15-15 -2-2
Iteration 1Iteration 1 00 -15-15 -2-2 -8-8
Iteration 2Iteration 2 00 -15-15 -2-2 -8-8
n = 3 (shortest path)
Reverse the sign of weights and solve shortest path problem.(Alternative: keep original weights and change min operator in algorithm to max.)
Weights reversed
Spring 07, Apr 10, 12Spring 07, Apr 10, 12 ELEC 7770: Advanced VLSI Design (Agrawal)ELEC 7770: Advanced VLSI Design (Agrawal) 1919
Bellman’s Equations – Longest PathBellman’s Equations – Longest Path
vi
vn
vm
vkvj
sq = maximum path weight betweensource and vq
wki
wji
wmi
wni
For all vertices:
si = max (sq + wqi)
vq ε pred(vi)
Spring 07, Apr 10, 12Spring 07, Apr 10, 12 ELEC 7770: Advanced VLSI Design (Agrawal)ELEC 7770: Advanced VLSI Design (Agrawal) 2020
Bellman-Ford for Cycles, Neg. WeightsBellman-Ford for Cycles, Neg. Weights
v0
v2
v3
v1w01=15 3
5
2 4source
si = path weight (v0, vi)
Alg. stepsAlg. steps s0s0 s1s1 s2s2 s3s3
InitiallyInitially 00 1515 22
Iteration 1Iteration 1 00 77 22 66
Iteration 2Iteration 2 00 77 22 55
Iteration 3Iteration 3 00 77 22 55
-2 n = 3 (shortest path)
This was incorrect with Dijkstra’s shortest path algorithm
Spring 07, Apr 10, 12Spring 07, Apr 10, 12 ELEC 7770: Advanced VLSI Design (Agrawal)ELEC 7770: Advanced VLSI Design (Agrawal) 2121
Bellman-Ford for Negative CycleBellman-Ford for Negative Cycle
v0
v2
v3
v1w01=15 -3
5
2 4source
si = path weight (v0, vi)
Alg. stepsAlg. steps s0s0 s1s1 s2s2 s3s3
InitiallyInitially 00 1515 22
Iteration 1Iteration 1 00 77 22 66
Iteration 2Iteration 2 00 33 22 66
Iteration 3Iteration 3 00 33 22 55
2
Values not stabilized after n iterations.Inconsistent problem: negative cycle.
n = 3 (shortest path)
Spring 07, Apr 10, 12Spring 07, Apr 10, 12 ELEC 7770: Advanced VLSI Design (Agrawal)ELEC 7770: Advanced VLSI Design (Agrawal) 2222
Retiming ExampleRetiming Example
FF10 5 5
Delay
a b c
Spring 07, Apr 10, 12Spring 07, Apr 10, 12 ELEC 7770: Advanced VLSI Design (Agrawal)ELEC 7770: Advanced VLSI Design (Agrawal) 2323
Retiming GraphRetiming Graph
FF10 5 5a b c
h0
a10
b5
c5
0 0 1
1
Critical path = 15It is the longest path consisting only of zero weight edges.
Spring 07, Apr 10, 12Spring 07, Apr 10, 12 ELEC 7770: Advanced VLSI Design (Agrawal)ELEC 7770: Advanced VLSI Design (Agrawal) 2424
Feasibility ConstraintsFeasibility Constraints
FF10 5 5a b c
h0
a10
b5
c5
0 0 1
1
ri – rj ≤ wij edges i → jRetiming should not cause negative edge weights.
rh – ra ≤ 0ra – rb ≤ 0rb – rc ≤ 1rc – rh ≤ 1
Spring 07, Apr 10, 12Spring 07, Apr 10, 12 ELEC 7770: Advanced VLSI Design (Agrawal)ELEC 7770: Advanced VLSI Design (Agrawal) 2525
Constraint GraphConstraint Graph
FF10 5 5a b c
rh0
ra10
rb5
rc5
0 0 -1
-1
ri – rj ≤ wij edges i → jRetiming should not cause negative edge weights.
rh – ra ≤ 0ra – rb ≤ 0rb – rc ≤ 1rc – rh ≤ 1
Observation: Constraint graph has the same structure as the original retiming graph, with signs of weights reversed. Vertex labels are the retiming integer variables.
Spring 07, Apr 10, 12Spring 07, Apr 10, 12 ELEC 7770: Advanced VLSI Design (Agrawal)ELEC 7770: Advanced VLSI Design (Agrawal) 2626
Max Delay for Min Weight PathsMax Delay for Min Weight Paths
h0
a10
b5
c5
0 0 1
1
W(h,a) = 0 D(h,a) = 10W(h,b) = 0 D(h,b) = 15W(h,c) = 1 D(h,c) = 20W(a,b) = 0 D(a,b) = 15W(a,c) = 1 D(a,c) = 20W(a,h) = 2 D(a,h) = 20
W(b,c) = 1 D(b,c) = 10W(b,h) = 2 D(b,h) = 10W(b,a) = 2 D(b,a) = 20W(c,h) = 1 D(c,h) = 5W(c,a) = 1 D(c,a) = 15W(c,b) = 1 D(c,b) = 20
T = 15
Spring 07, Apr 10, 12Spring 07, Apr 10, 12 ELEC 7770: Advanced VLSI Design (Agrawal)ELEC 7770: Advanced VLSI Design (Agrawal) 2727
Timing Optimization, T = 7.5?Timing Optimization, T = 7.5?
W(h,a) = 0 D(h,a) = 10W(h,b) = 0 D(h,b) = 15W(h,c) = 1 D(h,c) = 20W(a,b) = 0 D(a,b) = 15W(a,c) = 1 D(a,c) = 20W(a,h) = 2 D(a,h) = 20
W(b,c) = 1 D(b,c) = 10W(b,h) = 2 D(b,h) = 10W(b,a) = 2 D(b,a) = 20W(c,h) = 1 D(c,h) = 5W(c,a) = 1 D(c,a) = 15W(c,b) = 1 D(c,b) = 20
rh0
ra10
rb5
rc5
0 0 -1
-1
ri – rj ≤ W(I,j) – 1 paths (i,j) such that D(i,j) > 7.5
Constraint graph(feasibility)
Spring 07, Apr 10, 12Spring 07, Apr 10, 12 ELEC 7770: Advanced VLSI Design (Agrawal)ELEC 7770: Advanced VLSI Design (Agrawal) 2828
Timing Optimization, T = 7.5?Timing Optimization, T = 7.5?
W(h,a) = 0 D(h,a) = 10W(h,b) = 0 D(h,b) = 15W(h,c) = 1 D(h,c) = 20W(a,b) = 0 D(a,b) = 15W(a,c) = 1 D(a,c) = 20W(a,h) = 2 D(a,h) = 20
W(b,c) = 1 D(b,c) = 10W(b,h) = 2 D(b,h) = 10W(b,a) = 2 D(b,a) = 20W(c,h) = 1 D(c,h) = 5W(c,a) = 1 D(c,a) = 15W(c,b) = 1 D(c,b) = 20
rh0
ra10
rb5
rc5
0 0 -1
-1
11
0
1
0
-1 0
-1
-1
0
0
Positive cycleNo solution
Spring 07, Apr 10, 12Spring 07, Apr 10, 12 ELEC 7770: Advanced VLSI Design (Agrawal)ELEC 7770: Advanced VLSI Design (Agrawal) 2929
Timing Optimization, T = 11.25?Timing Optimization, T = 11.25?
W(h,a) = 0 D(h,a) = 10W(h,b) = 0 D(h,b) = 15W(h,c) = 1 D(h,c) = 20W(a,b) = 0 D(a,b) = 15W(a,c) = 1 D(a,c) = 20W(a,h) = 2 D(a,h) = 20
W(b,c) = 1 D(b,c) = 10W(b,h) = 2 D(b,h) = 10W(b,a) = 2 D(b,a) = 20W(c,h) = 1 D(c,h) = 5W(c,a) = 1 D(c,a) = 15W(c,b) = 1 D(c,b) = 20
rh0
ra10
rb5
rc5
0 0 -1
-1
10
1
0
-1 -1
0
0
rh = 0 rb = 1 rc = 0 ra = 0
Spring 07, Apr 10, 12Spring 07, Apr 10, 12 ELEC 7770: Advanced VLSI Design (Agrawal)ELEC 7770: Advanced VLSI Design (Agrawal) 3030
Retiming GraphRetiming Graph
FF10 5 5a b c
h0
a10
b5
c5
0 0 1
1
rh = 0 ra = 0 rb = 1 rc = 0
1 0
wij_retimed = wij + rj – ri
Spring 07, Apr 10, 12Spring 07, Apr 10, 12 ELEC 7770: Advanced VLSI Design (Agrawal)ELEC 7770: Advanced VLSI Design (Agrawal) 3131
Retimed CircuitRetimed Circuit
FF10 5 5a b c
h0
a10
b5
c5
0
1
rh = 0 ra = 0 rb = 1 rc = 0
1 0
Critical Path = 10
Logic optimization will remove these.