prev

next

of 31

View

21Download

6

Embed Size (px)

DESCRIPTION

CAD Algorithms

Algorithmic Techniques in VLSI CADShantanu DuttUniversity of Illinois at Chicago

Algorithms in VLSI CADDivide & Conquer (D&C) [e.g., merge-sort, partition-driven placement]Reduce & Conquer (R&C) [e.g., multilevel techniques such as the hMetis partitioner]Dynamic programming [e.g., matrix multiplication, optimal buffer insertion]Mathematical programming: linear, quadratic, 0/1 integer programming [e.g., floorplanning, global placement]

Algorithms in VLSI CAD (contd)Search Methods:Depth-first search (DFS): mainly used to find any solution when cost is not an issue [e.g., FPGA detailed routing---cost generally determined at the global routing phase]Breadth-first search (BFS): mainly used to find a soln at min. distance from root of search tree [e.g., maze routing when cost = dist. from root]Best-first search (BeFS): used to find optimal solutions w/ any cost function, Can be done when a provable lower-bound of the cost can be determined for each branching choice from the current partial soln node [e.g., TSP, global routing]Iterative Improvement: deterministic, stochastic

Divide & ConquerDetermine if the problem can be solved in a hierarchical or divide-&-conquer (D&C) manner: D&C approach: See if the problem can be broken up into 2 or more smaller subproblems that can be stitched-up to give a soln. to the parent prob. Do this recrusively for each large subprob until subprobs are small enough for an easy solution technique (could be exhasutive!) If the subprobs are of a similar kind to the root prob then the breakup and stitching will also be similar

Reduce-&-Conquer Examples: Multilevel graph/hypergraph partitioning (e.g., hMetis), multilevel routing

Dynamic Programming (DP) The above property means that everytime we optimally solve the subproblem, we can store/record the soln and reuse it everytime it is part of the formulation of a higher-level problemSubproblems

- Dynamic Programming (contd) Matrix multiplication example: Most computationally efficient way to perform the series of matrix mults: M = M1 x M2 x .. x Mn, Mi is of size ri x ci w/ ri = ci-1 for i > 1. DP formulation: opt_seq(M) = (by defn) opt_seq(M(1,n)) = mini=1 to n-1 {opt_seq(M(1, i)) + opt_seq(M(i+1, n)) + r1xcixcn} Correctness rests on the property that the optimal way of multiplying M1x x Mi& Mi+1 to Mn will be used in the min stitch-up function to determine the optimal soln for M Thus if the optimal soln invloves a cut at Mr, then the opt_seq(M(1,r)) & opt_seq(M(r+1,n)) will be part of opt_seq(M) Perform computation bottom-up (smallest sequences first) Complexity: Note that each subseq M(j, k) will appear in the above computation and is solved exactly once (irrespective of how many times it appears). Time to solve M(j, k), j < n, k >= j, not counting the time to solve its subproblems (which are accounted for in the complexity of each M(j,k)) is length l of seq -1 = l-1 (min of l-1 different options is computed). Note l = j-k+1 # of different M(j, k)s is of length l = n l + 1, 2
A DP Example: Simple Buffer Insertion ProblemGiven: Source and sink locations, sink capacitancesand RATs, a buffer type, source delay rules, unit wire resistance and capacitanceBufferRAT1RAT2RAT3RAT4s0Courtesy: Chuck Alpert, IBM

Simple Buffer Insertion Problem (contd)Find: Buffer locations and a routing tree such that slack/RAT at the source is maximizedCourtesy: Chuck Alpert, IBM

Slack/RAT ExampleRAT = 400delay = 600RAT = 500delay = 350RAT = 400delay = 300RAT = 500delay = 400Slack/RAT = -200Slack/RAT = +100Courtesy: Chuck Alpert, IBM

Elmore DelayCourtesy: Chuck Alpert, IBM

DP Example: Van Ginneken Buffer Insertion Algorithm [ISCAS90]Associate each leaf node/sink with two metrics (Ct, Tt)Downstream loading capacitance (Ct) and RAT (Tt)DP-based alg propagates potential solutions bottom-up [Van Ginneken, 90] Add a wire

Add a buffer

Merge two solutions: For each Zn=(Cn,Tn),Zm=(Cm,Tm) soln. vectors in the 2 subtrees,create a soln vector Zt=(Ct,Tt) where

Cn, TnCt, TtCn, TnCt, TtCn, TnCm, TmCt, TtCw, RwCourtesy: UCLANote: Take Ln = Cn

- DP Example (contd)Add a wire to each merged solution Zt (same cap. & delay change formulation as before)Add a buffer to each ZtDelete all dominated solutions Zd: Zd=(Cd, Td) is dominated if there exists a Zr=(Cr, Tr) s.t. Cd >= Cr and Td
Van Ginneken Example(20,400)(20,400)(30,250)(5, 220)WireC=10,d=150BufferC=5, d=30(20,400)BufferC=5, d=50C=5, d=30WireC=15,d=200C=15,d=120(30,250)(5, 220)(45, 50)(5, 0)(20,100)(5, 70)Courtesy: Chuck Alpert, IBM

Van Ginneken Example Contd(20,400)(30,250)(5, 220)(45, 50)(5, 0)(20,100)(5, 70)(5,0) is inferior to (5,70). (45,50) is inferior to (20,100)(20,400)(30,250)(5, 220)(20,100)(5, 70)(30,10)(15, -10)Pick solution with largest slack, follow arrows to get solutionWire C=10Courtesy: Chuck Alpert, IBM

- Mathematical ProgrammingLinear programming (LP)E.g., Obj: Min 2x1-x2+x3w/ constraintsx1+x2
- 0/1 ILP/QLP Examples Generally useful for assignment problems, where objects {O1, ..., On) are assigned to bins {B1, ..., Bm} 0/1 variable xi,j = 1 of object Oi is assigned to bin Bj Min-cut bi-partitioning for graphs G(V,E) can me modeled as a 0/1 IQPV1V2uiuj xi,1 = 1 => ui in V1 else ui in V2 Edge (ui, uj) in cutset ifxi,1 (1-xj,1) + (1-xi,1)(xj,1 ) = 1 Objective function: Min Sum (ui, uj) in E c(i,j) (xi,1 (1-xj,1) + (1-xi,1)(xj,1) Constraint: Sum w(ui) xi,1
Search Techniquesdfs(v) /* for basic graph visit or for soln finding when nodes are partial solns */ v.mark = 1; for each (v,u) in E if (u.mark != 1) then dfs(u)

Algorithm Depth_First_Search for each v in V v.mark = 0; for each v in V if v.mark = 0 then if G has partial soln nodes then dfs(v); else soln_dfs(v);soln_dfs(v)/* used when nodes are basic elts of the problem and not partial soln nodes */v.mark = 1;If path to v is a soln, then return(1);for each (v,u) in E if (u.mark != 1) then soln_found = soln_dfs(u) if (soln_found = 1) then return(soln_found)end for;v.mark = 0; /* can visit v again to form another soln on a different path */return(0)

Search TechniquesExhaustive DFSoptimal_soln_dfs(v)/* used when nodes are basic elts of the problem and not partial soln nodes */beginv.mark = 1;If path to v is a soln, then begin if cost < best_cost then begin best_soln=soln; best_cost=cost; endif v.mark=0; return;Endiffor each (v,u) in E if (u.mark != 1) then optimal_soln_dfs(u)end for;v.mark = 0; /* can visit v again to form another soln on a different path */endAlgorithm Depth_First_Search for each v in V v.mark = 0; best_cost = infinity; optimal_soln_dfs(root);

Best-First SearchBeFS (root)begin open = {root} /* open is list of gen. but not expanded nodes---partial solns */ best_soln_cost = infinity; while open != nullset do begin curr = first(open); if curr is a soln then return(curr) /* curr is an optimal soln */ else children = Expand_&_est_cost(curr); /* generate all children of curr & estimate their costs---cost(u) should be a lower bound of cost of the best soln reachable from u */ for each child in children do begin if child is a soln then delete all nodes w in open s.t. cost(w) >= cost(child); endif store child in open in increasing order of cost; endfor endwhileend /* BFS */ Expand_&_est_cost(Y)begin children = nullset; for each basic elt x of problem reachable from Y & can be part of current partial soln. Y do begin if x not in u and if feasible child = Y U {x}; path_cost(child) = path_cost(Y) + cost(u, x) /* cost(Y,x) is cost of reaching x from Y */ est(child) = lower bound cost of best soln reachable from child; cost(child) = path_cost(child) + est(child); children = children U {child}; endforend /* Expand_&_est_cost(Y);

Best-First SearchProof of optimality when cost is a LB The current set of nodes in open represents a complete front of generated nodes, i.e., the rest of the nodes in the search space are descendants of open Assuming the basic cost (cost of adding an elt in a partial soln to contruct another partial soln that is closer to the soln) is non-negative, the cost is monotonic, i.e., cost of child >= cost of parent If first node curr in open is a soln, then cost(curr) = cost of its ancestor in open and thus >= cost(curr). Thus curr is the optimal (min-cost) soln

Search techs for a TSP exampleBEFFDFEFDEDxAACFEEAAA273133Exhaustive search using DFS (w/ backtrack) for findingan optimal solutionSolution nodesTSP graph

- Search techs for a TSP example (contd)BEFFDFEFAACFA2723+8BeFS for finding an optimal solution22+9CDEDXXXFD21+6CFBFFA8+1611+1414+9205+15 Lower-bound cost estimate: MST({unvisited cities} U {current city} U {start city}) LB as structure (spanning tree) is a superset of reqd soln structure (cycle) min_cost(set S)
BFS for 0/1 ILP SolutionX = {x1, , xm} are 0/1 varsX2=0X2=1Solve LPw/ x2=0;Cost=cost(LP)=C1Solve LPw/ x2=1;Cost=cost(LP)=C2Solve LPw/ x2=1, x4=0;Cost=cost(LP)=C3Solve LPw/ x2=1, x4=1;Cost=cost(LP)=C4X4=0X4=1X5=0X5=1Solve LPw/ x2=1, x4=1, x5=1Cost=cost(LP)=C6Solve LPw/ x2=1, x4=1, x5=0Cost=cost(LP)=C5optimal solnCost relations:C5 < C3 < C1 < C6C2 < C1C4 < C3

Iterative Improvement TechniquesIterative improvementDeterministic GreedyStochastic(non-greedy)Locally/immediately greedyNon-locally greedy Make move that isimmediately (locally) bestUntil (no further impr.)(e.g., FM) Make move