8
Layout Decomposition for Triple Patterning Lithography Bei Yu, Kun Yuan , Boyang Zhang, Duo Ding, David Z. Pan ECE Dept. University of Texas at Austin, Austin, TX USA 78712 Cadence Design Systems, Inc., San Jose, CA USA 95134 Email: {bei, dpan}@cerc.utexas.edu Abstract—As minimum feature size and pitch spacing further decrease, triple patterning lithography (TPL) is a possible 193nm extension along the paradigm of double patterning lithography (DPL). However, there is very little study on TPL layout decom- position. In this paper, we show that TPL layout decomposition is a more difficult problem than that for DPL. We then propose a general integer linear programming formulation for TPL layout decomposition which can simultaneously minimize conflict and stitch numbers. Since ILP has very poor scalability, we propose three acceleration techniques without sacrificing solution quality: independent component computation, layout graph simplifica- tion, and bridge computation. For very dense layouts, even with these speedup techniques, ILP formulation may still be too slow. Therefore, we propose a novel vector programming formulation for TPL decomposition, and solve it through effective semidefinite programming (SDP) approximation. Experimental results show that the ILP with acceleration techniques can reduce 82% runtime compared to the baseline ILP. Using SDP based algorithm, the runtime can be further reduced by 42% with some tradeoff in the stitch number (reduced by 7%) and the conflict (9% more). However, for very dense layouts, SDP based algorithm can achieve 140× speed-up even compared with accelerated ILP. 1. I NTRODUCTION As minimum feature size further scales, the semiconductor industry is greatly challenged of patterning sub-22nm half- pitch due to the delay of viable next generation lithography such as Extreme Ultra Violet (EUV). Double patterning lithog- raphy (DPL) is widely recognized as a promising solution for 32nm, 22nm, and possibly 16nm volume chip production. As shown in Fig. 1, the key challenge of DPL lies in the decomposition process by which the original layout is divided into two masks. Then, there are two exposure/etching steps, through which the layout can be produced. The advantage of this approach is that the effective pitch can be doubled, which improves the lithography resolution. During the decomposi- tion, when the distance between the two patterns is less than minimum colorable distance min s , they need to be assigned to different masks to avoid a conflict. Sometimes conflict can be solved by splitting a pattern into two touching parts, called stitches. In Fig. 1(b), polygon a is split into two polygons a 1 and a 2 in order to resolve the decomposition conflicts. However the introduced stitches lead to yield loss due to overlay error [1]. Therefore, two of the main challenges in layout decomposition are conflict and stitch minimization. The paradigm of double patterning may be further extended to triple patterning lithography (TPL). Industry has already explored the test-chip patterns with triple patterning or even a b c (a) b c a2 a1 Stitch (b) a1 b c a2 mask2 mask1 (c) Fig. 1. In DPL, a single layer is decomposed into two masks and the pitch can be increased effectively. b a c Conflicts (a) b a c (b) Fig. 2. (a) In DPL, even stitch insertion can not avoid native conflict. (b) The native conflicts can be resolved by TPL. quadruple patterning [2]. By using TPL, we can achieve further feature-size scaling through pitch-tripling. It shall be noted that in DPL, even with stitch insertion, there may be native conflicts [3]. Fig. 2(a) shows a three- way conflict cycle between features a, b and c, where any two of them are within the min s . As a consequence, there is no chance to produce a conflict-free solution using DPL decomposition. However, we can easily resolve this problem if the layout is decomposed into three masks as shown in Fig. 2(b). Yet this does not mean TPL layout decomposition problem becomes easier. Actually since the features can be packed closer, the problem turns out to be more difficult, as to be shown in Section 2. Much previous research focuses on the double patterning layout decomposition problem, which is generally regarded as a two-coloring problem on a conflict graph. Integer Linear Programming (ILP) is adopted in [4][5] to minimize the stitch number and/or the conflict number. Xu et al. [6] propose an efficient graph reduction-based algorithm for stitch minimiza- tion, and Yang et al. [7] propose a fast min-cut based approach. A matching based decomposer is proposed to minimize both the conflict and the stitch numbers [8]. To resolve the native conflict, several works introduce layout modification to further minimize the conflict number [9][10][11]. However, layout

Layout decomposition for triple patterning lithography

Embed Size (px)

Citation preview

Layout Decomposition for Triple Patterning Lithography

Bei Yu, Kun Yuan†, Boyang Zhang, Duo Ding, David Z. PanECE Dept. University of Texas at Austin, Austin, TX USA 78712† Cadence Design Systems, Inc., San Jose, CA USA 95134

Email: {bei, dpan}@cerc.utexas.edu

Abstract—As minimum feature size and pitch spacing furtherdecrease, triple patterning lithography (TPL) is a possible 193nmextension along the paradigm of double patterning lithography(DPL). However, there is very little study on TPL layout decom-position. In this paper, we show that TPL layout decompositionis a more difficult problem than that for DPL. We then propose ageneral integer linear programming formulation for TPL layoutdecomposition which can simultaneously minimize conflict andstitch numbers. Since ILP has very poor scalability, we proposethree acceleration techniques without sacrificing solution quality:independent component computation, layout graph simplifica-tion, and bridge computation. For very dense layouts, evenwith these speedup techniques, ILP formulation may still betoo slow. Therefore, we propose a novel vector programmingformulation for TPL decomposition, and solve it through effectivesemidefinite programming (SDP) approximation. Experimentalresults show that the ILP with acceleration techniques can reduce82% runtime compared to the baseline ILP. Using SDP basedalgorithm, the runtime can be further reduced by 42% with sometradeoff in the stitch number (reduced by 7%) and the conflict(9% more). However, for very dense layouts, SDP based algorithmcan achieve 140× speed-up even compared with accelerated ILP.

1. INTRODUCTION

As minimum feature size further scales, the semiconductorindustry is greatly challenged of patterning sub-22nm half-pitch due to the delay of viable next generation lithographysuch as Extreme Ultra Violet (EUV). Double patterning lithog-raphy (DPL) is widely recognized as a promising solution for32nm, 22nm, and possibly 16nm volume chip production.

As shown in Fig. 1, the key challenge of DPL lies in thedecomposition process by which the original layout is dividedinto two masks. Then, there are two exposure/etching steps,through which the layout can be produced. The advantage ofthis approach is that the effective pitch can be doubled, whichimproves the lithography resolution. During the decomposi-tion, when the distance between the two patterns is less thanminimum colorable distance mins, they need to be assignedto different masks to avoid a conflict. Sometimes conflict canbe solved by splitting a pattern into two touching parts, calledstitches. In Fig. 1(b), polygon a is split into two polygonsa1 and a2 in order to resolve the decomposition conflicts.However the introduced stitches lead to yield loss due tooverlay error [1]. Therefore, two of the main challenges inlayout decomposition are conflict and stitch minimization.

The paradigm of double patterning may be further extendedto triple patterning lithography (TPL). Industry has alreadyexplored the test-chip patterns with triple patterning or even

a

b

c

(a)

b

ca2

a1Stitch

(b)

a1

b

ca2

mask2

mask1

(c)

Fig. 1. In DPL, a single layer is decomposed into two masks and the pitchcan be increased effectively.

b

a cConflicts

(a)

b

a c

(b)

Fig. 2. (a) In DPL, even stitch insertion can not avoid native conflict. (b)The native conflicts can be resolved by TPL.

quadruple patterning [2]. By using TPL, we can achieve furtherfeature-size scaling through pitch-tripling.

It shall be noted that in DPL, even with stitch insertion,there may be native conflicts [3]. Fig. 2(a) shows a three-way conflict cycle between features a, b and c, where anytwo of them are within the mins. As a consequence, thereis no chance to produce a conflict-free solution using DPLdecomposition. However, we can easily resolve this problemif the layout is decomposed into three masks as shown inFig. 2(b). Yet this does not mean TPL layout decompositionproblem becomes easier. Actually since the features can bepacked closer, the problem turns out to be more difficult, asto be shown in Section 2.

Much previous research focuses on the double patterninglayout decomposition problem, which is generally regarded asa two-coloring problem on a conflict graph. Integer LinearProgramming (ILP) is adopted in [4][5] to minimize the stitchnumber and/or the conflict number. Xu et al. [6] propose anefficient graph reduction-based algorithm for stitch minimiza-tion, and Yang et al. [7] propose a fast min-cut based approach.A matching based decomposer is proposed to minimize boththe conflict and the stitch numbers [8]. To resolve the nativeconflict, several works introduce layout modification to furtherminimize the conflict number [9][10][11]. However, layout

modification may cause new problems, i.e., timing closure,hotspot.

Until now, there are very few investigations on TPL lay-out decomposition. [12] proposes a triple patterning coloringalgorithm, which adopts SAT Solver. However, their workonly deals with contact arrays, not general layout structureswith wires, contacts, and so on. Besides, it does not involveany stitch minimization. [13][14] propose a self-aligned triplepatterning (SATP) process to extend 193nm immersion lithog-raphy to half-pitch 15nm patterning. But the SATP processcannot insert any stitch, which would greatly constrain thepossible layout patterns that are decomposable [15]. To ourbest knowledge, there is no study so far on layout decompo-sition on TPL for general layout styles.

In this paper, we propose the first systematic study onlayout decomposition for triple patterning lithography. Wefirst formulate a general ILP formulation for TPL layoutdecomposition to simultaneously minimize conflict and stitch.To improve scalability, we further propose three accelerationtechniques without loss of solution quality: layout graph sim-plification, independent component computation and bridgescomputation. A semidefinite programming based approxima-tion algorithm is further proposed to improve scalability.Semidefinite programming is an extension of linear program-ming to approximately solve NP-hard problems and it has beensuccessfully applied to many combinatorial problems [16][17].Our main contributions of this paper include:• General ILP formulation to simultaneously minimize

conflict and stitch for TPL layout decomposition;• Three acceleration techniques to improve ILP scalability;• A novel vector programming formulation for TPL de-

composition and its semidefinite programming based ap-proximation algorithm which can further deal with verydense layouts where even accelerated ILP becomes tooslow;

• Our experimental results are very promising in terms ofquality of results and runtime tradeoff.

The rest of the paper is organized as follows: in Section 2,we discuss the problem formulation and then analyze prob-lem complexity. The basic algorithm and some accelerationtechniques are described in section 3. Section 4 proposes asemidefinite programming based algorithm to further accel-erate the basic algorithm. Section 5 presents the experimentresults, followed by conclusion in Section 6.

2. PROBLEM FORMULATION AND COMPLEXITY

Some preliminaries on TPL are provided in this section,including some definitions and the problem formulation. Wealso demonstrate the complexity of the problem.

A. Problem Formulation

Given a layout which is specified by features in polygonalshapes, a layout graph [4] and a decomposition graph [9] areconstructed.

Definition 1 (Layout Graph): The layout graph (LG) isan undirected graph whose nodes are the given layout’s

bf

ea

c d

(a)

da

bf

c

e

(b)

bf

ea

c d

(c)

d1a

bf

c

e1

d2

e2

(d)

Fig. 3. Layout graph construction and decomposition graph construc-tion (a) Input layout representing as irregular polygons. (b)Correspondinglayout graph, where all edges are conflict edges. (c)The node projec-tion. (d)Corresponding decomposition graph, where dash edges are stitchedges.

polygonal shapes and where an edge exists if and only if thetwo polygonal shapes are within minimum coloring distancemins of each other.

Fig. 3(a) gives an example of an input layout; the corre-sponding layout graph is shown in Fig. 3(b). All the edgesin a layout graph are called Conflict Edges (CE). A conflictexists if and only if two nodes are connected by a CE andare in the same mask. In other words, each conflict edge is aconflict candidate.

Definition 2 (Decomposition Graph): Given a layout rep-resented by a set of polygonal shapes, the decomposition graph(DG) is an undirected graph with a single set of nodes V , andtwo sets of edges, CE and SE, which contain the conflictedges and stitch edges, respectively. V has one or more nodesfor each polygonal shape and each node is associated witha polygonal shape. An edge is in CE iff the two polygonalshapes are within minimum coloring distance mins of eachother. An edge is in SE iff there is a stitch between the twonodes which are associated with the same polygonal shape.

On the layout graph, the node projection is first performed,where projected segments are highlighted by bold lines in Fig.3(c). Based on the projection result, all the legal splittinglocations are computed. Then the decomposition graph isconstructed, as shown in Fig. 3(d). Note that the conflict edgesare marked as black edges, while stitch edges are marked asdash edges.

Problem 1 (TPL layout decomposition): Given a layoutwhich is specified by features in polygonal shapes, the layoutgraph and the decomposition graph are constructed. Our goalis to assign all the nodes in the decomposition graph to threemasks to minimize the stitch number and the conflict number.

TABLE INOTATION

Notation used in Mathematical programming

CE set of conflict edges

SE set of stitch edges.

V the set of polygens.

ri the ith layout polygons

xi variable denoting the coloring of ricij 0-1 variable, cij = 1 when a conflict between ri and rj

sij 0-1 variable, sij = 1 when a stitch between ri and rj

Notation used in ILP formulation

xi1, xi2 two 1-bit 0-1 variables to represents 3 colors

cij1, cij2 two 1-bit 0-1 variables to determine cij

sij1, sij2 two 1-bit 0-1 variables to determine sij

B. Problem Complexity

At first glance, the layout decomposition is similar to graphcoloring problem. However, since stitch edges are introduced,the problem to minimize conflict and stitch is more compli-cated. For double patterning case, deciding whether a graphis 2-colorable is easy by determining if there exists oddcycles. For conflict and stitch minimization, if a decompositiongraph is planar DPL layout decomposition can be solved inpolynomial time [8]. In order to solve the triple patterningissue, the problem becomes more complicated.

Lemma 1: Deciding whether a planar graph is 3-colorableis NP-complete [18].

Lemma 1 can be naturally extended to general graph. Basedon Lemma 1, the methodology in [12] is not suitable forTPL decomposition: SAT solver can only work for 3-colorablelayout graph, which cannot be checked in polynomial time.

Lemma 2: Coloring a 3-colorable graph with 4 colors isNP-complete [19].

3-coloring problem is to assign the nodes in one 3-colorablegraph to 3 colors. Since coloring the graph with 4 colorscannot be finished in polynomial time, it can be shown 3-coloring problem is NP-hard. Based on above lemmas, eventhe decomposition graph is planar, we reach the followingtheorem:

Theorem 1: TPL layout decomposition problem is NP-hard.

We can prove this theorem by reducing 3-Coloring problemto the TPL decomposition problem. Due to page limit, thedetailed proof is skipped here.

3. BASIC ALGORITHM

In this section, we will present our basic algorithms, whichare based on the Integer Linear Programming (ILP). Sincethe timing complexity for ILP is very high, we proposethree acceleration techniques to divide the whole problem intoseveral smaller ones. The entire flow is shown in Fig. 4.

A. Mathematical Formulation for TPL Decomposition

The mathematical formulation for TPL layout decompo-sition is shown in (1). For convenience, some notations in

mathematical programming and ILP formulation are listed inTable I. The objective is to simultaneously minimize boththe conflict number and the stitch number. The parameter αis a user-defined parameter for assigning relative importancebetween the conflict number and the stitch number.

min∑

eij∈CE

cij + α∑

eij∈SE

sij (1)

s.t. cij = (xi == xj) ∀eij ∈ CE (1a)sij = xi ⊕ xj ∀eij ∈ SE (1b)xi ∈ {0, 1, 2} ∀i ∈ V (1c)

where xi is a variable for the three colors of rectangles ri,cij is a binary variable for conflict edge eij ∈ CE and sijis a binary variable for stitch edge eij ∈ SE. Constraint (1a)is used to evaluate the conflict number when touch nodes riand rj are assigned different colors (masks). Constraint (1b)is used to calculate the stitch number. If node ri and node rjare assigned the same color (mask), stitch sij is introduced.

B. ILP Formulation for TPL Layout Decomposition

We will now show how to formulate (1) with Integer LinearProgramming. Note that eqs. (1a) and (1b) can be linearizedonly when xi is a 0-1 variable [4], which cannot representthree different colors. To handle this problem, we representthe color of each node using two 1-bit 0-1 variables xi1 andxi2. In order to limit the number of colors for each node to3, for each pair (xi1, xi2) the value (1, 1) is not permitted. Inother words, only values (0, 0), (0, 1) and (1, 0) are allowed.

Thus, (1) can be formulated as follows:

min∑

eij∈CE

cij + α∑

eij∈SE

sij (2)

s.t. xi1 + xi2 ≤ 1 (2a)xi1 + xj1 ≤ 1 + cij1 ∀eij ∈ CE (2b)(1− xi1) + (1− xj1) ≤ 1 + cij1 ∀eij ∈ CE (2c)xi2 + xj2 ≤ 1 + cij2 ∀eij ∈ CE (2d)(1− xi2) + (1− xj2) ≤ 1 + cij2 ∀eij ∈ CE (2e)cij1 + cij2 ≤ 1 + cij ∀eij ∈ CE (2f )xi1 − xj1 ≤ sij1 ∀eij ∈ SE (2g)xj1 − xi1 ≤ sij1 ∀eij ∈ SE (2h)xi2 − xj2 ≤ sij2 ∀eij ∈ SE (2i)xj2 − xi2 ≤ sij2 ∀eij ∈ SE (2j)sij ≥ sij1, sij ≥ sij2 ∀eij ∈ SE (2k)

The objective function is the same as that in (1), whichminimizes the weighted summation of the conflict number andthe stitch number. Constraint (2a) is used to limit the numberof colors for each node to 3.

Constraints (2b) to (2f ) are equivalent to constraint (1a),where 0-1 variable cij1 demonstrates whether xi1 equals toxj1, and cij2 demonstrates whether xi2 equals to xj2. 0-1variable cij is true only if two nodes connected by conflictedge eij are in the same color, e.g. both cij1 and cij2 are true.

ILP / Vector Programming

Input Layout

Output Masks

Layout Graph Construction

Bridge Computation

Independent Component Computation

Decomposition Graph Construction

Layout Graph Simplification

Fig. 4. Basic Algorithms Flow

Similarly, constraints (2g) to (2k) are equivalent to con-straint (1b). 0-1 variable sij1 demonstrates whether xi1 isdifferent from xj1, and sij2 demonstrates whether xi2 isdifferent from xj2. Stitch sij is true if either sij1 or sij2 istrue.

C. Acceleration Techniques

Since ILP is an NP-hard problem, its runtime increasesdramatically with the size of a decomposition graph. Wepropose three acceleration techniques to simplify the layoutgraph and the decomposition graph in order to reduce the timecomplexity of ILP. As shown in Fig.4, our acceleration flowconsists of three steps: Independent Component Computation,Layout Graph Simplification and Bridges Computation.

1) Independent Component Computation: We propose in-dependent component computation on the decompositiongraph to reduce the ILP problem size without losing optimality.In a layout graph of real design, we observe many isolatedclusters. Therefore, we can break down the whole designinto several independent components, and apply basic ILPformulation for each one. The overall solution can be takenas the union of all the components without affecting theglobal optimality. The runtime of ILP formulation decreasesdramatically with the reduction of variables and constraints,and the coloring assignment can be effectively accelerated. In-dependent component computation is a well-known techniquewhich has been applied in many previous studies [4][5][7][9].

Algorithm 1 Layout Graph Simplification and Color Assign-mentRequire: Layout Graph G to be simplified, stack S

1: while ∃n ∈ G s.t. degree(n) ≤ 2 do2: S.push(n);3: G.delete(n);4: end while5: Decomposition graph construction.6: TPL layout decomposition for nodes not be simplified.7: while !S.empty() do8: n = S.pop();9: G.add(n);

10: Assign n a legal color.11: end while

a

b

(a)

a

b

(b)

a

b

(c)

Fig. 6. Bridges Computation. (a) After bridges computation, label edge eabas bridge. (b) In two sub-graphs carry out ILP formulation. (c) Rotate colorsin one sub-graph to add bridge.

2) Layout Graph Simplification: We can simplify the layoutgraph by removing all nodes with degree less than or equalto two. At the beginning, all nodes with degree less than orequal to two are detected and removed temporarily from thelayout graph. This removing process will continue until all thenodes are at least degree-three. The layout graph simplificationalgorithm is shown in Algorithm 1.

If all the nodes in the layout graph can be pushed ontothe stack, Algorithm 1 can solve TPL layout decompositionoptimally in linear time. As an example shown in Fig.5, everynode can be pushed onto stack and finally be colored whenis popped off. Note that even when some nodes cannot besimplified, layout graph simplification can reduce problem sizedramatically. Additionally, we observe that this algorithm canalso partition the layout graph into several sub-graphs.

3) Bridges Computation: A bridge of a graph is an edgewhose removal disconnects the graph into two components. Ifthe two components are independent, removing the bridge candivide the whole ILP into two independent ILP formulations.

Theorem 2: Partitioning decomposition graph by re-moving bridges does not introduce new stitches.

An example of the bridges computation is shown in Fig. 6.First of all, conflict edge eab is found to be a bridge. Removingthe bridge divides the decomposition graph into two sides.After ILP based color assignment, if node a and node b areassigned the same color, we can rotate colors of all nodes inone sub-graph. Similar method can be adopted when bridgeis a stitch edge. We adopt an O(|V |+ |E|) algorithm [20] tofind bridges in decomposition graph.

Using above three acceleration techniques, the ILP formu-lation can still achieve optimal solutions. In other words, ouracceleration algorithms can keep optimality. Due to page limit,we skip the detailed discussion here.

4. SDP BASED ALGORITHM

Although the accelerated algorithms can simplify the prob-lem size in many ways, ILP may still be too slow forlarge problems which cannot be simplified effectively. In thissection we provide an approximation algorithm to obtain morerapid solutions. First, a novel vector programming for TPLlayout decomposition is formulated. Then we relax the vector

bf

ea

c d

(a)

da

bf

c

e

(b)

d

a

b f

c

e

fc

(c)

d

a

b f

c

ee

af

c

(d)

d

a

b f

c

ed

be

af

c

(e)

b

d

a

b f

c

ee

af

c

(f)

d

a

f

c

ee

af

c

b

(g)

d

a

f

c

e

fc

b

(h)

d

f

c

a

b

e

(i)

bf

ea

c d

(j)

Fig. 5. This layout can be directly decomposed by layout graph simplification. (a) Input layout. (b) Corresponding layout graph. (c)(d)(e) Iteratively removeand push in nodes with edges no more than 2. (f)(g)(h)(i) Iteratively pop up and recover node, and assign any legal color. (j) Final decomposition result.

(- , )√32

12

(1, 0)

(- ,- )√32

12

Fig. 7. Three vectors (1, 0), (− 12,√3

2), (− 1

2,−

√32) represent three

different colors.

programming into Semidefinite Programming (SDP). Giventhis solution from Semidefinite Programming, we can obtainthe TPL decomposition results in polynomial time.

A. Vector Programming for TPL Layout Decomposition

In TPL decomposition, there are three possible colors. Weset a unit vector ~vi for every node i. If eij is a conflict edge,we want nodes ~vi and ~vj to be far apart. If eij is a stitchedge, we hope for nodes ~vi and ~vj to be the same. As shownin Fig. 7, we associate all the nodes with three different unitvectors: (1, 0), (− 1

2 ,√32 ) and (− 1

2 ,−√32 ). Note that the angle

between any two vectors of the same color is 0, while theangle between vectors with different colors is 2π/3.

Additionally, we define the inner product of two m-dimension vectors ~vi and ~vj as follows:

~vi · ~vj =m∑

k=1

vikvjk

where each vector ~vi can be represented as (vi1, vi2, . . . vim).Then for the vectors ~vi, ~vj ∈ {(1, 0), (− 1

2 ,√32 ), (− 1

2 ,−√32 )},

we have the following property:

~vi · ~vj ={

1, ~vi = ~vj− 1

2 ~vi 6= ~vj

Based on the above property, we can formulate the TPLlayout decomposition as the following vector program [21]:

min∑

eij∈CE

2

3(~vi · ~vj +

1

2) +

3

∑eij∈SE

(1− ~vi · ~vj) (3)

s.t. ~vi ∈ {(1, 0), (−1

2,

√3

2), (−1

2,−√3

2)} (3a)

Formula (3) is equivalent to mathematical formula (1). Sincethe TPL decomposition is NP-hard, this vector programming isalso NP-hard. In the next part, we will relax (3) to semidefiniteprogramming, which can be solved in polynomial time.

B. Semidefinite Programming Approximation

Constraint (3a) requires solutions of (3) be discrete. Afterremoving this constraint, we generate formula (4) as follows:

min∑

eij∈CE

2

3(~yi · ~yj +

1

2) +

3

∑eij∈SE

(1− ~yi · ~yj) (4)

s.t. ~yi · ~yi = 1, ∀i ∈ V (4a)

~yi · ~yj ≥ −1

2, ∀eij ∈ CE (4b)

This formula is a relaxation of (3) since we can take anyfeasible solution ~vi = (vi1, vi2) to produce a feasible solutionof (4) by setting ~yi = (vi1, vi2, 0, 0, · · · , 0), i.e. ~yi · ~yj =1 and ~yi · ~yj = ~vi · ~vj in this solution. Thus if ZR is thevalue of an optimal solution of formula (4) and OPT is anoptimal value of formula (3), it must satisfy: ZR ≤ OPT . Inother words, solution of (4) is an approximation to that in (3).Since we only care about the value of ~yi, program (4) can befurther simplified by eliminating the constants in the objective

function:

min∑

eij∈CE

(~yi · ~yj)− α∑

eij∈SE

(~yi · ~yj) (5)

s.t. (4a)− (4b)

Without discrete constraint (3a), programs (4) and (5) arenot NP-hard now. To solve (5) in polynomial time, we willshow that it is equivalent to a semidefinite programming.Semidefinite programming (SDP) is similar to linear pro-gramming which has a linear objective function and linearconstraints. However, a square symmetric matrix of variablescan be constrained to be positive semidefinite. Althoughsemidefinite programs are more general than linear programs,both of them can be solved in polynomial time. Besides, therelaxation based on the semidefinite programming has bettertheoretical results than those based on LP [16].

Consider the following standard semidefinite program:

SDP: min A •X (6)Xii = 1, ∀i ∈ V (6a)

Xij ≥ −1

2, ∀eij ∈ CE (6b)

X � 0 (6c)

where A • X is the inner product between two matrices Aand X , i.e.

∑i

∑j AijXij . Here Aij is the entry that lies in

the i-th row and the j-th column of matrix A. Constraint (6c)means matrix X should be positive semidefinite.

Aij =

1, ∀eij ∈ CE−α, ∀eij ∈ SE0, otherwise

(7)

Similarly, Xij is the i-th row and the j-th column entry ofX . Note that the solution of SDP is represented as a positivesemidefinite matrix X , while solutions of vector programmingare stored in a list of vectors. However, we can show that theyare equivalent.

Lemma 3: A symmetric matrix X is positive semidefiniteif and only if X = V V T for some matrix V .

Given a positive semidefinite matrix X , using the Choleskydecomposition we can find corresponding matrix V in O(n3)time.

Theorem 3: The semidefinite program (6) and the vectorprogram (5) are equivalent.

Proof: Given solutions of (5) {~v1, ~v2, · · · ~vm}, the corre-sponding matrix X is defined as Xij = ~vi · ~vj . In the otherdirection, based on Lemma 3, given a matrix X from (6),we can find a matrix V satisfying X = V V T by using theCholesky decomposition. The rows of V are vectors {vi} thatform the solutions of (5).

C. Mapping Algorithm

Solutions of program (6) are continuous, while optimalsolutions in (3) are discrete. In this subsection we map thecontinuous solutions into discrete ones.

In the matrix X generated by SDP, if Xij is close to 1, thennodes i and j should be in the same mask, while if Xij is close

to −0.5, node i and node j tend to be in different masks. Ourmapping algorithm is given in Algorithm 2, which finds therelative relationships among the nodes and maps them intothree different masks. First some triplets are constructed andsorted to store all Xij information. Then, we carry out ourmapping algorithm in two steps. In the first step, if Xij is closeto 1, the node i and the node j will be in the same mask, whileif Xij is close to −0.5, they will be labeled to be in differentmasks. Here the vectors UnionLevel[k] and SepaLevel[k] aresome user defined parameters: UnionLevel[] are close to 1 andSepaLevel[] are close to −0.5. In the second step, we continueto union the node i and the node j with maximum Xij untilall nodes are assigned into three masks.

We use the disjoint-set data structure to group nodes intothree masks. Implemented with “union by rank” and “pathcompression”, the running time per operation of disjoint-setis almost constant [22]. Let n be the number of nodes, andthe number of triplets is n2. Sorting all the triplets requiresO(n2logn). Since all triplets are sorted, each of them can bevisited at most once. Because the runtime of each operationcan be finished almost in constant time, the complexity ofAlgorithm 2 is O(n2logn).

Algorithm 2 Mapping Algorithm1: Solve the program (6), get a matrix X .2: Label each non-zero entry Xi,j as a triplet (Xij , i, j).3: sort all (Xij , i, j) by Xij .4: for k = 1 to R do5: for each triple (Xij , i, j) do6: if Xij > UnionLevel[k] && Compatible(i, j) then7: Union(i, j);8: end if9: end for

10: for each triple (xij , i, j) do11: if Xij < SepaLevel[k] then12: Seperate(i, j);13: end if14: end for15: end for16: while Masks number > 3 do17: Pick triple with maximum Xij and Compatible(i, j);18: Union (i, j);19: end while

D. An Example of the SDP Based Algorithm

Fig. 8 shows an example of the decomposition graph. Thisgraph includes 7 conflict edges and 1 stitch edge, and canbe colored with 3 colors. Moreover, the graph is not 2-colorable since it contains odd cycles. Here we show how thesemidefinite programming can be used to solve TPL layoutdecomposition problem.

43

2

1

5

(a)

43

2

1

5

(b)

Fig. 8. Example to color decomposition graph. (a)Input decomposition graph.(b)Using semidefinite programming and Algorithm 2, we assign nodes into 3different colors (masks).

If we set α = 0.1, then matrix A is as follow:

A =

0 1 1 −0.1 11 0 1 0 11 1 0 1 0−0.1 0 1 0 11 1 0 1 0

After solving the semidefinite programming (6), we can get

a matrix X as following:

X =

1.0 −0.5 −0.5 1.0 −0.5

1.0 −0.5 −0.5 −0.51.0 −0.5 1.0

. . . 1.0 −0.51.0

here we only show the upper part of the matrix X .

From the matrix X we can find that node 1 and node 4should be in the same color (because X14 = 1.0), and node 3and node 5 should also be in the same color (because X35 =1.0). However, since X12, X13 and X15 are close to −0.5,nodes 2, 3 and 5 cannot be assigned in the same color asnode 1. Using Algorithm 2, we can map all the nodes intothree colors: {1, 4}, {2} and {3, 5}. The final mapping resultis shown in Fig. 8(b).

5. EXPERIMENTAL RESULTS

We implement our algorithm in C++ and test it on an IntelCore 3.0GHz Linux machine with 32G RAM. OpenAccess2.2[23] is used for interfacing with GDSII directly. We chooseCBC [24] as our solver for the integer linear programming, andCSDP [25] as the solver for the semidefinite programming.

ISCAS-85 & 89 benchmarks are scaled down and modifiedto reflect the 16nm technology node. The metal one layeris used for experimental purposes, because it is one of themost complex layers in terms of layout decomposition. Theminimum width and spacing become 25nm and 30nm. Theminimum colorable distance is set as 85nm, and the minimumoverlapping margin for stitch insertion is 10nm. The parameterα is set as 0.1.

A. Comparison

First, we show the effectiveness of the layout graph sim-plification and the bridges computation. Table II comparesthe Normal ILP and the Accelerated ILP, where the NormalILP only uses independent component computation technique,

while the Accelerated ILP uses all three acceleration tech-niques. Columns “SE#” and “CE#” denote the stitch edgenumber and conflict edge number respectively. From TableII we can see that layout graph simplification and bridgescomputation are quite effective: the stitch edge number canbe reduced by 90%, while the conflict number can be reducedby 93%. The columns “st#” and “cn#” show the stitch numberand the conflict number in the final design. “CPU(s)” iscomputational time in seconds. Compared with the NormalILP, the Accelerated ILP can achieve the same results inaround 18% of the runtime. Note that if no accelerativetechnique is used, the runtime for ILP is unacceptable even forsmall circuits like C432. From Table II we can see that bothILP formulations can achieve the optimal solutions, becauseof the same conflict number and stitch number.

Second, we verify the quality and efficiency of our ap-proximation algorithm based on semidefinite programming.Table II also compares the Accelerated ILP and the SDPbased algorithm, where “SDP Based” denotes the semidefiniteprogramming based algorithm. Note that these two methodsshare the same decomposition graph, i.e. both the stitch edgenumber and the conflict edge number in their decompositiongraph are equal. As we can see, using SDP based method theruntime can be further reduced by 42% and the stitch numbercan be reduced by 7%. The tradeoff for this acceleration isthe 9% more conflicts.

B. Efficiency

In order to further evaluate the scalability of our SDPbased method, we create four additional benchmarks (C1-C4)to test two decomposition algorithms on very dense layouts.Table III lists the comparison of the Speed-up ILP and theSDP based method on these very dense layouts. As we cansee, compared with the Speed-up ILP, SDP based methodcan reduce stitch number by 10% while introduces 5% moreconflicts. Furthermore, SDP based method can achieve 140×speed-up. The reason for the dramatically acceleration is that:for a low density layout, the decomposition problem canbe divided into many sub-problems, and typically each sub-problem contains no more than 20 nodes. While for highdensity layout, there are more nodes in each sub-problem,where SDP can be much faster than ILP.

TABLE IIICOMPARISON ON VERY DENSE LAYOUTS

Circuit SE# CE# Accelerated ILP SDP Basedst# cn# CPU(s) st# cn# CPU(s)

C1 16 247 1 5 5.5 0 6 0.29C2 38 289 0 15 17.32 0 16 0.77C3 24 381 0 14 33.41 0 15 0.32C4 56 437 9 32 203.17 9 32 0.49

avg. - - 2.5 16.5 64.9 2.25 17.3 0.468ratio - - 1 1 1 0.9 1.05 0.007

6. CONCLUSION

In this paper, we propose a general integer linear program-ming (ILP) formulation for the TPL layout decomposition

TABLE IIRUNTIME AND PERFORMANCE COMPARISONS

Circuit Comp# Normal ILP Accelerated ILP SDP BasedSE# CE# st# cn# CPU(s) SE# CE# st# cn# CPU(s) st# cn# CPU(s)

C432 261 300 930 0 1 38.01 32 136 0 1 1.11 0 1 0.26C499 418 566 2239 0 0 34.15 171 838 0 0 6.34 0 4 1.01C880 516 844 2219 1 3 25.91 6 24 1 3 0.49 1 3 0.06

C1355 872 1008 2543 0 1 23.27 2 14 0 1 0.12 0 1 0.03C1908 1132 1332 4480 0 1 48.66 1 14 0 1 0.10 0 1 0.03C2670 1501 2186 7469 0 4 101.98 6 40 0 4 0.50 0 4 0.10C3540 1964 3197 9283 2 2 3975.25 9 40 2 2 0.43 2 2 0.11C5315 2767 4644 13211 5 0 173.86 20 70 5 0 0.70 5 0 0.18C6288 3740 5319 11394 ? ? > 7200 259 509 9 72 26.05 9 72 1.36C7552 4164 6591 19187 ? ? > 7200 64 180 10 6 2.19 7 9 0.46S1488 588 1932 4284 0 1 73.37 31 54 0 1 0.50 0 1 0.1S38417 9385 15912 40734 ? ? > 7200 1617 2724 3 19 20.56 3 21 9.33S35932 23565 25252 71198 ? ? > 7200 3776 6317 3 18 49.87 3 22 33.55S38584 24724 28808 74968 ? ? > 7200 3657 6066 4 26 44.16 4 26 34.52S15850 20881 33134 87358 ? ? > 7200 3877 6504 6 34 49.18 6 39 35.92

avg. - 8735 23433 - - - 901.8 1568.7 2.87 12.53 13.49 2.67 13.73 7.80ratio - 1 1 - - - 0.10 0.07 1 1 1 0.93 1.09 0.58

to simultaneously minimize the conflicts and stitches. Toimprove scalability, we develop three acceleration techniqueswithout losing solution quality: layout graph simplification,independent component computation and bridges computation.Furthermore, we propose a novel semidefinite programmingalgorithm to improve scalability for very dense layouts. Ex-perimental results show that our methods are very effective.Since this is the first systematic attempt on TPL layoutdecomposition for general layouts, we expect to see a lot ofresearches as TPL may be adopted by industry in the nearfuture.

REFERENCES

[1] J.-S. Yang and D. Z. Pan, “Overlay aware interconnect and timingvariation modeling for double patterning technology,” in IEEE/ACMInternational Conference on Computer-Aided Design (ICCAD), 2008,pp. 488 –493.

[2] Y. Borodovsky, “Lithography 2009 overview of opportunities,” in Semi-con West, 2009.

[3] V. O. Anton, N. Peter, H. Judy, G. Ronald, and N. Robert, “Pattern splitrules! a feasibility study of rule based pitch decomposition for doublepatterning,” in Proc. of SPIE, 2007.

[4] A. B. Kahng, C.-H. Park, X. Xu, and H. Yao, “Layout decomposition fordouble patterning lithography,” in ACM/IEEE International Conferenceon Computer Aided Design (ICCAD), 2008, pp. 465–472.

[5] K. Yuan, J.-S. Yang, and D. Pan, “Double patterning layout decom-position for simultaneous conflict and stitch minimization,” in ACMInternational Symposium on Physical Design (ISPD), 2009, pp. 107–114.

[6] Y. Xu and C. Chu, “GREMA: graph reduction based efficient mask as-signment for double patterning technology,” in ACM/IEEE InternationalConference on Computer Aided Design (ICCAD), 2009, pp. 601–606.

[7] J.-S. Yang, K. Lu, M. Cho, K. Yuan, and D. Pan, “A new graph-theoretic,multi-objective layout decomposition framework for double patterninglithography,” in IEEE/ACM Asia and South Pacific Design AutomationConference (ASPDAC), 2010.

[8] Y. Xu and C. Chu, “A matching based decomposer for double patterninglithography,” in ACM International Symposium on Physical Design(ISPD), 2010, pp. 121–126.

[9] K. Yuan and D. Pan, “WISDOM: Wire spreading enhanced decom-position of masks in double patterning lithography,” in IEEE/ACMInternational Conference on Computer-Aided Design (ICCAD), 2010,pp. 32 –38.

[10] S.-Y. Chen and Y.-W. Chang, “Native-conflict-aware wire perturbationfor double patterning technology,” in IEEE/ACM International Confer-ence on Computer-Aided Design (ICCAD), 2010, pp. 556 –561.

[11] C.-H. Hsu, Y.-W. Chang, and S. R. Nassif, “Simultaneous layout migra-tion and decomposition for double patterning technology,” in IEEE/ACMInternational Conference on Computer-Aided Design (ICCAD), 2009,pp. 595–600.

[12] C. Cork, J.-C. Madre, and L. Barnes, “Comparison of triple-patterningdecomposition algorithms using aperiodic tiling patterns,” in Proc. ofSPIE, 2008.

[13] Y. Chen, P. Xu, L. Miao, Y. Chen, X. Xu, D. Mao, P. Blanco, C. Bencher,R. Hung, and C. S. Ngai, “Self-aligned triple patterning for continuousic scaling to half-pitch 15nm,” in Proc. of SPIE, 2011.

[14] B. Mebarki, H. D. Chen, Y. Chen, A. Wang, J. Liang, K. Sapre,T. Mandrekar, X. Chen, P. Xu, P. Blanko, C. Ngai, C. Bencher, andM. Naik, “Innovative self-aligned triple patterning for 1x half pitch usingsingle ”spacer deposition-spacer etch” step,” in Proc. of SPIE, 2011.

[15] Q. Li, “NP-completeness result for positive line-by-fill sadp process,” inProc. of SPIE, 2010.

[16] L. Vandenberghe and S. Boyd, “Semidefinite programming,” SIAM Rev.,vol. 38, pp. 49–95, March 1996.

[17] D. Karger, R. Motwani, and M. Sudan, “Approximate graph coloring bysemidefinite programming,” J. ACM, vol. 45, pp. 246–265, March 1998.

[18] M. R. Garey, D. S. Johnson, and L. Stockmeye, “Some simplified NP-complete graph problems,” Theoretical Computer Science, vol. 1, pp.237–267, 1976.

[19] S. Khanna, N. Linial, and S. Safra, “On the hardness of approximatingthe chromatic number,” in Theory and Computing Systems, 1993.,Proceedings of the 2nd Israel Symposium on the, Jun. 1993, pp. 250–260.

[20] R. E. Tarjan, “A note on finding the bridges of a graph,” InformationProcessing Letters, vol. 2, pp. 160–161, 1974.

[21] D. P. Williamson and D. B. Shmoys, The design of approximationalgorithms. Cambridge University Press, 2011.

[22] T. T. Cormen, C. E. Leiserson, and R. L. Rivest, Introduction toalgorithms. Cambridge, MA, USA: MIT Press, 1990.

[23] [Online]. Available: http://www.si2.org/?page=69[24] [Online]. Available: http://www.coin-or.org/projects/Cbc.xml[25] B. Borchers, “CSDP, a C library for semidefinite programming,” Opti-

mization Methods and Software, vol. 11, pp. 613 – 623, 1999.