11

E cien - Integrand Software · 2003-12-18 · (FE) [1], nite-di erence (FD) [18], and time-domain (FDTD) [21] approac hes all fall in to this class. Metho ds from the second class

  • Upload
    others

  • View
    1

  • Download
    0

Embed Size (px)

Citation preview

Page 1: E cien - Integrand Software · 2003-12-18 · (FE) [1], nite-di erence (FD) [18], and time-domain (FDTD) [21] approac hes all fall in to this class. Metho ds from the second class

E�cient Electrostatic and Electromagnetic Simulation Using IES3Sharad Kapur David E. LongBell Laboratories, Lucent TechnologiesMurray Hill, NJ 07974AbstractIntegral equation techniques are often used to ex-tract models of integrated circuit structures. Thisextraction involves solving a dense system of linearequations, and using direct solution methods is pro-hibitive for large problems. In this paper, we presentIES3 (pronounced \ice cube"), a fast Integral Equa-tion Solver for three-dimensional problems with arbi-trary kernels. We apply our method to solving elec-trostatic problems and electromagnetic problems inthe electrically small regime (i.e., when circuit struc-tures are at most a wavelength or so in size). Theoverall approach gives O(N logN) complexity, whereN is the number of panels in a discretization of theconductor surfaces.1 IntroductionExtracting compact, accurate linear models for pack-ages, interconnect, and components plays a signi�-cant role in modern Radio Frequency designs. Modelscan be extracted in a variety of ways, but for the highaccuracy that critical sections of RF designs demand,only direct numeric simulation su�ces. At lower fre-quencies (far from resonance), capacitive or induc-tive coupling can be extracted with electrostatic ormagnetostatic simulations. These simulations requiresolving a scalar or vector Laplace equation. At highfrequencies (near resonances), electromagnetic simu-lation must be used. In this case, we solve a vectorHelmholtz equation.Simulation methods can be broadly divided intotwo classes. Methods in the �rst class use di�eren-tial equation formulations. Finite-element (FE) [1],�nite-di�erence (FD) [18], and �nite-di�erence time-domain (FDTD) [21] approaches all fall into thisclass. Methods from the second class use inte-gral equations. The method-of-moments (MoM) ap-proach [4] is based on integral equations. While the

integral integral formulation leads to dense matri-ces, it allows us to apply Green's theorem, reducingvolume integrals to surface integrals. This can re-duce the matrix dimension signi�cantly since the dis-cretization only involves surfaces such as the bound-ary of a conductor or the interface between two di-electrics. For simulations that involve di�cult-to-describe material variations (e.g., the doping pro�leof a MOSFET), the di�erential approach is supe-rior. For typical IC, board, or MCM simulations,where the material variation is simpler, the integralapproach becomes attractive due to the use of sur-face discretizations. In these cases, the integral for-mulation often reduces the problem size by ordersof magnitude. However, the matrix dimension caneasily be many thousands for complex problems. Inthis paper, we present IES3 (\ice cube"), an Inte-gral Equation Solver for three-dimensional problemsinvolving arbitrary kernels, and show that this newsolver has signi�cant performance advantages com-pared to competing approaches. IES3 uses the factthat large parts of the integral operator matrix arenumerically low rank. The singular value decomposi-tion (SVD) is an extremely e�ective tool for the com-pression of rank-de�cient matrices [19]. In IES3, thematrix is recursively partitioned and the submatricesare compressed using the SVD.2 Related ApproachesThe fast multipole method (FMM) [2], while origi-nally developed for particle simulation problems, canbe combined with iterative techniques to solve thedense integral equation matrices that arise from adiscretized Laplace equation. Parameter extractionprograms such as FastCap [12] and FastHenry [5] usethe FMM for accelerating the dense matrix-vectorproducts required by an iterative solver. Because theFMM is tailored to the 1=jjx � x0jj kernel, dealingPage 1

Page 2: E cien - Integrand Software · 2003-12-18 · (FE) [1], nite-di erence (FD) [18], and time-domain (FDTD) [21] approac hes all fall in to this class. Metho ds from the second class

with common situations such as layered dielectricsis di�cult. For example, in FastCap this requiresdiscretizing the dielectric boundaries and introducingextra unknowns for the polarization charge [13, 15].This can result in a large increase in problem size.An extractor based on IES3 has no such di�culty: itsimply incorporates the variation in dielectrics intothe Green's function.Surprisingly, even for extraction problems that onlyinvolve the Laplace kernel, IES3 is faster and morememory-e�cient than the fast multipole implemen-tation in FastCap, as we show in Section 5. This isbecause IES3 provides greater compression than theFMM, though at the cost of a more expensive prepro-cessing phase. In extraction problems, the numberof matrix-vector multiplies is typically large; hencethe added compression provided by IES3 more thanmakes up for the FMM's smaller startup cost. In con-trast, in a particle simulation environment where thegeometry (and hence the matrix) changes after eachmultiply, the FMM has a de�nite advantage [2].We did not have access to the Precorrected-FFTversion of FastCap [14], but based on the relative per-formance of both approaches to the FMM version ofFastCap, we believe that our approach would still bevery competitive. Also note that there have been re-cent advances in the FMM algorithms [3], but theyhave not yet been incorporated into extraction pro-grams such as FastCap.3 FormulationWe assume that the extraction problem has been ex-pressed as an integral equation, such as the �rst-kindequation �(x) = ZR0G(x; x0)�(x0) dR0: (1)Here � is the known \right-hand side" (for histori-cal reasons, written on the left), � is the unknown tobe solved for, R is the region of integration and G isthe kernel (or Green's function). For example, in thestandard problem of capacitance extraction in threedimensions, � is the potential, � is the surface chargedensity, R ranges over the surfaces of conductors, andG is the free-space Green's function 1=(4��0jjx�x0jj).To solve such a problem numerically, we discretize theregion and reduce (1) to a matrix equation. In engi-neering applications, �rst-order collocation schemesare commonly used. The domain R is subdividedinto regions fR1; R2; : : : ; RNg, a collocation point xjis chosen in each region Rj , and � is assumed to bepiecewise constant over each region. Then (1) reduces

to the following set of equations:�(xi) = NXj=1�ZR0jG(xi; x0) dR0j��(xj) 1 � i � N:(2)If we de�ne the dense matrix A = faijg byaij = ZR0jG(xi; x0) dR0j ; 1 � i; j � N: (3)then the system (2) is equivalent to the matrix equa-tion � = A�. We emphasize that IES3 does notassume any particular method of discretization andonly requires that the Green's function be smooth inthe far �eld. For example, in the IES3-based simu-lators that we have written, layered media are han-dled directly in the Green's function [9, 11, 23], andour capacitance extractor actually uses a higher-orderNystr�om method [10] for discretization. For the re-mainder of this paper, A will denote the N �N ma-trix (3) associated with the integral equation. Fornotational convenience, we assume that N is a powerof 2.4 Rapid matrix solutionDirect solution of the linear system A� = � via Gaus-sian elimination requires O(N2) storage and O(N3)time and is impractical for large problems. Fortu-nately, typical systems arising from integral equationsare well conditioned and can be solved by Krylov-subspace iterative schemes such as Conjugate Gra-dient [19] or GMRES [16]. Iterative solvers requireapplication of the matrix A to a sequence of recur-sively generated vectors. The dominant costs becomethe O(N2) time and space required for constructingand storing the matrix and the O(N2) time requiredfor each matrix-vector product. IES3 reduces each ofthese costs to O(N logN) for typical problems.4.1 A simple exampleThe idea behind IES3 is to exploit the fact that typ-ical Green's functions vary smoothly with distance.Consider the situation depicted in Figure 1. Supposethat X = fx1; x2; : : : ; xng and Y = fy1; y2; : : : ; yngare two sets of points in R3 separated by a distance d.Let H(x; y) be the function giving the interaction be-tween the points. Then the in uence of Y on X is canbe computed by multiplication of the n � n matrixB = fbijg = fH(xi; yj)g with a vector. Direct eval-uation of this product would require n2 operations.However, if d is relatively large, and the function HPage 2

Page 3: E cien - Integrand Software · 2003-12-18 · (FE) [1], nite-di erence (FD) [18], and time-domain (FDTD) [21] approac hes all fall in to this class. Metho ds from the second class

is smooth, then the numerical rank of the matrix Bis small compared to n. Intuitively the in uence dueto a point yj is very similar to the in uence of itsneighbors. As a result, the columns of B are nearlylinearly dependent.X Ydxi yjFigure 1: Well-separated regionsWe take advantage of this fact using the SingularValue Decomposition (SVD) [19]. The SVD of B isgiven by B = USWwhere UU� =W �W = I;and S = diag(s1; : : : ; sn) with s1 � s2 � � � � � sn �0. The numerical rank of a matrix to a precision � isthe integer r such that sr=s1 < �. Clearly, if B is ofrank r then it can be approximated as1~B = U(:; 1:r)S(1:r; 1:r)W (1:r; :)with jjB � ~Bjj < �:Given the SVD of the matrix B of rank r, a matrix-vector product can be computed in (2n+ 1)r opera-tions by multiplying the vector in turn by W (1:r; :),S(1:r; 1:r), and U(:; 1:r). When r � n it is muchmore e�cient to multiply with B in this manner.Use of the SVD to represent B requires no priorknowledge of the function H , in contrast to the mul-tipole approach which is based on a Taylor series ap-proximation of H . This added exibility does not re-duce the compression that is achieved. In fact, as ournumerical experiments in Section 5 show, we obtaingreater compression than the multipole schemes.Even though we use the SVD for the theoreticaldevelopment of our algorithm, it is actually a bit moree�cient to use the Gram-Schmidt process [19]. Givenan n�nmatrix B and a user speci�ed tolerance �, the1We use Matlab notation for submatrices. U(:; 1:r) denotesthe submatrix consisting of columns 1 through r of U . Simi-larly, W (1:r; :) denotes the submatrix consisting of the �rst rrows of W .

Gram-Schmidt process computes the numerical rankr of B and an n�r orthogonal matrix U which spansthe column space of B. B can be approximated asUV where V = U�B. Decomposing B in this way isfaster than computing the SVD of B.4.2 OrderingTo compress the integral operator matrix A in (3),which represents both near and far interactions, weneed to identify large submatrices that correspondto interactions between well-separated regions. Insome situations, this is easy. For example, in a one-dimensional problem, the natural ordering of pointsby their position in space is a good one. In generalthough, the problem is harder. Our goal is to �nda permutation matrix P such that ~A = P�1AP ishighly compressible. Ideally, all strong interactionsshould be close to the diagonal in ~A, as in Figure 3.For two- and three-dimensional problems there is gen-erally no permutation P that can completely accom-plish this. Nevertheless, the following heuristic basedon recursive subdivision yields excellent results andrequires only O(N logN) time. We assume that thereis a spatial coordinate associated with each element inthe discretization. Under this assumption, the prob-lem reduces to one of ordering a set of points suchthat points that are close in space are also close inthe ordering. For notational simplicity, we presentthe algorithm for the two dimensional case.1. If the current set of points S is the singletonf(x; y)g then return the sequence h(x; y)i.2. Otherwise compute the bounding box for thepoints in S. We assume without loss of gener-ality that the longest dimension corresponds tothe x-axis. Equally divide S into two disjointsets S1 and S2 such that x1 � x2 for all points(x1; y1) 2 S1 and (x2; y2) 2 S2. Recursively or-der S1 and S2 and concatenate the results to ob-tain the ordering for S.The rank map for a three-dimensional problem af-ter ordering with this technique is shown in Figure 3.The rank map shows the partitioning of the matrixA into submatrices and the rank of each submatrix.The structure of this rank map is typical in problemswhere the storage requirements drop from O(N2) toO(N logN). This problem corresponds to capaci-tance extraction for the structure of Figure 2 [13]. Al-though there are some strong interactions o� the di-agonal, most o�-diagonal blocks are indeed low rank.For the remainder of the paper, we will assume thatall matrices have been ordered in this fashion.Page 3

Page 4: E cien - Integrand Software · 2003-12-18 · (FE) [1], nite-di erence (FD) [18], and time-domain (FDTD) [21] approac hes all fall in to this class. Metho ds from the second class

Figure 2: The backplane connector example from [13]17

2016

13 6

6 9

9 66

56

88

88 8

6

8 45 5

6 9 56

4 3 6

4 4 5

8

7

9

7 6

57 4

35

4 4

5 66

5 4 7 6

66 14

176

6

6 4

19 57

4 453 4 4

7 46

5 6

6 445 5

5

116 4

45 5

5 58 6

5 412

6

5 56

5 4

13 8

8 13

164

5 33

4 4

76

66

4 7

3 76 5

5 46

11

45 5

5 4

6 5 6

4 5

44

4 3

4 4

66 5 5

4 5

13 8

8 13

21

18

48 4

45 4

6 12

9 7

694 4

96

6

16

768 7

13 455 7 4

4 4 554 5 6

5

77

6 7

54

5 45

4 49 76 6

5 58

4 4 6 7

6

6

15

16

6

6Figure 3: The rank map for the backplane connectorexample (discretized into N = 10672 panels)

4.3 Matrix compressionConstruction of the compressed representation pro-ceeds as follows.1. If the dimension n of the current submatrix Bis less than b, where b is a small �xed number,then:(a) Use the Gram-Schmidt procedure to obtainan orthonormal column basis to a user spec-i�ed precision �. If the size of the basis r isgreater than n=2 then B is simply stored asa dense matrix.(b) Otherwise let U be the n� r matrix whosecolumns are the basis. Compute V = U�B.2. If n is at least b and a statically-determinedrank map indicates that B should be low rank,then sample B and construct the UV decompo-sition using interpolation, as discussed in Sub-section 4.4.3. Otherwise, construct the representationof the four submatrices B(1:n=2; 1:n=2),B(1:n=2; n=2 + 1:n), B(n=2 + 1:n; 1:n=2), andB(n=2+1:n; n=2+1:n). If these child submatri-ces are all low rank, i.e., are represented as UVdecompositions, then(a) Construct a decomposition UV for B bymerging (as described below).(b) If the size of this decomposition is smallerthan the sum of the sizes of the child sub-matrix representations then represent B byU and V . Otherwise represent B by thechild submatrices.4. If some submatrix is not low rank, then representB by its four children.The merging process constructs a decompositionU12V12 for the matrix �U1V1U2V2 �where U1; V1; U2, and V2 have dimensions n�r1, r1�n, n� r2, and r2 � n respectively, and where U1 andU2 are both orthogonal. We run the Gram-Schmidtprocedure on the rows of V1 and V2 to obtain a rowbasis of size r. V12 is the r�n matrix whose rows arethis basis. The 2n� r matrix U12 is given by�U1V1V �U2V2V � � :Page 4

Page 5: E cien - Integrand Software · 2003-12-18 · (FE) [1], nite-di erence (FD) [18], and time-domain (FDTD) [21] approac hes all fall in to this class. Metho ds from the second class

U1V1 U3V3U2V2 U4V4 - U12V12 U34V34 - B = UVFigure 4: The merging process.Note that the matrix V12 generated by this proce-dure is orthogonal. Three merges (two vertical andone horizontal) yield the UV representation of B inStep 3a above. As a result of the �nal horizontalmerge, U is orthogonal. The process is illustratedin Figure 4. Note that directly using the Gram-Schmidt process to construct the UV decompositionof B would require O(n2r) operations. In contrast,merging requires only O(nr2) steps. For rank de�-cient matrices (r � n) merging is much more e�-cient.Computing a matrix-vector product with the �nalcompressed representation of is done with an obvi-ous recursive algorithm. Empirically, the compressionobtained is much higher than that achieved by themultipole algorithms. This leads to a faster matrix-vector multiply. However, the construction overheadusing IES3 is higher than with the FMM. In situa-tions where the construction cost can be amortizedover a large number of solves, the UV representationis more e�cient.4.4 Constructing UV representationswith interpolationConsider constructing a UV decomposition of the in-teraction matrix B for the situation shown in Fig-ure 5. Let SX and SY be sets of p sampling pointsin X and Y respectively. The idea for reconstructingB is to use the in uence of SY on X to construct acolumn basis for B. Then the in uence of Y on SX isused to express each column of B in terms of this ba-sis. The columns of B that have not been completelysampled (which correspond to points in Y � SY ) areinterpolated based on the sampling SX of rows. Morespeci�cally, the in uence due to any point in Y on Xis approximated by a linear combination of the in u-ences of the sampling points SY . Hence if C is then�p matrix of columns of B corresponding to pointsin SY , then a column basis for C should also span thecolumns of B (the validity of this is discussed below).De�ne U to be the n� r matrix obtained by runningthe Gram-Schmidt process on the columns of C. Toget the coe�cients of the columns of B in the basis

represented by U , we use the sampling points SX . LetR be the p� n matrix of rows of B corresponding topoints in SX and let ~U be the p� r matrix of rows ofU corresponding to these same points. Finally, de�neV to be the solution (in the least squares sense) to~UV = R. Then B is approximated by UV .X Ydxi yjSX SYFigure 5: Well-separated regions with samplingpointsClearly, obtaining U from C instead of B is onlyjusti�ed when there are su�cient sampling points toreconstruct the interaction. The issue is akin to in-terpolation of an unknown function. When the func-tion is smooth (in the sense that the derivatives overthe interval have a known bound), the accuracy ofthe interpolation can be rigorously controlled by thenumber and location of sampling points [19]. How-ever, when designing a general-purpose algorithm, re-quiring bounds on the derivatives of the function isusually very conservative. Empirically, given the as-sumptions that the function is smooth, the sets arewell separated, and the sampling is reasonably uni-form, the interpolation can be made arbitrarily ac-curate. Hence, instead of derivative information, ouralgorithm uses a statically-determined map which in-dicates the low-rank regions of the matrix A and in-formation about appropriate sampling points withineach such region. We provide default methods thatcompute this information based purely on the spatialcoordinates of elements. For typical kernels these de-fault routines su�ce. As an example, a static mapof the interactions for the backplane connector men-tioned in Subsection 4.2 is shown in Figure 6. Fig-ure 3 shows the rank map for the same example; notethe close correspondence between the two. Also notethat the static map is conservative, in that many ofthe small submatrices shown on the static map havemerged into larger low-rank matrices in Figure 3.For a (statically-identi�ed low-rank) submatrix Bof size n � n and rank r, the algorithm takes onlyO(r2n) operations to construct U and V . As a result,if the statically-determined rank map has O(N logN)size, the total time to construct the compressed rep-

Page 5

Page 6: E cien - Integrand Software · 2003-12-18 · (FE) [1], nite-di erence (FD) [18], and time-domain (FDTD) [21] approac hes all fall in to this class. Metho ds from the second class

Figure 6: The static sampling map for the backplaneconnector exampleresentation will be O(r2N logN).5 Experimental resultsIn this section, we present experimental results forIES3. The capacitance extraction problems were runon an SGI machine (194 MHz R10000 CPU). Theelectromagnetic simulation was run on a Sun UltraEnterprise (248 MHz UltraSPARC-II CPU).5.1 Comparison to FastCapWe begin by comparing IES3 [6] to the FMM-basedFastCap program (version 2) [12, 13]. To make afair comparison, we modi�ed FastCap to use our al-gorithm within FastCap's iterative solver. The dis-cretization, matrix formulation, and preconditioningwere unchanged. The relative tolerance was set togive the same accuracy as FastCap's default two-termmultipole expansions. Results for a number of thestandard examples supplied with FastCap are pre-sented in Table 1. For FastCap we computed thememory requirements by subtracting the precondi-tioner memory and the \miscellaneous" memory fromthe total memory use reported. Error is determinedby running the same examples at a much higher ac-curacy and computing jjC� ~Cjj=jjCjj where C and ~Care the high- and low-accuracy capacitance matrices,respectively. For the RAM cell example, the defaulttolerances only yielded one digit of accuracy for both

methods. The example was run with tighter toler-ances for both.The �rst two examples in the table involve con-ductors in free space, while the last two also in-clude dielectric interfaces. For examples with di-electrics, FastCap uses a �nite-di�erence approxima-tion to compute normal components of the electric�eld at the dielectric interfaces. This computationis done after the potential is evaluated, since incor-porating it directly would require substantial alter-ations to the multipole algorithm (as discussed in theappendix of [13]). Because IES3 is not restricted toany speci�c kernel, we chose to incorporate the same�nite-di�erence computation directly into the matrix.We could as easily have analytically di�erentiated thepotential to avoid the inaccuracies inherent in �nite-di�erencing.The memory required by IES3 can be as little asone-�fth that required by FastCap. As a consequence,IES3 is much faster than FastCap for larger problemsand those involving many conductors. Our startupoverhead is still substantial for small examples suchas the via structure. For medium-sized examples suchas the backplane connector, IES3 is almost two andone-half times faster.5.2 Higher-order discretizationsFirst-order collocation methods are easy to imple-ment and give satisfactory results for simple prob-lems. However, for structures with many �ne fea-tures, getting accurate results requires extremelylarge discretizations. This results in large compu-tation times even with fast solution methods. Inthis subsection, we demonstrate a higher-order capac-itance extractor based on the Nystr�om method [10].The Nystr�om method represents functions � and �in the integral equation (1) by their values a selectedpoints fr1; r2; : : : ; rng on R and replaces the integralby a quadrature at those points. We obtain the linearsystem nXj=1Aij�(rj ) = �(ri); i = 1; : : : ; nwhere Aij = wijG(ri; rj) is a matrix and fwijg arequadrature weights that are constructed to integratethe Green's function kernel G convolved with low-order polynomial basis functions [8, 20].The example that we consider is an interdigitatedcapacitor fabricated in CMOS. Figure 7 is a (coarse)mesh of the capacitor produced using a surface meshgenerator [17]. Two sets of IES3-based simulationsPage 6

Page 7: E cien - Integrand Software · 2003-12-18 · (FE) [1], nite-di erence (FD) [18], and time-domain (FDTD) [21] approac hes all fall in to this class. Metho ds from the second class

FastCap IES3Example Panels Time (sec) Mem. (MB) Error Time Mem. Error5x5 bus crossing 7360 185 66 3e-3 105 26 6e-3via structure 6126 45 37 1e-3 60 22 4e-3backplane connector 10672 675 165 4e-3 280 45 3e-3RAM cell 6129 480 135 1e-2 195 26 1e-2Table 1: Comparison of IES3 to the FMM-based scheme in FastCapwere run. The �rst set used simple �rst-order collo-cation while the second used a third-order Nystr�omscheme.

Figure 7: Interdigitated capacitorSince we could not run the �rst-order scheme toconvergence on the full capacitor, we �rst consider aversion with only a few �ngers. Convergence of themutual capacitance under both schemes is shown inFigure 10. The Nystr�om scheme shows a slight in-crease in capacitance with increasing discretization,presumably due to the fact that edge and corner so-lution singularities are not re ected in the quadra-

ture. However, the Nystr�om approach at the lowestdiscretization gives much better accuracy than thecollocation scheme at the highest discretization. Fur-ther, the e�ciency of the two schemes is similar: atten thousand points, the Nystr�om version takes 147seconds while the collocation scheme requires 161.On other examples, the Nystr�om scheme is somewhatslower, but the ratio in times is generally less than afactor of two. Thus, the superior convergence avail-able with the higher-order scheme comes at very littlecost. The scaling of time and memory requirementswith problem size is shown in Figures 8 and 9. Forboth collocation and the Nystr�om method, the re-source requirements grow slightly faster than linearly.For the full capacitor, the Nystr�om-based solver us-ing 38,142 points took just over an hour of CPU timeand predicted a capacitance of 252 fF. The measuredvalue was 241 fF, a di�erence of �ve percent.

102

103

104

105

106

100

101

102

103

104

Points

CP

U s

econ

ds

Simulation Time

3rd order Nystrom 1st order collocationFigure 8: Time requirements for the interdigitatedcapacitor

Page 7

Page 8: E cien - Integrand Software · 2003-12-18 · (FE) [1], nite-di erence (FD) [18], and time-domain (FDTD) [21] approac hes all fall in to this class. Metho ds from the second class

102

103

104

105

10−1

100

101

102

Points

Mill

ions

of n

onze

ros

Simulation Memory

3rd order Nystrom 1st order collocation

Figure 9: Memory requirements for the interdigitatedcapacitor

103

104

105

1.58

1.6

1.62

1.64

1.66

1.68

1.7x 10

−15

Points

Cap

acita

nce

Convergence Behavior

3rd order Nystrom 1st order collocation

Figure 10: Convergence of mutual capacitance for theinterdigitated capacitor example

5.3 Electromagnetic simulationIn this subsection, we present results from a 2.5D elec-tromagnetic simulator [7] based on IES3. The simu-lator uses a �rst-order collocation scheme with basisfunctions that decompose the unknown surface cur-rent into curl-free and divergence-free parts. This de-composition ensures a well-conditioned system evenfor electrically small structures [22]. The examplethat we consider is an inductor on a seven layer MCMsubstrate. The layout is shown in Figure 11. We ex-amined the performance of the simulator as the dis-cretization level was increased. The results are showin Figure 12. These simulations were at a single fre-quency of 1 GHz. Discretization sizes ranged from988 elements to 10,228 elements, yielding matrix sizesof between 1200 and 14,000 (the di�erence is the num-ber of divergence-free basis functions). The graphshows the near-linear growth in total CPU time withproblem size. Almost all of the time goes into buildingthe compressed matrix representation. The iterativesolve, which is preconditioned using a sparse matrixthat captures only the nearby interactions, requiredno more than thirty iterations at any discretizationlevel to achieve a relative reduction of 104 in the normof the residual. The smallest simulation took about90 seconds, and the largest required about thirty min-utes. The total simulation times compare favorably tothe time required for extracting only the parasitic ca-pacitance of the same inductor. The electromagneticsimulation is typically �ve times slower than the elec-trostatic simulation. Note that in the electromagneticcase, a triangle with three neighbors requires four in-tegrals: one for computing the scalar potential, andone for each edge when computing the vector poten-tial. Since each vector potential integral also involvesmanipulating vectors rather than scalars, the factorof �ve di�erence in time is about the best that canbe expected. Comparisons between the simulation re-sults and measurement are shown in Figure 13. Thegraphs show the magnitude and phase of the com-plex impedance. The density of current ow nearresonance is shown in Figure 14.6 ConclusionsIn this paper we presented IES3, a fast general-purpose integral equation solver. The algorithm usesa combination of SVD-based compression and an it-erative solver. The compressed representation of theintegral operator matrix is constructed in O(N logN)time by combining a recursive partitioning strategywith interpolation of the function that de�nes the ma-Page 8

Page 9: E cien - Integrand Software · 2003-12-18 · (FE) [1], nite-di erence (FD) [18], and time-domain (FDTD) [21] approac hes all fall in to this class. Metho ds from the second class

Figure 11: Layout of the two metal layer MCM in-ductor.

103

104

102

103

System size

CP

U s

econ

ds

Simulation Time

Figure 12: Time requirements for the EM solver

108

109

1010

1011

100

101

102

103

104

105

Impedance

Frequency(Hz)

Ohm

s

108

109

1010

1011

−2

−1.5

−1

−0.5

0

0.5

1

1.5

2Phase

Frequency

Rad

ians

Figure 13: Comparison of simulation (circles) to mea-surement (lines)

Figure 14: Density of current ow in the inductornear resonance.Page 9

Page 10: E cien - Integrand Software · 2003-12-18 · (FE) [1], nite-di erence (FD) [18], and time-domain (FDTD) [21] approac hes all fall in to this class. Metho ds from the second class

trix entries. Unlike the kernel-speci�c multipole algo-rithms, IES3 does not depend on the particular formof the Green's function. Our extraction tools take ad-vantage of IES3's kernel independence by incorporat-ing dielectric variations directly in the Green's func-tion. Further, our scheme gives higher compressioneven for those kernels for which the multipole meth-ods were developed. We demonstrated IES3 on three-dimensional capacitance extraction problems usingboth collocation and higher-order Nystr�om methods.We also gave results for an IES3-based electromag-netic solver.AcknowledgementsBob Frye and Pete Smith provided the descriptionand measurements for the inductor example. PeterKinget provided description and measurement for theinter-digititated capacitor example.References[1] R. L. Ferrari and P. P. Silvester. Finite Elementsfor Electrical Engineers. Cambridge UniversityPress, 1996.[2] L. Greengard and V. Rokhlin. A fast algorithmfor particle simulations. Journal of Computa-tional Physics, 73(2):325{348, December 1987.[3] L. Greengard and V. Rokhlin. A new versionof the fast multipole method for the Laplaceequation in three dimensions. Technical Re-port YALEU/DCS/RR-1115, Yale UniversityDepartment of Computer Science, September1996.[4] R. F. Harrington. Field Computation by MomentMethods. IEEE Press, New York, 1991.[5] M. Kamon, M. J. Tsuk, and J.White. FAS-THENRY: A multipole-accelerated 3-D induc-tance extraction program. IEEE Transac-tions on Microwave Theory and Techniques,42(9):1750{1758, September 1994.[6] S. Kapur and D. E. Long. IES3: A fast inte-gral equation solver for e�cient 3-dimensionalextraction. In 37th International Conference onComputer Aided Design, Nov 1997.[7] S. Kapur, D. E. Long, and J. Zhao. E�cientfull-wave simulation in layered, lossy media. InProceedings of the IEEE Custom Integrated Cir-cuits Conference, May 1998.

[8] S. Kapur and V. Rokhlin. High-order correctedtrapezoidal rules for singular functions. SIAMJournal of Numerical Analysis, 34(4):1331{1356,August 1997.[9] K. A. Michalski and J. R. Mosig. Multilayeredmedia Green's functions in integral equation for-mulation. IEEE Transaction on Antenna andPropagation, 45(3):508{519, March 1997.[10] S. G. Michlin. Integral Equations and their Ap-plications to Certain Problems in Mechanics,Mathematical Physics, and Technology. Perga-mon Press, New York, 1957.[11] J. R. Mosig and F. E. Gardiol. Analytic and nu-merical techniques in the Green's function treat-ment of microstrip antennas and scatterers. InInstitute of Electrical Engineers Proceedings, vol-ume 130, pages 175{182, March 1983. Pt. H.[12] K. Nabors and J. White. Fastcap: A multipole-accelerated 3-D capacitance extraction program.IEEE Transactions on Computer-Aided Design,10(10):1447{1459, November 1991.[13] K. Nabors and J. White. Multipole-acceleratedcapacitance extraction for 3-D structures withmultiple dielectrics. IEEE Transactions on Cir-cuits and Systems, 39(11):946{954, November1992.[14] J. R. Philips and J. White. A precorrected-FFT method for capacitance extraction of com-plicated 3-D structures. In Proceedings of the In-ternational Conference on Computer-Aided De-sign, November 1994.[15] S. M. Rao, T. K. Sarkar, and R. F. Harrington.The electrostatic �eld of conducting bodies inmultiple dielectric media. IEEE Transactions onMicrowave Theory and Techniques, 32(11):1441{1448, November 1984.[16] Y. Saad and M. H. Shultz. GMRES: A general-ized minimal residual algorithm for solving non-symmetric linear systems. SIAM Journal on Sci-enti�c and Statistical Computing, 7(3):856{869,July 1986.[17] J. R. Shewchuk. Triangle: Engineering a 2Dquality mesh generator and Delaunay triangu-lator. In First Workshop on Applied Computa-tional Geometry, pages 124{133. Association forComputing Machinery, May 1996.Page 10

Page 11: E cien - Integrand Software · 2003-12-18 · (FE) [1], nite-di erence (FD) [18], and time-domain (FDTD) [21] approac hes all fall in to this class. Metho ds from the second class

[18] G. D. Smith. Numerical Solution of Partial Dif-ferential Equations: Finite Di�erence Methods.Oxford University Press, 1986.[19] J. Stoer and R. Bulirsch. Introduction to Numer-ical Analysis. Springer-Verlag, New York, 1979.[20] J. Strain. Locally-corrected multidimensionalquadrature rules for singular functions. SIAMJournal of Scienti�c Computing, 16:992{1017,1995.[21] A. Ta ove. Computational Electrodynamics:The Finite-Di�erence Time-Domain Method.Artech House, 1995.[22] D. R. Wilton and A. W. Glisson. On improvingthe stability of the electric �eld integral equationat low frequency. In Proceedings of the IEEEAP-S National Symposium, pages 124{133, LosAngeles, June 1981. IEEE.[23] J. Zhao, S. Kapur, D. E. Long, and W. W.-M.Dai. E�cient three-dimensional extraction basedon static and full-wave layered Green's functions.In Proceedings of the 35th Design AutomationConference, June 1998.

Page 11