Upload
others
View
11
Download
0
Embed Size (px)
Citation preview
EMX: a commercial full-wave 3D electromagnetic simulator
Sharad Kapur and David E. Long
IMA workshop: Integral equation methods, fast algorithms and applications, Aug 2010
www.integrandsoftware.com
Outline
Introduction to EMX
• 3D, full-wave, volume integral equation EM solver
Application areas
• High speed ICs, IPD substrates, packaging
3D Formulation
Numerical techniques
• Layered Media Green‟s functions
• Iterative solver
• 3D FMM
• Adaptive frequency sweep
• Multi-threading
Experiments
Measurement
2
Application areas
Mobile (Smart phones)
Wireless (WiFi, WiMax)
Wired (Ethernet)
Storage (Hard disks)
Frequencies
• DC to 60GHz
• Usually full circuits are less than 2mm in size
Electrically small structures, (fraction of a wavelength)
3
Passive components
4
inductor ind + shield
MOM cap Transformer
Full circuits
5
Full circuit1
Diplexer2
VCO3: inductors +
interconnect+
capacitor bank
1. SiGe Semiconductor on IBM BiCMOS
2. STATS ChipPAC IPD technology
3. Wipro on TSMC 90nm
Technology
65nm technology node
40 dielectric layers
Layered lossy substrate
• lossless to 1000 S/m
10 layers of metal
• 0.1um width/thickness for interconnect metals
• Special thin film capacitors
• Ultra thick metals 3um thick top metals for low loss components like inductors
6
Accuracy considerations
Final solution accuracy required is about 1%
All circuits and devices are built to operate within 3 sigma variation of about 10%
The solver formulation is optimized for this “3 significant digit” accuracy realm
The physics captures reality very well. Benchmark is always comparison to measurement.
Current7
Influencing work
Drawn from the best numerical techniques of last decades
FMM
• Yale, NYU (Rokhlin, Greengard)
Fast Solvers
• MIT (White and students), Bell Labs (Kapur, Long).
Green‟s functions and Hankel transforms
• Text books: J.Mosig, W.Chew
Krylov Subspace Iterative solvers
• Yale, Bell Labs
Reduced order modeling
• CMU, Bell Labs, MIT
8
3D Formulation
3D volume integral formulation
Mixed-potential Integral Equation (MPIE)
Time harmonic
Electric field
• Ohmic loss
• Vector Potential
• Scalar Potential
9
Discretization
3D volume formulation
Charge:
• Discretized into triangular and rectangular surface charge elements
Currents
• Rectangular and triangular prisms that carry volume currents.
• Current vectors can be in x-y-z directions
Galerkin method used
10
3D Basis functions
Rao-Wilton-Glisson (RWG) basis functions
2D picture, has 3D analog
Linear “roof top” functions
To avoid low frequency ill-conditioning basis functions are decomposed into curl-free and divergence free parts
At low frequencies the problem decomposes into an electrostatic and magnetostatic problems
Finding loops (J.D. Horton „87) linear time for short cycles
Divergence free basis function
Curl free basis function
11
Matrix formulation
V is a r x b sparse matrix
• r is number of roof tops
• b is number of basis functions
S is a t x b sparse matrix
• t is number of surface charge elements
• b is number of basis functions
(continuous form)
(discrete matrix form)
12
Solving the linear system
Solve system at a set of frequencies
To solve the linear system we use a Block Krylov-subspace solver (similar to GMRES).
The application of the matrices and are done using a new 3D FMM
The dominant cost of the algorithm is the representation and application of the matrix to a sequence of right-hand-sides
Every step is O(N) (with a large constant)
Lowering the constant by several techniques (some of them domain specific)
13
Calculating Green’s functions
Layered media Green‟s functions
Green‟s functions calculated on the fly without lookup tables
Use the “transmission line equivalence” method to represent a lossy, layered dielectric in “spectral domain”
Use a modified Fast Hankel Transform to convert to “space domain” Green‟s functions
Messy and tedious numerics
14
Layout regularity
Layout is regular
1. Wires are paths of constant width.
2. Distance between adjacent routing is constant Routing is at 45 or 90 degrees
3. Components, spiral inductors, capacitors, are symmetric
Layout “space” is actually a very small subset of all possible routing
15
Exploiting regularity
Mesh generation was regarded as an orthogonal sub problem (typically unstructured Delaunytriangulation)
Layout has a lot of structure
This structure can be imposed on the mesh
Identical interactions are repeated all over
Few unstructured “left over” regions are a small part of the mesh
16
Algorithm for creating regular meshes
Wire recognition algorithm was developed
Sweep through the layout identifying wires
Grey regions are identified wires
Once the wires are identified
A mesh is created from a small set of canonical shapes
17
Exploiting the regularity
Embedded in the FMM
Direct interactions represented by a relatively sparse matrix
Lot of structure in the sparse matrix with identical entries
Substantially more compact representation
• Reduction in time for matrix construction (integral time)
• Reduction in storage
18
Approximating the vector potential
Vector potential term is dominant cost
With RWG basis functions
• 3 roof tops for each triangle
• 4 roof tops for each rectangle
• Between two shapes need to compute 9-16 interactions (roof top to roof top)
• 1 for scalar interaction
19
Approximating the Vector potential
To avoid ill-conditioning basis functions are decomposed into curl free and divergence free bases (loops and patches)
Current flow through a triangle due to loop is a constant vector!
Can be exactly represented by a scalar integral over source
Approximation for other vector contributions
20
Approximating the vector potential
In the limit of fine mesh approximation is exact
Intuition: The current flow smoothly varies across shapes and very small amount of charge is deposited as current leaves a shape
Approximation is valid for practical problems and frequencies
AVV TError in
21
3D Fast Kernel Independent Fast Multipole
Based on the Greengard, Rokhlin FMM using diagonal form of translation operators (M2L) (1997)
M operator approximated by U and V and M* is diagonal
U and V constructed analytically
src M obs
UMM* VT
22
Diagonalizing translation operators
Same U and V diagonalize all the M2L operators for all source and observation boxes
Cannot be done in the kernel independent case
• Green‟s function is numerical
• No analytic form
• Z dependence
Use SVDs to “compress” M2Ls. Not diagonal. M* is full rank but much smaller than M.
UMM* VT
MN
M1
obsM2
23
Diagonalizing translation operators
Organize M1, M2, …, MN by distance between boxes
Construct U incrementally
Use Gram Schmidt to construct column basis for M1
• Lowest rank for M1 (furthest away)
• Higher for M2 (project with U for M1 and add new columns)
• Highest rank for MN
Similar for VT with source boxes
Typical saving: Average rank goes from 15 to 5 (factor of 9)
MN
M1
obsM2
UM1
0
UM2
UMN
0
24
Symmetries
Four quadrants of interactions
Due to symmetries in basis functions U and VT in different quadrants are the same modulo sign flips
Construct U for one quadrant
All the Us can be applied by using FFT-like butterfly with first quadrant U
Similar operations for V
Z dimension does not have same symmetries
obs
2 1
3 4
25
Adaptive frequency sweep
An adaptive frequency sweep is used to speed up simulation
This is based on creating a reduced order model using Krylov subspace methods
The reduced order model is created from a small set of EM solutions
The playback is done on a finer grid. Instead of hundreds of simulations you can do only a few
26
Multithreading
EMX exploits parallelism in various ways
• Multipole setup (direct interaction computation)
• Independent solves at different frequencies (has memory cost)
• Iterative solver orthogonalization as the basis expands
• Different vectors in the block solve are independent
• Different kernels (vector, scalar) are independent
• Direct part of a single kernel application (each row of the block sparse matrix can be done independently)
• Indirect part of a kernel application (partition into phases up+VT and M+U+down, then split into independent parts according to multipole spatial subdivision)
EMX determines parallel scheduling on the fly, preferring “higher level” splitting when possible
27
Time and Memory scaling
Single frequency simulation (including iterative solve)
Compare speed and memory for 1, 2, 4, 8, …, 64 inductors
1 inductor 64 inductors
28
Benchmarks: Spiral Inductor
29
3D mesh
Current
Courtesy: TSMC. 65nm RFCMOS, 9LM thick metal technology. Published at RFIC 2009“Including Pattern-Dependent Effects in Electromagnetic Simulations of On-Chip Passive Components”, Integrand and TSMC
Stacked Inductor
30
3D mesh
Current
Courtesy: TSMC. 65nm RFCMOS, 9LM thick metal technology. Published at RFIC 2009“Including Pattern-Dependent Effects in Electromagnetic Simulations of On-Chip Passive Components”, Integrand and TSMC
MOM (finger)Capacitor
31
3D mesh of 0.6pF Cap
Courtesy: TSMC. 65nm RFCMOS, 9LM thick metal technology. Published at RFIC 2009“Including Pattern-Dependent Effects in Electromagnetic Simulations of On-Chip Passive Components”, Integrand and TSMC
Transformer
32
3D mesh
Current
Courtesy: UMC. 90nm RFCMOS, 8LM thick metal technology. Published at CICC 2007“Synthesis of Optimal On-Chip Baluns”, Integrand and UMC
Balun (with MiM caps)
33
3D mesh
Current
Courtesy: UMC. 90nm RFCMOS, 8LM thick metal technology. Published at CICC 2007“Synthesis of Optimal On-Chip Baluns”, Integrand and UMC
BiCMOS Diplexer (with Thru Silicon Vias)
34
3D mesh
Current
Courtesy: SiGe Semiconductor. IBM BiCMOS 5PAE. To be published at in IEEE Trans on Advanced Packaging, 2010
IPD Diplexer
35
3D mesh
Current
Courtesy: STATSChipPAC, IPD technology (8um Cu on high resistivity Si substrate)
CMOS VCO
36
3D mesh (inductor+ capacitor bank)
Courtesy: Wipro, TSMC90nm, 1P5M
Benchmark Summary
Multi-threaded EMX is 2-3X faster for small examples and 5-7X faster for larger examples on an 8 CPU machine.
The memory for the multi-threaded version goes up at a slower rate than the speedup.
37
Generating scalable component models
EMX used by foundries for generation of scalable models
Foundries moved from “hand-crafted” physics-based scalable models to “machine generated” based on EM simulations
• Machines are made of iron
1000s of simulations combined with optimization algorithm to generate parameterized models of components
• Inductors, transformers, capacitors
38
EMX-Continuum
Spice Models
(Spectre, Eldo, Hspice, ADS)
Design Space (5D)
nt: 1.5 to 7.5 turns
w: 3mm to 10mm
s: 1.5mm to 5mm
od: 75mm to 300mm
f: DC to 20 GHz
Layout Generator
(PCELL)
Technology File
39
EMX
Scalable Model Space Lookup
Spiral Inductor
Automated loop of
layout, simulation
and synthesis
Spice Model
Generation of scalable models
Mimicking IC fabrication in EMX
EMX uses Voronoi diagrams to capture the width-and-spacing dependent parameters in the iRCX files
These Voronoi diagrams are used to alter the drawn layout to mimic the fabrication process
The shaded region shows the “drawn” layout and the “line” shows the modified layout according to the foundry rules
Courtesy: TSMC. 65nm RFCMOS, 9LM thick metal technology. Published at RFIC 2009“Including Pattern-Dependent Effects in Electromagnetic Simulations of On-Chip Passive Components”, Integrand and TSMC
40
EMX simulation of MOM capacitors
The iRCX width-and-spacing dependence is more critical for structures
that are not at minimum dimensions. The accuracy of EMX using iRCX
is increased since the fabrication process is mimicked more closely.
Easy to Use
EMX accepts true mask layout and will automatically simplify the layout for meshing and EM simulation
• Via arrays, slotting rules, metal fill simplified
• Boolean masking operations can be performed
• Can apply grows and shrinks to the geometry including half-node scaling and metal bias
What goes to mask goes to simulator!
42
Layout automatically
simplified
Vias merged
Fill removed
Metal bias applied
MiM cap bank
handled
EMX Virtuoso interface
Create a layout using Virtuoso
Use EMX menu pull-down to access the EMX simulator interface
Challenges
Multiple scales
• MOM cap (0.05um), Inductor (1um), Package (50um)
Approximating reality
• Conformal dielectrics and dielectric blocks
• Local doping in the substrates
Further speedup needed.
• Hardware: GPUs
• Algorithms: better basis functions
Converting frequency domain data to passive “time-domain” circuit representations
• Partial solutions exist, but nothing is robust.
44
Conclusions
Described a commercial implementation of EMX
• a 3D full-wave integral equation solver
Various ways to improve speed and reduce memory
• Kernel independent FMM
• Exploiting layout regularity
• On-the-fly layered media Green‟s functions
• Multithreading
• Efficient meshing
Demonstrated accuracy for a number of real circuits
For references and citations go to: www.integrandsoftware.com/publications.php
45
46
Extra Slides
EMX: “Software Network Analyzer”
47
EMX
GDS
Wafer
Network Analyzer
Process Stack/
Technology File
Measurement vs EMX simulation results
Same mask GDSII layout is used for wafer fabrication and EM simulation.
De-embedding
View the raw .gds
48
49
Insertion Loss of Baluns (silicon verification)
50
Transformers and Baluns
(Courtesy of UMC)
51
Designing full RF blocks with EMX
Step 1:
• Rough design of inductors and baluns
• Run EMX simulation
(seconds to minutes; sometimes in a
scripting loop along with a layout
generator)
Step 2:
• Simulate full structure with
interconnect and large number of
internal ports (e.g., 20) for capacitors.
Tune design for caps.
Step 3:
• Re-simulate final structure with caps
included (few ports and high
discretization). Accounts for coupling
and effects of interconnect.
51
802.11b/WIMAX balanced diplexer
2.0E9 4.0E9 6.0E9 8.0E9 1.0E100.0 1.2E10
-40
-30
-20
-10
-50
0
FREQUENCY (Hz)
INS
ER
TIO
N L
OS
S (
dB
)
52
802.11b/WIMAX Balanced Diplexer
Low Band High Band
2.0E9 4.0E9 6.0E9 8.0E9 1.0E100.0 1.2E10
-40
-30
-20
-10
-50
0
FREQUENCY (Hz)
INS
ER
TIO
N L
OS
S (
dB
)
2.0E9 4.0E9 6.0E9 8.0E9 1.0E100.0 1.2E10
-40
-30
-20
-10
-50
0
FREQUENCY (Hz)
INS
ER
TIO
N L
OS
S (
dB
)
Measurement: red
Simulation: blue
(STATS-ChipPAC Data)