Transcript
Page 1: Transient Analysis CK Cheng UC San Diego CK Cheng UC San Diego Jan. 25, 2007

Transient Analysis Transient Analysis

CK Cheng

UC San Diego

CK Cheng

UC San Diego

Jan. 25, 2007Jan. 25, 2007

Page 2: Transient Analysis CK Cheng UC San Diego CK Cheng UC San Diego Jan. 25, 2007

Outline

• Research Directions• Simulation test case results• Overview of Simulation• Commercial Package• Alternating direction implicit (ADI) Method• General Operator Splitting Method• Distributed Computing• Conclusions and Future Works

Page 3: Transient Analysis CK Cheng UC San Diego CK Cheng UC San Diego Jan. 25, 2007

Research Directions

• Simulation: SPICE, STA

• Network on Chip: topology and wire styles,

• Power, and Clock Networks

• Data Path Components: adders, shifters, multipliers, division

• Packaging: passive distortion compensation

Page 4: Transient Analysis CK Cheng UC San Diego CK Cheng UC San Diego Jan. 25, 2007

6x6 Bump Simulation Results• The Circuit:

– 184K Capacitors, 17K Current Sources, 120K Inductors and 246K Resistors.

– 306K Nodes

• Accuracy:– Waveform and measurement results match Fujitsu’s

with less than 0.002% error.

• Runtime / Memory Comparison:

CPU_Time Memory Computer Used

UCSD 678s 600.2M Pentium 4 3.2G, Linux

Fujistu Log File 1845s 771M unknown

Page 5: Transient Analysis CK Cheng UC San Diego CK Cheng UC San Diego Jan. 25, 2007

6x6 Bump Simulation Results• Measurement results and waveform

Min_pwr_l_est_10000954 Min_18269323 Min_33085875

UCSD 0.9980790 0.9967357 0.9934251

Fujistu Log File 0.9980620 0.9966940 0.9933790

Error 0.002% 0.004% 0.005%

(Red curve is UCSD result)

Page 6: Transient Analysis CK Cheng UC San Diego CK Cheng UC San Diego Jan. 25, 2007

703KR Simulation Results• The Circuit:

– 514K Capacitors, 76K Current Sources, 370K Inductors and 703K Resistors.

– 1.3M Nodes

• Accuracy:– Measurement results match Fujitsu’s with less than

0.02% error.

• Runtime / Memory Comparison:

CPU_Time Memory Computer Used

UCSD 2575s (0.7h) 1.7G Pentium 4 3.2G, Linux

Fujistu Log File 864561s (240h) 2.28G unknown

Page 7: Transient Analysis CK Cheng UC San Diego CK Cheng UC San Diego Jan. 25, 2007

703KR Simulation Results • Measurement results and waveform

Min_33096003 Min_33096004 Min_33097557

UCSD 0.9400988 0.9421157 0.9370827

Fujistu Log File 0.9399610 0.9419260 0.9368400

Error 0.015% 0.02% 0.026%

(UCSD results only. Fujitsu waveform is not available for comparison)

Page 8: Transient Analysis CK Cheng UC San Diego CK Cheng UC San Diego Jan. 25, 2007

Further Speed-ups• Reduce iteration count by 50% for pure linear circuits (like

6x6 bump and 703KR)– 2x speed up

• More effective time step control– DVDT, breakpoint, truncation error. 1.5 - 3x speed up

• Use Multigrid solver– 1.5 - 2x speed up for medium circuits (6x6 bump)

– 2x – 10x speed up for large circuits (703KR)

• Parallel simulation– 4 or more processors on linux cluster

– 32 to hundreds of processors on supercomputer.

• Overall speed-up– 6x - 60x speed up without parallel simulation

– 12x - 1000x speed up with parallel simulation

Page 9: Transient Analysis CK Cheng UC San Diego CK Cheng UC San Diego Jan. 25, 2007

Performance and capacity prediction

Cases 10x-100x larger than 703KR.

Preferred Solver Cpu Time Memory

Small - Medium

0.3M nodes

LU Decomposition 11 minutes 600M

Medium - Large

1.3M nodes

Multigrid 43 minutes 1.7G

Huge

10–100 M nodes

Multigrid + Parallel

5 – 100 hours 15G - 200G

Page 10: Transient Analysis CK Cheng UC San Diego CK Cheng UC San Diego Jan. 25, 2007

Overview of Simulation

Our research• Fast speed with SPICE

accuracy• Nonlinear devices• Efficient matrix solvers• Effective integration methods• Time step controls according

to different integration methods

• Distributed computingYes

Load Circuit

Device Evaluation

LU Decomposition

N-R Converge?

Next Time Point

Time Step Control

Integration Approximation

Linearization

No

Page 11: Transient Analysis CK Cheng UC San Diego CK Cheng UC San Diego Jan. 25, 2007

Overview of Simulation

•Matrix Solver•LU Decomposition•Iterative Approach

•Integration•Time Step Control•ADI

•Nonlinear Devices•Two Stage Newton Raphson

•Distributed Computing•Commercial Implementation

Page 12: Transient Analysis CK Cheng UC San Diego CK Cheng UC San Diego Jan. 25, 2007

Overview of Simulation

•Integration•Time Step Control•ADI (two-way partitioning)•Operator Splitting (multi-way)

•Distributed Computing•MPI•Partitioning

•Three Ph.D. Students

Page 13: Transient Analysis CK Cheng UC San Diego CK Cheng UC San Diego Jan. 25, 2007

Commercial Package: Fastrack Design

•Founded in January 2001•Headquartered in San Jose•Privately funded, cash-flow positive•Two Business Units

•Design Services•Technology Products

Page 14: Transient Analysis CK Cheng UC San Diego CK Cheng UC San Diego Jan. 25, 2007

Analog Designs

DesignDesign # Elements# Elements Sim. Sim. LenLen

HSpiceHSpice mSPICEmSPICE SPEEDUPSPEEDUP

FACTORFACTOR

LVDS 13490 20us 80h 26h 3.1X

Oscillator 222 1 ms 13,706s 2,670s 5.1X

Biasing Circuit

49197 200ns 427s 82s 5.2X

PLL 16050 40us 67d 12d 5.6X

PLL (post-layout)

300K 40us 290d (est) 16d 18.1X

Page 15: Transient Analysis CK Cheng UC San Diego CK Cheng UC San Diego Jan. 25, 2007

Digital Blocks

DesignDesign

NameNameDevicesDevices RuntimeRuntime Speedup Speedup

FactorFactorMOSMOS RR CC mSPICEmSPICE Traditional Traditional SpiceSpice

ALU 10.1k 12.7k 7.5k 6.9m 7m 1.0X

CONTROL 69k 83.7k 52.5k 1.5h 9.5h 6.3X

YN_BLK 205K 242.8k 203.9k 3.5h > 2d >13.7X

THP 437k 499.3k 313.5k 5.0h COULD NOT RUN ∞

VCON 936k 753k 561k 15.0h COULD NOT RUN ∞

Page 16: Transient Analysis CK Cheng UC San Diego CK Cheng UC San Diego Jan. 25, 2007

Memory Blocks

DesignDesign # #

TrTr

##

RR

##

CC

# Vectors / # Vectors / Sim. LengthSim. Length

mSPICEmSPICERun TimeRun Time

BRAM (pre) 220K 0 500 2 2.5 hours

SRAM (pre)

8Kx8 SP

410K 0 0 2 7 hours

eRAM (post)

256x16

72K 28K 427K 48ns 8 hours

BRAM (post) 220K 1320K 870K 2 18 hours

• 100% accurate Spice simulation

Page 17: Transient Analysis CK Cheng UC San Diego CK Cheng UC San Diego Jan. 25, 2007

mSPICE-Parallel

• Industry’s first practical parallel Spice simulation solution

– Increases capacity further

– Dramatically improves throughput

• Uses Matrix Level Partitioning

– No loss of accuracy

– Client-Server configuration

– Minimal memory requirement for client nodes

Page 18: Transient Analysis CK Cheng UC San Diego CK Cheng UC San Diego Jan. 25, 2007

Client-Server Configuration

• Server distributes sub-matrices to clients• Clients communicate partial solutions• Minimal memory requirements for clients

1 0 0 0 1 0 1 0 0 1 0 0 0 1 0 1

1 0 0 0 1 0 1 0 0 1 0 0 0 1 0 1

0 0 1 0 1 0 1 0 0 1 0 0 0 1 0 1

1 0 1 0 0 0 1 0 0 0 0 1 0 0 0 1

1 0 0 0 1 0 1 0 0 1 0 0 0 1 0 1

1 0 1 0 0 0 0 0 0 1 0 0

1 0 0 1 0 1 0 0 0 0 1 0 1 0 1 0 0 0 1 0 1 0

0 1 0 0 0 1 0 1

Page 19: Transient Analysis CK Cheng UC San Diego CK Cheng UC San Diego Jan. 25, 2007

Experimental Results

DesignDesign TotalTotal

ElementsElements

Sim. LengthSim. Length RuntimeRuntime

1-proc1-proc 2-proc2-proc 4-proc4-proc

ASIC 1.2M 8ns 12.2h 7.0h

(1.7X)

5.1h

(2.4X)

38IO SSO 1.4M 30ns 3.0h 2.0h

(1.5X)

1.4h

(2.2X)

Signal-power 2.1M 1.2us 13d 7d18h

(1.7X)

5d12h

(2.4X)

4096x8 RAM

(extracted)

2.3M 10ns 32h 18.5h

(1.7X)

13.4h

(2.4X)

120IO SSO 3.5M 30ns 6.2h 4.1h

(1.5X)

3.1h

(2.0X)

Page 20: Transient Analysis CK Cheng UC San Diego CK Cheng UC San Diego Jan. 25, 2007

ADI: Previous Works

• 1999, Namiki and Ito

– the alternating direction implicit (ADI) is used to simulate a 2D TE wave.

• 2001, Zheng etc.

– extend to 3D problem

• 2001 & 2003, Lee and Chen

– ADI is used to transmission line modeled power grid

The alternation is among different geometric directions, so the simulated geometric structure is constrained.

Page 21: Transient Analysis CK Cheng UC San Diego CK Cheng UC San Diego Jan. 25, 2007

Alternating Direction Implicit (ADI)

• ADI Integration Method– Two way partition of the circuit– One partition is used for each backward integration

– Unconditional stable

(A-stable: independent of time step size)– Time step size according to local truncation error.

Page 22: Transient Analysis CK Cheng UC San Diego CK Cheng UC San Diego Jan. 25, 2007

Alternating Direction Implicit (ADI)

• ADI method formulation• Circuit partition algorithm• Local truncation error estimation• Stability discussion• Experimental results

Page 23: Transient Analysis CK Cheng UC San Diego CK Cheng UC San Diego Jan. 25, 2007

SPICE Formulation

• Equations for RLC circuits

where C: capacitance matrix L: inductance matrix

R: resistance matrix G: conductance matrix

E: incidence matrix

)t(U)t(I

)t(V

RE

EG

)t(I

)t(V

L0

0C T

Page 24: Transient Analysis CK Cheng UC San Diego CK Cheng UC San Diego Jan. 25, 2007

ADI Formulation

• Transient simulation

– Split the resistors and inductors branchesinto two parts

• G = G1 + G2

• E = E1 + E2

• R = R1 + R2

– Alternate Backward and Forward integrationon each partition

Page 25: Transient Analysis CK Cheng UC San Diego CK Cheng UC San Diego Jan. 25, 2007

ADI Formulation (Cont.)

• Equations of ADI method

– the size of left-hand-side matrix remains unchanged

– the number of non-zero elements is decreased

– direct solving methods can be efficient

)2

ht(U

)2

ht(I

)2

ht(V

Rh

L2E

EGh

C2

)ht(I

)ht(V

Rh

L2E

EGh

C2

)2

ht(U

)t(I

)t(V

Rh

L2E

EGh

C2

)2

ht(I

)2

ht(V

Rh

L2E

EGh

C2

11

T11

22

T22

22

T22

11

T11

Page 26: Transient Analysis CK Cheng UC San Diego CK Cheng UC San Diego Jan. 25, 2007

Experiments of non-zero fill-ins

• A small ASIC Design

Spice matrix : Dimension: 10,286 The number of non-zero elements: 46,655 The number of non-zero fill-ins: 90,960

• A large I/O Design

Spice matrix : Dimension: 615,436 The number of non-zero elements: 2,126,246

Sub-matrix1 Sub-matrix2 Total# non-zero

fill-ins# non-zeroelements

# non-zerofill-ins

# non-zeroelements

# non-zerofill-ins

Case 1 38,572 2,618 42,020 10,040 12,658

Case 2 1,176,208 12,421,534 950,038 14,772,068 27,193,602

Page 27: Transient Analysis CK Cheng UC San Diego CK Cheng UC San Diego Jan. 25, 2007

Local Truncation Error (LTE)

• Time step control using LTE– In circuit transient analysis, the next time step can be

estimated from the local truncation error at the present time point

– LTE is defined as the difference between the calculated solution and the exact solution

– To ensure the consistency, the local truncation error should not exceed the error tolerance, thus the time step can be estimated using

)tΔ(fx̂xεLTE n1n1nn

toln1n1nn E)tΔ(fx̂xεLTE

Page 28: Transient Analysis CK Cheng UC San Diego CK Cheng UC San Diego Jan. 25, 2007

Local Truncation Error (Cont.)

• LTE of ADI method(1) equations

let , , and

then

)t(U)t(I

)t(V

RE

EG

)t(I

)t(V

L0

0C T

UNXXM

)t(I

)t(VX

L0

0CM

RE

EGN

T

BUAXUMNXMX 11

Page 29: Transient Analysis CK Cheng UC San Diego CK Cheng UC San Diego Jan. 25, 2007

Local Truncation Error (Cont.)

• LTE of ADI method(2) Estimate exact solution

we characterize the input as a simple ramp over the interval (tn, tn+1), the exact analytic solution with time step tn:

]tΔ

UΔBA)UΔU(B[A]

UΔBABU(AX[eX

n

n1nn

1

n

n1n

1n

tΔA1n

n

n3

n32

n2

n X)tΔA6

1tΔA

2

1tΔAI(

n3

n22

n U)tΔBA6

1tΔAB

2

1B(

)tΔ(OUΔ)tΔAB6

1tΔB

2

1( 4

nn2

nn

Page 30: Transient Analysis CK Cheng UC San Diego CK Cheng UC San Diego Jan. 25, 2007

Local Truncation Error (Cont.)

• LTE of ADI method(3) Estimate ADI solution

2/1n2/1n1n

1n2n

2/1nn2n

2/1n1n

UX)NMtΔ

2(X)NM

2(

UX)NMtΔ

2(X)NM

2(

n2n1

1n

1n1

2n

1n X)A2

tΔI()A

2

tΔI)(A

2

tΔI()A

2

tΔI(X̂

2/1nn1

2n1

1n

1n1

2n BU

2

tΔ])A

2

tΔI()A

2

tΔI)(A

2

tΔI()A

2

tΔI[(

Page 31: Transient Analysis CK Cheng UC San Diego CK Cheng UC San Diego Jan. 25, 2007

Local Truncation Error (Cont.)

• LTE of ADI method(3) Estimate ADI solution

n2n1

1n

1n1

2n

1n X)A2

tΔI()A

2

tΔI)(A

2

tΔI()A

2

tΔI(X̂

2/1nn1

2n1

1n

1n1

2n BU

2

tΔ])A

2

tΔI()A

2

tΔI)(A

2

tΔI()A

2

tΔI[(

n3

n213

n32

n2

n X)tΔAAA4

1tΔA

4

1tΔA

2

1tΔAI(

n3

n213

n22

nn U)tΔBAA4

1tΔBA

4

1tΔAB

2

1tΔB(

)tΔ(OUΔ)tΔAB4

1tΔB

2

1( 4

nn2

nn

Page 32: Transient Analysis CK Cheng UC San Diego CK Cheng UC San Diego Jan. 25, 2007

Local Truncation Error (Cont.)

• LTE of ADI method(4) LTE estimation

1n1nn X̂XεLTE

n3

n213

n3 X)tΔAAA

4

1tΔA

12

1(

)tΔ(OXtΔAA4

1XtΔ

12

1 4nn

3n21n

3n

)tΔ(OUΔtΔAB12

1U)tΔBAA

4

1tΔBA

12

1( 4

nn2

nn3

n213

n2

Page 33: Transient Analysis CK Cheng UC San Diego CK Cheng UC San Diego Jan. 25, 2007

Local Truncation Error (Cont.)

• LTE of ADI method(5) Time step control

2/1n2/1n1n

1n2n

2/1nn2n

2/1n1n

UX)NMtΔ

2(X)NM

2(

UX)NMtΔ

2(X)NM

2(

2/1n1n22/1n12/1n1nn

2/1nn22/1n1n2/1nn

UXNXN)XX(MtΔ

2

UXNXN)XX(MtΔ

2

Page 34: Transient Analysis CK Cheng UC San Diego CK Cheng UC San Diego Jan. 25, 2007

Local Truncation Error (Cont.)

• LTE of ADI method(5) Time step control

)XX(tΔAA4

1)XX(

2

tΔXX n1n

2n21n1n

nn1n

n3

n21n1nn

nn XtΔAA4

1)XX(

2

tΔXtΔ

)XX(2

tΔXtΔAA

4

1n1n

nn

3n21

)XX(2

tΔXtΔAA

4

11nn

1n1n

31n21

3n2

1n

1nnnn

3n21n

3n tΔ)

tΔ2

XX

12

X(XtΔAA

4

1XtΔ

12

1LTE

Page 35: Transient Analysis CK Cheng UC San Diego CK Cheng UC San Diego Jan. 25, 2007

Stability Discussion

• The stability is concerned with whether the accumulated error grows or decays as time evolves through a series of time steps.

• One-step integration approximations, the error is accumulated by a factor of

• If the final steady state error vector is smaller than the initial, then the integration method is stable.

• In ADI integration method:

– It can be proved to be unconditional stable

]tΔ

UΔBABU(AX[e]

UΔBA)UΔU(B[AX

n

n1n

1n

tΔA

n

n1nn

11n

n

ntΔAe

)A2

tΔI()A

2

tΔI)(A

2

tΔI()A

2

tΔI(e 2

n11

n1

n12

ntΔA n

Page 36: Transient Analysis CK Cheng UC San Diego CK Cheng UC San Diego Jan. 25, 2007

Experimental Results

Circuit1 Cuicuit2 Circuit3 1k-cell

#Nodes 10,000 40,000 90,000 10,200

#Transistors 0 0 0 6,500

Period 10ns 10ns 10ns 10ns

SPICE3 CPU time (sec) 77.8 485.3 3,061.1 181.6

#steps 115 115 114 193

ADI CPU time (sec) 28.6 117.8 275.2 523.3

#steps 102 102 102 949

Speedup 2.7x 4.1x 11.1x -

Page 37: Transient Analysis CK Cheng UC San Diego CK Cheng UC San Diego Jan. 25, 2007

Voltage drop of Circuit3 (power mesh with sinks)

Page 38: Transient Analysis CK Cheng UC San Diego CK Cheng UC San Diego Jan. 25, 2007

Signal in 1k_cell (ASIC design)

Page 39: Transient Analysis CK Cheng UC San Diego CK Cheng UC San Diego Jan. 25, 2007

General Operator Splitting

• General operator splitting method– Multiple way partitions

– Each partition is considered separately in each time step simulation

– No geometry constrains

– Local truncation error is used to dynamically control time step size

Page 40: Transient Analysis CK Cheng UC San Diego CK Cheng UC San Diego Jan. 25, 2007

General Operator Splitting

• Fundamental theory• Operator splitting formulation• Local truncation error estimation• Stability discussion• Experimental results

Page 41: Transient Analysis CK Cheng UC San Diego CK Cheng UC San Diego Jan. 25, 2007

Fundamental theory

• In circuit transient simulation, the integration approximation is actually the approximation of the exponential operator

• The exponential operators can be approximated in any order using a general scheme of fractal decomposition

• The decomposition of exponential operators corresponds to the circuit multi-way partition

New integration approximation in transient simulation

Page 42: Transient Analysis CK Cheng UC San Diego CK Cheng UC San Diego Jan. 25, 2007

Fundamental theory

• Approximation of exponential operator– General circuit equation and solution

– If we characterize the input as a simple ramp over the interval (tn, tn+1), the exact analytic solution with time step tn

– Exponential operator approximation

• Forward Euler

• Backward Euler

• Trapezoidal

]tΔ

UΔBA)UΔU(B[A]

UΔBABU(AX[eX

n

n1nn

1

n

n1n

1n

tΔA1n

n

)t(Bu)t(Ax)t(x

tΔt

t

)τtΔt(AtΔA τd)τ(Bue)t(xe)tΔt(x

1tΔA )tΔAI(e

tΔAIe tΔA

)tΔA2

1I()tΔA

2

1I(e 1tΔA

Page 43: Transient Analysis CK Cheng UC San Diego CK Cheng UC San Diego Jan. 25, 2007

Fundamental theory

• Decomposition of exponential operators(Masuo Suzuki, 1991, Physics)– Function

– First order:

– Second order:

– Third order:

– (2m-1)th and (2m)th order:

)BA(xe)x(F xBxA

1 ee)x(f xA

2

1xB

xA2

1

2 eee)x(f

)22/(1s,eeeeeee)x(f 3xA

2

ssxB

xA2

s1xB)s21(

xA2

s1sxB

xA2

s

3

)22/(1k

)xk(f)x)k21((f)xk(f)x(f)x(f1m2

m

m3m2m3m2m3m2m21m2

Page 44: Transient Analysis CK Cheng UC San Diego CK Cheng UC San Diego Jan. 25, 2007

Fundamental theory

• Decomposition of exponential operators

)()(2

1)(

)()2

1

2

1

2

1

2

1()(

)()4

1

2

1

2

1

8

1

2

1

8

1()

2

1

2

1(

)](8

1

2

1)][(

2

1)][(

8

1

2

1[

)(

)()(2

1)()(

322

3222

322222

322322322

2

1

2

1

2

322)(

xOxBAxBAI

xOxBAABBAxBAI

xOxABAABABAxABAI

xOxAAxIxOxBBxIxOxAAxI

eeexf

xOxBAxBAIexF

xAxBxA

BAx

Page 45: Transient Analysis CK Cheng UC San Diego CK Cheng UC San Diego Jan. 25, 2007

General Operator Splitting Formulation

• Transient simulation:– Apply the second order approximation

– In each time step, every partition is calculated separately and trapezoidal integration is used for every partition

– The size of left-hand-side matrix may be changed

– The number of non-zero elements is definitely decreased

– Can be easily extended to multi-way partitions

12

121

xA2

1xAxA

2

1)AA(x eeee

121qq1q21q21xA

2

1xA

2

1xA

2

1xAxA

2

1xA

2

1xA

2

1)A...AA(xxA ee...eee...eeee

Page 46: Transient Analysis CK Cheng UC San Diego CK Cheng UC San Diego Jan. 25, 2007

General Operator Splitting Formulation

• Equations

)2

ht(U

2

1

)t(I

)t(V

2

R

h

L2

2

E2

E

2

G

h

C2

)ht(I

)ht(V

2

R

h

L2

2

E2

E

2

G

h

C2

)2

ht(U

2

1

)t(I

)t(V

2

R

h

L

2

E2

E

2

G

h

C

)t(I

)t(V

2

R

h

L

2

E2

E

2

G

h

C

)2

ht(U

2

1

)t(I

)t(V

2

R

h

L2

2

E2

E

2

G

h

C2

)t(I

)t(V

2

R

h

L2

2

E2

E

2

G

h

C2

1T1

T11

1T1

T11

2T2

T22

2T2

T22

1T1

T11

1T1

T11

12

121

hA2

1hAhA

2

1)AA(h eeee

Page 47: Transient Analysis CK Cheng UC San Diego CK Cheng UC San Diego Jan. 25, 2007

Local Truncation Error (Cont.)

• LTE of general operator splitting methodEstimate solution

2/1nn1

n1n

1

n

2/1nn2

nn

2

n

2/1nn1

nn

1

n

U2

1X)

2

NM

2(X)

2

NM

2(

U2

1X)

2

NM

1(X)

2

NM

1(

U2

1X)

2

NM

2(X)

2

NM

2(

Page 48: Transient Analysis CK Cheng UC San Diego CK Cheng UC San Diego Jan. 25, 2007

Local Truncation Error (Cont.)

• LTE of general operator splitting methodEstimate solution

n1n1

1n

2n1

2n

1n1

1n

1n X)A4

tΔI()A

4

tΔI)(A

2

tΔI()A

2

tΔI)(A

4

tΔI()A

4

tΔI(X̂

11

n2

n12

n1

n11

n )A4

tΔI)(A

2

tΔI()A

2

tΔI)(A

4

tΔI()A

4

tΔI[(

2/1n1

1n1

2n

1n1

1n U

2

1])A

4

tΔI()A

2

tΔI)(A

4

tΔI()A

4

tΔI(

n3

n2122122

21

31

3n

32n

2n X)tΔ)AAA

4

1AA

8

1AA

8

1A

16

1(tΔA

4

1tΔA

2

1tΔAI(

n3

n1221

3n

22nn U)tΔB)AA

16

3A

32

3(tΔBA

4

1tΔAB

2

1tΔB(

)tΔ(OUΔ)tΔAB4

1tΔB

2

1( 4

nn2

nn

Page 49: Transient Analysis CK Cheng UC San Diego CK Cheng UC San Diego Jan. 25, 2007

Local Truncation Error (Cont.)

• LTE of general operator splitting methodLTE estimation

1n1nn X̂XεLTE

n3

n2122122

21

31n

3n XtΔ)AAA

4

1AA

8

1AA

8

1A

16

1(XtΔ

12

1

)tΔ(OUtΔB)AA16

3A

32

3( 4

nn3

n1221

Page 50: Transient Analysis CK Cheng UC San Diego CK Cheng UC San Diego Jan. 25, 2007

Local Truncation Error (Cont.)

• LTE of general operator splitting methodLTE estimation

2/1nn1nn1n1n

2/1nnnnn2nn

2/1nnnnn1nn

UtΔB4

1)XX(tΔA

4

1XX

UtΔB2

1)XX(tΔA

2

1XX

UtΔB4

1)XX(tΔA

4

1XX

Page 51: Transient Analysis CK Cheng UC San Diego CK Cheng UC San Diego Jan. 25, 2007

Local Truncation Error (Cont.)

• LTE of general operator splitting methodLTE estimation

)XX(2

tΔXX n1n

nn1n

n3

n2122122

21

31 XtΔ)AAA

4

1AA

8

1AA

8

1A

16

1(

n3

n1221 UtΔB)AA

16

3A

32

3(

3n2

1n

1nnn tΔ)tΔ2

XX

12

X(LTE

Page 52: Transient Analysis CK Cheng UC San Diego CK Cheng UC San Diego Jan. 25, 2007

Stability Discussion

• The trapezoidal integration method is unconditional stable for stable system.

• In our operator splitting method, trapezoidal method is used for all the sub-systems

still unconditional stable

)A4

tΔI()A

4

tΔI)(A

2

tΔI()A

2

tΔI)(A

4

tΔI()A

4

tΔI(e 1

n11

n2

n12

n1

n11

ntΔA n

)A2

tΔI()A

2

tΔI(e n1ntΔA n

12

121

xA2

1xAxA

2

1)AA(x eeee

Page 53: Transient Analysis CK Cheng UC San Diego CK Cheng UC San Diego Jan. 25, 2007

Experimental Results

Circuit1 Cuicuit2 Circuit3

#Nodes 10,000 40,000 90,000

#Transistors 0 0 0

Period 10ns 10ns 10ns

SPICE3 CPU time (sec) 77.8 485.3 3,061.1

#steps 115 115 114

GOS CPU time (sec) 164.7 1011.6 3435.9

#steps 102 102 102

Comparison 2.1x 2x 1.1x

Page 54: Transient Analysis CK Cheng UC San Diego CK Cheng UC San Diego Jan. 25, 2007

Voltage drop of Circuit3 (power mesh with sinks)

Page 55: Transient Analysis CK Cheng UC San Diego CK Cheng UC San Diego Jan. 25, 2007

Conclusions

• We investigate alternating direction implicit and general operator splitting integration methods for transistor-level circuit transient simulation.

• In both methods, the circuit will be divided into several sub-circuits, thus the direct matrix solver is still efficient because the matrix is simplified.

• Both methods are second order accurate and unconditional stable.

• Overhead:– Circuit partition– Each time step consists of many sub-steps, each sub-step is a

N-R iteration process• Better for circuits with large linear network

Page 56: Transient Analysis CK Cheng UC San Diego CK Cheng UC San Diego Jan. 25, 2007

• Distributed Processors – Cluster

– Supercomputer

– Multi-Core Processors (Intel Dual/Quad-Core, IBM Cell etc.)

• Standard– MPI

– Partitioning

– Matrix Solver

• Capabilities– Speed-up (10-100+)

– Memory Capacity (10-100+)

Distributed Computing

Page 57: Transient Analysis CK Cheng UC San Diego CK Cheng UC San Diego Jan. 25, 2007

Future Works

• ADI method– More experiments

• General operator splitting method– Design and implement multi-way circuit partition

algorithm– Implement multi-way general operator splitting program– Derive LTE for general multi-way situation– More experiments

• Distributed Computing– MPI Standard– Distributed Partitioning, Matrix Solver

Page 58: Transient Analysis CK Cheng UC San Diego CK Cheng UC San Diego Jan. 25, 2007

Recommended