1
Implicit coupling of Lagrangian grid and solution in plasma simulation with additive Schwarz preconditioned inexact Newton Xuefei (Rebecca) Yuan 1,4 , Stephen C. Jardin 2 , David E. Keyes 3,4 and Alice E. Koniges 1 1 LBNL (USA), 2 PPPL (USA), 3 KAUST (Saudi Arabia), 4 Columbia University (USA) The work was supported by the Department of Applied Physics and Applied Mathematics of Columbia University, the Center for Simulation of RF Wave Interactions with Magnetohydrodynamics (CSWIM), and Petascale Initiative in Computational Science at NERSC . Additionally, we are grateful for the extended computer time as well as the valuable support from NERSC. This work was supported by the Director, Office of Science, Advanced Scientific Computing Research, of the U.S. Department of Energy under Contract No. DE-DC02-06ER54863. MOTIVATION: TO SIMULATE ITER, ONE CAN ESTIMATE AN ADDITIONAL 12 ORDERS OF MAGNITUDE ARE REQUIRED a) 1.5 orders: increased processor speed and efficiency b) 1.5 orders: increased concurrency c) 1 order: higher-order discretizations d) 1 order: flux-surface following gridding, less resolution required along than across field lines e) 4 orders: adaptive gridding, zones required refinement are < 1% of ITER volume and resolution requirements away from them are 10 2 less severe f) 3 orders: implicit solvers Name Symbol Units CDX-U DIII-D ITER Field B 0 Telsa 0.22 1.0 5.3 Minor radius a m 0.22 0.67 2.0 Temperature T e keV 0.1 2.0 8.0 Lundquist no. S 1x10 4 7x10 6 5x10 8 Mode growth time τ A S 1/2 s 2x10 -4 9x10 -3 7x10 -2 Layer thickness aS -1/2 m 2x10 -3 2x10 -4 8x10 -5 Zones N R xN θ xN φ 3x10 6 5x10 10 3x10 13 CFL time step size ΔX/V A s 2x10 -9 8x10 -11 7x10 -12 Space-time points 6x10 12 1x10 20 6x10 24 Internatio nal Thermonucl ear Experiment al Reactor Nov. 2019 – First Plasma in Cadarache, France GRID GENERATION TECHNIQUE: THE EQUIDISTRIBUTION METHOD The logical domain: The physical domain: The transformation: Let be a two-dimensional coordinate transformation between a reference grid (a curvilinear coordinate system) and a physical grid (a Cartesian coordinate system) . Assume the transformation has the form , where is the displacement potential. A single Monge-Ampère (MA) equation leads to an adaptive grid equidistributed by the density function with the Neumann boundary condition on . The density function has the form of where, is the area of the logical domain and is provided by the user. Ξ 2 X 2 r x ( r ξ ) r x ( x , y ) Adaptive grids Zoom-in areas r x ( r ξ ):Ξ 2 X 2 r ξ = ( ξ , η ) r x =( x , y ) r x = r ξ +∇ ω ω 2 ω + 2 ω ξ 2 2 ω η 2 −( 2 ω ξ η ) 2 = 1 ρ x ( x , y ) −1 r x ( x , y ) ω r n =0 X r x ( x , y ) r x ( x , y )= ˜ ρ x ( x , y ) Vo Ξ 2 1 ˜ ρ x ( x , y ) dξdη Vo = Ξ 2 dξdη ˜ ρ x ( x , y ) R-TYPE ADAPTIVE GRID: THE CURRENT DENSITY AND TIME-DEPENDENT COORDINATES The out-of-plane current density can develop large gradients and near singular behavior in the reconnection region as time evolves, requiring localized region of higher resolution. Assume is j =−∇ 2 ψ ˜ ρ x ( x , y ) ˜ ρ x ( x ( ξ , η ), y ( ξ , η ))=1+ α | j ( ξ , η )| where , is the ratio between the biggest cell size and the smallest cell size, and is the maximum and minimum of the current density, respectively. α = ( r −1)/( j max j min ) r j max j min The mid-plane current density vs. time: vertical axis is along the mid-plane , and the horizontal axis is time . j y =0 t = 0 ~ 40 t =0 t = 40 t = 10 t = 20 t = 30 ϕ V ψ B j Adaptive grids LAGRANGIAN AND EULERIAN VELOCITIES The fluid velocity is the sum of the velocity of the grid (the Lagrangian velocity ) and the velocity of the fluid relative to the moving grid (the Eulerian velocity ): . The temporal derivative of a scalar function defined at either a fixed spatial location or at fixed coordinates are related by the chain rule of differentiation: r V = r V C + r V R r V R r V C r V f t r x = f t r ξ r V C f , where r V C r x t r ξ . MATHEMATICAL MODEL: GEM MAGNETIC RECONNECTION By applying a mixed Lagrangian/Eulerian description of the fluid flow, 4-field extended MHD equations are transformed from the physical domain to a time-dependent curvilinear coordinate system : ( x , y ) ( ξ , η ) t 2 ϕ + r V R ∇(∇ 2 ϕ)= ∇ 2 ψ , ψ [ ] + μ4 ϕ t V + r V R V = B , ψ [ ] + μ2 V μh 4 V t ψ + r V R ψ = d i ψ , B [ ] + η 2 ψ ν 4 ψ t B + r V R B = V , ψ [ ] + d i 2 ψ , ψ [ ] + η 2 B ν 4 B the ion velocity: the magnetic field: the out-of-plane current density: the Poisson bracket: the electrical resistivity: the collisionless ion skin depth: the fluid viscosity: the hyper-resistivity (or electron viscosity): the hyper-viscosity: r V =∇ ϕ × ˆ z + V ˆ z r B =∇ ψ × ˆ z + B ˆ z j = −∇ 2 ψ η d i μ ν h the computational domain: , the first quadrant of the physical domain (finite difference, (anti-)symmetric fields) boundary conditions: Dirichlet at the top, anti-symmetric in and symmetric in at other three boundaries initial conditions: a Harris equilibrium and perturbation combination for , and other three fields are zeros Ω =[0,0.5 L x ]× [0,0.5 L y ] ϕ , B ψ, V ψ ψ( ξ, η, 0) = 1 2 ln cosh 2 η + cos k x cos k y , k x = 2 π L x ,k y = 2 π L y , ε = 0.1, L x = 25.6, L y =12.8 CONVERGENCE, ACCURACY AND COMPLEXITY STUDY The total energy of the system is The total energy of curvilinear solutions and Cartesian solutions approaches to at . SIMULATION ARCHITECTURE: NERSC CRAY XE6 “HOPPER” 6384 nodes, 24 cores per node (153,216 total cores) 2 twelve-core AMD ‘MagnyCours’ 2.1 GHz processors per node (NUMA) 32 GB DDR3 1333 MHz memory per node (6000 nodes), 64 GB DDR3 1333 MHz memory per node (384 nodes) 1.28 Petaflop/s for the entire machine 6 MB L3 cache shared between 6 cores on the ‘MagnyCours’ processor 4 DDR3 1333 MHz memory channels per twelve-core ‘MagnyCours’ The total energy from t = 0 ~ 40 Asymptotic limits at t = 40 t = 40 ˜ T = 141.0179 M ξ ×M η Cartesi an curvili near 32×32 7.859×1 0 -3 8.773×1 0 -4 64×64 2.368×1 0 -3 1.369×1 0 -4 128×12 8 6.416×1 0 -4 2.530×1 0 -5 256×25 6 1.647×1 0 -4 5.840×1 0 -6 512×51 2 4.165×1 0 -5 The curvilinear solutions are more accurate than the Cartesian solutions on the same problem size. ˜ T | e |=| T ˜ T |/| ˜ T | np M ξ ×M η |e| T F M N1 N2 N2/ N1 cur v. 1 64×64 1.369×1 0 -4 3572 262 20 979 5816 6 Car t. 16 256×2 56 1.647×1 0 -4 2370 0 907 13 198 3 7994 5 40 cur v. 16 256×2 56 5.840×1 0 -6 1440 0 422 35 390 1666 9 43 Complexity study. The number of processors (np), the problem size, the relative error, the execution time T (sec), flops F (10 9 ), memory M (10 7 ), the total number of nonlinear iterations N1, the total number of linear iterations N2, and linear iterations per nonlinear iteration N2/N1 are listed. MagnyCours processor Compute node configuration j t =0.2 j t =40 r V C t =0.2 r V C t =40 f r ξ r x T ( t )= 1 2 ϕ 2 + V 2 +∇ ψ 2 + B 2 dA ∫∫ . Accuracy study. Take as the “exact” solution, relative errors at the final time for the curvilinear and Cartesian solutions of different problem sizes are listed. The Cartesian solution on 256×256 is less accurate than the curvilinear solution on 64×64, so that for comparable accuracy, the curvilinear approach is 106 times more efficient. The Cartesian solution on 256×256 is not only much less accurate than the curvilinear solution of the same size, but also with much more computational time and nonlinear, linear iterations. [ f , g ]≡ ∇ f ×∇ g ˆ z

Xuefei (Rebecca) Yuan 1,4 , Stephen C. Jardin 2 , David E. Keyes 3,4 and Alice E. Koniges 1

Embed Size (px)

DESCRIPTION

Implicit coupling of Lagrangian grid and solution in plasma simulation with additive Schwarz preconditioned inexact Newton. Xuefei (Rebecca) Yuan 1,4 , Stephen C. Jardin 2 , David E. Keyes 3,4 and Alice E. Koniges 1 - PowerPoint PPT Presentation

Citation preview

Page 1: Xuefei  (Rebecca) Yuan 1,4 , Stephen C. Jardin 2 , David E. Keyes 3,4  and Alice E. Koniges 1

Implicit coupling of Lagrangian grid and solution in plasma simulation with additive Schwarz preconditioned inexact Newton

Xuefei (Rebecca) Yuan1,4, Stephen C. Jardin2, David E. Keyes3,4 and Alice E. Koniges1

1LBNL (USA), 2PPPL (USA), 3KAUST (Saudi Arabia), 4Columbia University (USA)

The work was supported by the Department of Applied Physics and Applied Mathematics of Columbia University, the Center for Simulation of RF Wave Interactions with Magnetohydrodynamics (CSWIM), and Petascale Initiative in Computational Science at NERSC . Additionally, we are grateful for the extended computer time as well as the valuable support from NERSC. This work was supported by the Director, Office of Science, Advanced Scientific Computing Research, of the U.S. Department of Energy under Contract No. DE-DC02-06ER54863.

MOTIVATION: TO SIMULATE ITER, ONE CAN ESTIMATE AN ADDITIONAL 12 ORDERS OF MAGNITUDE ARE REQUIREDa) 1.5 orders: increased processor speed and efficiency

b) 1.5 orders: increased concurrencyc) 1 order: higher-order discretizationsd) 1 order: flux-surface following gridding, less

resolution required along than across field linese) 4 orders: adaptive gridding, zones required

refinement are < 1% of ITER volume and resolution requirements away from them are 102 less severe

f) 3 orders: implicit solvers

Name Symbol Units CDX-U DIII-D ITERField B0 Telsa 0.22 1.0 5.3

Minor radius a m 0.22 0.67 2.0Temperature Te keV 0.1 2.0 8.0

Lundquist no. S 1x104 7x106 5x108

Mode growth time τAS1/2 s 2x10-4 9x10-3 7x10-2

Layer thickness aS-1/2 m 2x10-3 2x10-4 8x10-5

Zones NRxNθxNφ 3x106 5x1010 3x1013

CFL time step size ΔX/VA s 2x10-9 8x10-11 7x10-12

Space-time points 6x1012 1x1020 6x1024

International Thermonuclear Experimental Reactor

Nov. 2019 – First Plasmain Cadarache, France

GRID GENERATION TECHNIQUE: THE EQUIDISTRIBUTION METHODThe logical domain:

The physical domain:

The transformation:

Let be a two-dimensional coordinate transformation between a reference grid (a curvilinear coordinate system) and a physical grid (a Cartesian coordinate system) . Assume the transformation has the form , where is the displacement potential. A single Monge-Ampère (MA) equation

leads to an adaptive grid equidistributed by the density function with the Neumann boundary condition

on .The density function has the form of

where, is the area of the logical domain and is provided by the user.

Ξ 2

X 2

rx (

r ξ )

ρx

(x, y) Adaptive grids Zoom-in areas

rx (

r ξ ) :Ξ2 → X

2

rξ =(ξ ,η )

rx = (x,y)

rx =

r ξ +∇ω

ω

∇2ω +∂ 2ω

∂ξ 2∂ 2ω

∂η 2 −(∂ 2ω∂ξ∂η

)2 =1

ρx(x,y)−1

ρx

(x, y)

∇ω⋅ r

n = 0

∂X

ρx(x,y)

ρx(x,y) =˜ ρ x (x,y)

Vo Ξ 2∫

1˜ ρ x (x,y)

dξdη

Vo =Ξ2∫ dξdη

˜ ρ x(x,y)

R-TYPE ADAPTIVE GRID: THE CURRENT DENSITY AND TIME-DEPENDENT COORDINATESThe out-of-plane current density can develop large gradients and near singular behavior in the reconnection region as time evolves, requiring localized region of higher resolution. Assume is

j = −∇ 2ψ

˜ ρ x(x,y)

˜ ρ x(x(ξ ,η ),y(ξ ,η )) =1+α | j(ξ ,η ) |,

where , is the ratio between the biggest cell size and the smallest cell size, and is the maximum and minimum of the current density, respectively.

α =(r −1) /( jmax − jmin )

r

jmax

jmin

The mid-plane current density vs. time: vertical axis is along the mid-plane , and the horizontal axis is time .

−j

y = 0

t = 0 ~ 40

t = 0

t = 40

t = 10

t = 20

t = 30

ϕ

V

ψ

B

−j Adaptive grids

LAGRANGIAN AND EULERIAN VELOCITIES• The fluid velocity is the sum of the velocity of the grid (the Lagrangian velocity ) and the velocity of the fluid relative to the moving grid (the Eulerian velocity ): .• The temporal derivative of a scalar function defined at either a fixed spatial location or at fixed coordinates are related by the chain rule of differentiation:

rV =

r V C +

r V R

rV R

rV C

rV

∂f

∂t r x

=∂f

∂t r ξ

−r V C ⋅ ∇f , where

r V C ≡

∂r x

∂t r ξ

.

MATHEMATICAL MODEL: GEM MAGNETIC RECONNECTIONBy applying a mixed Lagrangian/Eulerian description of the fluid flow, 4-field extended MHD equations are transformed from the physical domain to a time-dependent curvilinear coordinate system :

(x,y)

(ξ ,η )

∂∂t∇ 2ϕ +

r V R ⋅∇(∇ 2ϕ ) = ∇ 2ψ ,ψ[ ] + μ∇ 4ϕ

∂∂t

V +r V R ⋅∇V = B,ψ[ ] + μ∇ 2V −μh∇ 4V

∂∂t

ψ +r

V R ⋅∇ψ = di ψ ,B[ ] +η∇ 2ψ −ν∇ 4ψ

∂∂t

B +r V R ⋅∇B = V,ψ[ ] + di ∇

2ψ ,ψ[ ] +η∇ 2 B −ν∇ 4 B

⎪ ⎪ ⎪ ⎪

⎪ ⎪ ⎪ ⎪

• the ion velocity:• the magnetic field:• the out-of-plane current density:• the Poisson bracket: • the electrical resistivity:• the collisionless ion skin depth:• the fluid viscosity:• the hyper-resistivity (or electron viscosity):• the hyper-viscosity:

rV =∇ϕ × ˆ z +Vˆ z

rB =∇ψ × ˆ z + Bˆ z

j = −∇2ψ

η

di

μ

ν

h

• the computational domain: , the first quadrant of the physical domain (finite difference, (anti-)symmetric fields)• boundary conditions: Dirichlet at the top, anti-symmetric in and symmetric in at other three boundaries• initial conditions: a Harris equilibrium and perturbation combination for , and other three fields are zeros

Ω = [0, 0.5Lx ] × [0, 0.5Ly ]

ϕ, B

ψ,V

ψ

ψ (ξ, η, 0) =1

2ln cosh 2η + cos kx cos ky , kx =

L x

, ky =2π

L y

, ε = 0.1, Lx = 25.6, Ly =12.8

CONVERGENCE, ACCURACY AND COMPLEXITY STUDY• The total energy of the system is

• The total energy of curvilinear solutions and Cartesian solutions approaches to at .

SIMULATION ARCHITECTURE: NERSC CRAY XE6 “HOPPER”• 6384 nodes, 24 cores per node (153,216 total cores)• 2 twelve-core AMD ‘MagnyCours’ 2.1 GHz processors per node (NUMA)• 32 GB DDR3 1333 MHz memory per node (6000 nodes), 64 GB DDR3 1333 MHz memory per node (384 nodes)• 1.28 Petaflop/s for the entire machine• 6 MB L3 cache shared between 6 cores on the ‘MagnyCours’ processor• 4 DDR3 1333 MHz memory channels per twelve-core ‘MagnyCours’ processor

The total energy from

t = 0 ~ 40 Asymptotic limits at

t = 40

t = 40

˜ T = 141.0179

Mξ×Mη Cartesian curvilinear

32×32 7.859×10-3 8.773×10-4

64×64 2.368×10-3 1.369×10-4

128×128 6.416×10-4 2.530×10-5

256×256 1.647×10-4 5.840×10-6

512×512 4.165×10-5

• The curvilinear solutions are more accurate than the Cartesian solutions on the same problem size.

˜ T

| e |=| T − ˜ T | / | ˜ T |

np Mξ×Mη |e| T F M N1 N2 N2/N1

curv. 1 64×64 1.369×10-4 3572 262 20 979 5816 6Cart. 16 256×256 1.647×10-4 23700 907 13 1983 79945 40curv. 16 256×256 5.840×10-6 14400 422 35 390 16669 43

Complexity study. The number of processors (np), the problem size, the relative error, the execution time T (sec), flops F (109), memory M (107), the total number of nonlinear iterations N1, the total number of linear iterations N2, and linear iterations per nonlinear iteration N2/N1 are listed.

MagnyCours processor

Compute node configuration

−jt =0.2

−jt =40

rV C t =0.2

rV C t =40

f

rx

T( t) =1

2∇ϕ

2+ V 2 + ∇ψ

2+ B 2 ⎧

⎨ ⎩

⎫ ⎬ ⎭dA∫∫ .

Accuracy study. Take as the “exact” solution, relative errors at the final time for the curvilinear and Cartesian solutions of different problem sizes are listed.

• The Cartesian solution on 256×256 is less accurate than the curvilinear solution on 64×64, so that for comparable accuracy, the curvilinear approach is 106 times more efficient.• The Cartesian solution on 256×256 is not only much less accurate than the curvilinear solution of the same size, but also with much more computational time and nonlinear, linear iterations.

[ f ,g] ≡∇f ×∇g⋅ ˆ z