Upload
others
View
1
Download
0
Embed Size (px)
Citation preview
SPPEXA Garching, 25.01.16
Rochus Schmid
Lehrstuhl für Anorganische Chemie II
Computational Materials Chemistry Group
Ruhr-Universität Bochum
Efficient Solvers forDensity Functional Theory
orQuantum Chemistry and Stencils?
SPPEXA Garching, 25.01.16
The Target: Atomistic Simulations of Chemical Systems and Processes
Ab initio MD (quantum mechanic description computed with our
RSDFT code)
Surface of a Ge single crystal terminated with hydroxyl-groups
• 192 atoms
• ca. 1.5 nm3
• 104x104x97
grid points
• 0.12 fs timestep
• ~5s cpu-time/step
2.5 ps in 2 days
SPPEXA Garching, 25.01.16
Quantum Chemistry
1. Born-Oppenheimer Approximation:
Because of weight difference: approximate nuclei as classical
point charges (justification for E=f(R))
Must treat electrons by QM
2. 𝑯𝜳 = 𝑬𝜳 Approximate Ψ (Many electron WF) by a product of single
particle WF φ („molecular orbitals“)
electrons move in mean field of others
electrons are fermions: use Slater-determinant which
changes sign when exchanging electrons
Hartree-Fock approximation
𝑯 = −𝟏
𝟐𝛻𝟐 + 𝑽 𝝆 : kinetic and potential (Coulomb)
𝜌 = 𝜓2 𝑯 depends on solution
SPPEXA Garching, 25.01.16
Kohn-Sham DFT
• Second order non-linear Eigenvalue-problem:
𝐻𝐾𝑆 = −1
2𝛻2 + 𝑉𝐸𝐸 + 𝑉𝑁𝐸 + 𝑉𝑋𝐶
solve (e.g. Matrix diagonalization) iteratively (𝑉𝐸𝐸 and 𝑉𝑋𝐶depend on the solution)
NOTE: only a few lowest Eigenvalues (occupied states) are
needed for the energy
• Direct minimisation of energy expression 𝐸[𝜓] by following
„force on wavefunction“ −𝛿𝐸
𝛿𝜓(only lowest Eigenvalues)
For (energy conserving) dynamics: accurate forces are needed
in integration of Newtonian eq. of motion
Because of dependence of 𝜌 on 𝜓: potentials 𝛿𝐸
𝛿𝜌
SPPEXA Garching, 25.01.16
Consequences of MD
• Integrate Newtons EOM with sufficient accuracy to give proper
Boltzmann ensemble
• We need to solve the problem E = f(R) many times
• We need accurate forces 𝝏𝑬
𝝏𝑹at each step
Large variation of V, ρ and φ close to nuclei but more smooth in
the rest of the domain
• It is tempting to use hierarchical methods with some kind of
local refinement, but the changes when nuclei move!
• Either we need converged solutions (usually too expensive) or
methods, where we can compute the resulting forces (due to
changes in the local refinement) in an analytic and efficient
way.
SPPEXA Garching, 25.01.16
“Traditional” forms:
• LCAO: atom centered BF
GTO, STO (but also Numeric)
• Planewaves non atom centered BF
New forms:
• Finite Difference (Beck; Bernholc; Chelikowsky; Fattebert; …)
• Finite Elements (Pask)
• Discrete Variable Representation (Tuckerman)
• Wavelets (Goedecker; Arias)
• “BLIP”-functions (Conquest-code)
• Localized numeric atom cent. BF (Siesta-code)
• ...
see excellent review by Th. Beck, Rev. Mod. Phys. 2000, 72, 1041.
Representation of Wavefunctions
SPPEXA Garching, 25.01.16
RS versus FS
Wavefunctions represented in Fourier Space (PW)
• Laplacian is diagonal in FS
kinetic energy (2) or Coulomb potential (2Vee)
• other contributions in real space (e.g. Exc)
• convert RS ↔ FS via 3D-FFT
3D-FFT
Wavefunctions discretized in Real Space (RS)
• everything evaluated in RS no FFT
• approximate Laplacian by Finite Difference
SPPEXA Garching, 25.01.16
• Ab initio molecular Dynamics (Car-Parrinello MD)
• Numerical wave functions discretized in real space
• Norm-conserving PPs
• Multigrid Poisson solver
R. Schmid, J. Comput. Chem. 2004, 25, 799 - 812.
R. Schmid, M. Tafipolsky, P. H. König, and H. Köstler,Phys. Stat. Sol. B 2006, 243, 1001 - 1015.
M. Tafipolsky and R. Schmid,J. Chem. Phys. 2006, 124, 174102/1 - 174102/9.
H. Köstler, R. Schmid, U. Rüde, Ch. Scheit,Comput. Visual Sci. 2008, 11, 115-122.
A Car-ParrinelloReal Space DFT Code
Hydrogen terminated
GaN [0001] surface
SPPEXA Garching, 25.01.16
Energy expression
• Higher order FD Laplacian (19 point stencil)
3 neighbor points in each direction
• Norm-conserving pseudopotentials (NCPP)
Kleinman-Bylander separation
• Hartree potential via Poisson-Equation (MG w. MSD)
• XC: for GGA FD gradient norm (19 point stencil)
2 4eeV
atom
ne loc l lm lm lm l loc lm
lm
V V D p p with p V V
2
2
1, ,
2
occ
DFT i n i i i ne n ee xc nn n
i
occ
i i ij i j ij
i
E f E E E E
with f and S
R R R
R. Schmid, M. Tafipolsky, P. H. König, H. Köstler, phys. stat. sol. (b) 2006, 243, 1001-1015.
SPPEXA Garching, 25.01.16
• The idea:
i are expanded in some basis functions by coefficients Ci
• EDFT(R, Ci) EDFT/ R and EDFT/ Ci
• Give „C‘s“ a fictitious mass
• Integrate Eq. of Motion of both R and C together
• Maintain orthonormality of i by Lagrange mult.
• Integrate by Position Verlet (or Vel. Verlet)
Car-Parrinello-MD
21 1
2 2CP n i i DFT ij ij ij
n i ij
M E S RL
200 0
02
0
ˆ2
2
i i i DFT i ij j
j
DFTn n n
n n
tH
Et
M
R R RR
SPPEXA Garching, 25.01.16
WF gradient ĤDFT|
|gpsii = -2 |psii
|gpsii += |pp|psii
rho = fi |psii2 drho = rho - compn(Rn)
Vee[drho] Vxc[rho]
Vtot = Vloc(Rn) + Vxc +Vee
|gpsii += 2 |psii Vtot psi
rho
Comm.
(Laplacian)
Comm.!!!
(Poisson)
Comm. (only
GGA)
SPPEXA Garching, 25.01.16
psi
Verlet / SHAKE
|psi‘(+) = 2 |psi(0) - |psi(-) - t2/ |gpsi(0)
Aij = psii‘(+)|psij‘(+) ; Bij = psii‘(+)|psij(0)
Solve: XX + XB + B†X = I-A ; Xij= t2/ ij
use: A‘=I-A ; B‘=I-B ; X0 = ½A’
iterate: Xn+1 = ½[A’ + XnB’ + B’ †Xn – (Xn)2]
|psii(+) = |psii‘(+) + jXij |psij(0)
O(NM)
O(N2M)
O(N2M)
SPPEXA Garching, 25.01.16
Coding
Python
(+NumPy)
•Object Oriented
•User Interface
•Basic I/O
•Analysis
•Some calculations
fd pp ps BLAS
Wrapped C-functions for speed:
xc
Python: ~7000 lines C: ~10000 lines
Automatic wrapping of C-functions by SWIG
Grid Objects:
wavefunctions, densities, potentials
• data distributed, parallel bookkeeping hidden
• add, scale, copy in Python (via BLAS)
• Overlap operator “|” for <bra|ket>
Example: Eigenvalues as method of psi object
def eigenvalues (self):
Hij = (self.psi|self.gpsi)
self.eigenv =LinearAlgebra.eigenvalues(Hij)
return
ˆij i jH H
SPPEXA Garching, 25.01.16
A Note on Domain Size and Grid Spacing
Solvers must be efficient for any grid sizes (not just 2N !)
• Domain is defined by physics (e.g. crystal lattice
parameters)
• Find integer number giving close to equal grid spacing
in all spatial directions
• Use largest grid spacing to be used for the given
chemical species in order to reduce size of the
problem (orthogonalization!)
SPPEXA Garching, 25.01.16
Example GaN
x = 4 a = 12.6092 Å
Nx = 80 hx = 0.2978 Bohr
y =
2 s
qrt
(3)
a =
10.9
1989 Å
Ny
= 7
0
hy
= 0
.2948 B
ohr
for 1 ML H: Ga-H 1.61905 / 1.61928
SPPEXA Garching, 25.01.16
Problem: Iterative Poisson Solver
“direct integration”
of Coulomb’s law impossible
iterative solution of Poisson-Eq.
(SOR-solver, FD-Laplacian)
Vee acts as force on
Critical for energy conservation in CP-MD
31( ) ( )ee i j
j i j
V r r hr r
2 ( ) 4 ( )ee i iV r r
• Convergence of solver
• Boundary conditions (for non periodic systems)
• Discretization errors
SPPEXA Garching, 25.01.16
Prediction of Vee
•1st order: Veeinit(0) = 2Vee(-1) - Vee(-2)
•2nd order: Veeinit(0) = 3Vee(-1) - 3Vee(-2) + Vee(-3)
Testcase: free dyn. of Si8h=0.3 a.u. (87x87x87)
t = 5 a.u. ; e = 700 a.u.
Vee BC enf. zero
PS conv. 10-10
2 dual xeon nodes (4 cpus)
PS cyc./step CPU total CPU ener. CPU Vee
“zero” 245.1 7.45 s 5.26 s 3.38 s
1st 83.8 5.20 s 3.02 s 1.16 s
2nd 54.5 4.82 s 2.65 s 0.79 s
SPPEXA Garching, 25.01.16
Poisson-Solver: Multigrid
• From SOR (=1.7) to
Multigrid (MG)
• Method from J. Bernholc
(MSD on finest level,
7-point stencil on others)
• We use GS (=1.0)
instead of Jacobi
• FMG doesn’t help
4
4
4
4
1
1
1
h
2h
4h
8h
CPU Vee SOR MG
1st step 21.9 s 2.28 s
average 3.37 s 0.644 s
opt. of Si8h=0.3 (71x71x71)
serial (1 CPU)
SPPEXA Garching, 25.01.16
Higher order techniques: MSD
• Balance between accuracy and communication overhead:
Mehrstellen-discretization (MSD)
• MSD on finest grid only
and 2nd order FD on all coarse grids
• Scaling of correction of finest grid (~1.25)
000
010
000
010
161
010
000
010
000
12
1
010
121
010
121
2242
121
010
121
010
6
12
B
Ah
4BA eeV
SPPEXA Garching, 25.01.16
A NestedProblem
• inner loop MG Poisson solver
• From solution 𝑉𝐸𝐸 we need to compute 𝐸𝐸𝐸 and 𝜕𝐸𝐸𝐸
𝜕𝜌(in the
discretized case).
• We need higher order discretizations (MSD is efficient in
parallel, because only one neighbor point needed)
• 3D (or 2D) periodic with proper boundary conditions
• Grid spacing should be similar and simulation box is defined
from physics arbitrary number of grid points in all directions!
(currently limited to multiples of 8 because three levels in MG)
• We solve multiple times (~ 10000 timesteps for ps trajectory)
starting from very good approx. from last step
KS-Equations (𝛻2𝜓)
Poisson (𝛻2𝑉𝐸𝐸)
SPPEXA Garching, 25.01.16
Routes for Improvement
• Discard Car-Parrinello scheme (fictitious wavefucntion
dynamics) can lead to difficulties
Solve KS-Eigenvalue problem by FAS-MG?
Problem resolving Psuedopotentials (nuclei-electron interaction)
on coarse grids?
How to do it? Solve Hartree-problem every step? Solve all
wavefunctions at the same time or one after the other?
SPPEXA Garching, 25.01.16
My Personal Wish-List
• Efficient and flexible parallelization on different hardware with
different parallelization strategy (Infiniband MPI, OpenMPI on
Multicore CPUs and GPGPU) autotuning??
• No restrictions in domain sizes and grid spacings
• Higher order (and Mehrstellen) discretizations
• Local refinement with accurate forces (can this be done?)
• WF either localized (order-N) or over full domain (N2)
Performance
(on the example of our own code RSDFT and CPMD):
• Current status: ~200 atoms, ~400 Orbitals, 1003 grid points
~10 GB, 48 cores, ca. 5 s/timestep
• Wish: ~ 1000-2000 atoms, ~5000 Orbitals, 2003 -3003
~1000 GB, 1024-2048 cores, ~10s/timestep
SPPEXA Garching, 25.01.16
Thank you!
SPPEXA Garching, 25.01.16
Electrochemistry
• Macroscopic
“system”
• The actual
interface is an
open system (ions
and electrons leave
and enter)
ions
electrons
interface
ions electrons
SPPEXA Garching, 25.01.16
Potential Drop and Field in the EDL
• The EDL is “diffuse” and dynamic
• Ion charges compensate surface
charge
For a atomistic model:
• We need ions (water alone is not
enough)
• The system must be “open” (Grand
canonical)
• We need to include the complete
EDL up to the point were the
potential is constant (system neutral)
→ can be large e.g. for low conc. of
the electrolyte
solvent
+
+
+
+
+
+
+
+
+
+
+
------
-----electrode
charge
potential
Debey length
SPPEXA Garching, 25.01.16
x
y
z
Electrostatic problem including implicit solvent/open boundary
1.) ε=const. rr 42 solve in FS(or in RS via MG)
2.) ε(r) (178) rrr 4
density dependent ε(r) + cavitation termFattebert, Gygi, Marzari et al., JCP 2006, 124, 074103
3.) ε(r) (178) plus ions:
(modified) Poisson-Boltzmann Eq.
))((4 rFrrr
solve in
RS
damping
dielectric
compensating
ion charges
Real Space: 2D Periodic