Upload
jonathan-ona
View
26
Download
3
Embed Size (px)
Citation preview
Algorithms and Computational Aspects of DFTCalculations
Part II
Juan Meza and Chao YangHigh Performance Computing ResearchLawrence Berkeley National Laboratory
IMA TutorialMathematical and Computational Approaches to Quantum ChemistryInstitute for Mathematics and its Applications, University of Minnesota
September 26-27, 2008
Juan Meza (LBNL) Algorithms and Computational Aspects of DFT Calculations September 27, 2008 1 / 37
1 Goals and Motivation
2 Review of Equations
3 Plane Wave DFT Computational Components
4 Parallelization Strategies
5 Future Computational ChallengesLinear Scaling MethodsParallelism Issues
6 SoftwareAvailable CodesKSSOLV
7 Summary
Juan Meza (LBNL) Algorithms and Computational Aspects of DFT Calculations September 27, 2008 2 / 37
1 Goals and Motivation
2 Review of Equations
3 Plane Wave DFT Computational Components
4 Parallelization Strategies
5 Future Computational ChallengesLinear Scaling MethodsParallelism Issues
6 SoftwareAvailable CodesKSSOLV
7 Summary
Juan Meza (LBNL) Algorithms and Computational Aspects of DFT Calculations September 27, 2008 3 / 37
Goals
1 The Role of Computation
2 Review Equations and Solution Techniques
3 Discuss Major Computational Aspects of Plane Wave DFT codes
4 Present Some Parallelization Issues
5 Highlight Computational Challenges
Juan Meza (LBNL) Algorithms and Computational Aspects of DFT Calculations September 27, 2008 4 / 37
Materials by design
Advances in density functional theory coupled with multinodecomputational clusters now enable accurate simulation of the behaviorof multi-thousand atom complexes that mediate the electronic and ionictransfers of solar energy conversion. These new and emerging nanosciencecapabilities bring a fundamental understanding of the atomic andmolecular processes of solar energy utilization within reach.
Basic Research Needs for Solar Energy Utilization, Report of the BESWorkshop on Solar Energy Utilization,April 18-21, 2005
Juan Meza (LBNL) Algorithms and Computational Aspects of DFT Calculations September 27, 2008 5 / 37
DFT codes are widely used for science applications
9470 nodes; 19,480 cores
13 Tflops/s SSP (100 Tflops/speak)
Upgrade to QuadCore (355 Tflops/speak)
DFT methods account for 75% ofthe materials sciences simulations atNERSC, totaling over 5 Millionhours of computer time in 2006
Juan Meza (LBNL) Algorithms and Computational Aspects of DFT Calculations September 27, 2008 6 / 37
We can now simulate some realistic structures
The charge density of a 15,000 atom
quantum dot, Si13607H2236. Using 2048
processors at NERSC the calculation took
about 5 hours.
The calculated dipole moment of
a 2633 atom CdSe quantum rod,
Cd961Se724H948. Using 2560 processors
at NERSC the calculation took about 30
hours.
Juan Meza (LBNL) Algorithms and Computational Aspects of DFT Calculations September 27, 2008 7 / 37
1 Goals and Motivation
2 Review of Equations
3 Plane Wave DFT Computational Components
4 Parallelization Strategies
5 Future Computational ChallengesLinear Scaling MethodsParallelism Issues
6 SoftwareAvailable CodesKSSOLV
7 Summary
Juan Meza (LBNL) Algorithms and Computational Aspects of DFT Calculations September 27, 2008 8 / 37
Kohn-Sham Equations
Recall our goal is to find the ground state energy by minimizing theKohn-Sham total energy, Etotal
Leads to:
Kohn-Sham equations
Hψi = εiψi, i = 1, 2, ..., ne
H =[−1
2∇2 + V (ρ(r))
],
V (ρ(r)) = Vext(r) +∫
ρ
|r − r′ |+ Vxc(ρ)
Nonlinear eigenvalue problem since the Hamiltonian, H, depends on ψthrough the charge density, ρ
Juan Meza (LBNL) Algorithms and Computational Aspects of DFT Calculations September 27, 2008 9 / 37
Discretized Kohn-Sham Equations
KKT conditions
∇XL(X,Λ) = 0,X∗X = Ine .
Discretized Kohn-Sham equations can now be written as:
H(X)X = XΛ,X∗X = Ine
.
Kohn-Sham Hamiltonian given by:
H(X) =12L+ V (X),
V (X) = Vext + Diag (L†ρ(X)) + Diag gxc(ρ(X))
Juan Meza (LBNL) Algorithms and Computational Aspects of DFT Calculations September 27, 2008 10 / 37
The SCF Iteration
V (ρ(r))
ρ(r) =∑ne
i |ψi(r)|2
{ψi}i=1,...,ne
[− 1
2∇2 + V (ρ(r))
]ψi = Eiψi
1 Given an initial charge density ρcompute a potential Vk(ρ(r))
2 Solve the linear eigenvalue problemfor the ψi, i = 1, . . . , ne
3 Compute the new charge density ρ
4 Update ρ using your favorite mixingscheme
5 Compute Vk+1 and repeat untilconverged
Overall computational complexity isO(N n2
e) due to linear algebra
Major computational components
CG methodOrthogonalizationComputation of potentials3D FFT
Juan Meza (LBNL) Algorithms and Computational Aspects of DFT Calculations September 27, 2008 11 / 37
What Are the Computational Issues?
DFT methods account for 75% of the material science simulations at NERSC
Parallel efficiencies can be quite high
on plane wave basis can scale to ≈ 1000 processorson plane wave basis and wavefunction index can scale to ≈ 10, 000 processors
Most codes still based on O(N3) algorithms
Not systematically improvable
Inadequate for strong and/or non-local correlations
Parallel efficiencies can be difficult to achieve; 10-20% parallel efficiency isnot uncommon
Juan Meza (LBNL) Algorithms and Computational Aspects of DFT Calculations September 27, 2008 12 / 37
1 Goals and Motivation
2 Review of Equations
3 Plane Wave DFT Computational Components
4 Parallelization Strategies
5 Future Computational ChallengesLinear Scaling MethodsParallelism Issues
6 SoftwareAvailable CodesKSSOLV
7 Summary
Juan Meza (LBNL) Algorithms and Computational Aspects of DFT Calculations September 27, 2008 13 / 37
Major Computational Components of Plane Wave DFTCodes
Eigenvalue solver
Orthogonalization
3D FFTs
Computation of potentials
Juan Meza (LBNL) Algorithms and Computational Aspects of DFT Calculations September 27, 2008 14 / 37
Eigenvalue Solver
Need to solve one N × ne linear eigenvalue problem at each SCF iteration
The size of N can easily be 10,000 – 100,000
Only need the ne(≈ number of atoms) lowest eigenvalues and correspondingeigenvectors
Called diagonalization in chemistry/materials science circles
Various approaches including CG, Grassmann CG, residual minimization
Distinction is usually made between all band vs. band-by-band, whichcorresponds to solving for all eigenvectors simultaneously vs. solving for oneeigenvector at a time. We would call this blocked vs. unblocked
Use of optimized high-level BLAS3 routines can significantly improveperformance
Juan Meza (LBNL) Algorithms and Computational Aspects of DFT Calculations September 27, 2008 15 / 37
Orthogonalization
Due to physical constraints, the electronic wavefunctions must beorthonormal
This adds a constraint to the KS equations in the form of X∗X = Ine
Can be time consuming for large systems
Complexity is O(N n2e), where N is the size of the discretization and ne is
the number of electrons
Juan Meza (LBNL) Algorithms and Computational Aspects of DFT Calculations September 27, 2008 16 / 37
FFTs
Recall that the kinetic energy operator takes on a particularly simple form inFourier space (also called G-space)
Most DFT codes take advantage of this fact by converting from real space toG-space for computation of the Hamiltonian
Since systems are usually 3D, codes need to compute the 3D FFTs through aseries of 1D FFTs
This has a consequence both in the total amount of work and when trying toparallelize the codes
Juan Meza (LBNL) Algorithms and Computational Aspects of DFT Calculations September 27, 2008 17 / 37
Computation of potentials
The Hartree potential,VHartree =
∫ρ
|r−r′ | , can be computed in several ways
The calculation can be posed as the solution of a Poisson problem.
Fast Poisson solvers or multigrid can also be used
Because the potential can be viewed a convolution, it can also be computedusing FFTs
Juan Meza (LBNL) Algorithms and Computational Aspects of DFT Calculations September 27, 2008 18 / 37
1 Goals and Motivation
2 Review of Equations
3 Plane Wave DFT Computational Components
4 Parallelization Strategies
5 Future Computational ChallengesLinear Scaling MethodsParallelism Issues
6 SoftwareAvailable CodesKSSOLV
7 Summary
Juan Meza (LBNL) Algorithms and Computational Aspects of DFT Calculations September 27, 2008 19 / 37
Parallel Calculations Milestones
1991 Silicon surface reconstruction (7x7), Meiko I860, 64 processor, (Stich, Payne,King-Smith, Lin, Clarke)
1998 FeMn alloys (exchange bias), Cray T3E, 1500 procs; First > 1 Tflopsimulation, Gordon Bell prize (Ujfalussy, Stocks, Canning, Y. Wang, Sheltonet al.)
2005 1000 atom Molybdenum simulation with Qbox, BlueGene/L at LLNL with32,000 processors (F. Gygi et al.)
2008 Band-gap calculation of a 13,824 atom ZnTeO alloy proposed as a new solarcell material. Used 131,072 processors on Blue Gene/P at ANL achieved107.5 Tflops/s
Juan Meza (LBNL) Algorithms and Computational Aspects of DFT Calculations September 27, 2008 20 / 37
Parallelization Strategies
Parallel across k-points – Not useful for large systems as k is usually small
Parallel over electrons – number of processors limited by number of electrons
Parallel over the number of plane-wave basis, ng – most commonly used inplane-wave codes
Parallelization of DFT codes is nontrivial and most codes cannot scale tolarge numbers of processors with even moderate efficiencies.
30% parallel efficiency is usually considered very good
Parallelization issues for Hartree-Fock codes are similar, especially for SCF
Juan Meza (LBNL) Algorithms and Computational Aspects of DFT Calculations September 27, 2008 21 / 37
Parallelization of 3D FFT
3D FFTs are computed via 3 sets of 1DFFTs and 2 transposes
Most of the communication is in globaltranspose (b) to (c)
Ratio of flops/comm ≈ logNMany FFTs are computed at the sametime to avoid latency issues
Only non-zero elementscomputed/communicated
For details see (Canning et al.):http://www.nersc.gov/projects/paratec/
Juan Meza (LBNL) Algorithms and Computational Aspects of DFT Calculations September 27, 2008 22 / 37
1 Goals and Motivation
2 Review of Equations
3 Plane Wave DFT Computational Components
4 Parallelization Strategies
5 Future Computational ChallengesLinear Scaling MethodsParallelism Issues
6 SoftwareAvailable CodesKSSOLV
7 Summary
Juan Meza (LBNL) Algorithms and Computational Aspects of DFT Calculations September 27, 2008 23 / 37
Linear Scaling Electronic Structure Methods
Goal is to reduce the computational work from O(N3) to O(N)Quantum mechanical effects are near-sighted, e.g. treat the computation ofthe exchange-correlation potential locally
Need to introduce concept of a localization region, inside which the quantityof interest is computed and is assumed to vanish outside the region
Six strategies for taking advantage of this fact (see Goedecker (1999)):1 Fermi operator expansion2 Fermi operator projection3 Divide-and-conquer4 Density-matrix minimization5 Orbital minimization approach6 Optimal basis density-matrix minimization
Juan Meza (LBNL) Algorithms and Computational Aspects of DFT Calculations September 27, 2008 24 / 37
LS3DF
Based on Divide-and-Conquer approach
Divide a large system into smaller sub-domains that can be solvedindependently, then stitch the sub-domains back together again
Classical electrostatic interactions are long-ranged, i.e. solve one globalPoisson equation
Requires minimal communication between the sub-domains
Artificial boundary effects due to sub-dividing domains can be cancelled out
Based on ideas from fragment molecular method
We call our method Linear Scaling 3D Fragment or LS3DF 1
1L.W. Wang, Z. Zhao, J. Meza, LBNL-61691 (2006)Juan Meza (LBNL) Algorithms and Computational Aspects of DFT Calculations September 27, 2008 25 / 37
Parallelism Issues
IBM Cell Blade. Same processor as found in
a Sony Playstation 3
Multi-core and many-core is thewave of the future
Current algorithms for parallelismare difficult to parallelize with highefficiency
Many quantum chemistry codes donot parallelize well for even mediumscaled paralellism
Juan Meza (LBNL) Algorithms and Computational Aspects of DFT Calculations September 27, 2008 26 / 37
1 Goals and Motivation
2 Review of Equations
3 Plane Wave DFT Computational Components
4 Parallelization Strategies
5 Future Computational ChallengesLinear Scaling MethodsParallelism Issues
6 SoftwareAvailable CodesKSSOLV
7 Summary
Juan Meza (LBNL) Algorithms and Computational Aspects of DFT Calculations September 27, 2008 27 / 37
Electronic Structure Codes
ABINIT – www.abinit.org
PARATEC – www.nersc.gov/projects/paratec
PEtot – hpcrd.lbl.gov/linwang/PEtot/PEtot.html
PWscf – www.pwscf.org
NWChem – www.emsl.pnl.gov/docs/nwchem/nwchem.html
Q-Chem – www.q-chem.com/
Quantum Espresso – www.quantum-espresso.org
Socorro – dft.sandia.gov/Socorro
VASP – cms.mpi.univie.ac.at/vasp
Many, many more – apologies if your favorite code was not listed
Juan Meza (LBNL) Algorithms and Computational Aspects of DFT Calculations September 27, 2008 28 / 37
KSSOLV Matlab package
KSSOLV Matlab code for solving the Kohn-Sham equations
Open source package
Handles SCF, DCM, Trust Region
Example problems to get started with
Object-oriented design - easy to extend
Good starting point for students
Beta version of KSSOLV available, ask one of us for more information!
Juan Meza (LBNL) Algorithms and Computational Aspects of DFT Calculations September 27, 2008 29 / 37
Example: SiH4
a1 = Atom(’Si’);
a2 = Atom(’H’);
alist = [a1 a2 a2 a2 a2];
xyzlist= [
0.0 0.0 0.0
1.61 1.61 1.61
... ];
mol = Molecule();
mol = set(mol,’supercell’,C);
mol = set(mol,’atomlist’,alist);
mol = set(mol,’xyzlist’ ,xyzlist);
mol = set(mol,’ecut’, 25);
mol = set(mol,’name’,’SiH4’);
...
isosurface(rho);
Juan Meza (LBNL) Algorithms and Computational Aspects of DFT Calculations September 27, 2008 30 / 37
Convergence
[Etot, X, vtot, rho] = scf(mol);[Etot, X, vtot, rho] = dcm(mol);
Juan Meza (LBNL) Algorithms and Computational Aspects of DFT Calculations September 27, 2008 31 / 37
Charge Density
isosurface(rho);
Juan Meza (LBNL) Algorithms and Computational Aspects of DFT Calculations September 27, 2008 32 / 37
Example: Pt6Ni2O
cell:19.59 0.0 0.0...
sampling size: n1 = 96, n2 = 48, n3 = 48atoms and coordinates:1 Pt 1.3 -0.180 -0.015...7 Ni 8.4 0.003 3.0698 Ni 8.5 7.998 7.7629 O 14.9 2.644 1.511
number of electrons : 86spin type : 1kinetic energy cutoff: 60.0
Juan Meza (LBNL) Algorithms and Computational Aspects of DFT Calculations September 27, 2008 33 / 37
Comparison of DCM vs. SCF
Juan Meza (LBNL) Algorithms and Computational Aspects of DFT Calculations September 27, 2008 34 / 37
1 Goals and Motivation
2 Review of Equations
3 Plane Wave DFT Computational Components
4 Parallelization Strategies
5 Future Computational ChallengesLinear Scaling MethodsParallelism Issues
6 SoftwareAvailable CodesKSSOLV
7 Summary
Juan Meza (LBNL) Algorithms and Computational Aspects of DFT Calculations September 27, 2008 35 / 37
Summary
Described most common PW DFT computational components
Overview of standard numerical methods used
Brief introduction into some parallelization issues
Listed some computational challenges
Introduced KSSOLV, Matlab package for solving KS equations
Juan Meza (LBNL) Algorithms and Computational Aspects of DFT Calculations September 27, 2008 36 / 37
References
Aron J. Cohen, Paula Mori-Snchez, Weitao Yang, Insights into CurrentLimitations of Density Functional Theory, Science, Vol. 321. no. 5890, pp.792 - 794 (2008).
F. Gygi, R. K. Yates, J. Lorenz, E. W. Draeger, F. Franchetti, C. W.Ueberhuber, B. R. de Supinski, S. Kral, J. A. Gunnels, J. C. Sexton ,Proceedings of the 2005 ACM/IEEE conference on Supercomputing (2005).
G. Goedecker, Linear Scaling Electronic Structure Methods, Rev. Mod. Phys.71, 1085 (1999).
Curtis L. Janssen and Ida M.B. Nielsen, Parallel Computing in QuantumChemistry, CRC Press, (2008).
Juan Meza (LBNL) Algorithms and Computational Aspects of DFT Calculations September 27, 2008 37 / 37