15
BIGDML: Towards Exact Machine Learning Force Fields for Materials Huziel E. Sauceda, 1, 2, * Luis E. G´ alvez-Gonz´ alez, 3 Stefan Chmiela, 1, 4 Lauro Oliver Paz-Borb´ on, 5 Klaus-Robert M¨ uller, 1, 4, 6, 7, 8, and Alexandre Tkatchenko 9, 1 Machine Learning Group, Technische Universit¨ at Berlin, 10587 Berlin, Germany 2 BASLEARN - TU Berlin/BASF Joint Lab for Machine Learning, Technische Universit¨ at Berlin, 10587 Berlin, Germany 3 Programa de Doctorado en Ciencias (F´ ısica), Divisi´on de Ciencias Exactas y Naturales, Universidad de Sonora, Blvd. Luis Encinas & Rosales, Hermosillo, Mexico 4 BIFOLD – Berlin Institute for the Foundations of Learning and Data, Germany 5 Instituto de F´ ısica, Universidad Nacional Aut´onoma de M´ exico, Apartado Postal 20-364, 01000 CDMX, Mexico 6 Google Research, Brain team, Berlin, Germany 7 Department of Artificial Intelligence, Korea University, Anam-dong, Seongbuk-gu, Seoul 02841, Korea 8 Max Planck Institute for Informatics, Stuhlsatzenhausweg, 66123 Saarbr¨ ucken, Germany 9 Department of Physics and Materials Science, University of Luxembourg, L-1511 Luxembourg, Luxembourg (Dated: June 9, 2021) Machine-learning force fields (MLFF) should be accurate, computationally and data efficient, and applicable to molecules, materials, and interfaces thereof. Currently, MLFFs often introduce tradeoffs that restrict their practical applicability to small subsets of chemical space or require exhaustive datasets for training. Here, we introduce the Bravais-Inspired Gradient-Domain Machine Learning (BIGDML) approach and demonstrate its ability to construct reliable force fields using a training set with just 10-200 geometries for materials including pristine and defect-containing 2D and 3D semiconductors and metals, as well as chemisorbed and physisorbed atomic and molecular adsorbates on surfaces. The BIGDML model employs the full relevant symmetry group for a given material, does not assume artificial atom types or localization of atomic interactions and exhibits high data efficiency and state-of-the-art energy accuracies (errors substantially below 1 meV per atom) for an extended set of materials. Extensive path-integral molecular dynamics carried out with BIGDML models demonstrate the counterintuitive localization of benzene–graphene dynamics induced by nuclear quantum effects and allow to rationalize the Arrhenius behavior of hydrogen diffusion coefficient in a Pd crystal for a wide range of temperatures. I. INTRODUCTION The development and implementation of accurate and efficient machine learning force fields (MLFF) is trans- forming atomistic simulations throughout the fields of physics [15], chemistry [613], biology [14, 15], and materials science [1621]. The application of MLFFs have enabled a wealth of novel discoveries and quantum- mechanical insights into atomic-scale mechanisms in molecules [3, 6, 2124] and materials [2, 4, 2527]. A major hurdle in the development of MLFFs is to optimize the conflicting requirements of ab initio accu- racy, computational speed and data efficiency, as well as universal applicability to increasingly larger chemi- cal spaces [28]. In practice, all existing MLFFs intro- duce tradeoffs that restrict their accuracy, efficiency, or applicability. In the domain of materials modeling, all MLFFs known to the authors employ the so-called lo- cality approximation, i.e. the global problem of predict- ing the total energy of a many-body condensed-matter system is approximated by its partitioning into localized atomic contributions. The locality approximation has * [email protected] [email protected] [email protected] been rather successful for capturing local chemical de- grees of freedom as demonstrated in a wide number of applications [2933]. However, we emphasize that the lo- cality assumption disregards non-local interactions and its validity can only be truly assessed by comparison to experimental observables or explicit ab initio dynamics. This fact restricts truly predictive MLFF simulations of realistic materials, whose properties are often determined by a complex interplay between local chemical bonds and a multitude of non-local interactions. The chemical space of materials is exceedingly diverse if we count all possible compositions and configurations of a given number of chemical elements. For example, an accurate MLFF reconstruction of the potential-energy surface (PES) of elemental bulk materials to meV/atom accuracy often requires many thousands of configurations for training [20, 29, 3437]. The MLFF errors also in- crease at least by an order of magnitude when including defects or surfaces [31, 34]. Heteroatomic materials and interfaces between molecules and materials would require substantially more training data for creating predictive MLFFs and accuracies much better than 1 meV/atom, eventually making the modeling of such materials in- tractable. In addition, there is a strong desire to go be- yond traditional density-functional theory (DFT) refer- ence data in the field of atomistic materials modeling [3840]. Beyond-DFT methods can only be realistically ap- plied to compute dozens or hundreds of geometries, mak- arXiv:2106.04229v1 [cond-mat.mtrl-sci] 8 Jun 2021

BIGDML: Towards Exact Machine Learning Force Fields for

  • Upload
    others

  • View
    3

  • Download
    0

Embed Size (px)

Citation preview

Page 1: BIGDML: Towards Exact Machine Learning Force Fields for

BIGDML: Towards Exact Machine Learning Force Fields for Materials

Huziel E. Sauceda,1, 2, ∗ Luis E. Galvez-Gonzalez,3 Stefan Chmiela,1, 4 Lauro

Oliver Paz-Borbon,5 Klaus-Robert Muller,1, 4, 6, 7, 8, † and Alexandre Tkatchenko9, ‡

1Machine Learning Group, Technische Universitat Berlin, 10587 Berlin, Germany2BASLEARN - TU Berlin/BASF Joint Lab for Machine Learning,

Technische Universitat Berlin, 10587 Berlin, Germany3Programa de Doctorado en Ciencias (Fısica), Division de Ciencias Exactas y Naturales,

Universidad de Sonora, Blvd. Luis Encinas & Rosales, Hermosillo, Mexico4BIFOLD – Berlin Institute for the Foundations of Learning and Data, Germany

5Instituto de Fısica, Universidad Nacional Autonoma de Mexico, Apartado Postal 20-364, 01000 CDMX, Mexico6Google Research, Brain team, Berlin, Germany

7Department of Artificial Intelligence, Korea University, Anam-dong, Seongbuk-gu, Seoul 02841, Korea8Max Planck Institute for Informatics, Stuhlsatzenhausweg, 66123 Saarbrucken, Germany

9Department of Physics and Materials Science, University of Luxembourg, L-1511 Luxembourg, Luxembourg(Dated: June 9, 2021)

Machine-learning force fields (MLFF) should be accurate, computationally and data efficient,and applicable to molecules, materials, and interfaces thereof. Currently, MLFFs often introducetradeoffs that restrict their practical applicability to small subsets of chemical space or requireexhaustive datasets for training. Here, we introduce the Bravais-Inspired Gradient-Domain MachineLearning (BIGDML) approach and demonstrate its ability to construct reliable force fields using atraining set with just 10-200 geometries for materials including pristine and defect-containing 2Dand 3D semiconductors and metals, as well as chemisorbed and physisorbed atomic and molecularadsorbates on surfaces. The BIGDML model employs the full relevant symmetry group for a givenmaterial, does not assume artificial atom types or localization of atomic interactions and exhibitshigh data efficiency and state-of-the-art energy accuracies (errors substantially below 1 meV peratom) for an extended set of materials. Extensive path-integral molecular dynamics carried outwith BIGDML models demonstrate the counterintuitive localization of benzene–graphene dynamicsinduced by nuclear quantum effects and allow to rationalize the Arrhenius behavior of hydrogendiffusion coefficient in a Pd crystal for a wide range of temperatures.

I. INTRODUCTION

The development and implementation of accurate andefficient machine learning force fields (MLFF) is trans-forming atomistic simulations throughout the fields ofphysics [1–5], chemistry [6–13], biology [14, 15], andmaterials science [16–21]. The application of MLFFshave enabled a wealth of novel discoveries and quantum-mechanical insights into atomic-scale mechanisms inmolecules [3, 6, 21–24] and materials [2, 4, 25–27].

A major hurdle in the development of MLFFs is tooptimize the conflicting requirements of ab initio accu-racy, computational speed and data efficiency, as wellas universal applicability to increasingly larger chemi-cal spaces [28]. In practice, all existing MLFFs intro-duce tradeoffs that restrict their accuracy, efficiency, orapplicability. In the domain of materials modeling, allMLFFs known to the authors employ the so-called lo-cality approximation, i.e. the global problem of predict-ing the total energy of a many-body condensed-mattersystem is approximated by its partitioning into localizedatomic contributions. The locality approximation has

[email protected][email protected][email protected]

been rather successful for capturing local chemical de-grees of freedom as demonstrated in a wide number ofapplications [29–33]. However, we emphasize that the lo-cality assumption disregards non-local interactions andits validity can only be truly assessed by comparison toexperimental observables or explicit ab initio dynamics.This fact restricts truly predictive MLFF simulations ofrealistic materials, whose properties are often determinedby a complex interplay between local chemical bonds anda multitude of non-local interactions.

The chemical space of materials is exceedingly diverseif we count all possible compositions and configurationsof a given number of chemical elements. For example,an accurate MLFF reconstruction of the potential-energysurface (PES) of elemental bulk materials to meV/atomaccuracy often requires many thousands of configurationsfor training [20, 29, 34–37]. The MLFF errors also in-crease at least by an order of magnitude when includingdefects or surfaces [31, 34]. Heteroatomic materials andinterfaces between molecules and materials would requiresubstantially more training data for creating predictiveMLFFs and accuracies much better than 1 meV/atom,eventually making the modeling of such materials in-tractable. In addition, there is a strong desire to go be-yond traditional density-functional theory (DFT) refer-ence data in the field of atomistic materials modeling [38–40]. Beyond-DFT methods can only be realistically ap-plied to compute dozens or hundreds of geometries, mak-

arX

iv:2

106.

0422

9v1

[co

nd-m

at.m

trl-

sci]

8 J

un 2

021

Page 2: BIGDML: Towards Exact Machine Learning Force Fields for

2

ing the construction of beyond-DFT MLFFs impractical.To address these challenges, in this work we introduce

a Bravais-Inspired Gradient Domain Machine Learning(BIGDML) model for periodic materials that is accu-rate, data efficient, and computationally inexpensive atthe same time. The BIGDML model extends the ap-plicability domain of the Symmetric Gradient-DomainMachine Learning (sGDML) framework [22, 41, 42] toinclude periodic systems with unit cells containing up toroughly hundred atoms. The BIGDML model employsa global representation of the full system, i.e. treat-ing the supercell as a whole instead of a collection ofatoms. This avoids the uncontrollable locality approxi-mation, but also restricts the maximum number of atomsin the unit cell. To extend the applicability of BIGDMLto much larger unit cells will require the development of aglobal multiscale representation, which will be the topicof our future work. An additional advantage of a globalrepresentation is that cross-correlations between forceson different atomic species are dealt with rigorously, atvariance with existing atomic representations. Similarlyto the sGDML model, another key advantage of theBIGDML model is the usage of physical constraints (en-ergy conservation) and all relevant physical symmetries ofperiodic systems, including the full translation and Bra-vais symmetry groups. As a consequence, BIGDML mod-els achieve meV/atom accuracy already for 10-200 train-ing points, surpassing state-of-the-art atom-based modelsby 1-2 orders of magnitude. This result underlines onceagain the importance of including prior knowledge, in-cluding physical laws and symmetries, into ML models.Clearly, what is known does not need to be learned fromdata — in effect the data manifold has been reduced inits complexity (see e.g. [22, 23, 33, 43–46]).

Altogether, the BIGDML framework opens the possi-bility to accurately reconstruct the PES of complex peri-odic materials with unprecedented accuracy at very lowcomputational cost. In addition, the BIGDML modelcan be straightforwardly implemented as an ML enginein any periodic DFT code, and used as a molecular dy-namics driver after being trained on just a handful ofgeometries.

II. RESULTS

The BIGDML framework relies on two advances: (i)a global atomistic representation with periodic boundaryconditions (PBC), (ii) the use of the full translation andBravais symmetry group for a given material.

A. PBC-preserving Representation

To avoid localization of interatomic interactions andartificial (from the electronic perspective) atom-type as-signment, we use an efficient global representation withPBC. Following the sGDML approach for molecules [22,

41], we take the atomistic Coulomb matrix (CM) [47]as a starting representation. When used with sGDML,the CM has been proven to be a robust, accurate, andefficient representation [22, 41, 48].

Here, we introduce a generalization of the molecularCM descriptor to represent periodic materials, D(PBC).In order to construct the Coulomb matrix for extendedsystems, we first enforce the PBC using the minimal-image convention (MIC) [49, 50]:

D(PBC)ij =

{1

|rij−Amod(A−1rij)| if i 6= j

0 if i = j(1)

where rij = ri − rj is the difference between two atomiccoordinates i and j, and A is the matrix defined bythe supercell translation vectors as columns. Fig. 1-A-left shows the Coulomb matrix descriptor when consid-ering only the supercell structure with no PBC, whichmeans that the ML model considers the system as a fi-nite “molecule”. The right side of Fig. 1-A shows thedescriptor with the PBC enforced (Eq. 1), having nowthe correct periodic structure.

Many widely used periodic global representations al-ready exist, for example CM-inspired global descriptorssuch as the Ewald-sum, or extended Coulomb-like andsine matrices [51]. In the cases of the extended Coulomb-like and Ewald matrices, these representations accountthe contribution of the same atom iteratively by consider-ing its multiple periodic images, which is computationallydemanding and algebraically involved. From these globalperiodic representations [51], only the sine matrix avoidsusing redundant information, since it just depends on theatomic positions in a single unit cell. Our choice of CMwith PBC enforced using MIC is the simplest and mostefficient choice, which also turns out to be exceptionallyaccurate and data-efficient, as will be shown below.

As an alternative to the global approach, manylocal materials’ representations have been developed.Among those representations, there are numerous de-scriptors based on atomic local environments, forexample atom-density representations [52–55], partialradial-distribution functions [56], FCHL descriptor [57],rotationally-invariant internal representation [58], many-body vector interaction [59] and moment tensor poten-tials [17]. In all these cases, the PBC can be naturallyincorporated by using the MIC, as it has been done formechanistic force fields. These local representations inprinciple aim at the construction of transferable inter-atomic MLFFs, as done by GAP/SOAP framework [54]which is the basis of a series of high quality chemicalbonding potentials for phosphorus [31], carbon [34], andsilicon [20]. However, the intrinsic cutoff radius in thesedescriptors limits the extent of atomic environments, ne-glecting the ubiquitous long-range interactions and corre-lations between different atomic species. Here, by usinga D(PBC) global descriptor we avoid the need of fine-tuning representation hyperparameters while preserving

Page 3: BIGDML: Towards Exact Machine Learning Force Fields for

3

Non PBC PBC

Supe

rcel

lD

escr

ipto

r

A) Coulomb Matrix descriptor with PBC B) Full space group symmetry

Local symmetry

Translational symmetry

Mg

O

Pd

C) Functional form

v) Local and translational symmetries

D) Periodic PES reconstruction

i) No symmetries

ii) PBC

iii) Local symmetries only

iv) Translational symmetries only

PBC

No

sym

me

trie

s

BIG

DM

L

Acc

ura

cy (

meV

/ato

m)

Symmetries

vi)

Symmetrylines

Trainingset

Effectivetrainingset

xPd

yPd

FIG. 1. Description of the main components of BIGDML models. A) Coulomb Matrix representation for non-periodic (left) andperiodic (right) supercell (2×2×3L) of Pd1/MgO (100). B) Description of the local symmetries (i.e. Bravais group G or pointgroup of the unit cell) and the translation symmetries of the unit cell T . C) Analytical form of the sGDML predictor wherethe explicit usage of the full symmetry group of the supercell F (in blue) and the Coulomb matrix PBC descriptor (in red). D)Systematic symmetrization of the PES. The axis in the PES are the x and y coordinates of the Pd atom in units of the latticeconstant of MgO. Models: i) Pure GDML, ii) GDML+D(PBC), iii) sGDML+D(PBC)(s=G), iv) sGDML+D(PBC)(s=T ), andv) BIGDML. Panel vi) displays the incremental accuracy upon addition of each symmetry for the Pd1/MgO (100) system. Allmodels used for this comparison were trained on 50 data samples.

Page 4: BIGDML: Towards Exact Machine Learning Force Fields for

4

high accuracy in the description of the many possibleconfiguration states of a material.

B. Translation Symmetries and the Bravais’ Group

The full symmetry group F of a crystal is given bythe semidirect product of translation symmetries T andthe rotation and reflection symmetries of the Bravais lat-tice G (Bravais’ group): F = T ⊗ G [60] (See Fig. 1-B).This is a general result, meaning that it applies to anyperiodic system of dimension d, F (d) = T (d) ⊗ G(d). Inpractice, the translation group F is constructed by theset of translations of the Bravais cell that span the su-percell using the primitive translation vectors as a basis,while the Bravais’ group G is the symmetry point groupof the unit cell. In order to illustrate these concepts, asan example let us consider a graphene (d = 2) supercell

of size 5×5. Its full symmetry group is T (2)5×5 ⊗ G(2) =

T(2)5×5 ⊗ D6h and contains 300 symmetry elements. Fur-

ther important materials with ample symmetries are sur-faces and interfaces. Analogous to molecules possessinginternal rotors, molecules interacting with a surface areanother case of a fluxional system. For example, ben-zene adsorbed on graphene has a full fluxional symmetrygroup defined by the direct product of graphene’s fullsymmetry group and benzene’s molecular point group,

[T(2)5×5⊗D6h]graphene⊗ [D6h]benzene, which contains 3600

symmetry elements. Such a large number of symmetriesreduces considerably the region of configuration spaceneeded to be sampled to reconstruct the full PES andconsequently generate MLFF models with high data effi-ciency. The presented arguments generalize to other ma-terials, such as molecular crystals, rigid bulk materials,porous materials, and hybrid organic-inorganic materi-als, i.e. perovskites.

C. The BIGDML model

The construction of a BIGDML model consists in com-bining a global PBC-descriptor and the full symmetrygroup of the system in the gradient-domain machinelearning framework (See Fig. 1), which leads to a ro-bust and highly data efficient MLFF, capable of reach-ing state-of-the-art accuracy using only a few dozens oftraining points. We would like to stress here that suchunprecedented data efficiency opens up many opportu-nities to study advanced materials using high levels ofelectronic-structure theory, such as sophisticated DFTapproximations or even coupled-cluster theory [61].

In a nutshell, the periodic global supercell descrip-tor and symmetries presented in the previous sectionsare combined with the sGDML framework to create theBIGDML predictor displayed in Fig. 1-C. To illustratethe effects of the symmetries in the PES reconstructionprocess for the atom–surface Pd1/MgO system, Fig. 1-

D presents a diagram where the different core elementsof the BIGDML model are systematically included andthe resulting (learned) PES is displayed. In this figure,the shown PES corresponds to the energy surface ex-perienced by a Pd atom. The panel i) displays the re-constructed energy surface with no symmetries, wherethe training samples are the purple squares and repre-sent the position of the Pd atom. In panel ii) the PBCare enforced by the periodic descriptor (eq. 1), and thenthis is combined with the use of the point group of theunit cell in panel iii) and with translation symmetriesin panel iv). From the last two panels, we can see thecharacteristic contribution of each symmetry group, Gsymmetrizes the local PES by adding effective trainingsamples (shown as grey circles) while T delocalises theeffective sampling over the whole supercell. Then, byconsidering the full symmetry group F , in panel v) wearrive to the PES reconstructed by the BIGDML modelwhere the effective training data symmetrically span thewhole supercell. The panels i) to v) show the increasingsymmetrization of the PES, but also illustrate the ac-curacy gain at each stage. The prediction accuracy plotshown in panel vi) clearly shows the important impact ofeach symmetry group in generating accurate and robustBIGDML models.

D. Prediction performance of BIGDML fordifferent materials

The BIGDML model can be applied to accurately re-produce atomic forces and total energy of bulk materials,surfaces, and interfaces. To illustrate the applicability ofBIGDML, in this section we have selected representa-tive systems that cover the broad spectrum of materi-als, and study the prediction accuracy of our MLFFs asjudged by the learning curves (test error as a functionof the number of data points used for training). Theconsidered systems include bulk materials (graphene asa representative 2D material, 3D metallic and semicon-ducting solids), surfaces (Pd absorbed on MgO surface),and van der Waals bonded molecules on surfaces (ben-zene adsorbed on graphene), as well as a bulk materialwith interstitial defects (hydrogen in palladium). For adetailed description of the database generation and thelevels of theory, as well as the parameters of the simula-tions and software packages employed, we refer the readerto the Methods section.

1. Bulk materials

Graphene as a representative 2D material.Graphene is a well characterized layered material thatcontinues to exhibit many remarkable properties despitebeing thoroughly studied [35, 62, 63]. Hence, developingaccurate and widely applicable force fields for grapheneand its derivatives is an active research area. Recently,

Page 5: BIGDML: Towards Exact Machine Learning Force Fields for

5

Prediction

A) Energy prediction

B) Force prediction

Prediction

FIG. 2. Learning curves for different materials. 3D bulkmaterials: Pd(FCC), Na(BCC), and Au(FCC). 2D material:Graphene. Interstitial in materials: H in a supercell of Pd.Chemisorption of atom at a surface: Single Pd atom adsorbedon a MgO (100) surface. Van der Waals interactions: Benzenemolecule adsorbed on graphene. Their respective full (flux-ional) symmetry group, supercell dimensions and referencelevels of theory used in each case are presented in Methodssection. The reported values are the mean absolute errors(MAE). See Supplementary Figure 1 for additional informa-tion on the learning curves.

Rowe et al. [35] presented a comprehensive comparison ofexisting hand-crafted force fields and a Gaussian-processapproximated potential (GAP) using the Smooth Over-lap of Atomic Positions (SOAP) local descriptor. TheGAP/SOAP approach was shown to generalize much bet-ter than mechanistic carbon FFs. In Fig. 2 we show thelearning curves of the BIGDML model for 5×5 supercellof graphene, showing that only 10 geometries (data sam-ples) are needed to match the best-performing method

to date (≈25 meV A−1

in force RMSE) [35]. The per-

A) Energy prediction B) Atomic force prediction

Prediction

Prediction

FIG. 3. Comparison of learning curves of the BIGDML modeland GAP/SOAP for graphene.

formance and data efficiency of BIGDML is remarkable,given that it uses less than 1% of the amount of dataemployed by atom-based local descriptors. More impor-tantly, by increasing the number of data samples usedfor training to 100, we reach a generalization error of ≈1

meV (0.02 meV/atom) in energies and ≈6 meV A−1

forforces. To our knowledge, such accuracies have not beenobtained in the field of MLFFs for extended materials.In order to put our results into context of state-of-the-artMLFFs, in Fig. 3 we show the learning curves compar-ing GAP/SOAP and BIGDML for graphene (See Sup-plementary Figure 2 for an extended comparison usingdifferent materials). Given the same data for training,BIGDML achieves an improvement of a factor of 10 to 30in accuracy, both for the total energy and atomic forces.The same conclusions hold for other systems studied inthis work, as shown in the Supporting Information.3D materials: The case of cubic crystals. In thecase of 3D materials, we apply our model to monoatomicmetallic materials covering common cubic crystal struc-tures: Pd[FCC], Au[FCC] and Na[BCC]. Figure 2 showsthe learning curve for these three structures with a su-

percell of 3×3×3 and symmetry groups T(3)3×3×3×Oh. An

accuracy of ≈10 meV A−1 for a monoatomic metal ma-terial can be achieved using approximately 70 samples inthe case of Pd (only 10,000 atomic forces), which is onlya fraction of the data (less than 1%) required by othermodels to obtain the same accuracy [37].

2. Surfaces

One of the main challenges of constructing MLFFs onlocal atomic environments is that such representationscan fail to capture subtle local changes with global im-plications. For example, when describing a surface or aninterface, atoms of the same element are described by thesame atomic embedding function which in order to en-

Page 6: BIGDML: Towards Exact Machine Learning Force Fields for

6

Training data

FIG. 4. Comparison of the minimum-energy path betweenneighbouring oxygen-sites for a single Pd supported on a MgO(100) surface using BIGDML (continuous line) and the refer-ence method, DFT/PBE (circles). The purple lines indicatethe location of the Pd atom in the training dataset.

code the many possible neighbourhoods (atoms in deeperlayers, atoms close to the surface of the material) requireslarge amounts of training data. This eventually leads todegradation of MLFF performance, a problem that couldbecome practically intractable for local MLFFs whendealing with molecule-surface interactions. These limi-tations can be addressed in local models but at the costof higher complexity models and manual tuning of hyper-parameters, hence losing the key advantages of MLFFs.In this section we show that the BIGDML method doesnot have such limitations by studying two representativesystems: chemisorbed Pd/MgO-surface and physisorbedbenzene/graphene.

Atom chemisorbed at a surface: Pd1/MgO. Inrecent years, it has been shown that single-atom cat-alysts (SACs) can offer superior catalytic performancecompared to clusters and nanoparticles [64–66]. Theseheterogeneous catalysts consist of isolated metal atomssupported on a range of substrates, such as metal oxides,metal surfaces or carbon-based materials. As a showcase,here we use a single Pd atom supported on a pristineMgO (100) surface. The considered supercell consists ofa 2×2 slab of MgO(100) with 3 layers, where the lowestlayer is kept fixed, and a single Pd atom is deposited onthe surface.

The full symmetry group for this system is T(2)2×2⊗C4v

with 64 elements. The learning curve (see Fig. 2) showsthat only 200 samples are needed to reach energy andforce accuracy values of ≈34 meV (≈0.7 meV/atom) and≈30 meV A−1, respectively. Similarly as in the case oflearning force fields for molecules in the gas phase, thetarget error is always relative to the relevant dynamicsof the system and its energetics [22, 41, 42]. In this con-text, the Pd atom is chemisorbed at an oxygen site andthe lowest energetic barrier that the Pd atom experiencesis of 450 meV, thus our error is ≈6% of this value. InFig. 4 we show the minimum-energy barrier (MEB) of

Pd atom displacing from one minimum to another onthe MgO surface computed by the nudged elastic band(NEB) method (See Methods section for details). It mustbe noted that the Pd atom never crossed this barrierduring the MD simulation used to generate the referencedataset, as displayed by the purple lines in Fig. 4 indicat-ing the distribution of the Pd atom location in the train-ing dataset. Hence, even though the model did not haveinformation regarding the saddle point, the energetic bar-rier was nevertheless correctly modeled by BIGDML byincorporating translational and Bravais symmetries.Molecule physisorbed at a surface: Ben-zene/graphene. A highly active field of research in ma-terials science concerns the interaction between moleculesand surfaces, due to its fundamental and technologicalrelevance. From the modeling point of view, describ-ing non-covalent interactions within the framework ofDFT remains a competitive research area given its in-tricacies which has led to very accurate dispersion inter-action methods [67–71]. Nevertheless, most of the stud-ies about these systems focus on global optimizations orshort MD simulations. Here, we demonstrate the appli-cability of BIGDML by learning the molecular force fieldof the benzene molecule interacting with graphene.

The full symmetry group of the benzene/graphene sys-

tem is T(2)5×5 ⊗ C

(Graphene)6v ⊗ C(Benzene)

6v , which has a to-tal of 3600 elements. This large number of symmetriesgreatly reduces the configurational space sampling re-quirements to reconstruct its PES, as can be seen fromthe learning curve shown in Fig. 2 where the energy er-ror quickly drops below ≈43 meV (1 kcal mol−1) train-ing only on 10 datapoints and ≈21 meV with 30 train-ing datapoints. For this system, the energy generalisa-tion accuracy starts to saturate at 0.18 meV/atom whentraining on 100 configurations. Achieving such high gen-eralization accuracy using only a handful of training datapoints for such a complex system convincingly illustratesthe high potential of the BIGDML model, since it sud-denly opens the possibility of performing predictive sim-ulations for a wide variety of systems where only staticDFT calculations are available so far.

The systems discussed in this section offer a generalpicture of the broad diversity of extended materials thatthe BIGDML model can describe with high data effi-ciency and unprecedented accuracy.

E. Validation of BIGDML models for materialsproperties

In the previous section we demonstrated the predictioncapabilities of the BIGDML method using statistical ac-curacy measures. Now, we assess the predictive powerof BIGDML models in terms of predicting physical prop-erties of materials. In this section we first perform athorough test for ML models by assessing the phononspectra of 2D graphene and 3D bulk materials. Then,we proceed to test the performance beyond the harmonic

Page 7: BIGDML: Towards Exact Machine Learning Force Fields for

7

FIG. 5. Phonon spectra (left) and vibrational densities ofstates (right) for (a) graphene, (b) bulk sodium, and (c) bulkpalladium along the high-symmetry paths in their respectiveBrillouin zones. The dashed line represents BIGDML and thecontinuous line the reference DFT-PBE level of theory. Thedifferences between DFT and BIGDML are visually imper-ceptible.

approximation by carrying out molecular dynamics simu-lations and comparing observables against explicit DFTcalculations. All simulations performed in this sectionwere done using the best trained models displayed in thelearning curves (See Fig. 2 and Methods section).

FIG. 6. Radial distribution function for first-neighboursin graphene. Classical (MD, in blue) and path integralmolecular dynamics (PIMD, in orange) simulations describingthe graphene system at room temperature described by theBIGDML molecular force field. A Gaussian function fittinggives std. of 0.026 A and 0.046 A for classical MD and PIMD,respectively.

1. Phonon spectra

A common challenging test to assess force fields (ma-chine learned [19, 34, 35] as well as conventional FFs [72–74]) is the phonon dispersion curves and phonon densityof states, since they give a clear view of (i) the propersymmetrization of the FF and (ii) the correct descriptionof the elastic properties of the material in the harmonicapproximation. The main challenge for FFs is describ-ing both collective low-frequency phonon modes and thelocal high-frequency ones with equal accuracy. In Fig. 5we show the comparison of the BIGDML and DFT gen-erated phonon bands displaying a perfect match, showinga RMSE phonon errors across the Brillouin zone of 0.85meV for Graphene, 0.35 meV for Na, and 0.38 meV forPd. These values are comparable to those reported inliterature using MLFFs trained on thousands of configu-rations and hand-crafted datasets [75], while in our casewe only require less than 100 randomly selected trainingpoints. Such accuracy originates from the use of a globalrepresentation for the supercell which captures local andnon-local interactions with high fidelity, a feature that iscrucial in describing vibrational properties.

Now, we proceed to a more challenging physical testwhich is the prediction of properties at finite temperaturewhere also the anharmonic parts of the PES are impor-tant.

2. Molecular dynamics simulations

Graphene. Simulations of graphene at finite temper-ature using an accurate description of the interatomicforces is a highly relevant topic given the plethora of

Page 8: BIGDML: Towards Exact Machine Learning Force Fields for

8

applications of this material. In particular, a necessarycontribution to its realistic description is the inclusionof nuclear quantum effects (NQE). For example, the ex-perimental free energy barrier for the permeability ofgraphene-based membranes to thermal protons can onlybe correctly described by including the NQE of the car-bon atoms [76, 77]. In order to corroborate that ourgraphene BIGDML model is giving the correct physicaldelocalization of the nuclei, we performed path-integralmolecular dynamics (PIMD) simulations at 300 K for a5×5 supercell. In Fig. 6 we compare the distribution offirst neighbor interatomic distance rCC between classi-cal MD (blue) and PIMD (orange), results showing thatthe fluctuations in rCC double its value when consider-ing NQE. These findings are in excellent agreement withexplicit first-principles PIMD simulations in the litera-ture [77].

As an additional robustness test, we have performedextended classical MD simulations at various tempera-tures using the EAM force field [78] and a BIGDMLmodel trained on this level of theory, obtaining a per-fect match between these two different methods. Thisfurther validates the predictive power of our methodol-ogy even at long time scales. These results are shown inthe Supporting Information.

Up to this point, we have performed simulations tovalidate our models under different conditions. In thenext section we perform predictive simulations, whichhighlight the potential of BIGDML for novel applica-tions, including unexpected NQE-driven localization ofbenzene/graphene dynamics and the diffusion of inter-stitial hydrogen in bulk palladium.

F. Validation of BIGDML in dynamicalsimulations of materials

1. Benzene/graphene

The interaction between different molecules andgraphene has been extensively studied given the poten-tial applications of molecule/graphene systems as elec-trical and optical materials and even as candidates fordrug delivery systems [79–89]. Of particular interestis the understanding of the effective binding strengthand structural fluctuations of adsorbed molecules at fi-nite temperature, which requires long time-scale molec-ular dynamics simulations, unaffordable when using ex-plicit ab initio calculations. Here we will demonstratethat BIGDML models can be used for studying explicitlong-time dynamics of a realistic systems such as ben-zene (Bz) adsorbed on graphene with accurate and con-verged quantum treatment of both electrons and nu-clei (See Fig. 7-A). The Bz/graphene system has threeminima that resemble those of the benzene dimer: theπ − π stacking (parallel-displaced) structure as globalminimum and two local minima corresponding to paralleland T-shaped configurations, as displayed in Fig. 7-B [3]

A) Main degrees of freedom

B) Minima of the system

C) Dynamics of the system

h [Å]

θ [°]

Eads [eV]

3.30 3.41 4.85/2.74

0 0 90

-0.483 -0.446 -0.260

h

z

h

π–π stacked Parallel T–shaped

rh

FIG. 7. Benzene/graphene. A) Depiction of the system withits two main degrees of freedom: (1) The angle between thenormal vector defined by the benzene ring n and the normalto the graphene plane (z). (2) The relative distance betweenthe benzene center of mass and the graphene, h. B) Thethree minima of the systems and its defining characteristicparameters: h, θ, and its adsorption energy in Eads. TheEads energies were computed using Eads = Ebenzene/graphene−(Ebenzene+Egraphene). C) Classical molecular dynamics (MD)and path integral molecular dynamics (PIMD) simulationsdescribing the dynamics of benzene molecule interacting withgraphene at room temperature described by the BIGDMLmolecular force field. Plots displaying projections of the dy-namics (classical MD and PIMD) to the two main degrees offreedom of the system: h, θ = arccos(z · n).

along with the corresponding structural parameters andadsorption energies computed at the PBE+MBD level oftheory [68, 69, 90]. The calculated adsorption energy forthe global minimum is in a very good agreement withexperimental measurements of 500±80 meV [91].

An extensive amount of studies exist on the implica-

Page 9: BIGDML: Towards Exact Machine Learning Force Fields for

9

tions of NQE on properties of molecules and materials atfinite temperature [92, 93], however much less is knownabout the implications of NQE for non-covalent van derWaals (vdW) interactions [3, 94]. In the particular case ofBz/graphene, considering the translational symmetries ofthe PES experienced by the Bz molecule as well as ther-mal fluctuations and its many degrees of freedom, it is tobe expected that the Bz dynamics will be highly delocal-ized. Nonetheless, it was recently reported that the in-clusion of NQE in a molecular dimer can considerably en-hance intermolecular vdW interactions [3]. However, theadsorption/binding energy ratio between Bz/graphene

and Bz/Bz system is EBz/grapheneads /E

Bz/Bzint ≈4, there-

fore it is not clear how NQEs will affect such stronglyinteracting vdW systems.

In order to assess the role of temperature and NQEfor Bz/graphene, in Fig. 7-C we present the results ob-tained from classical MD and PIMD simulations at 300Kusing a BIGDML FF trained at the PBE+MBD level oftheory. At this temperature, the benzene molecule tendsto mostly populate configurations at an angle of ≈ 10◦

relative to the graphene normal vector in both cases (seeFig. 7-A). Nevertheless, classical MD simulations exploresubstantially wider regions of the PES, reaching angles ofup to 80◦, close to the T-shaped minimum. In contrast,PIMD simulations yield a localized sampling of θ with amaximum angle of ≈30◦. To understand the origin of thislocalization, we have systematically increased the “quan-tumness” of the system by raising the number of beadsin the PIMD simulations to converge towards the exacttreatment of NQE. This approach provides concrete evi-dence of the progressive localization of the benzene nor-mal orientation as the NQE increase (see Supporting In-formation). The physical origin of this phenomenon is theNQE-induced interatomic bond dilation, where the zero-point energy generated by NQE drive the system beyondthe harmonic oscillation regime. The intramolecular de-localization produces effective molecular volume dilationand increases the average polarizability of benzene andgraphene rings, akin to a recent analysis of non-covalentinteractions between molecular dimers upon constrain-ing their center of mass [3]. In contrast, in this work noconstraints were imposed on the Bz/graphene system,suggesting that the Bz molecule localization on grapheneshould be observable in experiments. In order to furtherrationalize the NQE-induced stabilization of vdW inter-actions, we have computed the vdW interaction energy asa function of compression/dilation of the Bz molecule ongraphene and found a linear dependence between dilationand vdW interaction (see Supporting Information). Thisanalysis fully supports our hypothesis of NQE-inducedstabilization and dynamical localization.

The rather fundamental nature of the underlying phys-ical phenomenon of NQE-induced stabilization suggeststhat many polarizable molecules interacting with sur-faces will exhibit a similar dynamical localization ef-fect. It is worth mentioning that a thorough analysisof the Bz/graphene system demands extensive simula-

tions which are now made accessible due to the compu-tational efficiency and accuracy of the BIGDML model.Our modeling could also be applied to larger moleculeswith peculiar behavior under applied external forces [95].

2. Hydrogen interstitial in bulk palladium

Hydrogen has become a promising alternative to fossilfuels as a cleaner energy source. Nevertheless, finding asafe, economical and high-energy-density hydrogen stor-age medium remains a challenge [96]. One of the pro-posed methods is to store hydrogen in interstitial sites ofthe crystal lattices of bulk metals [96–98]. Among thesemetals, palladium has been widely researched as a can-didate, since it can absorb large quantities of hydrogenin a reversible manner [97].

Characterizing the diffusion of hydrogen in the crystallattices at different temperatures is crucial to assess theirperformance as storage materials. Hence, in this sectionwe study a system consisting of a hydrogen atom intersti-tial in bulk palladium with a cubic supercell containing

32 Pd atoms with full symmetry group T(3)2×2×2⊗Oh, and

described at the DFT-PBE level of theory (See Meth-ods section for more details). The BIGDML learningcurve for this system in presented in Fig. 2. Within theFCC lattice there are two possible cavities for hydrogenatoms storage: the octahedral (O-sites) and the tetrahe-dral (T-sites) cavities (See Fig. 8-A-top), where the O-site is the global minimum [97] and it is separated fromthe T-site by an energetic barrier of ≈160 meV as shownin Fig. 8-A-bottom. Additionally, from this figure we cansee the excellent agreement between BIGDML model andthe reference DFT calculations.

Given the height of the energetic barrier between thetwo minima, it is to be expected that the NQE-induceddelocalization of the hydrogen atom would be insufficientto promote H-atom tunneling at temperature close toroom temperature, while at higher temperatures classi-cal thermal effects should dominate the dynamics. Incontrast, Kimizuka et al. [97] reported a study based ontransition state theory (TST) suggesting that not onlythe inclusion of the NQE has indeed a strong effect onthe H-atom diffusion, but also they reported that NQEhinder the migration from O-site to T-site. In order toelucidate realistic dynamics of the H atom in the metallattice and the impact of the NQE without relying onapproximations such as TST, we performed direct classi-cal MD and PIMD simulations at different temperatures(from 100 K to 1000 K) (see Methods for more details).We first studied the NQE-induced statistical sampling ofthe hydrogen atom in each cavity as shown in Fig. 8-B(See supporting material for an animated version of thisfigure). This helps us to visualize hydrogen dynamics inthe temperature range from 100 K to 300 K and to de-termine the shape of the cavity, which transforms from acube to a much larger truncated octahedron as the tem-perature increases.

Page 10: BIGDML: Towards Exact Machine Learning Force Fields for

10

Kimizuka et alKimizuka et al TST-classical

TST-

FIG. 8. A) (Top) Minima of the trajectory used to calcu-late the minimum-energy path (MEP) for the diffusion of Hin FCC Pd. (Bottom) MEP as calculated using BIGDML(continuous line) and DFT-PBE (circles) between adjacentO- and T-sites. B) Three-dimensional plots of the probabilitydistribution of the H atom in the O-site at different temper-atures. C) Transition rates from O-sites to T-sites of an Hatom in bulk Pd as function of the temperature. The greencircles and yellow squares represent the values calculated us-ing classical MD and PIMD using the BIGDML model, re-spectively. The dotted lines are the Arrhenius fit to the datafrom BIGDML@MD (green) and the classical transition statetheory (TST) by Kimizuka et al. [97] (blue), and the red di-amonds are TST-PIMD by Kimizuka et al. too.

Then, from the generated (classical and quantum) tra-jectories we have estimated the diffusivity of the hydro-gen atom as a function of the temperature, which areshown in Fig. 8-C along with TST results. From theseresults we observe the usual Arrhenius temperature de-pendence in the classical MD case, but more interestinglythe quantum dynamics lead to essentially the same diffu-sion coefficients. In fact, it is to be expected that NQE donot play a major role in diffusivity in this particular caseat room temperature given that the thermal energy is≈26 meV while the energetic barrier between the O andT sites is 160 meV (Fig. 8-A). Hence, the NQE do notprovide the excess of energy required to promote protontunneling. Furthermore, as the temperature increases theNQE become less important and the classical thermal ef-fects dominate the dynamics, hence the hydrogen diffu-sion remains Arrhenius-like. Comparing our explicit MDsimulations with previous approximate TST results [97]we see that there are pronounced differences. The TSTprediction of the classical frequency transition rate con-siderably overestimates the actual value obtained fromour more robust simulations. The deviations observed inTST are due to the neglect of anharmonicites in this ap-proximate theory, which are in contrast fully treated inour MD/PIMD simulations. The results presented in thissection demonstrate how BIGDML enables long PIMDsimulations to obtain novel insights into dynamical be-havior of intricate materials containing vacancies or in-terstitial atoms.

III. DISCUSSION

In this work, we introduced the BIGDML approach— a MLFF for materials that is accurate, straightfor-ward to construct, efficient in terms of learning on ref-erence ab initio data, and computationally inexpensiveto evaluate. The accuracy and efficiency of the BIGDMLmethod stems from extending the sGDML framework forfinite systems [22, 41] by employing a global periodic de-scriptor and making usage of translational and Bravaissymmetry groups for materials. The BIGDML approachenables carrying out extended dynamical simulations ofmaterials, while correctly describing all relevant chemi-cal and physical (long-range) interactions in periodic sys-tems contained within the reference data. In principle,the BIGDML method would allow to execute exact dy-namics of materials once high-level electronic structureforce calculations for periodic systems (with CCSD(T) orQuantum Monte Carlo methods) become a reality [38–40]. We remark that the molecular sGDML approachhas already fulfilled this long-standing goal for moleculeswith up to a few dozen atoms [3, 22].

We have demonstrated the applicability and robust-ness of the BIGDML method by studying a wide varietyof relevant materials and their static and dynamical prop-erties, for example successfully assessing the performanceof BIGDML models for physical observables in the har-

Page 11: BIGDML: Towards Exact Machine Learning Force Fields for

11

monic and anharmonic regimes in the form of phononbands and molecular dynamics. Furthermore, we carriedout predictive simulations on interstitial hydrogen diffu-sion in bulk Pd, as well as accurately capturing intricatevan der Waals forces and the dynamics of the interfaceformed by molecular benzene and 2D graphene layer.

From the practical perspective, the BIGDML approachrepresents an advantageous framework beyond its accu-racy and data efficiency, given that the model generationis a straightforward process starting from the simplic-ity of database generation and its out-of-the-box trainingprocedure [42]. Given the very few data samples neededto generate a relatively accurate BIGDML model (seeFig. 2), our model can be coupled with any DFT code tosubstantially accelerate DFT-based dynamics with min-imal human effort invested in constructing the initialdataset. To illustrate the gain in computational speed,we remark that for benzene/graphene we gain a factor of50,000 for computing atomic forces with BIGDML whencompared to the PBE+MBD level of electronic-structuretheory. This gain would further increase when using ahigher level of quantum-mechanical methods for gener-ating reference data.

Many powerful MLFFs for materials have been pro-posed and some are already widely used for materialsmodeling [53, 99]. In order to embed the BIGDML modelinto the current context of MLFFs for materials, it is con-venient to address some of the limitations that currentmethodologies face, as well as to discuss goals to pursuewith the next generation of MLFFs in materials science.

All current MLFFs for materials known to the au-thors employ the locality approximation, i.e. they builda model for an energy of an atom in a certain chemicalenvironment, which is defined by a cutoff function. Thetypical employed cutoffs are of 3-8 A, being of a rathershort range. Increasing the cutoff does not necessarilylead to a better model, because electronic interactions ex-hibit hard-to-learn multiscale structure [9]. In addition,different interaction scales are mutually coupled. An at-tractive feature of the locality approximation is that inprinciple the short-range interactions are transferable todifferent systems. However, in practice this is not a gen-eral finding. For example, it was shown that a general-purpose GAP/SOAP MLFF for carbon [34] yields errorsan order of magnitude higher in graphene compared tothe same methodology trained specifically on graphenedata [35]. In addition, local MLFFs typically decouple in-teraction potentials of different atoms by assigning atomtypes. For example, carbon in benzene and carbon ingraphene could be treated as different atom types. Obvi-ously, such decoupling makes the learning problem harderbecause more data is necessary to “restore the coupling”between different atomic species.

BIGDML solves both problems of localization in arobust way by using a global descriptor with periodicboundary conditions. Any type of interaction can be cap-tured by BIGDML and all atoms are mutually coupledby construction. The disadvantage of such an approach is

that a BIGDML model is system-specific and, hence, nottransferable between different systems or even betweendifferent supercell sizes for the same system. Despitethis slight drawback, it is clear that having an access toa MLFF that can robustly represent all possible interac-tions in extended materials is a substantial achievement.In addition, we should stress that BIGDML has a supe-rior learning capacity compared to local MLFFs, since itcan reach generalization accuracies of up to two ordersof magnitude better than localized MLFFs (see Fig. 3).

Another crucial aspect of MLFFs is their data effi-ciency and ability to correctly capture all relevant sym-metries for a given system. Symmetries play a crucial rolewhen studying nuclear displacements (phonons, thermalconductivities, etc). BIGDML solves both of these chal-lenges at the same time. The symmetries are obtainedfrom the periodic cell and the reference geometries in adata-driven way. Symmetries are known to effectively re-duce the complexity of the learning problem and we haveconvincingly demonstrated this fact for finite molecularsystems [22]. Periodic systems have even more symme-tries than molecules, making the force field reconstruc-tion effectively a lower-dimensional task. While this qual-itative outcome could have been expected prior to theformulation of BIGDML, the enormous practical advan-tage of incorporating crystalline symmetries is remark-able. Even a few dozen samples (atomic forces for a fewunitcell geometries) already yield BIGDML models thatcan be used in practical applications of molecular dynam-ics.

We would like to remark further that while BIGDMLis a kernel-based approach, elegantly able to formally in-clude symmetries and prior physical information, it willbe an interesting and important challenge to transfer thelearning machinery established here also to deep learn-ing approaches (such as convolutive neural networks,graph neural networks or even generative adversarialmodels) ideally by incorporating symmetries, prior phys-ical knowledge and equivariance constructions into theirarchitecture (see Refs. [23, 33, 100, 101] for some firststeps in this direction).

With the advent of new advanced materials such ashigh performance perovskite solar cells, topological in-sulators and van der Waals materials, it is crucial toconstruct reliable MLFFs capable of dynamical simula-tions at the highest level of accuracy given by electronic-structure theories while maintaining relatively low com-putational cost. While local MLFFs and BIGDML arecomplementary approaches, we would like to emphasizethat global representations and symmetries could alsobe readily incorporated in other MLFF models. Thechallenge of developing accurate, efficient, scalable, andtransferable MLFFs valid for molecules, materials, andinterfaces thereof suggests the need for many further de-velopments aiming towards universally applicable MLFFmodels.

Page 12: BIGDML: Towards Exact Machine Learning Force Fields for

12

IV. METHODS

1. Data generation and DFT calculations

Given the different types of calculations and materialsin this work, we present the details of the data generation,model training and simulations organized per system. Allthe databases were generated using molecular dynamicssimulations using the NVT thermostat.

Graphene. Here we used a 5×5 supercell at theDFT level of theory at the generalized gradient ap-proximation (GGA) level of theory with the Perdew-Burke-Ernzerhof (PBE) [90] exchange-correlation func-tional, We performed the calculation in the QuantumEspresso [102, 103] software suite, using plane-waves withultrasoft pseudopotentials and scalar-relativistic correc-tions. We used an energy cutoff of 40 Ry. A uniform3×3×1 Monkhorst-Pack grid of k-points was used to inte-grate over the Brillouin zone. The ab initio MD (AIMD)used to generate the database was ran at 500 K during10,000 time steps using an integration step of 0.5 fs. Theresults displayed in Fig. 6 were performed using PIMDsimulations with 32 beads and we ran the simulation for300 ps using an integration step of 0.5 fs.

Pd1/MgO. In this case, we used a 2×2 supercell with3 atomic layers to model the MgO (100) surface. Thecalculations were performed in Quantum Espresso, us-ing an energy cutoff of 50 Ry and integrating over theBrillouin zone at the Γ-point only. For this system, weran an AIMD at 500 K with an integration step of 1.0 fsduring 10,000 integration steps to generate the material’sdatabase.

Benzene/graphene. For this particular example wehave used the same graphene supercell mentioned aboveand placed a benzene molecule on top. In order to in-clude the correct non-covalent interactions between thebenzene molecule and the graphene layer, we have usedan all-electrons DFT/PBE level of theory with the manybody dispersion (MBD) [68, 69] treatment of the van derWaals interaction using the FHI-aims [104] code. TheAIMD simulation for the system’s database constructionswas performed at 500 K using an integration step of 1.0 fsduring 15,000 steps. The results displayed in Fig. 7 wereperformed using PIMD simulations using 1, 8, 16 and 32beads (in order to guarantee that we have achieved con-verged NQE) and we ran the simulation for 200 ps usingan integration step of 0.5 fs.

Bulk metals. In this case we were interested in a varietyof materials and their different interactions. Then, wehave considered Pd[FCC] and Na[BCC] described at theDFT/PBE level of theory using the Quantum Espressosoftware. The databases were created by running AIMDsimulations at 500 K and 1000 K for Pd, and 300 K forNa using a time steps of 1.0 fs for all the simulations.Monkhorst-Pack grids of 3×3×3 k-points were used tointegrate over the Brillouin zone for all materials. Allcalculations for the bulk metals were spin-polarized.

H in Pd[FCC] In this case we used a supercell of

3×3×3 with 32 Pd atoms and a single hydrogen atomdescribed by DFT/PBE level of theory using the Quan-tum Espresso software. The database was generated byrunning AIMD at 1000 K. We used time steps of 1.0 fsand a total dynamics of 6 ps. Monkhorst-Pack grids of3×3×3 k-points were used to integrate over the Brillouinzone for all materials. The results shown in Fig. 8 we ob-tained by running classical MD and PIMD simulationsusing an interface of the BIGDML FF with the i-PI sim-ulation package [105]. We ran the simulations at varioustemperatures from 300 K to 1000 K. In each case, we em-ployed a time step of 2.0 fs during 2,000,000 steps, for atotal simulation time of 4 ns. For the PIMD simulationswe used different number of beads for each temperature:32 for 100, 300, and 600 K; 24 for 400 K; 2 for 700 K; and4 for 800 K. Using this data we were able to compute theH diffusivity as a function of the temperature.

2. The sGDML framework

A data efficient reconstruction of accurate force fieldswith ML hinges on including the right inductive biasesin the model to compensate for finite reference datasetsizes. The Symmetric Gradient-Domain Machine Learn-ing (sGDML) achieves this through constraints derivedfrom exact physical laws [22, 41, 42]. In additional tothe basic roto-translational invariance of energy, sGDMLimplements energy conservation, a fundamental propertyof closed classical and quantum mechanical systems. Thekey idea behind sGDML is to define a Gaussian Process(GP) using a kernel k (x,x′) = ∇xkE (x,x′)∇>x′ thatmodels any force field fF as a transformation of someunknown potential energy surface fE such that

fF = −∇fE ∼ GP[−∇µE(x),∇xkE (x,x′)∇>x′

]. (2)

Here, µE : Rd → R and kE : Rd × Rd → R are the priormean and prior covariance functions of the latent energyGP-predictor, respectively.

The sGDML model also incorporates all relevant rigidspace group symmetries, as well as dynamic non-rigidsymmetries of the system at hand into the same ker-nel, to further improve its efficiency. Those symme-tries are automatically recovered as atom-permutationsvia multi-partite matching of all geometries in the train-ing dataset [22]. BIGDML extends sGDML to periodicsystems, which posses unique permutational symmetriesthat were previously not considered.

3. Coulomb matrix PBC implementation

The periodic boundary conditions were implementedusing the minimum image convention. Under this con-vention, we take the distance between two atoms to bethe shortest distance between their periodic images. Westart by expressing the distance vectors dij = ri − rj in

Page 13: BIGDML: Towards Exact Machine Learning Force Fields for

13

the basis of the simulation supercell lattice vectors as

dij = Acij , (3)

where A is a 3 × 3 matrix which contains the lattice(supercell) vectors as columns, and cij are the distancevectors in the new basis. We then confine the originaldistance vectors to the simulation cell,

d(PBC)ij = dij −Anint(cij), (4)

where nint(x) is the nearest integer function. By re-

placing the ordinary distance vectors dij with d(PBC)ij in

the Coulomb matrix descriptor, it becomes

D(PBC)ij =

{1

|d(PBC)ij |

if i 6= j

0 if i = j(5)

In practice, only the d(PBC) upper triangular matrix isused.

4. Software: Interface with i-PI

For this work, a highly optimised interface of BIGDMLhas been implemented in the i-PI molecular dynamicspackage [105]. The main features of this implementationare: (1) it allows the use of periodic boundary condi-tions and stress tensor calculation, (2) parallel queryingof all beads at once in PIMD simulations and (3) it usesthe highly optimized sGDML GPU implementation inPyTorch to parallelise beads calculations, dramaticallyincreasing the simulation efficiency.

5. Software: Interface with Phonopy for phonons

An ASE calculator is already provided by the sGDMLpackage, this allows to use all its simulation options. Inparticular, the phonon analysis for materials is easy com-puted in this package using Phonopy [106]. An exampleof the scripts used to compute the phonons in this paperis provided in the Supporting Information.

V. SOFTWARE AVAILABILITY

Our code, documentation and datasets are available athttp://sgdml.org.

VI. ACKNOWLEDGEMENTS

AT was supported by the Luxembourg National Re-search Fund (DTU PRIDE MASSENA) and by the Eu-ropean Research Council (ERC-CoG BeStMo). KRMwas supported in part by the Institute of Informa-tion & Communications Technology Planning & Eval-uation (IITP) grant funded by the Korea Government(No. 2019-0-00079, Artificial Intelligence GraduateSchool Program, Korea University), and was partly sup-ported by the German Ministry for Education and Re-search (BMBF) under Grants 01IS14013A-E, 01GQ1115,01GQ0850, 01IS18025A and 01IS18037A; the GermanResearch Foundation (DFG) under Grant Math+, EXC2046/1, Project ID 390685689. Correspondence shouldbe addressed to HES, KRM and AT. LOPB thank fi-nancial support from DGAPA-UNAM (PAPIIT) underProjects IA102218 and IN116020, as well as cpu-timeat Supercomputo UNAM (Miztli) through a DGTIC-UNAM grant LANCAD-UNAM-DGTIC-307. LOPBand LEGG are also grateful to CONACYT-Mexico forsupport through Project 285218 and a doctoral scholar-ship 493775, respectively. HES works at the BASLEARN- TU Berlin/BASF Joint Lab for Machine Learning, co-financed by TU Berlin and BASF SE.

[1] M. Veit, S. K. Jain, S. Bonakala, I. Rudra, D. Hohl, andG. Csanyi, J. Chem. Theory Comput. 15, 2574 (2019).

[2] B. Cheng, G. Mazzola, C. J. Pickard, and M. Ceriotti,Nature 585, 217 (2020).

[3] H. E. Sauceda, V. Vassilev-Galindo, S. Chmiela, K.-R.Muller, and A. Tkatchenko, Nat. Commun. 12, 442(2021).

[4] V. L. Deringer, N. Bernstein, G. Csanyi, C. Ben Mah-moud, M. Ceriotti, M. Wilson, D. A. Drabold, andS. R. Elliott, Nature 589, 59 (2021).

[5] V. Ladygin, P. Korotaev, A. Yanilkin, and A. Shapeev,Computational Materials Science 172, 109333 (2020).

[6] J. S. Smith, O. Isayev, and A. E. Roitberg, Chem. Sci.8, 3192 (2017).

[7] F. Noe, A. Tkatchenko, K.-R. Muller, and C. Clementi,Annu. Rev. Phys. Chem. 71, 361 (2020).

[8] A. Tkatchenko, Nat. Commun. 11, 4125 (2020).[9] O. T. Unke, S. Chmiela, H. E. Sauceda, M. Gastegger,

I. Poltavsky, K. T. Schutt, A. Tkatchenko, and K.-R.Muller, Chem. Rev. (2021).

[10] O. A. Von Lilienfeld, Angew. Chem. Int. Ed. 57, 4164(2018).

[11] K. T. Schutt, S. Chmiela, O. A. von Lilienfeld,A. Tkatchenko, K. Tsuda, and K.-R. Muller, MachineLearning Meets Quantum Physics, Vol. 968 (SpringerLecture Notes in Physics, 2020).

[12] F. Musil, A. Grisafi, A. P. Bartok, C. Ortner, G. Csanyi,and M. Ceriotti, “Physics-inspired structural repre-

Page 14: BIGDML: Towards Exact Machine Learning Force Fields for

14

sentations for molecules and materials,” (2021),arXiv:2101.04673.

[13] O. A. von Lilienfeld and K. Burke, Nat. Commun. 11,4895 (2020).

[14] W. Gao, S. P. Mahajan, J. Sulam, and J. J. Gray,Patterns 1, 100142 (2020).

[15] F. Noe, G. De Fabritiis, and C. Clementi, Curr. Opin.Struc. Biol. 60, 77 (2020).

[16] S. A. Ghasemi, A. Hofstetter, S. Saha, andS. Goedecker, Phys. Rev. B 92, 045131 (2015).

[17] I. S. Novikov, K. Gubaev, E. V. Podryabinkin, andA. V. Shapeev, Mach. Learn.: Sci. Technol. 2, 025002(2021).

[18] N. Artrith, A. Urban, and G. Ceder, J. Chem. Phys.148, 241711 (2018).

[19] J. Byggmastar, K. Nordlund, and F. Djurabekova,“Gaussian approximation potentials for body-centeredcubic transition metals,” (2020), arXiv:2006.14365.

[20] A. P. Bartok, J. Kermode, N. Bernstein, and G. Csanyi,Phys. Rev. X 8, 041048 (2018).

[21] A. P. Bartok, S. De, C. Poelking, N. Bernstein, J. R.Kermode, G. Csanyi, and M. Ceriotti, Sci. Adv. 3,e1701816 (2017).

[22] S. Chmiela, H. E. Sauceda, K.-R. Muller, andA. Tkatchenko, Nat. Commun. 9, 3887 (2018).

[23] O. T. Unke and M. Meuwly, J. Chem. Theory Comput.15, 3678 (2019).

[24] C. Devereux, J. S. Smith, K. K. Davis, K. Barros, R. Zu-batyuk, O. Isayev, and A. E. Roitberg, J. Chem. Theo.Comp. 16, 4192 (2020).

[25] J. Behler, Int. J. Quantum Chem. 115, 1032 (2015).[26] K. T. Butler, D. W. Davies, H. Cartwright, O. Isayev,

and A. Walsh, Nature 559, 547 (2018).[27] S. K. Wallace, A. van Roekeghem, A. S. Bochkarev,

J. Carrasco, A. Shapeev, and N. Mingo, Phys. Rev.Research 3, 013139 (2021).

[28] O. A. von Lilienfeld, K.-R. Muller, and A. Tkatchenko,Nat. Rev. Chem. 4, 347 (2020).

[29] P. Seema, J. Behler, and D. Marx, Phys. Rev. Lett.115, 036102 (2015).

[30] K. T. Schutt, H. E. Sauceda, P.-J. Kindermans,A. Tkatchenko, and K.-R. Muller, J. Chem. Phys. 148,241722 (2018).

[31] V. L. Deringer, M. A. Caro, and G. Csanyi, Nat. Com-mun. 11, 5461 (2020).

[32] T. W. Ko, J. A. Finkler, S. Goedecker, and J. Behler,Nat. Commun. 12, 398 (2021).

[33] O. T. Unke, S. Chmiela, M. Gastegger, K. T. Schutt,H. E. Sauceda, and K.-R. Muller, “Spookynet: Learn-ing force fields with electronic degrees of freedom andnonlocal effects,” (2021), arXiv:2105.00304.

[34] P. Rowe, V. L. Deringer, P. Gasparotto, G. Csanyi, andA. Michaelides, J. Chem. Phys. 153, 034702 (2020).

[35] P. Rowe, G. Csanyi, D. Alfe, and A. Michaelides, Phys.Rev. B 97, 054303 (2018).

[36] J. Behler, Angew. Chem. Int. Ed. 56, 12828 (2017).[37] N. Artrith and J. Behler, Phys. Rev. B 85, 045439

(2012).[38] G. H. Booth, A. Gruneis, G. Kresse, and A. Alavi,

Nature 493, 365 (2013).[39] T. Gruber, K. Liao, T. Tsatsoulis, F. Hummel, and

A. Gruneis, Phys. Rev. X 8, 021043 (2018).[40] A. Zen, J. G. Brandenburg, J. Klimes, A. Tkatchenko,

D. Alfe, and A. Michaelides, Proc. Natl. Acad. Sci. USA

115, 1724 (2018).[41] S. Chmiela, A. Tkatchenko, H. E. Sauceda, I. Poltavsky,

K. T. Schutt, and K.-R. Muller, Sci. Adv. 3, e1603015(2017).

[42] S. Chmiela, H. E. Sauceda, I. Poltavsky, K.-R. Muller,and A. Tkatchenko, Comput. Phys. Commun. 240, 38(2019).

[43] G. Montavon, K. Hansen, S. Fazli, M. Rupp, F. Biegler,A. Ziehe, A. Tkatchenko, A. Lilienfeld, and K.-R.Muller, Advances in neural information processing sys-tems 25, 440 (2012).

[44] G. Montavon, M. Rupp, V. Gobre, A. Vazquez-Mayagoitia, K. Hansen, A. Tkatchenko, K.-R. Muller,and O. A. Von Lilienfeld, New J. Phys. 15, 095003(2013).

[45] F. Anselmi, L. Rosasco, and T. Poggio, Informationand Inference: A Journal of the IMA 5, 134 (2016).

[46] T. Poggio and F. Anselmi, Visual cortex and deep net-works: learning invariant representations (MIT Press,2016).

[47] M. Rupp, A. Tkatchenko, K.-R. Muller, and O. A. vonLilienfeld, Phys. Rev. Lett. 108, 58301 (2012).

[48] H. E. Sauceda, S. Chmiela, I. Poltavsky, K.-R. Muller,and A. Tkatchenko, J. Chem. Phys. 150, 114102 (2019).

[49] M. Hloucha and U. K. Deiters, Mol. Simul. 20, 239(1998).

[50] S. Chmiela, Towards exact molecular dynamics simula-tions with invariant machine-learned models (Technis-che Universitaet Berlin (Germany), 2019).

[51] F. Faber, A. Lindmaa, O. A. von Lilienfeld, andR. Armiento, Int. J. Quantum Chem. 115, 1094 (2015).

[52] M. J. Willatt, F. Musil, and M. Ceriotti, J.Chem. Phys.150, 154110 (2019).

[53] J. Behler and M. Parrinello, Phys. Rev. Lett. 98, 146401(2007).

[54] A. P. Bartok, R. Kondor, and G. Csanyi, Phys. Rev. B87, 184115 (2013).

[55] H. Huo and M. Rupp, “Unified representation ofmolecules and crystals for machine learning,” (2018),arXiv:1704.06439.

[56] K. T. Schutt, H. Glawe, F. Brockherde, A. Sanna, K. R.Muller, and E. K. U. Gross, Phys. Rev. B 89, 205118(2014).

[57] F. A. Faber, A. S. Christensen, B. Huang, and O. A.von Lilienfeld, J. Chem. Phys. 148, 241717 (2018).

[58] Z. Li, J. R. Kermode, and A. De Vita, Phys. Rev. Lett.114, 096405 (2015).

[59] W. Pronobis, A. Tkatchenko, and K.-R. Muller, J.Chem. Theory Comput. 14, 2991 (2018).

[60] J. Solyom, Fundamentals of the Physics of Solids: Vol-ume I: Structure and Dynamics, 1st ed. (Springer,2008).

[61] I. Y. Zhang and A. Gruneis, Front. Mater. 6, 123 (2019).[62] D. Yoon, Y.-W. Son, and H. Cheong, Nano Lett. 11,

3227 (2011).[63] Y. Fan, Y. Xiang, and H. Shen, Nanotechnol. Rev. 8,

415–421 (2019).[64] X.-F. Yang, A. Wang, B. Qiao, J. Li, J. Liu, and

T. Zhang, Acc. Chem. Res. 46, 1740 (2013).[65] A. Wang, J. Li, and T. Zhang, Nat. Rev. Chem. 2, 65

(2018).[66] F. Doherty, H. Wang, M. Yang, and B. R. Goldsmith,

Catal. Sci. Technol. 10, 5772 (2020).

Page 15: BIGDML: Towards Exact Machine Learning Force Fields for

15

[67] A. Tkatchenko and M. Scheffler, Phys. Rev. Lett. 102,073005 (2009).

[68] A. Tkatchenko, R. A. DiStasio, R. Car, and M. Schef-fler, Phys. Rev. Lett. 108, 236402 (2012).

[69] A. Ambrosetti, A. M. Reilly, R. A. DiStasio, andA. Tkatchenko, J. Chem. Phys. 140, 18A508 (2014).

[70] V. G. Ruiz, W. Liu, E. Zojer, M. Scheffler, andA. Tkatchenko, Phys. Rev. Lett. 108, 146103 (2012).

[71] J. Hermann and A. Tkatchenko, Phys. Rev. Lett. 124,146401 (2020).

[72] F. Cleri and V. Rosato, Phys. Rev. B 48, 22 (1993).[73] M. S. Daw, S. M. Foiles, and M. I. Baskes, Mat. Sci.

Eng. Rep. 9, 251 (1993).[74] H. E. Sauceda and I. L. Garzon, J. Phys. Chem. C 119,

10876 (2015).[75] J. George, G. Hautier, A. P. Bartok, G. Csanyi, and

V. L. Deringer, J. Chem. Phys. 153, 044104 (2020).[76] M. Lozada-Hidalgo, S. Hu, O. Marshall, A. Mishchenko,

A. N. Grigorenko, R. A. W. Dryfe, B. Radha, I. V.Grigorieva, and A. K. Geim, Science 351, 68 (2016).

[77] I. Poltavsky, L. Zheng, M. Mortazavi, andA. Tkatchenko, J. Chem. Phys. 148, 204707 (2018).

[78] E. Tadmor, “EAM potential (LAMMPS cubic hermitetabulation) for Pd developed by Zhou, Johnson, andWadley (2004); NIST retabulation v000,” OpenKIM,https://doi.org/10.25950/9edc9c7c (2018).

[79] S. Gowtham, R. H. Scheicher, R. Ahuja, R. Pandey,and S. P. Karna, Phys. Rev. B 76, 033401 (2007).

[80] N. Varghese, U. Mogera, A. Govindaraj, A. Das, P. K.Maiti, A. K. Sood, and C. N. R. Rao, Chem. Phys.Chem 10, 206 (2009).

[81] A. AlZahrani, Appl. Surf. Sci. 257, 807 (2010).[82] T. Gan and S. Hu, Microchim. Acta 175, 1 (2011).[83] B. D. Mohapatra, S. P. Mantry, N. Behera, B. Behera,

S. Rath, and K. S. K. Varadwaj, Chem. Commun. 52,10385 (2016).

[84] A. Chakradhar, N. Sivapragasam, M. T. Nayakasinghe,and U. Burghaus, J. Vac. Sci. Technol. A 34, 021402(2016).

[85] S. Roychoudhury, C. Motta, and S. Sanvito, Phys. Rev.B 93, 045130 (2016).

[86] M. Z. Tonel, I. V. Lara, I. Zanella, and S. B. Fagan,Phys. Chem. Chem. Phys. 19, 27374 (2017).

[87] M. Z. Tonel, M. O. Martins, I. Zanella, R. B. Pontes,and S. B. Fagan, Comput. Theor. Chem. 1115, 270(2017).

[88] E. E. de Moraes, M. Z. Tonel, S. B. Fagan, and M. C.Barbosa, J. Mol. Model. 25, 302 (2019).

[89] N. Ojaghlou, D. Bratko, M. Salanne, M. Shafiei, andA. Luzar, ACS Nano 14, 7987 (2020).

[90] J. P. Perdew, K. Burke, and M. Ernzerhof, Phys. Rev.Lett. 77, 3865 (1996).

[91] R. Zacharia, H. Ulbricht, and T. Hertel, Phys. Rev. B69, 155406 (2004).

[92] W. Fang, J. Chen, M. Rossi, Y. Feng, X.-Z. Li, andA. Michaelides, J. Phys. Chem. Lett. 7, 2125 (2016).

[93] T. E. Markland and M. Ceriotti, Nat. Rev. Chem. 2,0109 (2018).

[94] M. Rossi, W. Fang, and A. Michaelides, J. Phys. Chem.Lett. 6, 4233 (2015).

[95] P. Leinen, M. Esders, K. T. Schutt, C. Wagner, K.-R.Muller, and F. S. Tautz, Sci. Adv. 6, eabb6987 (2020).

[96] A. Zuttel, Materials Today 6, 24 (2003).[97] H. Kimizuka, S. Ogata, and M. Shiga, Phys. Rev. B

97, 014102 (2018).[98] D. E. Jiang and E. A. Carter, Phys. Rev. B 70, 064102

(2004).[99] A. P. Bartok, M. C. Payne, R. Kondor, and G. Csanyi,

Phys. Rev. Lett. 104, 136403 (2010).[100] N. Thomas, T. Smidt, S. Kearnes, L. Yang, L. Li,

K. Kohlhoff, and P. Riley, “Tensor field networks:Rotation-and translation-equivariant neural networksfor 3d point clouds,” (2018), arXiv:1802.08219.

[101] K. T. Schutt, O. T. Unke, and M. Gastegger,“Equivariant message passing for the prediction oftensorial properties and molecular spectra,” (2021),arXiv:2102.03150.

[102] P. Giannozzi, S. Baroni, N. Bonini, M. Calandra,R. Car, C. Cavazzoni, D. Ceresoli, G. L. Chiarotti,M. Cococcioni, I. Dabo, A. D. Corso, S. de Giron-coli, S. Fabris, G. Fratesi, R. Gebauer, U. Gerst-mann, C. Gougoussis, A. Kokalj, M. Lazzeri, L. Martin-Samos, N. Marzari, F. Mauri, R. Mazzarello, S. Paolini,A. Pasquarello, L. Paulatto, C. Sbraccia, S. Scandolo,G. Sclauzero, A. P. Seitsonen, A. Smogunov, P. Umari,and R. M. Wentzcovitch, J. Phys.: Condens. Matter 21,395502 (2009).

[103] P. Giannozzi, O. Andreussi, T. Brumme, O. Bunau,M. B. Nardelli, M. Calandra, R. Car, C. Cavazzoni,D. Ceresoli, M. Cococcioni, N. Colonna, I. Carn-imeo, A. D. Corso, S. de Gironcoli, P. Delugas,R. A. DiStasio, A. Ferretti, A. Floris, G. Fratesi,G. Fugallo, R. Gebauer, U. Gerstmann, F. Giustino,T. Gorni, J. Jia, M. Kawamura, H.-Y. Ko, A. Kokalj,E. Kucukbenli, M. Lazzeri, M. Marsili, N. Marzari,F. Mauri, N. L. Nguyen, H.-V. Nguyen, A. O. de-laRoza, L. Paulatto, S. Ponce, D. Rocca, R. Sabatini,B. Santra, M. Schlipf, A. P. Seitsonen, A. Smogunov,I. Timrov, T. Thonhauser, P. Umari, N. Vast, X. Wu,and S. Baroni, J. Phys.: Condens. Matter 29, 465901(2017).

[104] V. Blum, R. Gehrke, F. Hanke, P. Havu, V. Havu,X. Ren, K. Reuter, and M. Scheffler, Comput. Phys.Commun. 180, 2175 (2009).

[105] V. Kapil, M. Rossi, O. Marsalek, R. Petraglia, Y. Lit-man, T. Spura, B. Cheng, A. Cuzzocrea, R. H. Meißner,D. M. Wilkins, B. A. Helfrecht, P. Juda, S. P. Bi-envenue, W. Fang, J. Kessler, I. Poltavsky, S. Van-denbrande, J. Wieme, C. Corminboeuf, T. D. Kuhne,D. E. Manolopoulos, T. E. Markland, J. O. Richardson,A. Tkatchenko, G. A. Tribello, V. Van Speybroeck, andM. Ceriotti, Comput. Phys. Commun. 236, 214 (2019).

[106] A. Togo and I. Tanaka, Scr. Mater. 108, 1 (2015).