Crystal Structure Representations for Machine Learning Models of Formation Energies
Abstract
We introduce and evaluate a set of feature vector representations of crystal structures for machine learning (ML) models of formation energies of solids. ML models of atomization energies of organic molecules have been successful using a Coulomb matrix representation of the molecule. We consider three ways to generalize such representations to periodic systems: (i) a matrix where each element is related to the Ewald sum of the electrostatic interaction between two different atoms in the unit cell repeated over the lattice; (ii) an extended Coulomblike matrix that takes into account a number of neighboring unit cells; and (iii) an Ansatz that mimics the periodicity and the basic features of the elements in the Ewald sum matrix by using a sine function of the crystal coordinates of the atoms. The representations are compared for a Laplacian kernel with Manhattan norm, trained to reproduce formation energies using a data set of 3938 crystal structures obtained from the Materials Project. For training sets consisting of 3000 crystals, the generalization error in predicting formation energies of new structures corresponds to (i) 0.49, (ii) 0.64, and (iii) 0.37 eV/atom for the respective representations.
 Publication:

arXiv eprints
 Pub Date:
 March 2015
 DOI:
 10.48550/arXiv.1503.07406
 arXiv:
 arXiv:1503.07406
 Bibcode:
 2015arXiv150307406F
 Keywords:

 Physics  Chemical Physics