Big Data of Materials Science: Critical Role of the Descriptor
Abstract
Statistical learning of materials properties or functions so far starts with a largely silent, nonchallenged step: the choice of the set of descriptive parameters (termed descriptor). However, when the scientific connection between the descriptor and the actuating mechanisms is unclear, the causality of the learned descriptor-property relation is uncertain. Thus, a trustful prediction of new promising materials, identification of anomalies, and scientific advancement are doubtful. We analyze this issue and define requirements for a suitable descriptor. For a classic example, the energy difference of zinc blende or wurtzite and rocksalt semiconductors, we demonstrate how a meaningful descriptor can be found systematically.
- Publication:
-
Physical Review Letters
- Pub Date:
- March 2015
- DOI:
- 10.1103/PhysRevLett.114.105503
- arXiv:
- arXiv:1411.7437
- Bibcode:
- 2015PhRvL.114j5503G
- Keywords:
-
- 61.50.-f;
- 02.60.Ed;
- 71.15.Mb;
- 89.20.Ff;
- Crystalline state;
- Interpolation;
- curve fitting;
- Density functional theory local density approximation gradient and other corrections;
- Computer science and technology;
- Physics - Data Analysis;
- Statistics and Probability
- E-Print:
- Accepted to Phys. Rev. Lett