Icospherical Chemical Objects (ICOs) allow for chemical data augmentation and maintain rotational, translation and permutation invariance
Abstract
Dataset augmentation is a common way to deal with small datasets; Chemistry datasets are often small. Spherical convolutional neural networks (SphNNs) and Icosahedral neural networks (IcoNNs) are a type of geometric machine learning algorithm that maintains rotational symmetry. Molecular structure has rotational invariance and is inherently 3-D, and thus we need 3-D encoding methods to input molecular structure into machine learning. In this paper I present Icospherical Chemical Objects (ICOs) that enable the encoding of 3-D data in a rotationally invariant way which works with spherical or icosahedral neural networks and allows for dataset augmentation. I demonstrate the ICO featurisation method on the following tasks: predicting general molecular properties, predicting solubility of drug like molecules and the protein binding problem and find that ICO and SphNNs perform well on all problems.
- Publication:
-
arXiv e-prints
- Pub Date:
- April 2023
- DOI:
- 10.48550/arXiv.2304.07558
- arXiv:
- arXiv:2304.07558
- Bibcode:
- 2023arXiv230407558G
- Keywords:
-
- Computer Science - Machine Learning
- E-Print:
- 16 pages, 13 figures