Automatic learning of hydrogen-bond fixes in an AMBER RNA force field
Abstract
The capability of current force fields to reproduce RNA structural dynamics is limited. Several methods have been developed to take advantage of experimental data in order to enforce agreement with experiments. We herein extend an existing framework, which allows arbitrarily chosen force-field correction terms to be fitted by quantification of the discrepancy between observables back-calculated from simulation and corresponding experiments. We apply a robust regularization protocol to avoid overfitting, and additionally introduce and compare a number of different regularization strategies, namely L1-, L2-, Kish Size-, Relative Kish Size- and Relative Entropy-penalties. The training set includes a GACC tetramer as well as more challenging systems, namely gcGAGAgc and gcUUCGgc RNA tetraloops. Specific intramolecular hydrogen bonds in the AMBER RNA force field are corrected with automatically determined parameters that we call gHBfix$_{opt}$. A validation involving a separate simulation of a system present in the training set (gcUUCGgc) and new systems not seen during training (CAAU and UUUU tetramers) displays improvements regarding native population of the tetraloop as well as good agreement with NMR-experiments for tetramers when using the new parameters. Then we simulate folded RNAs (a kink-turn and L1 stalk rRNA) including hydrogen bond types not sufficiently present in the training set. This allows a final modification of the parameter set which is named gHBfix21 and is suggested to be applicable to a wider range of RNA systems.
- Publication:
-
arXiv e-prints
- Pub Date:
- January 2022
- DOI:
- 10.48550/arXiv.2201.04078
- arXiv:
- arXiv:2201.04078
- Bibcode:
- 2022arXiv220104078F
- Keywords:
-
- Physics - Chemical Physics;
- Physics - Biological Physics;
- Physics - Computational Physics;
- Quantitative Biology - Biomolecules
- E-Print:
- Supporting information included in ancillary files