AI Feynman 2.0: Paretooptimal symbolic regression exploiting graph modularity
Abstract
We present an improved method for symbolic regression that seeks to fit data to formulas that are Paretooptimal, in the sense of having the best accuracy for a given complexity. It improves on the previous stateoftheart by typically being orders of magnitude more robust toward noise and bad data, and also by discovering many formulas that stumped previous methods. We develop a method for discovering generalized symmetries (arbitrary modularity in the computational graph of a formula) from gradient properties of a neural network fit. We use normalizing flows to generalize our symbolic regression method to probability distributions from which we only have samples, and employ statistical hypothesis testing to accelerate robust bruteforce search.
 Publication:

arXiv eprints
 Pub Date:
 June 2020
 DOI:
 10.48550/arXiv.2006.10782
 arXiv:
 arXiv:2006.10782
 Bibcode:
 2020arXiv200610782U
 Keywords:

 Computer Science  Machine Learning;
 Computer Science  Artificial Intelligence;
 Computer Science  Information Theory;
 Physics  Computational Physics;
 Statistics  Machine Learning
 EPrint:
 17 pages, 6 figs, replaced to match accepted NeurIPS version