Discriminative Bayesian filtering lends momentum to the stochastic Newton method for minimizing log-convex functions

doi:10.48550/arXiv.2104.12949

Discriminative Bayesian filtering lends momentum to the stochastic Newton method for minimizing log-convex functions

Burkhart, Michael C.

To minimize the average of a set of log-convex functions, the stochastic Newton method iteratively updates its estimate using subsampled versions of the full objective's gradient and Hessian. We contextualize this optimization problem as sequential Bayesian inference on a latent state-space model with a discriminatively-specified observation process. Applying Bayesian filtering then yields a novel optimization algorithm that considers the entire history of gradients and Hessians when forming an update. We establish matrix-based conditions under which the effect of older observations diminishes over time, in a manner analogous to Polyak's heavy ball momentum. We illustrate various aspects of our approach with an example and review other relevant innovations for the stochastic Newton method.

Publication:

arXiv e-prints

Pub Date:

April 2021

DOI:

10.48550/arXiv.2104.12949

arXiv:

arXiv:2104.12949

Bibcode:

2021arXiv210412949B

Keywords:

Statistics - Machine Learning;
Computer Science - Machine Learning;
Mathematics - Optimization and Control;
49M15;
90C15;
62M20 (Primary);
90C25 (Secondary)

E-Print:

to appear in: Optimization Letters (2022)

NASA/ADS

Discriminative Bayesian filtering lends momentum to the stochastic Newton method for minimizing log-convex functions

Abstract