Gradient descent learning in and out of equilibrium
Abstract
Relations between the off thermal equilibrium dynamical process of online learning and the thermally equilibrated offline learning are studied for potential gradient descent learning. The approach of Opper to study online Bayesian algorithms is used for potential based or maximum likelihood learning. We look at the online learning algorithm that best approximates the offline algorithm in the sense of least KullbackLeibler information loss. The closest online algorithm works by updating the weights along the gradient of an effective potential, which is different from the parent offline potential. A few examples are analyzed and the origin of the potential annealing is discussed.
 Publication:

Physical Review E
 Pub Date:
 June 2001
 DOI:
 10.1103/PhysRevE.63.061905
 arXiv:
 arXiv:condmat/0004047
 Bibcode:
 2001PhRvE..63f1905C
 Keywords:

 87.10.+e;
 84.35.+i;
 89.70.+c;
 05.50.+q;
 General theory and mathematical aspects;
 Neural networks;
 Information theory and communication theory;
 Lattice theory and statistics;
 Condensed Matter  Disordered Systems and Neural Networks;
 Condensed Matter  Statistical Mechanics
 EPrint:
 08 pages, submitted to the Journal of Physics A