Accelerated Dual Learning by Homotopic Initialization

doi:10.48550/arXiv.1706.03958

Accelerated Dual Learning by Homotopic Initialization

Gradient descent and coordinate descent are well understood in terms of their asymptotic behavior, but less so in a transient regime often used for approximations in machine learning. We investigate how proper initialization can have a profound effect on finding near-optimal solutions quickly. We show that a certain property of a data set, namely the boundedness of the correlations between eigenfeatures and the response variable, can lead to faster initial progress than expected by commonplace analysis. Convex optimization problems can tacitly benefit from that, but this automatism does not apply to their dual formulation. We analyze this phenomenon and devise provably good initialization strategies for dual optimization as well as heuristics for the non-convex case, relevant for deep learning. We find our predictions and methods to be experimentally well-supported.

Publication:

arXiv e-prints

Pub Date:

June 2017

DOI:

10.48550/arXiv.1706.03958

arXiv:

arXiv:1706.03958

Bibcode:

2017arXiv170603958D

Keywords:

Computer Science - Machine Learning

NASA/ADS

Accelerated Dual Learning by Homotopic Initialization

Abstract