Provably Faster Algorithms for Bilevel Optimization

doi:10.48550/arXiv.2106.04692

Provably Faster Algorithms for Bilevel Optimization

Bilevel optimization has been widely applied in many important machine learning applications such as hyperparameter optimization and meta-learning. Recently, several momentum-based algorithms have been proposed to solve bilevel optimization problems faster. However, those momentum-based algorithms do not achieve provably better computational complexity than $\mathcal{\widetilde O}(\epsilon^{-2})$ of the SGD-based algorithm. In this paper, we propose two new algorithms for bilevel optimization, where the first algorithm adopts momentum-based recursive iterations, and the second algorithm adopts recursive gradient estimations in nested loops to decrease the variance. We show that both algorithms achieve the complexity of $\mathcal{\widetilde O}(\epsilon^{-1.5})$, which outperforms all existing algorithms by the order of magnitude. Our experiments validate our theoretical results and demonstrate the superior empirical performance of our algorithms in hyperparameter applications.

Publication:

arXiv e-prints

Pub Date:

June 2021

DOI:

10.48550/arXiv.2106.04692

arXiv:

arXiv:2106.04692

Bibcode:

2021arXiv210604692Y

Keywords:

Computer Science - Machine Learning;
Mathematics - Optimization and Control;
Statistics - Machine Learning

E-Print:

This paper is accepted in NeurIPS 2021

NASA/ADS

Provably Faster Algorithms for Bilevel Optimization

Abstract