Empirical Study of the Benefits of Overparameterization in Learning Latent Variable Models
Abstract
One of the most surprising and exciting discoveries in supervised learning was the benefit of overparameterization (i.e. training a very large model) to improving the optimization landscape of a problem, with minimal effect on statistical performance (i.e. generalization). In contrast, unsupervised settings have been under-explored, despite the fact that it was observed that overparameterization can be helpful as early as Dasgupta & Schulman (2007). We perform an empirical study of different aspects of overparameterization in unsupervised learning of latent variable models via synthetic and semi-synthetic experiments. We discuss benefits to different metrics of success (recovering the parameters of the ground-truth model, held-out log-likelihood), sensitivity to variations of the training algorithm, and behavior as the amount of overparameterization increases. We find that across a variety of models (noisy-OR networks, sparse coding, probabilistic context-free grammars) and training algorithms (variational inference, alternating minimization, expectation-maximization), overparameterization can significantly increase the number of ground truth latent variables recovered.
- Publication:
-
arXiv e-prints
- Pub Date:
- June 2019
- DOI:
- 10.48550/arXiv.1907.00030
- arXiv:
- arXiv:1907.00030
- Bibcode:
- 2019arXiv190700030B
- Keywords:
-
- Statistics - Machine Learning;
- Computer Science - Machine Learning
- E-Print:
- 22 pages, to appear at ICML 2020