Complete Dictionary Recovery over the Sphere II: Recovery by Riemannian Trust-region Method
Abstract
We consider the problem of recovering a complete (i.e., square and invertible) matrix $\mathbf A_0$, from $\mathbf Y \in \mathbb{R}^{n \times p}$ with $\mathbf Y = \mathbf A_0 \mathbf X_0$, provided $\mathbf X_0$ is sufficiently sparse. This recovery problem is central to theoretical understanding of dictionary learning, which seeks a sparse representation for a collection of input signals and finds numerous applications in modern signal processing and machine learning. We give the first efficient algorithm that provably recovers $\mathbf A_0$ when $\mathbf X_0$ has $O(n)$ nonzeros per column, under suitable probability model for $\mathbf X_0$. Our algorithmic pipeline centers around solving a certain nonconvex optimization problem with a spherical constraint, and hence is naturally phrased in the language of manifold optimization. In a companion paper (arXiv:1511.03607), we have showed that with high probability our nonconvex formulation has no "spurious" local minimizers and around any saddle point the objective function has a negative directional curvature. In this paper, we take advantage of the particular geometric structure, and describe a Riemannian trust region algorithm that provably converges to a local minimizer with from arbitrary initializations. Such minimizers give excellent approximations to rows of $\mathbf X_0$. The rows are then recovered by linear programming rounding and deflation.
- Publication:
-
arXiv e-prints
- Pub Date:
- November 2015
- DOI:
- 10.48550/arXiv.1511.04777
- arXiv:
- arXiv:1511.04777
- Bibcode:
- 2015arXiv151104777S
- Keywords:
-
- Computer Science - Information Theory;
- Computer Science - Computer Vision and Pattern Recognition;
- Mathematics - Optimization and Control;
- Statistics - Machine Learning
- E-Print:
- The second of two papers based on the report arXiv:1504.06785. Accepted by IEEE Transaction on Information Theory