Variational Gram Functions: Convex Analysis and Optimization
Abstract
We propose a new class of convex penalty functions, called \emph{variational Gram functions} (VGFs), that can promote pairwise relations, such as orthogonality, among a set of vectors in a vector space. These functions can serve as regularizers in convex optimization problems arising from hierarchical classification, multitask learning, and estimating vectors with disjoint supports, among other applications. We study convexity for VGFs, and give efficient characterizations for their convex conjugates, subdifferentials, and proximal operators. We discuss efficient optimization algorithms for regularized loss minimization problems where the loss admits a common, yet simple, variational representation and the regularizer is a VGF. These algorithms enjoy a simple kernel trick, an efficient line search, as well as computational advantages over first order methods based on the subdifferential or proximal maps. We also establish a general representer theorem for such learning problems. Lastly, numerical experiments on a hierarchical classification problem are presented to demonstrate the effectiveness of VGFs and the associated optimization algorithms.
- Publication:
-
arXiv e-prints
- Pub Date:
- July 2015
- DOI:
- 10.48550/arXiv.1507.04734
- arXiv:
- arXiv:1507.04734
- Bibcode:
- 2015arXiv150704734J
- Keywords:
-
- Mathematics - Optimization and Control;
- Computer Science - Machine Learning;
- Statistics - Machine Learning
- E-Print:
- 26 pages, 5 figures, additional revisions to text, under revision in SIOPT, An earlier version of this work has appeared as Chapter 3 in reference [21]