Groupsparse Embeddings in Collective Matrix Factorization
Abstract
CMF is a technique for simultaneously learning lowrank representations based on a collection of matrices with shared entities. A typical example is the joint modeling of useritem, itemproperty, and userfeature matrices in a recommender system. The key idea in CMF is that the embeddings are shared across the matrices, which enables transferring information between them. The existing solutions, however, break down when the individual matrices have lowrank structure not shared with others. In this work we present a novel CMF solution that allows each of the matrices to have a separate lowrank structure that is independent of the other matrices, as well as structures that are shared only by a subset of them. We compare MAP and variational Bayesian solutions based on alternating optimization algorithms and show that the model automatically infers the nature of each factor using groupwise sparsity. Our approach supports in a principled way continuous, binary and count observations and is efficient for sparse matrices involving missing data. We illustrate the solution on a number of examples, focusing in particular on an interesting usecase of augmented multiview learning.
 Publication:

arXiv eprints
 Pub Date:
 December 2013
 DOI:
 10.48550/arXiv.1312.5921
 arXiv:
 arXiv:1312.5921
 Bibcode:
 2013arXiv1312.5921K
 Keywords:

 Statistics  Machine Learning;
 Computer Science  Machine Learning
 EPrint:
 9+2 pages, submitted for International Conference on Learning Representations 2014. This version fixes minor typographic mistakes, has one new paragraph on computational efficiency, and describes the algorithm in more detail in the Supplementary material