Compressive Statistical Learning with Random Feature Moments
Abstract
We describe a general framework compressive statistical learning for resourceefficient largescale learning: the training collection is compressed in one pass into a lowdimensional sketch (a vector of random empirical generalized moments) that captures the information relevant to the considered learning task. A nearminimizer of the risk is computed from the sketch through the solution of a nonlinear least squares problem. We investigate sufficient sketch sizes to control the generalization error of this procedure. The framework is illustrated on compressive PCA, compressive clustering, and compressive Gaussian mixture Modeling with fixed known variance. The latter two are further developed in a companion paper.
 Publication:

arXiv eprints
 Pub Date:
 June 2017
 arXiv:
 arXiv:1706.07180
 Bibcode:
 2017arXiv170607180G
 Keywords:

 Statistics  Machine Learning;
 Computer Science  Information Theory;
 Computer Science  Machine Learning;
 Mathematics  Statistics Theory
 EPrint:
 Main novelties between version 1 and version 2: improved concentration bounds, improved sketch sizes for compressive kmeans and compressive GMM that now scale linearly with the ambient dimensionMain novelties of version 3: all content on compressive clustering and compressive GMM is now developed in the companion paper hal02536818