On Linear Separability under Linear Compression with Applications to Hard Support Vector Machine
Abstract
This paper investigates the theoretical problem of maintaining linear separability of the data-generating distribution under linear compression. While it has been long known that linear separability may be maintained by linear transformations that approximately preserve the inner products between the domain points, the limit to which the inner products are preserved in order to maintain linear separability was unknown. In this paper, we show that linear separability is maintained as long as the distortion of the inner products is smaller than the squared margin of the original data-generating distribution. The proof is mainly based on the geometry of hard support vector machines (SVM) extended from the finite set of training examples to the (possibly) infinite domain of the data-generating distribution. As applications, we derive bounds on the (i) compression length of random sub-Gaussian matrices; and (ii) generalization error for compressive learning with hard-SVM.
- Publication:
-
arXiv e-prints
- Pub Date:
- February 2022
- DOI:
- arXiv:
- arXiv:2202.01118
- Bibcode:
- 2022arXiv220201118M
- Keywords:
-
- Computer Science - Machine Learning;
- Mathematics - Statistics Theory
- E-Print:
- 12 pages