On the Consistency of Graph-based Bayesian Learning and the Scalability of Sampling Algorithms
Abstract
A popular approach to semi-supervised learning proceeds by endowing the input data with a graph structure in order to extract geometric information and incorporate it into a Bayesian framework. We introduce new theory that gives appropriate scalings of graph parameters that provably lead to a well-defined limiting posterior as the size of the unlabeled data set grows. Furthermore, we show that these consistency results have profound algorithmic implications. When consistency holds, carefully designed graph-based Markov chain Monte Carlo algorithms are proved to have a uniform spectral gap, independent of the number of unlabeled inputs. Several numerical experiments corroborate both the statistical consistency and the algorithmic scalability established by the theory.
- Publication:
-
arXiv e-prints
- Pub Date:
- October 2017
- DOI:
- 10.48550/arXiv.1710.07702
- arXiv:
- arXiv:1710.07702
- Bibcode:
- 2017arXiv171007702G
- Keywords:
-
- Statistics - Machine Learning;
- Computer Science - Machine Learning;
- Mathematics - Probability;
- Statistics - Computation