Large Data and Zero Noise Limits of Graph-Based Semi-Supervised Learning Algorithms

doi:10.48550/arXiv.1805.09450

Large Data and Zero Noise Limits of Graph-Based Semi-Supervised Learning Algorithms

Scalings in which the graph Laplacian approaches a differential operator in the large graph limit are used to develop understanding of a number of algorithms for semi-supervised learning; in particular the extension, to this graph setting, of the probit algorithm, level set and kriging methods, are studied. Both optimization and Bayesian approaches are considered, based around a regularizing quadratic form found from an affine transformation of the Laplacian, raised to a, possibly fractional, exponent. Conditions on the parameters defining this quadratic form are identified under which well-defined limiting continuum analogues of the optimization and Bayesian semi-supervised learning problems may be found, thereby shedding light on the design of algorithms in the large graph setting. The large graph limits of the optimization formulations are tackled through $\Gamma-$convergence, using the recently introduced $TL^p$ metric. The small labelling noise limits of the Bayesian formulations are also identified, and contrasted with pre-existing harmonic function approaches to the problem.

Publication:

arXiv e-prints

Pub Date:

May 2018

DOI:

10.48550/arXiv.1805.09450

arXiv:

arXiv:1805.09450

Bibcode:

2018arXiv180509450D

Keywords:

Statistics - Machine Learning;
Computer Science - Machine Learning;
Mathematics - Analysis of PDEs;
62G20;
62C10;
62F15;
49J55

NASA/ADS

Large Data and Zero Noise Limits of Graph-Based Semi-Supervised Learning Algorithms

Abstract