Density-sensitive semisupervised inference

doi:10.48550/arXiv.1204.1685

Density-sensitive semisupervised inference

Semisupervised methods are techniques for using labeled data $(X_1,Y_1),\ldots,(X_n,Y_n)$ together with unlabeled data $X_{n+1},\ldots,X_N$ to make predictions. These methods invoke some assumptions that link the marginal distribution $P_X$ of X to the regression function f(x). For example, it is common to assume that f is very smooth over high density regions of $P_X$. Many of the methods are ad-hoc and have been shown to work in specific examples but are lacking a theoretical foundation. We provide a minimax framework for analyzing semisupervised methods. In particular, we study methods based on metrics that are sensitive to the distribution $P_X$. Our model includes a parameter $\alpha$ that controls the strength of the semisupervised assumption. We then use the data to adapt to $\alpha$.

Publication:

arXiv e-prints

Pub Date:

April 2012

DOI:

10.48550/arXiv.1204.1685

arXiv:

arXiv:1204.1685

Bibcode:

2012arXiv1204.1685A

Keywords:

Mathematics - Statistics Theory;
Computer Science - Machine Learning;
Statistics - Machine Learning

E-Print:

Published in at http://dx.doi.org/10.1214/13-AOS1092 the Annals of Statistics (http://www.imstat.org/aos/) by the Institute of Mathematical Statistics (http://www.imstat.org)

NASA/ADS

Density-sensitive semisupervised inference

Abstract