English

Density-sensitive semisupervised inference

Statistics Theory 2013-05-27 v2 Machine Learning Machine Learning Statistics Theory

Abstract

Semisupervised methods are techniques for using labeled data (X1,Y1),,(Xn,Yn)(X_1,Y_1),\ldots,(X_n,Y_n) together with unlabeled data Xn+1,,XNX_{n+1},\ldots,X_N to make predictions. These methods invoke some assumptions that link the marginal distribution PXP_X of X to the regression function f(x). For example, it is common to assume that f is very smooth over high density regions of PXP_X. Many of the methods are ad-hoc and have been shown to work in specific examples but are lacking a theoretical foundation. We provide a minimax framework for analyzing semisupervised methods. In particular, we study methods based on metrics that are sensitive to the distribution PXP_X. Our model includes a parameter α\alpha that controls the strength of the semisupervised assumption. We then use the data to adapt to α\alpha.

Keywords

Cite

@article{arxiv.1204.1685,
  title  = {Density-sensitive semisupervised inference},
  author = {Martin Azizyan and Aarti Singh and Larry Wasserman},
  journal= {arXiv preprint arXiv:1204.1685},
  year   = {2013}
}

Comments

Published in at http://dx.doi.org/10.1214/13-AOS1092 the Annals of Statistics (http://www.imstat.org/aos/) by the Institute of Mathematical Statistics (http://www.imstat.org)

R2 v1 2026-06-21T20:46:10.787Z