Unsupervised Supervised Learning II: Training Margin Based Classifiers without Labels

Krishnakumar Balasubramanian; Pinar Donmez; Guy Lebanon

Unsupervised Supervised Learning II: Training Margin Based Classifiers without Labels

Machine Learning 2010-07-23 v2

Authors: Krishnakumar Balasubramanian , Pinar Donmez , Guy Lebanon

Abstract

Many popular linear classifiers, such as logistic regression, boosting, or SVM, are trained by optimizing a margin-based risk function. Traditionally, these risk functions are computed based on a labeled dataset. We develop a novel technique for estimating such risks using only unlabeled data and the marginal label distribution. We prove that the proposed risk estimator is consistent on high-dimensional datasets and demonstrate it on synthetic and real-world data. In particular, we show how the estimate is used for evaluating classifiers in transfer learning, and for training classifiers with no labeled data whatsoever.

Keywords

semi-supervised learning multi-label classification classification

Cite

@article{arxiv.1003.0470,
  title  = {Unsupervised Supervised Learning II: Training Margin Based Classifiers without Labels},
  author = {Krishnakumar Balasubramanian and Pinar Donmez and Guy Lebanon},
  journal= {arXiv preprint arXiv:1003.0470},
  year   = {2010}
}

Comments

22 pages, 43 figures

Unsupervised Supervised Learning II: Training Margin Based Classifiers without Labels

Abstract

Keywords

Cite

Comments

Related papers