Does quantification without adjustments work?

Dirk Tasche

Does quantification without adjustments work?

Machine Learning 2016-08-15 v2 Machine Learning Statistics Theory Statistics Theory

Authors: Dirk Tasche

Abstract

Classification is the task of predicting the class labels of objects based on the observation of their features. In contrast, quantification has been defined as the task of determining the prevalences of the different sorts of class labels in a target dataset. The simplest approach to quantification is Classify & Count where a classifier is optimised for classification on a training set and applied to the target dataset for the prediction of class labels. In the case of binary quantification, the number of predicted positive labels is then used as an estimate of the prevalence of the positive class in the target dataset. Since the performance of Classify & Count for quantification is known to be inferior its results typically are subject to adjustments. However, some researchers recently have suggested that Classify & Count might actually work without adjustments if it is based on a classifer that was specifically trained for quantification. We discuss the theoretical foundation for this claim and explore its potential and limitations with a numerical example based on the binormal model with equal variances. In order to identify an optimal quantifier in the binormal setting, we introduce the concept of local Bayes optimality. As a side remark, we present a complete proof of a theorem by Ye et al. (2012).

Keywords

semi-supervised learning statistical learning theory statistical inference and model selection

Cite

@article{arxiv.1602.08780,
  title  = {Does quantification without adjustments work?},
  author = {Dirk Tasche},
  journal= {arXiv preprint arXiv:1602.08780},
  year   = {2016}
}

Comments

20 pages, 2 figures, major update

Does quantification without adjustments work?

Abstract

Keywords

Cite

Comments

Related papers