P-values for classification
Abstract
Let be a random variable consisting of an observed feature vector and an unobserved class label with unknown joint distribution. In addition, let be a training data set consisting of completely observed independent copies of . Usual classification procedures provide point predictors (classifiers) of or estimate the conditional distribution of given . In order to quantify the certainty of classifying we propose to construct for each a p-value for the null hypothesis that , treating temporarily as a fixed parameter. In other words, the point predictor is replaced with a prediction region for with a certain confidence. We argue that (i) this approach is advantageous over traditional approaches and (ii) any reasonable classifier can be modified to yield nonparametric p-values. We discuss issues such as optimality, single use and multiple use validity, as well as computational and graphical aspects.
Cite
@article{arxiv.0801.2934,
title = {P-values for classification},
author = {Lutz Duembgen and Bernd-Wolfgang Igl and Axel Munk},
journal= {arXiv preprint arXiv:0801.2934},
year = {2008}
}
Comments
Published in at http://dx.doi.org/10.1214/08-EJS245 the Electronic Journal of Statistics (http://www.i-journals.org/ejs/) by the Institute of Mathematical Statistics (http://www.imstat.org)