English

A Bayesian Evaluation Framework for Subjectively Annotated Visual Recognition Tasks

Computer Vision and Pattern Recognition 2021-12-23 v2 Machine Learning Machine Learning

Abstract

An interesting development in automatic visual recognition has been the emergence of tasks where it is not possible to assign objective labels to images, yet still feasible to collect annotations that reflect human judgements about them. Machine learning-based predictors for these tasks rely on supervised training that models the behavior of the annotators, i.e., what would the average person's judgement be for an image? A key open question for this type of work, especially for applications where inconsistency with human behavior can lead to ethical lapses, is how to evaluate the epistemic uncertainty of trained predictors, i.e., the uncertainty that comes from the predictor's model. We propose a Bayesian framework for evaluating black box predictors in this regime, agnostic to the predictor's internal structure. The framework specifies how to estimate the epistemic uncertainty that comes from the predictor with respect to human labels by approximating a conditional distribution and producing a credible interval for the predictions and their measures of performance. The framework is successfully applied to four image classification tasks that use subjective human judgements: facial beauty assessment, social attribute assignment, apparent age estimation, and ambiguous scene labeling.

Keywords

Cite

@article{arxiv.2007.06711,
  title  = {A Bayesian Evaluation Framework for Subjectively Annotated Visual Recognition Tasks},
  author = {Derek S. Prijatelj and Mel McCurrie and Walter J. Scheirer},
  journal= {arXiv preprint arXiv:2007.06711},
  year   = {2021}
}

Comments

21 pages. 6 figures. 2 tables. Supplementary Material as Appendix with 28 pages, 6 figures, 2 tables. First major revision for journal Pattern Recognition. Code to be included after publication at https://github.com/prijatelj/bayesian_eval_ground_truth-free

R2 v1 2026-06-23T17:05:37.155Z