English

Rethinking Generalisation

Machine Learning 2020-03-27 v2 Machine Learning

Abstract

In this paper, a new approach to computing the generalisation performance is presented that assumes the distribution of risks, ρ(r)\rho(r), for a learning scenario is known. From this, the expected error of a learning machine using empirical risk minimisation is computed for both classification and regression problems. A critical quantity in determining the generalisation performance is the power-law behaviour of ρ(r)\rho(r) around its minimum value---a quantity we call attunement. The distribution ρ(r)\rho(r) is computed for the case of all Boolean functions and for the perceptron used in two different problem settings. Initially a simplified analysis is presented where an independence assumption about the losses is made. A more accurate analysis is carried out taking into account chance correlations in the training set. This leads to corrections in the typical behaviour that is observed.

Keywords

Cite

@article{arxiv.1911.04301,
  title  = {Rethinking Generalisation},
  author = {Antonia Marcu and Adam Prügel-Bennett},
  journal= {arXiv preprint arXiv:1911.04301},
  year   = {2020}
}
R2 v1 2026-06-23T12:11:44.532Z