Cross-validation

Sylvain Arlot

Cross-validation

Statistics Theory 2017-03-10 v1 Machine Learning Statistics Theory

Authors: Sylvain Arlot

Abstract

This text is a survey on cross-validation. We define all classical cross-validation procedures, and we study their properties for two different goals: estimating the risk of a given estimator, and selecting the best estimator among a given family. For the risk estimation problem, we compute the bias (which can also be corrected) and the variance of cross-validation methods. For estimator selection, we first provide a first-order analysis (based on expectations). Then, we explain how to take into account second-order terms (from variance computations, and by taking into account the usefulness of overpenalization). This allows, in the end, to provide some guidelines for choosing the best cross-validation method for a given learning problem.

Keywords

model selection statistical inference covariance estimation

Cite

@article{arxiv.1703.03167,
  title  = {Cross-validation},
  author = {Sylvain Arlot},
  journal= {arXiv preprint arXiv:1703.03167},
  year   = {2017}
}

Comments

in French

Cross-validation

Abstract

Keywords

Cite

Comments

Related papers