English
Related papers

Related papers: Estimating Subagging by cross-validation

200 papers

In this article, we derive concentration inequalities for the cross-validation estimate of the generalization error for empirical risk minimizers. In the general setting, we prove sanity-check bounds in the spirit of \cite{KR99}…

Machine Learning · Statistics 2010-11-02 Matthieu Cornec

In this article, we derive concentration inequalities for the cross-validation estimate of the generalization error for stable predictors in the context of risk assessment. The notion of stability has been first introduced by \cite{DEWA79}…

Machine Learning · Statistics 2010-11-24 Matthieu Cornec

Cross-validation is a widely-used technique to estimate prediction error, but its behavior is complex and not fully understood. Ideally, one would like to think that cross-validation estimates the prediction error for the model at hand, fit…

Methodology · Statistics 2024-03-12 Stephen Bates , Trevor Hastie , Robert Tibshirani

Tuning parameters in supervised learning problems are often estimated by cross-validation. The minimum value of the cross-validation error can be biased downward as an estimate of the test error at that same value of the tuning parameter.…

Applications · Statistics 2009-08-21 Ryan J. Tibshirani , Robert Tibshirani

Cross-validation is a common method for estimating the predictive performance of machine learning models. In a data-scarce regime, where one typically wishes to maximize the number of instances used for training the model, an approach…

Methodology · Statistics 2025-03-25 George I. Austin , Itsik Pe'er , Tal Korem

Cross-validation is a well-known and widely used bandwidth selection method in nonparametric regression estimation. However, this technique has two remarkable drawbacks: (i) the large variability of the selected bandwidths, and (ii) the…

Methodology · Statistics 2021-05-11 D. Barreiro-Ures , R. Cao , M. Francisco-Fernández

Cross-validation is a standard tool for obtaining a honest assessment of the performance of a prediction model. The commonly used version repeatedly splits data, trains the prediction model on the training set, evaluates the model…

Machine Learning · Statistics 2025-10-10 Tianyu Pan , Vincent Z. Yu , Viswanath Devanarayan , Lu Tian

This text is a survey on cross-validation. We define all classical cross-validation procedures, and we study their properties for two different goals: estimating the risk of a given estimator, and selecting the best estimator among a given…

Statistics Theory · Mathematics 2017-03-10 Sylvain Arlot

This work develops central limit theorems for cross-validation and consistent estimators of its asymptotic variance under weak stability conditions on the learning algorithm. Together, these results provide practical, asymptotically-exact…

Machine Learning · Statistics 2020-11-03 Pierre Bayle , Alexandre Bayle , Lucas Janson , Lester Mackey

In this article we prove that estimator stability is enough to show that leave-one-out cross validation is a sound procedure, by providing concentration bounds in a general framework. In particular, we provide concentration bounds beyond…

Statistics Theory · Mathematics 2023-10-17 Benny Avelin , Lauri Viitasaari

A general framework is that the estimators of a distribution are obtained by minimizing a function (the estimating function) and they are assessed through another function (the assessment function). The estimating and assessment functions…

Statistics Theory · Mathematics 2022-01-14 Daniel Commenges , Cécile Proust-Lima , Cécilia Samieri , Benoit Liquet

Despite ongoing theoretical research on cross-validation (CV), many theoretical questions remain widely open. This motivates our investigation into how properties of algorithm-distribution pairs can affect the choice for the number of folds…

Statistics Theory · Mathematics 2026-01-09 Ido Nachum , Rüdiger Urbanke , Thomas Weinberger

Cross-validation (CV) is a common method to tune machine learning methods and can be used for model selection in regression as well. Because of the structured nature of small, traditional experimental designs, the literature has warned…

Applications · Statistics 2025-06-18 Maria L. Weese , Byran J. Smucker , David J. Edwards

Cross-validation is one of the most popular model selection methods in statistics and machine learning. Despite its wide applicability, traditional cross validation methods tend to select overfitting models, due to the ignorance of the…

Methodology · Statistics 2017-12-25 Jing Lei

This article introduces a subbagging (subsample aggregating) approach for variable selection in regression within the context of big data. The proposed subbagging approach not only ensures that variable selection is scalable given the…

Methodology · Statistics 2025-03-10 Xian Li , Xuan Liang , Tao Zou

Cross-validation plays a fundamental role in Machine Learning, enabling robust evaluation of model performance and preventing overestimation on training and validation data. However, one of its drawbacks is the potential to create data…

Machine Learning · Computer Science 2025-08-28 Afonso Martini Spezia , Thomas Fontanari , Mariana Recamonde-Mendoza

We present a methodology for model evaluation and selection where the sampling mechanism violates the i.i.d. assumption. Our methodology involves a formulation of the bias between the standard Cross-Validation (CV) estimator and the mean…

Methodology · Statistics 2025-03-14 Oren Yuval , Saharon Rosset

Cross-validation under sample selection bias can, in principle, be done by importance-weighting the empirical risk. However, the importance-weighted risk estimator produces sub-optimal hyperparameter estimates in problem settings where…

Machine Learning · Computer Science 2019-08-28 Wouter M. Kouw , Jesse H. Krijthe , Marco Loog

Predictive models ground many state-of-the-art developments in statistical brain image analysis: decoding, MVPA, searchlight, or extraction of biomarkers. The principled approach to establish their validity and usefulness is…

Quantitative Methods · Quantitative Biology 2017-06-26 Gaël Varoquaux

Cross-validation (CV) is one of the most widely used techniques in statistical learning for estimating the test error of a model, but its behavior is not yet fully understood. It has been shown that standard confidence intervals for test…

Methodology · Statistics 2023-10-10 Min Woo Sun , Robert Tibshirani
‹ Prev 1 2 3 10 Next ›