Related papers: Fast Cross-Validation for Incremental Learning

Iterative Approximate Cross-Validation

Cross-validation (CV) is one of the most popular tools for assessing and selecting predictive models. However, standard CV suffers from high computational cost when the number of folds is large. Recently, under the empirical risk…

Methodology · Statistics 2023-05-30 Yuetian Luo , Zhimei Ren , Rina Foygel Barber

Fast and Informative Model Selection using Learning Curve Cross-Validation

Common cross-validation (CV) methods like k-fold cross-validation or Monte-Carlo cross-validation estimate the predictive performance of a learner by repeatedly training it on a large portion of the given data and testing on the remaining…

Machine Learning · Computer Science 2021-11-30 Felix Mohr , Jan N. van Rijn

Is K-fold cross validation the best model selection method for Machine Learning?

As a technique that can compactly represent complex patterns, machine learning has significant potential for predictive inference. K-fold cross-validation (CV) is the most common approach to ascertaining the likelihood that a machine…

Machine Learning · Statistics 2026-04-24 Juan M Gorriz , R. Martin Clemente , F Segovia , J Ramirez , A Ortiz , J. Suckling

Cross-Validation for Unsupervised Learning

Cross-validation (CV) is a popular method for model-selection. Unfortunately, it is not immediately obvious how to apply CV to unsupervised or exploratory contexts. This thesis discusses some extensions of cross-validation to unsupervised…

Methodology · Statistics 2009-09-17 Patrick O. Perry

Using J-K fold Cross Validation to Reduce Variance When Tuning NLP Models

K-fold cross validation (CV) is a popular method for estimating the true performance of machine learning models, allowing model selection and parameter tuning. However, the very process of CV requires random partitioning of the data and so…

Computation and Language · Computer Science 2018-06-20 Henry B. Moss , David S. Leslie , Paul Rayson

Cross Validation for Correlated Data in Regression and Classification Models, with Applications to Deep Learning

We present a methodology for model evaluation and selection where the sampling mechanism violates the i.i.d. assumption. Our methodology involves a formulation of the bias between the standard Cross-Validation (CV) estimator and the mean…

Methodology · Statistics 2025-03-14 Oren Yuval , Saharon Rosset

Blocked Cross-Validation: A Precise and Efficient Method for Hyperparameter Tuning

Hyperparameter tuning plays a crucial role in optimizing the performance of predictive learners. Cross--validation (CV) is a widely adopted technique for estimating the error of different hyperparameter settings. Repeated cross-validation…

Machine Learning · Computer Science 2023-08-01 Giovanni Maria Merola

Approximate cross-validation formula for Bayesian linear regression

Cross-validation (CV) is a technique for evaluating the ability of statistical models/learning systems based on a given data set. Despite its wide applicability, the rather heavy computational cost can prevent its use as the system size…

Machine Learning · Statistics 2016-10-26 Yoshiyuki Kabashima , Tomoyuki Obuchi , Makoto Uemura

When to Impute? Imputation before and during cross-validation

Cross-validation (CV) is a technique used to estimate generalization error for prediction models. For pipeline modeling algorithms (i.e. modeling procedures with multiple steps), it has been recommended the entire sequence of steps be…

Machine Learning · Statistics 2020-10-05 Byron C. Jaeger , Nicholas J. Tierney , Noah R. Simon

Stable and Robust Hyper-Parameter Selection Via Robust Information Sharing Cross-Validation

Robust estimators for linear regression require non-convex objective functions to shield against adverse affects of outliers. This non-convexity brings challenges, particularly when combined with penalization in high-dimensional settings.…

Computation · Statistics 2025-08-08 David Kepplinger , Siqi Wei

Efficient, adaptive cross-validation for tuning and comparing models, with application to drug discovery

Cross-validation (CV) is widely used for tuning a model with respect to user-selected parameters and for selecting a "best" model. For example, the method of $k$-nearest neighbors requires the user to choose $k$, the number of neighbors,…

Applications · Statistics 2012-03-01 Hui Shen , William J. Welch , Jacqueline M. Hughes-Oliver

Approximate Cross-validation: Guarantees for Model Assessment and Selection

Cross-validation (CV) is a popular approach for assessing and selecting predictive models. However, when the number of folds is large, CV suffers from a need to repeatedly refit a learning procedure on a large number of training datasets.…

Machine Learning · Statistics 2020-06-12 Ashia Wilson , Maximilian Kasy , Lester Mackey

Leave Zero Out: Towards a No-Cross-Validation Approach for Model Selection

As the main workhorse for model selection, Cross Validation (CV) has achieved an empirical success due to its simplicity and intuitiveness. However, despite its ubiquitous role, CV often falls into the following notorious dilemmas. On the…

Machine Learning · Computer Science 2020-12-29 Weikai Li , Chuanxing Geng , Songcan Chen

The use of cross validation in the analysis of designed experiments

Cross-validation (CV) is a common method to tune machine learning methods and can be used for model selection in regression as well. Because of the structured nature of small, traditional experimental designs, the literature has warned…

Applications · Statistics 2025-06-18 Maria L. Weese , Byran J. Smucker , David J. Edwards

Efficient algorithms for decision tree cross-validation

Cross-validation is a useful and generally applicable technique often employed in machine learning, including decision tree induction. An important disadvantage of straightforward implementation of the technique is its computational…

Machine Learning · Computer Science 2007-05-23 Hendrik Blockeel , Jan Struyf

Optimizing for Generalization in Machine Learning with Cross-Validation Gradients

Cross-validation is the workhorse of modern applied statistics and machine learning, as it provides a principled framework for selecting the model that maximizes generalization performance. In this paper, we show that the cross-validation…

Machine Learning · Statistics 2018-05-21 Shane Barratt , Rishi Sharma

The Generalized Cross Validation Filter

Generalized cross validation (GCV) is one of the most important approaches used to estimate parameters in the context of inverse problems and regularization techniques. A notable example is the determination of the smoothness parameter in…

Machine Learning · Statistics 2017-06-09 Giulio Bottegal , Gianluigi Pillonetto

Bayesian cross-validation by parallel Markov Chain Monte Carlo

Brute force cross-validation (CV) is a method for predictive assessment and model selection that is general and applicable to a wide range of Bayesian models. Naive or `brute force' CV approaches are often too computationally costly for…

Methodology · Statistics 2024-01-17 Alex Cooper , Aki Vehtari , Catherine Forbes , Lauren Kennedy , Dan Simpson

Approximate Cross-Validation for Structured Models

Many modern data analyses benefit from explicitly modeling dependence structure in data -- such as measurements across time or space, ordered words in a sentence, or genes in a genome. A gold standard evaluation technique is structured…

Machine Learning · Statistics 2020-12-02 Soumya Ghosh , William T. Stephenson , Tin D. Nguyen , Sameer K. Deshpande , Tamara Broderick

Statistical learning and cross-validation for point processes

This paper presents the first general (supervised) statistical learning framework for point processes in general spaces. Our approach is based on the combination of two new concepts, which we define in the paper: i) bivariate innovations,…

Methodology · Statistics 2021-03-03 Ottmar Cronie , Mehdi Moradi , Christophe A. N. Biscio