Related papers: Targeted Cross-Validation

Aligning Validation with Deployment in Spatial Prediction: Target-Weighted Cross-Validation

Reliable estimation of predictive performance is essential for spatial environmental modeling, where machine-learning models are used to generate maps from unevenly distributed observations. Standard cross-validation (CV) assumes that…

Machine Learning · Computer Science 2026-05-22 Alexander Brenning , Thomas Suesse

The restricted consistency property of leave-$n_v$-out cross-validation for high-dimensional variable selection

Cross-validation (CV) methods are popular for selecting the tuning parameter in the high-dimensional variable selection problem. We show the mis-alignment of the CV is one possible reason of its over-selection behavior. To fix this issue,…

Methodology · Statistics 2018-01-17 Yang Feng , Yi Yu

Cross Validation for Correlated Data in Regression and Classification Models, with Applications to Deep Learning

We present a methodology for model evaluation and selection where the sampling mechanism violates the i.i.d. assumption. Our methodology involves a formulation of the bias between the standard Cross-Validation (CV) estimator and the mean…

Methodology · Statistics 2025-03-14 Oren Yuval , Saharon Rosset

Fast and Informative Model Selection using Learning Curve Cross-Validation

Common cross-validation (CV) methods like k-fold cross-validation or Monte-Carlo cross-validation estimate the predictive performance of a learner by repeatedly training it on a large portion of the given data and testing on the remaining…

Machine Learning · Computer Science 2021-11-30 Felix Mohr , Jan N. van Rijn

Is K-fold cross validation the best model selection method for Machine Learning?

As a technique that can compactly represent complex patterns, machine learning has significant potential for predictive inference. K-fold cross-validation (CV) is the most common approach to ascertaining the likelihood that a machine…

Machine Learning · Statistics 2026-04-24 Juan M Gorriz , R. Martin Clemente , F Segovia , J Ramirez , A Ortiz , J. Suckling

Efficient, adaptive cross-validation for tuning and comparing models, with application to drug discovery

Cross-validation (CV) is widely used for tuning a model with respect to user-selected parameters and for selecting a "best" model. For example, the method of $k$-nearest neighbors requires the user to choose $k$, the number of neighbors,…

Applications · Statistics 2012-03-01 Hui Shen , William J. Welch , Jacqueline M. Hughes-Oliver

The use of cross validation in the analysis of designed experiments

Cross-validation (CV) is a common method to tune machine learning methods and can be used for model selection in regression as well. Because of the structured nature of small, traditional experimental designs, the literature has warned…

Applications · Statistics 2025-06-18 Maria L. Weese , Byran J. Smucker , David J. Edwards

Distributional bias compromises leave-one-out cross-validation

Cross-validation is a common method for estimating the predictive performance of machine learning models. In a data-scarce regime, where one typically wishes to maximize the number of instances used for training the model, an approach…

Methodology · Statistics 2025-03-25 George I. Austin , Itsik Pe'er , Tal Korem

Model selection by cross-validation in an expectile linear regression

For linear models that may have asymmetric errors, we study variable selection by cross-validation. The data are split into training and validation sets, with the number of observations in the validation set much larger than in the training…

Methodology · Statistics 2026-01-16 Bilel Bousselmi , Gabriela Ciuperca

Cross-Validation with Confidence

Cross-validation is one of the most popular model selection methods in statistics and machine learning. Despite its wide applicability, traditional cross validation methods tend to select overfitting models, due to the ignorance of the…

Methodology · Statistics 2017-12-25 Jing Lei

Stable and Robust Hyper-Parameter Selection Via Robust Information Sharing Cross-Validation

Robust estimators for linear regression require non-convex objective functions to shield against adverse affects of outliers. This non-convexity brings challenges, particularly when combined with penalization in high-dimensional settings.…

Computation · Statistics 2025-08-08 David Kepplinger , Siqi Wei

Unified optimal model averaging with a general loss function based on cross-validation

Studying unified model averaging estimation for situations with complicated data structures, we propose a novel model averaging method based on cross-validation (MACV). MACV unifies a large class of new and existing model averaging…

Methodology · Statistics 2024-12-16 Dalei Yu , Xinyu Zhang , Hua Liang

Efficient Test-based Variable Selection for High-dimensional Linear Models

Variable selection plays a fundamental role in high-dimensional data analysis. Various methods have been developed for variable selection in recent years. Well-known examples are forward stepwise regression (FSR) and least angle regression…

Methodology · Statistics 2018-02-01 Siliang Gong , Kai Zhang , Yufeng Liu

Consistency of cross validation for comparing regression procedures

Theoretical developments on cross validation (CV) have mainly focused on selecting one among a list of finite-dimensional models (e.g., subset or order selection in linear regression) or selecting a smoothing parameter (e.g., bandwidth for…

Statistics Theory · Mathematics 2008-12-18 Yuhong Yang

Using J-K fold Cross Validation to Reduce Variance When Tuning NLP Models

K-fold cross validation (CV) is a popular method for estimating the true performance of machine learning models, allowing model selection and parameter tuning. However, the very process of CV requires random partitioning of the data and so…

Computation and Language · Computer Science 2018-06-20 Henry B. Moss , David S. Leslie , Paul Rayson

Statistical learning and cross-validation for point processes

This paper presents the first general (supervised) statistical learning framework for point processes in general spaces. Our approach is based on the combination of two new concepts, which we define in the paper: i) bivariate innovations,…

Methodology · Statistics 2021-03-03 Ottmar Cronie , Mehdi Moradi , Christophe A. N. Biscio

Is Cross-Validation the Gold Standard to Evaluate Model Performance?

Cross-Validation (CV) is the default choice for evaluating the performance of machine learning models. Despite its wide usage, their statistical benefits have remained half-understood, especially in challenging nonparametric regimes. In…

Statistics Theory · Mathematics 2024-08-22 Garud Iyengar , Henry Lam , Tianyu Wang

Cross-Validation for Unsupervised Learning

Cross-validation (CV) is a popular method for model-selection. Unfortunately, it is not immediately obvious how to apply CV to unsupervised or exploratory contexts. This thesis discusses some extensions of cross-validation to unsupervised…

Methodology · Statistics 2009-09-17 Patrick O. Perry

Joint leave-group-out cross-validation in Bayesian spatial models

Cross-validation (CV) is a widely-used method of predictive assessment based on repeated model fits to different subsets of the available data. CV is applicable in a wide range of statistical settings. However, in cases where data are not…

Methodology · Statistics 2025-04-23 Alex Cooper , Aki Vehtari , Catherine Forbes

Blocked Cross-Validation: A Precise and Efficient Method for Hyperparameter Tuning

Hyperparameter tuning plays a crucial role in optimizing the performance of predictive learners. Cross--validation (CV) is a widely adopted technique for estimating the error of different hyperparameter settings. Repeated cross-validation…

Machine Learning · Computer Science 2023-08-01 Giovanni Maria Merola