Related papers: Approximate Cross-Validation for Structured Models

Estimating the Prediction Performance of Spatial Models via Spatial k-Fold Cross Validation

In machine learning one often assumes the data are independent when evaluating model performance. However, this rarely holds in practise. Geographic information data sets are an example where the data points have stronger dependencies among…

Applications · Statistics 2020-06-01 Jonne Pohjankukka , Tapio Pahikkala , Paavo Nevalainen , Jukka Heikkonen

Approximate Cross-validation: Guarantees for Model Assessment and Selection

Cross-validation (CV) is a popular approach for assessing and selecting predictive models. However, when the number of folds is large, CV suffers from a need to repeatedly refit a learning procedure on a large number of training datasets.…

Machine Learning · Statistics 2020-06-12 Ashia Wilson , Maximilian Kasy , Lester Mackey

The use of cross validation in the analysis of designed experiments

Cross-validation (CV) is a common method to tune machine learning methods and can be used for model selection in regression as well. Because of the structured nature of small, traditional experimental designs, the literature has warned…

Applications · Statistics 2025-06-18 Maria L. Weese , Byran J. Smucker , David J. Edwards

Assessing the performance of spatial cross-validation approaches for models of spatially structured data

Evaluating models fit to data with internal spatial structure requires specific cross-validation (CV) approaches, because randomly selecting assessment data may produce assessment sets that are not truly independent of data used to train…

Computation · Statistics 2023-03-14 Michael J Mahoney , Lucas K Johnson , Julia Silge , Hannah Frick , Max Kuhn , Colin M Beier

Efficient, adaptive cross-validation for tuning and comparing models, with application to drug discovery

Cross-validation (CV) is widely used for tuning a model with respect to user-selected parameters and for selecting a "best" model. For example, the method of $k$-nearest neighbors requires the user to choose $k$, the number of neighbors,…

Applications · Statistics 2012-03-01 Hui Shen , William J. Welch , Jacqueline M. Hughes-Oliver

Fast and Informative Model Selection using Learning Curve Cross-Validation

Common cross-validation (CV) methods like k-fold cross-validation or Monte-Carlo cross-validation estimate the predictive performance of a learner by repeatedly training it on a large portion of the given data and testing on the remaining…

Machine Learning · Computer Science 2021-11-30 Felix Mohr , Jan N. van Rijn

Cross-Validation for Correlated Data

K-fold cross-validation (CV) with squared error loss is widely used for evaluating predictive models, especially when strong distributional assumptions cannot be taken. However, CV with squared error loss is not free from distributional…

Methodology · Statistics 2021-08-10 Assaf Rabinowicz , Saharon Rosset

Is Cross-Validation the Gold Standard to Evaluate Model Performance?

Cross-Validation (CV) is the default choice for evaluating the performance of machine learning models. Despite its wide usage, their statistical benefits have remained half-understood, especially in challenging nonparametric regimes. In…

Statistics Theory · Mathematics 2024-08-22 Garud Iyengar , Henry Lam , Tianyu Wang

Cross Validation Based Model Selection via Generalized Method of Moments

Structural estimation is an important methodology in empirical economics, and a large class of structural models are estimated through the generalized method of moments (GMM). Traditionally, selection of structural models has been performed…

Econometrics · Economics 2018-07-19 Junpei Komiyama , Hajime Shimao

Fast Cross-Validation for Incremental Learning

Cross-validation (CV) is one of the main tools for performance estimation and parameter tuning in machine learning. The general recipe for computing CV estimate is to run a learning algorithm separately for each CV fold, a computationally…

Machine Learning · Statistics 2015-07-02 Pooria Joulani , András György , Csaba Szepesvári

Iterative Approximate Cross-Validation

Cross-validation (CV) is one of the most popular tools for assessing and selecting predictive models. However, standard CV suffers from high computational cost when the number of folds is large. Recently, under the empirical risk…

Methodology · Statistics 2023-05-30 Yuetian Luo , Zhimei Ren , Rina Foygel Barber

Leave Zero Out: Towards a No-Cross-Validation Approach for Model Selection

As the main workhorse for model selection, Cross Validation (CV) has achieved an empirical success due to its simplicity and intuitiveness. However, despite its ubiquitous role, CV often falls into the following notorious dilemmas. On the…

Machine Learning · Computer Science 2020-12-29 Weikai Li , Chuanxing Geng , Songcan Chen

Approximate Cross-Validation with Low-Rank Data in High Dimensions

Many recent advances in machine learning are driven by a challenging trifecta: large data size $N$; high dimensions; and expensive algorithms. In this setting, cross-validation (CV) serves as an important tool for model assessment. Recent…

Machine Learning · Statistics 2022-11-03 William T. Stephenson , Madeleine Udell , Tamara Broderick

Is K-fold cross validation the best model selection method for Machine Learning?

As a technique that can compactly represent complex patterns, machine learning has significant potential for predictive inference. K-fold cross-validation (CV) is the most common approach to ascertaining the likelihood that a machine…

Machine Learning · Statistics 2026-04-24 Juan M Gorriz , R. Martin Clemente , F Segovia , J Ramirez , A Ortiz , J. Suckling

Generalised learning of time-series: Ornstein-Uhlenbeck processes

In machine learning, statistics, econometrics and statistical physics, cross-validation (CV) is used asa standard approach in quantifying the generalisation performance of a statistical model. A directapplication of CV in time-series leads…

Machine Learning · Statistics 2021-12-14 Mehmet Süzen , Alper Yegenoglu

Approximate cross-validation formula for Bayesian linear regression

Cross-validation (CV) is a technique for evaluating the ability of statistical models/learning systems based on a given data set. Despite its wide applicability, the rather heavy computational cost can prevent its use as the system size…

Machine Learning · Statistics 2016-10-26 Yoshiyuki Kabashima , Tomoyuki Obuchi , Makoto Uemura

Foundation for unbiased cross-validation of spatio-temporal models for species distribution modeling

Evaluating the predictive performance of species distribution models (SDMs) under realistic deployment scenarios requires careful handling of spatial and temporal dependencies in the data. Cross-validation (CV) is the standard approach for…

Applications · Statistics 2025-12-22 Diana Koldasbayeva , Alexey Zaytsev

Using J-K fold Cross Validation to Reduce Variance When Tuning NLP Models

K-fold cross validation (CV) is a popular method for estimating the true performance of machine learning models, allowing model selection and parameter tuning. However, the very process of CV requires random partitioning of the data and so…

Computation and Language · Computer Science 2018-06-20 Henry B. Moss , David S. Leslie , Paul Rayson

Confidence intervals for the Cox model test error from cross-validation

Cross-validation (CV) is one of the most widely used techniques in statistical learning for estimating the test error of a model, but its behavior is not yet fully understood. It has been shown that standard confidence intervals for test…

Methodology · Statistics 2023-10-10 Min Woo Sun , Robert Tibshirani

Cross Validation for Correlated Data in Regression and Classification Models, with Applications to Deep Learning

We present a methodology for model evaluation and selection where the sampling mechanism violates the i.i.d. assumption. Our methodology involves a formulation of the bias between the standard Cross-Validation (CV) estimator and the mean…

Methodology · Statistics 2025-03-14 Oren Yuval , Saharon Rosset