Related papers: Estimation Stability with Cross Validation (ESCV)

The Relative Instability of Model Comparison with Cross-validation

Cross-validation (CV) is known to provide asymptotically exact tests and confidence intervals for model improvement but only when the model comparison is relatively stable. Surprisingly, we prove that even simple, individually stable models…

Machine Learning · Statistics 2026-02-10 Alexandre Bayle , Lucas Janson , Lester Mackey

Stability

Reproducibility is imperative for any scientific discovery. More often than not, modern scientific findings rely on statistical analysis of high-dimensional data. At a minimum, reproducibility manifests itself in stability of statistical…

Statistics Theory · Mathematics 2013-10-02 Bin Yu

Stable and Robust Hyper-Parameter Selection Via Robust Information Sharing Cross-Validation

Robust estimators for linear regression require non-convex objective functions to shield against adverse affects of outliers. This non-convexity brings challenges, particularly when combined with penalization in high-dimensional settings.…

Computation · Statistics 2025-08-08 David Kepplinger , Siqi Wei

Model selection by cross-validation in an expectile linear regression

For linear models that may have asymmetric errors, we study variable selection by cross-validation. The data are split into training and validation sets, with the number of observations in the validation set much larger than in the training…

Methodology · Statistics 2026-01-16 Bilel Bousselmi , Gabriela Ciuperca

Approximate Cross-validation: Guarantees for Model Assessment and Selection

Cross-validation (CV) is a popular approach for assessing and selecting predictive models. However, when the number of folds is large, CV suffers from a need to repeatedly refit a learning procedure on a large number of training datasets.…

Machine Learning · Statistics 2020-06-12 Ashia Wilson , Maximilian Kasy , Lester Mackey

Analyzing Cross Validation In Compressed Sensing With Mixed Gaussian And Impulse Measurement Noise With L1 Errors

Compressed sensing (CS) involves sampling signals at rates less than their Nyquist rates and attempting to reconstruct them after sample acquisition. Most such algorithms have parameters, for example the regularization parameter in LASSO,…

Information Theory · Computer Science 2021-02-23 Chinmay Gurjarpadhye , Shubhang Bhatnagar , Ajit Rajwade

Efficient Test-based Variable Selection for High-dimensional Linear Models

Variable selection plays a fundamental role in high-dimensional data analysis. Various methods have been developed for variable selection in recent years. Well-known examples are forward stepwise regression (FSR) and least angle regression…

Methodology · Statistics 2018-02-01 Siliang Gong , Kai Zhang , Yufeng Liu

Leave Zero Out: Towards a No-Cross-Validation Approach for Model Selection

As the main workhorse for model selection, Cross Validation (CV) has achieved an empirical success due to its simplicity and intuitiveness. However, despite its ubiquitous role, CV often falls into the following notorious dilemmas. On the…

Machine Learning · Computer Science 2020-12-29 Weikai Li , Chuanxing Geng , Songcan Chen

Iterative Approximate Cross-Validation

Cross-validation (CV) is one of the most popular tools for assessing and selecting predictive models. However, standard CV suffers from high computational cost when the number of folds is large. Recently, under the empirical risk…

Methodology · Statistics 2023-05-30 Yuetian Luo , Zhimei Ren , Rina Foygel Barber

Efficient implementations of echo state network cross-validation

Background/introduction: Cross-Validation (CV) is still uncommon in time series modeling. Echo State Networks (ESNs), as a prime example of Reservoir Computing (RC) models, are known for their fast and precise one-shot learning, that often…

Machine Learning · Computer Science 2021-03-05 Mantas Lukoševičius , Arnas Uselis

The restricted consistency property of leave-$n_v$-out cross-validation for high-dimensional variable selection

Cross-validation (CV) methods are popular for selecting the tuning parameter in the high-dimensional variable selection problem. We show the mis-alignment of the CV is one possible reason of its over-selection behavior. To fix this issue,…

Methodology · Statistics 2018-01-17 Yang Feng , Yi Yu

Ensemble Conditional Variance Estimator for Sufficient Dimension Reduction

Ensemble Conditional Variance Estimation (ECVE) is a novel sufficient dimension reduction (SDR) method in regressions with continuous response and predictors. ECVE applies to general non-additive error regression models. It operates under…

Methodology · Statistics 2021-03-01 Lukas Fertl , Efstathia Bura

Approximate cross-validation formula for Bayesian linear regression

Cross-validation (CV) is a technique for evaluating the ability of statistical models/learning systems based on a given data set. Despite its wide applicability, the rather heavy computational cost can prevent its use as the system size…

Machine Learning · Statistics 2016-10-26 Yoshiyuki Kabashima , Tomoyuki Obuchi , Makoto Uemura

The use of cross validation in the analysis of designed experiments

Cross-validation (CV) is a common method to tune machine learning methods and can be used for model selection in regression as well. Because of the structured nature of small, traditional experimental designs, the literature has warned…

Applications · Statistics 2025-06-18 Maria L. Weese , Byran J. Smucker , David J. Edwards

On the Asymptotic Optimality of Cross-Validation based Hyper-parameter Estimators for Regularized Least Squares Regression Problems

The asymptotic optimality (a.o.) of various hyper-parameter estimators with different optimality criteria has been studied in the literature for regularized least squares regression problems. The estimators include e.g., the maximum…

Statistics Theory · Mathematics 2021-04-28 Biqiang Mu , Tianshi Chen , Lennart Ljung

Estimating the Prediction Performance of Spatial Models via Spatial k-Fold Cross Validation

In machine learning one often assumes the data are independent when evaluating model performance. However, this rarely holds in practise. Geographic information data sets are an example where the data points have stronger dependencies among…

Applications · Statistics 2020-06-01 Jonne Pohjankukka , Tapio Pahikkala , Paavo Nevalainen , Jukka Heikkonen

Cross-validation on Extreme Regions

We conduct a non asymptotic study of the Cross Validation (CV) estimate of the generalization risk for learning algorithms dedicated to extreme regions of the covariates space. In this Extreme Value Analysis context, the risk function…

Statistics Theory · Mathematics 2024-09-12 Anass Aghbalou , Patrice Bertail , François Portier , Anne Sabourin

Cross-Validation, Risk Estimation, and Model Selection

Cross-validation is a popular non-parametric method for evaluating the accuracy of a predictive rule. The usefulness of cross-validation depends on the task we want to employ it for. In this note, I discuss a simple non-parametric setting,…

Methodology · Statistics 2019-09-27 Stefan Wager

Stabilized Cross-Validation of Smoothness in Density Deconvolution

We consider density estimation under measurement error with the Smoothness-Penalized Deconvolution (SPeD) estimator. The estimator has a tuning parameter regulating the smoothness of the estimate, and proper choice of this parameter is…

Statistics Theory · Mathematics 2025-08-25 David Kent

Approximate Cross-Validation for Structured Models

Many modern data analyses benefit from explicitly modeling dependence structure in data -- such as measurements across time or space, ordered words in a sentence, or genes in a genome. A gold standard evaluation technique is structured…

Machine Learning · Statistics 2020-12-02 Soumya Ghosh , William T. Stephenson , Tin D. Nguyen , Sameer K. Deshpande , Tamara Broderick