Optimal Cross-Validation for Sparse Linear Regression

Ryan Cory-Wright; Andrés Gómez

Optimal Cross-Validation for Sparse Linear Regression

Optimization and Control 2026-02-13 v4 Machine Learning Methodology

Authors: Ryan Cory-Wright , Andrés Gómez

Abstract

Given a high-dimensional covariate matrix and a response vector, ridge-regularized sparse linear regression selects a subset of features that explains the relationship between covariates and the response in an interpretable manner. To choose hyperparameters that control the sparsity level and amount of regularization, practitioners commonly use k-fold cross-validation. However, cross-validation substantially increases the computational cost of sparse regression as it requires solving many mixed-integer optimization problems (MIOs) for each hyperparameter combination. To address this computational burden, we derive computationally tractable relaxations of the k-fold cross-validation loss, facilitating hyperparameter selection while solving $50$ -- $80\%$ fewer MIOs in practice. Our computational results demonstrate, across eleven real-world UCI datasets, that exact MIO-based cross-validation can be competitive with mature software packages such as glmnet and L0Learn -particularly when the sample-to-feature ratio is small.

Keywords

covariance estimation compressed sensing statistical estimation

Cite

@article{arxiv.2306.14851,
  title  = {Optimal Cross-Validation for Sparse Linear Regression},
  author = {Ryan Cory-Wright and Andrés Gómez},
  journal= {arXiv preprint arXiv:2306.14851},
  year   = {2026}
}

Comments

Updated manuscript for revision

Optimal Cross-Validation for Sparse Linear Regression

Abstract

Keywords

Cite

Comments

Related papers