Related papers: Selective inference after cross-validation

Model selection by cross-validation in an expectile linear regression

For linear models that may have asymmetric errors, we study variable selection by cross-validation. The data are split into training and validation sets, with the number of observations in the validation set much larger than in the training…

Methodology · Statistics 2026-01-16 Bilel Bousselmi , Gabriela Ciuperca

Unifying approach to selective inference with applications to cross-validation

We develop tools to do valid post-selective inference for a family of model selection procedures, including choosing a model via cross-validated Lasso. The tools apply universally when the following random vectors are jointly asymptotically…

Methodology · Statistics 2018-02-13 Jelena Markovic , Lucy Xia , Jonathan Taylor

Cross-Validation with Confidence

Cross-validation is one of the most popular model selection methods in statistics and machine learning. Despite its wide applicability, traditional cross validation methods tend to select overfitting models, due to the ignorance of the…

Methodology · Statistics 2017-12-25 Jing Lei

Selective inference in regression models with groups of variables

We provide a general mathematical framework for selective inference with supervised model selection procedures characterized by quadratic forms in the outcome variable. Forward stepwise with groups of variables is an important special case…

Methodology · Statistics 2015-11-05 Joshua R. Loftus , Jonathan E. Taylor

Cross-Validation for Unsupervised Learning

Cross-validation (CV) is a popular method for model-selection. Unfortunately, it is not immediately obvious how to apply CV to unsupervised or exploratory contexts. This thesis discusses some extensions of cross-validation to unsupervised…

Methodology · Statistics 2009-09-17 Patrick O. Perry

A survey of cross-validation procedures for model selection

Used to estimate the risk of an estimator or to perform model selection, cross-validation is a widespread strategy because of its simplicity and its apparent universality. Many results exist on the model selection performances of…

Statistics Theory · Mathematics 2011-02-01 Sylvain Arlot , Alain Celisse

Tuning Parameter Selection for Penalized Estimation via $R^2$

The tuning parameter selection strategy for penalized estimation is crucial to identify a model that is both interpretable and predictive. However, popular strategies (e.g., minimizing average squared prediction error via cross-validation)…

Methodology · Statistics 2022-11-10 Julia Holter , Jonathan Stallrich

Cross-Validation, Risk Estimation, and Model Selection

Cross-validation is a popular non-parametric method for evaluating the accuracy of a predictive rule. The usefulness of cross-validation depends on the task we want to employ it for. In this note, I discuss a simple non-parametric setting,…

Methodology · Statistics 2019-09-27 Stefan Wager

Selective Inference in Propensity Score Analysis

Selective inference (post-selection inference) is a methodology that has attracted much attention in recent years in the fields of statistics and machine learning. Naive inference based on data that are also used for model selection tends…

Methodology · Statistics 2021-11-25 Yoshiyuki Ninomiya , Yuta Umezu , Ichiro Takeuchi

Test Error Estimation after Model Selection Using Validation Error

When performing supervised learning with the model selected using validation error from sample splitting and cross validation, the minimum value of the validation error can be biased downward. We propose two simple methods that use the…

Methodology · Statistics 2018-02-13 Leying Guan

Cross Validation Based Model Selection via Generalized Method of Moments

Structural estimation is an important methodology in empirical economics, and a large class of structural models are estimated through the generalized method of moments (GMM). Traditionally, selection of structural models has been performed…

Econometrics · Economics 2018-07-19 Junpei Komiyama , Hajime Shimao

Model selection for estimation of causal parameters

A popular technique for selecting and tuning machine learning estimators is cross-validation. Cross-validation evaluates overall model fit, usually in terms of predictive accuracy. In causal inference, the optimal choice of estimator…

Methodology · Statistics 2021-07-07 Dominik Rothenhäusler

Exact post-selection inference, with application to the lasso

We develop a general approach to valid inference after model selection. At the core of our framework is a result that characterizes the distribution of a post-selection estimator conditioned on the selection event. We specialize the…

Statistics Theory · Mathematics 2016-05-04 Jason D. Lee , Dennis L. Sun , Yuekai Sun , Jonathan E. Taylor

Selective Inference for Latent Block Models

Model selection in latent block models has been a challenging but important task in the field of statistics. Specifically, a major challenge is encountered when constructing a test on a block structure obtained by applying a specific…

Machine Learning · Statistics 2021-06-08 Chihiro Watanabe , Taiji Suzuki

Cross-validation: what does it estimate and how well does it do it?

Cross-validation is a widely-used technique to estimate prediction error, but its behavior is complex and not fully understood. Ideally, one would like to think that cross-validation estimates the prediction error for the model at hand, fit…

Methodology · Statistics 2024-03-12 Stephen Bates , Trevor Hastie , Robert Tibshirani

Approximate Cross-validation: Guarantees for Model Assessment and Selection

Cross-validation (CV) is a popular approach for assessing and selecting predictive models. However, when the number of folds is large, CV suffers from a need to repeatedly refit a learning procedure on a large number of training datasets.…

Machine Learning · Statistics 2020-06-12 Ashia Wilson , Maximilian Kasy , Lester Mackey

Selective Inference for Additive and Linear Mixed Models

This work addresses the problem of conducting valid inference for additive and linear mixed models after model selection. One possible solution to overcome overconfident inference results after model selection is selective inference, which…

Methodology · Statistics 2020-12-22 David Rügamer , Philipp F. M. Baumann , Sonja Greven

Variable selection for longitudinal survey data

In this article we propose a new variable selection method for analyzing data collected from longitudinal sample surveys. The procedure is based on the survey-weighted quadratic inference function, which was recently introduced as an…

Statistics Theory · Mathematics 2021-05-04 Laura Dumitrescu , Wei Qian , J. N. K. Rao

Selective inference after likelihood- or test-based model selection in linear models

Statistical inference after model selection requires an inference framework that takes the selection into account in order to be valid. Following recent work on selective inference, we derive analytical expressions for inference after…

Methodology · Statistics 2017-09-26 David Rügamer , Sonja Greven

Cross Validation for Correlated Data in Regression and Classification Models, with Applications to Deep Learning

We present a methodology for model evaluation and selection where the sampling mechanism violates the i.i.d. assumption. Our methodology involves a formulation of the bias between the standard Cross-Validation (CV) estimator and the mean…

Methodology · Statistics 2025-03-14 Oren Yuval , Saharon Rosset