Related papers: Selective inference after cross-validation
For linear models that may have asymmetric errors, we study variable selection by cross-validation. The data are split into training and validation sets, with the number of observations in the validation set much larger than in the training…
We develop tools to do valid post-selective inference for a family of model selection procedures, including choosing a model via cross-validated Lasso. The tools apply universally when the following random vectors are jointly asymptotically…
Cross-validation is one of the most popular model selection methods in statistics and machine learning. Despite its wide applicability, traditional cross validation methods tend to select overfitting models, due to the ignorance of the…
We provide a general mathematical framework for selective inference with supervised model selection procedures characterized by quadratic forms in the outcome variable. Forward stepwise with groups of variables is an important special case…
Cross-validation (CV) is a popular method for model-selection. Unfortunately, it is not immediately obvious how to apply CV to unsupervised or exploratory contexts. This thesis discusses some extensions of cross-validation to unsupervised…
Used to estimate the risk of an estimator or to perform model selection, cross-validation is a widespread strategy because of its simplicity and its apparent universality. Many results exist on the model selection performances of…
The tuning parameter selection strategy for penalized estimation is crucial to identify a model that is both interpretable and predictive. However, popular strategies (e.g., minimizing average squared prediction error via cross-validation)…
Cross-validation is a popular non-parametric method for evaluating the accuracy of a predictive rule. The usefulness of cross-validation depends on the task we want to employ it for. In this note, I discuss a simple non-parametric setting,…
Selective inference (post-selection inference) is a methodology that has attracted much attention in recent years in the fields of statistics and machine learning. Naive inference based on data that are also used for model selection tends…
When performing supervised learning with the model selected using validation error from sample splitting and cross validation, the minimum value of the validation error can be biased downward. We propose two simple methods that use the…
Structural estimation is an important methodology in empirical economics, and a large class of structural models are estimated through the generalized method of moments (GMM). Traditionally, selection of structural models has been performed…
A popular technique for selecting and tuning machine learning estimators is cross-validation. Cross-validation evaluates overall model fit, usually in terms of predictive accuracy. In causal inference, the optimal choice of estimator…
We develop a general approach to valid inference after model selection. At the core of our framework is a result that characterizes the distribution of a post-selection estimator conditioned on the selection event. We specialize the…
Model selection in latent block models has been a challenging but important task in the field of statistics. Specifically, a major challenge is encountered when constructing a test on a block structure obtained by applying a specific…
Cross-validation is a widely-used technique to estimate prediction error, but its behavior is complex and not fully understood. Ideally, one would like to think that cross-validation estimates the prediction error for the model at hand, fit…
Cross-validation (CV) is a popular approach for assessing and selecting predictive models. However, when the number of folds is large, CV suffers from a need to repeatedly refit a learning procedure on a large number of training datasets.…
This work addresses the problem of conducting valid inference for additive and linear mixed models after model selection. One possible solution to overcome overconfident inference results after model selection is selective inference, which…
In this article we propose a new variable selection method for analyzing data collected from longitudinal sample surveys. The procedure is based on the survey-weighted quadratic inference function, which was recently introduced as an…
Statistical inference after model selection requires an inference framework that takes the selection into account in order to be valid. Following recent work on selective inference, we derive analytical expressions for inference after…
We present a methodology for model evaluation and selection where the sampling mechanism violates the i.i.d. assumption. Our methodology involves a formulation of the bias between the standard Cross-Validation (CV) estimator and the mean…