Related papers: Model selection by resampling penalization

Model selection by resampling penalization

We present a new family of model selection algorithms based on the resampling heuristics. It can be used in several frameworks, do not require any knowledge about the unknown law of the data, and may be seen as a generalization of local…

Statistics Theory · Mathematics 2007-06-13 Sylvain Arlot

Choosing a penalty for model selection in heteroscedastic regression

We consider the problem of choosing between several models in least-squares regression with heteroscedastic data. We prove that any penalization procedure is suboptimal when the penalty is a function of the dimension of the model, at least…

Statistics Theory · Mathematics 2010-07-28 Sylvain Arlot

The Slope Heuristics in Heteroscedastic Regression

We consider the estimation of a regression function with random design and heteroscedastic noise in a nonparametric setting. More precisely, we address the problem of characterizing the optimal penalty when the regression function is…

Statistics Theory · Mathematics 2015-06-29 Adrien Saumard

Bootstrap for neural model selection

Bootstrap techniques (also called resampling computation techniques) have introduced new advances in modeling and model evaluation. Using resampling methods to construct a series of new samples which are based on the original data set,…

Statistics Theory · Mathematics 2007-06-13 Riadh Kallel , Marie Cottrell , Vincent Vigneron

V-fold cross-validation improved: V-fold penalization

We study the efficiency of V-fold cross-validation (VFCV) for model selection from the non-asymptotic viewpoint, and suggest an improvement on it, which we call ``V-fold penalization''. Considering a particular (though simple) regression…

Statistics Theory · Mathematics 2008-02-07 Sylvain Arlot

Bootstrap based asymptotic refinements for high-dimensional nonlinear models

We consider penalized extremum estimation of a high-dimensional, possibly nonlinear model that is sparse in the sense that most of its parameters are zero but some are not. We use the SCAD penalty function, which provides model selection…

Econometrics · Economics 2024-02-23 Joel L. Horowitz , Ahnaf Rafi

Slope heuristics and V-Fold model selection in heteroscedastic regression using strongly localized bases

We investigate the optimality for model selection of the so-called slope heuristics, $V$-fold cross-validation and $V$-fold penalization in a heteroscedastic with random design regression context. We consider a new class of linear models…

Statistics Theory · Mathematics 2023-03-08 Fabien Navarro , Adrien Saumard

Variable selection using pseudo-variables

Penalized regression has become a standard tool for model building across a wide range of application domains. Common practice is to tune the amount of penalization to tradeoff bias and variance or to optimize some other measure of…

Methodology · Statistics 2018-04-05 Wenhao Hu , Eric Laber , Leonard Stefanski

Improving prediction accuracy by choosing resampling distribution via cross-validation

In a regression model, prediction is typically performed after model selection. The large variability in the model selection makes the prediction unstable. Thus, it is essential to reduce the variability in model selection and improve…

Computation · Statistics 2024-04-11 Wataru Yoshida , Kei Hirose

Wild Residual Bootstrap Inference for Penalized Quantile Regression with Heteroscedastic Errors

We consider a heteroscedastic regression model in which some of the regression coefficients are zero but it is not known which ones. Penalized quantile regression is a useful approach for analyzing such data. By allowing different…

Methodology · Statistics 2018-07-23 Lan Wang , Ingrid Van Keilegrom , Adam Maidman

Lasso tuning through the flexible-weighted bootstrap

Regularized regression approaches such as the Lasso have been widely adopted for constructing sparse linear models in high-dimensional datasets. A complexity in fitting these models is the tuning of the parameters which control the level of…

Methodology · Statistics 2019-03-12 Ellis Patrick , Samuel Mueller

lassopack: Model selection and prediction with regularized regression in Stata

This article introduces lassopack, a suite of programs for regularized regression in Stata. lassopack implements lasso, square-root lasso, elastic net, ridge regression, adaptive lasso and post-estimation OLS. The methods are suitable for…

Econometrics · Economics 2019-01-17 Achim Ahrens , Christian B. Hansen , Mark E. Schaffer

Risk and resampling under model uncertainty

In statistical exercises where there are several candidate models, the traditional approach is to select one model using some data driven criterion and use that model for estimation, testing and other purposes, ignoring the variability of…

Statistics Theory · Mathematics 2008-12-18 Snigdhansu Chatterjee , Nitai D. Mukhopadhyay

An analysis of the cost of hyper-parameter selection via split-sample validation, with applications to penalized regression

In the regression setting, given a set of hyper-parameters, a model-estimation procedure constructs a model from training data. The optimal hyper-parameters that minimize generalization error of the model are usually unknown. In practice…

Machine Learning · Statistics 2019-04-01 Jean Feng , Noah Simon

Optimization with Sparsity-Inducing Penalties

Sparse estimation methods are aimed at using or obtaining parsimonious representations of data or models. They were first dedicated to linear variable selection but numerous extensions have now emerged such as structured sparsity or kernel…

Machine Learning · Computer Science 2011-11-24 Francis Bach , Rodolphe Jenatton , Julien Mairal , Guillaume Obozinski

Optimal model selection in density estimation

We build penalized least-squares estimators using the slope heuristic and resampling penalties. We prove oracle inequalities for the selected estimator with leading constant asymptotically equal to 1. We compare the practical performances…

Statistics Theory · Mathematics 2015-03-13 Matthieu Lerasle

Meta-Learning for Resampling Recommendation Systems

One possible approach to tackle the class imbalance in classification tasks is to resample a training dataset, i.e., to drop some of its elements or to synthesize new ones. There exist several widely-used resampling methods. Recent research…

Machine Learning · Computer Science 2018-09-18 Smolyakov Dmitry , Alexander Korotin , Pavel Erofeev , Artem Papanov , Evgeny Burnaev

Transfer learning of regression models from a sequence of datasets by penalized estimation

Transfer learning refers to the promising idea of initializing model fits based on pre-training on other data. We particularly consider regression modeling settings where parameter estimates from previous data can be used as anchoring…

Methodology · Statistics 2020-07-07 Wessel N. van Wieringen , Harald Binder

Non-asymptotic model selection for linear non least-squares estimation in regression models and inverse problems

We propose to address the common problem of linear estimation in linear statistical models by using a model selection approach via penalization. Depending then on the framework in which the linear statistical model is considered namely the…

Statistics Theory · Mathematics 2009-09-11 Ikhlef Bechar

Bootstrap Bias Corrections for Ensemble Methods

This paper examines the use of a residual bootstrap for bias correction in machine learning regression methods. Accounting for bias is an important obstacle in recent efforts to develop statistical inference for machine learning methods. We…

Machine Learning · Statistics 2015-06-02 Giles Hooker , Lucas Mentch