Related papers: Choosing a penalty for model selection in heterosc…

The Slope Heuristics in Heteroscedastic Regression

We consider the estimation of a regression function with random design and heteroscedastic noise in a nonparametric setting. More precisely, we address the problem of characterizing the optimal penalty when the regression function is…

Statistics Theory · Mathematics 2015-06-29 Adrien Saumard

Data-driven calibration of penalties for least-squares regression

Penalization procedures often suffer from their dependence on multiplying factors, whose optimal values are either unknown or hard to estimate from the data. We propose a completely data-driven calibration algorithm for this parameter in…

Statistics Theory · Mathematics 2010-07-02 Sylvain Arlot , Pascal Massart

Model selection by resampling penalization

In this paper, a new family of resampling-based penalization procedures for model selection is defined in a general framework. It generalizes several methods, including Efron's bootstrap penalization and the leave-one-out penalization…

Statistics Theory · Mathematics 2009-06-19 Sylvain Arlot

Model selection by resampling penalization

We present a new family of model selection algorithms based on the resampling heuristics. It can be used in several frameworks, do not require any knowledge about the unknown law of the data, and may be seen as a generalization of local…

Statistics Theory · Mathematics 2007-06-13 Sylvain Arlot

High-dimensional classification by sparse logistic regression

We consider high-dimensional binary classification by sparse logistic regression. We propose a model/feature selection procedure based on penalized maximum likelihood with a complexity penalty on the model size and derive the non-asymptotic…

Statistics Theory · Mathematics 2018-11-20 Felix Abramovich , Vadim Grinshtein

Slope heuristics and V-Fold model selection in heteroscedastic regression using strongly localized bases

We investigate the optimality for model selection of the so-called slope heuristics, $V$-fold cross-validation and $V$-fold penalization in a heteroscedastic with random design regression context. We consider a new class of linear models…

Statistics Theory · Mathematics 2023-03-08 Fabien Navarro , Adrien Saumard

V-fold cross-validation improved: V-fold penalization

We study the efficiency of V-fold cross-validation (VFCV) for model selection from the non-asymptotic viewpoint, and suggest an improvement on it, which we call ``V-fold penalization''. Considering a particular (though simple) regression…

Statistics Theory · Mathematics 2008-02-07 Sylvain Arlot

Penalized Composite Quasi-Likelihood for Ultrahigh-Dimensional Variable Selection

In high-dimensional model selection problems, penalized simple least-square approaches have been extensively used. This paper addresses the question of both robustness and efficiency of penalized model selection methods, and proposes a…

Methodology · Statistics 2011-07-06 Jelena Bradic , Jianqing Fan , Weiwei Wang

Data-driven calibration of linear estimators with minimal penalties

This paper tackles the problem of selecting among several linear estimators in non-parametric regression; this includes model selection for linear regression, the choice of a regularization parameter in kernel ridge regression, spline…

Statistics Theory · Mathematics 2011-09-15 Sylvain Arlot , Francis Bach

Weighted least squares methods for prediction in the functional data linear model

The problem of prediction in functional linear regression is conventionally addressed by reducing dimension via the standard principal component basis. In this paper we show that an alternative basis chosen through weighted least-squares,…

Methodology · Statistics 2009-02-20 Aurore Delaigle , Peter Hall , Tatiyana V. Apanasovich

Multiclass classification by sparse multinomial logistic regression

In this paper we consider high-dimensional multiclass classification by sparse multinomial logistic regression. We propose first a feature selection procedure based on penalized maximum likelihood with a complexity penalty on the model size…

Statistics Theory · Mathematics 2020-11-20 Felix Abramovich , Vadim Grinshtein , Tomer Levy

Horseshoe Regularization for Feature Subset Selection

Feature subset selection arises in many high-dimensional applications of statistics, such as compressed sensing and genomics. The $\ell_0$ penalty is ideal for this task, the caveat being it requires the NP-hard combinatorial evaluation of…

Machine Learning · Statistics 2017-06-26 Anindya Bhadra , Jyotishka Datta , Nicholas G. Polson , Brandon Willard

Tuning Parameter Selection for Penalized Estimation via $R^2$

The tuning parameter selection strategy for penalized estimation is crucial to identify a model that is both interpretable and predictive. However, popular strategies (e.g., minimizing average squared prediction error via cross-validation)…

Methodology · Statistics 2022-11-10 Julia Holter , Jonathan Stallrich

A Cluster Elastic Net for Multivariate Regression

We propose a method for estimating coefficients in multivariate regression when there is a clustering structure to the response variables. The proposed method includes a fusion penalty, to shrink the difference in fitted values from…

Machine Learning · Statistics 2018-03-28 Bradley S. Price , Ben Sherwood

Trade-off between predictive performance and FDR control for high-dimensional Gaussian model selection

In the context of high-dimensional Gaussian linear regression for ordered variables, we study the variable selection procedure via the minimization of the penalized least-squares criterion. We focus on model selection where the penalty…

Statistics Theory · Mathematics 2024-07-01 Perrine Lacroix , Marie-Laure Martin

Penalized Euclidean Distance Regression

A new method is proposed for variable screening, variable selection and prediction in linear regression problems where the number of predictors can be much larger than the number of observations. The method involves minimizing a penalized…

Statistics Theory · Mathematics 2017-09-14 D. Vasiliu , T. Dey , I. L. Dryden

Optimal Subsampling for Large Sample Ridge Regression

Subsampling is a popular approach to alleviating the computational burden for analyzing massive datasets. Recent efforts have been devoted to various statistical models without explicit regularization. In this paper, we develop an efficient…

Methodology · Statistics 2022-04-12 Yunlu Chen , Nan Zhang

Stable and Robust Hyper-Parameter Selection Via Robust Information Sharing Cross-Validation

Robust estimators for linear regression require non-convex objective functions to shield against adverse affects of outliers. This non-convexity brings challenges, particularly when combined with penalization in high-dimensional settings.…

Computation · Statistics 2025-08-08 David Kepplinger , Siqi Wei

Penalized Likelihood Regression in Reproducing Kernel Hilbert Spaces with Randomized Covariate Data

Classical penalized likelihood regression problems deal with the case that the independent variables data are known exactly. In practice, however, it is common to observe data with incomplete covariate information. We are concerned with a…

Methodology · Statistics 2010-08-04 Xiwen Ma , Bin Dai , Ronald Klein , Barbara E. K. Klein , Kristine E. Lee , Grace Wahba

Bayesian MIDAS Penalized Regressions: Estimation, Selection, and Prediction

We propose a new approach to mixed-frequency regressions in a high-dimensional environment that resorts to Group Lasso penalization and Bayesian techniques for estimation and inference. In particular, to improve the prediction properties of…

Econometrics · Economics 2020-06-12 Matteo Mogliani , Anna Simoni