English
Related papers

Related papers: Better subset regression

200 papers

In this paper we discuss the variable selection method from \ell0-norm constrained regression, which is equivalent to the problem of finding the best subset of a fixed size. Our study focuses on two aspects, consistency and computation. We…

Methodology · Statistics 2013-03-20 Shifeng Xiong

Subset selection in multiple linear regression aims to choose a subset of candidate explanatory variables that tradeoff fitting error (explanatory power) and model complexity (number of variables selected). We build mathematical programming…

Machine Learning · Statistics 2020-09-04 Young Woong Park , Diego Klabjan

The problem of best subset selection in linear regression is considered with the aim to find a fixed size subset of features that best fits the response. This is particularly challenging when the total available number of features is very…

Methodology · Statistics 2023-11-28 Sarat Moka , Benoit Liquet , Houying Zhu , Samuel Muller

Analysis of high-dimensional data has led to increased interest in both single index models (SIMs) and the best-subset selection. SIMs provide an interpretable and flexible modeling framework for high-dimensional data, while the best-subset…

Machine Learning · Statistics 2025-08-19 Borui Tang , Jin Zhu , Junxian Zhu , Xueqin Wang , Heping Zhang

In high-dimensional linear regression, the goal pursued here is to estimate an unknown regression function using linear combinations of a suitable set of covariates. One of the key assumptions for the success of any statistical procedure in…

Statistics Theory · Mathematics 2015-03-13 Philippe Rigollet , Alexandre Tsybakov

We study the fundamental problem of selecting optimal features for model construction. This problem is computationally challenging on large datasets, even with the use of greedy algorithm variants. To address this challenge, we extend the…

We consider the problem of best subset selection (BSS) under high-dimensional sparse linear regression model. Recently, Guo et al. (2020) showed that the model selection performance of BSS depends on a certain identifiability margin, a…

Statistics Theory · Mathematics 2025-04-15 Saptarshi Roy , Ambuj Tewari , Ziwei Zhu

Best subset selection in linear regression is well known to be nonconvex and computationally challenging to solve, as the number of possible subsets grows rapidly with increasing dimensionality of the problem. As a result, finding the…

Machine Learning · Statistics 2025-04-01 Vikram Singh , Min Sun

We study the problem of variable selection in convex nonparametric regression. Under the assumption that the true regression function is convex and sparse, we develop a screening procedure to select a subset of variables that contains the…

Statistics Theory · Mathematics 2014-11-19 Min Xu , Minhua Chen , John Lafferty

In high-dimensional generalized linear models, it is crucial to identify a sparse model that adequately accounts for response variation. Although the best subset section has been widely regarded as the Holy Grail of problems of this type,…

Machine Learning · Statistics 2023-08-02 Junxian Zhu , Jin Zhu , Borui Tang , Xuanyu Chen , Hongmei Lin , Xueqin Wang

Identifying the underlying models in a set of data points contaminated by noise and outliers, leads to a highly complex multi-model fitting problem. This problem can be posed as a clustering problem by the projection of higher order…

Computer Vision and Pattern Recognition · Computer Science 2018-08-01 Ruwan Tennakoon , Alireza Sadri , Reza Hoseinnezhad , Alireza Bab-Hadiashar

Filter or screening methods are often used as a preprocessing step for reducing the number of variables used by a learning algorithm in obtaining a classification or regression model. While there are many such filter methods, there is a…

Machine Learning · Statistics 2019-09-13 Mingyuan Wang , Adrian Barbu

The dramatic growth of big datasets presents a new challenge to data storage and analysis. Data reduction, or subsampling, that extracts useful information from datasets is a crucial step in big data analysis. We propose an orthogonal…

Methodology · Statistics 2021-06-01 Lin Wang , Jake Elmstedt , Weng Kee Wong , Hongquan Xu

We study the problem of exact support recovery for high-dimensional sparse linear regression under independent Gaussian design when the signals are weak, rare, and possibly heterogeneous. Under a suitable scaling of the sample size and…

Statistics Theory · Mathematics 2023-07-19 Saptarshi Roy , Ambuj Tewari , Ziwei Zhu

Both feature selection and hyperparameter tuning are key tasks in machine learning. Hyperparameter tuning is often useful to increase model performance, while feature selection is undertaken to attain sparse models. Sparsity may yield…

Machine Learning · Statistics 2020-02-14 Martin Binder , Julia Moosbauer , Janek Thomas , Bernd Bischl

We study a seemingly unexpected and relatively less understood overfitting aspect of a fundamental tool in sparse linear modeling - best subset selection, which minimizes the residual sum of squares subject to a constraint on the number of…

Methodology · Statistics 2022-01-11 Rahul Mazumder , Peter Radchenko , Antoine Dedieu

In this article, we propose a new algorithm for supervised learning methods, by which one can both capture the non-linearity in data and also find the best subset model. To produce an enhanced subset of the original variables, an ideal…

Applications · Statistics 2017-01-23 Peyman Tavallali , Marianne Razavi , Sean Brady

We consider a discriminative learning (regression) problem, whereby the regression function is a convex combination of k linear classifiers. Existing approaches are based on the EM algorithm, or similar techniques, without provable…

Machine Learning · Computer Science 2014-08-01 Yuekai Sun , Stratis Ioannidis , Andrea Montanari

While the SLIM approach obtained high ranking-accuracy in many experiments in the literature, it is also known for its high computational cost of learning its parameters from data. For this reason, we focus in this paper on variants of…

Information Retrieval · Computer Science 2019-05-01 Harald Steck

The problems of Lasso regression and optimal design of experiments share a critical property: their optimal solutions are typically \emph{sparse}, i.e., only a small fraction of the optimal variables are non-zero. Therefore, the…

Methodology · Statistics 2023-12-07 Guillaume Sagnol , Luc Pronzato
‹ Prev 1 2 3 10 Next ›