Related papers: Better subset regression

On best subset regression

In this paper we discuss the variable selection method from \ell0-norm constrained regression, which is equivalent to the problem of finding the best subset of a fixed size. Our study focuses on two aspects, consistency and computation. We…

Methodology · Statistics 2013-03-20 Shifeng Xiong

Subset Selection for Multiple Linear Regression via Optimization

Subset selection in multiple linear regression aims to choose a subset of candidate explanatory variables that tradeoff fitting error (explanatory power) and model complexity (number of variables selected). We build mathematical programming…

Machine Learning · Statistics 2020-09-04 Young Woong Park , Diego Klabjan

COMBSS: Best Subset Selection via Continuous Optimization

The problem of best subset selection in linear regression is considered with the aim to find a fixed size subset of features that best fits the response. This is particularly challenging when the total available number of features is very…

Methodology · Statistics 2023-11-28 Sarat Moka , Benoit Liquet , Houying Zhu , Samuel Muller

A Consistent and Scalable Algorithm for Best Subset Selection in Single Index Models

Analysis of high-dimensional data has led to increased interest in both single index models (SIMs) and the best-subset selection. SIMs provide an interpretable and flexible modeling framework for high-dimensional data, while the best-subset…

Machine Learning · Statistics 2025-08-19 Borui Tang , Jin Zhu , Junxian Zhu , Xueqin Wang , Heping Zhang

Exponential Screening and optimal rates of sparse estimation

In high-dimensional linear regression, the goal pursued here is to estimate an unknown regression function using linear combinations of a suitable set of covariates. One of the key assumptions for the success of any statistical procedure in…

Statistics Theory · Mathematics 2015-03-13 Philippe Rigollet , Alexandre Tsybakov

Fast Feature Selection with Fairness Constraints

We study the fundamental problem of selecting optimal features for model construction. This problem is computationally challenging on large datasets, even with the use of greedy algorithm variants. To address this challenge, we extend the…

Machine Learning · Computer Science 2023-02-06 Francesco Quinzan , Rajiv Khanna , Moshik Hershcovitch , Sarel Cohen , Daniel G. Waddington , Tobias Friedrich , Michael W. Mahoney

Understanding Best Subset Selection: A Tale of Two C(omplex)ities

We consider the problem of best subset selection (BSS) under high-dimensional sparse linear regression model. Recently, Guo et al. (2020) showed that the model selection performance of BSS depends on a certain identifiability margin, a…

Statistics Theory · Mathematics 2025-04-15 Saptarshi Roy , Ambuj Tewari , Ziwei Zhu

Solving the Best Subset Selection Problem via Suboptimal Algorithms

Best subset selection in linear regression is well known to be nonconvex and computationally challenging to solve, as the number of possible subsets grows rapidly with increasing dimensionality of the problem. As a result, finding the…

Machine Learning · Statistics 2025-04-01 Vikram Singh , Min Sun

Faithful Variable Screening for High-Dimensional Convex Regression

We study the problem of variable selection in convex nonparametric regression. Under the assumption that the true regression function is convex and sparse, we develop a screening procedure to select a subset of variables that contains the…

Statistics Theory · Mathematics 2014-11-19 Min Xu , Minhua Chen , John Lafferty

Best-Subset Selection in Generalized Linear Models: A Fast and Consistent Algorithm via Splicing Technique

In high-dimensional generalized linear models, it is crucial to identify a sparse model that adequately accounts for response variation. Although the best subset section has been widely regarded as the Holy Grail of problems of this type,…

Machine Learning · Statistics 2023-08-02 Junxian Zhu , Jin Zhu , Borui Tang , Xuanyu Chen , Hongmei Lin , Xueqin Wang

Effective Sampling: Fast Segmentation Using Robust Geometric Model Fitting

Identifying the underlying models in a set of data points contaminated by noise and outliers, leads to a highly complex multi-model fitting problem. This problem can be posed as a clustering problem by the projection of higher order…

Computer Vision and Pattern Recognition · Computer Science 2018-08-01 Ruwan Tennakoon , Alireza Sadri , Reza Hoseinnezhad , Alireza Bab-Hadiashar

Are screening methods useful in feature selection? An empirical study

Filter or screening methods are often used as a preprocessing step for reducing the number of variables used by a learning algorithm in obtaining a classification or regression model. While there are many such filter methods, there is a…

Machine Learning · Statistics 2019-09-13 Mingyuan Wang , Adrian Barbu

Orthogonal Subsampling for Big Data Linear Regression

The dramatic growth of big datasets presents a new challenge to data storage and analysis. Data reduction, or subsampling, that extracts useful information from datasets is a crucial step in big data analysis. We propose an orthogonal…

Methodology · Statistics 2021-06-01 Lin Wang , Jake Elmstedt , Weng Kee Wong , Hongquan Xu

High-dimensional variable selection with heterogeneous signals: A precise asymptotic perspective

We study the problem of exact support recovery for high-dimensional sparse linear regression under independent Gaussian design when the signals are weak, rare, and possibly heterogeneous. Under a suitable scaling of the sample size and…

Statistics Theory · Mathematics 2023-07-19 Saptarshi Roy , Ambuj Tewari , Ziwei Zhu

Multi-Objective Hyperparameter Tuning and Feature Selection using Filter Ensembles

Both feature selection and hyperparameter tuning are key tasks in machine learning. Hyperparameter tuning is often useful to increase model performance, while feature selection is undertaken to attain sparse models. Sparsity may yield…

Machine Learning · Statistics 2020-02-14 Martin Binder , Julia Moosbauer , Janek Thomas , Bernd Bischl

Subset Selection with Shrinkage: Sparse Linear Modeling when the SNR is low

We study a seemingly unexpected and relatively less understood overfitting aspect of a fundamental tool in sparse linear modeling - best subset selection, which minimizes the residual sum of squares subject to a constraint on the number of…

Methodology · Statistics 2022-01-11 Rahul Mazumder , Peter Radchenko , Antoine Dedieu

Parameter Selection Algorithm For Continuous Variables

In this article, we propose a new algorithm for supervised learning methods, by which one can both capture the non-linearity in data and also find the best subset model. To produce an enhanced subset of the original variables, an ideal…

Applications · Statistics 2017-01-23 Peyman Tavallali , Marianne Razavi , Sean Brady

Learning Mixtures of Linear Classifiers

We consider a discriminative learning (regression) problem, whereby the regression function is a convex combination of k linear classifiers. Existing approaches are based on the EM algorithm, or similar techniques, without provable…

Machine Learning · Computer Science 2014-08-01 Yuekai Sun , Stratis Ioannidis , Andrea Montanari

Collaborative Filtering via High-Dimensional Regression

While the SLIM approach obtained high ranking-accuracy in many experiments in the literature, it is also known for its high computational cost of learning its parameters from data. For this reason, we focus in this paper on variants of…

Information Retrieval · Computer Science 2019-05-01 Harald Steck

Fast Screening Rules for Optimal Design via Quadratic Lasso Reformulation

The problems of Lasso regression and optimal design of experiments share a critical property: their optimal solutions are typically \emph{sparse}, i.e., only a small fraction of the optimal variables are non-zero. Therefore, the…

Methodology · Statistics 2023-12-07 Guillaume Sagnol , Luc Pronzato