English
Related papers

Related papers: Fair Column Subset Selection

200 papers

Low-rank approximation and column subset selection are two fundamental and related problems that are applied across a wealth of machine learning applications. In this paper, we study the question of socially fair low-rank approximation and…

Machine Learning · Computer Science 2024-12-10 Zhao Song , Ali Vakilian , David P. Woodruff , Samson Zhou

We consider the problem of matrix column subset selection, which selects a subset of columns from an input matrix such that the input can be well approximated by the span of the selected columns. Column subset selection has been applied to…

Machine Learning · Statistics 2018-01-26 Yining Wang , Aarti Singh

The problem of column subset selection has recently attracted a large body of research, with feature selection serving as one obvious and important application. Among the techniques that have been applied to solve this problem, the greedy…

Data Structures and Algorithms · Computer Science 2021-11-16 Jason Altschuler , Aditya Bhaskara , Gang Fu , Vahab Mirrokni , Afshin Rostamizadeh , Morteza Zadimoghaddam

In this paper, we introduce an efficient algorithm for column subset selection that combines the column-pivoted QR factorization with sparse subspace embeddings. The proposed method, SE-QRSC, is particularly effective for wide matrices with…

Numerical Analysis · Mathematics 2025-09-05 Israa Fakih , Laura Grigori

Given a fixed matrix, the problem of column subset selection requests a column submatrix that has favorable spectral properties. Most research from the algorithms and numerical linear algebra communities focuses on a variant called…

Numerical Analysis · Mathematics 2014-04-29 Joel A. Tropp

In this paper, we consider the problem of column subset selection. We present a novel analysis of the spectral norm reconstruction for a simple randomized algorithm and establish a new bound that depends explicitly on the sampling…

Numerical Analysis · Mathematics 2015-05-05 Tianbao Yang , Lijun Zhang , Rong Jin , Shenghuo Zhu

Selecting a good column (or row) subset of massive data matrices has found many applications in data analysis and machine learning. We propose a new adaptive sampling algorithm that can be used to improve any relative-error column selection…

Data Structures and Algorithms · Computer Science 2015-10-15 Saurabh Paul , Malik Magdon-Ismail , Petros Drineas

We propose a randomized method for solving linear programs with a large number of columns but a relatively small number of constraints. Since enumerating all the columns is usually unrealistic, such linear programs are commonly solved by…

Optimization and Control · Mathematics 2023-11-29 Yi-Chun Akchen , Velibor V. Mišić

We address the subset selection problem for matrices, where the goal is to select a subset of $k$ columns from a "short-and-fat" matrix $X \in \mathbb{R}^{m \times n}$, such that the pseudoinverse of the sampled submatrix has as small…

Numerical Analysis · Mathematics 2025-07-29 Ivan Kozyrev , Alexander Osinsky

In this work, we analyze a sublinear-time algorithm for selecting a few rows and columns of a matrix for low-rank approximation purposes. The algorithm is based on an initial uniformly random selection of rows and columns, followed by a…

Numerical Analysis · Mathematics 2024-02-22 Alice Cortinovis , Lexing Ying

Column selection is an essential tool for structure-preserving low-rank approximation, with wide-ranging applications across many fields, such as data science, machine learning, and theoretical chemistry. In this work, we develop unified…

Numerical Analysis · Mathematics 2024-08-09 Mark Fornace , Michael Lindsey

The Column Subset Selection Problem provides a natural framework for unsupervised feature selection. Despite being a hard combinatorial optimization problem, there exist efficient algorithms that provide good approximations. The drawback of…

Machine Learning · Computer Science 2018-04-13 Bruno Ordozgoiti , Alberto Mozo , Jesús García López de Lacalle

We study the low rank approximation problem of any given matrix $A$ over $\mathbb{R}^{n\times m}$ and $\mathbb{C}^{n\times m}$ in entry-wise $\ell_p$ loss, that is, finding a rank-$k$ matrix $X$ such that $\|A-X\|_p$ is minimized. Unlike…

Machine Learning · Computer Science 2019-10-31 Chen Dan , Hong Wang , Hongyang Zhang , Yuchen Zhou , Pradeep Ravikumar

We introduce a new rule-based optimization method for classification with constraints. The proposed method leverages column generation for linear programming, and hence, is scalable to large datasets. The resulting pricing subproblem is…

Machine Learning · Computer Science 2025-02-07 Tabea E. Röber , Adia C. Lumadjeng , M. Hakan Akyüz , Ş. İlker Birbil

This paper defines a generalized column subset selection problem which is concerned with the selection of a few columns from a source matrix A that best approximate the span of a target matrix B. The paper then proposes a fast greedy…

Data Structures and Algorithms · Computer Science 2013-12-25 Ahmed K. Farahat , Ali Ghodsi , Mohamed S. Kamel

We consider the problem of selecting the best subset of exactly $k$ columns from an $m \times n$ matrix $A$. We present and analyze a novel two-stage algorithm that runs in $O(\min\{mn^2,m^2n\})$ time and returns as output an $m \times k$…

Data Structures and Algorithms · Computer Science 2015-03-13 Christos Boutsidis , Michael W. Mahoney , Petros Drineas

Best subset selection in linear regression is well known to be nonconvex and computationally challenging to solve, as the number of possible subsets grows rapidly with increasing dimensionality of the problem. As a result, finding the…

Machine Learning · Statistics 2025-04-01 Vikram Singh , Min Sun

We consider a variety of criteria for selecting k representative columns from a real mxn matrix A, when sufficiently few columns are required, i.e., 1<= k<= min{rank(A), m/3}. The criteria include the following optimization problems:…

Numerical Analysis · Mathematics 2026-04-13 Ilse C. F. Ipsen , Arvind K. Saibaba

Random sampling has become a critical tool in solving massive matrix problems. For linear regression, a small, manageable set of data rows can be randomly selected to approximate a tall, skinny data matrix, improving processing time…

Data Structures and Algorithms · Computer Science 2014-08-22 Michael B. Cohen , Yin Tat Lee , Cameron Musco , Christopher Musco , Richard Peng , Aaron Sidford

Subset selection in multiple linear regression aims to choose a subset of candidate explanatory variables that tradeoff fitting error (explanatory power) and model complexity (number of variables selected). We build mathematical programming…

Machine Learning · Statistics 2020-09-04 Young Woong Park , Diego Klabjan
‹ Prev 1 2 3 10 Next ›