Related papers: Column Subset Selection, Matrix Factorization, and…
The problem of extracting a well conditioned submatrix from any rectangular matrix (with normalized columns) has been studied for some time in functional and harmonic analysis; see…
In this work, we analyze a sublinear-time algorithm for selecting a few rows and columns of a matrix for low-rank approximation purposes. The algorithm is based on an initial uniformly random selection of rows and columns, followed by a…
We consider the problem of matrix column subset selection, which selects a subset of columns from an input matrix such that the input can be well approximated by the span of the selected columns. Column subset selection has been applied to…
The problem of column subset selection asks for a subset of columns from an input matrix such that the matrix can be reconstructed as accurately as possible within the span of the selected columns. A natural extension is to consider a…
Low-rank approximation and column subset selection are two fundamental and related problems that are applied across a wealth of machine learning applications. In this paper, we study the question of socially fair low-rank approximation and…
The problem of column subset selection has recently attracted a large body of research, with feature selection serving as one obvious and important application. Among the techniques that have been applied to solve this problem, the greedy…
In this paper, we introduce an efficient algorithm for column subset selection that combines the column-pivoted QR factorization with sparse subspace embeddings. The proposed method, SE-QRSC, is particularly effective for wide matrices with…
We address the subset selection problem for matrices, where the goal is to select a subset of $k$ columns from a "short-and-fat" matrix $X \in \mathbb{R}^{m \times n}$, such that the pseudoinverse of the sampled submatrix has as small…
Column selection is an essential tool for structure-preserving low-rank approximation, with wide-ranging applications across many fields, such as data science, machine learning, and theoretical chemistry. In this work, we develop unified…
The Column Subset Selection Problem provides a natural framework for unsupervised feature selection. Despite being a hard combinatorial optimization problem, there exist efficient algorithms that provide good approximations. The drawback of…
A CUR approximation of a matrix $A$ is a particular type of low-rank approximation $A \approx C U R$, where $C$ and $R$ consist of columns and rows of $A$, respectively. One way to obtain such an approximation is to apply column subset…
We present a matrix-factorization algorithm that scales to input matrices with both huge number of rows and columns. Learned factors may be sparse or dense and/or non-negative, which makes our algorithm suitable for dictionary learning,…
We discuss a randomized strong rank-revealing QR factorization that effectively reveals the spectrum of a matrix $\textbf{M}$. This factorization can be used to address problems such as selecting a subset of the columns of $\textbf{M}$,…
Subset selection for matrices is the task of extracting a column sub-matrix from a given matrix $B\in\mathbb{R}^{n\times m}$ with $m>n$ such that the pseudoinverse of the sampled matrix has as small Frobenius or spectral norm as possible.…
We propose a continuous optimization algorithm for the Column Subset Selection Problem (CSSP) and Nystr\"om approximation. The CSSP and Nystr\"om method construct low-rank approximations of matrices based on a predetermined subset of…
We propose a randomized method for solving linear programs with a large number of columns but a relatively small number of constraints. Since enumerating all the columns is usually unrealistic, such linear programs are commonly solved by…
Nonnegative matrix factorization arises widely in machine learning and data analysis. In this paper, for a given factorization of rank r, we consider the sparse stochastic matrix factorization (SSMF) of decomposing a prescribed m-by-n…
We consider the problem of selecting the best subset of exactly $k$ columns from an $m \times n$ matrix $A$. We present and analyze a novel two-stage algorithm that runs in $O(\min\{mn^2,m^2n\})$ time and returns as output an $m \times k$…
This paper defines a generalized column subset selection problem which is concerned with the selection of a few columns from a source matrix A that best approximate the span of a target matrix B. The paper then proposes a fast greedy…
Dimensionality reduction is a first step of many machine learning pipelines. Two popular approaches are principal component analysis, which projects onto a small number of well chosen but non-interpretable directions, and feature selection,…