Related papers: An Improved Approximation Algorithm for the Column…
We prove that for any real-valued matrix $X \in \R^{m \times n}$, and positive integers $r \ge k$, there is a subset of $r$ columns of $X$ such that projecting $X$ onto their span gives a $\sqrt{\frac{r+1}{r-k+1}}$-approximation to best…
We consider a variety of criteria for selecting k representative columns from a real mxn matrix A, when sufficiently few columns are required, i.e., 1<= k<= min{rank(A), m/3}. The criteria include the following optimization problems:…
Subset selection for the rank $k$ approximation of an $n\times d$ matrix $A$ offers improvements in the interpretability of matrices, as well as a variety of computational savings. This problem is well-understood when the error measure is…
We study the low rank approximation problem of any given matrix $A$ over $\mathbb{R}^{n\times m}$ and $\mathbb{C}^{n\times m}$ in entry-wise $\ell_p$ loss, that is, finding a rank-$k$ matrix $X$ such that $\|A-X\|_p$ is minimized. Unlike…
A CUR approximation of a matrix $A$ is a particular type of low-rank approximation $A \approx C U R$, where $C$ and $R$ consist of columns and rows of $A$, respectively. One way to obtain such an approximation is to apply column subset…
We study subset selection for matrices defined as follows: given a matrix $\matX \in \R^{n \times m}$ ($m > n$) and an oversampling parameter $k$ ($n \le k \le m$), select a subset of $k$ columns from $\matX$ such that the pseudo-inverse of…
We study the problem of entrywise $\ell_1$ low rank approximation. We give the first polynomial time column subset selection-based $\ell_1$ low rank approximation algorithm sampling $\tilde{O}(k)$ columns and achieving an…
We study the column subset selection problem with respect to the entrywise $\ell_1$-norm loss. It is known that in the worst case, to obtain a good rank-$k$ approximation to a matrix, one needs an arbitrarily large $n^{\Omega(1)}$ number of…
We address the subset selection problem for matrices, where the goal is to select a subset of $k$ columns from a "short-and-fat" matrix $X \in \mathbb{R}^{m \times n}$, such that the pseudoinverse of the sampled submatrix has as small…
We give the first single-pass streaming algorithm for Column Subset Selection with respect to the entrywise $\ell_p$-norm with $1 \leq p < 2$. We study the $\ell_p$ norm loss since it is often considered more robust to noise than the…
We consider the problem of matrix column subset selection, which selects a subset of columns from an input matrix such that the input can be well approximated by the span of the selected columns. Column subset selection has been applied to…
The problem of column subset selection asks for a subset of columns from an input matrix such that the matrix can be reconstructed as accurately as possible within the span of the selected columns. A natural extension is to consider a…
We propose a continuous optimization algorithm for the Column Subset Selection Problem (CSSP) and Nystr\"om approximation. The CSSP and Nystr\"om method construct low-rank approximations of matrices based on a predetermined subset of…
This paper investigates the spectral norm version of the column subset selection problem. Given a matrix $\mathbf{A}\in\mathbb{R}^{n\times d}$ and a positive integer $k\leq\text{rank}(\mathbf{A})$, the objective is to select exactly $k$…
Given a fixed matrix, the problem of column subset selection requests a column submatrix that has favorable spectral properties. Most research from the algorithms and numerical linear algebra communities focuses on a variant called…
Dimensionality reduction is a first step of many machine learning pipelines. Two popular approaches are principal component analysis, which projects onto a small number of well chosen but non-interpretable directions, and feature selection,…
Selecting a good column (or row) subset of massive data matrices has found many applications in data analysis and machine learning. We propose a new adaptive sampling algorithm that can be used to improve any relative-error column selection…
Subset selection for matrices is the task of extracting a column sub-matrix from a given matrix $B\in\mathbb{R}^{n\times m}$ with $m>n$ such that the pseudoinverse of the sampled matrix has as small Frobenius or spectral norm as possible.…
We derive a new adaptive leverage score sampling strategy for solving the Column Subset Selection Problem (CSSP). The resulting algorithm, called Adaptive Randomized Pivoting, can be viewed as a randomization of Osinsky's recently proposed…
Let $M$ be a real $r\times c$ matrix and let $k$ be a positive integer. In the column subset selection problem (CSSP), we need to minimize the quantity $\|M-SA\|$, where $A$ can be an arbitrary $k\times c$ matrix, and $S$ runs over all…