Related papers: Uniform Sampling for Matrix Approximation
In statistics and machine learning, logistic regression is a widely-used supervised learning technique primarily employed for binary classification tasks. When the number of observations greatly exceeds the number of predictor variables, we…
Matrix completion, i.e., the exact and provable recovery of a low-rank matrix from a small subset of its elements, is currently only known to be possible if the matrix satisfies a restrictive structural constraint---known as {\em…
One popular method for dealing with large-scale data sets is sampling. For example, by using the empirical statistical leverage scores as an importance sampling distribution, the method of algorithmic leveraging samples and rescales…
We consider the related tasks of matrix completion and matrix approximation from missing data and propose adaptive sampling procedures for both problems. We show that adaptive sampling allows one to eliminate standard incoherence…
Low-rank matrix completion is an important problem with extensive real-world applications. When observations are uniformly sampled from the underlying matrix entries, existing methods all require the matrix to be incoherent. This paper…
There has been significant interest and progress recently in algorithms that solve regression problems involving tall and thin matrices in input sparsity time. These algorithms find shorter equivalent of a n*d matrix where n >> d, which…
We focus the use of \emph{row sampling} for approximating matrix algorithms. We give applications to matrix multipication; sparse matrix reconstruction; and, \math{\ell_2} regression. For a matrix \math{\matA\in\R^{m\times d}} which…
Selecting a good column (or row) subset of massive data matrices has found many applications in data analysis and machine learning. We propose a new adaptive sampling algorithm that can be used to improve any relative-error column selection…
This paper deals with the problem of robust matrix completion -- retrieving a low-rank matrix and a sparse matrix from the compressed counterpart of their superposition. Though seemingly not an unresolved issue, we point out that the…
Leverage score sampling provides an appealing way to perform approximate computations for large matrices. Indeed, it allows to derive faithful approximations with a complexity adapted to the problem at hand. Yet, performing leverage scores…
Low-rank approximation of a matrix by means of random sampling has been consistently efficient in its empirical studies by many scientists who applied it with various sparse and structured multipliers, but adequate formal support for this…
We focus on \emph{row sampling} based approximations for matrix algorithms, in particular matrix multipication, sparse matrix reconstruction, and \math{\ell_2} regression. For \math{\matA\in\R^{m\times d}} (\math{m} points in \math{d\ll m}…
We explain theoretically a curious empirical phenomenon: "Approximating a matrix by deterministically selecting a subset of its columns with the corresponding largest leverage scores results in a good low-rank matrix surrogate". To obtain…
Finding a small spectral approximation for a tall $n \times d$ matrix $A$ is a fundamental numerical primitive. For a number of reasons, one often seeks an approximation whose rows are sampled from those of $A$. Row sampling improves…
We present a new algorithm for finding a near optimal low-rank approximation of a matrix $A$ in $O(nnz(A))$ time. Our method is based on a recursive sampling scheme for computing a representative subset of $A$'s columns, which is then used…
A novel matrix approximation problem is considered herein: observations based on a few fully sampled columns and quasi-polynomial structural side information are exploited. The framework is motivated by quantum chemistry problems wherein…
Subsampling algorithms are a natural approach to reduce data size before fitting models on massive datasets. In recent years, several works have proposed methods for subsampling rows from a data matrix while maintaining relevant information…
We consider the problem of matrix column subset selection, which selects a subset of columns from an input matrix such that the input can be well approximated by the span of the selected columns. Column subset selection has been applied to…
This paper is concerned with the problem of low rank plus sparse matrix decomposition for big data. Conventional algorithms for matrix decomposition use the entire data to extract the low-rank and sparse components, and are based on…
Ridge leverage scores provide a balance between low-rank approximation and regularization, and are ubiquitous in randomized linear algebra and machine learning. Deterministic algorithms are also of interest in the moderately big data…