Related papers: Near-Optimal Entrywise Sampling for Data Matrices
Many real-world data sets are sparse or almost sparse. One method to measure this for a matrix $A\in \mathbb{R}^{n\times n}$ is the \emph{numerical sparsity}, denoted $\mathsf{ns}(A)$, defined as the minimum $k\geq 1$ such that…
We adapt a well known streaming algorithm for approximating item frequencies to the matrix sketching setting. The algorithm receives the rows of a large matrix $A \in \R^{n \times m}$ one after the other in a streaming fashion. It maintains…
Matrix sketching is a powerful tool for reducing the size of large data matrices. Yet there are fundamental limitations to this size reduction when we want to recover an accurate estimator for a task such as least square regression. We show…
We provide the first streaming algorithm for computing a provable approximation to the $k$-means of sparse Big data. Here, sparse Big Data is a set of $n$ vectors in $\mathbb{R}^d$, where each vector has $O(1)$ non-zeroes entries, and…
This paper argues that randomized linear sketching is a natural tool for on-the-fly compression of data matrices that arise from large-scale scientific simulations and data collection. The technical contribution consists in a new algorithm…
This paper describes a suite of algorithms for constructing low-rank approximations of an input matrix from a random linear image of the matrix, called a sketch. These methods can preserve structural properties of the input matrix, such as…
Tracking and approximating data matrices in streaming fashion is a fundamental challenge. The problem requires more care and attention when data comes from multiple distributed sites, each receiving a stream of data. This paper considers…
We study the streaming model for approximate matrix multiplication (AMM). We are interested in the scenario that the algorithm can only take one pass over the data with limited memory. The state-of-the-art deterministic sketching algorithm…
In this paper, we design sub-linear space streaming algorithms for estimating three fundamental parameters -- maximum independent set, minimum dominating set and maximum matching -- on sparse graph classes, i.e., graphs which satisfy…
Matrices arising in scientific applications frequently admit linear low-rank approximations due to smoothness in the physical and/or temporal domain of the problem. In large-scale problems, computing an optimal low-rank approximation can be…
We study how well one can recover sparse principal components of a data matrix using a sketch formed from a few of its elements. We show that for a wide class of optimization problems, if the sketch is close (in the spectral norm) to the…
This survey highlights the recent advances in algorithms for numerical linear algebra that have come from the technique of linear sketching, whereby given a matrix, one first compresses it to a much smaller matrix by multiplying it by a…
Motivated by a sampling problem basic to computational statistical inference, we develop a nearly optimal algorithm for a fundamental problem in spectral graph theory and numerical analysis. Given an $n\times n$ SDDM matrix ${\bf…
We present a new algorithm for finding a near optimal low-rank approximation of a matrix $A$ in $O(nnz(A))$ time. Our method is based on a recursive sampling scheme for computing a representative subset of $A$'s columns, which is then used…
Analyzing massive data sets has been one of the key motivations for studying streaming algorithms. In recent years, there has been significant progress in analysing distributions in a streaming setting, but the progress on graph problems…
Low-rank approximation in data streams is a fundamental and significant task in computing science, machine learning and statistics. Multiple streaming algorithms have emerged over years and most of them are inspired by randomized…
We present an approximate algorithm for matrix multiplication based on matrix sketching techniques. First one of the matrix is chosen and sparsified using the online matrix sketching algorithm, and then the matrix product is calculated…
We study the problem of solving semidefinite programs (SDP) in the streaming model. Specifically, $m$ constraint matrices and a target matrix $C$, all of size $n\times n$ together with a vector $b\in \mathbb{R}^m$ are streamed to us…
We give two algorithms for output-sparse matrix multiplication (OSMM), the problem of multiplying two $n \times n$ matrices $A, B$ when their product $AB$ is promised to have at most $O(n^{\delta})$ many non-zero entries for a given value…
In recent years, a variety of randomized constructions of sketching matrices have been devised, that have been used in fast algorithms for numerical linear algebra problems, such as least squares regression, low-rank approximation, and the…