Related papers: Relative Error Streaming Quantiles
Computing the approximate quantiles or ranks of a stream is a fundamental task in data monitoring. Given a stream of elements $x_1, x_2, \dots, x_n$ and a query $x$, a relative-error quantile estimation algorithm can estimate the rank of…
This paper resolves one of the longest standing basic problems in the streaming computational model. Namely, optimal construction of quantile sketches. An $\varepsilon$ approximate quantile sketch receives a stream of items $x_1,\ldots,x_n$…
Approximating quantiles and distributions over streaming data has been studied for roughly two decades now. Recently, Karnin, Lang, and Liberty proposed the first asymptotically optimal algorithm for doing so. This manuscript complements…
We develop a new algorithmic technique that allows to transfer some constant time approximation algorithms for general graphs into random order streaming algorithms. We illustrate our technique by proving that in random order streams with…
An $\varepsilon$-approximate quantile sketch over a stream of $n$ inputs approximates the rank of any query point $q$ - that is, the number of input points less than $q$ - up to an additive error of $\varepsilon n$, generally with some…
We consider streaming algorithms for approximating a product of input probabilities up to multiplicative error of $1-\epsilon$. It is shown that every randomized streaming algorithm for this problem needs space $\Omega(\log n + \log b -…
We initiate a broad study of classical problems in the streaming model with insertions and deletions in the setting where we allow the approximation factor $\alpha$ to be much larger than $1$. Such algorithms can use significantly less…
The distinct elements problem is one of the fundamental problems in streaming algorithms --- given a stream of integers in the range $\{1,\ldots,n\}$, we wish to provide a $(1+\varepsilon)$ approximation to the number of distinct elements…
Estimating quantiles is one of the foundational problems of data sketching. Given $n$ elements $x_1, x_2, \dots, x_n$ from some universe of size $U$ arriving in a data stream, a quantile sketch estimates the rank of any element with…
We study the classic NP-Hard problem of finding the maximum $k$-set coverage in the data stream model: given a set system of $m$ sets that are subsets of a universe $\{1,\ldots,n \}$, find the $k$ sets that cover the most number of distinct…
We consider the problem of estimating the value of max cut in a graph in the streaming model of computation. At one extreme, there is a trivial $2$-approximation for this problem that uses only $O(\log n)$ space, namely, count the number of…
We study learning-augmented streaming algorithms for estimating the value of MAX-CUT in a graph. In the classical streaming model, while a $1/2$-approximation for estimating the value of MAX-CUT can be trivially achieved with $O(1)$ words…
We initiate the study of sub-linear sketching and streaming techniques for estimating the output size of common dictionary compressors such as Lempel-Ziv '77, the run-length Burrows-Wheeler transform, and grammar compression. To this end,…
Space efficient algorithms play a central role in dealing with large amount of data. In such settings, one would like to analyse the large data using small amount of "working space". One of the key steps in many algorithms for analysing…
We resolve the space complexity of linear sketches for approximating the maximum matching problem in dynamic graph streams where the stream may include both edge insertion and deletion. Specifically, we show that for any $\epsilon > 0$,…
The majority of streaming problems are defined and analyzed in a static setting, where the data stream is any worst-case sequence of insertions and deletions that is fixed in advance. However, many real-world applications require a more…
Estimating the quantiles of a large dataset is a fundamental problem in both the streaming algorithms literature and the differential privacy literature. However, all existing private mechanisms for distribution-independent quantile…
In this paper, we present a new algorithm for maintaining linear sketches in turnstile streams with faster update time. As an application, we show that $\log n$ \texttt{Count} sketches or \texttt{CountMin} sketches with a constant number of…
Quantile summaries provide a scalable way to estimate the distribution of individual attributes in large datasets that are often distributed across multiple machines or generated by sensor networks. ReqSketch (arXiv:2004.01668) is currently…
Given a finite set of points $P \subseteq \mathbb{R}^d$, we would like to find a small subset $S \subseteq P$ such that the convex hull of $S$ approximately contains $P$. More formally, every point in $P$ is within distance $\epsilon$ from…