Related papers: Efficient Algorithm for Deterministic Search of Ho…
The distinct elements problem is one of the fundamental problems in streaming algorithms --- given a stream of integers in the range $\{1,\ldots,n\}$, we wish to provide a $(1+\varepsilon)$ approximation to the number of distinct elements…
Detecting frequent elements is among the oldest and most-studied problems in the area of data streams. Given a stream of $m$ data items in $\{1, 2, \dots, n\}$, the objective is to output items that appear at least $d$ times, for some…
Many streaming algorithms provide only a high-probability relative approximation. These two relaxations, of allowing approximation and randomization, seem necessary -- for many streaming problems, both relaxations must be employed…
Many problems on data streams have been studied at two extremes of difficulty: either allowing randomized algorithms, in the static setting (where they should err with bounded probability on the worst case stream); or when only…
Frequency estimation of elements is an important task for summarizing data streams and machine learning applications. The problem is often addressed by using streaming algorithms with sublinear space data structures. These algorithms allow…
We give the first optimal bounds for returning the $\ell_1$-heavy hitters in a data stream of insertions, together with their approximate frequencies, closing a long line of work on this problem. For a stream of $m$ items in $\{1, 2, \dots,…
We devise a deterministic algorithm for minimum Steiner cut, which uses $(\log n)^{O(1)}$ maximum flow calls and additional near-linear time. This algorithm improves on Li and Panigrahi's (FOCS 2020) algorithm, which uses $(\log…
We present a novel approach for the problem of frequency estimation in data streams that is based on optimization and machine learning. Contrary to state-of-the-art streaming frequency estimation algorithms, which heavily rely on random…
The maximum coverage problem is to select $k$ sets from a collection of sets such that the cardinality of the union of the selected sets is maximized. We consider $(1-1/e-\epsilon)$-approximation algorithms for this NP-hard problem in three…
We consider the problem of monotone, submodular maximization over a ground set of size $n$ subject to cardinality constraint $k$. For this problem, we introduce the first deterministic algorithms with linear time complexity; these…
In this work, we study the problem of finding the maximum value of a non-negative submodular function subject to a limit on the number of items selected, a ubiquitous problem that appears in many applications, such as data summarization and…
Finding dense subgraphs is a fundamental algorithmic tool in data mining, community detection, and clustering. In this problem, one aims to find an induced subgraph whose edge-to-vertex ratio is maximized. We study the directed case of this…
We consider the following fundamental routing problem. An adversary inputs packets arbitrarily at sources, each packet with an arbitrary destination. Traffic is constrained by link capacities and buffer sizes, and packets may be dropped at…
In the online sorting problem, a sequence of $n$ numbers in $[0, 1]$ (including $\{0,1\}$) have to be inserted in an array of size $m \ge n$ so as to minimize the sum of absolute differences between pairs of numbers occupying consecutive…
This paper studies the classic problem of finding heavy hitters in the turnstile streaming model. We give the first deterministic linear sketch that has $O(\epsilon^{-2} \log n \cdot \log^*(\epsilon^{-1}))$ rows and answers queries in…
We revisit one of the classic problems in the data stream literature, namely, that of estimating the frequency moments $F_p$ for $0 < p < 2$ of an underlying $n$-dimensional vector presented as a sequence of additive updates in a stream. It…
We study the problem of extracting a small subset of representative items from a large data stream. In many data mining and machine learning applications such as social network analysis and recommender systems, this problem can be…
Adversarially robust streaming algorithms are required to process a stream of elements and produce correct outputs, even when each stream element can be chosen as a function of earlier algorithm outputs. As with classic streaming…
We introduce an online version of the multiselection problem, in which q selection queries are requested on an unsorted array of n elements. We provide the first online algorithm that is 1-competitive with Kaligosi et al. [ICALP 2005] in…
The problem of finding locally dense components of a graph is an important primitive in data analysis, with wide-ranging applications from community mining to spam detection and the discovery of biological network modules. In this paper we…