Related papers: Streaming algorithms for groups and semigroups
A pseudo-deterministic algorithm is a (randomized) algorithm which, when run multiple times on the same input, with high probability outputs the same result on all executions. Classic streaming algorithms, such as those for finding heavy…
We study the complexity of the following problems in the streaming model. Membership testing for \DLIN We show that every language in \DLIN\ can be recognised by a randomized one-pass $O(\log n)$ space algorithm with inverse polynomial…
Many problems on data streams have been studied at two extremes of difficulty: either allowing randomized algorithms, in the static setting (where they should err with bounded probability on the worst case stream); or when only…
We consider streaming algorithms for approximating a product of input probabilities up to multiplicative error of $1-\epsilon$. It is shown that every randomized streaming algorithm for this problem needs space $\Omega(\log n + \log b -…
We propose a streaming algorithm for the binary classification of data based on crowdsourcing. The algorithm learns the competence of each labeller by comparing her labels to those of other labellers on the same tasks and uses this…
Many streaming algorithms provide only a high-probability relative approximation. These two relaxations, of allowing approximation and randomization, seem necessary -- for many streaming problems, both relaxations must be employed…
We consider the independent set problem in the semi-streaming model. For any input graph $G=(V, E)$ with $n$ vertices, an independent set is a set of vertices with no edges between any two elements. In the semi-streaming model, $G$ is…
A longstanding observation, which was partially proven in \cite{LNW14,AHLW16}, is that any turnstile streaming algorithm can be implemented as a linear sketch (the reverse is trivially true). We study the relationship between turnstile…
In this paper, we study streaming and online algorithms in the context of randomness in the input. For several problems, a random order of the input sequence---as opposed to the worst-case order---appears to be a necessary evil in order to…
Diversity maximization is a fundamental problem with wide applications in data summarization, web search, and recommender systems. Given a set $X$ of $n$ elements, it asks to select a subset $S$ of $k \ll n$ elements with maximum…
In this work, we present a combinatorial, deterministic single-pass streaming algorithm for the problem of maximizing a submodular function, not necessarily monotone, with respect to a cardinality constraint (SMCC). In the case the function…
Adversarially robust streaming algorithms are required to process a stream of elements and produce correct outputs, even when each stream element can be chosen as a function of earlier algorithm outputs. As with classic streaming…
We study the problem of extracting a small subset of representative items from a large data stream. In many data mining and machine learning applications such as social network analysis and recommender systems, this problem can be…
Many well-known, real-world problems involve dynamic data which describe the relationship among the entities. Hypergraphs are powerful combinatorial structures that are frequently used to model such data. For many of today's data-centric…
We investigate the adversarial robustness of streaming algorithms. In this context, an algorithm is considered robust if its performance guarantees hold even if the stream is chosen adaptively by an adversary that observes the outputs of…
The discrete logarithm problem in a finite group is the basis for many protocols in cryptography. The best general algorithms which solve this problem have time complexity of $\mathcal{O}(\sqrt{N}\log N)$, and a space complexity of…
We study streaming algorithms in the white-box adversarial model, where the stream is chosen adaptively by an adversary who observes the entire internal state of the algorithm at each time step. We show that nontrivial algorithms are still…
In the fields of big data, AI, and streaming processing, we work with large amounts of data from multiple sources. Due to memory and network limitations, we process data streams on distributed systems to alleviate computational and network…
We consider the problem of monotone, submodular maximization over a ground set of size $n$ subject to cardinality constraint $k$. For this problem, we introduce the first deterministic algorithms with linear time complexity; these…
Given a data stream $\mathcal{A} = \langle a_1, a_2, \ldots, a_m \rangle$ of $m$ elements where each $a_i \in [n]$, the Distinct Elements problem is to estimate the number of distinct elements in $\mathcal{A}$.Distinct Elements has been a…