Related papers: Better Bounds for Frequency Moments in Random-Orde…
We revisit one of the classic problems in the data stream literature, namely, that of estimating the frequency moments $F_p$ for $0 < p < 2$ of an underlying $n$-dimensional vector presented as a sequence of additive updates in a stream. It…
Frequency estimation in data streams is one of the classical problems in streaming algorithms. Following much research, there are now almost matching upper and lower bounds for the trade-off needed between the number of samples and the…
In this paper we consider the problem of approximating frequency moments in the streaming model. Given a stream $D = \{p_1,p_2,\dots,p_m\}$ of numbers from $\{1,\dots, n\}$, a frequency of $i$ is defined as $f_i = |\{j: p_j = i\}|$. The…
We consider the \textsf{Unit Interval Selection} problem in the one-pass random order streaming model. Here, an algorithm is presented a sequence of $n$ unit-length intervals on the line that arrive in uniform random order, and the…
We show an improved lower bound for the Fp estimation problem in a data stream setting for p>2. A data stream is a sequence of items from the domain [n] with possible repetitions. The frequency vector x is an n-dimensional non-negative…
We present a novel approach for the problem of frequency estimation in data streams that is based on optimization and machine learning. Contrary to state-of-the-art streaming frequency estimation algorithms, which heavily rely on random…
Mining frequent itemsets through static Databases has been extensively studied and used and is always considered a highly challenging task. For this reason it is interesting to extend it to data streams field. In the streaming case, the…
Machine learning from data streams is an active and growing research area. Research on learning from streaming data typically makes strict assumptions linked to computational resource constraints, including requirements for stream mining…
Estimating the first moment of a data stream defined as $F_1 = \sum_{i \in \{1, 2, \ldots, n\}} \abs{f_i}$ to within $1 \pm \epsilon$-relative error with high probability is a basic and influential problem in data stream processing. A tight…
We study the classical problem of moment estimation of an underlying vector whose $n$ coordinates are implicitly defined through a series of updates in a data stream. We show that if the updates to the vector arrive in the random-order…
Estimating the second frequency moment of a stream up to $(1\pm\varepsilon)$ multiplicative error requires at most $O(\log n / \varepsilon^2)$ bits of space, due to a seminal result of Alon, Matias, and Szegedy. It is also known that at…
We study which property testing and sublinear time algorithms can be transformed into graph streaming algorithms for random order streams. Our main result is that for bounded degree graphs, any property that is constant-query testable in…
A central problem in data streams is to characterize which functions of an underlying frequency vector can be approximated efficiently. Recently there has been considerable effort in extending this problem to that of estimating functions of…
We obtain the best possible upper bounds for the moments of a single order statistic from independent, non-negative random variables, in terms of the population mean. The main result covers the independent identically distributed case.…
One of the oldest problems in the data stream model is to approximate the $p$-th moment $\|\mathcal{X}\|_p^p = \sum_{i=1}^n |\mathcal{X}_i|^p$ of an underlying vector $\mathcal{X} \in \mathbb{R}^n$, which is presented as a sequence of…
The bounds for absolute moments of order statistics are established. Let $X_1,\dots ,X_n$ be independent identically distributed real-valued random variables and let $X_{1:n}\le \dots \le X_{n:n}$ be the corresponding order statistics. The…
We introduce a new notion of information complexity for multi-pass streaming problems and use it to resolve several important questions in data streams. In the coin problem, one sees a stream of $n$ i.i.d. uniform bits and one would like to…
Estimating the second frequency moment $F_2$ of a data stream up to a $(1 \pm \varepsilon)$ factor is a central problem in the streaming literature. For errors $\varepsilon > \Omega(1/\sqrt{n})$, the tight bound…
We consider the problem of learning over non-stationary ranking streams. The rankings can be interpreted as the preferences of a population and the non-stationarity means that the distribution of preferences changes over time. Our goal is…
We use fluid limits to explore the (in)stability properties of wireless networks with queue-based random-access algorithms. Queue-based random-access schemes are simple and inherently distributed in nature, yet provide the capability to…