Related papers: Analysis of a Bloom Filter Algorithm via the Super…
We propose in this paper an on-line algorithm based on Bloom filters for identifying large flows in IP traffic (a.k.a. elephants). Because of the large number of small flows, hash tables of these algorithms have to be regularly refreshed.…
The majority of Internet traffic is caused by a relatively small number of flows (so-called elephant flows). This phenomenon can be exploited to facilitate traffic engineering: resource-costly individual flow forwarding entries can be…
Monitoring the traffic volumes of elephant flows, including the total byte count per flow, is a fundamental capability for online network measurements. We present an asymptotically optimal algorithm for solving this problem in terms of both…
A Bloom filter is a method for reducing the space (memory) required for representing a set by allowing a small error probability. In this paper we consider a \emph{Sliding Bloom Filter}: a data structure that, given a stream of elements,…
A Bloom Filter is a probabilistic data structure designed to check, rapidly and memory-efficiently, whether an element is present in a set. It has been vastly used in various computing areas and several variants, allowing deletions, dynamic…
We introduce a framework for proving lower bounds on computational problems over distributions against algorithms that can be implemented using access to a statistical query oracle. For such algorithms, access to the input distribution is…
Bloom filters are space-efficient probabilistic data structures that are used to test whether an element is a member of a set, and may return false positives. Recently, variations referred to as learned Bloom filters were developed that can…
In the fixed budget thresholding bandit problem, an algorithm sequentially allocates a budgeted number of samples to different distributions. It then predicts whether the mean of each distribution is larger or lower than a given threshold.…
Optimizing modern production plants using the job-shop principle is a known hard problem. For very large plants, like semiconductor fabs, the problem becomes unsolvable on a plant-wide scale in a reasonable amount of time using classical…
Recent work has suggested enhancing Bloom filters by using a pre-filter, based on applying machine learning to determine a function that models the data set the Bloom filter is meant to represent. Here we model such learned Bloom filters,,…
The Distributed Bloom Filter is a space-efficient, probabilistic data structure designed to perform more efficient set reconciliations in distributed systems. It guarantees eventual consistency of states between nodes in a system, while…
Optimizing the assortment of products to display to customers is a key to increasing revenue for both offline and online retailers. To trade-off between exploring customers' preference and exploiting customers' choices learned from data, in…
In this paper, we propose a distributed algorithm for the minimum dominating set problem. For some especial networks, we prove theoretically that the achieved answer by our proposed algorithm is a constant approximation factor of the exact…
In this paper, we show a connection between a certain online low-congestion routing problem and an online prediction of graph labeling. More specifically, we prove that if there exists a routing scheme that guarantees a congestion of…
In statistical learning for real-world large-scale data problems, one must often resort to "streaming" algorithms which operate sequentially on small batches of data. In this work, we present an analysis of the information-theoretic limits…
We study an online version of the max-min fair allocation problem for indivisible items. In this problem, items arrive one by one, and each item must be allocated irrevocably on arrival to one of $n$ agents, who have additive valuations for…
We study a model of market economics wherein the $(n+1)$-st customer, for each $n\geqslant N$, with $N$ being a prespecified positive integer, draws a sample of (random) size $K_{n}$, either with replacement or without, from the customers…
Bloom filters are data structures used to determine set membership of elements, with applications from string matching to networking and security problems. These structures are favored because of their reduced memory consumption and fast…
Makespan minimization on identical parallel machines is a classical scheduling problem. We consider the online scenario where a sequence of $n$ jobs has to be scheduled non-preemptively on $m$ machines so as to minimize the maximum…
The paper addresses the problem of distributed filtering with guaranteed convergence properties using minimum-energy filtering and $H_\infty$ filtering methodologies. A linear state space plant model is considered observed by a network of…