Related papers: Bloom maps
A Bloom filter is a method for reducing the space (memory) required for representing a set by allowing a small error probability. In this paper we consider a \emph{Sliding Bloom Filter}: a data structure that, given a stream of elements,…
Bloom filters are data structures used to determine set membership of elements, with applications from string matching to networking and security problems. These structures are favored because of their reduced memory consumption and fast…
We present a method for computing the topological entropy of one-dimensional maps. As an approximation scheme, the algorithm converges rapidly and provides both upper and lower bounds.
We study data structure problems related to document indexing and pattern matching queries and our main contribution is to show that the pointer machine model of computation can be extremely useful in proving high and unconditional lower…
The question of what can be computed, and how efficiently, are at the core of computer science. Not surprisingly, in distributed systems and networking research, an equally fundamental question is what can be computed in a…
We use entropy numbers in combination with the polynomial method to derive a new general lower bound for the n-th minimal error in the quantum setting of information-based complexity. As an application, we improve some lower bounds on…
Sequence representations supporting queries $access$, $select$ and $rank$ are at the core of many data structures. There is a considerable gap between the various upper bounds and the few lower bounds known for such representations, and how…
Indexing of static and dynamic sets is fundamental to a large set of applications such as information retrieval and caching. Denoting the characteristic vector of the set by B, we consider the problem of encoding sets and multisets to…
We consider a range of simply stated dynamic data structure problems on strings. An update changes one symbol in the input and a query asks us to compute some function of the pattern of length $m$ and a substring of a longer text. We give…
Consider a rectangular matrix describing some type of communication or transportation between a set of origins and a set of destinations, or a classification of objects by two attributes. The problem is to infer the entries of the matrix…
There is a plethora of data structures, algorithms, and frameworks dealing with major data-stream problems like estimating the frequency of items, answering set membership, association and multiplicity queries, and several other statistics…
We illustrate how computer-aided methods can be used to investigate the fundamental limits of the caching systems, which are significantly different from the conventional analytical approach usually seen in the information theory…
Within the task of collaborative filtering two challenges for computing conditional probabilities exist. First, the amount of training data available is typically sparse with respect to the size of the domain. Thus, support for higher-order…
Bloom filters are probabilistic data structures commonly used for approximate membership problems in many areas of Computer Science (networking, distributed systems, databases, etc.). With the increase in data size and distribution of data,…
MAP is the problem of finding a most probable instantiation of a set of variables in a Bayesian network given some evidence. Unlike computing posterior probabilities, or MPE (a special case of MAP), the time and space complexity of…
In many high-impact applications, it is important to ensure the quality of output of a machine learning algorithm as well as its reliability in comparison with the complexity of the algorithm used. In this paper, we have initiated a…
Bloom filters are space-efficient probabilistic data structures that are used to test whether an element is a member of a set, and may return false positives. Recently, variations referred to as learned Bloom filters were developed that can…
We study entropy-bounded computational geometry, that is, geometric algorithms whose running times depend on a given measure of the input entropy. Specifically, we introduce a measure that we call range-partition entropy, which unifies and…
We present a version of the Bloom filter data structure that supports not only the insertion, deletion, and lookup of key-value pairs, but also allows a complete listing of its contents with high probability, as long the number of key-value…
We revisit the well-known problem of sorting under partial information: sort a finite set given the outcomes of comparisons between some pairs of elements. The input is a partially ordered set P, and solving the problem amounts to…