Related papers: Succinct Data Structures for Retrieval and Approxi…
We suggest a method for holding a dictionary data structure, which maps keys to values, in the spirit of Bloom Filters. The space requirements of the dictionary we suggest are much smaller than those of a hashtable. We allow storing n keys,…
A retrieval data structure for a static function $f:S\rightarrow \{0,1\}^r$ supports queries that return $f(x)$ for any $x \in S$. Retrieval data structures can be used to implement a static approximate membership query data structure…
Let $\mathcal{D}$ be a collection of $D$ documents, which are strings over an alphabet of size $\sigma$, of total length $n$. We describe a data structure that uses linear space and and reports $k$ most relevant documents that contain a…
In the static retrieval problem, a data structure must answer retrieval queries mapping a set of $n$ keys in a universe $[U]$ to $v$-bit values. Information-theoretically, retrieval data structures can use as little as $nv$ bits of space.…
A retrieval data structure stores a static function f : S -> {0,1}^r . For all x in S, it returns the r-bit value f(x), while for other inputs it may return an arbitrary result. The structure cannot answer membership queries, so it does not…
Given an $n$-bit array $A$, the succinct rank data structure problem asks to construct a data structure using space $n+r$ bits for $r\ll n$, supporting rank queries of form $\mathtt{rank}(x)=\sum_{i=0}^{x-1} A[i]$. In this paper, we design…
Given a set $S$ of $n$ (distinct) keys from key space $[U]$, each associated with a value from $\Sigma$, the \emph{static dictionary} problem asks to preprocess these (key, value) pairs into a data structure, supporting value-retrieval…
The rank problem in succinct data structures asks to preprocess an array A[1..n] of bits into a data structure using as close to n bits as possible, and answer queries of the form rank(k) = Sum_{i=1}^k A[i]. The problem has been intensely…
In the polytope membership problem, a convex polytope $K$ in $R^d$ is given, and the objective is to preprocess $K$ into a data structure so that, given a query point $q \in R^d$, it is possible to determine efficiently whether $q \in K$.…
Let ${\cal{D}}$ = $\{d_1, d_2, d_3, ..., d_D\}$ be a given set of $D$ (string) documents of total length $n$. The top-$k$ document retrieval problem is to index $\cal{D}$ such that when a pattern $P$ of length $p$, and a parameter $k$ come…
A minimal perfect hash function bijectively maps a key set $S$ out of a universe $U$ into the first $|S|$ natural numbers. Minimal perfect hash functions are used, for example, to map irregularly-shaped keys, such as string, in a compact…
A dictionary data structure maintains a set of at most $n$ keys from the universe $[U]$ under key insertions and deletions, such that given a query $x \in [U]$, it returns if $x$ is in the set. Some variants also store values associated to…
The membership problem asks to maintain a set $S\subseteq[u]$, supporting insertions and membership queries, i.e., testing if a given element is in the set. A data structure that computes exact answers is called a dictionary. When a (small)…
The selection problem, where one wishes to locate the $k^{th}$ smallest element in an unsorted array of size $n$, is one of the basic problems studied in computer science. The main focus of this work is designing algorithms for solving the…
Given an unordered array of $N$ elements drawn from a totally ordered set and an integer $k$ in the range from $1$ to $N$, in the classic selection problem the task is to find the $k$-th smallest element in the array. We study the…
A classic data structure problem is to preprocess a string T of length $n$ so that, given a query $q$, we can quickly find all substrings of T with Hamming distance at most $k$ from the query string. Variants of this problem have seen…
Perfect hash functions can potentially be used to compress data in connection with a variety of data management tasks. Though there has been considerable work on how to construct good perfect hash functions, there is a gap between theory…
Pattern matching is a fundamental process in almost every scientific domain. The problem involves finding the positions of a given pattern (usually of short length) in a reference stream of data (usually of large length). The matching can…
Motivated by the imminent growth of massive, highly redundant genomic databases, we study the problem of compressing a string database while simultaneously supporting fast random access, substring extraction and pattern matching to the…
Given a set S of n keys, a k-perfect hash function (kPHF) is a data structure that maps the keys to the first m integers, where each output integer can be hit by at most k input keys. When m=n/k, the resulting function is called a minimal…