Related papers: Substring Range Reporting
A longest repeat query on a string, motivated by its applications in many subfields including computational biology, asks for the longest repetitive substring(s) covering a particular string position (point query). In this paper, we extend…
Bille and G{\o}rtz (2011) recently introduced the problem of substring range counting, for which we are asked to store compactly a string $S$ of $n$ characters with integer labels in ([0, u]), such that later, given an interval ([a, b]) and…
In Gapped String Indexing, the goal is to compactly represent a string $S$ of length $n$ such that for any query consisting of two strings $P_1$ and $P_2$, called patterns, and an integer interval $[\alpha, \beta]$, called gap range, we can…
The string indexing problem is a fundamental computational problem with numerous applications, including information retrieval and bioinformatics. It aims to efficiently solve the pattern matching problem: given a text T of length n for…
We revisit the classic problem of simplex range searching and related problems in computational geometry. We present a collection of new results which improve previous bounds by multiple logarithmic factors that were caused by the use of…
We consider string matching with variable length gaps. Given a string $T$ and a pattern $P$ consisting of strings separated by variable length gaps (arbitrary strings of length in a specified range), the problem is to find all ending…
The classic string indexing problem is to preprocess a string S into a compact data structure that supports efficient pattern matching queries. Typical queries include existential queries (decide if the pattern occurs in S), reporting…
Given an array A[1: n] of n elements drawn from an ordered set, the sorted range selection problem is to build a data structure that can be used to answer the following type of queries efficiently: Given a pair of indices i, j $ (1\le i\le…
In the orthogonal range reporting problem, we are to preprocess a set of $n$ points with integer coordinates on a $U \times U$ grid. The goal is to support reporting all $k$ points inside an axis-aligned query rectangle. This is one of the…
We study the non-overlapping indexing problem: Given a text T, preprocess it so that you can answer queries of the form: given a pattern P, report the maximal set of non-overlapping occurrences of P in T. A generalization of this problem is…
We give a simplified and improved lower bound for the simplex range reporting problem. We show that given a set $P$ of $n$ points in $\mathbb{R}^d$, any data structure that uses $S(n)$ space to answer such queries must have…
We study the fundamental question of how efficiently suffix array entries can be accessed when the array cannot be stored explicitly. The suffix array $SA_T[1..n]$ of a text $T$ of length $n$ encodes the lexicographic order of its suffixes…
We consider the two-dimensional sorted range reporting problem. Our data structure requires O(n lglg n) words of space and O(lglg n + k lglg n) query time, where k is the number of points in the query range. This data structure improves a…
Repeat finding in strings has important applications in subfields such as computational biology. Surprisingly, all prior work on repeat finding did not consider the constraint on the locality of repeats. In this paper, we propose and study…
Several biological problems require the identification of regions in a sequence where some feature occurs within a target density range: examples including the location of GC-rich regions, identification of CpG islands, and sequence…
In this paper we study lower bounds for the fundamental problem of text indexing with mismatches and differences. In this problem we are given a long string of length $n$, the "text", and the task is to preprocess it into a data structure…
We consider encoding problems for range queries on arrays. In these problems the goal is to store a structure capable of recovering the answer to all queries that occupies the information theoretic minimum space possible, to within lower…
In a typical range emptiness searching (resp., reporting) problem, we are given a set $P$ of $n$ points in $\reals^d$, and wish to preprocess it into a data structure that supports efficient range emptiness (resp., reporting) queries, in…
Given a string $S$ of length $n$, the classic string indexing problem is to preprocess $S$ into a compact data structure that supports efficient subsequent pattern queries. In the \emph{deterministic} variant the goal is to solve the string…
We consider the problem of encoding a string of length $n$ from an integer alphabet of size $\sigma$ so that access and substring equality queries (that is, determining the equality of any two substrings) can be answered efficiently. Any…