Related papers: Data Structures for Range Sorted Consecutive Occur…

Gapped Indexing for Consecutive Occurrences

The classic string indexing problem is to preprocess a string S into a compact data structure that supports efficient pattern matching queries. Typical queries include existential queries (decide if the pattern occurs in S), reporting…

Data Structures and Algorithms · Computer Science 2021-02-05 Philip Bille , Inge Li Gørtz , Max Rishøj Pedersen , Teresa Anna Steiner

String Indexing for Top-$k$ Close Consecutive Occurrences

The classic string indexing problem is to preprocess a string $S$ into a compact data structure that supports efficient subsequent pattern matching queries, that is, given a pattern string $P$, report all occurrences of $P$ within $S$. In…

Data Structures and Algorithms · Computer Science 2024-02-15 Philip Bille , Inge Li Gørtz , Max Rishøj Pedersen , Eva Rotenberg , Teresa Anna Steiner

Gapped String Indexing in Subquadratic Space and Sublinear Query Time

In Gapped String Indexing, the goal is to compactly represent a string $S$ of length $n$ such that for any query consisting of two strings $P_1$ and $P_2$, called patterns, and an integer interval $[\alpha, \beta]$, called gap range, we can…

Data Structures and Algorithms · Computer Science 2024-03-06 Philip Bille , Inge Li Gørtz , Moshe Lewenstein , Solon P. Pissis , Eva Rotenberg , Teresa Anna Steiner

Compressed Indexing for Consecutive Occurrences

The fundamental question considered in algorithms on strings is that of indexing, that is, preprocessing a given string for specific queries. By now we have a number of efficient solutions for this problem when the queries ask for an exact…

Data Structures and Algorithms · Computer Science 2023-04-04 Paweł Gawrychowski , Garance Gourdel , Tatiana Starikovskaya , Teresa Anna Steiner

String Matching with Variable Length Gaps

We consider string matching with variable length gaps. Given a string $T$ and a pattern $P$ consisting of strings separated by variable length gaps (arbitrary strings of length in a specified range), the problem is to find all ending…

Data Structures and Algorithms · Computer Science 2011-10-14 Philip Bille , Inge Li Goertz , Hjalte Wedel Vildhøj , David Kofoed Wind

Range Non-Overlapping Indexing

We study the non-overlapping indexing problem: Given a text T, preprocess it so that you can answer queries of the form: given a pattern P, report the maximal set of non-overlapping occurrences of P in T. A generalization of this problem is…

Data Structures and Algorithms · Computer Science 2010-01-12 Hagai Cohen , Ely Porat

Pattern Matching on Grammar-Compressed Strings in Linear Time

The most fundamental problem considered in algorithms for text processing is pattern matching: given a pattern $p$ of length $m$ and a text $t$ of length $n$, does $p$ occur in $t$? Multiple versions of this basic question have been…

Data Structures and Algorithms · Computer Science 2021-11-10 Moses Ganardi , Paweł Gawrychowski

On Stabbing Queries for Generalized Longest Repeat

A longest repeat query on a string, motivated by its applications in many subfields including computational biology, asks for the longest repetitive substring(s) covering a particular string position (point query). In this paper, we extend…

Data Structures and Algorithms · Computer Science 2015-11-10 Bojian Xu

Fast and linear-time string matching algorithms based on the distances of $q$-gram occurrences

Given a text $T$ of length $n$ and a pattern $P$ of length $m$, the string matching problem is a task to find all occurrences of $P$ in $T$. In this study, we propose an algorithm that solves this problem in $O((n + m)q)$ time considering…

Data Structures and Algorithms · Computer Science 2020-04-14 Satoshi Kobayashi , Diptarama Hendrian , Ryo Yoshinaka , Ayumi Shinohara

Space-Efficient Text Indexing with Mismatches using Function Inversion

A classic data structure problem is to preprocess a string T of length $n$ so that, given a query $q$, we can quickly find all substrings of T with Hamming distance at most $k$ from the query string. Variants of this problem have seen…

Data Structures and Algorithms · Computer Science 2026-04-03 Jackson Bibbens , Levi Borevitz , Samuel McCauley

On Longest Repeat Queries

Repeat finding in strings has important applications in subfields such as computational biology. Surprisingly, all prior work on repeat finding did not consider the constraint on the locality of repeats. In this paper, we propose and study…

Data Structures and Algorithms · Computer Science 2015-01-27 Atalay Mert İleri , M. Oğuzhan Külekci , Bojian Xu

Compressed Indexing with Signature Grammars

The compressed indexing problem is to preprocess a string $S$ of length $n$ into a compressed representation that supports pattern matching queries. That is, given a string $P$ of length $m$ report all occurrences of $P$ in $S$. We present…

Data Structures and Algorithms · Computer Science 2018-04-12 Anders Roy Christiansen , Mikko Berggren Ettienne

Finding Top-k Longest Palindromes in Substrings

Palindromes are strings that read the same forward and backward. Problems of computing palindromic structures in strings have been studied for many years with a motivation of their application to biology. The longest palindrome problem is…

Data Structures and Algorithms · Computer Science 2023-06-21 Kazuki Mitani , Takuya Mieno , Kazuhisa Seto , Takashi Horiyama

Detecting $k$-(Sub-)Cadences and Equidistant Subsequence Occurrences

The equidistant subsequence pattern matching problem is considered. Given a pattern string $P$ and a text string $T$, we say that $P$ is an \emph{equidistant subsequence} of $T$ if $P$ is a subsequence of the text such that consecutive…

Data Structures and Algorithms · Computer Science 2020-02-18 Mitsuru Funakoshi , Yuto Nakashima , Shunsuke Inenaga , Hideo Bannai , Masayuki Takeda , Ayumi Shinohara

Substring Range Reporting

We revisit various string indexing problems with range reporting features, namely, position-restricted substring searching, indexing substrings with gaps, and indexing substrings with intervals. We obtain the following main results.…

Data Structures and Algorithms · Computer Science 2011-08-19 Philip Bille , Inge Li Goertz

Simplex Range Searching Revisited: How to Shave Logs in Multi-Level Data Structures

We revisit the classic problem of simplex range searching and related problems in computational geometry. We present a collection of new results which improve previous bounds by multiple logarithmic factors that were caused by the use of…

Computational Geometry · Computer Science 2022-10-24 Timothy M. Chan , Da Wei Zheng

On Optimal Top-K String Retrieval

Let ${\cal{D}}$ = $\{d_1, d_2, d_3, ..., d_D\}$ be a given set of $D$ (string) documents of total length $n$. The top-$k$ document retrieval problem is to index $\cal{D}$ such that when a pattern $P$ of length $p$, and a parameter $k$ come…

Data Structures and Algorithms · Computer Science 2012-11-20 Rahul Shah , Cheng Sheng , Sharma V. Thankachan , Jeffrey Scott Vitter

Space-Efficient k-Mismatch Text Indexes

A central task in string processing is text indexing, where the goal is to preprocess a text (a string of length $n$) into an efficient index (a data structure) supporting queries about the text. Cole, Gottlieb, and Lewenstein (STOC 2004)…

Data Structures and Algorithms · Computer Science 2025-10-31 Tomasz Kociumaka , Jakub Radoszewski

The Complexity of the Co-Occurrence Problem

Let $S$ be a string of length $n$ over an alphabet $\Sigma$ and let $Q$ be a subset of $\Sigma$ of size $q \geq 2$. The 'co-occurrence problem' is to construct a compact data structure that supports the following query: given an integer $w$…

Data Structures and Algorithms · Computer Science 2022-11-11 Philip Bille , Inge Li Gørtz , Tord Stordalen

Random Access in Persistent Strings and Segment Selection

We consider compact representations of collections of similar strings that support random access queries. The collection of strings is given by a rooted tree where edges are labeled by an edit operation (inserting, deleting, or replacing a…

Data Structures and Algorithms · Computer Science 2021-02-12 Philip Bille , Inge Li Gørtz