Related papers: Linear pattern matching on sparse suffix trees

Compressed Subsequence Matching and Packed Tree Coloring

We present a new algorithm for subsequence matching in grammar compressed strings. Given a grammar of size $n$ compressing a string of size $N$ and a pattern string of size $m$ over an alphabet of size $\sigma$, our algorithm uses…

Data Structures and Algorithms · Computer Science 2014-06-06 Philip Bille , Patrick Hagge Cording , Inge Li Gørtz

Packed Compact Tries: A Fast and Efficient Data Structure for Online String Processing

In this paper, we present a new data structure called the packed compact trie (packed c-trie) which stores a set $S$ of $k$ strings of total length $n$ in $n \log\sigma + O(k \log n)$ bits of space and supports fast pattern matching queries…

Data Structures and Algorithms · Computer Science 2017-10-11 Takuya Takagi , Shunsuke Inenaga , Kunihiko Sadakane , Hiroki Arimura

Fast Packed String Matching for Short Patterns

Searching for all occurrences of a pattern in a text is a fundamental problem in computer science with applications in many other fields, like natural language processing, information retrieval and computational biology. In the last two…

Information Retrieval · Computer Science 2012-10-01 Simone Faro , M. Oguzhan Külekci

Space-Efficient String Indexing for Wildcard Pattern Matching

In this paper we describe compressed indexes that support pattern matching queries for strings with wildcards. For a constant size alphabet our data structure uses $O(n\log^{\varepsilon}n)$ bits for any $\varepsilon>0$ and reports all…

Data Structures and Algorithms · Computer Science 2014-01-06 Moshe Lewenstein , Yakov Nekrich , Jeffrey Scott Vitter

Fast Searching in Packed Strings

Given strings $P$ and $Q$ the (exact) string matching problem is to find all positions of substrings in $Q$ matching $P$. The classical Knuth-Morris-Pratt algorithm [SIAM J. Comput., 1977] solves the string matching problem in linear time…

Data Structures and Algorithms · Computer Science 2010-09-08 Philip Bille

Grammar Index By Induced Suffix Sorting

Pattern matching is the most central task for text indices. Most recent indices leverage compression techniques to make pattern matching feasible for massive but highly-compressible datasets. Within this kind of indices, we propose a new…

Data Structures and Algorithms · Computer Science 2021-05-31 Tooru Akagi , Dominik Köppl , Yuto Nakashima , Shunsuke Inenaga , Hideo Bannai , Masayuki Takeda

Compressed Dictionary Matching on Run-Length Encoded Strings

Given a set of pattern strings $\mathcal{P}=\{P_1, P_2,\ldots P_k\}$ and a text string $S$, the classic dictionary matching problem is to report all occurrences of each pattern in $S$. We study the dictionary problem in the compressed…

Data Structures and Algorithms · Computer Science 2025-09-04 Philip Bille , Inge Li Gørtz , Simon J. Puglisi , Simon R. Tarnow

Cartesian Tree Matching and Indexing

We introduce a new metric of match, called Cartesian tree matching, which means that two strings match if they have the same Cartesian trees. Based on Cartesian tree matching, we define single pattern matching for a text of length n and a…

Data Structures and Algorithms · Computer Science 2019-05-23 Sung Gwan Park , Amihood Amir , Gad M. Landau , Kunsoo Park

Faster Approximate Pattern Matching in Compressed Repetitive Texts

Motivated by the imminent growth of massive, highly redundant genomic databases, we study the problem of compressing a string database while simultaneously supporting fast random access, substring extraction and pattern matching to the…

Data Structures and Algorithms · Computer Science 2012-11-01 Travis Gagie , Paweł Gawrychowski , Christopher Hoobin , Simon J. Puglisi

Sparse Suffix Tree Construction in Optimal Time and Space

Suffix tree (and the closely related suffix array) are fundamental structures capturing all substrings of a given text essentially by storing all its suffixes in the lexicographical order. In some applications, we work with a subset of $b$…

Data Structures and Algorithms · Computer Science 2016-08-03 Paweł Gawrychowski , Tomasz Kociumaka

Compressed Indexing with Signature Grammars

The compressed indexing problem is to preprocess a string $S$ of length $n$ into a compressed representation that supports pattern matching queries. That is, given a string $P$ of length $m$ report all occurrences of $P$ in $S$. We present…

Data Structures and Algorithms · Computer Science 2018-04-12 Anders Roy Christiansen , Mikko Berggren Ettienne

A Faster Grammar-Based Self-Index

To store and search genomic databases efficiently, researchers have recently started building compressed self-indexes based on grammars. In this paper we show how, given a straight-line program with $r$ rules for a string (S [1..n]) whose…

Data Structures and Algorithms · Computer Science 2012-09-28 Travis Gagie , Paweł Gawrychowski , Juha Kärkkäinen , Yakov Nekrich , Simon J. Puglisi

Linear Index for Logarithmic Search-Time for any String under any Internal Node in Suffix Trees

Suffix trees are key and efficient data structure for solving string problems. A suffix tree is a compressed trie containing all the suffixes of a given text of length $n$ with a linear construction cost. In this work, we introduce an…

Data Structures and Algorithms · Computer Science 2024-06-04 Anas Al-okaily

Online Grammar Compression for Frequent Pattern Discovery

Various grammar compression algorithms have been proposed in the last decade. A grammar compression is a restricted CFG deriving the string deterministically. An efficient grammar compression develops a smaller CFG by finding duplicated…

Data Structures and Algorithms · Computer Science 2016-09-01 Shouhei Fukunaga , Yoshimasa Takabatake , I Tomohiro , Hiroshi Sakamoto

Deterministic Indexing for Packed Strings

Given a string $S$ of length $n$, the classic string indexing problem is to preprocess $S$ into a compact data structure that supports efficient subsequent pattern queries. In the \emph{deterministic} variant the goal is to solve the string…

Data Structures and Algorithms · Computer Science 2016-12-07 Philip Bille , Inge Li Gørtz , Frederik Rye Skjoldjensen

Efficient Online String Matching Based on Characters Distance Text Sampling

Searching for all occurrences of a pattern in a text is a fundamental problem in computer science with applications in many other fields, like natural language processing, information retrieval and computational biology. Sampled string…

Data Structures and Algorithms · Computer Science 2019-08-19 Simone Faro , Arianna Pavone , Francesco Pio Marino

Order-Preserving Suffix Trees and Their Algorithmic Applications

Recently Kubica et al. (Inf. Process. Let., 2013) and Kim et al. (submitted to Theor. Comp. Sci.) introduced order-preserving pattern matching. In this problem we are looking for consecutive substrings of the text that have the same "shape"…

Data Structures and Algorithms · Computer Science 2013-03-28 Maxime Crochemore , Costas S. Iliopoulos , Tomasz Kociumaka , Marcin Kubica , Alessio Langiu , Solon P. Pissis , Jakub Radoszewski , Wojciech Rytter , Tomasz Walen

Linear Approximate Pattern Matching Algorithm

Pattern matching is a fundamental process in almost every scientific domain. The problem involves finding the positions of a given pattern (usually of short length) in a reference stream of data (usually of large length). The matching can…

Data Structures and Algorithms · Computer Science 2022-07-01 Anas Al-okaily , Abdelghani Tbakhi

Load-Balancing Succinct B Trees

We propose a B tree representation storing $n$ keys, each of $k$ bits, in either (a) $nk + O(nk / \lg n)$ bits or (b) $nk + O(nk \lg \lg n/ \lg n)$ bits of space supporting all B tree operations in either (a) $O(\lg n )$ time or (b) $O(\lg…

Data Structures and Algorithms · Computer Science 2021-04-20 Tomohiro I , Dominik Köppl

Top Tree Compression of Tries

We present a compressed representation of tries based on top tree compression [ICALP 2013] that works on a standard, comparison-based, pointer machine model of computation and supports efficient prefix search queries. Namely, we show how to…

Data Structures and Algorithms · Computer Science 2019-09-23 Philip Bille , Inge Li Gørtz , Paweł Gawrychowski , Gad M. Landau , Oren Weimann