Related papers: String Inference from the LCP Array

Lightweight LCP-Array Construction in Linear Time

The suffix tree is a very important data structure in string processing, but it suffers from a huge space consumption. In large-scale applications, compressed suffix trees (CSTs) are therefore used instead. A CST consists of three…

Data Structures and Algorithms · Computer Science 2010-12-21 Simon Gog , Enno Ohlebusch

Lightweight LCP Construction for Very Large Collections of Strings

The longest common prefix array is a very advantageous data structure that, combined with the suffix array and the Burrows-Wheeler transform, allows to efficiently compute some combinatorial properties of a string useful in several…

Data Structures and Algorithms · Computer Science 2016-05-16 Anthony J. Cox , Fabio Garofalo , Giovanna Rosone , Marinella Sciortino

Sampled Longest Common Prefix Array

When augmented with the longest common prefix (LCP) array and some other structures, the suffix array can solve many string processing problems in optimal time and space. A compressed representation of the LCP array is also one of the main…

Data Structures and Algorithms · Computer Science 2010-06-30 Jouni Sirén

Low Space External Memory Construction of the Succinct Permuted Longest Common Prefix Array

The longest common prefix (LCP) array is a versatile auxiliary data structure in indexed string matching. It can be used to speed up searching using the suffix array (SA) and provides an implicit representation of the topology of an…

Data Structures and Algorithms · Computer Science 2016-03-09 German Tischler

Computing the LCP Array of a Labeled Graph

The LCP array is an important tool in stringology, allowing to speed up pattern matching algorithms and enabling compact representations of the suffix tree. Recently, Conte et al. [DCC 2023] and Cotumaccio et al. [SPIRE 2023] extended the…

Data Structures and Algorithms · Computer Science 2024-04-23 Jarno Alanko , Davide Cenzato , Nicola Cotumaccio , Sung-Hwan Kim , Giovanni Manzini , Nicola Prezza

In-Place Sparse Suffix Sorting

Suffix arrays encode the lexicographical order of all suffixes of a text and are often combined with the Longest Common Prefix array (LCP) to simulate navigational queries on the suffix tree in reduced space. In space-critical applications…

Data Structures and Algorithms · Computer Science 2017-11-02 Nicola Prezza

Wee LCP

We prove that longest common prefix (LCP) information can be stored in much less space than previously known. More precisely, we show that in the presence of the text and the suffix array, o(n) additional bits are sufficient to answer…

Data Structures and Algorithms · Computer Science 2010-02-19 Johannes Fischer

Inducing the LCP-Array

We show how to modify the linear-time construction algorithm for suffix arrays based on induced sorting (Nong et al., DCC'09) such that it computes the array of longest common prefixes (LCP-array) as well. Practical tests show that this…

Data Structures and Algorithms · Computer Science 2011-01-19 Johannes Fischer

Linear Time Inference of Strings from Cover Arrays using a Binary Alphabet

Covers being one of the most popular form of regularities in strings, have drawn much attention over time. In this paper, we focus on the problem of linear time inference of strings from cover arrays using the least sized alphabet possible.…

Data Structures and Algorithms · Computer Science 2011-08-30 Tanaeem M. Moosa , Sumaiya Nazeen , M. Sohel Rahman , Rezwana Reaz

Sparse Suffix and LCP Array: Simple, Direct, Small, and Fast

Sparse suffix sorting is the problem of sorting $b=o(n)$ suffixes of a string of length $n$. Efficient sparse suffix sorting algorithms have existed for more than a decade. Despite the multitude of works and their justified claims for…

Data Structures and Algorithms · Computer Science 2024-07-08 Lorraine A. K. Ayad , Grigorios Loukides , Solon P. Pissis , Hilde Verbeek

Optimal Time and Space Construction of Suffix Arrays and LCP Arrays for Integer Alphabets

Suffix arrays and LCP arrays are one of the most fundamental data structures widely used for various kinds of string processing. We consider two problems for a read-only string of length $N$ over an integer alphabet $[1, \dots, \sigma]$ for…

Data Structures and Algorithms · Computer Science 2019-07-16 Keisuke Goto

Computing matching statistics on Wheeler DFAs

Matching statistics were introduced to solve the approximate string matching problem, which is a recurrent subroutine in bioinformatics applications. In 2010, Ohlebusch et al. [SPIRE 2010] proposed a time and space efficient algorithm for…

Data Structures and Algorithms · Computer Science 2023-01-16 Alessio Conte , Nicola Cotumaccio , Travis Gagie , Giovanni Manzini , Nicola Prezza , Marinella Sciortino

The Inverse Lyndon Array: Definition, Properties, and Linear-Time Construction

The Lyndon array stores, at each position of a word, the length of the longest maximal Lyndon subword starting at that position, and plays an important role in combinatorics on words, for example in the construction of fundamental data…

Data Structures and Algorithms · Computer Science 2026-03-19 Pietro Negri , Manuel Sica , Rocco Zaccagnino , Rosalba Zizza

Computing the BWT and LCP array of a Set of Strings in External Memory

Indexing very large collections of strings, such as those produced by the widespread next generation sequencing technologies, heavily relies on multistring generalization of the Burrows-Wheeler Transform (BWT): large requirements of…

Data Structures and Algorithms · Computer Science 2020-12-07 Paola Bonizzoni , Gianluca Della Vedova , Yuri Pirola , Marco Previtali , Raffaella Rizzi

The colored longest common prefix array computed via sequential scans

Due to the increased availability of large datasets of biological sequences, the tools for sequence comparison are now relying on efficient alignment-free approaches to a greater extent. Most of the alignment-free approaches require the…

Data Structures and Algorithms · Computer Science 2018-07-23 F. Garofalo , G. Rosone , M. Sciortino , D. Verzotto

Direct Linear Time Construction of Parameterized Suffix and LCP Arrays for Constant Alphabets

We present the first worst-case linear time algorithm that directly computes the parameterized suffix and LCP arrays for constant sized alphabets. Previous algorithms either required quadratic time or the parameterized suffix tree to be…

Data Structures and Algorithms · Computer Science 2019-06-04 Noriki Fujisato , Yuto Nakashima , Shunsuke Inenaga , Hideo Bannai , Masayuki Takeda

Suffix Tree of Alignment: An Efficient Index for Similar Data

We consider an index data structure for similar strings. The generalized suffix tree can be a solution for this. The generalized suffix tree of two strings $A$ and $B$ is a compacted trie representing all suffixes in $A$ and $B$. It has…

Data Structures and Algorithms · Computer Science 2013-05-09 Joong Chae Na , Heejin Park , Maxime Crochemore , Jan Holub , Costas S. Iliopoulos , Laurent Mouchard , Kunsoo Park

A Data-Structure for Approximate Longest Common Subsequence of A Set of Strings

Given a set of $k$ strings $I$, their longest common subsequence (LCS) is the string with the maximum length that is a subset of all the strings in $I$. A data-structure for this problem preprocesses $I$ into a data-structure such that the…

Data Structures and Algorithms · Computer Science 2021-01-13 Sepideh Aghamolaei

Longest Common Prefix Arrays for Succinct k-Spectra

The k-spectrum of a string is the set of all distinct substrings of length k occurring in the string. K-spectra have many applications in bioinformatics including pseudoalignment and genome assembly. The Spectral Burrows-Wheeler Transform…

Data Structures and Algorithms · Computer Science 2023-06-09 Jarno N. Alanko , Elena Biagi , Simon J. Puglisi

Lightweight LCP Construction for Next-Generation Sequencing Datasets

The advent of "next-generation" DNA sequencing (NGS) technologies has meant that collections of hundreds of millions of DNA sequences are now commonplace in bioinformatics. Knowing the longest common prefix array (LCP) of such a collection…

Data Structures and Algorithms · Computer Science 2013-05-02 Markus J. Bauer , Anthony J. Cox , Giovanna Rosone , Marinella Sciortino