Related papers: Inducing the LCP-Array
The suffix tree is a very important data structure in string processing, but it suffers from a huge space consumption. In large-scale applications, compressed suffix trees (CSTs) are therefore used instead. A CST consists of three…
In this paper we propose a variant of the induced suffix sorting algorithm by Nong (TOIS, 2013) that computes simultaneously the Lyndon array and the suffix array of a text in $O(n)$ time using $\sigma + O(1)$ words of working space, where…
When augmented with the longest common prefix (LCP) array and some other structures, the suffix array can solve many string processing problems in optimal time and space. A compressed representation of the LCP array is also one of the main…
The suffix array, perhaps the most important data structure in modern string processing, is often augmented with the longest common prefix (LCP) array which stores the lengths of the LCPs for lexicographically adjacent suffixes of a string.…
We present the first worst-case linear time algorithm that directly computes the parameterized suffix and LCP arrays for constant sized alphabets. Previous algorithms either required quadratic time or the parameterized suffix tree to be…
The longest common prefix (LCP) array is a versatile auxiliary data structure in indexed string matching. It can be used to speed up searching using the suffix array (SA) and provides an implicit representation of the topology of an…
Sparse suffix sorting is the problem of sorting $b=o(n)$ suffixes of a string of length $n$. Efficient sparse suffix sorting algorithms have existed for more than a decade. Despite the multitude of works and their justified claims for…
The LCP array is an important tool in stringology, allowing to speed up pattern matching algorithms and enabling compact representations of the suffix tree. Recently, Conte et al. [DCC 2023] and Cotumaccio et al. [SPIRE 2023] extended the…
Suffix arrays encode the lexicographical order of all suffixes of a text and are often combined with the Longest Common Prefix array (LCP) to simulate navigational queries on the suffix tree in reduced space. In space-critical applications…
We give the first concise description of the fastest known suffix sorting algorithm in main memory, the DivSufSort by Yuta Mori. We then present an extension that also computes the LCP-array, which is competitive with the fastest known…
We prove that longest common prefix (LCP) information can be stored in much less space than previously known. More precisely, we show that in the presence of the text and the suffix array, o(n) additional bits are sufficient to answer…
The longest common prefix array is a very advantageous data structure that, combined with the suffix array and the Burrows-Wheeler transform, allows to efficiently compute some combinatorial properties of a string useful in several…
We propose a simple linear-time on-line algorithm for constructing a position heap for a string [Ehrenfeucht et al, 2011]. Our definition of position heap differs slightly from the one proposed in [Ehrenfeucht et al, 2011] in that it…
Recently, Cenzato et al.\ proposed a new text index, called the \emph{suffixient array}, which is a subset of the suffix array and supports locating a single pattern occurrence or finding its maximal exact matches (MEMs), assuming random…
The suffix array is a fundamental data structure for many applications that involve string searching and data compression. Designing time/space-efficient suffix array construction algorithms has attracted significant attention and…
A deterministic BSP algorithm for constructing the suffix array of a given string is presented, based on a technique which we call accelerated sampling. It runs in optimal O(n/p) local computation and communication, and requires a near…
We introduce a new algorithm for constructing the generalized suffix array of a collection of highly similar strings. As a first step, we construct a compressed representation of the matching statistics of the collection with respect to a…
The Lyndon array stores, at each position of a word, the length of the longest maximal Lyndon subword starting at that position, and plays an important role in combinatorics on words, for example in the construction of fundamental data…
We present first algorithmic ideas for a practical and lightweight adaption of the DCX suffix array construction algorithm [Sanders et al., 2003] to the distributed-memory setting. Our approach relies on a bucketing technique which enables…
The suffix array is arguably one of the most important data structures in sequence analysis and consequently there is a multitude of suffix sorting algorithms. However, to this date the GSACA algorithm introduced in 2015 is the only known…