Related papers: Computing All Distinct Squares in Linear Time for …
Suffix trees are key and efficient data structure for solving string problems. A suffix tree is a compressed trie containing all the suffixes of a given text of length $n$ with a linear construction cost. In this work, we introduce an…
Given two strings $T$ and $S$ and a set of strings $P$, for each string $p \in P$, consider the unique substrings of $T$ that have $p$ as their prefix and $S$ as their suffix. Two problems then come to mind; the first problem being the…
We show that the number of distinct squares in a packed string of length $n$ over an alphabet of size $\sigma$ can be computed in $O(n/\log_\sigma n)$ time in the word-RAM model. This paper is the first to introduce a sublinear-time…
We present the first worst-case linear time algorithm that directly computes the parameterized suffix and LCP arrays for constant sized alphabets. Previous algorithms either required quadratic time or the parameterized suffix tree to be…
The suffix tree is a very important data structure in string processing, but it suffers from a huge space consumption. In large-scale applications, compressed suffix trees (CSTs) are therefore used instead. A CST consists of three…
Suffix tree (and the closely related suffix array) are fundamental structures capturing all substrings of a given text essentially by storing all its suffixes in the lexicographical order. In some applications, we work with a subset of $b$…
We show that the compressed suffix array and the compressed suffix tree of a string $T$ can be built in $O(n)$ deterministic time using $O(n\log\sigma)$ bits of space, where $n$ is the string length and $\sigma$ is the alphabet size.…
We present a simple algorithm for computing the document array given a string collection and its suffix array as input. Our algorithm runs in linear time using constant additional space for strings from constant alphabets.
Suppose we want to seek the longest common subsequences (LCSs) of two strings as informative patterns that explain the relationship between the strings. The dynamic programming algorithm gives us a table from which all LCSs can be extracted…
A trie $\mathcal{T}$ is a rooted tree such that each edge is labeled by a single character from the alphabet, and the labels of out-going edges from the same node are mutually distinct. Given a trie $\mathcal{T}$ with $n$ edges, we show how…
A seed in a word is a relaxed version of a period in which the occurrences of the repeating subword may overlap. We show a linear-time algorithm computing a linear-size representation of all the seeds of a word (the number of seeds might be…
We show that the sum of a sequence of integers can be computed in linear time on a Turing machine. In particular, the most obvious algorithm for this problem, which appears to require quadratic time due to carry propagation, actually runs…
We derive a simple efficient algorithm for Abelian periods knowing all Abelian squares in a string. An efficient algorithm for the latter problem was given by Cummings and Smyth in 1997. By the way we show an alternative algorithm for…
We consider the problem of finding repetitive structures and inherent patterns in a given string $\s{s}$ of length $n$ over a finite totally ordered alphabet. A border $\s{u}$ of a string $\s{s}$ is both a prefix and a suffix of $\s{s}$…
The suffix array is a fundamental data structure for many applications that involve string searching and data compression. Designing time/space-efficient suffix array construction algorithms has attracted significant attention and…
We consider an index data structure for similar strings. The generalized suffix tree can be a solution for this. The generalized suffix tree of two strings $A$ and $B$ is a compacted trie representing all suffixes in $A$ and $B$. It has…
We study the following substring suffix selection problem: given a substring of a string T of length n, compute its k-th lexicographically smallest suffix. This a natural generalization of the well-known question of computing the maximal…
We consider the problems of computing maximal palindromes and distinct palindromes in a trie. A trie is a natural generalization of a string, which can be seen as a single-path tree. There is a linear-time offline algorithm to compute…
A suffix tree is a data structure used mainly for pattern matching. It is known that the space complexity of simple suffix trees is quadratic in the length of the string. By a slight modification of the simple suffix trees one gets the…
This paper addresses the problem of finding a representation of a subtree distance, which is an extension of the tree metric. We show that a minimal representation is uniquely determined by a given subtree distance, and give a linear time…