Related papers: Efficiently computing runs on a trie
A run is a maximal occurrence of a repetition $v$ with a period $p$ such that $2p \le |v|$. The maximal number of runs in a string of length $n$ was studied by several authors and it is known to be between $0.944 n$ and $1.029 n$. We…
The cornerstone of any algorithm computing all repetitions in a string of length n in O(n) time is the fact that the number of runs (or maximal repetitions) is O(n). We give a simple proof of this result. As a consequence of our approach,…
We give a new characterization of maximal repetitions (or runs) in strings based on Lyndon words. The characterization leads to a proof of what was known as the "runs" conjecture (Kolpakov \& Kucherov (FOCS '99)), which states that the…
A run is an inclusion maximal occurrence in a string (as a subinterval) of a repetition $v$ with a period $p$ such that $2p \le |v|$. The exponent of a run is defined as $|v|/p$ and is $\ge 2$. We show new bounds on the maximal sum of…
Maximal repetition of a string is the maximal length of a repeated substring. This paper investigates maximal repetition of strings drawn from stochastic processes. Strengthening previous results, two new bounds for the almost sure growth…
A trie $\mathcal{T}$ is a rooted tree such that each edge is labeled by a single character from the alphabet, and the labels of out-going edges from the same node are mutually distinct. Given a trie $\mathcal{T}$ with $n$ edges, we show how…
A run in a string is a maximal periodic substring. For example, the string $\texttt{bananatree}$ contains the runs $\texttt{anana} = (\texttt{an})^{3/2}$ and $\texttt{ee} = \texttt{e}^2$. There are less than $n$ runs in any length-$n$…
We describe a RAM algorithm computing all runs (maximal repetitions) of a given string of length $n$ over a general ordered alphabet in $O(n\log^{\frac{2}3} n)$ time and linear space. Our algorithm outperforms all known solutions working in…
We show a new lower bound for the maximum number of runs in a string. We prove that for any e > 0, (a -- e)n is an asymptotic lower bound, where a = 56733/60064 = 0.944542. It is superior to the previous bound 0.927 given by Franek et al.…
Recently, a short and elegant proof was presented showing that a binary word of length $n$ contains at most $n-3$ runs. Here we show, using the same technique and a computer search, that the number of runs in a binary word of length $n$ is…
Much research in stringology focuses on structures that can, in a way, ``grasp'' repeats (substrings that occur multiple times) as, for example, the so-called runs, a.k.a. maximal repetitions, compactly describe all tandem repeats. In this…
Repeat finding in strings has important applications in subfields such as computational biology. Surprisingly, all prior work on repeat finding did not consider the constraint on the locality of repeats. In this paper, we propose and study…
We consider the problems of computing maximal palindromes and distinct palindromes in a trie. A trie is a natural generalization of a string, which can be seen as a single-path tree. There is a linear-time offline algorithm to compute…
In the deletion channel, an important problem is to determine the number of subsequences derived from a string $U$ of length $n$ when subjected to $t$ deletions. It is well-known that the number of subsequences in the setting exhibits a…
The first two authors have shown [KK99,KK00] that the sum the exponent (and thus the number) of maximal repetitions of exponent at least 2 (also called runs) is linear in the length of the word. The exponent 2 in the definition of a run may…
Repeat finding in strings has important applications in subfields such as computational biology. The challenge of finding the longest repeats covering particular string positions was recently proposed and solved by \.{I}leri et al., using a…
In this paper we initiate the study of computing a maximal (not necessarily maximum) repeating pattern in a single input string, where the corresponding problems have been studied (e.g., a maximal common subsequence) only in two or more…
A border of a string is a non-empty proper prefix of the string that is also a suffix. A string is unbordered if it has no border. The longest unbordered factor is a fundamental notion in stringology, closely related to string periodicity.…
A run in a word is a periodic factor whose length is at least twice its period and which cannot be extended to the left or right (by a letter) to a factor with greater period. In recent years a great deal of work has been done on estimating…
A drawing of a graph in the plane is called a thrackle if every pair of edges meets precisely once, either at a common vertex or at a proper crossing. Let t(n) denote the maximum number of edges that a thrackle of n vertices can have.…