Related papers: On Longest Repeat Queries Using GPU
Repeat finding in strings has important applications in subfields such as computational biology. Surprisingly, all prior work on repeat finding did not consider the constraint on the locality of repeats. In this paper, we propose and study…
A longest repeat query on a string, motivated by its applications in many subfields including computational biology, asks for the longest repetitive substring(s) covering a particular string position (point query). In this paper, we extend…
We revisit the problem of finding shortest unique substring (SUS) proposed recently by [6]. We propose an optimal $O(n)$ time and space algorithm that can find an SUS for every location of a string of size $n$. Our algorithm significantly…
We consider string matching with variable length gaps. Given a string $T$ and a pattern $P$ consisting of strings separated by variable length gaps (arbitrary strings of length in a specified range), the problem is to find all ending…
In this paper we initiate the study of computing a maximal (not necessarily maximum) repeating pattern in a single input string, where the corresponding problems have been studied (e.g., a maximal common subsequence) only in two or more…
In the area of Pattern Recognition and Matching, finding a Longest Common Subsequence plays an important role. In this paper, we have proposed one algorithm based on parallel computation. We have used OpenMP API package as middleware to…
Given strings $P$ and $Q$ the (exact) string matching problem is to find all positions of substrings in $Q$ matching $P$. The classical Knuth-Morris-Pratt algorithm [SIAM J. Comput., 1977] solves the string matching problem in linear time…
In this paper, first we give a sequential linear-time algorithm for the longest path problem in meshes. This algorithm can be considered as an improvement of [13]. Then based on this sequential algorithm, we present a constant-time parallel…
We describe a RAM algorithm computing all runs (maximal repetitions) of a given string of length $n$ over a general ordered alphabet in $O(n\log^{\frac{2}3} n)$ time and linear space. Our algorithm outperforms all known solutions working in…
Given a set of strings, the shortest common superstring problem is to find the shortest possible string that contains all the input strings. The problem is NP-hard, but a lot of work has gone into designing approximation algorithms for…
String matching algorithms are among one of the most widely used algorithms in computer science. Traditional string matching algorithms efficiency of underlaying string matching algorithm will greatly increase the efficiency of any…
Motivated by computing duplication patterns in sequences, a new fundamental problem called the longest subsequence-repeated subsequence (LSRS) is proposed. Given a sequence $S$ of length $n$, a letter-repeated subsequence is a subsequence…
Persistent homology is a crucial invariant that is used in many areas to understand data. The $O(N^4)$ run time is a hindrance to its use on most large datasets. We give a parallelization method to utilize multi-core machines and clusters.…
We consider the problem of encoding a string of length $n$ from an integer alphabet of size $\sigma$ so that access and substring equality queries (that is, determining the equality of any two substrings) can be answered efficiently. Any…
Following (Kolpakov et al., 2013; Gawrychowski and Manea, 2015), we continue the study of {\em $\alpha$-gapped repeats} in strings, defined as factors $uvu$ with $|uv|\leq \alpha |u|$. Our main result is the $O(\alpha n)$ bound on the…
In the classical longest palindromic substring (LPS) problem, we are given a string $S$ of length $n$, and the task is to output a longest palindromic substring in $S$. Gilbert, Hajiaghayi, Saleh, and Seddighin [SPAA 2023] showed how to…
Much research in stringology focuses on structures that can, in a way, ``grasp'' repeats (substrings that occur multiple times) as, for example, the so-called runs, a.k.a. maximal repetitions, compactly describe all tandem repeats. In this…
The problem of approximate string matching is important in many different areas such as computational biology, text processing and pattern recognition. A great effort has been made to design efficient algorithms addressing several variants…
We present a new scalable, lightweight algorithm to incrementally construct the BWT and FM-index of large string sets such as those produced by Next Generation Sequencing. The algorithm is designed for massive parallelism and can…
We study quantum algorithms for several fundamental string problems, including Longest Common Substring, Lexicographically Minimal String Rotation, and Longest Square Substring. These problems have been widely studied in the stringology…