Related papers: Computing and Enumerating Minimal Common Supersequ…
In this paper we initiate the study of computing a maximal (not necessarily maximum) repeating pattern in a single input string, where the corresponding problems have been studied (e.g., a maximal common subsequence) only in two or more…
Described are two algorithms to find long approximate palindromes in a string, for example a DNA sequence. A simple algorithm requires O(n)-space and almost always runs in $O(k.n)$-time where n is the length of the string and k is the…
We present the first $\mathrm{o}(n)$-space polynomial-time algorithm for computing the length of a longest common subsequence. Given two strings of length $n$, the algorithm runs in $\mathrm{O}(n^{3})$ time with $\mathrm{O}\left(\frac{n…
We consider the longest common subsequence (LCS) problem with the restriction that the common subsequence is required to consist of at least $k$ length substrings. First, we show an $O(mn)$ time algorithm for the problem which gives a…
This paper reformulates the problem of finding a longest common increasing subsequence of the two given input sequences in a very succinct way. An extremely simple linear space algorithm based on the new formula can find a longest common…
The longest common substring with $k$-mismatches problem is to find, given two strings $S_1$ and $S_2$, a longest substring $A_1$ of $S_1$ and $A_2$ of $S_2$ such that the Hamming distance between $A_1$ and $A_2$ is $\le k$. We introduce a…
In this paper we define a new problem, motivated by computational biology, $LCSk$ aiming at finding the maximal number of $k$ length $substrings$, matching in both input strings while preserving their order of appearance. The traditional…
This paper shows that a simple algorithm produces the {\em all-prefixes-LCSs-graph} in $O(mn)$ time for two input sequences of size $m$ and $n$. Given any prefix $p$ of the first input sequence and any prefix $q$ of the second input…
Computation on compressed strings is one of the key approaches to processing massive data sets. We consider local subsequence recognition problems on strings compressed by straight-line programs (SLP), which is closely related to…
The equidistant subsequence pattern matching problem is considered. Given a pattern string $P$ and a text string $T$, we say that $P$ is an \emph{equidistant subsequence} of $T$ if $P$ is a subsequence of the text such that consecutive…
Given a set of strings, the shortest common superstring problem is to find the shortest possible string that contains all the input strings. The problem is NP-hard, but a lot of work has gone into designing approximation algorithms for…
In this paper, we consider the ``Shortest Superstring Problem''(SSP) or the ``Shortest Common Superstring Problem''(SCS). The problem is as follows. For a positive integer $n$, a sequence of n strings $S=(s^1,\dots,s^n)$ is given. We should…
This note provides very simple, efficient algorithms for computing the number of distinct longest common subsequences of two input strings and for computing the number of LCS embeddings.
We propose efficient algorithms for enumerating maximal common subsequences (MCSs) of two strings. Efficiency of the algorithms are estimated by the preprocessing-time, space, and delay-time complexities. One algorithm prepares a…
Given a set of $k$ strings $I$, their longest common subsequence (LCS) is the string with the maximum length that is a subset of all the strings in $I$. A data-structure for this problem preprocesses $I$ into a data-structure such that the…
In this paper, we consider two versions of the Text Assembling problem. We are given a sequence of strings $s^1,\dots,s^n$ of total length $L$ that is a dictionary, and a string $t$ of length $m$ that is texts. The first version of the…
This paper investigates the approximability of the Longest Common Subsequence (LCS) problem. The fastest algorithm for solving the LCS problem exactly runs in essentially quadratic time in the length of the input, and it is known that under…
Real-world data often comes in compressed form. Analyzing compressed data directly (without decompressing it) can save space and time by orders of magnitude. In this work, we focus on fundamental sequence comparison problems and try to…
The problem of approximate string matching is important in many different areas such as computational biology, text processing and pattern recognition. A great effort has been made to design efficient algorithms addressing several variants…
Finding the longest common subsequence in $k$-length substrings (LCS$k$) is a recently proposed problem motivated by computational biology. This is a generalization of the well-known LCS problem in which matching symbols from two sequences…