Related papers: On Approximating String Selection Problems with Ou…
Given $n$ length-$\ell$ strings $S =\{s_1, ..., s_n\}$ over a constant size alphabet $\Sigma$ together with parameters $d$ and $k$, the objective in the {\em Consensus String with Outliers} problem is to find a subset $S^*$ of $S$ of size…
The problem of finding a center string that is `close' to every given string arises and has many applications in computational biology and coding theory. This problem has two versions: the Closest String problem and the Closest Substring…
In the Closest String problem one is given a family $\mathcal S$ of equal-length strings over some fixed alphabet, and the task is to find a string $y$ that minimizes the maximum Hamming distance between $y$ and a string from $\mathcal S$.…
Clustering with outliers is one of the most fundamental problems in Computer Science. Given a set $X$ of $n$ points and two integers $k$ and $m$, the clustering with outliers aims to exclude $m$ points from $X$ and partition the remaining…
Finding an Approximate Longest Common Substring (ALCS) within a given set $S=\{s_1,s_2,\ldots,s_m\}$ of $m \ge 2$ strings is a key problem in computational biology, such as identifying related mutations across multiple genetic sequences. We…
We study the fundamental problem of finding the best string to represent a given set, in the form of the Closest String problem: Given a set $X \subseteq \Sigma^d$ of $n$ strings, find the string $x^*$ minimizing the radius of the smallest…
The approximate string matching is a fundamental and recurrent problem that arises in most computer science fields. This problem can be defined as follows: Let $D=\{x_1,x_2,\ldots x_d\}$ be a set of $d$ words defined on an alphabet…
We consider string matching with variable length gaps. Given a string $T$ and a pattern $P$ consisting of strings separated by variable length gaps (arbitrary strings of length in a specified range), the problem is to find all ending…
The problem of approximate string matching is important in many different areas such as computational biology, text processing and pattern recognition. A great effort has been made to design efficient algorithms addressing several variants…
String consensus problems aim at finding a string that minimizes some given distance with respect to an input set of strings. In particular, in the Closest string problem, we are given a set of strings of equal length and a radius $d$. The…
We report (to our knowledge) the first evaluation of Constraint Satisfaction as a computational framework for solving closest string problems. We show that careful consideration of symbol occurrences can provide search heuristics that…
We study the complexity of the problem of searching for a set of patterns that separate two given sets of strings. This problem has applications in a wide variety of areas, most notably in data mining, computational biology, and in…
Described are two algorithms to find long approximate palindromes in a string, for example a DNA sequence. A simple algorithm requires O(n)-space and almost always runs in $O(k.n)$-time where n is the length of the string and k is the…
The Closest String Problem is an NP-hard problem that aims to find a string that has the minimum distance from all sequences that belong to the given set of strings. Its applications can be found in coding theory, computational biology, and…
String matching is the problem of deciding whether a given $n$-bit string contains a given $k$-bit pattern. We study the complexity of this problem in three settings. Communication complexity. For small $k$, we provide near-optimal upper…
In this paper we consider two problems concerning string factorisation. Specifically given a string $w$ and an integer $k$ find a factorisation of $w$ where each factor has length bounded by $k$ and has the minimum (the FmD problem) or the…
We show that Closest Substring, one of the most important problems in the field of biological sequence analysis, is W[1]-hard when parameterized by the number k of input strings (and remains so, even over a binary alphabet). This problem is…
In the Shortest-Superstring problem, we are given a set of strings S and want to find a string that contains all strings in S as substrings and has minimum length. This is a classical problem in approximation and the best known…
We consider a range of simply stated dynamic data structure problems on strings. An update changes one symbol in the input and a query asks us to compute some function of the pattern of length $m$ and a substring of a longer text. We give…
This study investigates whether reoptimization can help in solving the closest substring problem. We are dealing with the following reoptimization scenario. Suppose, we have an optimal l-length closest substring of a given set of sequences…