Related papers: Faster Binary Mean Computation Under Dynamic Time …
Given a set of strings over a specified alphabet, identifying a median or consensus string that minimizes the total distance to all input strings is a fundamental data aggregation problem. When the Hamming distance is considered as the…
We study variants of the mean problem under the $p$-Dynamic Time Warping ($p$-DTW) distance, a popular and robust distance measure for sequential data. In our setting we are given a set of finite point sequences over an arbitrary metric…
String consensus problems aim at finding a string that minimizes some given distance with respect to an input set of strings. In particular, in the Closest string problem, we are given a set of strings of equal length and a radius $d$. The…
We study the fundamental problem of finding the best string to represent a given set, in the form of the Closest String problem: Given a set $X \subseteq \Sigma^d$ of $n$ strings, find the string $x^*$ minimizing the radius of the smallest…
Consensus problems for strings and sequences appear in numerous application contexts, ranging from bioinformatics over data mining to machine learning. Closing some gaps in the literature, we show that several fundamental problems in this…
Classic similarity measures of strings are longest common subsequence and Levenshtein distance (i.e., the classic edit distance). A classic similarity measure of curves is dynamic time warping. These measures can be computed by simple…
The closest string problem is an NP-hard problem, whose task is to find a string that minimizes maximum Hamming distance to a given set of strings. This can be reduced to an integer program (IP). However, to date, there exists no known…
In this paper we consider the $p$-Norm Hamming Centroid problem which asks to determine whether some given binary strings have a centroid with a bound on the $p$-norm of its Hamming distances to the strings. Specifically, given a set of…
Strings are a natural representation of biological data such as DNA, RNA and protein sequences. The problem of finding a string that summarizes a set of sequences has direct application in relative compression algorithms for genome and…
Weighted Hamming distance, as a similarity measure between binary codes and binary queries, provides superior accuracy in search tasks than Hamming distance. However, how to efficiently and accurately find $K$ binary codes that have the…
Dynamic time warping constitutes a major tool for analyzing time series. In particular, computing a mean series of a given sample of series in dynamic time warping spaces (by minimizing the Fr\'echet function) is a challenging computational…
The Closest String Problem is an NP-hard problem that aims to find a string that has the minimum distance from all sequences that belong to the given set of strings. Its applications can be found in coding theory, computational biology, and…
In the Manhattan Sequence Consensus problem (MSC problem) we are given $k$ integer sequences, each of length $l$, and we are to find an integer sequence $x$ of length $l$ (called a consensus sequence), such that the maximum Manhattan…
We study approximation algorithms for variants of the \emph{median string} problem, which asks for a string that minimizes the sum of edit distances from a given set of $m$ strings of length $n$. Only the straightforward $2$-approximation…
String matching is the problem of finding all the substrings of a text which match a given pattern. It is one of the most investigated problems in computer science, mainly due to its very diverse applications in several fields. Recently,…
In this work, we consider the problem of pattern matching under the dynamic time warping (DTW) distance motivated by potential applications in the analysis of biological data produced by the third generation sequencing. To measure the DTW…
The dynamic time warping (dtw) distance is an established tool for mining time series data. The DTW-Mean problem consists of computing a series which minimizes the so-called Fr\'echet function, that is, the sum of squared dtw-distances to a…
We give an $\tilde O(n^2)$ time algorithm for computing the exact Dynamic Time Warping distance between two strings whose run-length encoding is of size at most $n$. This matches (up to log factors) the known (conditional) lower bound, and…
The problem of approximate string matching is important in many different areas such as computational biology, text processing and pattern recognition. A great effort has been made to design efficient algorithms addressing several variants…
Dynamic Time Warping (DTW) is a widely used similarity measure for comparing strings that encode time series data, with applications to areas including bioinformatics, signature verification, and speech recognition. The standard…