Related papers: Improved Sublinear-Time Edit Distance for Preproce…
We study the fundamental problem of approximating the edit distance of two strings. After an extensive line of research led to the development of a constant-factor approximation algorithm in almost-linear time, recent years have witnessed a…
We study the problem of estimating the edit distance between two $n$-character strings. While exact computation in the worst case is believed to require near-quadratic time, previous work showed that in certain regimes it is possible to…
We revisit the task of computing the edit distance in sublinear time. In the $(k,K)$-gap edit distance problem the task is to distinguish whether the edit distance of two strings is at most $k$ or at least $K$. It has been established by…
In this paper, we design new sublinear-time algorithms for solving the gap edit distance problem and for embedding edit distance to Hamming distance. For the gap edit distance problem, we give an $\tilde{O}(\frac{n}{k}+k^2)$-time greedy…
The edit distance is a way of quantifying how similar two strings are to one another by counting the minimum number of character insertions, deletions, and substitutions required to transform one string into the other. A simple dynamic…
We study edit distance computation with preprocessing: the preprocessing algorithm acts on each string separately, and then the query algorithm takes as input the two preprocessed strings. This model is inspired by scenarios where we would…
We study the problem of approximating edit distance in sublinear time. This is formalized as the $(k,k^c)$-Gap Edit Distance problem, where the input is a pair of strings $X,Y$ and parameters $k,c>1$, and the goal is to return YES if…
The edit distance is a fundamental measure of sequence similarity, defined as the minimum number of character insertions, deletions, and substitutions needed to transform one string into the other. Given two strings of length at most $n$,…
The problem of approximate string matching is important in many different areas such as computational biology, text processing and pattern recognition. A great effort has been made to design efficient algorithms addressing several variants…
We present an algorithm for approximating the edit distance between two strings of length $n$ in time $n^{1+\varepsilon}$ up to a constant factor, for any $\varepsilon>0$. Our result completes a research direction set forth in the recent…
We show how to compute the edit distance between two strings of length n up to a factor of 2^{\~O(sqrt(log n))} in n^(1+o(1)) time. This is the first sub-polynomial approximation algorithm for this problem that runs in near-linear time,…
The edit distance is a way of quantifying how similar two strings are to one another by counting the minimum number of character insertions, deletions, and substitutions required to transform one string into the other. In this paper we…
The edit distance of two strings is the minimum number of insertions, deletions, and substitutions needed to transform one string into the other. The textbook algorithm determines the edit distance of length-$n$ strings in $O(n^2)$ time,…
We consider the problem of preprocessing two strings $S$ and $T$, of lengths $m$ and $n$, respectively, in order to be able to efficiently answer the following queries: Given positions $i,j$ in $S$ and positions $a,b$ in $T$, return the…
The edit distance of two strings is the minimum number of insertions, deletions, and substitutions of characters needed to transform one string into the other. The textbook dynamic-programming algorithm computes the edit distance of two…
The edit distance problem is a classical fundamental problem in computer science in general, and in combinatorial pattern matching in particular. The standard dynamic programming solution for this problem computes the edit-distance between…
We present a near-linear time algorithm that approximates the edit distance between two strings within a polylogarithmic factor; specifically, for strings of length n and every fixed epsilon>0, it can compute a (log n)^O(1/epsilon)…
Real-world data often comes in compressed form. Analyzing compressed data directly (without decompressing it) can save space and time by orders of magnitude. In this work, we focus on fundamental sequence comparison problems and try to…
Given two strings of length $n$ over alphabet $\Sigma$, and an upper bound $k$ on their edit distance, the algorithm of Myers (Algorithmica'86) and Landau and Vishkin (JCSS'88) computes the unweighted string edit distance in…
Edit distance is a fundamental measure of distance between strings and has been widely studied in computer science. While the problem of estimating edit distance has been studied extensively, the equally important question of actually…