Related papers: An Improved Sketching Algorithm for Edit Distance
We show that in the document exchange problem, where Alice holds $x \in \{0,1\}^n$ and Bob holds $y \in \{0,1\}^n$, Alice can send Bob a message of size $O(K(\log^2 K+\log n))$ bits such that Bob can recover $x$ using the message and his…
Edit distance is an important measure of string similarity. It counts the number of insertions, deletions and substitutions one has to make to a string $x$ to get a string $y$. In this paper we design an almost linear-size sketching scheme…
We present an algorithm for approximating the edit distance $\operatorname{ed}(x, y)$ between two strings $x$ and $y$ in time parameterized by the degree to which one of the strings $x$ satisfies a natural pseudorandomness property. The…
We consider an efficient two-party protocol for securely computing the similarity of strings w.r.t. an extended edit distance measure. Here, two parties possessing strings $x$ and $y$, respectively, want to jointly compute an approximate…
The edit distance is a way of quantifying how similar two strings are to one another by counting the minimum number of character insertions, deletions, and substitutions required to transform one string into the other. In this paper we…
The edit distance of two strings is the minimum number of insertions, deletions, and substitutions of characters needed to transform one string into the other. The textbook dynamic-programming algorithm computes the edit distance of two…
We study the robust communication complexity of maximum matching. Edges of an arbitrary $n$-vertex graph $G$ are randomly partitioned between Alice and Bob independently and uniformly. Alice has to send a single message to Bob such that Bob…
The shift distance $\mathsf{sh}(S_1,S_2)$ between two strings $S_1$ and $S_2$ of the same length is defined as the minimum Hamming distance between $S_1$ and any rotation (cyclic shift) of $S_2$. We study the problem of sketching the shift…
Suppose that we have two parties that possess each a binary string. Suppose that the length of the first string (document) is $n$ and that the two strings (documents) have edit distance (minimal number of deletes, inserts and substitutions…
We present novel randomized approximation schemes for the Edit Distance (ED) problem and the Longest Common Subsequence (LCS) problem that, for any constant $\epsilon>0$, compute a $(1+\epsilon)$-approximation for ED and a…
The edit distance $ed(X,Y)$ of two strings $X,Y\in \Sigma^*$ is the minimum number of character edits (insertions, deletions, and substitutions) needed to transform $X$ into $Y$. Its weighted counterpart $ed^w(X,Y)$ minimizes the total cost…
We study the problem of approximating edit distance in sublinear time. This is formalized as the $(k,k^c)$-Gap Edit Distance problem, where the input is a pair of strings $X,Y$ and parameters $k,c>1$, and the goal is to return YES if…
Given a pair of strings, the problems of computing their Longest Common Subsequence and Edit Distance have been extensively studied for decades. For exact algorithms, LCS and Edit Distance (with character insertions and deletions) are…
The edit distance (a.k.a. the Levenshtein distance) between two strings is defined as the minimum number of insertions, deletions or substitutions of symbols needed to transform one string into another. The problem of computing the edit…
The edit distance of two strings is the minimum number of insertions, deletions, and substitutions needed to transform one string into the other. The textbook algorithm determines the edit distance of length-$n$ strings in $O(n^2)$ time,…
The normalized edit distance is one of the distances derived from the edit distance. It is useful in some applications because it takes into account the lengths of the two strings compared. The normalized edit distance is not defined in…
In this paper we provide a new locally consistent decomposition of strings. Each string $x$ is decomposed into blocks that can be described by grammars of size $\widetilde{O}(k)$ (using some amount of randomness). If we take two strings $x$…
The edit distance (ED) and longest common subsequence (LCS) are two fundamental problems which quantify how similar two strings are to one another. In this paper, we consider these problems in the asymmetric streaming model introduced by…
We present an algorithm for approximating the edit distance between two strings of length $n$ in time $n^{1+\varepsilon}$ up to a constant factor, for any $\varepsilon>0$. Our result completes a research direction set forth in the recent…
Edit distance is a fundamental measure of distance between strings and has been widely studied in computer science. While the problem of estimating edit distance has been studied extensively, the equally important question of actually…