Related papers: A Unified Algorithm for Accelerating Edit-Distance…

Unified Compression-Based Acceleration of Edit-Distance Computation

The edit distance problem is a classical fundamental problem in computer science in general, and in combinatorial pattern matching in particular. The standard dynamic programming solution for this problem computes the edit-distance between…

Data Structures and Algorithms · Computer Science 2016-10-05 Danny Hermelin , Gad M. Landau , Shir Landau , Oren Weimann

How Compression and Approximation Affect Efficiency in String Distance Measures

Real-world data often comes in compressed form. Analyzing compressed data directly (without decompressing it) can save space and time by orders of magnitude. In this work, we focus on fundamental sequence comparison problems and try to…

Data Structures and Algorithms · Computer Science 2021-12-14 Arun Ganesh , Tomasz Kociumaka , Andrea Lincoln , Barna Saha

Streaming Algorithms For Computing Edit Distance Without Exploiting Suffix Trees

The edit distance is a way of quantifying how similar two strings are to one another by counting the minimum number of character insertions, deletions, and substitutions required to transform one string into the other. In this paper we…

Data Structures and Algorithms · Computer Science 2016-07-14 Diptarka Chakraborty , Elazar Goldenberg , Michal Koucký

Indexed Dynamic Programming to boost Edit Distance and LCSS Computation

There are efficient dynamic programming solutions to the computation of the Edit Distance from $S\in[1..\sigma]^n$ to $T\in[1..\sigma]^m$, for many natural subsets of edit operations, typically in time within $O(nm)$ in the worst-case over…

Information Retrieval · Computer Science 2018-06-13 Jérémy Barbay , Andrés Olivares

Does Preprocessing help in Fast Sequence Comparisons?

We study edit distance computation with preprocessing: the preprocessing algorithm acts on each string separately, and then the query algorithm takes as input the two preprocessed strings. This model is inspired by scenarios where we would…

Data Structures and Algorithms · Computer Science 2021-08-23 Elazar Goldenberg , Aviad Rubinstein , Barna Saha

Optimal Algorithms for Bounded Weighted Edit Distance

The edit distance of two strings is the minimum number of insertions, deletions, and substitutions of characters needed to transform one string into the other. The textbook dynamic-programming algorithm computes the edit distance of two…

Data Structures and Algorithms · Computer Science 2023-10-25 Alejandro Cassis , Tomasz Kociumaka , Philip Wellnitz

Faster Algorithm of String Comparison

In many applications, it is necessary to determine the string similarity. Edit distance[WF74] approach is a classic method to determine Field Similarity. A well known dynamic programming algorithm [GUS97] is used to calculate edit distance…

Data Structures and Algorithms · Computer Science 2007-05-23 Qi Xiao Yang , Sung Sam Yuan , Lu Chun , Li Zhao , Sun Peng

Improved Algorithms for Approximate String Matching (Extended Abstract)

The problem of approximate string matching is important in many different areas such as computational biology, text processing and pattern recognition. A great effort has been made to design efficient algorithms addressing several variants…

Data Structures and Algorithms · Computer Science 2008-07-29 Dimitris Papamichail , Georgios Papamichail

A Simple Sublinear Algorithm for Gap Edit Distance

We study the problem of estimating the edit distance between two $n$-character strings. While exact computation in the worst case is believed to require near-quadratic time, previous work showed that in certain regimes it is possible to…

Data Structures and Algorithms · Computer Science 2020-07-29 Joshua Brakensiek , Moses Charikar , Aviad Rubinstein

RLE edit distance in near optimal time

We show that the edit distance between two run-length encoded strings of compressed lengths $m$ and $n$ respectively, can be computed in $\mathcal{O}(mn\log(mn))$ time. This improves the previous record by a factor of…

Data Structures and Algorithms · Computer Science 2019-05-06 Raphaël Clifford , Paweł Gawrychowski , Tomasz Kociumaka , Daniel P. Martin , Przemysław Uznański

Sublinear Algorithms for Gap Edit Distance

The edit distance is a way of quantifying how similar two strings are to one another by counting the minimum number of character insertions, deletions, and substitutions required to transform one string into the other. A simple dynamic…

Computational Complexity · Computer Science 2019-10-03 Elazar Goldenberg , Robert Krauthgamer , Barna Saha

Near-Linear Time Insertion-Deletion Codes and (1+$\varepsilon$)-Approximating Edit Distance via Indexing

We introduce fast-decodable indexing schemes for edit distance which can be used to speed up edit distance computations to near-linear time if one of the strings is indexed by an indexing string $I$. In particular, for every length $n$ and…

Data Structures and Algorithms · Computer Science 2019-04-11 Bernhard Haeupler , Aviad Rubinstein , Amirbehshad Shahrasbi

Weighted Edit Distance Computation: Strings, Trees and Dyck

Given two strings of length $n$ over alphabet $\Sigma$, and an upper bound $k$ on their edit distance, the algorithm of Myers (Algorithmica'86) and Landau and Vishkin (JCSS'88) computes the unweighted string edit distance in…

Data Structures and Algorithms · Computer Science 2023-02-09 Debarati Das , Jacob Gilbert , MohammadTaghi Hajiaghayi , Tomasz Kociumaka , Barna Saha

Polylogarithmic Approximation for Edit Distance and the Asymmetric Query Complexity

We present a near-linear time algorithm that approximates the edit distance between two strings within a polylogarithmic factor; specifically, for strings of length n and every fixed epsilon>0, it can compute a (log n)^O(1/epsilon)…

Data Structures and Algorithms · Computer Science 2010-05-24 Alexandr Andoni , Robert Krauthgamer , Krzysztof Onak

Edit Distance in Near-Linear Time: it's a Constant Factor

We present an algorithm for approximating the edit distance between two strings of length $n$ in time $n^{1+\varepsilon}$ up to a constant factor, for any $\varepsilon>0$. Our result completes a research direction set forth in the recent…

Data Structures and Algorithms · Computer Science 2022-07-18 Alexandr Andoni , Negev Shekel Nosatzki

Faster Sublinear-Time Edit Distance

We study the fundamental problem of approximating the edit distance of two strings. After an extensive line of research led to the development of a constant-factor approximation algorithm in almost-linear time, recent years have witnessed a…

Data Structures and Algorithms · Computer Science 2023-12-05 Karl Bringmann , Alejandro Cassis , Nick Fischer , Tomasz Kociumaka

Approximating Edit Distance in Near-Linear Time

We show how to compute the edit distance between two strings of length n up to a factor of 2^{\~O(sqrt(log n))} in n^(1+o(1)) time. This is the first sub-polynomial approximation algorithm for this problem that runs in near-linear time,…

Data Structures and Algorithms · Computer Science 2011-09-27 Alexandr Andoni , Krzysztof Onak

Approximating Edit Distance in the Fully Dynamic Model

The edit distance is a fundamental measure of sequence similarity, defined as the minimum number of character insertions, deletions, and substitutions needed to transform one string into the other. Given two strings of length at most $n$,…

Data Structures and Algorithms · Computer Science 2023-07-17 Tomasz Kociumaka , Anish Mukherjee , Barna Saha

Approximating Edit Distance Within Constant Factor in Truly Sub-Quadratic Time

Edit distance is a measure of similarity of two strings based on the minimum number of character insertions, deletions, and substitutions required to transform one string into the other. The edit distance can be computed exactly using a…

Data Structures and Algorithms · Computer Science 2021-02-17 Diptarka Chakraborty , Debarati Das , Elazar Goldenberg , Michal Koucky , Michael Saks

Near-Optimal Quantum Algorithms for Bounded Edit Distance and Lempel-Ziv Factorization

Classically, the edit distance of two length-$n$ strings can be computed in $O(n^2)$ time, whereas an $O(n^{2-\epsilon})$-time procedure would falsify the Orthogonal Vectors Hypothesis. If the edit distance does not exceed $k$, the running…

Data Structures and Algorithms · Computer Science 2023-11-06 Daniel Gibney , Ce Jin , Tomasz Kociumaka , Sharma V. Thankachan