Related papers: Algorithm to derive shortest edit script using Lev…

Edit Distance Cannot Be Computed in Strongly Subquadratic Time (unless SETH is false)

The edit distance (a.k.a. the Levenshtein distance) between two strings is defined as the minimum number of insertions, deletions or substitutions of symbols needed to transform one string into another. The problem of computing the edit…

Computational Complexity · Computer Science 2017-08-17 Arturs Backurs , Piotr Indyk

Learning string edit distance

In many applications, it is necessary to determine the similarity of two strings. A widely-used notion of string similarity is the edit distance: the minimum number of insertions, deletions, and substitutions required to transform one…

cmp-lg · Computer Science 2008-02-03 Eric Sven Ristad , Peter N. Yianilos

Improved Algorithms for Approximate String Matching (Extended Abstract)

The problem of approximate string matching is important in many different areas such as computational biology, text processing and pattern recognition. A great effort has been made to design efficient algorithms addressing several variants…

Data Structures and Algorithms · Computer Science 2008-07-29 Dimitris Papamichail , Georgios Papamichail

A New String Edit Distance and Applications

String edit distances have been used for decades in applications ranging from spelling correction and web search suggestions to DNA analysis. Most string edit distances are variations of the Levenshtein distance and consider only…

Genomics · Quantitative Biology 2022-05-12 Taylor Petty , Jan Hannig , Tunde I Huszar , Hari Iyer

Reducing approximate Longest Common Subsequence to approximate Edit Distance

Given a pair of strings, the problems of computing their Longest Common Subsequence and Edit Distance have been extensively studied for decades. For exact algorithms, LCS and Edit Distance (with character insertions and deletions) are…

Data Structures and Algorithms · Computer Science 2019-04-12 Aviad Rubinstein , Zhao Song

Sublinear Algorithms for Gap Edit Distance

The edit distance is a way of quantifying how similar two strings are to one another by counting the minimum number of character insertions, deletions, and substitutions required to transform one string into the other. A simple dynamic…

Computational Complexity · Computer Science 2019-10-03 Elazar Goldenberg , Robert Krauthgamer , Barna Saha

Streaming Algorithms For Computing Edit Distance Without Exploiting Suffix Trees

The edit distance is a way of quantifying how similar two strings are to one another by counting the minimum number of character insertions, deletions, and substitutions required to transform one string into the other. In this paper we…

Data Structures and Algorithms · Computer Science 2016-07-14 Diptarka Chakraborty , Elazar Goldenberg , Michal Koucký

Approximating Edit Distance in the Fully Dynamic Model

The edit distance is a fundamental measure of sequence similarity, defined as the minimum number of character insertions, deletions, and substitutions needed to transform one string into the other. Given two strings of length at most $n$,…

Data Structures and Algorithms · Computer Science 2023-07-17 Tomasz Kociumaka , Anish Mukherjee , Barna Saha

Many Flavors of Edit Distance

Several measures exist for string similarity, including notable ones like the edit distance and the indel distance. The former measures the count of insertions, deletions, and substitutions required to transform one string into another,…

Data Structures and Algorithms · Computer Science 2024-10-15 Sudatta Bhattacharya , Sanjana Dey , Elazar Goldenberg , Michal Koucký

Identifying document similarity using a fast estimation of the Levenshtein Distance based on compression and signatures

Identifying document similarity has many applications, e.g., source code analysis or plagiarism detection. However, identifying similarities is not trivial and can be time complex. For instance, the Levenshtein Distance is a common metric…

Information Retrieval · Computer Science 2023-07-24 Peter Coates , Frank Breitinger

Unified Compression-Based Acceleration of Edit-Distance Computation

The edit distance problem is a classical fundamental problem in computer science in general, and in combinatorial pattern matching in particular. The standard dynamic programming solution for this problem computes the edit-distance between…

Data Structures and Algorithms · Computer Science 2016-10-05 Danny Hermelin , Gad M. Landau , Shir Landau , Oren Weimann

Approximating Edit Distance Within Constant Factor in Truly Sub-Quadratic Time

Edit distance is a measure of similarity of two strings based on the minimum number of character insertions, deletions, and substitutions required to transform one string into the other. The edit distance can be computed exactly using a…

Data Structures and Algorithms · Computer Science 2021-02-17 Diptarka Chakraborty , Debarati Das , Elazar Goldenberg , Michal Koucky , Michael Saks

Approximation Schemes for Edit Distance and LCS in Quasi-Strongly Subquadratic Time

We present novel randomized approximation schemes for the Edit Distance (ED) problem and the Longest Common Subsequence (LCS) problem that, for any constant $\epsilon>0$, compute a $(1+\epsilon)$-approximation for ED and a…

Data Structures and Algorithms · Computer Science 2026-04-01 Xiao Mao , Aviad Rubinstein

Quadratic Conditional Lower Bounds for String Problems and Dynamic Time Warping

Classic similarity measures of strings are longest common subsequence and Levenshtein distance (i.e., the classic edit distance). A classic similarity measure of curves is dynamic time warping. These measures can be computed by simple…

Computational Complexity · Computer Science 2015-04-06 Karl Bringmann , Marvin Künnemann

Faster Algorithm of String Comparison

In many applications, it is necessary to determine the string similarity. Edit distance[WF74] approach is a classic method to determine Field Similarity. A well known dynamic programming algorithm [GUS97] is used to calculate edit distance…

Data Structures and Algorithms · Computer Science 2007-05-23 Qi Xiao Yang , Sung Sam Yuan , Lu Chun , Li Zhao , Sun Peng

Approximating the Geometric Edit Distance

Edit distance is a measurement of similarity between two sequences such as strings, point sequences, or polygonal curves. Many matching problems from a variety of areas, such as signal analysis, bioinformatics, etc., need to be solved in a…

Computational Geometry · Computer Science 2020-09-10 Kyle Fox , Xinyi Li

Language Edit Distance & Scored Parsing: Faster Algorithms & Connection to Fundamental Graph Problems

Given a context free language $\mathcal{L(G)}$ over alphabet $\Sigma$ and a string $s \in \Sigma^*$, {\em the language edit distance} problem seeks the minimum number of edits (insertions, deletions and substitutions) required to convert…

Data Structures and Algorithms · Computer Science 2024-10-25 Tomasz Kociumaka , Barna Saha

Approximate Trace Reconstruction

In the usual trace reconstruction problem, the goal is to exactly reconstruct an unknown string of length $n$ after it passes through a deletion channel many times independently, producing a set of traces (i.e., random subsequences of the…

Data Structures and Algorithms · Computer Science 2020-12-17 Sami Davies , Miklos Z. Racz , Cyrus Rashtchian , Benjamin G. Schiffer

Space Efficient Deterministic Approximation of String Measures

We study approximation algorithms for the following three string measures that are widely used in practice: edit distance (ED), longest common subsequence (LCS), and longest increasing sequence (LIS). All three problems can be solved…

Data Structures and Algorithms · Computer Science 2020-07-28 Kuan Cheng , Zhengzhong Jin , Xin Li , Yu Zheng

Approximating Edit Distance in Truly Subquadratic Time: Quantum and MapReduce

The edit distance between two strings is defined as the smallest number of insertions, deletions, and substitutions that need to be made to transform one of the strings to another one. Approximating edit distance in subquadratic time is…

Data Structures and Algorithms · Computer Science 2018-04-27 Mahdi Boroujeni , Soheil Ehsani , Mohammad Ghodsi , MohammadTaghi HajiAghayi , Saeed Seddighin