Related papers: Improved Algorithms for Approximate String Matchin…

Average-Case Optimal Approximate Circular String Matching

Approximate string matching is the problem of finding all factors of a text t of length n that are at a distance at most k from a pattern x of length m. Approximate circular string matching is the problem of finding all factors of t that…

Data Structures and Algorithms · Computer Science 2016-04-26 Carl Barton , Costas S. Iliopoulos , Solon P. Pissis

Edit distance similarity search, also called approximate pattern matching, is a fundamental problem with widespread database applications. The goal of the problem is to preprocess $n$ strings of length $d$, to quickly answer queries $q$ of…

Data Structures and Algorithms · Computer Science 2020-07-10 Samuel McCauley

Faster Approximate String Matching for Short Patterns

We study the classical approximate string matching problem, that is, given strings $P$ and $Q$ and an error threshold $k$, find all ending positions of substrings of $Q$ whose edit distance to $P$ is at most $k$. Let $P$ and $Q$ have…

Data Structures and Algorithms · Computer Science 2011-03-21 Philip Bille

Approximating Edit Distance in the Fully Dynamic Model

The edit distance is a fundamental measure of sequence similarity, defined as the minimum number of character insertions, deletions, and substitutions needed to transform one string into the other. Given two strings of length at most $n$,…

Data Structures and Algorithms · Computer Science 2023-07-17 Tomasz Kociumaka , Anish Mukherjee , Barna Saha

Faster Algorithm of String Comparison

In many applications, it is necessary to determine the string similarity. Edit distance[WF74] approach is a classic method to determine Field Similarity. A well known dynamic programming algorithm [GUS97] is used to calculate edit distance…

Data Structures and Algorithms · Computer Science 2007-05-23 Qi Xiao Yang , Sung Sam Yuan , Lu Chun , Li Zhao , Sun Peng

Unified Compression-Based Acceleration of Edit-Distance Computation

The edit distance problem is a classical fundamental problem in computer science in general, and in combinatorial pattern matching in particular. The standard dynamic programming solution for this problem computes the edit-distance between…

Data Structures and Algorithms · Computer Science 2016-10-05 Danny Hermelin , Gad M. Landau , Shir Landau , Oren Weimann

Approximating Text-to-Pattern Hamming Distances

We revisit a fundamental problem in string matching: given a pattern of length m and a text of length n, both over an alphabet of size $\sigma$, compute the Hamming distance between the pattern and the text at every location. Several…

Data Structures and Algorithms · Computer Science 2020-01-03 Timothy M. Chan , Shay Golan , Tomasz Kociumaka , Tsvi Kopelowitz , Ely Porat

Sublinear Algorithms for Gap Edit Distance

The edit distance is a way of quantifying how similar two strings are to one another by counting the minimum number of character insertions, deletions, and substitutions required to transform one string into the other. A simple dynamic…

Computational Complexity · Computer Science 2019-10-03 Elazar Goldenberg , Robert Krauthgamer , Barna Saha

Polylogarithmic Approximation for Edit Distance and the Asymmetric Query Complexity

We present a near-linear time algorithm that approximates the edit distance between two strings within a polylogarithmic factor; specifically, for strings of length n and every fixed epsilon>0, it can compute a (log n)^O(1/epsilon)…

Data Structures and Algorithms · Computer Science 2010-05-24 Alexandr Andoni , Robert Krauthgamer , Krzysztof Onak

Approximate String Matching: Theory and Applications (La Recherche Approch\'ee de Motifs : Th\'eorie et Applications)

The approximate string matching is a fundamental and recurrent problem that arises in most computer science fields. This problem can be defined as follows: Let $D=\{x_1,x_2,\ldots x_d\}$ be a set of $d$ words defined on an alphabet…

Data Structures and Algorithms · Computer Science 2017-01-31 Ibrahim Chegrane

Fast and linear-time string matching algorithms based on the distances of $q$-gram occurrences

Given a text $T$ of length $n$ and a pattern $P$ of length $m$, the string matching problem is a task to find all occurrences of $P$ in $T$. In this study, we propose an algorithm that solves this problem in $O((n + m)q)$ time considering…

Data Structures and Algorithms · Computer Science 2020-04-14 Satoshi Kobayashi , Diptarama Hendrian , Ryo Yoshinaka , Ayumi Shinohara

Reducing approximate Longest Common Subsequence to approximate Edit Distance

Given a pair of strings, the problems of computing their Longest Common Subsequence and Edit Distance have been extensively studied for decades. For exact algorithms, LCS and Edit Distance (with character insertions and deletions) are…

Data Structures and Algorithms · Computer Science 2019-04-12 Aviad Rubinstein , Zhao Song

On Estimating Edit Distance: Alignment, Dimension Reduction, and Embeddings

Edit distance is a fundamental measure of distance between strings and has been widely studied in computer science. While the problem of estimating edit distance has been studied extensively, the equally important question of actually…

Data Structures and Algorithms · Computer Science 2018-05-08 Moses Charikar , Ofir Geri , Michael P. Kim , William Kuszmaul

Approximating Edit Distance in Truly Subquadratic Time: Quantum and MapReduce

The edit distance between two strings is defined as the smallest number of insertions, deletions, and substitutions that need to be made to transform one of the strings to another one. Approximating edit distance in subquadratic time is…

Data Structures and Algorithms · Computer Science 2018-04-27 Mahdi Boroujeni , Soheil Ehsani , Mohammad Ghodsi , MohammadTaghi HajiAghayi , Saeed Seddighin

String Matching with Variable Length Gaps

We consider string matching with variable length gaps. Given a string $T$ and a pattern $P$ consisting of strings separated by variable length gaps (arbitrary strings of length in a specified range), the problem is to find all ending…

Data Structures and Algorithms · Computer Science 2011-10-14 Philip Bille , Inge Li Goertz , Hjalte Wedel Vildhøj , David Kofoed Wind

On The Closest String and Substring Problems

The problem of finding a center string that is `close' to every given string arises and has many applications in computational biology and coding theory. This problem has two versions: the Closest String problem and the Closest Substring…

Computational Engineering, Finance, and Science · Computer Science 2007-05-23 Ming Li , Bin Ma , Lusheng Wang

Faster Sublinear-Time Edit Distance

We study the fundamental problem of approximating the edit distance of two strings. After an extensive line of research led to the development of a constant-factor approximation algorithm in almost-linear time, recent years have witnessed a…

Data Structures and Algorithms · Computer Science 2023-12-05 Karl Bringmann , Alejandro Cassis , Nick Fischer , Tomasz Kociumaka

MinJoin: Efficient Edit Similarity Joins via Local Hash Minima

We study the problem of computing similarity joins under edit distance on a set of strings. Edit similarity joins is a fundamental problem in databases, data mining and bioinformatics. It finds important applications in data cleaning and…

Databases · Computer Science 2019-05-30 Haoyu Zhang , Qin Zhang

Finding approximate palindromes in strings

We introduce a novel definition of approximate palindromes in strings, and provide an algorithm to find all maximal approximate palindromes in a string with up to $k$ errors. Our definition is based on the usual edit operations of…

Data Structures and Algorithms · Computer Science 2007-05-23 A. H. L. Porto , V. C. Barbosa

Faster Pattern Matching under Edit Distance

We consider the approximate pattern matching problem under the edit distance. Given a text $T$ of length $n$, a pattern $P$ of length $m$, and a threshold $k$, the task is to find the starting positions of all substrings of $T$ that can be…

Data Structures and Algorithms · Computer Science 2022-04-08 Panagiotis Charalampopoulos , Tomasz Kociumaka , Philip Wellnitz