Sublinear Algorithms for Gap Edit Distance
Abstract
The edit distance is a way of quantifying how similar two strings are to one another by counting the minimum number of character insertions, deletions, and substitutions required to transform one string into the other. A simple dynamic programming computes the edit distance between two strings of length in time, and a more sophisticated algorithm runs in time when the edit distance is [Landau, Myers and Schmidt, SICOMP 1998]. In pursuit of obtaining faster running time, the last couple of decades have seen a flurry of research on approximating edit distance, including polylogarithmic approximation in near-linear time [Andoni, Krauthgamer and Onak, FOCS 2010], and a constant-factor approximation in subquadratic time [Chakrabarty, Das, Goldenberg, Kouck\'y and Saks, FOCS 2018]. We study sublinear-time algorithms for small edit distance, which was investigated extensively because of its numerous applications. Our main result is an algorithm for distinguishing whether the edit distance is at most or at least (the quadratic gap problem) in time . This time bound is sublinear roughly for all in , which was not known before. The best previous algorithms solve this problem in sublinear time only for [Andoni and Onak, STOC 2009]. Our algorithm is based on a new approach that adaptively switches between uniform sampling and reading contiguous blocks of the input strings. In contrast, all previous algorithms choose which coordinates to query non-adaptively. Moreover, it can be extended to solve the vs gap problem in time .
Cite
@article{arxiv.1910.00901,
title = {Sublinear Algorithms for Gap Edit Distance},
author = {Elazar Goldenberg and Robert Krauthgamer and Barna Saha},
journal= {arXiv preprint arXiv:1910.00901},
year = {2019}
}