English

Sublinear Algorithms for Gap Edit Distance

Computational Complexity 2019-10-03 v1 Data Structures and Algorithms

Abstract

The edit distance is a way of quantifying how similar two strings are to one another by counting the minimum number of character insertions, deletions, and substitutions required to transform one string into the other. A simple dynamic programming computes the edit distance between two strings of length nn in O(n2)O(n^2) time, and a more sophisticated algorithm runs in time O(n+t2)O(n+t^2) when the edit distance is tt [Landau, Myers and Schmidt, SICOMP 1998]. In pursuit of obtaining faster running time, the last couple of decades have seen a flurry of research on approximating edit distance, including polylogarithmic approximation in near-linear time [Andoni, Krauthgamer and Onak, FOCS 2010], and a constant-factor approximation in subquadratic time [Chakrabarty, Das, Goldenberg, Kouck\'y and Saks, FOCS 2018]. We study sublinear-time algorithms for small edit distance, which was investigated extensively because of its numerous applications. Our main result is an algorithm for distinguishing whether the edit distance is at most tt or at least t2t^2 (the quadratic gap problem) in time O~(nt+t3)\tilde{O}(\frac{n}{t}+t^3). This time bound is sublinear roughly for all tt in [ω(1),o(n1/3)][\omega(1), o(n^{1/3})], which was not known before. The best previous algorithms solve this problem in sublinear time only for t=ω(n1/3)t=\omega(n^{1/3}) [Andoni and Onak, STOC 2009]. Our algorithm is based on a new approach that adaptively switches between uniform sampling and reading contiguous blocks of the input strings. In contrast, all previous algorithms choose which coordinates to query non-adaptively. Moreover, it can be extended to solve the tt vs t2ϵt^{2-\epsilon} gap problem in time O~(nt1ϵ+t3)\tilde{O}(\frac{n}{t^{1-\epsilon}}+t^3).

Keywords

Cite

@article{arxiv.1910.00901,
  title  = {Sublinear Algorithms for Gap Edit Distance},
  author = {Elazar Goldenberg and Robert Krauthgamer and Barna Saha},
  journal= {arXiv preprint arXiv:1910.00901},
  year   = {2019}
}