English

Approximating Approximate Pattern Matching

Data Structures and Algorithms 2019-07-24 v3

Abstract

Given a text TT of length nn and a pattern PP of length mm, the approximate pattern matching problem asks for computation of a particular \emph{distance} function between PP and every mm-substring of TT. We consider a (1±ε)(1\pm\varepsilon) multiplicative approximation variant of this problem, for p\ell_p distance function. In this paper, we describe two (1+ε)(1+\varepsilon)-approximate algorithms with a runtime of O~(nε)\widetilde{O}(\frac{n}{\varepsilon}) for all (constant) non-negative values of pp. For constant p1p \ge 1 we show a deterministic (1+ε)(1+\varepsilon)-approximation algorithm. Previously, such run time was known only for the case of 1\ell_1 distance, by Gawrychowski and Uzna\'nski [ICALP 2018] and only with a randomized algorithm. For constant 0p10 \le p \le 1 we show a randomized algorithm for the p\ell_p, thereby providing a smooth tradeoff between algorithms of Kopelowitz and Porat [FOCS~2015, SOSA~2018] for Hamming distance (case of p=0p=0) and of Gawrychowski and Uzna\'nski for 1\ell_1 distance.

Keywords

Cite

@article{arxiv.1810.01676,
  title  = {Approximating Approximate Pattern Matching},
  author = {Jan Studený and Przemysław Uznański},
  journal= {arXiv preprint arXiv:1810.01676},
  year   = {2019}
}