English

Approximate Circular Pattern Matching under Edit Distance

Data Structures and Algorithms 2024-02-23 v1

Abstract

In the kk-Edit Circular Pattern Matching (kk-Edit CPM) problem, we are given a length-nn text TT, a length-mm pattern PP, and a positive integer threshold kk, and we are to report all starting positions of the substrings of TT that are at edit distance at most kk from some cyclic rotation of PP. In the decision version of the problem, we are to check if any such substring exists. Very recently, Charalampopoulos et al. [ESA 2022] presented O(nk2)O(nk^2)-time and O(nklog3k)O(nk \log^3 k)-time solutions for the reporting and decision versions of kk-Edit CPM, respectively. Here, we show that the reporting and decision versions of kk-Edit CPM can be solved in O(n+(n/m)k6)O(n+(n/m) k^6) time and O(n+(n/m)k5log3k)O(n+(n/m) k^5 \log^3 k) time, respectively, thus obtaining the first algorithms with a complexity of the type O(n+(n/m)poly(k))O(n+(n/m) \mathrm{poly}(k)) for this problem. Notably, our algorithms run in O(n)O(n) time when m=Ω(k6)m=\Omega(k^6) and are superior to the previous respective solutions when m=ω(k4)m=\omega(k^4). We provide a meta-algorithm that yields efficient algorithms in several other interesting settings, such as when the strings are given in a compressed form (as straight-line programs), when the strings are dynamic, or when we have a quantum computer. We obtain our solutions by exploiting the structure of approximate circular occurrences of PP in TT, when TT is relatively short w.r.t. PP. Roughly speaking, either the starting positions of approximate occurrences of rotations of PP form O(k4)O(k^4) intervals that can be computed efficiently, or some rotation of PP is almost periodic (is at a small edit distance from a string with small period). Dealing with the almost periodic case is the most technically demanding part of this work; we tackle it using properties of locked fragments (originating from [Cole and Hariharan, SICOMP 2002]).

Keywords

Cite

@article{arxiv.2402.14550,
  title  = {Approximate Circular Pattern Matching under Edit Distance},
  author = {Panagiotis Charalampopoulos and Solon P. Pissis and Jakub Radoszewski and Wojciech Rytter and Tomasz Waleń and Wiktor Zuba},
  journal= {arXiv preprint arXiv:2402.14550},
  year   = {2024}
}

Comments

Full version of a paper accepted to STACS 2024