English

Approximate Circular Pattern Matching

Data Structures and Algorithms 2025-06-13 v2

Abstract

We consider approximate circular pattern matching (CPM, in short) under the Hamming and edit distance, in which we are given a length-nn text TT, a length-mm pattern PP, and a threshold k>0k>0, and we are to report all starting positions of fragments of TT (called occurrences) that are at distance at most kk from some cyclic rotation of PP. In the decision version of the problem, we are to check if any such occurrence exists. All previous results for approximate CPM were either average-case upper bounds or heuristics, except for the work of Charalampopoulos et al. [CKP+^+, JCSS'21], who considered only the Hamming distance. For the reporting version of the approximate CPM problem, under the Hamming distance we improve upon the main algorithm of [CKP+^+, JCSS'21] from O(n+(n/m)k4){\cal O}(n+(n/m)\cdot k^4) to O(n+(n/m)k3){\cal O}(n+(n/m)\cdot k^3) time; for the edit distance, we give an O(nk2){\cal O}(nk^2)-time algorithm. We also consider the decision version of the approximate CPM problem. Under the Hamming distance, we obtain an O(n+(n/m)k2logk/loglogk){\cal O}(n+(n/m)\cdot k^2\log k/\log\log k)-time algorithm, which nearly matches the algorithm by Chan et al. [CGKKP, STOC'20] for the standard counterpart of the problem. Under the edit distance, the O(nklog2k){\cal O}(nk\log^2 k) running time of our algorithm nearly matches the O(nk){\cal O}(nk) running time of the Landau-Vishkin algorithm [LV, J. Algorithms'89]. As a stepping stone, we propose an O(nklog2k){\cal O}(nk\log^2 k)-time algorithm for the Longest Prefix kk'-Approximate Match problem, proposed by Landau et al. [LMS, SICOMP'98], for all k{1,,k}k'\in \{1,\dots,k\}. We give a conditional lower bound that suggests a polynomial separation between approximate CPM under the Hamming distance over the binary alphabet and its non-circular counterpart. We also show that a strongly subquadratic-time algorithm for the decision version of approximate CPM under edit distance would refute SETH.

Keywords

Cite

@article{arxiv.2208.08915,
  title  = {Approximate Circular Pattern Matching},
  author = {Panagiotis Charalampopoulos and Tomasz Kociumaka and Jakub Radoszewski and Solon P. Pissis and Wojciech Rytter and Tomasz Waleń and Wiktor Zuba},
  journal= {arXiv preprint arXiv:2208.08915},
  year   = {2025}
}

Comments

A preliminary version of this article was presented at ESA 2022. In this version, we have improved the exposition and improved some complexities by polylogarithmic factors. Abstract abridged to meet arXiv requirements