English

Quantum Algorithm for the Multiple String Matching Problem

Quantum Physics 2024-11-25 v1 Data Structures and Algorithms

Abstract

Let us consider the Multiple String Matching Problem. In this problem, we consider a long string, denoted by tt, of length nn. This string is referred to as a text. We also consider a sequence of mm strings, denoted by SS, which we refer to as a dictionary. The total length of all strings from the dictionary is represented by the variable L. The objective is to identify all instances of strings from the dictionary within the text. The standard classical solution to this problem is Aho-Corasick Algorithm that has O(n+L)O(n+L) query and time complexity. At the same time, the classical lower bound for the problem is the same Ω(n+L)\Omega(n+L). We propose a quantum algorithm with O(n+mLlogn+mlogn)O(n+\sqrt{mL\log n}+m\log n) query complexity and O(n+mLlognlogb+mlogn)=O(n+mL)O(n+\sqrt{mL\log n}\log b+m\log n)=O^*(n+\sqrt{mL}) time complexity, where bb is the maximal length of strings from the dictionary. This improvement is particularly significant in the case of dictionaries comprising long words. Our algorithm's complexity is equal to the quantum lower bound O(n+mL)O(n + \sqrt{mL}), up to a log factor. In some sense, our algorithm can be viewed as a quantum analogue of the Aho-Corasick algorithm.

Keywords

Cite

@article{arxiv.2411.14850,
  title  = {Quantum Algorithm for the Multiple String Matching Problem},
  author = {Kamil Khadiev and Danil Serov},
  journal= {arXiv preprint arXiv:2411.14850},
  year   = {2024}
}

Comments

the paper is accepted in SOFSEM2025 Conference