English

Cartesian Tree Subsequence Matching

Data Structures and Algorithms 2022-04-18 v2

Abstract

Park et al. [TCS 2020] observed that the similarity between two (numerical) strings can be captured by the Cartesian trees: The Cartesian tree of a string is a binary tree recursively constructed by picking up the smallest value of the string as the root of the tree. Two strings of equal length are said to Cartesian-tree match if their Cartesian trees are isomorphic. Park et al. [TCS 2020] introduced the following Cartesian tree substring matching (CTMStr) problem: Given a text string TT of length nn and a pattern string of length mm, find every consecutive substring S=T[i..j]S = T[i..j] of a text string TT such that SS and PP Cartesian-tree match. They showed how to solve this problem in O~(n+m)\tilde{O}(n+m) time. In this paper, we introduce the Cartesian tree subsequence matching (CTMSeq) problem, that asks to find every minimal substring S=T[i..j]S = T[i..j] of TT such that SS contains a subsequence SS' which Cartesian-tree matches PP. We prove that the CTMSeq problem can be solved efficiently, in O(mnp(n))O(m n p(n)) time, where p(n)p(n) denotes the update/query time for dynamic predecessor queries. By using a suitable dynamic predecessor data structure, we obtain O(mnloglogn)O(mn \log \log n)-time and O(nlogm)O(n \log m)-space solution for CTMSeq. This contrasts CTMSeq with closely related order-preserving subsequence matching (OPMSeq) which was shown to be NP-hard by Bose et al. [IPL 1998].

Keywords

Cite

@article{arxiv.2202.04349,
  title  = {Cartesian Tree Subsequence Matching},
  author = {Tsubasa Oizumi and Takeshi Kai and Takuya Mieno and Shunsuke Inenaga and Hiroki Arimura},
  journal= {arXiv preprint arXiv:2202.04349},
  year   = {2022}
}