English

The Longest Subsequence-Repeated Subsequence Problem

Data Structures and Algorithms 2023-09-01 v3 Computational Complexity

Abstract

Motivated by computing duplication patterns in sequences, a new fundamental problem called the longest subsequence-repeated subsequence (LSRS) is proposed. Given a sequence SS of length nn, a letter-repeated subsequence is a subsequence of SS in the form of x1d1x2d2xkdkx_1^{d_1}x_2^{d_2}\cdots x_k^{d_k} with xix_i a subsequence of SS, xjxj+1x_j\neq x_{j+1} and di2d_i\geq 2 for all ii in [k][k] and jj in [k1][k-1]. We first present an O(n6)O(n^6) time algorithm to compute the longest cubic subsequences of all the O(n2)O(n^2) substrings of SS, improving the trivial O(n7)O(n^7) bound. Then, an O(n6)O(n^6) time algorithm for computing the longest subsequence-repeated subsequence (LSRS) of SS is obtained. Finally we focus on two variants of this problem. We first consider the constrained version when Σ\Sigma is unbounded, each letter appears in SS at most dd times and all the letters in Σ\Sigma must appear in the solution. We show that the problem is NP-hard for d=4d=4, via a reduction from a special version of SAT (which is obtained from 3-COLORING). We then show that when each letter appears in SS at most d=3d=3 times, then the problem is solvable in O(n5)O(n^5) time.

Keywords

Cite

@article{arxiv.2304.06862,
  title  = {The Longest Subsequence-Repeated Subsequence Problem},
  author = {Manuel Lafond and Wenfeng Lai and Adiesha Liyanage and Binhai Zhu},
  journal= {arXiv preprint arXiv:2304.06862},
  year   = {2023}
}

Comments

15 pages, 1 figure