Faster Algorithms for Longest Common Substring
Abstract
In the classic longest common substring (LCS) problem, we are given two strings and , each of length at most , over an alphabet of size , and we are asked to find a longest string occurring as a fragment of both and . Weiner, in his seminal paper that introduced the suffix tree, presented an -time algorithm for this problem [SWAT 1973]. For polynomially-bounded integer alphabets, the linear-time construction of suffix trees by Farach yielded an -time algorithm for the LCS problem [FOCS 1997]. However, for small alphabets, this is not necessarily optimal for the LCS problem in the word RAM model of computation, in which the strings can be stored in space and read in time. We show that, in this model, we can compute an LCS in time , which is sublinear in if (in particular, if ), using optimal space . In fact, it was recently shown that this result is conditionally optimal [Kempa and Kociumaka, STOC 2025]. We then lift our ideas to the problem of computing a -mismatch LCS, which has received considerable attention in recent years. In this problem, the aim is to compute a longest substring of that occurs in with at most mismatches. Thankachan et al.~showed how to compute a -mismatch LCS in time for [J. Comput. Biol. 2016]. We show an -time algorithm, for any constant and irrespective of the alphabet size, using space as the previous approaches. We thus notably break through the well-known barrier, which stems from a recursive heavy-path decomposition technique that was first introduced in the seminal paper of Cole et al. [STOC 2004] for string indexing with errors.
Cite
@article{arxiv.2105.03106,
title = {Faster Algorithms for Longest Common Substring},
author = {Panagiotis Charalampopoulos and Tomasz Kociumaka and Jakub Radoszewski and Solon P. Pissis},
journal= {arXiv preprint arXiv:2105.03106},
year = {2025}
}
Comments
Accepted for publication in ACM TALG; extended version of a paper that appeared in the proceedings of ESA 2021. Abstract abridged to meet arXiv requirements