English

Sampling the suffix array with minimizers

Data Structures and Algorithms 2014-12-04 v2

Abstract

Sampling (evenly) the suffixes from the suffix array is an old idea trading the pattern search time for reduced index space. A few years ago Claude et al. showed an alphabet sampling scheme allowing for more efficient pattern searches compared to the sparse suffix array, for long enough patterns. A drawback of their approach is the requirement that sought patterns need to contain at least one character from the chosen subalphabet. In this work we propose an alternative suffix sampling approach with only a minimum pattern length as a requirement, which seems more convenient in practice. Experiments show that our algorithm achieves competitive time-space tradeoffs on most standard benchmark data.

Keywords

Cite

@article{arxiv.1406.2348,
  title  = {Sampling the suffix array with minimizers},
  author = {Szymon Grabowski and Marcin Raniszewski},
  journal= {arXiv preprint arXiv:1406.2348},
  year   = {2014}
}

Comments

One new SamSAMi variant; extended experimental results

R2 v1 2026-06-22T04:34:29.707Z