English

Efficiently decoding strings from their shingles

Discrete Mathematics 2012-04-17 v1

Abstract

Determining whether an unordered collection of overlapping substrings (called shingles) can be uniquely decoded into a consistent string is a problem that lies within the foundation of a broad assortment of disciplines ranging from networking and information theory through cryptography and even genetic engineering and linguistics. We present three perspectives on this problem: a graph theoretic framework due to Pevzner, an automata theoretic approach from our previous work, and a new insight that yields a time-optimal streaming algorithm for determining whether a string of nn characters over the alphabet Σ\Sigma can be uniquely decoded from its two-character shingles. Our algorithm achieves an overall time complexity Θ(n)\Theta(n) and space complexity O(Σ)O(|\Sigma|). As an application, we demonstrate how this algorithm can be extended to larger shingles for efficient string reconciliation.

Keywords

Cite

@article{arxiv.1204.3293,
  title  = {Efficiently decoding strings from their shingles},
  author = {Aryeh Kontorovich and Ari Trachtenberg},
  journal= {arXiv preprint arXiv:1204.3293},
  year   = {2012}
}
R2 v1 2026-06-21T20:49:40.877Z