English

Coding Schemes for Document Exchange under Multiple Substring Edits

Information Theory 2026-01-27 v1 math.IT

Abstract

We study the document exchange problem under multiple substring edits. A substring edit in a string x\mathbf{x} occurs when a substring u\mathbf{u} of x\mathbf{x} is replaced by an arbitrary string v\mathbf{v}. The lengths of u\mathbf{u} and v\mathbf{v} are bounded from above by a fixed constant. Let x\mathbf{x} and y\mathbf{y} be two binary strings that differ by multiple substring edits. The aim of document exchange schemes is to construct an encoding of x\mathbf{x} with small length such that x\mathbf{x} can be recovered using y\mathbf{y} and the encoding. We construct a low-complexity document exchange scheme with encoding length of 4tlogn+o(logn)4t\log n+o(\log n) bits, where nn is the length of the string x\mathbf{x}. The best known scheme achieves an encoding length of 4tlogn+O(loglogn)4t \log n+O(\log\log n) bits, but at a much higher computational complexity. Then, we investigate the average length of valid encodings for document exchange schemes with uniform strings x\mathbf{x} and develop a scheme with an expected encoding length of (4t1)logn+o(logn)(4t-1) \log n+o(\log n) bits. In this setting, prior works have only constructed schemes for a single substring edit.

Keywords

Cite

@article{arxiv.2601.18441,
  title  = {Coding Schemes for Document Exchange under Multiple Substring Edits},
  author = {Hrishi Narayanan and Vinayak Ramkumar and Rawad Bitar and Antonia Wachter-Zeh},
  journal= {arXiv preprint arXiv:2601.18441},
  year   = {2026}
}
R2 v1 2026-07-01T09:20:18.566Z