English

An Improved Sketching Algorithm for Edit Distance

Data Structures and Algorithms 2021-05-04 v3

Abstract

We provide improved upper bounds for the simultaneous sketching complexity of edit distance. Consider two parties, Alice with input xΣnx\in\Sigma^n and Bob with input yΣny\in\Sigma^n, that share public randomness and are given a promise that the edit distance ed(x,y)\mathsf{ed}(x,y) between their two strings is at most some given value kk. Alice must send a message sxsx and Bob must send sysy to a third party Charlie, who does not know the inputs but shares the same public randomness and also knows kk. Charlie must output ed(x,y)\mathsf{ed}(x,y) precisely as well as a sequence of ed(x,y)\mathsf{ed}(x,y) edits required to transform xx into yy. The goal is to minimize the lengths sx,sy|sx|, |sy| of the messages sent. The protocol of Belazzougui and Zhang (FOCS 2016), building upon the random walk method of Chakraborty, Goldenberg, and Kouck\'y (STOC 2016), achieves a maximum message length of O~(k8)\tilde O(k^8) bits, where O~()\tilde O(\cdot) hides poly(logn)\mathrm{poly}(\log n) factors. In this work we build upon Belazzougui and Zhang's protocol and provide an improved analysis demonstrating that a slight modification of their construction achieves a bound of O~(k3)\tilde O(k^3).

Cite

@article{arxiv.2010.13170,
  title  = {An Improved Sketching Algorithm for Edit Distance},
  author = {Ce Jin and Jelani Nelson and Kewen Wu},
  journal= {arXiv preprint arXiv:2010.13170},
  year   = {2021}
}

Comments

Appeared in STACS 2021. Fixed the title to match the conference version