English

Speeding up decimal multiplication

Data Structures and Algorithms 2020-12-11 v4

Abstract

Decimal multiplication is the task of multiplying two numbers in base 10N.10^N. Specifically, we focus on the number-theoretic transform (NTT) family of algorithms. Using only portable techniques, we achieve a 3x-5x speedup over the mpdecimal library. In this paper we describe our implementation and discuss further possible optimizations. We also present a simple cache-efficient algorithm for in-place 2n×n2n \times n or n×2nn \times 2n matrix transposition, the need for which arises in the "six-step algorithm" variation of the matrix Fourier algorithm, and which does not seem to be widely known. Another finding is that use of two prime moduli instead of three makes sense even considering the worst case of increasing the size of the input, and makes for simpler answer recovery.

Keywords

Cite

@article{arxiv.2011.11524,
  title  = {Speeding up decimal multiplication},
  author = {Viktor Krapivensky},
  journal= {arXiv preprint arXiv:2011.11524},
  year   = {2020}
}