English

Scalable Distributed String Sorting

Data Structures and Algorithms 2024-04-26 v1

Abstract

String sorting is an important part of tasks such as building index data structures. Unfortunately, current string sorting algorithms do not scale to massively parallel distributed-memory machines since they either have latency (at least) proportional to the number of processors pp or communicate the data a large number of times (at least logarithmic). We present practical and efficient algorithms for distributed-memory string sorting that scale to large pp. Similar to state-of-the-art sorters for atomic objects, the algorithms have latency of about p1/kp^{1/k} when allowing the data to be communicated kk times. Experiments indicate good scaling behavior on a wide range of inputs on up to 49152 cores. Overall, we achieve speedups of up to 5 over the current state-of-the-art distributed string sorting algorithms.

Keywords

Cite

@article{arxiv.2404.16517,
  title  = {Scalable Distributed String Sorting},
  author = {Florian Kurpicz and Pascal Mehnert and Peter Sanders and Matthias Schimek},
  journal= {arXiv preprint arXiv:2404.16517},
  year   = {2024}
}