English

A sparsity-aware distributed-memory algorithm for sparse-sparse matrix multiplication

Distributed, Parallel, and Cluster Computing 2024-08-28 v1

Abstract

Multiplying two sparse matrices (SpGEMM) is a common computational primitive used in many areas including graph algorithms, bioinformatics, algebraic multigrid solvers, and randomized sketching. Distributed-memory parallel algorithms for SpGEMM have mainly focused on sparsity-oblivious approaches that use 2D and 3D partitioning. Sparsity-aware 1D algorithms can theoretically reduce communication by not fetching nonzeros of the sparse matrices that do not participate in the multiplication. Here, we present a distributed-memory 1D SpGEMM algorithm and implementation. It uses MPI RDMA operations to mitigate the cost of packing/unpacking submatrices for communication, and it uses a block fetching strategy to avoid excessive fine-grained messaging. Our results show that our 1D implementation outperforms state-of-the-art 2D and 3D implementations within CombBLAS for many configurations, inputs, and use cases, while remaining conceptually simpler.

Keywords

Cite

@article{arxiv.2408.14558,
  title  = {A sparsity-aware distributed-memory algorithm for sparse-sparse matrix multiplication},
  author = {Yuxi Hong and Aydin Buluc},
  journal= {arXiv preprint arXiv:2408.14558},
  year   = {2024}
}

Comments

Accepted by 2024 International Conference on High Performance Computing, Networking, Storage and Analysis, 2024 (SC'24)