English

Exploring GPU Stream-Aware Message Passing using Triggered Operations

Distributed, Parallel, and Cluster Computing 2022-08-10 v1 Networking and Internet Architecture

Abstract

Modern heterogeneous supercomputing systems are comprised of compute blades that offer CPUs and GPUs. On such systems, it is essential to move data efficiently between these different compute engines across a high-speed network. While current generation scientific applications and systems software stacks are GPU-aware, CPU threads are still required to orchestrate data moving communication operations and inter-process synchronization operations. A new GPU stream-aware MPI communication strategy called stream-triggered (ST) communication is explored to allow offloading both computation and communication control paths to the GPU. The proposed ST communication strategy is implemented on HPE Slingshot Interconnects over a new proprietary HPE Slingshot NIC (Slingshot 11) using the supported triggered operations feature. Performance of the proposed new communication strategy is evaluated using a microbenchmark kernel called Faces, based on the nearest-neighbor communication pattern in the CORAL-2 Nekbone benchmark, over a heterogeneous node architecture consisting of AMD CPUs and GPUs.

Keywords

Cite

@article{arxiv.2208.04817,
  title  = {Exploring GPU Stream-Aware Message Passing using Triggered Operations},
  author = {Naveen Namashivayam and Krishna Kandalla and Trey White and Nick Radcliffe and Larry Kaplan and Mark Pagel},
  journal= {arXiv preprint arXiv:2208.04817},
  year   = {2022}
}

Comments

11 pages, 12 figures

R2 v1 2026-06-25T01:35:59.558Z