Taking GPU Programming Models to Task for Performance Portability

Joshua H. Davis; Pranav Sivaraman; Joy Kitson; Konstantinos Parasyris; Harshitha Menon; Isaac Minn; Giorgis Georgakoudis; Abhinav Bhatele

doi:10.1145/3721145.3730423

Taking GPU Programming Models to Task for Performance Portability

Distributed, Parallel, and Cluster Computing 2025-09-08 v4 Performance

Authors: Joshua H. Davis , Pranav Sivaraman , Joy Kitson , Konstantinos Parasyris , Harshitha Menon , Isaac Minn , Giorgis Georgakoudis , Abhinav Bhatele

View on arXiv ↗ PDF ↗ DOI ↗

Abstract

Portability is critical to ensuring high productivity in developing and maintaining scientific software as the diversity in on-node hardware architectures increases. While several programming models provide portability for diverse GPU systems, they don't make any guarantees about performance portability. In this work, we explore several programming models -- CUDA, HIP, Kokkos, RAJA, OpenMP, OpenACC, and SYCL, to assess the consistency of their performance across NVIDIA and AMD GPUs. We use five proxy applications from different scientific domains, create implementations where missing, and use them to present a comprehensive comparative evaluation of the performance portability of these programming models. We provide a Spack scripting-based methodology to ensure reproducibility of experiments conducted in this work. Finally, we analyze the reasons for why some programming models underperform in certain scenarios and in some cases, present performance optimizations to the proxy applications.

Keywords

computer architecture gpu computing parallel programming

Cite

@article{arxiv.2402.08950,
  title  = {Taking GPU Programming Models to Task for Performance Portability},
  author = {Joshua H. Davis and Pranav Sivaraman and Joy Kitson and Konstantinos Parasyris and Harshitha Menon and Isaac Minn and Giorgis Georgakoudis and Abhinav Bhatele},
  journal= {arXiv preprint arXiv:2402.08950},
  year   = {2025}
}

Comments

16 pages, 5 figures

Taking GPU Programming Models to Task for Performance Portability

Abstract

Keywords

Cite

Comments

Related papers