Related papers: Taking GPU Programming Models to Task for Performa…

A Study of Performance Portability in Plasma Physics Simulations

The high-performance computing (HPC) community has recently seen a substantial diversification of hardware platforms and their associated programming models. From traditional multicore processors to highly specialized accelerators, vendors…

Plasma Physics · Physics 2024-11-11 Josef Ruzicka , Christian Asch , Esteban Meneses , Markus Rampp , Erwin Laure

Implementing Multi-GPU Scientific Computing Miniapps Across Performance Portable Frameworks

Scientific computing in the exascale era demands increased computational power to solve complex problems across various domains. With the rise of heterogeneous computing architectures the need for vendor-agnostic, performance portability…

Distributed, Parallel, and Cluster Computing · Computer Science 2025-11-05 Johansell Villalobos , Josef Ruzicka , Silvio Rizzi

Evaluating Application Characteristics for GPU Portability Layer Selection

GPUs have become the dominant source of computing power for high performance computing and are increasingly being used across the High Energy Physics computing landscape for a wide variety of tasks. Though NVIDIA is currently the main…

High Energy Physics - Experiment · Physics 2026-01-27 Mohammad Atif , Meghna Bhattacharya , Mark Dewing , Zhihua Dong , Julien Esseiva , Oliver Gutsche , Matti Kortelainen , Ka Hei Martin Kwok , Charles Leggett , Meifeng Lin , Aleksei Strelchenko , Vakhang Tsulaia , Brett Viren , Tianle Wang , Haiwang Yu

Comparing Performance and Portability between CUDA and SYCL for Protein Database Search on NVIDIA, AMD, and Intel GPUs

The heterogeneous computing paradigm has led to the need for portable and efficient programming solutions that can leverage the capabilities of various hardware devices, such as NVIDIA, Intel, and AMD GPUs. This study evaluates the…

Programming Languages · Computer Science 2023-11-13 Manuel Costanzo , Enzo Rucci , Carlos García Sánchez , Marcelo Naiouf , Manuel Prieto-Matías

An approach to performance portability through generic programming

The expanding hardware diversity in high performance computing adds enormous complexity to scientific software development. Developers who aim to write maintainable software have two options: 1) To use a so-called data locality abstraction…

Software Engineering · Computer Science 2023-11-10 Andreas Hadjigeorgiou , Christodoulos Stylianou , Michele Weiland , Dirk Jacob Verschuur , Jacob Finkenrath

Evaluating performance and portability of high-level programming models: Julia, Python/Numba, and Kokkos on exascale nodes

We explore the performance and portability of the high-level programming models: the LLVM-based Julia and Python/Numba, and Kokkos on high-performance computing (HPC) nodes: AMD Epyc CPUs and MI250X graphical processing units (GPUs) on…

Distributed, Parallel, and Cluster Computing · Computer Science 2023-09-20 William F. Godoy , Pedro Valero-Lara , T. Elise Dettling , Christian Trefftz , Ian Jorquera , Thomas Sheehy , Ross G. Miller , Marc Gonzalez-Tallada , Jeffrey S. Vetter , Valentin Churavy

Evaluation of Programming Models and Performance for Stencil Computation on Current GPU Architectures

Accelerated computing is widely used in high-performance computing. Therefore, it is crucial to experiment and discover how to better utilize GPUGPUs latest generations on relevant applications. In this paper, we present results and share…

Distributed, Parallel, and Cluster Computing · Computer Science 2024-08-13 Baodi Shan , Mauricio Araya-Polo

A Performance-Portable SYCL Implementation of CRK-HACC for Exascale

The first generation of exascale systems will include a variety of machine architectures, featuring GPUs from multiple vendors. As a result, many developers are interested in adopting portable programming models to avoid maintaining…

Performance · Computer Science 2023-10-26 Esteban M. Rangel , S. John Pennycook , Adrian Pope , Nicholas Frontiere , Zhiqiang Ma , Varsha Madananth

Application of performance portability solutions for GPUs and many-core CPUs to track reconstruction kernels

Next generation High-Energy Physics (HEP) experiments are presented with significant computational challenges, both in terms of data volume and processing power. Using compute accelerators, such as GPUs, is one of the promising ways to…

Accelerator Physics · Physics 2024-01-26 Ka Hei Martin Kwok , Matti Kortelainen , Giuseppe Cerati , Alexei Strelchenko , Oliver Gutsche , Allison Reinsvold Hall , Steve Lantz , Michael Reid , Daniel Riley , Sophie Berkman , Seyong Lee , Hammad Ather , Boyana Norris , Cong Wang

Performance Portability Strategies for Grid C++ Expression Templates

One of the key requirements for the Lattice QCD Application Development as part of the US Exascale Computing Project is performance portability across multiple architectures. Using the Grid C++ expression template as a starting point, we…

High Energy Physics - Lattice · Physics 2018-04-18 Peter A. Boyle , M. A. Clark , Carleton DeTar , Meifeng Lin , Verinder Rana , Alejandro Vaquero Avilés-Casco

Runtime Support for Performance Portability on Heterogeneous Distributed Platforms

Hardware heterogeneity is here to stay for high-performance computing. Large-scale systems are currently equipped with multiple GPU accelerators per compute node and are expected to incorporate more specialized hardware. This shift in the…

Distributed, Parallel, and Cluster Computing · Computer Science 2023-03-09 Polykarpos Thomadakis , Nikos Chrisochoides

Portable Programming Model Exploration for LArTPC Simulation in a Heterogeneous Computing Environment: OpenMP vs. SYCL

The evolution of the computing landscape has resulted in the proliferation of diverse hardware architectures, with different flavors of GPUs and other compute accelerators becoming more widely available. To facilitate the efficient use of…

High Energy Physics - Experiment · Physics 2023-04-05 Meifeng Lin , Zhihua Dong , Tianle Wang , Mohammad Atif , Meghna Battacharya , Kyle Knoepfel , Charles Leggett , Brett Viren , Haiwang Yu

Many Cores, Many Models: GPU Programming Model vs. Vendor Compatibility Overview

In recent history, GPUs became a key driver of compute performance in HPC. With the installation of the Frontier supercomputer, they became the enablers of the Exascale era; further largest-scale installations are in progress (Aurora, El…

Distributed, Parallel, and Cluster Computing · Computer Science 2023-10-05 Andreas Herten

Analyzing the Performance Portability of SYCL across CPUs, GPUs, and Hybrid Systems with SW Sequence Alignment

The high-performance computing (HPC) landscape is undergoing rapid transformation, with an increasing emphasis on energy-efficient and heterogeneous computing environments. This comprehensive study extends our previous research on SYCL's…

Distributed, Parallel, and Cluster Computing · Computer Science 2025-04-15 Manuel Costanzo , Enzo Rucci , Carlos García-Sánchez , Marcelo Naiouf , Manuel Prieto-Matías

Exploring code portability solutions for HEP with a particle tracking test code

Traditionally, high energy physics (HEP) experiments have relied on x86 CPUs for the majority of their significant computing needs. As the field looks ahead to the next generation of experiments such as DUNE and the High-Luminosity LHC, the…

High Energy Physics - Experiment · Physics 2024-09-17 Hammad Ather , Sophie Berkman , Giuseppe Cerati , Matti Kortelainen , Ka Hei Martin Kwok , Steven Lantz , Seyong Lee , Boyana Norris , Michael Reid , Allison Reinsvold Hall , Daniel Riley , Alexei Strelchenko , Cong Wang

Evaluating the performance portability of SYCL across CPUs and GPUs on bandwidth-bound applications

In this paper, we evaluate the portability of the SYCL programming model on some of the latest CPUs and GPUs from a wide range of vendors, utilizing the two main compilers: DPC++ and hipSYCL/OpenSYCL. Both compilers currently support GPUs…

Performance · Computer Science 2023-09-20 Istvan Z Reguly

Open SYCL on heterogeneous GPU systems: A case of study

Computational platforms for high-performance scientific applications are becoming more heterogenous, including hardware accelerators such as multiple GPUs. Applications in a wide variety of scientific fields require an efficient and careful…

Distributed, Parallel, and Cluster Computing · Computer Science 2023-10-12 Rocío Carratalá-Sáez , Francisco J. andújar , Yuri Torres , Arturo Gonzalez-Escribano , Diego R. Llanos

Parallel Paradigms in Modern HPC: A Comparative Analysis of MPI, OpenMP, and CUDA

This paper presents a comprehensive comparison of three dominant parallel programming models in High Performance Computing (HPC): Message Passing Interface (MPI), Open Multi-Processing (OpenMP), and Compute Unified Device Architecture…

Distributed, Parallel, and Cluster Computing · Computer Science 2025-06-19 Nizar ALHafez , Ahmad Kurdi

Evaluation of performance portability frameworks for the implementation of a particle-in-cell code

This paper reports on an in-depth evaluation of the performance portability frameworks Kokkos and RAJA with respect to their suitability for the implementation of complex particle-in-cell (PIC) simulation codes, extending previous studies…

Distributed, Parallel, and Cluster Computing · Computer Science 2019-11-20 Victor Artigues , Katharina Kormann , Markus Rampp , Klaus Reuter

Portability for GPU-accelerated molecular docking applications for cloud and HPC: can portable compiler directives provide performance across all platforms?

High-throughput structure-based screening of drug-like molecules has become a common tool in biomedical research. Recently, acceleration with graphics processing units (GPUs) has provided a large performance boost for molecular docking…

Distributed, Parallel, and Cluster Computing · Computer Science 2022-03-07 Mathialakan Thavappiragasam , Wael Elwasif , Ada Sedova