Enabling predictable parallelism in single-GPU systems with persistent CUDA threads

Paolo Burgio

Enabling predictable parallelism in single-GPU systems with persistent CUDA threads

Distributed, Parallel, and Cluster Computing 2023-10-03 v1

Authors: Paolo Burgio

Abstract

Graphics Processing Unit, or GPUs, have been successfully adopted both for graphic computation in 3D applications, and for general purpose application (GP-GPUs), thank to their tremendous performance-per-watt. Recently, there is a big interest in adopting them also within automotive and avionic industrial settings, imposing for the first time real-time constraints on the design of such devices. Unfortunately, it is extremely hard to extract timing guarantees from modern GPU designs, and current approaches rely on a model where the GPU is treated as a unique monolithic execution device. Unlike state-of-the-art of research, we try to open the box of modern GPU architectures, providing a clean way to exploit intra-GPU predictable execution.

Enabling predictable parallelism in single-GPU systems with persistent CUDA threads

Abstract

Keywords

Cite

Comments

Related papers