English
Related papers

Related papers: OpenMP Advisor

200 papers

Heterogeneity has become a mainstream architecture design choice for building High Performance Computing systems. However, heterogeneity poses significant challenges for achieving performance portability of execution. Adapting a program to…

Programming Languages · Computer Science 2023-03-17 Giorgis Georgakoudis , Konstantinos Parasyris , Chunhua Liao , David Beckingsale , Todd Gamblin , Bronis de Supinski

Software developers must adapt to keep up with the changing capabilities of platforms so that they can utilize the power of High- Performance Computers (HPC), including exascale systems. OpenMP, a directive-based parallel programming model,…

This documentation is designed for beginners in Graphics Processing Unit (GPU)-programming and who want to get familiar with OpenACC and OpenMP offloading models. Here we present an overview of these two programming models as well as of the…

Distributed, Parallel, and Cluster Computing · Computer Science 2022-01-31 Hichan Agueny

In recent years, utilization of heterogeneous hardware other than small core CPU such as GPU, FPGA or many core CPU is increasing. However, when using heterogeneous hardware, barriers of technical skills such as OpenMP, CUDA and OpenCL are…

Distributed, Parallel, and Cluster Computing · Computer Science 2021-07-13 Yoji Yamato

In recent years, utilization of heterogeneous hardware other than small core CPU such as GPU, FPGA or many core CPU is increasing. However, when using heterogeneous hardware, barriers of technical skills such as OpenMP, CUDA and OpenCL are…

Distributed, Parallel, and Cluster Computing · Computer Science 2020-11-24 Yoji Yamato

GPU-based HPC clusters are attracting more scientific application developers due to their extensive parallelism and energy efficiency. In order to achieve portability among a variety of multi/many core architectures, a popular choice for an…

Distributed, Parallel, and Cluster Computing · Computer Science 2023-04-10 Ali TehraniJamsaz , Alok Mishra , Akash Dutta , Abid M. Malik , Barbara Chapman , Ali Jannesari

This paper presents a comprehensive comparison of three dominant parallel programming models in High Performance Computing (HPC): Message Passing Interface (MPI), Open Multi-Processing (OpenMP), and Compute Unified Device Architecture…

Distributed, Parallel, and Cluster Computing · Computer Science 2025-06-19 Nizar ALHafez , Ahmad Kurdi

When using heterogeneous hardware other than CPUs, barriers of technical skills such as OpenCL are high. Based on that, I have proposed environment adaptive software that enables automatic conversion, configuration, and high-performance…

Distributed, Parallel, and Cluster Computing · Computer Science 2020-04-22 Yoji Yamato

Utilizing GPUs is critical for high performance on heterogeneous systems. However, leveraging the full potential of GPUs for accelerating legacy CPU applications can be a challenging task for developers. The porting process requires…

Distributed, Parallel, and Cluster Computing · Computer Science 2023-06-27 Shilei Tian , Tom Scogland , Barbara Chapman , Johannes Doerfert

High-performance computing are based more and more in heterogeneous architectures and GPGPUs have become one of the main integrated blocks in these, as the recently emerged Mali GPU in embedded systems or the NVIDIA GPUs in HPC servers. In…

Distributed, Parallel, and Cluster Computing · Computer Science 2014-07-28 Albert Saà-Garriga , David Castells-Rufas , Jordi Carrabina

In this paper we describe an autotuning tool for optimization of OpenMP applications on highly multicore and multithreaded architectures. Our work was motivated by in-depth performance analysis of scientific applications and synthetic…

Distributed, Parallel, and Cluster Computing · Computer Science 2014-01-17 Jakub Katarzyński , Maciej Cytowski

Parallel programming is central to HPC and AI, but producing code that is correct and fast remains challenging, especially for OpenMP GPU offload, where data movement and tuning dominate. Autonomous coding agents can compile, test, and…

Distributed, Parallel, and Cluster Computing · Computer Science 2026-01-09 Erel Kaplan , Tomer Bitan , Lian Ghrayeb , Le Chen , Tom Yotam , Niranjan Hasabnis , Gal Oren

GPU runtimes are historically implemented in CUDA or other vendor specific languages dedicated to GPU programming. In this work we show that OpenMP 5.1, with minor compiler extensions, is capable of replacing existing solutions without a…

Distributed, Parallel, and Cluster Computing · Computer Science 2021-10-06 Shilei Tian , Jon Chesterfield , Johannes Doerfert , Barbara Chapman

Growing heterogeneity and configurability in HPC architectures has made auto-tuning applications and runtime parameters on these systems very complex. Users are presented with a multitude of options to configure parameters. In addition to…

Distributed, Parallel, and Cluster Computing · Computer Science 2023-04-28 Akash Dutta , Jordi Alcaraz , Ali TehraniJamsaz , Eduardo Cesar , Anna Sikora , Ali Jannesari

When using heterogeneous hardware, barriers of technical skills such as OpenMP, CUDA and OpenCL are high. Based on that, I have proposed environment-adaptive software that enables automatic conversion, configuration. However, including…

Distributed, Parallel, and Cluster Computing · Computer Science 2024-09-17 Yoji Yamato

Heterogeneous systems are becoming increasingly prevalent. In order to exploit the rich compute resources of such systems, robust programming models are needed for application developers to seamlessly migrate legacy code from today's…

GPU is the dominant accelerator device due to its high performance and energy efficiency. Directive-based GPU offloading using OpenACC or OpenMP target is a convenient way to port existing codes originally developed for multicore CPUs.…

Distributed, Parallel, and Cluster Computing · Computer Science 2024-12-02 Yohei Miki , Toshihiro Hanawa

In recent years, utilization of heterogeneous hardware other than small core CPU such as GPU, FPGA or many core CPU is increasing. However, when using heterogeneous hardware, barriers of technical skills such as CUDA are high. Based on…

Distributed, Parallel, and Cluster Computing · Computer Science 2020-11-10 Yoji Yamato

OpenCL, along with CUDA, is one of the main tools used to program GPGPUs. However, it allows running the same code on multi-core CPUs too, making it a rival for the long-established OpenMP. In this paper we compare OpenCL and OpenMP when…

Distributed, Parallel, and Cluster Computing · Computer Science 2015-03-24 Kamran Karimi

To use heterogeneous hardware, programmers must have sufficient technical skills to utilize OpenMP, CUDA, and OpenCL. On the basis of this, I have proposed environment-adaptive software that enables automatic conversion, configuration, and…

Distributed, Parallel, and Cluster Computing · Computer Science 2023-02-27 Yoji Yamato
‹ Prev 1 2 3 10 Next ›