Related papers: OMP2HMPP: HMPP Source Code Generation from Program…

OMP2MPI: Automatic MPI code generation from OpenMP programs

In this paper, we present OMP2MPI a tool that generates automatically MPI source code from OpenMP. With this transformation the original program can be adapted to be able to exploit a larger number of processors by surpassing the limits of…

Distributed, Parallel, and Cluster Computing · Computer Science 2015-06-12 Albert Saa-Garriga , David Castells-Rufas , Jordi Carrabina

OMP2HMPP: Compiler Framework for Energy Performance Trade-off Analysis of Automatically Generated Codes

We present OMP2HMPP, a tool that, in a first step, automatically translates OpenMP code into various possible transformations of HMPP. In a second step OMP2HMPP executes all variants to obtain the performance and power consumption of each…

Distributed, Parallel, and Cluster Computing · Computer Science 2015-06-10 Albert Saà-Garriga , David Castells-Rufas , Jordi Carrabina

OpenMP Advisor

With the increasing diversity of heterogeneous architecture in the HPC industry, porting a legacy application to run on different architectures is a tough challenge. In this paper, we present OpenMP Advisor, a first of its kind compiler…

Distributed, Parallel, and Cluster Computing · Computer Science 2023-01-11 Alok Mishra , Abid M. Malik , Meifeng Lin , Barbara Chapman

Advances in Semantic Patching for HPC-oriented Refactorings with Coccinelle

Currently, the most energy-efficient hardware platforms for floating point-intensive calculations (also known as High Performance Computing, or HPC) are graphical processing units (GPUs). However, porting existing scientific codes to GPUs…

Distributed, Parallel, and Cluster Computing · Computer Science 2025-03-28 Michele Martone , Julia Lawall

CI/CD Efforts for Validation, Verification and Benchmarking OpenMP Implementations

Software developers must adapt to keep up with the changing capabilities of platforms so that they can utilize the power of High- Performance Computers (HPC), including exascale systems. OpenMP, a directive-based parallel programming model,…

Programming Languages · Computer Science 2024-08-22 Aaron Jarmusch , Felipe Cabarcas , Swaroop Pophale , Andrew Kallai , Johannes Doerfert , Luke Peyralans , Seyong Lee , Joel Denny , Sunita Chandrasekaran

OMPGPT: A Generative Pre-trained Transformer Model for OpenMP

Large language models (LLMs)such as ChatGPT have significantly advanced the field of Natural Language Processing (NLP). This trend led to the development of code-based large language models such as StarCoder, WizardCoder, and CodeLlama,…

Software Engineering · Computer Science 2024-11-08 Le Chen , Arijit Bhattacharjee , Nesreen Ahmed , Niranjan Hasabnis , Gal Oren , Vy Vo , Ali Jannesari

HPC-Coder-V2: Studying Code LLMs Across Low-Resource Parallel Languages

Large Language Model (LLM) based coding tools have been tremendously successful as software development assistants, yet they are often designed for general purpose programming tasks and perform poorly for more specialized domains such as…

Distributed, Parallel, and Cluster Computing · Computer Science 2024-12-20 Aman Chaturvedi , Daniel Nichols , Siddharth Singh , Abhinav Bhatele

Machine Learning-Driven Adaptive OpenMP For Portable Performance on Heterogeneous Systems

Heterogeneity has become a mainstream architecture design choice for building High Performance Computing systems. However, heterogeneity poses significant challenges for achieving performance portability of execution. Adapting a program to…

Programming Languages · Computer Science 2023-03-17 Giorgis Georgakoudis , Konstantinos Parasyris , Chunhua Liao , David Beckingsale , Todd Gamblin , Bronis de Supinski

Development and performance of a HemeLB GPU code for human-scale blood flow simulation

In recent years, it has become increasingly common for high performance computers (HPC) to possess some level of heterogeneous architecture - typically in the form of GPU accelerators. In some machines these are isolated within a dedicated…

Distributed, Parallel, and Cluster Computing · Computer Science 2022-10-19 I. Zacharoudiou , J. W. S. McCullough , P. V. Coveney

Advanced Programming Platform for efficient use of Data Parallel Hardware

Graphics processing units (GPU) had evolved from a specialized hardware capable to render high quality graphics in games to a commodity hardware for effective processing blocks of data in a parallel schema. This evolution is particularly…

Distributed, Parallel, and Cluster Computing · Computer Science 2012-03-26 Luis Cabellos

Parallel Paradigms in Modern HPC: A Comparative Analysis of MPI, OpenMP, and CUDA

This paper presents a comprehensive comparison of three dominant parallel programming models in High Performance Computing (HPC): Message Passing Interface (MPI), Open Multi-Processing (OpenMP), and Compute Unified Device Architecture…

Distributed, Parallel, and Cluster Computing · Computer Science 2025-06-19 Nizar ALHafez , Ahmad Kurdi

Automatic Multi-GPU Code Generation applied to Simulation of Electrical Machines

The electrical and electronic engineering has used parallel programming to solve its large scale complex problems for performance reasons. However, as parallel programming requires a non-trivial distribution of tasks and data, developers…

Distributed, Parallel, and Cluster Computing · Computer Science 2011-07-05 Antonio Wendell De Oliveira Rodrigues , Frédéric Guyomarc'H , Jean-Luc Dekeyser , Yvonnick Le Menach

OMPar: Automatic Parallelization with AI-Driven Source-to-Source Compilation

Manual parallelization of code remains a significant challenge due to the complexities of modern software systems and the widespread adoption of multi-core architectures. This paper introduces OMPar, an AI-driven tool designed to automate…

Computation and Language · Computer Science 2024-09-24 Tal Kadosh , Niranjan Hasabnis , Prema Soundararajan , Vy A. Vo , Mihai Capota , Nesreen Ahmed , Yuval Pinter , Gal Oren

GPU First -- Execution of Legacy CPU Codes on GPUs

Utilizing GPUs is critical for high performance on heterogeneous systems. However, leveraging the full potential of GPUs for accelerating legacy CPU applications can be a challenging task for developers. The porting process requires…

Distributed, Parallel, and Cluster Computing · Computer Science 2023-06-27 Shilei Tian , Tom Scogland , Barbara Chapman , Johannes Doerfert

ParaGraph: Weighted Graph Representation for Performance Optimization of HPC Kernels

GPU-based HPC clusters are attracting more scientific application developers due to their extensive parallelism and energy efficiency. In order to achieve portability among a variety of multi/many core architectures, a popular choice for an…

Distributed, Parallel, and Cluster Computing · Computer Science 2023-04-10 Ali TehraniJamsaz , Alok Mishra , Akash Dutta , Abid M. Malik , Barbara Chapman , Ali Jannesari

HPCTransCompile: An AI Compiler Generated Dataset for High-Performance CUDA Transpilation and LLM Preliminary Exploration

The rapid growth of deep learning has driven exponential increases in model parameters and computational demands. NVIDIA GPUs and their CUDA-based software ecosystem provide robust support for parallel computing, significantly alleviating…

Distributed, Parallel, and Cluster Computing · Computer Science 2025-07-08 Jiaqi Lv , Xufeng He , Yanchen Liu , Xu Dai , Aocheng Shen , Yinghao Li , Jiachen Hao , Jianrong Ding , Yang Hu , Shouyi Yin

Experience Report: Writing A Portable GPU Runtime with OpenMP 5.1

GPU runtimes are historically implemented in CUDA or other vendor specific languages dedicated to GPU programming. In this work we show that OpenMP 5.1, with minor compiler extensions, is capable of replacing existing solutions without a…

Distributed, Parallel, and Cluster Computing · Computer Science 2021-10-06 Shilei Tian , Jon Chesterfield , Johannes Doerfert , Barbara Chapman

HLSPilot: LLM-based High-Level Synthesis

Large language models (LLMs) have catalyzed an upsurge in automatic code generation, garnering significant attention for register transfer level (RTL) code generation. Despite the potential of RTL code generation with natural language, it…

Hardware Architecture · Computer Science 2024-08-14 Chenwei Xiong , Cheng Liu , Huawei Li , Xiaowei Li

HPC-Coder: Modeling Parallel Programs using Large Language Models

Parallel programs in high performance computing (HPC) continue to grow in complexity and scale in the exascale era. The diversity in hardware and parallel programming models make developing, optimizing, and maintaining parallel software…

Distributed, Parallel, and Cluster Computing · Computer Science 2024-05-15 Daniel Nichols , Aniruddha Marathe , Harshitha Menon , Todd Gamblin , Abhinav Bhatele

The Feasibility of Using OpenCL Instead of OpenMP for Parallel CPU Programming

OpenCL, along with CUDA, is one of the main tools used to program GPGPUs. However, it allows running the same code on multi-core CPUs too, making it a rival for the long-established OpenMP. In this paper we compare OpenCL and OpenMP when…

Distributed, Parallel, and Cluster Computing · Computer Science 2015-03-24 Kamran Karimi