Related papers: AutOMP: An Automatic OpenMP Parallelization Genera…

OMPar: Automatic Parallelization with AI-Driven Source-to-Source Compilation

Manual parallelization of code remains a significant challenge due to the complexities of modern software systems and the widespread adoption of multi-core architectures. This paper introduces OMPar, an AI-driven tool designed to automate…

Computation and Language · Computer Science 2024-09-24 Tal Kadosh , Niranjan Hasabnis , Prema Soundararajan , Vy A. Vo , Mihai Capota , Nesreen Ahmed , Yuval Pinter , Gal Oren

Learning to Parallelize in a Shared-Memory Environment with Transformers

In past years, the world has switched to many-core and multi-core shared memory architectures. As a result, there is a growing need to utilize these architectures by introducing shared memory parallelization schemes to software…

Distributed, Parallel, and Cluster Computing · Computer Science 2022-07-15 Re'em Harel , Yuval Pinter , Gal Oren

Dynamic Loop Parallelisation

Regions of nested loops are a common feature of High Performance Computing (HPC) codes. In shared memory programming models, such as OpenMP, these structure are the most common source of parallelism. Parallelising these structures requires…

Programming Languages · Computer Science 2012-05-14 Adrian Jackson , Orestis Agathokleous

Enabling performance portability of data-parallel OpenMP applications on asymmetric multicore processors

Asymmetric multicore processors (AMPs) couple high-performance big cores and low-power small cores with the same instruction-set architecture but different features, such as clock frequency or microarchitecture. Previous work has shown that…

Distributed, Parallel, and Cluster Computing · Computer Science 2024-02-13 Juan Carlos Saez , Fernando Castro , Manuel Prieto-Matias

Automatic task-based parallelization of C++ applications by source-to-source transformations

Currently, multi/many-core CPUs are considered standard in most types of computers including, mobile phones, PCs or supercomputers. However, the parallelization of applications as well as refactoring/design of applications for efficient…

Distributed, Parallel, and Cluster Computing · Computer Science 2021-05-25 Garip Kusoglu , Berenger Bramas , Stephane Genaud

Source-to-Source Automatic Differentiation of OpenMP Parallel Loops

This paper presents our work toward correct and efficient automatic differentiation of OpenMP parallel worksharing loops in forward and reverse mode. Automatic differentiation is a method to obtain gradients of numerical programs, which are…

Mathematical Software · Computer Science 2021-11-04 Jan Hückelheim , Laurent Hascoët

Towards Autotuning of OpenMP Applications on Multicore Architectures

In this paper we describe an autotuning tool for optimization of OpenMP applications on highly multicore and multithreaded architectures. Our work was motivated by in-depth performance analysis of scientific applications and synthetic…

Distributed, Parallel, and Cluster Computing · Computer Science 2014-01-17 Jakub Katarzyński , Maciej Cytowski

A Formal Semantics of C with OpenMP Parallelism (Extended Version)

OpenMP is a popular parallelization framework that lets users transform sequential code into parallel code with a few simple annotations. Unfortunately, it is also easy to inadvertently introduce errors by adding OpenMP pragmas into…

Distributed, Parallel, and Cluster Computing · Computer Science 2026-05-28 Ke Du , Anshu Sharma , Liyi Li , William Mansky

Toward a Standard Interface for User-Defined Scheduling in OpenMP

Parallel loops are an important part of OpenMP programs. Efficient scheduling of parallel loops can improve performance of the programs. The current OpenMP specification only offers three options for loop scheduling, which are insufficient…

Distributed, Parallel, and Cluster Computing · Computer Science 2019-07-10 Vivek Kale , Christian Iwainsky , Michael Klemm , Jonas H. Muller Korndorfer , Florina M. Ciorba

Using Cognitive Computing for Learning Parallel Programming: An IBM Watson Solution

While modern parallel computing systems provide high performance resources, utilizing them to the highest extent requires advanced programming expertise. Programming for parallel computing systems is much more difficult than programming for…

Programming Languages · Computer Science 2017-04-06 Adrian Calvo Chozas , Suejb Memeti , Sabri Pllana

Supporting OpenMP 5.0 Tasks in hpxMP -- A study of an OpenMP implementation within Task Based Runtime Systems

OpenMP has been the de facto standard for single node parallelism for more than a decade. Recently, asynchronous many-task runtime (AMT) systems have increased in popularity as a new programming paradigm for high performance computing…

Distributed, Parallel, and Cluster Computing · Computer Science 2020-02-20 Tianyi Zhang , Shahrzad Shirzad , Bibek Wagle , Adrian S. Lemoine , Patrick Diehl , Hartmut Kaiser

OMP-Engineer: Bridging Syntax Analysis and In-Context Learning for Efficient Automated OpenMP Parallelization

In advancing parallel programming, particularly with OpenMP, the shift towards NLP-based methods marks a significant innovation beyond traditional S2S tools like Autopar and Cetus. These NLP approaches train on extensive datasets of…

Distributed, Parallel, and Cluster Computing · Computer Science 2024-05-07 Weidong Wang , Haoran Zhu

OMP2MPI: Automatic MPI code generation from OpenMP programs

In this paper, we present OMP2MPI a tool that generates automatically MPI source code from OpenMP. With this transformation the original program can be adapted to be able to exploit a larger number of processors by surpassing the limits of…

Distributed, Parallel, and Cluster Computing · Computer Science 2015-06-12 Albert Saa-Garriga , David Castells-Rufas , Jordi Carrabina

Implementing implicit OpenMP data sharing on GPUs

OpenMP is a shared memory programming model which supports the offloading of target regions to accelerators such as NVIDIA GPUs. The implementation in Clang/LLVM aims to deliver a generic GPU compilation toolchain that supports both the…

Programming Languages · Computer Science 2017-11-29 Gheorghe-Teodor Bercea , Carlo Bertolli , Arpith C. Jacob , Alexandre Eichenberger , Alexey Bataev , Georgios Rokos , Hyojin Sung , Tong Chen , Kevin O'Brien

Learning to Parallelize with OpenMP by Augmented Heterogeneous AST Representation

Detecting parallelizable code regions is a challenging task, even for experienced developers. Numerous recent studies have explored the use of machine learning for code analysis and program synthesis, including parallelization, in light of…

Machine Learning · Computer Science 2024-11-25 Le Chen , Quazi Ishtiaque Mahmud , Hung Phan , Nesreen K. Ahmed , Ali Jannesari

Quantifying OpenMP: Statistical Insights into Usage and Adoption

In high-performance computing (HPC), the demand for efficient parallel programming models has grown dramatically since the end of Dennard Scaling and the subsequent move to multi-core CPUs. OpenMP stands out as a popular choice due to its…

Distributed, Parallel, and Cluster Computing · Computer Science 2023-08-21 Tal Kadosh , Niranjan Hasabnis , Timothy Mattson , Yuval Pinter , Gal Oren

Accelerating Fortran Codes: A Method for Integrating Coarray Fortran with CUDA Fortran and OpenMP

Fortran's prominence in scientific computing requires strategies to ensure both that legacy codes are efficient on high-performance computing systems, and that the language remains attractive for the development of new high-performance…

Instrumentation and Methods for Astrophysics · Physics 2024-09-13 James McKevitt , Eduard I. Vorobyov , Igor Kulikov

Advising OpenMP Parallelization via a Graph-Based Approach with Transformers

There is an ever-present need for shared memory parallelization schemes to exploit the full potential of multi-core architectures. The most common parallelization API addressing this need today is OpenMP. Nevertheless, writing parallel code…

Distributed, Parallel, and Cluster Computing · Computer Science 2023-05-23 Tal Kadosh , Nadav Schneider , Niranjan Hasabnis , Timothy Mattson , Yuval Pinter , Gal Oren

An Operational Semantic Basis for OpenMP Race Analysis

OpenMP is the de facto standard to exploit the on-node parallelism in new generation supercomputers.Despite its overall ease of use, even expert users are known to create OpenMP programs that harbor concurrency errors, of which one of the…

Distributed, Parallel, and Cluster Computing · Computer Science 2017-09-15 Simone Atzeni , Ganesh Gopalakrishnan

ComPar: Optimized Multi-Compiler for Automatic OpenMP S2S Parallelization

Parallelization schemes are essential in order to exploit the full benefits of multi-core architectures. In said architectures, the most comprehensive parallelization API is OpenMP. However, the introduction of correct and optimal OpenMP…

Distributed, Parallel, and Cluster Computing · Computer Science 2020-05-28 Idan Mosseri , Lee-or Alon , Re'em Harel , Gal Oren