Related papers: ComPar: Optimized Multi-Compiler for Automatic Ope…

OMPar: Automatic Parallelization with AI-Driven Source-to-Source Compilation

Manual parallelization of code remains a significant challenge due to the complexities of modern software systems and the widespread adoption of multi-core architectures. This paper introduces OMPar, an AI-driven tool designed to automate…

Computation and Language · Computer Science 2024-09-24 Tal Kadosh , Niranjan Hasabnis , Prema Soundararajan , Vy A. Vo , Mihai Capota , Nesreen Ahmed , Yuval Pinter , Gal Oren

Advising OpenMP Parallelization via a Graph-Based Approach with Transformers

There is an ever-present need for shared memory parallelization schemes to exploit the full potential of multi-core architectures. The most common parallelization API addressing this need today is OpenMP. Nevertheless, writing parallel code…

Distributed, Parallel, and Cluster Computing · Computer Science 2023-05-23 Tal Kadosh , Nadav Schneider , Niranjan Hasabnis , Timothy Mattson , Yuval Pinter , Gal Oren

Learning to Parallelize in a Shared-Memory Environment with Transformers

In past years, the world has switched to many-core and multi-core shared memory architectures. As a result, there is a growing need to utilize these architectures by introducing shared memory parallelization schemes to software…

Distributed, Parallel, and Cluster Computing · Computer Science 2022-07-15 Re'em Harel , Yuval Pinter , Gal Oren

OMP-Engineer: Bridging Syntax Analysis and In-Context Learning for Efficient Automated OpenMP Parallelization

In advancing parallel programming, particularly with OpenMP, the shift towards NLP-based methods marks a significant innovation beyond traditional S2S tools like Autopar and Cetus. These NLP approaches train on extensive datasets of…

Distributed, Parallel, and Cluster Computing · Computer Science 2024-05-07 Weidong Wang , Haoran Zhu

A comparison between Automatically versus Manually Parallelized NAS Benchmarks

We compare automatically and manually parallelized NAS Benchmarks in order to identify code sections that differ. We discuss opportunities for advancing automatic parallelizers. We find ten patterns that pose challenges for current…

Distributed, Parallel, and Cluster Computing · Computer Science 2022-12-02 Parinaz Barakhshan , Rudolf Eigenmann

Estimating the overlap between dependent computations for automatic parallelization

Researchers working on the automatic parallelization of programs have long known that too much parallelism can be even worse for performance than too little, because spawning a task to be run on another CPU incurs overheads.…

Programming Languages · Computer Science 2011-09-08 Paul Bone , Zoltan Somogyi , Peter Schachte

MKPipe: A Compiler Framework for Optimizing Multi-Kernel Workloads in OpenCL for FPGA

OpenCL for FPGA enables developers to design FPGAs using a programming model similar for processors. Recent works have shown that code optimization at the OpenCL level is important to achieve high computational efficiency. However, existing…

Distributed, Parallel, and Cluster Computing · Computer Science 2020-02-06 Ji Liu , Abdullah-Al Kafi , Xipeng Shen , Huiyang Zhou

Learning to Parallelize with OpenMP by Augmented Heterogeneous AST Representation

Detecting parallelizable code regions is a challenging task, even for experienced developers. Numerous recent studies have explored the use of machine learning for code analysis and program synthesis, including parallelization, in light of…

Machine Learning · Computer Science 2024-11-25 Le Chen , Quazi Ishtiaque Mahmud , Hung Phan , Nesreen K. Ahmed , Ali Jannesari

Parallelizing Program Execution on Distributed Quantum Systems via Compiler/Hardware Co-Design

As quantum computers continue to improve and support larger, more complex computations, smart control hardware and compilers are needed to efficiently leverage the capabilities of these systems. This paper introduces a novel approach to…

Quantum Physics · Physics 2025-11-19 Folkert de Ronde , Alexander Knapen , Stephan Wong , Sebastian Feld

Dynamic Loop Parallelisation

Regions of nested loops are a common feature of High Performance Computing (HPC) codes. In shared memory programming models, such as OpenMP, these structure are the most common source of parallelism. Parallelising these structures requires…

Programming Languages · Computer Science 2012-05-14 Adrian Jackson , Orestis Agathokleous

Parallel Combining: Benefits of Explicit Synchronization

Parallel batched data structures are designed to process synchronized batches of operations in a parallel computing model. In this paper, we propose parallel combining, a technique that implements a concurrent data structure from a parallel…

Distributed, Parallel, and Cluster Computing · Computer Science 2018-11-14 Vitaly Aksenov , Petr Kuznetsov , Anatoly Shalyto

Parallel Algorithm for Longest Common Subsequence in a String

In the area of Pattern Recognition and Matching, finding a Longest Common Subsequence plays an important role. In this paper, we have proposed one algorithm based on parallel computation. We have used OpenMP API package as middleware to…

Data Structures and Algorithms · Computer Science 2013-06-20 Tirtharaj Dash , Tanistha Nayak

Automatic Parallelization: Executing Sequential Programs on a Task-Based Parallel Runtime

There are billions of lines of sequential code inside nowadays' software which do not benefit from the parallelism available in modern multicore architectures. Automatically parallelizing sequential code, to promote an efficient use of the…

Programming Languages · Computer Science 2016-04-13 Alcides Fonseca , Bruno Cabral , João Rafael , Ivo Correia

Enabling Dynamic Selection of Implementation Variants in Component-Based Parallel Programming for Heterogeneous Systems

Heterogeneous systems, consisting of CPUs and GPUs, offer the capability to address the demands of compute- and data-intensive applications. However, programming such systems is challenging, requiring knowledge of various parallel…

Distributed, Parallel, and Cluster Computing · Computer Science 2023-11-08 Suejb Memeti

An OpenMP translator for the GAP8 MPSoC

One of the barriers to the adoption of parallel computing is the inherent complexity of its programming. The Open Multi-Processing (OpenMP) Application Programming Interface (API) facilitates such implementations, providing high abstraction…

Distributed, Parallel, and Cluster Computing · Computer Science 2020-07-21 Reinaldo Agostinho de Souza Filho , Diego V. Cirilo do Nascimento , Samuel Xavier-de-Souza

Compiler Enhanced Scheduling for OpenMP for Heterogeneous Multiprocessors

Scheduling in Asymmetric Multicore Processors (AMP), a special case of Heterogeneous Multiprocessors, is a widely studied topic. The scheduling techniques which are mostly runtime do not usually consider parallel programming pattern used in…

Performance · Computer Science 2018-08-21 Jyothi Krishna V S , Shankar Balachandran

MCompiler: A Synergistic Compilation Framework

This paper presents a meta-compilation framework, the MCompiler. The main idea is that different segments of a program can be compiled with different compilers/optimizers and combined into a single executable. The MCompiler can be used in a…

Distributed, Parallel, and Cluster Computing · Computer Science 2019-05-31 Aniket Shivam , Alexandru Nicolau , Alexander V. Veidenbaum

Redesigning OP2 Compiler to Use HPX Runtime Asynchronous Techniques

Maximizing parallelism level in applications can be achieved by minimizing overheads due to load imbalances and waiting time due to memory latencies. Compiler optimization is one of the most effective solutions to tackle this problem. The…

Distributed, Parallel, and Cluster Computing · Computer Science 2017-03-29 Zahra Khatami , Hartmut Kaiser , J. Ramanujam

UPIR: Toward the Design of Unified Parallel Intermediate Representation for Parallel Programming Models

The complexity of heterogeneous computing architectures, as well as the demand for productive and portable parallel application development, have driven the evolution of parallel programming models to become more comprehensive and complex…

Distributed, Parallel, and Cluster Computing · Computer Science 2022-10-31 Anjia Wang , Xinyao Yi , Yonghong Yan

Framework for the hybrid parallelisation of simulation codes

Writing efficient hybrid parallel code is tedious, error-prone, and requires good knowledge of both parallel programming and multithreading such as MPI and OpenMP, resp. Therefore, we present a framework which is based on a job model that…

Distributed, Parallel, and Cluster Computing · Computer Science 2018-07-03 Ralf-Peter Mundani , Marko Ljucović , Ernst Rank