English
Related papers

Related papers: OMP2HMPP: HMPP Source Code Generation from Program…

200 papers

In this paper, we present OMP2MPI a tool that generates automatically MPI source code from OpenMP. With this transformation the original program can be adapted to be able to exploit a larger number of processors by surpassing the limits of…

Distributed, Parallel, and Cluster Computing · Computer Science 2015-06-12 Albert Saa-Garriga , David Castells-Rufas , Jordi Carrabina

We present OMP2HMPP, a tool that, in a first step, automatically translates OpenMP code into various possible transformations of HMPP. In a second step OMP2HMPP executes all variants to obtain the performance and power consumption of each…

Distributed, Parallel, and Cluster Computing · Computer Science 2015-06-10 Albert Saà-Garriga , David Castells-Rufas , Jordi Carrabina

With the increasing diversity of heterogeneous architecture in the HPC industry, porting a legacy application to run on different architectures is a tough challenge. In this paper, we present OpenMP Advisor, a first of its kind compiler…

Distributed, Parallel, and Cluster Computing · Computer Science 2023-01-11 Alok Mishra , Abid M. Malik , Meifeng Lin , Barbara Chapman

Currently, the most energy-efficient hardware platforms for floating point-intensive calculations (also known as High Performance Computing, or HPC) are graphical processing units (GPUs). However, porting existing scientific codes to GPUs…

Distributed, Parallel, and Cluster Computing · Computer Science 2025-03-28 Michele Martone , Julia Lawall

Software developers must adapt to keep up with the changing capabilities of platforms so that they can utilize the power of High- Performance Computers (HPC), including exascale systems. OpenMP, a directive-based parallel programming model,…

Large language models (LLMs)such as ChatGPT have significantly advanced the field of Natural Language Processing (NLP). This trend led to the development of code-based large language models such as StarCoder, WizardCoder, and CodeLlama,…

Software Engineering · Computer Science 2024-11-08 Le Chen , Arijit Bhattacharjee , Nesreen Ahmed , Niranjan Hasabnis , Gal Oren , Vy Vo , Ali Jannesari

Large Language Model (LLM) based coding tools have been tremendously successful as software development assistants, yet they are often designed for general purpose programming tasks and perform poorly for more specialized domains such as…

Distributed, Parallel, and Cluster Computing · Computer Science 2024-12-20 Aman Chaturvedi , Daniel Nichols , Siddharth Singh , Abhinav Bhatele

Heterogeneity has become a mainstream architecture design choice for building High Performance Computing systems. However, heterogeneity poses significant challenges for achieving performance portability of execution. Adapting a program to…

Programming Languages · Computer Science 2023-03-17 Giorgis Georgakoudis , Konstantinos Parasyris , Chunhua Liao , David Beckingsale , Todd Gamblin , Bronis de Supinski

In recent years, it has become increasingly common for high performance computers (HPC) to possess some level of heterogeneous architecture - typically in the form of GPU accelerators. In some machines these are isolated within a dedicated…

Distributed, Parallel, and Cluster Computing · Computer Science 2022-10-19 I. Zacharoudiou , J. W. S. McCullough , P. V. Coveney

Graphics processing units (GPU) had evolved from a specialized hardware capable to render high quality graphics in games to a commodity hardware for effective processing blocks of data in a parallel schema. This evolution is particularly…

Distributed, Parallel, and Cluster Computing · Computer Science 2012-03-26 Luis Cabellos

This paper presents a comprehensive comparison of three dominant parallel programming models in High Performance Computing (HPC): Message Passing Interface (MPI), Open Multi-Processing (OpenMP), and Compute Unified Device Architecture…

Distributed, Parallel, and Cluster Computing · Computer Science 2025-06-19 Nizar ALHafez , Ahmad Kurdi

The electrical and electronic engineering has used parallel programming to solve its large scale complex problems for performance reasons. However, as parallel programming requires a non-trivial distribution of tasks and data, developers…

Distributed, Parallel, and Cluster Computing · Computer Science 2011-07-05 Antonio Wendell De Oliveira Rodrigues , Frédéric Guyomarc'H , Jean-Luc Dekeyser , Yvonnick Le Menach

Manual parallelization of code remains a significant challenge due to the complexities of modern software systems and the widespread adoption of multi-core architectures. This paper introduces OMPar, an AI-driven tool designed to automate…

Computation and Language · Computer Science 2024-09-24 Tal Kadosh , Niranjan Hasabnis , Prema Soundararajan , Vy A. Vo , Mihai Capota , Nesreen Ahmed , Yuval Pinter , Gal Oren

Utilizing GPUs is critical for high performance on heterogeneous systems. However, leveraging the full potential of GPUs for accelerating legacy CPU applications can be a challenging task for developers. The porting process requires…

Distributed, Parallel, and Cluster Computing · Computer Science 2023-06-27 Shilei Tian , Tom Scogland , Barbara Chapman , Johannes Doerfert

GPU-based HPC clusters are attracting more scientific application developers due to their extensive parallelism and energy efficiency. In order to achieve portability among a variety of multi/many core architectures, a popular choice for an…

Distributed, Parallel, and Cluster Computing · Computer Science 2023-04-10 Ali TehraniJamsaz , Alok Mishra , Akash Dutta , Abid M. Malik , Barbara Chapman , Ali Jannesari

The rapid growth of deep learning has driven exponential increases in model parameters and computational demands. NVIDIA GPUs and their CUDA-based software ecosystem provide robust support for parallel computing, significantly alleviating…

Distributed, Parallel, and Cluster Computing · Computer Science 2025-07-08 Jiaqi Lv , Xufeng He , Yanchen Liu , Xu Dai , Aocheng Shen , Yinghao Li , Jiachen Hao , Jianrong Ding , Yang Hu , Shouyi Yin

GPU runtimes are historically implemented in CUDA or other vendor specific languages dedicated to GPU programming. In this work we show that OpenMP 5.1, with minor compiler extensions, is capable of replacing existing solutions without a…

Distributed, Parallel, and Cluster Computing · Computer Science 2021-10-06 Shilei Tian , Jon Chesterfield , Johannes Doerfert , Barbara Chapman

Large language models (LLMs) have catalyzed an upsurge in automatic code generation, garnering significant attention for register transfer level (RTL) code generation. Despite the potential of RTL code generation with natural language, it…

Hardware Architecture · Computer Science 2024-08-14 Chenwei Xiong , Cheng Liu , Huawei Li , Xiaowei Li

Parallel programs in high performance computing (HPC) continue to grow in complexity and scale in the exascale era. The diversity in hardware and parallel programming models make developing, optimizing, and maintaining parallel software…

Distributed, Parallel, and Cluster Computing · Computer Science 2024-05-15 Daniel Nichols , Aniruddha Marathe , Harshitha Menon , Todd Gamblin , Abhinav Bhatele

OpenCL, along with CUDA, is one of the main tools used to program GPGPUs. However, it allows running the same code on multi-core CPUs too, making it a rival for the long-established OpenMP. In this paper we compare OpenCL and OpenMP when…

Distributed, Parallel, and Cluster Computing · Computer Science 2015-03-24 Kamran Karimi
‹ Prev 1 2 3 10 Next ›