Related papers: Advanced Synchronization Techniques for Task-based…

Asynchronous Runtime with Distributed Manager for Task-based Programming Models

Parallel task-based programming models, like OpenMP, allow application developers to easily create a parallel version of their sequential codes. The standard OpenMP 4.0 introduced the possibility of describing a set of data dependences per…

Distributed, Parallel, and Cluster Computing · Computer Science 2020-09-09 Jaume Bosch , Carlos Álvarez , Daniel Jiménez-González , Xavier Martorell , Eduard Ayguadé

Taskgraph: A Low Contention OpenMP Tasking Framework

OpenMP is the de-facto standard for shared memory systems in High-Performance Computing (HPC). It includes a task-based model that offers a high-level of abstraction to effectively exploit highly dynamic structured and unstructured…

Distributed, Parallel, and Cluster Computing · Computer Science 2022-12-12 Chenle Yu , Sara Royuela , Eduardo Quiñones

Supporting OpenMP 5.0 Tasks in hpxMP -- A study of an OpenMP implementation within Task Based Runtime Systems

OpenMP has been the de facto standard for single node parallelism for more than a decade. Recently, asynchronous many-task runtime (AMT) systems have increased in popularity as a new programming paradigm for high performance computing…

Distributed, Parallel, and Cluster Computing · Computer Science 2020-02-20 Tianyi Zhang , Shahrzad Shirzad , Bibek Wagle , Adrian S. Lemoine , Patrick Diehl , Hartmut Kaiser

Accelerating Task-based Iterative Applications

Task-based programming models have risen in popularity as an alternative to traditional fork-join parallelism. They are better suited to write applications with irregular parallelism that can present load imbalance. However, these…

Distributed, Parallel, and Cluster Computing · Computer Science 2022-08-15 David Álvarez , Vicenç Beltran

Task inefficiency patterns for a wave equation solver

The orchestration of complex algorithms demands high levels of automation to use modern hardware efficiently. Task-based programming with OpenMP 5.0 is a prominent candidate to accomplish this goal. We study OpenMP 5.0's tasking in the…

Distributed, Parallel, and Cluster Computing · Computer Science 2022-02-11 Holger Schulz , Gonzalo Brito Gadeschi , Oleksandr Rudyy , Tobias Weinzierl

Enabling performance portability of data-parallel OpenMP applications on asymmetric multicore processors

Asymmetric multicore processors (AMPs) couple high-performance big cores and low-power small cores with the same instruction-set architecture but different features, such as clock frequency or microarchitecture. Previous work has shown that…

Distributed, Parallel, and Cluster Computing · Computer Science 2024-02-13 Juan Carlos Saez , Fernando Castro , Manuel Prieto-Matias

Task-Graph Scheduling Extensions for Efficient Synchronization and Communication

Task graphs have been studied for decades as a foundation for scheduling irregular parallel applications and incorporated in programming models such as OpenMP. While many high-performance parallel libraries are based on task graphs, they…

Distributed, Parallel, and Cluster Computing · Computer Science 2020-11-09 Seonmyeong Bak , Oscar Hernandez , Mark Gates , Piotr Luszczek , Vivek Sarkar

Myrmics: Scalable, Dependency-aware Task Scheduling on Heterogeneous Manycores

Task-based programming models have become very popular, as they offer an attractive solution to parallelize serial application code with task and data annotations. They usually depend on a runtime system that schedules the tasks to multiple…

Distributed, Parallel, and Cluster Computing · Computer Science 2016-06-15 Spyros Lyberis , Polyvios Pratikakis , Iakovos Mavroidis , Dimitrios S. Nikolopoulos

Scheduling Task-parallel Applications in Dynamically Asymmetric Environments

Shared resource interference is observed by applications as dynamic performance asymmetry. Prior art has developed approaches to reduce the impact of performance asymmetry mainly at the operating system and architectural levels. In this…

Distributed, Parallel, and Cluster Computing · Computer Science 2020-09-24 Jing Chen , Pirah Noor Soomro , Mustafa Abduljabbar , Madhavan Manivannan , Miquel Pericas

Analysis and Characterization of Performance Variability for OpenMP Runtime

In the high performance computing (HPC) domain, performance variability is a major scalability issue for parallel computing applications with heavy synchronization and communication. In this paper, we present an experimental performance…

Distributed, Parallel, and Cluster Computing · Computer Science 2023-11-10 Minyu Cui , Nikela Papadopoulou , Miquel Pericàs

Real-Time Parallel Programming: State of Play and Open Issues

Real-time systems applications usually consist of a set of concurrent activities with timing-related properties. Developing these applications requires programming paradigms that can effectively handle the specification of concurrent…

Distributed, Parallel, and Cluster Computing · Computer Science 2023-03-21 Luis Miguel Pinho

A Granularity Characterization of Task Scheduling Effectiveness

Task-based runtime systems provide flexible load balancing and portability for parallel scientific applications, but their strong scaling is highly sensitive to task granularity. As parallelism increases, scheduling overhead may transition…

Distributed, Parallel, and Cluster Computing · Computer Science 2026-02-26 Sana Taghipour Anvari , David Kaeli

The OpenMP Cluster Programming Model

Despite the various research initiatives and proposed programming models, efficient solutions for parallel programming in HPC clusters still rely on a complex combination of different programming models (e.g., OpenMP and MPI), languages…

Distributed, Parallel, and Cluster Computing · Computer Science 2022-08-16 Hervé Yviquel , Marcio Pereira , Emílio Francesquini , Guilherme Valarini , Gustavo Leite , Pedro Rosso , Rodrigo Ceccato , Carla Cusihualpa , Vitoria Dias , Sandro Rigo , Alan Souza , Guido Araujo

Experiences with task-based programming using cluster nodes as OpenMP devices

Programming a distributed system, such as a cluster, requires extended use of low-level communication libraries and can often become cumbersome and error prone for the average developer. In this work, we consider each node of a cluster as a…

Distributed, Parallel, and Cluster Computing · Computer Science 2022-05-24 Ilias Keftakis , Vassilios V. Dimakopoulos

A Proof of Concept for Optimizing Task Parallelism by Locality Queues

Task parallelism as employed by the OpenMP task construct, although ideal for tackling irregular problems or typical producer/consumer schemes, bears some potential for performance bottlenecks if locality of data access is important, which…

Performance · Computer Science 2009-02-12 Markus Wittmann , Georg Hager

OMP-Engineer: Bridging Syntax Analysis and In-Context Learning for Efficient Automated OpenMP Parallelization

In advancing parallel programming, particularly with OpenMP, the shift towards NLP-based methods marks a significant innovation beyond traditional S2S tools like Autopar and Cetus. These NLP approaches train on extensive datasets of…

Distributed, Parallel, and Cluster Computing · Computer Science 2024-05-07 Weidong Wang , Haoran Zhu

OpenMP Loop Scheduling Revisited: Making a Case for More Schedules

In light of continued advances in loop scheduling, this work revisits the OpenMP loop scheduling by outlining the current state of the art in loop scheduling and presenting evidence that the existing OpenMP schedules are insufficient for…

Distributed, Parallel, and Cluster Computing · Computer Science 2018-09-11 Florina M. Ciorba , Christian Iwainsky , Patrick Buder

Towards Efficient OpenMP Strategies for Non-Uniform Architectures

Parallel processing is considered as todays and future trend for improving performance of computers. Computing devices ranging from small embedded systems to big clusters of computers rely on parallelizing applications to reduce execution…

Distributed, Parallel, and Cluster Computing · Computer Science 2014-11-27 Oussama Tahan

A Comparative Study of OpenMP Scheduling Algorithm Selection Strategies

Scientific and data science applications are becoming increasingly complex, with growing computational and memory demands. Modern high performance computing (HPC) systems provide high parallelism and heterogeneity across nodes, devices, and…

Distributed, Parallel, and Cluster Computing · Computer Science 2025-07-29 Jonas H. Müller Korndörfer , Ali Mohammed , Ahmed Eleliemy , Quentin Guilloteau , Reto Krummenacher , Florina M. Ciorba

Ensemble Toolkit: Scalable and Flexible Execution of Ensembles of Tasks

There are many science applications that require scalable task-level parallelism and support for flexible execution and coupling of ensembles of simulations. Most high-performance system software and middleware, however, are designed to…

Distributed, Parallel, and Cluster Computing · Computer Science 2016-06-29 Vivekanandan Balasubramanian , Antons Treikalis , Ole Weidner , Shantenu Jha