Related papers: A Proof of Concept for Optimizing Task Parallelism…

Optimizing ccNUMA locality for task-parallel execution under OpenMP and TBB on multicore-based systems

Task parallelism as employed by the OpenMP task construct or some Intel Threading Building Blocks (TBB) components, although ideal for tackling irregular problems or typical producer/consumer schemes, bears some potential for performance…

Distributed, Parallel, and Cluster Computing · Computer Science 2011-01-04 Markus Wittmann , Georg Hager

Dynamic Loop Parallelisation

Regions of nested loops are a common feature of High Performance Computing (HPC) codes. In shared memory programming models, such as OpenMP, these structure are the most common source of parallelism. Parallelising these structures requires…

Programming Languages · Computer Science 2012-05-14 Adrian Jackson , Orestis Agathokleous

Scheduling Task-parallel Applications in Dynamically Asymmetric Environments

Shared resource interference is observed by applications as dynamic performance asymmetry. Prior art has developed approaches to reduce the impact of performance asymmetry mainly at the operating system and architectural levels. In this…

Distributed, Parallel, and Cluster Computing · Computer Science 2020-09-24 Jing Chen , Pirah Noor Soomro , Mustafa Abduljabbar , Madhavan Manivannan , Miquel Pericas

Towards Efficient OpenMP Strategies for Non-Uniform Architectures

Parallel processing is considered as todays and future trend for improving performance of computers. Computing devices ranging from small embedded systems to big clusters of computers rely on parallelizing applications to reduce execution…

Distributed, Parallel, and Cluster Computing · Computer Science 2014-11-27 Oussama Tahan

Optimizing Fine-Grained Parallelism Through Dynamic Load Balancing on Multi-Socket Many-Core Systems

Achieving efficient task parallelism on many-core architectures is an important challenge. The widely used GNU OpenMP implementation of the popular OpenMP parallel programming model incurs high overhead for fine-grained, short-running tasks…

Distributed, Parallel, and Cluster Computing · Computer Science 2025-03-20 Wenyi Wang , Maxime Gonthier , Poornima Nookala , Haochen Pan , Ian Foster , Ioan Raicu , Kyle Chard

Extending Task Parallelism for Frequent Pattern Mining

Algorithms for frequent pattern mining, a popular informatics application, have unique requirements that are not met by any of the existing parallel tools. In particular, such applications operate on extremely large data sets and have…

Distributed, Parallel, and Cluster Computing · Computer Science 2012-11-08 Prabhanjan Kambadur , Amol Ghoting , Anshul Gupta , Andrew Lumsdaine

Learning-based Dynamic Pinning of Parallelized Applications in Many-Core Systems

Motivated by the need for adaptive, secure and responsive scheduling in a great range of computing applications, including human-centered and time-critical applications, this paper proposes a scheduling framework that seamlessly adds…

Distributed, Parallel, and Cluster Computing · Computer Science 2020-01-14 Georgios C. Chasparis , Vladimir Janjic , Michael Rossbory

Proactive bottleneck performance analysis in parallel computing using openMP

The aim of parallel computing is to increase an application performance by executing the application on multiple processors. OpenMP is an API that supports multi platform shared memory programming model and shared-memory programs are…

Distributed, Parallel, and Cluster Computing · Computer Science 2013-11-12 Vibha Rajput , Alok Katiyar

Integrating Job Parallelism in Real-Time Scheduling Theory

We investigate the global scheduling of sporadic, implicit deadline, real-time task systems on multiprocessor platforms. We provide a task model which integrates job parallelism. We prove that the time-complexity of the feasibility problem…

Operating Systems · Computer Science 2008-05-22 S. Collette , L. Cucu , J. Goossens

Task-Graph Scheduling Extensions for Efficient Synchronization and Communication

Task graphs have been studied for decades as a foundation for scheduling irregular parallel applications and incorporated in programming models such as OpenMP. While many high-performance parallel libraries are based on task graphs, they…

Distributed, Parallel, and Cluster Computing · Computer Science 2020-11-09 Seonmyeong Bak , Oscar Hernandez , Mark Gates , Piotr Luszczek , Vivek Sarkar

Advanced Synchronization Techniques for Task-based Runtime Systems

Task-based programming models like OmpSs-2 and OpenMP provide a flexible data-flow execution model to exploit dynamic, irregular and nested parallelism. Providing an efficient implementation that scales well with small granularity tasks…

Distributed, Parallel, and Cluster Computing · Computer Science 2021-05-18 David Álvarez , Kevin Sala , Marcos Maroñas , Aleix Roca , Vicenç Beltran

Multi-Resource Parallel Query Scheduling and Optimization

Scheduling query execution plans is a particularly complex problem in shared-nothing parallel systems, where each site consists of a collection of local time-shared (e.g., CPU(s) or disk(s)) and space-shared (e.g., memory) resources and…

Databases · Computer Science 2014-04-01 Minos Garofalakis , Yannis Ioannidis

Asynchronous Runtime with Distributed Manager for Task-based Programming Models

Parallel task-based programming models, like OpenMP, allow application developers to easily create a parallel version of their sequential codes. The standard OpenMP 4.0 introduced the possibility of describing a set of data dependences per…

Distributed, Parallel, and Cluster Computing · Computer Science 2020-09-09 Jaume Bosch , Carlos Álvarez , Daniel Jiménez-González , Xavier Martorell , Eduard Ayguadé

New Thread Migration Strategies for NUMA Systems

Multicore systems present on-board memory hierarchies and communication networks that influence performance when executing shared memory parallel codes. Characterising this influence is complex, and understanding the effect of particular…

Distributed, Parallel, and Cluster Computing · Computer Science 2018-10-01 O. G. Lorenzo , M. L. Becoña , T. F. Pena , J. C. Cabaleiro , J. A. Lorenzo , F. F. Rivera

Scheduling Parallel-Task Jobs Subject to Packing and Placement Constraints

Motivated by modern parallel computing applications, we consider the problem of scheduling parallel-task jobs with heterogeneous resource requirements in a cluster of machines. Each job consists of a set of tasks that can be processed in…

Distributed, Parallel, and Cluster Computing · Computer Science 2020-04-03 Mehrnoosh Shafiee , Javad Ghaderi

Diversity/Parallelism Trade-off in Distributed Systems with Redundancy

As numerous machine learning and other algorithms increase in complexity and data requirements, distributed computing becomes necessary to satisfy the growing computational and storage demands, because it enables parallel execution of…

Distributed, Parallel, and Cluster Computing · Computer Science 2021-12-21 Pei Peng , Emina Soljanin , Philip Whiting

Approximation algorithms for energy, reliability and makespan optimization problems

In this paper, we consider the problem of scheduling an application on a parallel computational platform. The application is a particular task graph, either a linear chain of tasks, or a set of independent tasks. The platform is made of…

Data Structures and Algorithms · Computer Science 2012-10-18 Guillaume Aupy , Anne Benoit

CPU Scheduling in Data Centers Using Asynchronous Finite-Time Distributed Coordination Mechanisms

We propose an asynchronous iterative scheme that allows a set of interconnected nodes to distributively reach an agreement within a pre-specified bound in a finite number of steps. While this scheme could be adopted in a wide variety of…

Distributed, Parallel, and Cluster Computing · Computer Science 2023-01-13 Andreas Grammenos , Themistoklis Charalambous , Evangelia Kalyvianaki

Dynamic load balancing with enhanced shared-memory parallelism for particle-in-cell codes

Furthering our understanding of many of today's interesting problems in plasma physics---including plasma based acceleration and magnetic reconnection with pair production due to quantum electrodynamic effects---requires large-scale kinetic…

Computational Physics · Physics 2020-10-28 Kyle G. Miller , Roman P. Lee , Adam Tableman , Anton Helm , Ricardo A. Fonseca , Viktor K. Decyk , Warren B. Mori

On the Optimal Control of Parallel Processing Networks with Resource Collaboration and Multitasking

We study scheduling control of parallel processing networks in which some resources need to simultaneously collaborate to perform some activities and some resources multitask. Resource collaboration and multitasking give rise to…

Optimization and Control · Mathematics 2020-12-29 Erhun Özkan