Related papers: Dynamic Loop Scheduling Using MPI Passive-Target R…

Hierarchical Dynamic Loop Self-Scheduling on Distributed-Memory Systems Using an MPI+MPI Approach

Computationally-intensive loops are the primary source of parallelism in scientific applications. Such loops are often irregular and a balanced execution of their loop iterations is critical for achieving high performance. However, several…

Distributed, Parallel, and Cluster Computing · Computer Science 2019-03-25 Ahmed Eleliemy , Florina M. Ciorba

A Distributed Chunk Calculation Approach for Self-scheduling of Parallel Applications on Distributed-memory Systems

Loop scheduling techniques aim to achieve load-balanced executions of scientific applications. Dynamic loop self-scheduling (DLS) libraries for distributed-memory systems are typically MPI-based and employ a centralized chunk calculation…

Distributed, Parallel, and Cluster Computing · Computer Science 2021-01-19 Ahmed Eleliemy , Florina M. Ciorba

Performance Reproduction and Prediction of Selected Dynamic Loop Scheduling Experiments

Scientific applications are complex, large, and often exhibit irregular and stochastic behavior. The use of efficient loop scheduling techniques in computationally-intensive applications is crucial for improving their performance on…

Distributed, Parallel, and Cluster Computing · Computer Science 2018-06-08 Ali Mohammed , Ahmed Eleliemy , Florina M. Ciorba

Experimental Verification and Analysis of Dynamic Loop Scheduling in Scientific Applications

Scientific applications are often irregular and characterized by large computationally-intensive parallel loops. Dynamic loop scheduling (DLS) techniques improve the performance of computationally-intensive scientific applications via load…

Distributed, Parallel, and Cluster Computing · Computer Science 2018-05-08 Ali Mohammed , Ahmed Eleliemy , Florina M. Ciorba , Franziska Kasielke , Ioana Banicescu

rDLB: A Novel Approach for Robust Dynamic Load Balancing of Scientific Applications with Parallel Independent Tasks

Scientific applications often contain large and computationally intensive parallel loops. Dynamic loop self scheduling (DLS) is used to achieve a balanced load execution of such applications on high performance computing (HPC) systems.…

Distributed, Parallel, and Cluster Computing · Computer Science 2019-10-07 Ali Mohammed , Aurelien Cavelan , Florina M. Ciorba

An Approach for Realistically Simulating the Performance of Scientific Applications on High Performance Computing Systems

Scientific applications often contain large, computationally-intensive, and irregular parallel loops or tasks that exhibit stochastic characteristics. Applications may suffer from load imbalance during their execution on high-performance…

Distributed, Parallel, and Cluster Computing · Computer Science 2019-10-16 Ali Mohammed , Ahmed Eleliemy , Florina M. Ciorba , Franziska Kasielke , Ioana Banicescu

Resource Optimization with MPI Process Malleability for Dynamic Workloads in HPC Clusters

Dynamic resource management is essential for optimizing computational efficiency in modern high-performance computing (HPC) environments, particularly as systems scale. While research has demonstrated the benefits of malleability in…

Distributed, Parallel, and Cluster Computing · Computer Science 2025-06-18 Sergio Iserte , Iker Martín-Álvarez , Krzysztof Rojek , José I. Aliaga , Maribel Castillo , Weronika Folwarska , Antonio J. Peña

Decentralized Task Scheduling in Distributed Systems: A Deep Reinforcement Learning Approach

Efficient task scheduling in large-scale distributed systems presents significant challenges due to dynamic workloads, heterogeneous resources, and competing quality-of-service requirements. Traditional centralized approaches face…

Distributed, Parallel, and Cluster Computing · Computer Science 2026-03-27 Daniel Benniah John

Two-level Dynamic Load Balancing for High Performance Scientific Applications

Scientific applications are often complex, irregular, and computationally-intensive. To accommodate the ever-increasing computational demands of scientific applications, high-performance computing (HPC) systems have become larger and more…

Distributed, Parallel, and Cluster Computing · Computer Science 2019-11-20 Ali Mohammed , Aurelien Cavelan , Florina M. Ciorba , Ruben M. Cabezon , Ioana Banicesu

Extending SLURM for Dynamic Resource-Aware Adaptive Batch Scheduling

With the growing constraints on power budget and increasing hardware failure rates, the operation of future exascale systems faces several challenges. Towards this, resource awareness and adaptivity by enabling malleable jobs has been…

Distributed, Parallel, and Cluster Computing · Computer Science 2021-05-21 Mohak Chadha , Jophin John , Michael Gerndt

Structure-Aware Dynamic Scheduler for Parallel Machine Learning

Training large machine learning (ML) models with many variables or parameters can take a long time if one employs sequential procedures even with stochastic updates. A natural solution is to turn to distributed computing on a cluster;…

Machine Learning · Statistics 2013-12-31 Seunghak Lee , Jin Kyu Kim , Qirong Ho , Garth A. Gibson , Eric P. Xing

Distributed In-memory Data Management for Workflow Executions

Complex scientific experiments from various domains are typically modeled as workflows and executed on large-scale machines using a Parallel Workflow Management System (WMS). Since such executions usually last for hours or days, some WMSs…

Distributed, Parallel, and Cluster Computing · Computer Science 2021-05-13 Renan Souza , Vítor Silva , Alexandre A. B. Lima , Daniel de Oliveira , Patrick Valduriez , Marta Mattoso

Compiler Discovered Dynamic Scheduling of Irregular Code in High-Level Synthesis

Dynamically scheduled high-level synthesis (HLS) achieves higher throughput than static HLS for codes with unpredictable memory accesses and control flow. However, excessive dataflow scheduling results in circuits that use more resources…

Distributed, Parallel, and Cluster Computing · Computer Science 2023-08-30 Robert Szafarczyk , Syed Waqar Nabi , Wim Vanderbauwhede

Mitigating inefficient task mappings with an Adaptive Resource-Moldable Scheduler (ARMS)

Efficient runtime task scheduling on complex memory hierarchy becomes increasingly important as modern and future High-Performance Computing (HPC) systems are progressively composed of multisocket and multi-chiplet nodes with nonuniform…

Distributed, Parallel, and Cluster Computing · Computer Science 2021-12-20 Mustafa Abduljabbar , Mahmoud Eljammaly , Miquel Pericas

An Adaptive Self-Scheduling Loop Scheduler

Many shared-memory parallel irregular applications, such as sparse linear algebra and graph algorithms, depend on efficient loop scheduling (LS) in a fork-join manner despite that the work per loop iteration can greatly vary depending on…

Distributed, Parallel, and Cluster Computing · Computer Science 2021-10-29 Joshua Dennis Booth , Phillip Lane

Distributed TensorFlow with MPI

Machine Learning and Data Mining (MLDM) algorithms are becoming increasingly important in analyzing large volume of data generated by simulations, experiments and mobile devices. With increasing data volume, distributed memory systems (such…

Distributed, Parallel, and Cluster Computing · Computer Science 2017-08-21 Abhinav Vishnu , Charles Siegel , Jeffrey Daily

DMRlib: Easy-coding and Efficient Resource Management for Job Malleability

Process malleability has proved to have a highly positive impact on the resource utilization and global productivity in data centers compared with the conventional static resource allocation policy. However, the non-negligible additional…

Distributed, Parallel, and Cluster Computing · Computer Science 2026-04-30 Sergio Iserte , Rafael Mayo , Enrique S. Quintana-Ortí , Antonio J. Peña

Learning-based Dynamic Pinning of Parallelized Applications in Many-Core Systems

Motivated by the need for adaptive, secure and responsive scheduling in a great range of computing applications, including human-centered and time-critical applications, this paper proposes a scheduling framework that seamlessly adds…

Distributed, Parallel, and Cluster Computing · Computer Science 2020-01-14 Georgios C. Chasparis , Vladimir Janjic , Michael Rossbory

Distributed dynamic load balancing for task parallel programming

In this paper, we derive and investigate approaches to dynamically load balance a distributed task parallel application software. The load balancing strategy is based on task migration. Busy processes export parts of their ready task queue…

Distributed, Parallel, and Cluster Computing · Computer Science 2018-01-16 Afshin Zafari , Elisabeth Larsson

Dynamic Loop Fusion in High-Level Synthesis

Dynamic High-Level Synthesis (HLS) uses additional hardware to perform memory disambiguation at runtime, increasing loop throughput in irregular codes compared to static HLS. However, most irregular codes consist of multiple sibling loops,…

Hardware Architecture · Computer Science 2025-01-27 Robert Szafarczyk , Syed Waqar Nabi , Wim Vanderbauwhede