English
Related papers

Related papers: An Interrupt-Driven Work-Sharing For-Loop Schedule…

200 papers

Work-stealing is a widely used technique for balancing irregular parallel workloads, and most modern runtime systems adopt lock-free work-stealing deques to reduce contention and improve scalability. However, existing algorithms are…

Distributed, Parallel, and Cluster Computing · Computer Science 2026-03-09 Raja Sai Nandhan Yadav Kataru , Danial Davarnia , Ali Jannesari

Work Stealing has been a very successful algorithm for scheduling parallel computations, and is known to achieve high performances even for computations exhibiting fine-grained parallelism. We present a variant of \ws\ that provably avoids…

Data Structures and Algorithms · Computer Science 2019-04-30 Guilherme Rito , Hervé Paulino

Work-stealing is a popular technique to implement dynamic load balancing in a distributed manner. In this approach, each process owns a set of tasks that have to be executed. The owner of the set can put tasks in it and can take tasks from…

Distributed, Parallel, and Cluster Computing · Computer Science 2021-02-23 Armando Castañeda , Miguel Piña

Many shared-memory parallel irregular applications, such as sparse linear algebra and graph algorithms, depend on efficient loop scheduling (LS) in a fork-join manner despite that the work per loop iteration can greatly vary depending on…

Distributed, Parallel, and Cluster Computing · Computer Science 2021-10-29 Joshua Dennis Booth , Phillip Lane

Parallelism has become extremely popular over the past decade, and there have been a lot of new parallel algorithms and software. The randomized work-stealing (RWS) scheduler plays a crucial role in this ecosystem. In this paper, we study…

Data Structures and Algorithms · Computer Science 2021-11-10 Yan Gu , Zachary Napier , Yihan Sun

Algorithms for scheduling structured parallel computations have been widely studied in the literature. For some time now, Work Stealing is one of the most popular for scheduling such computations, and its performance has been studied in…

Distributed, Parallel, and Cluster Computing · Computer Science 2018-10-26 Guilherme Rito , Hervé Paulino

The intelligent Distributed Dispatch and Scheduling (iDDS) service is a versatile workflow orchestration system designed for large-scale, distributed scientific computing. iDDS extends traditional workload and data management by integrating…

To improve the utility of learning applications and render machine learning solutions feasible for complex applications, a substantial amount of heavy computations is needed. Thus, it is essential to delegate the computations among several…

Distributed, Parallel, and Cluster Computing · Computer Science 2022-04-29 Homa Esfahanizadeh , Alejandro Cohen , Muriel Medard

The main goal of parallel processing is to provide users with performance that is much better than that of single processor systems. The execution of jobs is scheduled, which requires certain resources in order to meet certain criteria.…

Distributed, Parallel, and Cluster Computing · Computer Science 2019-02-07 Yang Cao , Fei Wu , Thomas Robertazzi

Scheduling is an important task allowing parallel systems to perform efficiently and reliably. For modern computation systems, divisible load is a special type of data which can be divided into arbitrary sizes and independently processed in…

Distributed, Parallel, and Cluster Computing · Computer Science 2019-02-07 Fei Wu , Yang Cao , Thomas Robertazzi

Work-stealing systems are typically oblivious to the nature of the tasks they are scheduling. For instance, they do not know or take into account how long a task will take to execute or how many subtasks it will spawn. Moreover, the actual…

Distributed, Parallel, and Cluster Computing · Computer Science 2013-05-29 Martin Wimmer , Daniel Cederman , Jesper Larsson Träff , Philippas Tsigas

The nested parallel (a.k.a. fork-join) model is widely used for writing parallel programs. However, the two composition constructs, i.e. "$\parallel$" (parallel) and "$;$" (serial), are insufficient in expressing "partial dependencies" or…

Distributed, Parallel, and Cluster Computing · Computer Science 2016-02-16 David Dinh , Harsha Vardhan Simhadri , Yuan Tang

The augmented scale and complexity of urban transportation networks have significantly increased the execution time and resource requirements of vehicular network simulations, exceeding the capabilities of sequential simulators. The need…

Distributed, Parallel, and Cluster Computing · Computer Science 2018-06-08 Mohammad A Hoque , Xiaoyan Hong , Md Salman Ahmed

The convergence of high-performance computing (HPC) and artificial intelligence (AI) is driving the emergence of increasingly complex parallel applications and workloads. These workloads often combine multiple parallel runtimes within the…

Distributed, Parallel, and Cluster Computing · Computer Science 2026-01-29 Aleix Roca , Vicenç Beltran

Parallel task-based programming models, like OpenMP, allow application developers to easily create a parallel version of their sequential codes. The standard OpenMP 4.0 introduced the possibility of describing a set of data dependences per…

Distributed, Parallel, and Cluster Computing · Computer Science 2020-09-09 Jaume Bosch , Carlos Álvarez , Daniel Jiménez-González , Xavier Martorell , Eduard Ayguadé

We consider shared-object systems that require their threads to fulfill the system jobs by first acquiring sequentially the objects needed for the jobs and then holding on to them until the job completion. Such systems are in the core of a…

Distributed, Parallel, and Cluster Computing · Computer Science 2015-11-04 Iosif Salem , Elad M. Schiller , Marina Papatriantafilou , Philippas Tsigas

Task parallelism is designed to simplify the task of parallel programming. When executing a task parallel program on modern NUMA architectures, it can fail to scale due to the phenomenon called work inflation, where the overall processing…

Distributed, Parallel, and Cluster Computing · Computer Science 2019-01-08 Justin Deters , Jiaye Wu , Yifan Xu , I-Ting Angelina Lee

Shared memory programming models usually provide worksharing and task constructs. The former relies on the efficient fork-join execution model to exploit structured parallelism; while the latter relies on fine-grained synchronization among…

Distributed, Parallel, and Cluster Computing · Computer Science 2020-04-08 M. Maronas , K. Sala , S. Mateo , E. Ayguadé , V. Beltran Barcelona Supercomputing Center

The task-based dataflow programming model has emerged as an alternative to the process-centric programming model for extreme-scale applications. However, load balancing is still a challenge in task-based dataflow runtimes. In this paper, we…

Distributed, Parallel, and Cluster Computing · Computer Science 2022-11-11 Joseph John , Josh Milthorpe , Peter Strazdins

Computationally-intensive loops are the primary source of parallelism in scientific applications. Such loops are often irregular and a balanced execution of their loop iterations is critical for achieving high performance. However, several…

Distributed, Parallel, and Cluster Computing · Computer Science 2019-03-25 Ahmed Eleliemy , Florina M. Ciorba
‹ Prev 1 2 3 10 Next ›