English
Related papers

Related papers: Task-Graph Scheduling Extensions for Efficient Syn…

200 papers

Large Language Models (LLMs) have demonstrated exceptional abilities in reasoning for task planning. However, challenges remain under-explored for parallel schedules. This paper introduces a novel paradigm, plan-over-graph, in which the…

Artificial Intelligence · Computer Science 2025-02-21 Shiqi Zhang , Xinbei Ma , Zouying Cao , Zhuosheng Zhang , Hai Zhao

Parallel task-based programming models, like OpenMP, allow application developers to easily create a parallel version of their sequential codes. The standard OpenMP 4.0 introduced the possibility of describing a set of data dependences per…

Distributed, Parallel, and Cluster Computing · Computer Science 2020-09-09 Jaume Bosch , Carlos Álvarez , Daniel Jiménez-González , Xavier Martorell , Eduard Ayguadé

Dataflow devices represent an avenue towards saving the control and data movement overhead of Load-Store Architectures. Various dataflow accelerators have been proposed, but how to efficiently schedule applications on such devices remains…

Distributed, Parallel, and Cluster Computing · Computer Science 2023-06-06 Tiziano De Matteis , Lukas Gianinazzi , Johannes de Fine Licht , Torsten Hoefler

OpenMP is the de-facto standard for shared memory systems in High-Performance Computing (HPC). It includes a task-based model that offers a high-level of abstraction to effectively exploit highly dynamic structured and unstructured…

Distributed, Parallel, and Cluster Computing · Computer Science 2022-12-12 Chenle Yu , Sara Royuela , Eduardo Quiñones

Parallel programming models can encourage performance portability by moving the responsibility for work assignment and data distribution from the programmer to a runtime system. However, analyzing the resulting implicit memory allocations,…

Distributed, Parallel, and Cluster Computing · Computer Science 2025-03-14 Fabian Knorr , Philip Salzmann , Peter Thoman , Thomas Fahringer

Task graphs provide a simple way to describe scientific workflows (sets of tasks with dependencies) that can be executed on both HPC clusters and in the cloud. An important aspect of executing such graphs is the used scheduling algorithm.…

Distributed, Parallel, and Cluster Computing · Computer Science 2022-04-18 Jakub Beránek , Stanislav Böhm , Vojtěch Cima

Task-based programming models like OmpSs-2 and OpenMP provide a flexible data-flow execution model to exploit dynamic, irregular and nested parallelism. Providing an efficient implementation that scales well with small granularity tasks…

Distributed, Parallel, and Cluster Computing · Computer Science 2021-05-18 David Álvarez , Kevin Sala , Marcos Maroñas , Aleix Roca , Vicenç Beltran

Several methods exist today to accelerate Machine Learning(ML) or Deep-Learning(DL) model performance for training and inference. However, modern techniques that rely on various graph and operator parallelism methodologies rely on search…

Machine Learning · Computer Science 2023-08-23 Srinjoy Das , Lawrence Rauchwerger

With the rapidly growing demand of graph processing in the real scene, they have to efficiently handle massive concurrent jobs. Although existing work enable to efficiently handle single graph processing job, there are plenty of memory…

Distributed, Parallel, and Cluster Computing · Computer Science 2018-06-05 Jin Zhao

The scheduling of task graphs with communication delays has been extensively studied. Recently, new results for the common sub-case of fork-join shaped task graphs were published, including an EPTAS and polynomial algorithms for special…

Distributed, Parallel, and Cluster Computing · Computer Science 2023-06-06 Huijun Wang , Oliver Sinnen

Over the years, many multiprocessor locking protocols have been designed and analyzed. However, the performance of these protocols highly depends on how the tasks are partitioned and prioritized and how the resources are shared locally and…

Operating Systems · Computer Science 2018-09-11 Jian-Jia Chen , Georg von der Brüggen , Junjie Shi , Niklas Uete

Parallel real-time embedded applications can be modelled as directed acyclic graphs (DAGs) whose nodes model subtasks and whose edges model precedence constraints among subtasks. Efficiently scheduling such parallel tasks can be challenging…

Distributed, Parallel, and Cluster Computing · Computer Science 2024-10-24 Shardul Lendve , Konstantinos Bletsas , Pedro F. Souto

Work-stealing is a widely used technique for balancing irregular parallel workloads, and most modern runtime systems adopt lock-free work-stealing deques to reduce contention and improve scalability. However, existing algorithms are…

Distributed, Parallel, and Cluster Computing · Computer Science 2026-03-09 Raja Sai Nandhan Yadav Kataru , Danial Davarnia , Ali Jannesari

The Integrative Model for Parallelism (IMP) derives a task graph from a higher level description of parallel algorithms. In this note we show how task graph transformations can be used to achieve latency tolerance in the program execution.…

Distributed, Parallel, and Cluster Computing · Computer Science 2018-11-14 Victor Eijkhout

To satisfy the increasing performance needs of modern cyber-physical systems, multiprocessor architectures are increasingly utilized. To efficiently exploit their potential parallelism in hard real-time systems, appropriate task models and…

Distributed, Parallel, and Cluster Computing · Computer Science 2022-08-26 Niklas Ueter , Mario Günzel , Georg von der Brüggen , Jian-Jia Chen

Work-stealing systems are typically oblivious to the nature of the tasks they are scheduling. For instance, they do not know or take into account how long a task will take to execute or how many subtasks it will spawn. Moreover, the actual…

Distributed, Parallel, and Cluster Computing · Computer Science 2013-05-29 Martin Wimmer , Daniel Cederman , Jesper Larsson Träff , Philippas Tsigas

Task parallelism as employed by the OpenMP task construct, although ideal for tackling irregular problems or typical producer/consumer schemes, bears some potential for performance bottlenecks if locality of data access is important, which…

Performance · Computer Science 2009-02-12 Markus Wittmann , Georg Hager

There are billions of lines of sequential code inside nowadays' software which do not benefit from the parallelism available in modern multicore architectures. Automatically parallelizing sequential code, to promote an efficient use of the…

Programming Languages · Computer Science 2016-04-13 Alcides Fonseca , Bruno Cabral , João Rafael , Ivo Correia

Shared memory programming models usually provide worksharing and task constructs. The former relies on the efficient fork-join execution model to exploit structured parallelism; while the latter relies on fine-grained synchronization among…

Distributed, Parallel, and Cluster Computing · Computer Science 2020-04-08 M. Maronas , K. Sala , S. Mateo , E. Ayguadé , V. Beltran Barcelona Supercomputing Center

The vast amounts of data used in social, business or traffic networks, biology and other natural sciences are often managed in graph-based data sets, consisting of a few thousand up to billions and trillions of vertices and edges,…

Databases · Computer Science 2021-10-22 Matthias Hauck , Ismail Oukid , Holger Fröning
‹ Prev 1 2 3 10 Next ›