Related papers: Accelerating Task-based Iterative Applications
OpenMP is the de-facto standard for shared memory systems in High-Performance Computing (HPC). It includes a task-based model that offers a high-level of abstraction to effectively exploit highly dynamic structured and unstructured…
Task-based programming models have proven to be a robust and versatile way to approach development of applications for distributed environments. They provide natural programming patterns with high performance. However, execution on this…
Task-based programming models like OmpSs-2 and OpenMP provide a flexible data-flow execution model to exploit dynamic, irregular and nested parallelism. Providing an efficient implementation that scales well with small granularity tasks…
Parallel task-based programming models, like OpenMP, allow application developers to easily create a parallel version of their sequential codes. The standard OpenMP 4.0 introduced the possibility of describing a set of data dependences per…
Task graphs have been studied for decades as a foundation for scheduling irregular parallel applications and incorporated in programming models such as OpenMP. While many high-performance parallel libraries are based on task graphs, they…
A heterogeneous architecture composed by a host and an accelerator must frequently deal with situations where several independent tasks are available to be offloaded onto the accelerator. These tasks can be generated by concurrent…
Parallel real-time embedded applications can be modelled as directed acyclic graphs (DAGs) whose nodes model subtasks and whose edges model precedence constraints among subtasks. Efficiently scheduling such parallel tasks can be challenging…
Taskflow aims to streamline the building of parallel and heterogeneous applications using a lightweight task graph-based approach. Taskflow introduces an expressive task graph programming model to assist developers in the implementation of…
As the Moore's scaling era comes to an end, application specific hardware accelerators appear as an attractive way to improve the performance and power efficiency of our computing systems. A massively heterogeneous system with a large…
Open-source matters, not just to the current cohort of HPC users but also to potential new HPC communities, such as machine learning, themselves often rooted in open-source. Many of these potential new workloads are, by their very nature,…
There are billions of lines of sequential code inside nowadays' software which do not benefit from the parallelism available in modern multicore architectures. Automatically parallelizing sequential code, to promote an efficient use of the…
We consider the classical problem of scheduling task graphs corresponding to complex applications on distributed computing systems. A number of heuristics have been previously proposed to optimize task scheduling with respect to metrics…
Hard real-time systems like image processing, autonomous driving, etc. require an increasing need of computational power that classical multi-core platforms can not provide, to fulfill with their timing constraints. Heterogeneous…
As modern HPC computing platforms become increasingly heterogeneous, it is challenging for programmers to fully leverage the computation power of massive parallelism offered by such heterogeneity. Consequently, task-based runtime systems…
There has been significant progress in understanding the parallelism inherent to iterative sequential algorithms: for many classic algorithms, the depth of the dependence structure is now well understood, and scheduling techniques have been…
Despite the various research initiatives and proposed programming models, efficient solutions for parallel programming in HPC clusters still rely on a complex combination of different programming models (e.g., OpenMP and MPI), languages…
Task graphs provide a simple way to describe scientific workflows (sets of tasks with dependencies) that can be executed on both HPC clusters and in the cloud. An important aspect of executing such graphs is the used scheduling algorithm.…
Scheduling computational tasks represented by directed acyclic graphs (DAGs) is challenging because of its complexity. Conventional scheduling algorithms rely heavily on simple heuristics such as shortest job first (SJF) and critical path…
Scientific workflows are often represented as directed acyclic graphs (DAGs), where vertices correspond to tasks and edges represent the dependencies between them. Since these graphs are often large in both the number of tasks and their…
With the rapid advancement of Artificial Intelligence, the Graphics Processing Unit (GPU) has become increasingly essential across a growing number of safety-critical application domains. Applying a GPU is indispensable for parallel…