Related papers: Load-Balanced Bottleneck Objectives in Process Map…
We introduce a new model for the task mapping problem to aid in the systematic design of algorithms for heterogeneous systems including, but not limited to, CPUs, GPUs and FPGAs. A special focus is set on the communication between the…
Hardware accelerators, such as those based on GPUs and FPGAs, offer an excellent opportunity to efficiently parallelize functionalities. Recently, modern embedded platforms started being equipped with such accelerators, resulting in a…
Distributing spatially located heterogeneous workloads is an important problem in parallel scientific computing. We investigate the problem of partitioning such workloads (represented as a matrix of non-negative integers) into rectangles,…
Motivated by performance optimization of large-scale graph processing systems that distribute the graph across multiple machines, we consider the balanced graph partitioning problem. Compared to the previous work, we study the…
Graph partitioning is a key fundamental problem in the area of big graph computation. Previous works do not consider the practical requirements when optimizing the big data analysis in real applications. In this paper, motivated by…
The balanced hypergraph partitioning problem (HGP) is to partition the vertex set of a hypergraph into k disjoint blocks of bounded weight, while minimizing an objective function defined on the hyperedges. Whereas real-world applications…
Partitioning graphs into blocks of roughly equal size such that few edges run between blocks is a frequently needed operation when processing graphs on a parallel computer. When a topology of a distributed system is known an important task…
In order to improve system performance efficiently, a number of systems choose to equip multi-core and many-core processors (such as GPUs). Due to their discrete memory these heterogeneous architectures comprise a distributed system within…
We describe an algorithm for dynamic load balancing of geometrically parallelized synchronous Monte Carlo simulations of physical models. This algorithm is designed for a (heterogeneous) multiprocessor system of the MIMD type with…
This study addresses the challenge of simulating realistic particle systems by proposing a novel particle decomposition scheme that improves the parallel performance of surface resolved particle simulations. Realistic particle systems often…
The simulation of the physical movement of multi-body systems at an atomistic level, with forces calculated from a quantum mechanical description of the electrons, motivates a graph partitioning problem studied in this article. Several…
We survey recent trends in practical algorithms for balanced graph partitioning together with applications and future research directions.
Graph partitioning schedules parallel calculations like sparse matrix-vector multiply (SpMV). We consider contiguous partitions, where the $m$ rows (or columns) of a sparse matrix with $N$ nonzeros are split into $K$ parts without…
We study a graph partitioning problem motivated by the simulation of the physical movement of multi-body systems on an atomistic level, where the forces are calculated from a quantum mechanical description of the electrons. Several advanced…
In the recent years we have witnessed a rapid development of new algorithmic techniques for parameterized algorithms for graph separation problems. We present experimental evaluation of two cornerstone theoretical results in this area:…
In recent years, significant advances have been made in the design and evaluation of balanced (hyper)graph partitioning algorithms. We survey trends of the last decade in practical algorithms for balanced (hyper)graph partitioning together…
Modeling data sharing in GPU programs is a challenging task because of the massive parallelism and complex data sharing patterns provided by GPU architectures. Better GPU caching efficiency can be achieved through careful task scheduling…
We propose three novel mathematical optimization formulations that solve the same two-type heterogeneous multiprocessor scheduling problem for a real-time taskset with hard constraints. Our formulations are based on a global scheduling…
In this paper, we explore the graph partitioning problem, a pivotal combina-torial optimization challenge with extensive applications in various fields such as science, technology, and business. Recognized as an NP-hard prob-lem, graph…
The problem of scheduling conflicting jobs on parallel machines consists in assigning a set of jobs to a set of machines so that no two conflicting jobs are allocated to the same machine, and the maximum processing time among all machines…