English
Related papers

Related papers: Dynamic load-balancing on multi-FPGA systems: a ca…

200 papers

In this paper, we introduce a software-defined framework that enables the parallel utilization of all the programmable processing resources available in heterogeneous system-on-chip (SoC) including FPGA-based hardware accelerators and…

Distributed, Parallel, and Cluster Computing · Computer Science 2018-02-12 Jose Nunez-Yanez , Mohammad Hosseinabady , Moslem Amiri , Andrés Rodríguez , Rafael Asenjo , Angeles Navarro , Rubén Gran-Tejero , Darío Suárez-Gracia

A parallel computer system is a collection of processing elements that communicate and cooperate to solve large computational problems efficiently. To achieve this, at first the large computational problem is partitioned into several tasks…

Distributed, Parallel, and Cluster Computing · Computer Science 2011-09-09 Ardhendu Mandal , Subhas Chandra Pal

FPGA-based hardware accelerators have received increasing attention mainly due to their ability to accelerate deep pipelined applications, thus resulting in higher computational performance and energy efficiency. Nevertheless, the amount of…

Distributed, Parallel, and Cluster Computing · Computer Science 2021-03-23 R. Nepomuceno , R. Sterle , G. Valarini , M. Pereira , H. Yviquel , G. Araujo

We introduce a new model for the task mapping problem to aid in the systematic design of algorithms for heterogeneous systems including, but not limited to, CPUs, GPUs and FPGAs. A special focus is set on the communication between the…

Distributed, Parallel, and Cluster Computing · Computer Science 2026-04-15 Martin Wilhelm , Hanna Geppert , Anna Drewes , Thilo Pionteck

Simultaneous multithreading processors improve throughput over single-threaded processors thanks to sharing internal core resources among instructions from distinct threads. However, resource sharing introduces inter-thread interference…

Distributed, Parallel, and Cluster Computing · Computer Science 2023-10-20 Marta Navarro , Josué Feliu , Salvador Petit , María E. Gómez , Julio Sahuquillo

The latest trends in high-performance computing systems show an increasing demand on the use of a large scale multicore systems in a efficient way, so that high compute-intensive applications can be executed reasonably well. However, the…

Distributed, Parallel, and Cluster Computing · Computer Science 2013-02-25 Juliana M. N. Silva , Cristina Boeres , Lúcia M. A. Drummond , Artur A. Pessoa

In parallel iterative applications, computational efficiency is essential for addressing large problems. Load imbalance is one of the major performance degradation factors of parallel applications. Therefore, distributing, cleverly, and as…

Distributed, Parallel, and Cluster Computing · Computer Science 2019-11-18 Anthony Boulmier , Franck Raynaud , Nabil Abdennadher , Bastien Chopard

As compute power increases with time, more involved and larger simulations become possible. However, it gets increasingly difficult to efficiently use the provided computational resources. Especially in particle-based simulations with a…

Distributed, Parallel, and Cluster Computing · Computer Science 2019-08-05 Sebastian Eibl , Ulrich Rüde

Heterogeneous multi-core architectures combine a few "host" cores, optimized for single-thread performance, with many small energy-efficient "accelerator" cores for data-parallel processing, on a single chip. Offloading a computation to the…

Hardware Architecture · Computer Science 2025-11-11 Luca Colagrande , Luca Benini

The increasing parallelism of many-core systems demands for efficient strategies for the run-time system management. Due to the large number of cores the management overhead has a rising impact to the overall system performance. This work…

Distributed, Parallel, and Cluster Computing · Computer Science 2015-02-11 Daniel Gregorek , Robert Schmidt , Alberto Garcia-Ortiz

The main goal of parallel processing is to provide users with performance that is much better than that of single processor systems. The execution of jobs is scheduled, which requires certain resources in order to meet certain criteria.…

Distributed, Parallel, and Cluster Computing · Computer Science 2019-02-07 Yang Cao , Fei Wu , Thomas Robertazzi

Growing deployment of power and energy efficient throughput accelerators (GPU) in data centers demands enhancement of power-performance co-optimization capabilities of GPUs. Realization of exascale computing using accelerators requires…

Distributed, Parallel, and Cluster Computing · Computer Science 2020-11-06 Nilanjan Goswami , Amer Qouneh , Chao Li , Tao Li

Nowadays, latency-critical, high-performance applications are parallelized even on power-constrained client systems to improve performance. However, an important scenario of fine-grained tasking on simultaneous multithreading CPU cores in…

Distributed, Parallel, and Cluster Computing · Computer Science 2024-10-03 Denis Los , Igor Petushkov

This paper presents a methodology for simultaneous heterogeneous computing, named ENEAC, where a quad core ARM Cortex-A53 CPU works in tandem with a preprogrammed on-board FPGA accelerator. A heterogeneous scheduler distributes the tasks…

Distributed, Parallel, and Cluster Computing · Computer Science 2021-11-16 Kris Nikov , Mohammad Hosseinabady , Rafael Asenjo , Andrés Rodríguezz , Angeles Navarro , Jose Nunez-Yanez

Nowadays, several industrial applications are being ported to parallel architectures. These applications take advantage of the potential parallelism provided by multiple core processors. Many-core processors, especially the GPUs(Graphics…

Distributed, Parallel, and Cluster Computing · Computer Science 2011-03-28 Wendell Rodrigues , Frédéric Guyomarc'h , Jean-Luc Dekeyser

Parallel programming models can encourage performance portability by moving the responsibility for work assignment and data distribution from the programmer to a runtime system. However, analyzing the resulting implicit memory allocations,…

Distributed, Parallel, and Cluster Computing · Computer Science 2025-03-14 Fabian Knorr , Philip Salzmann , Peter Thoman , Thomas Fahringer

Scientific applications are often complex, irregular, and computationally-intensive. To accommodate the ever-increasing computational demands of scientific applications, high-performance computing (HPC) systems have become larger and more…

Distributed, Parallel, and Cluster Computing · Computer Science 2019-11-20 Ali Mohammed , Aurelien Cavelan , Florina M. Ciorba , Ruben M. Cabezon , Ioana Banicesu

The balance metric is a simple approach to estimate the performance of bandwidth-limited loop kernels. However, applying the method to in-cache situations and modern multi-core architectures yields unsatisfactory results. This paper…

Performance · Computer Science 2009-10-27 Jan Treibig , Georg Hager , Gerhard Wellein

Multithreaded Multi-core processors are prevalent today and are used for solving some of the important problems in computing. Resource imbalance can negatively impact overall performance in such processors. Hence balanced resource…

Distributed, Parallel, and Cluster Computing · Computer Science 2020-08-25 Suryanarayana Murthy Durbhakula

Maintaining computational load balance is important to the performant behavior of codes which operate under a distributed computing model. This is especially true for GPU architectures, which can suffer from memory oversubscription if…

Distributed, Parallel, and Cluster Computing · Computer Science 2021-11-05 Michael E. Rowan , Axel Huebl , Kevin N. Gott , Jack Deslippe , Maxence Thévenet , Remi Lehe , Jean-Luc Vay
‹ Prev 1 2 3 10 Next ›