English
Related papers

Related papers: Improving tasks throughput on accelerators using O…

200 papers

As the Moore's scaling era comes to an end, application specific hardware accelerators appear as an attractive way to improve the performance and power efficiency of our computing systems. A massively heterogeneous system with a large…

Operating Systems · Computer Science 2019-07-02 Kartik Hegde , Abhishek Srivastava , Rohit Agrawal

Developing parallel algorithms efficiently requires careful management of concurrency across diverse hardware architectures. C++ executors provide a standardized interface that simplifies the development process, allowing developers to…

Distributed, Parallel, and Cluster Computing · Computer Science 2025-10-22 Karame Mohammadiporshokooh , Steven R. Brandt , Hartmut Kaiser

An acceptable response time of a server is an important aspect in many client-server applications; this is evident in situations in which the server is overloaded by many computationally intensive requests. In this work, we consider that…

Distributed, Parallel, and Cluster Computing · Computer Science 2018-02-09 István Módos , Přemysl Šůcha , Roman Václavík , Jan Smejkal , Zdeněk Hanzálek

Accelerator-based heterogeneous architectures, such as CPU-GPU, CPU-TPU, and CPU-FPGA systems, are widely adopted to support the popular artificial intelligence (AI) algorithms that demand intensive computation. When deployed in real-time…

Distributed, Parallel, and Cluster Computing · Computer Science 2025-05-20 An Zou , Yuankai Xu , Yinchen Ni , Jintao Chen , Yehan Ma , Jing Li , Christopher Gill , Xuan Zhang , Yier Jin

This paper investigates co-scheduling algorithms for processing a set of parallel applications. Instead of executing each application one by one, using a maximum degree of parallelism for each of them, we aim at scheduling several…

Data Structures and Algorithms · Computer Science 2013-05-01 Guillaume Aupy , Manu Shantharam , Anne Benoit , Yves Robert , Padma Raghavan

Performance-, power-, and energy-aware scheduling techniques play an essential role in optimally utilizing processing elements (PEs) of heterogeneous systems. List schedulers, a class of low-complexity static schedulers, have commonly been…

Distributed, Parallel, and Cluster Computing · Computer Science 2021-12-17 Joshua Mack , Samet E. Arda , Umit Y. Ogras , Ali Akoglu

To deliver high performance in power limited systems, architects have turned to using heterogeneous systems, either CPU+GPU or mixed CPU-hardware systems. However, in systems with different processor types and task affinities, scheduling…

Performance · Computer Science 2017-12-12 Zhuo Chen , Diana Marculescu

Heterogeneous systems are present from powerful supercomputers, to mobile devices, including desktop computers, thanks to their excellent performance and energy consumption. The ubiquity of these architectures in both desktop systems and…

Distributed, Parallel, and Cluster Computing · Computer Science 2020-10-27 Raúl Nozal , Jose Luis Bosque , Ramon Beivide

The aim of the paper is to introduce general techniques in order to optimize the parallel execution time of sorting on a distributed architectures with processors of various speeds. Such an application requires a partitioning step. For…

Distributed, Parallel, and Cluster Computing · Computer Science 2016-08-16 Christophe Cérin , Jean-Christophe Dubacq , Jean-Louis Roch , the SafeScale Collaboration

High Speed computing meets ever increasing real-time computational demands through the leveraging of flexibility and parallelism. The flexibility is achieved when computing platform designed with heterogeneous resources to support…

Operating Systems · Computer Science 2015-01-08 Mahendra Vucha , Arvind Rajawat

We study the problem of executing an application represented by a precedence task graph on a parallel machine composed of standard computing cores and accelerators. Contrary to most existing approaches, we distinguish the allocation and the…

Distributed, Parallel, and Cluster Computing · Computer Science 2017-11-20 Marcos Amaris , Giorgio Lucarelli , Clément Mommessin , Denis Trystram

The demand for stringent interactive quality-of-service has intensified in both mobile edge computing (MEC) and cloud systems, driven by the imperative to improve user experiences. As a result, the processing of computation-intensive tasks…

Distributed, Parallel, and Cluster Computing · Computer Science 2025-07-28 Ngoc Hung Nguyen , Van-Dinh Nguyen , Anh Tuan Nguyen , Nguyen Van Thieu , Hoang Nam Nguyen , Symeon Chatzinotas

Although High Performance Computing (HPC) users understand basic resource requirements such as the number of CPUs and memory limits, internal infrastructural utilization data is exclusively leveraged by cluster operators, who use it to…

Distributed, Parallel, and Cluster Computing · Computer Science 2024-01-19 Abel Souza , Kristiaan Pelckmans , Johan Tordsson

We consider the problem of scheduling multiprocessor jobs to minimize the total completion time under the given energy budget. Each multiprocessor job requires more than one processor at the same moment of time. Processors may operate at…

Optimization and Control · Mathematics 2021-07-22 Alexander Kononov , Yulia Kovalenko

Parallel programming models can encourage performance portability by moving the responsibility for work assignment and data distribution from the programmer to a runtime system. However, analyzing the resulting implicit memory allocations,…

Distributed, Parallel, and Cluster Computing · Computer Science 2025-03-14 Fabian Knorr , Philip Salzmann , Peter Thoman , Thomas Fahringer

Scheduled batch jobs have been widely used on the asynchronous computing platforms to execute various enterprise applications, including the scheduled notifications and the candidate pre-computation for the modern recommender systems. It is…

Machine Learning · Computer Science 2022-12-06 Yang Liu , Juan Wang , Zhengxing Chen , Ian Fox , Imani Mufti , Jason Sukumaran , Baokun He , Xiling Sun , Feng Liang

Hardware accelerators, such as those based on GPUs and FPGAs, offer an excellent opportunity to efficiently parallelize functionalities. Recently, modern embedded platforms started being equipped with such accelerators, resulting in a…

Distributed, Parallel, and Cluster Computing · Computer Science 2022-05-16 Daniel Casini , Paolo Pazzaglia , Alessandro Biondi , Marco Di Natale

We present a scheduler that improves cluster utilization and job completion times by packing tasks having multi-resource requirements and inter-dependencies. While the problem is algorithmically very hard, we achieve near-optimality on the…

Distributed, Parallel, and Cluster Computing · Computer Science 2016-04-26 Robert Grandl , Srikanth Kandula , Sriram Rao , Aditya Akella , Janardhan Kulkarni

In this paper, we propose the first optimum process scheduling algorithm for an increasingly prevalent type of heterogeneous multicore (HEMC) system that combines high-performance big cores and energy-efficient small cores with the same…

Distributed, Parallel, and Cluster Computing · Computer Science 2021-09-13 Chien-Hao Chen , Ren-Song Tsay

In neural network topologies, algorithms are running on batches of data tensors. The batches of data are typically scheduled onto the computing cores which execute in parallel. For the algorithms running on batches of data, an optimal batch…

Performance · Computer Science 2020-02-18 Phani Kumar Nyshadham , Mohit Sinha , Biswajit Mishra , H S Vijay
‹ Prev 1 2 3 10 Next ›