Related papers: Improving tasks throughput on accelerators using O…

HTS: A Hardware Task Scheduler for Heterogeneous Systems

As the Moore's scaling era comes to an end, application specific hardware accelerators appear as an attractive way to improve the performance and power efficiency of our computing systems. A massively heterogeneous system with a large…

Operating Systems · Computer Science 2019-07-02 Kartik Hegde , Abhishek Srivastava , Rohit Agrawal

A New Execution Model and Executor for Adaptively Optimizing the Performance of Parallel Algorithms Using HPX Runtime System

Developing parallel algorithms efficiently requires careful management of concurrency across diverse hardware architectures. C++ executors provide a standardized interface that simplifies the development process, allowing developers to…

Distributed, Parallel, and Cluster Computing · Computer Science 2025-10-22 Karame Mohammadiporshokooh , Steven R. Brandt , Hartmut Kaiser

Adaptive online scheduling of tasks with anytime property on heterogeneous resources

An acceptable response time of a server is an important aspect in many client-server applications; this is evident in situations in which the server is overloaded by many computationally intensive requests. In this work, we consider that…

Distributed, Parallel, and Cluster Computing · Computer Science 2018-02-09 István Módos , Přemysl Šůcha , Roman Václavík , Jan Smejkal , Zdeněk Hanzálek

A Survey of Real-time Scheduling on Accelerator-based Heterogeneous Architecture for Time Critical Applications

Accelerator-based heterogeneous architectures, such as CPU-GPU, CPU-TPU, and CPU-FPGA systems, are widely adopted to support the popular artificial intelligence (AI) algorithms that demand intensive computation. When deployed in real-time…

Distributed, Parallel, and Cluster Computing · Computer Science 2025-05-20 An Zou , Yuankai Xu , Yinchen Ni , Jintao Chen , Yehan Ma , Jing Li , Christopher Gill , Xuan Zhang , Yier Jin

Co-Scheduling Algorithms for High-Throughput Workload Execution

This paper investigates co-scheduling algorithms for processing a set of parallel applications. Instead of executing each application one by one, using a maximum degree of parallelism for each of them, we aim at scheduling several…

Data Structures and Algorithms · Computer Science 2013-05-01 Guillaume Aupy , Manu Shantharam , Anne Benoit , Yves Robert , Padma Raghavan

Performant, Multi-objective Scheduling of Highly Interleaved Task Graphs on Heterogeneous System on Chip Devices

Performance-, power-, and energy-aware scheduling techniques play an essential role in optimally utilizing processing elements (PEs) of heterogeneous systems. List schedulers, a class of low-complexity static schedulers, have commonly been…

Distributed, Parallel, and Cluster Computing · Computer Science 2021-12-17 Joshua Mack , Samet E. Arda , Umit Y. Ogras , Ali Akoglu

Priority-Aware Near-Optimal Scheduling for Heterogeneous Multi-Core Systems with Specialized Accelerators

To deliver high performance in power limited systems, architects have turned to using heterogeneous systems, either CPU+GPU or mixed CPU-hardware systems. However, in systems with different processor types and task affinities, scheduling…

Performance · Computer Science 2017-12-12 Zhuo Chen , Diana Marculescu

Towards Co-execution on Commodity Heterogeneous Systems: Optimizations for Time-Constrained Scenarios

Heterogeneous systems are present from powerful supercomputers, to mobile devices, including desktop computers, thanks to their excellent performance and energy consumption. The ubiquity of these architectures in both desktop systems and…

Distributed, Parallel, and Cluster Computing · Computer Science 2020-10-27 Raúl Nozal , Jose Luis Bosque , Ramon Beivide

Methods for Partitioning Data to Improve Parallel Execution Time for Sorting on Heterogeneous Clusters

The aim of the paper is to introduce general techniques in order to optimize the parallel execution time of sorting on a distributed architectures with processors of various speeds. Such an application requires a partitioning step. For…

Distributed, Parallel, and Cluster Computing · Computer Science 2016-08-16 Christophe Cérin , Jean-Christophe Dubacq , Jean-Louis Roch , the SafeScale Collaboration

A Case Study: Task Scheduling Methodologies for High Speed Computing Systems

High Speed computing meets ever increasing real-time computational demands through the leveraging of flexibility and parallelism. The flexibility is achieved when computing platform designed with heterogeneous resources to support…

Operating Systems · Computer Science 2015-01-08 Mahendra Vucha , Arvind Rajawat

Generic algorithms for scheduling applications on heterogeneous multi-core platforms

We study the problem of executing an application represented by a precedence task graph on a parallel machine composed of standard computing cores and accelerators. Contrary to most existing approaches, we distinguish the allocation and the…

Distributed, Parallel, and Cluster Computing · Computer Science 2017-11-20 Marcos Amaris , Giorgio Lucarelli , Clément Mommessin , Denis Trystram

Deadline-Aware Joint Task Scheduling and Offloading in Mobile Edge Computing Systems

The demand for stringent interactive quality-of-service has intensified in both mobile edge computing (MEC) and cloud systems, driven by the imperative to improve user experiences. As a result, the processing of computation-intensive tasks…

Distributed, Parallel, and Cluster Computing · Computer Science 2025-07-28 Ngoc Hung Nguyen , Van-Dinh Nguyen , Anh Tuan Nguyen , Nguyen Van Thieu , Hoang Nam Nguyen , Symeon Chatzinotas

A HPC Co-Scheduler with Reinforcement Learning

Although High Performance Computing (HPC) users understand basic resource requirements such as the number of CPUs and memory limits, internal infrastructural utilization data is exclusively leveraged by cluster operators, who use it to…

Distributed, Parallel, and Cluster Computing · Computer Science 2024-01-19 Abel Souza , Kristiaan Pelckmans , Johan Tordsson

Minimizing Total Completion Time in Multiprocessor Job Systems with Energy Constraint

We consider the problem of scheduling multiprocessor jobs to minimize the total completion time under the given energy budget. Each multiprocessor job requires more than one processor at the same moment of time. Processors may operate at…

Optimization and Control · Mathematics 2021-07-22 Alexander Kononov , Yulia Kovalenko

Concurrent Scheduling of High-Level Parallel Programs on Multi-GPU Systems

Parallel programming models can encourage performance portability by moving the responsibility for work assignment and data distribution from the programmer to a runtime system. However, analyzing the resulting implicit memory allocations,…

Distributed, Parallel, and Cluster Computing · Computer Science 2025-03-14 Fabian Knorr , Philip Salzmann , Peter Thoman , Thomas Fahringer

Personalized Execution Time Optimization for the Scheduled Jobs

Scheduled batch jobs have been widely used on the asynchronous computing platforms to execute various enterprise applications, including the scheduled notifications and the candidate pre-computation for the modern recommender systems. It is…

Machine Learning · Computer Science 2022-12-06 Yang Liu , Juan Wang , Zhengxing Chen , Ian Fox , Imani Mufti , Jason Sukumaran , Baokun He , Xiling Sun , Feng Liang

Optimized Partitioning and Priority Assignment of Real-Time Applications on Heterogeneous Platforms with Hardware Acceleration

Hardware accelerators, such as those based on GPUs and FPGAs, offer an excellent opportunity to efficiently parallelize functionalities. Recently, modern embedded platforms started being equipped with such accelerators, resulting in a…

Distributed, Parallel, and Cluster Computing · Computer Science 2022-05-16 Daniel Casini , Paolo Pazzaglia , Alessandro Biondi , Marco Di Natale

Do the Hard Stuff First: Scheduling Dependent Computations in Data-Analytics Clusters

We present a scheduler that improves cluster utilization and job completion times by packing tasks having multi-resource requirements and inter-dependencies. While the problem is algorithmically very hard, we achieve near-optimality on the…

Distributed, Parallel, and Cluster Computing · Computer Science 2016-04-26 Robert Grandl , Srikanth Kandula , Sriram Rao , Aditya Akella , Janardhan Kulkarni

Analytical Process Scheduling Optimization for Heterogeneous Multi-core Systems

In this paper, we propose the first optimum process scheduling algorithm for an increasingly prevalent type of heterogeneous multicore (HEMC) system that combines high-performance big cores and energy-efficient small cores with the same…

Distributed, Parallel, and Cluster Computing · Computer Science 2021-09-13 Chien-Hao Chen , Ren-Song Tsay

An optimal scheduling architecture for accelerating batch algorithms on Neural Network processor architectures

In neural network topologies, algorithms are running on batches of data tensors. The batches of data are typically scheduled onto the computing cores which execute in parallel. For the algorithms running on batches of data, an optimal batch…

Performance · Computer Science 2020-02-18 Phani Kumar Nyshadham , Mohit Sinha , Biswajit Mishra , H S Vijay