Related papers: A Framework for Developing Real-Time OLAP algorith…

Optimized Partitioning and Priority Assignment of Real-Time Applications on Heterogeneous Platforms with Hardware Acceleration

Hardware accelerators, such as those based on GPUs and FPGAs, offer an excellent opportunity to efficiently parallelize functionalities. Recently, modern embedded platforms started being equipped with such accelerators, resulting in a…

Distributed, Parallel, and Cluster Computing · Computer Science 2022-05-16 Daniel Casini , Paolo Pazzaglia , Alessandro Biondi , Marco Di Natale

RTGPU: Real-Time Computing with Graphics Processing Units

In this work, we survey the role of GPUs in real-time systems. Originally designed for parallel graphics workloads, GPUs are now widely used in time-critical applications such as machine learning, autonomous vehicles, and robotics due to…

Hardware Architecture · Computer Science 2025-12-11 Atiyeh Gheibi-Fetrat , Amirsaeed Ahmadi-Tonekaboni , Farzam Koohi-Ronaghi , Pariya Hajipour , Sana Babayan-Vanestan , Fatemeh Fotouhi , Elahe Mortazavian-Farsani , Pouria Khajehpour-Dezfouli , Sepideh Safari , Shaahin Hessabi , Hamid Sarbazi-Azad

Optimized Composition: Generating Efficient Code for Heterogeneous Systems from Multi-Variant Components, Skeletons and Containers

In this survey paper, we review recent work on frameworks for the high-level, portable programming of heterogeneous multi-/manycore systems (especially, GPU-based systems) using high-level constructs such as annotated user-level software…

Distributed, Parallel, and Cluster Computing · Computer Science 2014-05-14 Christoph Kessler , Usman Dastgeer , Lu Li

Evaluating Rapid Makespan Predictions for Heterogeneous Systems with Programmable Logic

Heterogeneous computing systems, which combine general-purpose processors with specialized accelerators, are increasingly important for optimizing the performance of modern applications. A central challenge is to decide which parts of an…

Distributed, Parallel, and Cluster Computing · Computer Science 2026-04-15 Martin Wilhelm , Franz Freitag , Max Tzschoppe , Thilo Pionteck

HPM-Frame: A Decision Framework for Executing Software on Heterogeneous Platforms

Heterogeneous computing is one of the most important computational solutions to meet rapidly increasing demands on system performance. It typically allows the main flow of applications to be executed on a CPU while the most computationally…

Software Engineering · Computer Science 2020-12-11 Hugo Andrade , Ola Benderius , Christian Berger , Ivica Crnkovic , Jan Bosch

Manycore processing of repeated range queries over massive moving objects observations

The ability to timely process significant amounts of continuously updated spatial data is mandatory for an increasing number of applications. Parallelism enables such applications to face this data-intensive challenge and allows the devised…

Databases · Computer Science 2014-11-13 Francesco Lettich , Salvatore Orlando , Claudio Silvestri , Christian S. Jensen

Multicore architecture and cache optimization techniques for solving graph problems

With the advent of era of Big Data and Internet of Things, there has been an exponential increase in the availability of large data sets. These data sets require in-depth analysis that provides intelligence for improvements in methods for…

Distributed, Parallel, and Cluster Computing · Computer Science 2018-07-11 Alvaro Tzul

Parallel Programming Models for Heterogeneous Many-Cores : A Survey

Heterogeneous many-cores are now an integral part of modern computing systems ranging from embedding systems to supercomputers. While heterogeneous many-core design offers the potential for energy-efficient high-performance, such potential…

Distributed, Parallel, and Cluster Computing · Computer Science 2020-05-11 Jianbin Fang , Chun Huang , Tao Tang , Zheng Wang

Efficient Use of Limited-Memory Accelerators for Linear Learning on Heterogeneous Systems

We propose a generic algorithmic building block to accelerate training of machine learning models on heterogeneous compute systems. Our scheme allows to efficiently employ compute accelerators such as GPUs and FPGAs for the training of…

Machine Learning · Computer Science 2017-11-08 Celestine Dünner , Thomas Parnell , Martin Jaggi

Adaptive GPU Resource Allocation for Multi-Agent Collaborative Reasoning in Serverless Environments

Multi-agent systems powered by large language models have emerged as a promising paradigm for solving complex reasoning tasks through collaborative intelligence. However, efficiently deploying these systems on serverless GPU platforms…

Distributed, Parallel, and Cluster Computing · Computer Science 2026-01-05 Guilin Zhang , Wulan Guo , Ziqi Tan

A Tool for Automatically Suggesting Source-Code Optimizations for Complex GPU Kernels

Future computing systems, from handhelds to supercomputers, will undoubtedly be more parallel and heterogeneous than todays systems to provide more performance and energy efficiency. Thus, GPUs are increasingly being used to accelerate…

Distributed, Parallel, and Cluster Computing · Computer Science 2019-10-18 Saeed Taheri , Apan Qasem , Martin Burtscher

A Survey of Real-time Scheduling on Accelerator-based Heterogeneous Architecture for Time Critical Applications

Accelerator-based heterogeneous architectures, such as CPU-GPU, CPU-TPU, and CPU-FPGA systems, are widely adopted to support the popular artificial intelligence (AI) algorithms that demand intensive computation. When deployed in real-time…

Distributed, Parallel, and Cluster Computing · Computer Science 2025-05-20 An Zou , Yuankai Xu , Yinchen Ni , Jintao Chen , Yehan Ma , Jing Li , Christopher Gill , Xuan Zhang , Yier Jin

Feedback Scheduling for Energy-Efficient Real-Time Homogeneous Multiprocessor Systems

Real-time scheduling algorithms proposed in the literature are often based on worst-case estimates of task parameters. The performance of an open-loop scheme can be degraded significantly if there are uncertainties in task parameters, such…

Operating Systems · Computer Science 2017-10-13 Mason Thammawichai , Eric C. Kerrigan

Bi-objective Optimisation of Data-parallel Applications on Heterogeneous Platforms for Performance and Energy via Workload Distribution

Performance and energy are the two most important objectives for optimisation on modern parallel platforms. Latest research demonstrated the importance of workload distribution as a decision variable in the bi-objective optimisation for…

Distributed, Parallel, and Cluster Computing · Computer Science 2019-07-10 Hamidreza Khaleghzadeh , Muhammad Fahad , Arsalan Shahid , Ravi Reddy Manumachu , Alexey Lastovetsky

Towards Green Computing: A Survey of Performance and Energy Efficiency of Different Platforms using OpenCL

When considering different hardware platforms, not just the time-to-solution can be of importance but also the energy necessary to reach it. This is not only the case with battery powered and mobile devices but also with high-performance…

Performance · Computer Science 2020-06-30 Philip Heinisch , Katharina Ostaszewski , Hendrik Ranocha

RTGPU: Real-Time GPU Scheduling of Hard Deadline Parallel Tasks with Fine-Grain Utilization

Many emerging cyber-physical systems, such as autonomous vehicles and robots, rely heavily on artificial intelligence and machine learning algorithms to perform important system operations. Since these highly parallel applications are…

Distributed, Parallel, and Cluster Computing · Computer Science 2023-02-07 An Zou , Jing Li , Christopher D. Gill , Xuan Zhang

A Hybrid Multi-GPU Implementation of Simplex Algorithm with CPU Collaboration

The simplex algorithm has been successfully used for many years in solving linear programming (LP) problems. Due to the intensive computations required (especially for the solution of large LP problems), parallel approaches have also…

Distributed, Parallel, and Cluster Computing · Computer Science 2022-11-22 Basilis Mamalis , Marios Perlitis

Multi-Objective Task Assignment and Multiagent Planning with Hybrid GPU-CPU Acceleration

Allocation and planning with a collection of tasks and a group of agents is an important problem in multiagent systems. One commonly faced bottleneck is scalability, as in general the multiagent model increases exponentially in size with…

Multiagent Systems · Computer Science 2023-05-09 Thomas Robinson , Guoxin Su

Orchestrated Co-scheduling, Resource Partitioning, and Power Capping on CPU-GPU Heterogeneous Systems via Machine Learning

CPU-GPU heterogeneous architectures are now commonly used in a wide variety of computing systems from mobile devices to supercomputers. Maximizing the throughput for multi-programmed workloads on such systems is indispensable as one single…

Distributed, Parallel, and Cluster Computing · Computer Science 2024-05-08 Issa Saba , Eishi Arima , Dai Liu , Martin Schulz

Ripple : Simplified Large-Scale Computation on Heterogeneous Architectures with Polymorphic Data Layout

GPUs are now used for a wide range of problems within HPC. However, making efficient use of the computational power available with multiple GPUs is challenging. The main challenges in achieving good performance are memory layout, affecting…

Distributed, Parallel, and Cluster Computing · Computer Science 2021-04-20 Robert Clucas , Philip Blakely , Nikolaos Nikiforakis