Related papers: A Server-based Approach for Predictable GPU Access…

GPGPU Based Parallelized Client-Server Framework for Providing High Performance Computation Support

Parallel data processing has become indispensable for processing applications involving huge data sets. This brings into focus the Graphics Processing Units (GPUs) which emphasize on many-core computing. With the advent of General Purpose…

Distributed, Parallel, and Cluster Computing · Computer Science 2015-05-22 Poorna Banerjee , Amit Dave

Queueing Analysis of GPU-Based Inference Servers with Dynamic Batching: A Closed-Form Characterization

GPU-accelerated computing is a key technology to realize high-speed inference servers using deep neural networks (DNNs). An important characteristic of GPU-based inference is that the computational efficiency, in terms of the processing…

Performance · Computer Science 2021-01-13 Yoshiaki Inoue

Towards Fast Setup and High Throughput of GPU Serverless Computing

Integrating GPUs into serverless computing platforms is crucial for improving efficiency. However, existing solutions for GPU-enabled serverless computing platforms face two significant problems due to coarse-grained GPU management: long…

Distributed, Parallel, and Cluster Computing · Computer Science 2024-04-24 Han Zhao , Weihao Cui , Quan Chen , Shulai Zhang , Zijun Li , Jingwen Leng , Chao Li , Deze Zeng , Minyi Guo

Contention-Aware GPU Partitioning and Task-to-Partition Allocation for Real-Time Workloads

In order to satisfy timing constraints, modern real-time applications require massively parallel accelerators such as General Purpose Graphic Processing Units (GPGPUs). Generation after generation, the number of computing clusters made…

Distributed, Parallel, and Cluster Computing · Computer Science 2021-05-24 Houssam-Eddine Zahaf , Ignacio Sanudo Olmedo , Jayati Singh , Nicola Capodieci , Sebastien Faucou

Skew Handling in Aggregate Streaming Queries on GPUs

Nowadays, the data to be processed by database systems has grown so large that any conventional, centralized technique is inadequate. At the same time, general purpose computation on GPU (GPGPU) recently has successfully drawn attention…

Databases · Computer Science 2013-09-04 Georgios Koutsoumpakis , Iakovos Koutsoumpakis , Anastasios Gounaris

RTGPU: Real-Time Computing with Graphics Processing Units

In this work, we survey the role of GPUs in real-time systems. Originally designed for parallel graphics workloads, GPUs are now widely used in time-critical applications such as machine learning, autonomous vehicles, and robotics due to…

Hardware Architecture · Computer Science 2025-12-11 Atiyeh Gheibi-Fetrat , Amirsaeed Ahmadi-Tonekaboni , Farzam Koohi-Ronaghi , Pariya Hajipour , Sana Babayan-Vanestan , Fatemeh Fotouhi , Elahe Mortazavian-Farsani , Pouria Khajehpour-Dezfouli , Sepideh Safari , Shaahin Hessabi , Hamid Sarbazi-Azad

Intra-node Memory Safe GPU Co-Scheduling

GPUs in High-Performance Computing systems remain under-utilised due to the unavailability of schedulers that can safely schedule multiple applications to share the same GPU. The research reported in this paper is motivated to improve the…

Distributed, Parallel, and Cluster Computing · Computer Science 2017-12-14 Carlos Reano , Federico Silla , Dimitrios S. Nikolopoulos , Blesson Varghese

Enabling predictable parallelism in single-GPU systems with persistent CUDA threads

Graphics Processing Unit, or GPUs, have been successfully adopted both for graphic computation in 3D applications, and for general purpose application (GP-GPUs), thank to their tremendous performance-per-watt. Recently, there is a big…

Distributed, Parallel, and Cluster Computing · Computer Science 2023-10-03 Paolo Burgio

GCAPS: GPU Context-Aware Preemptive Priority-based Scheduling for Real-Time Tasks

Scheduling real-time tasks that utilize GPUs with analyzable guarantees poses a significant challenge due to the intricate interaction between CPU and GPU resources, as well as the complex GPU hardware and software stack. While much…

Distributed, Parallel, and Cluster Computing · Computer Science 2024-06-11 Yidi Wang , Cong Liu , Daniel Wong , Hyoseung Kim

State-of-the-Art on Query & Transaction Processing Acceleration

The vast amount of processing power and memory bandwidth provided by modern Graphics Processing Units (GPUs) make them a platform for data-intensive applications. The database community identified GPUs as effective co-processors for data…

Distributed, Parallel, and Cluster Computing · Computer Science 2019-07-02 Bernd Amann , Youry Khmelevsky , Gaetan Hains

Revisiting Query Performance in GPU Database Systems

GPUs offer massive compute parallelism and high-bandwidth memory accesses. GPU database systems seek to exploit those capabilities to accelerate data analytics. Although modern GPUs have more resources (e.g., higher DRAM bandwidth) than…

Databases · Computer Science 2023-02-03 Jiashen Cao , Rathijit Sen , Matteo Interlandi , Joy Arulraj , Hyesoon Kim

Exploiting Dependency and Parallelism: Real-Time Scheduling and Analysis for GPU Tasks

With the rapid advancement of Artificial Intelligence, the Graphics Processing Unit (GPU) has become increasingly essential across a growing number of safety-critical application domains. Applying a GPU is indispensable for parallel…

Operating Systems · Computer Science 2026-02-25 Yuanhai Zhang , Songyang He , Ruizhe Gou , Mingyue Cui , Boyang Li , Shuai Zhao , Kai Huang

Supporting Parallelism in Server-based Multiprocessor Systems

Developing an efficient server-based real-time scheduling solution that supports dynamic task-level parallelism is now relevant to even the desktop and embedded domains and no longer only to the high performance computing market niche. This…

Distributed, Parallel, and Cluster Computing · Computer Science 2011-06-15 Luís Nogueira , Luís Miguel Pinho

Techniques for Shared Resource Management in Systems with Throughput Processors

The continued growth of the computational capability of throughput processors has made throughput processors the platform of choice for a wide variety of high performance computing applications. Graphics Processing Units (GPUs) are a prime…

Hardware Architecture · Computer Science 2018-05-01 Rachata Ausavarungnirun

Understanding GPU Resource Interference One Level Deeper

GPUs are vastly underutilized, even when running resource-intensive AI applications, as GPU kernels within each job have diverse resource profiles that may saturate some parts of a device while often leaving other parts idle. Colocating…

Distributed, Parallel, and Cluster Computing · Computer Science 2026-02-17 Paul Elvinger , Foteini Strati , Natalie Enright Jerger , Ana Klimovic

GPUs as Storage System Accelerators

Massively multicore processors, such as Graphics Processing Units (GPUs), provide, at a comparable price, a one order of magnitude higher peak performance than traditional CPUs. This drop in the cost of computation, as any…

Distributed, Parallel, and Cluster Computing · Computer Science 2016-11-18 Samer Al-Kiswany , Abdullah Gharaibeh , Matei Ripeanu

ESG: Pipeline-Conscious Efficient Scheduling of DNN Workflows on Serverless Platforms with Shareable GPUs

Recent years have witnessed increasing interest in machine learning inferences on serverless computing for its auto-scaling and cost effective properties. Existing serverless computing, however, lacks effective job scheduling methods to…

Distributed, Parallel, and Cluster Computing · Computer Science 2024-04-26 Xinning Hui , Yuanchao Xu , Zhishan Guo , Xipeng Shen

Efficient Resource Sharing Through GPU Virtualization on Accelerated High Performance Computing Systems

The High Performance Computing (HPC) field is witnessing a widespread adoption of Graphics Processing Units (GPUs) as co-processors for conventional homogeneous clusters. The adoption of prevalent Single- Program Multiple-Data (SPMD)…

Distributed, Parallel, and Cluster Computing · Computer Science 2015-11-25 Teng Li , Vikram K. Narayana , Tarek El-Ghazawi

A Comprehensive Overview of GPU Accelerated Databases

Over the past decade, the landscape of data analytics has seen a notable shift towards heterogeneous architectures, particularly the integration of GPUs to enhance overall performance. In the realm of in-memory analytics, which often…

Databases · Computer Science 2024-06-21 Harshit Sharma , Anmol Sharma

A Graph-based Model for GPU Caching Problems

Modeling data sharing in GPU programs is a challenging task because of the massive parallelism and complex data sharing patterns provided by GPU architectures. Better GPU caching efficiency can be achieved through careful task scheduling…

Distributed, Parallel, and Cluster Computing · Computer Science 2016-10-04 Lingda Li , Ari B. Hayes , Stephen A. Hackler , Eddy Z. Zhang , Mario Szegedy , Shuaiwen Leon Song