English
Related papers

Related papers: A Programming Model for GPU Load Balancing

200 papers

Fine-grained workload and resource balancing is the key to high performance for regular and irregular computations on the GPUs. In this dissertation, we conduct an extensive survey of existing load-balancing techniques to build an…

Distributed, Parallel, and Cluster Computing · Computer Science 2022-12-20 Muhammad Osama

Load-balancing among the threads of a GPU for graph analytics workloads is difficult because of the irregular nature of graph applications and the high variability in vertex degrees, particularly in power-law graphs. We describe a novel…

Distributed, Parallel, and Cluster Computing · Computer Science 2020-02-28 Vishwesh Jatala , Loc Hoang , Roshan Dathathri , Gurbinder Gill , V Krishna Nandivada , Keshav Pingali

Acceleration of graph applications on GPUs has found large interest due to the ubiquitous use of graph processing in various domains. The inherent \textit{irregularity} in graph applications leads to several challenges for parallelization.…

Distributed, Parallel, and Cluster Computing · Computer Science 2017-11-02 Ananya Raval , Rupesh Nasre , Vivek Kumar , Vasudevan R , Sathish Vadhiyar , Keshav Pingali

Maintaining computational load balance is important to the performant behavior of codes which operate under a distributed computing model. This is especially true for GPU architectures, which can suffer from memory oversubscription if…

Distributed, Parallel, and Cluster Computing · Computer Science 2021-11-05 Michael E. Rowan , Axel Huebl , Kevin N. Gott , Jack Deslippe , Maxence Thévenet , Remi Lehe , Jean-Luc Vay

A parallel computer system is a collection of processing elements that communicate and cooperate to solve large computational problems efficiently. To achieve this, at first the large computational problem is partitioned into several tasks…

Distributed, Parallel, and Cluster Computing · Computer Science 2011-09-09 Ardhendu Mandal , Subhas Chandra Pal

Using GPUs as general-purpose processors has revolutionized parallel computing by offering, for a large and growing set of algorithms, massive data-parallelization on desktop machines. An obstacle to widespread adoption, however, is the…

Distributed, Parallel, and Cluster Computing · Computer Science 2015-10-14 Alexey Kolesnichenko , Christopher M. Poskitt , Sebastian Nanz , Bertrand Meyer

In parallel iterative applications, computational efficiency is essential for addressing large problems. Load imbalance is one of the major performance degradation factors of parallel applications. Therefore, distributing, cleverly, and as…

Distributed, Parallel, and Cluster Computing · Computer Science 2019-11-18 Anthony Boulmier , Franck Raynaud , Nabil Abdennadher , Bastien Chopard

The ability to model, analyze, and predict execution time of computations is an important building block supporting numerous efforts, such as load balancing, performance optimization, and automated performance tuning for high performance,…

Performance · Computer Science 2020-06-22 James D. Stevens , Andreas Klöckner

Nowadays, the data to be processed by database systems has grown so large that any conventional, centralized technique is inadequate. At the same time, general purpose computation on GPU (GPGPU) recently has successfully drawn attention…

Databases · Computer Science 2013-09-04 Georgios Koutsoumpakis , Iakovos Koutsoumpakis , Anastasios Gounaris

Modeling data sharing in GPU programs is a challenging task because of the massive parallelism and complex data sharing patterns provided by GPU architectures. Better GPU caching efficiency can be achieved through careful task scheduling…

Distributed, Parallel, and Cluster Computing · Computer Science 2016-10-04 Lingda Li , Ari B. Hayes , Stephen A. Hackler , Eddy Z. Zhang , Mario Szegedy , Shuaiwen Leon Song

This paper presents Block, a distributed scheduling framework designed to optimize load balancing and auto-provisioning across instances in large language model serving frameworks by leveraging contextual information from incoming requests.…

Distributed, Parallel, and Cluster Computing · Computer Science 2025-08-14 Wei Da , Evangelia Kalyvianaki

GPUs are vastly underutilized, even when running resource-intensive AI applications, as GPU kernels within each job have diverse resource profiles that may saturate some parts of a device while often leaving other parts idle. Colocating…

Distributed, Parallel, and Cluster Computing · Computer Science 2026-02-17 Paul Elvinger , Foteini Strati , Natalie Enright Jerger , Ana Klimovic

We propose a GPU-based distributed optimization algorithm, aimed at controlling optimal power flow in multi-phase and unbalanced distribution systems. Typically, conventional distributed optimization algorithms employed in such scenarios…

Optimization and Control · Mathematics 2023-10-17 Minseok Ryu , Geunyeong Byeon , Kibaek Kim

With the advent of exascale computing, effective load balancing in massively parallel software applications is critically important for leveraging the full potential of high performance computing systems. Load balancing is the distribution…

Quantum Physics · Physics 2025-01-30 Omer Rathore , Alastair Basden , Nicholas Chancellor , Halim Kusumaatmaja

There is growing interest in accelerating irregular data-parallel algorithms on GPUs. These algorithms are typically blocking, so they require fair scheduling. But GPU programming models (e.g.\ OpenCL) do not mandate fair scheduling, and…

Programming Languages · Computer Science 2017-07-10 Tyler Sorensen , Hugues Evrard , Alastair F. Donaldson

The dynamic load-balancing framework in Charm++/AMPI, developed at the University of Illinois, is based on using processor virtualization to allow thread migration across processors. This framework has been successfully applied to many…

Distributed, Parallel, and Cluster Computing · Computer Science 2013-10-17 Alvaro Luiz Fazenda , Celso L. Mendes , Laxmikant V. Kale , Jairo Panetta , Eduardo Rocha Rodrigues

In order to satisfy timing constraints, modern real-time applications require massively parallel accelerators such as General Purpose Graphic Processing Units (GPGPUs). Generation after generation, the number of computing clusters made…

Distributed, Parallel, and Cluster Computing · Computer Science 2021-05-24 Houssam-Eddine Zahaf , Ignacio Sanudo Olmedo , Jayati Singh , Nicola Capodieci , Sebastien Faucou

High-performance implementations of graph algorithms are challenging to implement on new parallel hardware such as GPUs because of three challenges: (1) the difficulty of coming up with graph building blocks, (2) load imbalance on parallel…

Distributed, Parallel, and Cluster Computing · Computer Science 2021-06-16 Carl Yang , Aydin Buluc , John D. Owens

In this work, we survey the role of GPUs in real-time systems. Originally designed for parallel graphics workloads, GPUs are now widely used in time-critical applications such as machine learning, autonomous vehicles, and robotics due to…

In this dissertation, we propose a memory and computing coordinated methodology to thoroughly exploit the characteristics and capabilities of the GPU-based heterogeneous system to effectively optimize applications' performance and privacy.…

Cryptography and Security · Computer Science 2022-09-07 Zhendong Wang , Yang Hu
‹ Prev 1 2 3 10 Next ›